SAS Certificate Base Practice Questions (Textbook) Flashcards

**How many observations and variables does the data set below contain?

a. 3 observations, 4 variables

b. 3 observations, 3 variables

c. 4 observations, 3 variables

d. can't tell because some values are missing

Rows in the data set are called observations, and columns are called variables. Missing values don't affect the structure of the data set.

How many program steps are executed when the program below is processed?

data user.tables;

infile jobs;

input date name $ job $;

run;

proc sort data=user.tables;

by name; run;

proc print data=user.tables;

run;

a. three

b. four

c. five

d. six

When it encounters a DATA, PROC, or RUN statement, SAS stops reading statements and 2 executes the previous step in the program. The program above contains one DATA step and two PROC steps, for a total of three program steps.

What type of variable is the variable AcctNum in the data set below? a. numeric b. character c. can be either character or numeric d. can't tell from the data shown

b It must be a character variable, because the values contain letters and underscores, which are not valid characters for numeric values.

What type of variable is the variable Wear in the data set below? a. numeric b. character c. can be either character or numeric d. can't tell from the data shown

a It must be a numeric variable, because the missing value is indicated by a period rather than by a blank.

Which of the following variable names is valid? a. 4BirthDate b. $Cost c. _Items_ d. Tax-Rate

c Variable names follow the same rules as SAS data set names. They can be 1 to 32 characters long, must begin with a letter (A–Z, either uppercase or lowercase) or an underscore, and can continue with any combination of numbers, letters, or underscores.

Which of the following files is a permanent SAS file? a. Sashelp.PrdSale b. Sasuser.MySales c. Profits.Quarter1 d. all of the above

d To store a file permanently in a SAS data library, you assign it a libref other than the default Work. For example, by assigning the libref Profits to a SAS data library, you specify that files within the library are to be stored until you delete them. Therefore, SAS files in the Sashelp and Sasuser libraries are permanent files.

In a DATA step, how can you reference a temporary SAS data set named Forecast? a. Forecast b. Work.Forecast c. Sales.Forecast (after assigning the libref Sales) d. only a and b above

d To reference a temporary SAS file in a DATA step or PROC step, you can specify the one- 4 level name of the file (for example, Forecast) or the two-level name using the libref Work (for example, Work.Forecast).

What is the default length for the numeric variable Balance? a. 5 b. 6 c. 7 d. 8

d The numeric variable Balance has a default length of 8. Numeric values (no matter how many digits they contain) are stored in 8 bytes of storage unless you specify a different length.

How many statements does the following SAS program contain? proc print data=new.prodsale label double; var state day price1 price2; where state='NC'; label state='Name of State'; run; a. three b. four c. five d. six

c The five statements are • PROC PRINT statement (two lines long) 5 • VAR statement • WHERE statement (on the same line as the VAR statement) • LABEL statement • RUN statement (on the same line as the LABEL statement).

What is a SAS data library? a. a collection of SAS files, such as SAS data sets and catalogs b. in some operating environments, a physical collection of SAS files c. in some operating environments, a logically related collection of SAS files d. all of the above

d Every SAS file is stored in a SAS data library, which is a collection of SAS files, such as SAS data sets and catalogs. In some operating environments, a SAS data library is a physical collection of files. In others, the files are only logically related. In the Windows and UNIX environments, a SAS data library is typically a group of SAS files in the same folder or directory.

If you submit the following program, how does the output look? options pagesize=55 nonumber; proc tabulate data=clinic.admit; class actlevel; var age height weight; table actlevel,(age height weight)*mean; 6 run; options linesize=80; proc means data=clinic.heart min max maxdec=1; var arterial heart cardiac urinary; class survive sex; run; a. The PROC MEANS output has a print line width of 80 characters, but the PROC TABULATE output has no print line width. b. The PROC TABULATE output has no page numbers, but the PROC MEANS output has page numbers. c. Each page of output from both PROC steps is 55 lines long and has no page numbers, and the PROC MEANS output has a print line width of 80 characters. d. The date does not appear on output from either PROC step

c When you specify a system option, it remains in effect until you change the option or end your SAS session, so both PROC steps generate output that is printed 55 lines per page with no page numbers. If you don't specify a system option, SAS uses the default value for that system option.

In order for the date values 05May1955 and 04Mar2046 to be read correctly, what value must the YEARCUTOFF= option have? a. a value between 1947 and 1954, inclusive b. 1955 or higher c. 1946 or higher d. any value

d As long as you specify an informat with the correct field width for reading the entire date value, the YEARCUTOFF= option doesn't affect date values that have four-digit years.

When you specify an engine for a library, you are always specifying a. the file format for files that are stored in the library. 7 b. the version of SAS that you are using. c. access to other software vendors' files. d. instructions for creating temporary SAS files.

a A SAS engine is a set of internal instructions that SAS uses for writing to and reading from files in a SAS library. Each engine specifies the file format for files that are stored in the library, which in turn enables SAS to access files with a particular format. Some engines access SAS files, and other engines support access to other vendors' files.

Which statement prints a summary of all the files stored in the library named Area51? a. proc contents data=area51._all_ nods; b. proc contents data=area51 _all_ nods; c. proc contents data=area51 _all_ noobs; d. proc contents data=area51 _all_.nods;

a To print a summary of library contents with the CONTENTS procedure, use a period to append the _ALL_ option to the libref. Adding the NODS option suppresses detailed information about the files.

The following PROC PRINT output was created immediately after PROC TABULATE output. Which SAS system options were specified when the report was created? 8 a. OBS=, DATE, and NONUMBER b. PAGENO=1, and DATE c. NUMBER and DATE only d. none of the above

b Clearly, the DATE and PAGENO= options are specified. Because the page number on the output is 1, even though PROC TABULATE output was just produced. If you don't specify PAGENO=, all output in the Output window is numbered sequentially throughout your SAS session.

Which of the following programs correctly references a SAS data set named SalesAnalysis that is stored in a permanent SAS library? a. data saleslibrary.salesanalysis; set mydata.quarter1sales; if sales>100000; run; b. data mysales.totals; set sales_99.salesanalysis; if totalsales>50000; run; c. proc print data=salesanalysis.quarter1; var sales salesrep month; run; d. proc freq data=1999data.salesanalysis; 9 tables quarter*sales; run;

b Librefs must be 1 to 8 characters long, must begin with a letter or underscore, and can contain only letters, numbers, or underscores. After you assign a libref, you specify it as the first element in the two-level name for a SAS file.

Which time span is used to interpret two-digit year values if the YEARCUTOFF= option is set to 1950? a. 1950-2049 b. 1950-2050 c. 1949-2050 d. 1950-2000

a The YEARCUTOFF= option specifies which 100-year span is used to interpret two-digit year values. The default value of YEARCUTOFF= is 1920. However, you can override the default and change the value of YEARCUTOFF= to the first year of another 100-year span. If you specify YEARCUTOFF=1950, then the 100-year span will be from 1950 to 2049.

Asssuming you are using SAS code and not special SAS windows, which one of the following statements is false? a. LIBNAME statements can be stored with a SAS program to reference the SAS library automatically when you submit the program. b. When you delete a libref, SAS no longer has access to the files in the library. However, the contents of the library still exist on your operating system. c. Librefs can last from one SAS session to another. d. You can access files that were created with other vendors' software by submitting a LIBNAME statement.

c The LIBNAME statement is global, which means that librefs remain in effect until you modify them, cancel them, or end your SAS session. Therefore, the LIBNAME statement assigns the libref for the current SAS session only. You must assign a libref before accessing 10 SAS files that are stored in a permanent SAS data library

What does the following statement do? libname osiris spss 'c:\myfiles\sasdata\data'; a. defines a library called Spss using the OSIRIS engine b. defines a library called Osiris using the SPSS engine c. defines two libraries called Osiris and Spss using the default engine d. defines the default library using the OSIRIS and SPSS engines

b In the LIBNAME statement, you specify the library name before the engine name. Both are followed by the path.

What does the following OPTIONS statement do? options pagesize=15 nodate; a. suppresses the date and limits the page size of the log b. suppresses the date and limits the vertical page size for text output c. suppresses the date and limits the vertical page size for text and HTML output d. suppresses the date and limits the horizontal page size for text output

b These options affect the format of listing output only. NODATE suppresses the date and PAGESIZE= determines the number of rows to print on the page.

As you write and edit SAS programs it's a good idea to a. begin DATA and PROC steps in column one. 11 b. indent statements within a step. c. begin RUN statements in column one. d. all of the above

d Although you can write SAS statements in almost any format, a consistent layout enhances readability and enables you to understand the program's purpose. It's a good idea to begin DATA and PROC steps in column one, to indent statements within a step, to begin RUN statements in column one, and to include a RUN statement after every DATA step or PROC step

What usually happens when an error is detected? a. SAS continues processing the step. b. SAS continues to process the step, and the log displays messages about the error. c. SAS stops processing the step in which the error occurred, and the log displays messages about the error. d. SAS stops processing the step in which the error occurred, and the program output displays messages about the error.

c Syntax errors generally cause SAS to stop processing the step in which the error occurred. When a program that contains an error is submitted, messages regarding the problem also appear in the SAS log. When a syntax error is detected, the SAS log displays the word ERROR, identifies the possible location of the error, and gives an explanation of the error.

A syntax error occurs when a. some data values are not appropriate for the SAS statements that are specified in a program. b. the form of the elements in a SAS statement is correct, but the elements are not valid for that usage. c. program statements do not conform to the rules of the SAS language. d. none of the above

Syntax errors are common types of errors. Some SAS system options, features of the Editor window, and the DATA step debugger can help you identify syntax errors. Other types of errors include data errors, semantic errors, and execution-time errors.

How can you tell whether you have specified an invalid option in a SAS program? a. A log message indicates an error in a statement that seems to be valid. b. A log message indicates that an option is not valid or not recognized. c. The message "PROC running" or "DATA step running" appears at the top of the active window. d. You can't tell until you view the output from the program.

b When you submit a SAS statement that contains an invalid option, a log message notifies you that the option is not valid or not recognized. You should recall the program, remove or replace the invalid option, check your statement syntax as needed, and resubmit the corrected program.

Which of the following programs contains a syntax error?

b The DATA step contains a misspelled keyword (dat instead of data). However, this is such a common (and easily interpretable) error that SAS produces only a warning message, not an error

What does the following log indicate about your program? 13 proc print data=sasuser.cargo99 var origin dest cargorev; 22 76 ERROR 22-322: Syntax error, expecting one of the following: ;, (, DATA, DOUBLE, HEADING, LABEL, N, NOOBS, OBS, ROUND, ROWS, SPLIT, STYLE, UNIFORM, WIDTH. ERROR 76-322: Syntax error, statement will be ignored. 11 run; a. SAS identifies a syntax error at the position of the VAR statement. b. SAS is reading VAR as an option in the PROC PRINT statement. c. SAS has stopped processing the program because of errors. d. all of the above

d Because there is a missing semicolon at the end of the PROC PRINT statement, SAS interprets VAR as an option in PROC PRINT and finds a syntax error at that location. SAS stops processing programs when it encounters a syntax error.

Which PROC PRINT step below creates the following output?

c The DATA= option specifies the data set that you are listing, and the ID statement replaces the Obs column with the specified variable. The VAR statement specifies variables and controls the order in which they appear, and the WHERE statement selects rows based on a condition. The LABEL option in the PROC PRINT statement causes the labels that are specified in the LABEL statement to be displayed.

Which of the following PROC PRINT steps is correct if labels are not stored with the data set?

You use the DATA= option to specify the data set to be printed. The LABEL option specifies that variable labels appear in output instead of variable names.

Which of the following statements selects from a data set only those observations for which the value of the variable Style is RANCH, SPLIT, or TWOSTORY?

d In the WHERE statement, the IN operator enables you to select observations based on several values. You specify values in parentheses and separate them by spaces or commas. Character values must be enclosed in quotation marks and must be in the same case as in the data set.

If you want to sort your data and create a temporary data set named Calc to store the sorted data, which of the following steps should you submit?

c In a PROC SORT step, you specify the DATA= option to specify the data set to sort. The OUT= option specifies an output data set. The required BY statement specifies the variable(s) to use in sorting the data.

Which options are used to create the following PROC PRINT output? 13:27 Monday, March 22, 1999 Patient Arterial Heart Cardiac Urinary 203 88 95 66 110 54 83 183 95 0 664 72 111 332 12 210 74 97 369 0 101 80 130 291 0 a. the DATE system option and the LABEL option in PROC PRINT b. the DATE and NONUMBER system options and the DOUBLE and NOOBS options in PROC PRINT c. the DATE and NONUMBER system options and the DOUBLE option in PROC PRINT d. the DATE and NONUMBER system options and the NOOBS option in PROC PRINT

b The DATE and NONUMBER system options cause the output to appear with the date but without page numbers. In the PROC PRINT step, the DOUBLE option specifies double spacing, and the NOOBS option removes the default Obs column.

Which of the following statements can you use in a PROC PRINT step to create this output?

d You do not need to name the variables in a VAR statement if you specify them in the SUM statement, but you can. If you choose not to name the variables in the VAR statement as well, then the SUM statement determines the order of the variables in the output.

What happens if you submit the following program? proc sort data=clinic.diabetes; run; proc print data=clinic.diabetes; var age height weight pulse; where sex='F'; run; a. The PROC PRINT step runs successfully, printing observations in their sorted order. b. The PROC SORT step permanently sorts the input data set. c. The PROC SORT step generates errors and stops processing, but the PROC PRINT step runs successfully, printing observations in their original (unsorted) order. d. The PROC SORT step runs successfully, but the PROC PRINT step generates errors and stops processing.

c The BY statement is required in PROC SORT. Without it, the PROC SORT step fails. However, the PROC PRINT step prints the original data set as requested.

If you submit the following program, which output does it create? proc sort data=finance.loans out=work.loans; 19 by months amount; run; proc print data=work.loans noobs; var months; sum amount payment; where months<360; run;

a Column totals appear at the end of the report in the same format as the values of the variables, so b is incorrect. Work.Loans is sorted by Month and Amount, so c is 20 incorrect. The program sums both Amount and Payment, so d is incorrect.

Choose the statement below that selects rows which • the amount is less than or equal to $5000 • the account is 101-1092 or the rate equals 0.095.

c To ensure that the compound expression is evaluated correctly, you can use parentheses to group account='101-1092' or rate eq 0.095 OBS Account Amount Rate MonthsPayment 1 101-1092 $22,000 10.00%60 $467.43 2 101-1731 $114,0009.50% 360 $958.57 3 101-1289 $10,000 10.50%36 $325.02 4 101-3144 $3,500 10.50%12 $308.52 5 103-1135 $8,700 10.50%24 $403.47 6 103-1994 $18,500 10.00%60 $393.07 7 103-2335 $5,000 10.50%48 $128.02 8 103-3864 $87,500 9.50% 360 $735.75 9 103-3891 $30,000 9.75% 360 $257.75 For example, from the data set above, a and b above select observations 2 and 8 (those that have a rate of 0.095); c selects no observations; and d selects observations 4 and 7 (those that have an amount less than or equal to 5000).

What does PROC PRINT display by default? a. PROC PRINT does not create a default report; you must specify the rows and 21 columns to be displayed. b. PROC PRINT displays all observations and variables in the data set. If you want an additional column for observation numbers, you can request it. c. PROC PRINT displays columns in the following order: a column for observation numbers, all character variables, and all numeric variables. d. PROC PRINT displays all observations and variables in the data set, a column for observation numbers on the far left, and variables in the order in which they occur in the data set.

d You can remove the column for observation numbers. You can also specify the variables you want, and you can select observations according to conditions.

Which SAS statement associates the fileref Crime with the raw data file C:\States\Data\Crime? a. filename crime 'c:\states\data\crime'; b. filename crime c:\states\data\crime; c. fileref crime 'c:\states\data\crime'; d. filename 'c:\states\data\crime' crime;

a 22 answer: Before you can read your raw data, you must reference the raw data file by creating a fileref. You assign a fileref by using a FILENAME statement in the same way that you assign a libref by using a LIBNAME statement.

Filerefs remain in effect until a. you change them. b. you cancel them. c. you end your SAS session. d. all of the above

d Like LIBNAME statements, FILENAME statements are global; they remain in effect until you change them, cancel them, or end your SAS session.

Which statement identifies the name of a raw data file to be read with the fileref Products and specifies that the DATA step read only records 1-15? a. infile products obs 15; b. infile products obs=15; c. input products obs=15; d. input products 1-15;

b You use an INFILE statement to specify the raw data file to be read. You can specify a fileref or an actual filename (in quotation marks). The OBS= option in the INFILE statement enables you to process only records 1 through n.

Which of the following programs correctly writes the observations from the data set below to a raw data file?

d The keyword _NULL_ in the DATA statement enables you to use the power of the DATA step without actually creating a SAS data set. You use the FILE and PUT statements to write out the observations from a SAS data set to a raw data file. The FILE statement specifies the raw data file and the PUT statement describes the lines to 24 write to the raw data file. The filename and location that are specified in the FILE statement must be enclosed in quotation marks.

Which raw data file can be read using column input?

b Column input is appropriate only in some situations. When you use column input, your data must be standard character or numeric values, and they must be in fixed fields. That is, values for a particular variable must be in the same location in all records.

Which program creates the output shown below?

a The INPUT statement creates a variable using the name that you assign to each field. Therefore, when you write an INPUT statement, you need to specify the variable names exactly as you want them to appear in the SAS data set.

Which statement correctly reads the fields in the following order: StockNumber, Price, Item, Finish, Style? Field Name Start Column End Column Data Type StockNumber 1 3 character Finish 5 9 character Style 11 18 character Item 20 24 character Price 27 32 numeric

b You can use column input to read fields in any order. You must specify the variable name to be created, identify character values with a $, and name the correct starting column and ending column for each field.

Which statement correctly re-defines the values of the variable Income as 100 percent higher? a. income=income*1.00; b. income=income+(income*2.00); c. income=income*2; d. income=*2;

c To re-define the values of the variable Income in an Assignment statement, you specify the variable name on the left side of the equal sign and an appropriate expression including the variable name on the right side of the equal sign.

Which program correctly reads instream data? a. data finance.newloan; input datalines; if country='JAPAN'; 27 MonthAvg=amount/12; 1998 US CARS 194324.12 1998 US TRUCKS 142290.30 1998 CANADA CARS 10483.44 1998 CANADA TRUCKS 93543.64 1998 MEXICO CARS 22500.57 1998 MEXICO TRUCKS 10098.88 1998 JAPAN CARS 15066.43 1998 JAPAN TRUCKS 40700.34 ; b. data finance.newloan; input Year 1-4 Country $ 6-11 Vehicle $ 13-18 Amount 20-28; if country='JAPAN'; MonthAvg=amount/12; datalines; run; c. data finance.newloan; input Year 1-4 Country 6-11 Vehicle 13-18 Amount 20-28; if country='JAPAN'; MonthAvg=amount/12; datalines; 1998 US CARS 194324.12 1998 US TRUCKS 142290.30 1998 CANADA CARS 10483.44 1998 CANADA TRUCKS 93543.64 1998 MEXICO CARS 22500.57 1998 MEXICO TRUCKS 10098.88 1998 JAPAN CARS 15066.43 1998 JAPAN TRUCKS 40700.34 ; d. data finance.newloan; input Year 1-4 Country $ 6-11 Vehicle $ 13-18 Amount 20-28; if country='JAPAN'; MonthAvg=amount/12; datalines; 1998 US CARS 194324.12 1998 US TRUCKS 142290.30 1998 CANADA CARS 10483.44 1998 CANADA TRUCKS 93543.64 1998 MEXICO CARS 22500.57 1998 MEXICO TRUCKS 10098.88 1998 JAPAN CARS 15066.43 1998 JAPAN TRUCKS 40700.34 ;

d To read instream data, you specify a DATALINES statement and data lines, followed by a null statement (single semicolon) to indicate the end of the input data. Program a contains no DATALINES statement, and the INPUT statement doesn't specify the fields to read. Program b contains no data lines, and the INPUT statement in program c doesn't specify the necessary dollar signs for the character variables Country and Vehicle

Which SAS statement subsets the raw data shown below so that only the observations in which Sex (in the second field) has a value of F are processed? a. if sex=f; b. if sex=F; c. if sex='F'; d. a or b

c To subset data, you can use a subsetting IF statement in any DATA step to process only those observations that meet a specified condition. Because Sex is a character variable, the value F must be enclosed in quotation marks and must be in the same case as in the data set.

Which of the following is not created during the compilation phase? 29 a. the data set descriptor b. the first observation c. the program data vector d. the _N_ and _ERROR_ automatic variables

b At the beginning of the compilation phase, the program data vector is created. The program data vector includes the two automatic variables _N_ and _ERROR_. The descriptor portion of the new SAS data set is created at the end of the compilation phase. The descriptor portion includes the name of the data set, the number of observations and variables, and the names and attributes of the variables. Observations are not written until the execution phase.

During the compilation phase, SAS scans each statement in the DATA step, looking for syntax errors. Which of the following is not considered a syntax error? a. incorrect values and formats b. invalid options or variable names c. missing or invalid punctuation d. missing or misspelled keywords

a Syntax checking can detect many common errors, but it cannot verify the values of variables or the correctness of formats.

Unless otherwise directed, the DATA step executes a. once for each compilation phase. b. once for each DATA step statement. c. once for each record in the input file. d. once for each variable in the input file.

c 30 The DATA step executes once for each record in the input file, unless otherwise directed.

At the beginning of the execution phase, the value of _N_ is 1, the value of _ERROR_ is 0, and the values of the remaining variables are set to a. 0 b. 1 c. undefined d. missing

d The remaining variables are initialized to missing. Missing numeric values are represented by periods, and missing character values are represented by blanks.

Suppose you run a program that causes three DATA step errors. What is the value of the automatic variable _ERROR_ when the observation that contains the third error is processed? a. 0 b. 1 c. 2 d. 3

b The default value of _ERROR_ is 0, which means there is no error. When an error occurs, whether it is one error or multiple errors, the value is set to 1.

Which of the following actions occurs at the end of the DATA step? a. The automatic variables _N_ and _ERROR_ are incremented by one. b. The DATA step stops execution. c. The descriptor portion of the data set is written. d. The values of variables created in programming statements are re-set to missing in the program data vector.

d By default, at the end of the DATA step, the values in the program data vector are written to the data set as an observation, the value of the automatic variable _N_ is incremented by one, control returns to the top of the DATA step, and the values of variables created in programming statements are set to missing. The automatic variable _ERROR_ retains its value

Look carefully at the DATA step shown below. Based on the INPUT statement, in what order will the variables be stored in the new data set? data perm.update; infile invent; input IDnum $ 15-19 Item $ 1-13 Instock 21-22 BackOrd 24-25; Total=instock+backord; run; a. IDnum Item InStock BackOrd Total b. Item IDnum InStock BackOrd Total c. Total IDnum Item InStock BackOrd d. Total Item IDnum InStock BackOrd

a The order in which variables are defined in the DATA step determines the order in which the variables are stored in the data set.

If SAS cannot interpret syntax errors, then a. data set variables will contain missing values. b. the DATA step does not compile. c. the DATA step still compiles, but it does not execute. d. the DATA step still compiles and executes.

c When SAS can't interpret syntax errors, the DATA step compiles, but it does not execute.

What is wrong with this program? data perm.update; infile invent input Item $ 1-13 IDnum $ 15-19 Instock 21-22 BackOrd 24-25; total=instock+backord; run; a. missing semicolon on second line b. missing semicolon on third line c. incorrect order of variables d. incorrect variable type

a A semicolon is missing from the second line. It will cause an error because the INPUT statement will be interpreted as invalid INFILE statement options.

Look carefully at this section of a SAS session log. Based on the note, what was the most likely problem with the DATA step? NOTE: Invalid data for IDnum in line 7 15-19. RULE: ----+----1----+----2----+----3----+----4 7 Bird Feeder LG088 3 20 Item=Bird Feeder IDnum=. InStock=3 BackOrd=20 Total=23 _ERROR_=1 _N_=1 a. A keyword was misspelled in the DATA step. b. A semicolon was missing from the INFILE statement. c. A variable was misspelled in the INPUT statement. d. A dollar sign was missing in the INPUT statement.

d The third line of the log displays the values for IDnum, which are clearly character values. The fourth line displays the values in the program data vector and shows that the values for IDnum are missing, even though the other values are correctly assigned. Thus, it appears that numeric values were expected for IDnum. A dollar sign, to indicate character values, must be 33 missing from the INPUT statement.

If you don't specify the LIBRARY= option, your formats are stored in Work.Formats, and they exist a. only for the current procedure. b. only for the current DATA step. c. only for the current SAS session. d. permanently.

If you do not specify the LIBRARY= option, formats are stored in a default format catalog named Work.Formats. As the libref Work implies, any format that is stored in Work.Formats is a temporary format that exists only for the current SAS session.

Which of the following statements will store your formats in a permanent catalog?

a To store formats in a permanent catalog, you first write a LIBNAME statement to associate the libref with the SAS data library in which the catalog will be stored. Then add the LIB= (or LIBRARY=) option to the PROC FORMAT statement, specifying the name of the catalog.

When creating a format with the VALUE statement, the new format's name • cannot end with a number • cannot end with a period • cannot be the name of a SAS format, and a. cannot be the name of a data set variable. b. must be at least two characters long. c. must be at least eight characters long. d. must begin with a dollar sign ($) if used with a character variable.

d The name of a format that is created with a VALUE statement must begin with a dollar sign 35 ($) if it applies to a character variable.

Which of the following FORMAT procedures is written correctly?

b A semicolon is needed after the PROC FORMAT statement. The VALUE statement begins with the keyword VALUE and ends with a semicolon after all the labels have been defined.

Which of these is false? Ranges in the VALUE statement can specify a. a single value, such as 24 or 'S'. b. a range of numeric values, such as 0–1500. c. a range of character values, such as 'A–'M'. d. a list of numeric and character values separated by commas, such as 90,'B',180,'D',270.

d You can list values separated by commas, but the list must contain either all numeric values or all character values. Data set variables are either numeric or character.

How many characters can be used in a label? a. 40 b. 96 c. 200 d. 256

d When specifying a label, enclose it in quotation marks and limit the label to 256 characters.

Which keyword can be used to label missing values as well as any values that are not specified in a range? a. LOW b. MISS c. MISSING d. OTHER

d MISS and MISSING are invalid keywords, and LOW does not include missing values. The keyword OTHER can be used in the VALUE statement to label missing values as well as any values that are not specifically included in a range.

You can place the FORMAT statement in either a DATA step or a PROC step. What happens when you place the FORMAT statement in a DATA step? a. You temporarily associate the formats with variables. b. You permanently associate the formats with variables. c. You replace the original data with the format labels. 37 d. You make the formats available to other data sets

b By placing the FORMAT statement in a DATA step, you permanently associate the defined formats with variables.

The format JOBFMT was created in a FORMAT procedure. Which FORMAT statement will apply it to the variable JobTitle in the program output? a. format jobtitle jobfmt; b. format jobtitle jobfmt.; c. format jobtitle=jobfmt; d. format jobtitle='jobfmt';

b To associate a user-defined format with a variable, place a period at the end of the format name when it is used in the FORMAT statement.

Which keyword, when added to the PROC FORMAT statement, will display all the formats in your catalog? a. CATALOG b. LISTFMT c. FMTCAT d. FMTLIB

d Adding the keyword FMTLIB to the PROC FORMAT statement displays a list of all the formats in your catalog, along with descriptions of their values.

If Style has four unique values and you submit the following program, which output do you get? (Assume that all the other variables are numeric.) proc report data=sasuser.houses nowd; column style sqfeet bedrooms price; define style / group; run;

a This program creates a summary report, which consolidates into one row all observations from the data set that have a unique combination of values for the variable Style.

When you define an order variable, a. the detail rows are ordered according to their formatted values. b. you can't create summary reports. c. PROC REPORT displays only the first occurrence of each order variable value in a set of rows that have the same value for all order variables. d. all of the above

d Order variables do order rows according to the formatted values of the order variable, and PROC REPORT suppresses repetitious printing of order values. However, you can't use order variables in a summary report.

Which attributes or options are reflected in this PROC REPORT output? 41 Style SqFeet Price RANCH 720 $34,550 TWOSTORY 1040 $55,850 SPLIT 1190 $65,850 TWOSTORY 1240 $69,250 RANCH 1250 $64,000 SPLIT 1305 $73,650 CONDO 1390 $79,350 CONDO 1400 $80,050 RANCH 1500 $86,650 RANCH 1535 $89,100 SPLIT 1615 $94,450 TWOSTORY 1745 $102,950 TWOSTORY 1810 $107,250 CONDO 1860 $110,700 CONDO 2105 $127,150 a. SKIPLINE and FORMAT= b. CENTER, HEADLINE, HEADSKIP, and either WIDTH=, SPACING=, or FORMAT= c. SPACING= only d. CENTER, FORMAT=, and HEADLINE

b The HEADLINE option underlines the headings, and the HEADSKIP option skips a line between the headings and the rows in the report. Also, Style is centered, and the column for Price is wider than the default.

To create a summary report that shows the average number of bedrooms and the maximum number of baths for each style of house, which DEFINE statements do you use in your PROC REPORT step?

b To create a summary report, you must define a group variable. To produce the statistics that you want, you must specify the MEAN and MAX statistics for Bedrooms and Baths.

Which program does not contain an error?

c Program c correctly specifies a computed variable in the COLUMN statement, defines the variable in a DEFINE statement, and computes values using the form variable-name.statistic in a compute block.

What output does this PROC REPORT step produce? proc report data=sasuser.houses nowd; column style sqfeet bedrooms price; run; a. a list report ordered by values of the first variable in the COLUMN statement 44 b. a summary report ordered by values of the first variable in the COLUMN statement c. a list report that displays a row for each observation in the input data set and which calculates the SUM statistic for numeric variables d. a list report that calculates the N (frequency) statistic for character variables

c By default, PROC REPORT displays character variables as display variables. A report that contains one or more display variables has a detail row for each observation in the data set. By default, PROC REPORT displays numeric variables as analysis variables, which are used to calculate the default statistic SUM.

Which of the following programs produces this output?

c 45 answer: In this output, the table cells contain a frequency count for each unique value of an across variable, Style. You don't have to specify across variable values in your PROC REPORT step

If you submit this program, where does your PROC REPORT output appear? proc report data=sasuser.houses nowd; column style sqfeet bedrooms price; define style / group; run; a. in the PROC REPORT window b. as HTML and/or SAS listing output c. both of the above d. neither of the above

b In nonwindowing mode, your PROC REPORT output appears as HTML and/or as SAS listing output, depending on your option settings.

How can you create output with headings that break as shown below? Style of Average Maximum House Bedrooms Baths CONDO 2.75 2.5 RANCH 2.25 3 SPLIT 2.666666 3 TWOSTORY 3 3 a. You must specify the SPLIT= option in the PROC REPORT statement and use the split character in column headings in DEFINE statements. b. You must use the default split character in column headings in DEFINE statements. c. You must specify either the WIDTH= or the SPACING= attribute in DEFINE statements. d. These headings split this way by default.

d By default, columns for character variables are the same as the variable's length, and columns 46 for numeric variables have a width of 9. So these headings split this way by default.

Suppose you want to create a report using both character and numeric variables. If you don't use any DEFINE statements in your PROC REPORT step, a. your PROC REPORT step will not execute successfully. b. you can produce only list reports. c. you can order rows by specifying options in the PROC REPORT statement. d. you can produce only summary reports.

b .Unless you use DEFINE statements to define order variables or group variables, you can't order rows or produce summary reports. However, DEFINE statements are not required in all PROC REPORT steps.

The default statistics produced by the MEANS procedure are n-count, mean, minimum, maximum, and a. median. b. range. c. standard deviation. d. standard error of the mean.

c By default, the MEANS procedure produces the n-cout, mean, minimum, and standard deviation.

Which statement will limit a PROC MEANS analysis to the variables Boarded, Transfer, and Deplane? a. by boarded transfer deplane; b. class boarded transfer deplane; c. output boarded transfer deplane; 47 d. var boarded transfer deplane;

d To specify the variables that PROC MEANS analyzes, add a VAR statement and list the variable names.

The data set Survey.Health includes the following variables. Which is a poor candidate for PROC MEANS analysis? a. IDnum b. Age c. Height d. Weight

a Unlike Age, Height, or Weight, the values of IDnum are unlikely to yield any useful statistics.

Which of the following statements is true regarding BY-group processing? a. BY variables must be either indexed or sorted. b. Summary statistics are computed for BY variables. c. BY-group processing is preferred when you are categorizing data that contains few variables. d. BY-group processing overwrites your data set with the newly grouped observations

a Unlike CLASS processing, BY-group processing requires that your data already be indexed or sorted in the order of the BY variables. You might need to run the SORT procedure before using PROC MEANS with a BY group.

Which group processing statement produced the PROC MEANS output shown below?

a. class sex survive; b. class survive sex; c. by sex survive; d. by survive sex;

b A CLASS statement produces a single large table, whereas BY-group processing creates a series of small tables. The order of the variables in the CLASS statement determines their order in the output table.

Which program can be used to create the following output?

d You can use either PROC MEANS or PROC SUMMARY to create the table. Adding a PRINT option to the PROC SUMMARY statement produces the same reports as if you used PROC MEANS

By default, PROC FREQ creates a table of frequencies and percentages for which data set variables? a. character variables b. numeric variables c. both character and numeric variables d. none: variables must always be specified

c By default, the PROC FREQ creates a table for all variables in a data set.

Frequency distributions work best with variables that contain 50 a. continuous values. b. numeric values. c. categorical values. d. unique values.

c Both continuous values and many unique values can result in lengthy and meaningless tables. Frequency distributors work best with categorical values.

Which PROC FREQ step produced this two-way table?

d An asterisk is used to join the variables in a two-way TABLES statement. The first variable forms the table rows, and the second variable forms the table columns.

Which PROC FREQ step produced this table?

d An asterisk is used to join the variables in crosstabulation tables. The only results that are shown in this table are cell percentages. The NOFREQ option suppresses cell frequencies, the NOROW option suppresses row percentages, and the NOCOL option suppresses column percentages.

Using ODS statements, how many types of output can you generate concurrently? a. 1 (only listing output) b. 2 c. 3 53 d. as many as you want

d You can generate any number of output types as long as you open the ODS destination for each type of output that you want to create.

If ODS is set to its default settings, what types of output are created by the code below? ods html file='c:\myhtml.htm'; ods pdf file='c:\mypdf.pdf'; a. HTML and PDF b. PDF only c. HTML, PDF, and listing d. No output is created because ODS is closed by default.

c Listing output is created by default, so these statements create HTML, PDF, and listing output.

What is the purpose of closing the Listing destination in the code shown below? ods listing close; ods html ... ; a. It conserves system resources. b. It simplifies your program. c. It makes your program compatible with other hardware platforms. d. It makes your program compatible with previous versions of SAS software.

a By default, SAS programs produce listing output. If you want only HTML output, it's a good idea to close the Listing destination before creating HTML output, because an open destination uses system resources.

When the code shown below is run, what will the file D:\Output\body.html contain? 54 ods html body='d:\output\body.html'; proc print data=work.alpha; run; proc print data=work.beta; run; ods html close; a. The PROC PRINT output for Work.Alpha. b. The PROC PRINT output for Work.Beta. c. The PROC PRINT output for both Work.Alpha and Work.Beta. d. Nothing. No output will be written to D:\Output\body.html.

c When multiple procedures are run while HTML output is open, procedure output is appended to the same body file.

When the code shown below is run, what file will be loaded by the links in D:\Output\contents.html? ods html body='d:\output\body.html' contents='d:\output\contents.html' frame='d:\output\frame.html'; a. D:\Output\body.html b. D:\Output\contents.html c. D:\Output\frame.html d. There are no links from the file D:\Output\contents.html.

a The CONTENTS= option creates a table of contents containing links to the body file, D:\Output\body.html.

. The table of contents created by the CONTENTS= option contains a numbered heading for a. each procedure. b. each procedure that creates output. 55 c. each procedure and DATA step. d. each HTML file created by your program.

b The table of contents contains a numbered heading for each procedure that creates output.

When the code shown below is run, what will the file D:\Output\frame.html display? ods html body='d:\output\body.html' contents='d:\output\contents.html' frame='d:\output\frame.html'; a. The file D:\Output\contents.html. b. The file D:\Output\frame.html. c. The files D:\Output\contents.html and D:\Output\body.html. d. It displays no other files.

c The FRAME= option creates an HTML file that integrates the table of contents and the body file.

What is the purpose of the URL= suboptions shown below? ods html body='d:\output\body.html' (url='body.html') contents='d:\output\contents.html' (url='contents.html') frame='d:\output\frame.html'; a. To create absolute link addresses for loading the files from a server. b. To create relative link addresses for loading the files from a server. c. To allow HTML files to be loaded from a local drive. d. To send HTML output to two locations.

b Specifying the URL= suboption in the file specification provides a URL that ODS uses in the 56 links it creates. Specifying a simple (one name) URL creates a relative link address to the file.

Which ODS HTML option was used in creating the following table? a. format=brown b. format='brown' c. style=brown d. style='brown'

c You can change the appearance of HTML output by using the STYLE= option in the ODS HTML statement. The style name doesn't need quotation marks.

What is the purpose of the PATH= option? ods html path='d:\output' (url=none) body='body.html' contents='contents.html' frame='frame.html'; a. It creates absolute link addresses for loading HTML files from a server. b. It creates relative link addresses for loading HTML files from a server. c. It allows HTML files to be loaded from a local drive. d. It specifies the location of HTML file output.

d You use the PATH= option to specify the location for HTML output. When you use the PATH= option, you don't need to specify the full pathname for the body, contents, or frame files.

Which program creates the output shown below? 58 a. data test2; infile furnture; input StockNum $ 1-3 Finish $ 5-9 Style $ 11-18 Item $ 20-24 Price 26-31; if finish='oak' then delete; retain TotPrice 100; totalprice+price; drop price; run; proc print data=test2 noobs; run; b. data test2; infile furnture; input StockNum $ 1-3 Finish $ 5-9 Style $ 11-18 Item $ 20-24 Price 26-31; if finish='oak' and price

c Program c correctly deletes the observation in which the value of Finish is oak and the value of Price is less than 200. It also creates TotalPrice by summing the variable Price down observations, then drops Price by using the DROP= data set option in the DATA statement.

How is the variable Amount labeled and formatted in the PROC PRINT output? 59 data credit; infile creddata; input Account $ 1-5 Name $ 7-25 Type $ 27 Transact $ 29-35 Amount 37-50; label amount='Amount of Loan'; format amount dollar12.2; run; proc print data=credit label; label amount='Total Amount Loaned'; format amount comma10.; run; a. label Amount of Loan, format DOLLAR12.2 b. label Total Amount Loaned, format COMMA10. c. label Amount, default format d. The PROC PRINT step does not execute because two labels and two formats are assigned to the same vari

b The PROC PRINT output displays the label Total Amount Loaned for the variable Amount and formats this variable using the COMMA10. format. Temporary labels or formats that are assigned in a PROC step override permanent labels or formats that are assigned in a DATA step.

Consider the IF-THEN statement shown below. When the statement is executed, which expression is evaluated first? if finlexam>=95 and (research='A' or (project='A' and present='A')) then Grade='A+'; a. finlexam>=95 b. research='A' c. project='A' and present='A' d. research='A' or e. (project='A' and present='A')

c Logical comparisons that are enclosed in parentheses are evaluated as true or false before they are compared to other expressions. In the example above, the AND comparison within the nested parentheses is evaluated before being compared to 60 the OR comparison.

100

Consider the small raw data file and program shown below. What is the value of Count after the fourth record is read? data work.newnums; infile numbers; input Tens 2-3; Count+tens; run; a. missing b. 0 c. 30 d. 70

d The Sum statement adds the result of the expression that is on the right side of the plus sign to the numeric variable that is on the left side. The new value is then retained for subsequent observations. The Sum statement treats the missing value as a 0, so the value of Count in the fourth observation would be 10+20+0+40, or 70.

101

Now consider the revised program below. What is the value of Count after the third observation is read? data work.newnums; infile numbers; input Tens 2-3; retain Count 100; count+tens; run; 61 a. missing b. 0 c. 100 d. 130

d The RETAIN statement assigns an initial value of 100 to the variable Count, so the value of Count in the third observation would be 100+10+20+0, or 130.

102

For the observation shown below, what is the result of the IF-THEN statement? Status Type Count Action Control ok 3 12 E Go if status='OK' and type=3 then Count+1; if status='S' or action='E' then Control='Stop'; a. Count = 12 Control = Go b. Count = 13 Control = Stop c. Count = 12 Control = Stop d. Count = 13 Control = Go

c You must enclose character values in quotation marks, and you must specify them in the same case in which they appear in the data set. The value ok is not identical to OK, so the value of Count is not changed by the IF-THEN statement.

103

Which of the following can determine the length of a new variable? a. the length of the variable's first value b. the assignment statement c. the LENGTH statement 62 d. all of the above

d The length of a variable is determined by its first reference in the DATA step. When creating a new character variable, SAS allocates as many bytes of storage space as there are characters in the first value that it encounters for that variable. The first reference to a new variable can also be made with a LENGTH statement or an assignment statement. The length of the variable's first value does not matter once the variable has been referenced in your program.

104

Which set of statements is the most efficient equivalent to the code shown below?

a Answer a is the most efficient. You can write multiple ELSE statements to specify a series of mutually exclusive conditions. The ELSE statement must immediately follow the IF-THEN statement in your program. An ELSE statement executes only if the previous IF-THEN/ELSE statement is false.

105

What is the length of the variable Type, as created in the DATA step below? data finance.newloan; set finance.records; TotLoan+payment; if code='1' then Type='Fixed'; else Type='Variable'; length type $ 10; run; 63 a. 5 b. 8 c. 10 d. It depends on the first value of Type.

: a The length of a new variable is determined by the first reference in the DATA step, not by data values. In this case, the length of Type is determined by the value Fixed. The LENGTH statement is in the wrong place; it must be read before any other reference to the variable in the DATA step. The LENGTH statement cannot change the length of an existing va

106

Which program contains an error?

b 64 To select variables, you can use a DROP or KEEP statement in any DATA step. You can also use the DROP= or KEEP= data set options following a data set name in any DATA or PROC step. However, you cannot use DROP or KEEP statements in PROC steps.

107

If you submit the following program, which variables appear in the new data set? data work.cardiac(drop=age group); set clinic.fitness(keep=age weight group); if group=2 and age>40; run; a. none b. Weight c. Age, Group d. Age, Weight, Group

b The variables Age, Weight, and Group are specified using the KEEP= option in the SET statement. After processing, Age and Group are dropped in the DATA statement

108

Which of the following programs correctly reads the data set Orders and creates the data set FastOrdr?

c You specify the data set to be created in the DATA statement. The DROP= data set option prevents variables from being written to the data set. Because you use the variable OrdrTime when processing your data, you cannot drop OrdrTime in the SET statement. If you use the KEEP= option in the SET statement, then you must list OrdrTime as one of the variables to be kept

109

Which of the following statements is false about BY-group processing? When you use the BY statement with the SET statement, a. the data sets that are listed in the SET statement must be indexed or sorted by the values of the BY variable(s). b. the DATA step automatically creates two variables, FIRST. and LAST., for each variable in the BY statement. c. FIRST. and LAST. identify the first and last observation in each BY group, in that order. d. FIRST. and LAST. are stored in the data set.

d 66 answer: When you use the BY statement with the SET statement, the DATA step creates the temporary variables FIRST. and LAST. They are not stored in the data set.

110

There are 500 observations in the data set Company.USA. What is the result of submitting the following program? data work.getobs5(drop=obsnum); obsnum=5; set company.usa(keep=manager payroll) point=obsnum; stop; run; a. an error b. an empty data set c. a continuous loop d. a data set that contains one observation

b The DATA step writes observations to output at the end of the DATA step. However, in this program, the STOP statement stops processing before the end of the DATA step. An explicit OUTPUT statement is needed in order to produce observations.

111

There is no end-of-file condition when you use direct access to read data, so how can your program prevent a continuous loop? a. Do not use a POINT= variable. b. Check for an invalid value of the POINT= variable. c. Do not use an END= variable. d. Include an OUTPUT statement

b To avoid a continuous loop when using direct access, either include a STOP statement or use programming logic that checks for an invalid value of the POINT= variable. If SAS reads an invalid value of the POINT= variable, it sets the automatic variable _ERROR_ to 1. You can use this information to check for conditions that cause continuous processing.

112

Assuming that the data set Company.USA has five or more observations, what is the result of 67 submitting the following program? data work.getobs5(drop=obsnum); obsnum=5; set company.usa(keep=manager payroll) point=obsnum; output; stop; run; a. an error b. an empty data set c. a continuous loop d. a data set that contains one observation

d By combining the POINT= option with the OUTPUT and STOP statements, your program can write a single observation to output.

113

Which of the following statements is true regarding direct access of data sets? a. You cannot specify END= with POINT=. b. You cannot specify OUTPUT with POINT=. c. You cannot specify STOP with END=. d. You cannot specify FIRST. with LAST.

a The END= option and POINT= option are incompatible in the same SET statement. Use one or the other in your program.

114

What is the result of submitting the following program? data work.addtoend; set clinic.stress2 end=last; if last; run; a. an error 68 b. an empty data set c. a continuous loop d. a data set that contains one observation

d This program uses the END= option to name a temporary variable that contains an end-of-file marker. That variable, LAST, is set to 1 when the SET statement reads the last observation of the data set.

115

At the start of DATA step processing, during the compilation phase, variables are created in the program data vector (PDV), and observations are set to a. blank b. missing c. 0 d. there are no observations.

d At the bottom of the DATA step, the compilation phase is complete, and the descriptor portion of the new SAS data set is created. There are no observations because the DATA step has not yet executed.

116

The DATA step executes a. continuously if you use the POINT= option and the STOP statement. b. once for each variable in the output data set. c. once for each observation in the input data set. d. until it encounters an OUTPUT statement.

c The DATA step executes once for each observation in the input data set. You use the POINT= option with the STOP statement to prevent continuous looping.

117

Which program will combine Brothers.One and Brothers.Two to produce Brothers.Three?

a This is a case of one-to-one reading, which requires multiple SET statements. Notice that where same-named variables occur, the values that are read in from the second data set replace those that are read in from the first one. Also, the number of observations in the new data set is the number of observations in the smallest original data set.

118

Which program will combine Actors.Props1 and Actors.Props2 to produce Actors.Props3?

c This is a case of interleaving, which requires a list of data set names in the SET statement and one or more BY variables in the BY statement. Notice that observations in each BY group are read sequentially, in the order in which the data sets and BY variables are listed. The new data set contains all the variables from all the input data sets, as well as the total number of records from all input data sets.

119

If you submit the following program, which new data set is created?

a Concatenating appends the observations from one data set to another data set. The new data set contains the total number of records from all input data sets, so b is incorrect. All the variables from all the input data sets appear in the new data set, so c is incorrect.

120

If you concatenate the data sets below in the order shown, what is the value of Sale in 72 observation 2 of the new data set?

a. missing b. $30,000 c. $40,000 d. you cannot concatenate these data sets

a The concatenated data sets are read sequentially, in the order in which they are listed in the SET statement. The second observation in Sales.Reps does not contain a value for Sale, so a missing value appears for this variable. (Note that if you merge the data sets, the value of Sale for the second observation is $30,000.)

121

What happens if you merge the following data sets by variable SSN? a. The values of Age in the 1st data set overwrite the values of Age in the 2nd data set. b. The values of Age in the 2nd data set overwrite the values of Age in the 1st data set. c. The DATA step fails because the two data sets contain same-named variables that 73 have different values. d. The values of Age in the 2nd data set are set to missing.

b If you have variables with the same name in more than one input data set, then values of the same-named variable in the first data set in which it appears are overwritten by values of the same-named variable in subsequent data sets.

122

Suppose you merge data sets Health.Set1 and Health.Set2 below: Which output does the following program create? data work.merged; merge health.set1(in=in1) health.set2(in=in2); by id; if in1 and in2; run; proc print data=work.merged; run;

c The DATA step uses the IN= data set option and the subsetting IF statement to exclude unmatched observations from the output data set. So a and b, which contain unmatched observations, are incorrect.

123

The data sets Ensemble.Spring and Ensemble.Summer both contain a variable named Blue. How do you prevent the values of the variable Blue from being overwritten when you merge the two data sets?

d Match-merging overwrites same-named variables in the first data set with same-named variables in subsequent data sets. To prevent overwriting, rename variables by using the RENAME= data set option in the MERGE statement.

124

What happens if you submit the following program to merge Blood.Donors1 and Blood.Donors2, shown below? data work.merged; merge blood.donors1 blood.donors2; by id; run;

a. The Merged data set contains some missing values because not all observations have matching observations in the other data set. b. The Merged data set contains 8 observations. c. The DATA step produces errors. d. Values for Units in Blood.Donors2 overwrite values for Units in Blood.Donors1.

c The two input data sets are not sorted by values of the BY variable, so the DATA step produces errors and stops processing.

125

If you merge Company.Staff1 and Company.Staff2 below by ID, how many observations does the new data set contain? a. 4 b. 5 c. 6 77 d. 9

c In this example, the new data set contains one observation for each unique value of ID. The merged data set is shown below.

126

If you merge data sets Sales.Reps, Sales.Close, and Sales.Bonus by ID, what is the value of Bonus in the third observation in the new data set? a. $4,000 b. $3,000 c. missing d. can't tell from the information given

a In the new data set, the third observation is the second observation for ID number 2 (Kelly Windsor). The value for Bonus is retained from the previous observation because the BY variable value didn't change. The new data set is shown below.

127

Which function calculates the average of the variables Var1, Var2, Var3, and Var4? a. mean(var1,var4) b. mean(var1-var4) c. mean(of var1,var4) d. mean(of var1-var4)

d Use a variable list to specify a range of variables as the function argument. When specifying a variable list, be sure to precede the list with the word OF. If you omit the word OF, the function argument might not be interpreted as expected.

128

Within the data set Hrd.Temp, PayRate is a character variable and Hours is a numeric variable. What happens when the following program is run? data work.temp; set hrd.temp; Salary=payrate*hours; run; a. SAS converts the values of PayRate to numeric values. No message is written to the log. 79 b. SAS converts the values of PayRate to numeric values. A message is written to the log. c. SAS converts the values of Hours to character values. No message is written to the log. d. SAS converts the values of Hours to character values. A message is written to the log.

b When this DATA step is executed, SAS automatically converts the character values of PayRate to numeric values so that the calculation can occur. Whenever data is automatically converted, a message is written to the SAS log stating that the conversion has occurred.

129

A typical value for the character variable Target is 123,456. Which statement correctly converts the values of Target to numeric values when creating the variable TargetNo? a. TargetNo=input(target,comma6.); b. TargetNo=input(target,comma7.); c. TargetNo=put(target,comma6.); d. TargetNo=put(target,comma7.);

b You explicitly convert character values to numeric values by using the INPUT function. Be sure to select an informat that can read the form of the values.

130

A typical value for the numeric variable SiteNum is 12.3. Which statement correctly converts the values of SiteNum to character values when creating the variable Location? a. Location=dept||'/'||input(sitenum,3.1); b. Location=dept||'/'||input(sitenum,4.1); c. Location=dept||'/'||put(sitenum,3.1); d. Location=dept||'/'||put(sitenum,4.1);

d You explicitly convert numeric values to character values by using the PUT function. Be sure 80 to select a format that can read the form of the values.

131

Suppose the YEARCUTOFF= system option is set to 1920. Which MDY function creates the date value for January 3, 2020? a. MDY(1,3,20) b. MDY(3,1,20) c. MDY(1,3,2020) d. MDY(3,1,2020)

c Because the YEARCUTOFF= system option is set to 1920, SAS sees the two-digit year value 20 as 1920. Four-digit year values are always read correctly

132

The variable Address2 contains values such as Piscataway, NJ. How do you assign the twoletter state abbreviations to a new variable named State? a. State=scan(address2,2); b. State=scan(address2,13,2); c. State=substr(address2,2); d. State=substr(address2,13,2);

a The SCAN function is used to extract words from a character value when you know the order of the words, when their position varies, and when the words are marked by some delimiter. In this case, you don't need to specify delimiters, because the blank and the comma are default delimiters.

133

The variable IDCode contains values such as 123FA and 321MB. The fourth character identifies sex. How do you assign these character codes to a new variable named Sex? a. Sex=scan(idcode,4); b. Sex=scan(idcode,4,1); c. Sex=substr(idcode,4); 81 d. Sex=substr(idcode,4,1);

d The SUBSTR function is best used when you know the exact position of the substring to extract from the character value. You specify the position to start from and the number of characters to extract

134

Due to growth within the 919 area code, the telephone exchange 555 is being reassigned to the 920 area code. The data set Clients.Piedmont includes the variable Phone, which contains telephone numbers in the form 919-555-1234. Which of the following programs will correctly change the values of Phone?

c The SUBSTR function replaces variable values if it is placed on the left side of an assignment 82 statement. When placed on the right side (as in Question 7), the function extracts a substring.

135

Suppose you need to create the variable FullName by concatenating the values of FirstName, which contains first names, and LastName, which contains last names. What's the best way to remove extra blanks between first names and last names?

b The TRIM function removes trailing blanks from character values. In this case, extra blanks must be removed from the values of FirstName. Although answer c also works, the extra TRIM function for the variable LastName is unnecessary. Because of the LENGTH statement, all values of FullName are padded to 40 characters.

136

Within the data set Furnitur.Bookcase, the variable Finish contains values such as ash/cherry/teak/matte-black. Which of the following creates a subset of the data in which the values of Finish contain the string walnut? Make the search for the string case-insensitive.

...

137

Which statement is false regarding the use of DO loops? a. They can contain conditional clauses. b. They can generate multiple observations. c. They can be used to combine DATA and PROC steps. d. They can be used to read data.

c DO loops are DATA step statements and cannot be used in conjunction with PROC steps.

138

During each execution of the following DO loop, the value of Earned is calculated and is 84 added to its previous value. How many times does this DO loop execute? data finance.earnings; Amount=1000; Rate=.075/12; do month=1 to 12; Earned+(amount+earned)*rate; end; run; a. 0 b. 1 c. 12 d. 13

c The number of iterations is determined by the DO statement's stop value, which in this case is 12.

139

On January 1 of each year, $5,000 is invested in an account. Complete the DATA step below to determine the value of the account after 15 years if a constant interest rate of 10% is expected. data work.invest; ... Capital+5000; capital+(capital*.10); end; run; a. do count=1 to 15; b. do count=1 to 15 by 10%; c. do count=1 to capital; d. do count=capital to (capital*.10);

answer: a Use a DO loop to perform repetitive calculations starting at 1 and looping 15 times.

140

In the data set Work.Invest, what would be the stored value for Year? data work.invest; do year=1990 to 2004; Capital+5000; capital+(capital*.10); end; run; a. missing b. 1990 c. 2004 d. 2005

d At the end of the fifteenth iteration of the DO loop, the value for Year is incremented to 2005. Because this value exceeds the stop value, the DO loop ends. At the bottom of the DATA step, the current values are written to the data set.

141

Which of the following statements is false regarding the program shown below? data work.invest; do year=1990 to 2004; Capital+5000; capital+(capital*.10); output; end; run; a. The OUTPUT statement writes current values to the data set immediately. b. The stored value for Year is 2005. c. The OUTPUT statement overrides the automatic output at the end of the DATA step. d. The DO loop performs 15 iterations.

b The OUTPUT statement overrides the automatic output at the end of the DATA step. On the last iteration of the DO loop, the value of Year, 2004, is written to the data set.

142

How many observations will the data set Work.Earn contain? data work.earn; Value=2000; do year=1 to 20; Interest=value*.075; value+interest; output; end; run; a. 0 b. 1 c. 19 d. 20

d The number of observations is based on the number of times the OUTPUT statement executes. The new data set has 20 observations, one for each iteration of the DO loop.

143

Which of the following would you use to compare the result of investing $4,000 a year for five years in three different banks that compound interest monthly? Assume a fixed rate for the five-year period. a. DO WHILE statement b. nested DO loops c. DO UNTIL statement d. a DO group

b Place the monthly calculation in a DO loop within a DO loop that iterates once for each year. The DO WHILE and DO UNTIL statements are not used here because the number of required iterations is fixed. A non-iterative DO group would not be useful.

144

Which statement is false regarding DO UNTIL statements? a. The condition is evaluated at the top of the loop, before the enclosed statements are executed. 87 b. The enclosed statements are always executed at least once. c. SAS statements in the DO loop are executed until the specified condition is true. d. The DO loop must have a closing END statement.

a The DO UNTIL condition is evaluated at the bottom of the loop, so the enclosed statements are always executed at least once.

145

Select the DO WHILE statement that would generate the same result as the program below. data work.invest capital=100000 do until(Capital gt 500000); Year+1; capital+(capital*.10); end; run; a. do while(Capital ge 500000); b. do while(Capital=500000); c. do while(Capital le 500000); d. do while(Capital>500000);

c Because the DO WHILE loop is evaluated at the top of the loop, you specify the condition that must exist in order to execute the enclosed statements.

146

In the following program, complete the statement so that the program stops generating observations when Distance reaches 250 miles or when 10 gallons of fuel have been used. data work.go250; set perm.cars; do gallons=1 to 10 ... ; Distance=gallons*mpg; output; end; run; 88 a. while(Distance<250) b. when(Distance>250) c. over(Distance le 250) d. until(Distance=250)

a The WHILE expression causes the DO loop to stop executing when the value of Distance becomes equal to or greater than 250.

147

Which statement is false regarding an ARRAY statement? a. It is an executable statement. b. It can be used to create variables. c. It must contain either all numeric or all character elements. d. It must be used to define an array before the array name can be referenced.

a An ARRAY statement is not an executable statement; it merely defines an array.

148

What belongs within the braces of this ARRAY statement? array contrib{?} qtr1-qtr4; a. quarter b. quarter* c. 1-4 d. 4

d 89 answer: The value in braces indicates the number of elements in the array. In this case, there are four elements.

149

For the program below, select an iterative DO statement to process all elements in the contrib array. data work.contrib; array contrib{4} qtr1-qtr4; ... contrib{i}=contrib{i}*1.25; end; run; a. do i=4; b. do i=1 to 4; c. do until i=4; d. do while i le 4;

b In the DO statement, you specify the index variable that represents the values of the array elements. Then specify the start and stop positions of the array elements.

150

What is the value of the index variable that references Jul in the statements below? array quarter{4} Jan Apr Jul Oct; do i=1 to 4; yeargoal=quarter{i}*1.2; end; a. 1 b. 2 c. 3 d. 4

c The index value represents the position of the array element. In this case, the third element is 90 Jul.

151

Which DO statement would not process all the elements in the factors array shown below? array factors{*} age height weight bloodpr; a. do i=1 to dim(factors); b. do i=1 to dim(*); c. do i=1,2,3,4; d. do i=1 to 4;

b To process all the elements in an array, you can either specify the array dimension or use the DIM function with the array name as the argument.

152

Which statement below is false regarding the use of arrays to create variables? a. The variables are added to the program data vector during the compilation of the DATA step. b. You do not need to specify the array elements in the ARRAY statement. c. By default, all character variables are assigned a length of eight. d. Only character variables can be created.

d Either numeric or character variables can be created by an ARRAY statement.

153

For the first observation, what is the value of diff{i} at the end of the second iteration of the DO loop? array wt{*} weight1-weight10; array diff{9}; 91 do i=1 to 9; diff{i}=wt{i+1}-wt{i}; end; a. 15 b. 10 c. 8 d. -7

a At the end of the second iteration, diff{i} resolves as follows: diff{2}=wt{2+1}-wt{2}; diff{2}=215-200

154

Finish the ARRAY statement below to create temporary array elements that have initial values of 9000, 9300, 9600, and 9900. array goal{4} ... ; a. _temporary_ (9000 9300 9600 9900) b. temporary (9000 9300 9600 9900) c. _temporary_ 9000 9300 9600 9900 d. (temporary) 9000 9300 9600 9900

a To create temporary array elements, specify _TEMPORARY_ after the array name and dimension. Specify an initial value for each element, separated by either blanks or commas, and enclose the values in parentheses.

155

Based on the ARRAY statement below, select the array reference for the array element q50. array ques{3,25} q1-q75; a. ques{q50} 92 b. ques{1,50} c. ques{2,25} d. ques{3,0}

c This two-dimensional array would consist of three rows of 25 elements. The first row would contain q1 through q25, the second row would start with q26 and end with q50, and the third row would start with q51 and end with q75.

156

Select the ARRAY statement that defines the array in the following program. data rainwear.coat; input category high1-high3 / low1-low3; ... do i=1 to 2; do j=1 to 3; compare{i,j}=round(compare{i,j}*1.12); end; end; run; a. array compare{1,6} high1-high3 low1-low3; b. array compare{2,3} high1-high3 low1-low3; c. array compare{3,2} high1-high3 low1-low3; d. array compare{3,3} high1-high3 low1-low3;

b The nested DO loops indicate that the array is named compare and is a two-dimensional array that has two rows and three columns.

157

Which SAS statement correctly uses column input to read the values in the raw data file below in this order: Address (4th field), SquareFeet (second field), Style (first field), Bedrooms (third field)?

c Column input specifies the variable's name, followed by a dollar ($) sign if the values are character values, and the beginning and ending column locations of the raw data values.

158

Which is not an advantage of column input? a. It can be used to read character variables that contain embedded blanks. b. No placeholder is required for missing data. c. Standard as well as nonstandard data values can be read. d. Fields do not have to be separated by blanks or other delimiters.

c Column input is useful for reading standard values only.

159

Which is an example of standard numeric data? a. -34.245 b. $24,234.25 c. 1/2 d. 50%

a A standard numeric value can contain numbers, scientific notation, decimal points, and plus and minus signs. Nonstandard numeric data includes values that contain fractions or special characters such as commas, dollar signs, and percent signs.

160

Formatted input can be used to read a. standard free-format data b. standard data in fixed fields c. nonstandard data in fixed fields d. both standard and nonstandard data in fixed fields

d Formatted input can be used to read both standard and nonstandard data in fixed fields.

161

Which informat should you use to read the values in column 1-5? a. w. b. $w. c. w.d d. COMMAw.d

b The $w. informat enables you to read character data. The w represents the field width of the data value or the total number of columns that contain the raw data field.

162

The COMMAw.d informat can be used to read which of the following values? a. 12,805 b. $177.95 95 c. 18 % d. all of the above

d The COMMAw.d informat strips out special characters such as commas, dollar signs, and percent signs from numeric data, and stores only numeric values in a SAS data set.

163

Which INPUT statement correctly reads the values for ModelNumber (first field) after the values for Item (second field)? Both Item and ModelNumber are character variables. a. input +7 Item $9. @1 ModelNumber $5.; b. input +6 Item $9. @1 ModelNumber $5.; c. input @7 Item $9. +1 ModelNumber $5.; d. input @7 Item $9 @1 ModelNumber 5.;

b The +6 pointer control moves the input pointer to the beginning column of Item, and the values are read. Then the @1 pointer control returns to column 1, where the values for ModelNumber are located

164

Which INPUT statement correctly reads the numeric values for Cost (third field)? a. input @17 Cost 7.2; b. input @17 Cost 9.2.; c. input @17 Cost comma7.; d. input @17 Cost comma9.;

d The values for Cost contain dollar signs and commas, so you must use the COMMAw.d informat. Counting the numbers, dollar sign, comma, and decimal point, the field width is 9 columns. Because the data value contains decimal places, a d value is not needed.

165

Which SAS statement correctly uses formatted input to read the values in this order: Item (first field), UnitCost (second field), Quantity (third field)?

d The default location of the column pointer control is column 1, so a column pointer control is optional for reading the first field. You can use the @n or +n pointer controls to specify the beginning column of the other fields. You can use the $w. informat to read the values for Item, the COMMAw.d informat for UnitCost, and the w.d informat for Quantity.

166

. Which raw data file requires the PAD option in the INFILE statement in order to correctly read the data using either column input or formatted input?

a Use the PAD option in the INFILE statement to read variable-length records that contain fixed-field data. The PAD option pads each record with blanks so that all data lines have the same length.

167

The raw data file referenced by the fileref Students contains data that is

b The raw data file contains data that is free format, meaning that the data is not arranged in columns or fixed fields.

168

Which input style should be used to read the values in the raw data file that is referenced by the fileref Students?

c List input should be used to read data that is free format because you do not need to specify the column locations of the data.

169

Which SAS program was used to create the raw data file Teamdat from the SAS data set Work.Scores?

c You can use the DSD option in the FILE statement to specify that data values containing commas should be enclo

170

Which SAS statement reads the raw data values in order and assigns them to the variables shown below? Variables: FirstName (character), LastName (character), Age (numeric), School (character), Class (numeric)

answer: a Because the data is free format, list input is used to read the values. With list input, you simply name each variable and identify its type.

171

Which SAS statement should be used to read the raw data file that is referenced by the fileref Salesrep? a. infile salesrep; b. infile salesrep ':'; c. infile salesrep dlm; d. infile salesrep dlm=':';

d The INFILE statement identifies the location of the external data file. The DLM= option specifies the colon (:) as the delimiter that separates each field.

172

Which of the following raw data files can be read by using the MISSOVER option in the INFILE statement? Missing values are indicated with colored blocks.

a You can use the MISSOVER option in the INFILE statement to read the missing values at the end of a record. The MISSOVER option prevents SAS from moving to the next record if values are missing in the current record.

173

Which SAS program correctly reads the data in the raw data file that is referenced by the fileref Volunteer?

b The LENGTH statement extends the length of the character variable LastName so that it is large enough to accommodate the data. Variable attributes such as length are defined the first time a variable is named in a DATA step. The LENGTH statement should precede the INPUT statement so that the correct length is defined.

174

Which type of input should be used to read the values in the raw data file that is referenced by the fileref University? a. column b. formatted c. list d. modified list

d 103 answer: Notice that the values for School contain embedded blanks, and the values for Enrolled are nonstandard numeric values. Modified list input can be used to read the values that contain embedded blanks and nonstandard values.

175

Which SAS statement correctly reads the values for Flavor and Quantity? Make sure the length of each variable can accommodate the values that are shown. a. input Flavor & $9. Quantity : comma.; b. input Flavor & $14. Quantity : comma.; c. input Flavor : $14. Quantity & comma.; d. input Flavor $14. Quantity : comma.;

b The INPUT statement uses list input with format modifiers and informats to read the values for each variable. The ampersand (&) modifier enables you to read character values that contain single embedded blanks. The colon (:) modifier enables you to read nonstandard data values and character values that are longer than eight characters, and that contain no embedded blanks.

176

Which SAS statement correctly reads the raw data values in order and assigns them to these corresponding variables: Year (numeric), School (character), Enrolled (numeric)?

d The values for Year can be read with column, formatted, or list input. However, the values for School and Enrolled are free-format data that contain embedded blanks or nonstandard values. Therefore, these last two variables must be read with modified list input.

177

SAS date values are the number of days since which date? a. January 1, 1900 b. January 1, 1950 c. January 1, 1960 d. January 1, 1970

c 105 A SAS date value is the number of days from January 1, 1960, to the given date.

178

A great advantage of storing dates and times as SAS numeric date and time values is that a. they can easily be edited. b. they can easily be read and understood. c. they can be used in text strings like other character values. d. they can be used in calculations like other numeric values.

d In addition to tracking time intervals, SAS date and time values can be used in calculations like other numeric values. This lets you calculate values that involve dates much more easily than in other programming languages

179

SAS does not automatically make adjustments for daylight saving time, but it does make adjustments for: a. leap seconds b. leap years c. Julian dates d. time zones

b SAS automatically makes adjustments for leap years.

180

An input data file has date expressions in the form 10222001. Which SAS informat should you use to read these dates? a. DATE6. b. DATE8. c. MMDDYY6. d. MMDDYY8.

d The SAS informat MMDDYYw. reads dates such as 10222001, 10/22/01, or 10-22-01. In this case, the field width is eight.

181

The minimum width of the TIMEw. informat is a. 4 b. 5 c. 6 d. 7

b The minimum acceptable field width for the TIMEw. informat is five. If you specify a w value less than five, you will receive an error message in the SAS log.

182

Shown below are date and time expressions and corresponding SAS datetime informats. Which date and time expresssion cannot be read by the informat that is shown beside it? a. 30May2000:10:03:17.2 DATETIME20. b. 30May00 10:03:17.2 DATETIME18. c. 30May2000/10:03 DATETIME15. d. 30May2000/1003 DATETIME14.

: d In the time value of a date and time expression, you must use delimiters to separate the values for hour, minutes, and seconds.

183

What is the default value of the YEARCUTOFF= system option? a. 1920 b. 1910 c. 1900 d. 1930

a The default value of YEARCUTOFF= is 1920. This enables you to read two-digit years from 00-19 as the years 2000 through 2019.

184

Suppose your input data file contains the date expression 13APR2009. The YEARCUTOFF= system option is set to 1910. SAS will read the date as a. 13APR1909 b. 13APR1920 c. 13APR2009 d. 13APR2020

c The value of the YEARCUTOFF= system option does not affect four-digit year values. Fourdigit values are always read correctly.

185

Suppose the YEARCUTOFF= system option is set to 1920. An input file contains the date expression 12/08/1925, which is being read with the MMDDYY8. informat. Which date will appear in your data? a. 08DEC1920 b. 08DEC1925 c. 08DEC2019 d. 08DEC2025

c The w value of the informat MMDDYY8. is too small to read the entire value, so the last two digits of the year are truncated. The last two digits thus become 19 instead of 25. Because the YEARCUTOFF= system option is set to 1920, SAS interprets this year as 2019. To avoid such errors, be sure to specify an informat that is wide enough for your date expressions

186

Suppose your program creates two variables from an input file. Both variables are stored as SAS date values: FirstDay records the start of a billing cycle, and LastDay records the end of that cycle. The code for calculating the total number of days in the cycle would be 108 a. TotDays=lastday-firstday; b. TotDays=lastday-firstday+1; c. TotDays=lastday/firstday; d. You cannot use date values in calculations

b To find the number of days spanned by two dates, subtract the first day from the last day and add one. Because SAS date values are numeric values, they can easily be used in calculations

187

You can position the input pointer on a specific record by using a. column pointer controls. b. column specifications. c. line pointer controls. d. line hold specifiers.

c Information for one observation can be spread out over several records. You can write one INPUT statement that contains line pointer controls to specify the records from which values are read.

188

Which pointer control is used to read multiple records sequentially? 109 a. @n b. +n c. / d. all of the above

c The forward slash (/) line pointer control is used to read multiple records sequentially. Each time a / pointer is encountered, the input pointer advances to the next line. @n and +n are column pointer controls.

189

Which pointer control can be used to read records non-sequentially? a. @n b. #n c. +n d. /

b The #n line pointer control is used to read records non-sequentially. The #n specifies the absolute number of the line to which you want to move the pointer.

190

Which SAS statement correctly reads the values for Fname, Lname, Address, City, State, and Zip in order? 110 Correct answer: a The INPUT statement uses the / line

a The INPUT statement uses the / line pointer control to move the input pointer forward from the first record to the second record, and from the second record to the third record. The / line pointer control only moves the input pointer forward and must be specified after the instructions for reading the values in the current record. You should place a semicolon only at the end of a complete INPUT statement.

191

Which INPUT statement correctly reads the values for ID in the fourth record, then returns to the first record to read the values for Fname and Lname?

d The first #n line pointer control enables you to read the values for ID from the fourth record. The second #n line pointer control moves back to the first record and reads the values for Fname and Lname. You can use formatted input, column input, or list input to read the values for ID.

192

How many records will be read for each iteration of the DATA step? data spring.sportswr; infile newitems; input #1 Item $ Color $ #3 @8 Price comma6. #2 Fabric $ #3 SKU $ 1-6; run; a. one b. two c. three 112 d. four

c The first time the DATA step executes, the first three records are read, and an observation is written to the data set. During the second iteration, the next three records are read, and the second observation is written to the data set. During the third iteration, the last three records are read, and the final observation is written to the data set.

193

Which INPUT statement correctly reads the values for City, State, and Zip? a. input #3 City $ State $ Zip $; b. input #3 City & $11. State $ Zip $; c. input #3 City $11. +2 State $2. + 2 Zip $5.; d. all of the above

b A combination of modified and simple list input can be used to read the values for City, State, and Zip. You need to use modified list input to read the values for City, because one of the values is longer than eight characters and contains an embedded blank. You cannot use formatted input, because the values do not begin and end in the same column in each record.

194

Which program does not read the values in the first record as a variable named Item and the values in the second record as two variables named Inventory and Type?

c The values for Item in the first record are read, then the following / or #n line pointer control advances the input pointer to the second record to read the values for Inventory and Type.

195

Which INPUT statement reads the values for Lname, Fname, Department, and Salary (in that order)?

d You can use either the / or #n line pointer control to advance the input pointer to the second line, in order to read the values for Department and Salary. The colon (:) modifier is used to read the character values that are longer than eight characters (Department) and the nonstandard data values (Salary).

196

Which raw data file poses potential problems when you are reading multiple records for each observation?

c The third raw data file does not contain the same number of records for each observation, so the output from this data set will show invalid data for the ID and salary information in the fourth line.

197

Which is true for the double trailing at sign (@@)? a. It enables the next INPUT statement to read from the current record across multiple iterations of the DATA step. b. It must be the last item that is specified in the INPUT statement. c. It is released when the input pointer moves past the end of the record. d. All of the above.

d The double trailing at sign (@@) enables the next INPUT statement to read from the current record across multiple iterations of the DATA step. It must be the last item that is specified in the INPUT statement. A record that is being held by the double trailing at sign (@@) is not released until the input pointer moves past the end of the record, or until an INPUT statement that has no line-hold specifier executes.

198

A record that is being held by a single trailing at sign (@) is automatically released when a. the input pointer moves past the end of the record. b. the next iteration of the DATA step begins. c. another INPUT statement that has a single trailing at sign (@) executes. d. another value is read from the observation.

b Unlike the double trailing at sign (@@), the single trailing at sign (@) is automatically released when control returns to the top of the DATA step for the next iteration. The trailing @ does not toggle on and off. If another INPUT statement that has a trailing @ executes, the holding effect is still on.

199

Which SAS program correctly creates a separate observation for each block of data?

c Each record in this file contains three repeating blocks of data values for Item and Variety. The INPUT statement reads a block of values for Item and Variety, and then holds the current record by using the double trailing at sign (@@). The values in the program data vector are written to the data set as the first observation. In the next iteration, the INPUT statement reads the next block of values for Item and Variety from the same record.

200

Which SAS program segment reads the values for ID and holds the record for each value of Quantity, so that three observations are created for each record?

d This raw data file contains an ID fieldthat is followed by repeating fields. The first INPUT statement reads the values for ID and uses the @ line-hold specifier to hold the current record for the next INPUT statement in the DATA step. The second INPUT statement reads the values for Quantity. When all of the repeating fields have been read, control returns to the top of the DATA step, and the record is released.

201

Which SAS statement repetitively executes several statements when the value of an index variable named Count ranges from 1 to 50, incremented by 5? a. do count=1 to 50 by 5; b. do while count=1 to 50 by 5; c. do count=1 to 50 + 5; d. do while (count=1 to 50 + 5);

a The iterative DO statement begins the execution of a loop based on the value of an index variable. Here, the loop executes when the value of Count ranges from 1 to 50, incremented 119 by 5.

202

Which option below, when used in a DATA step, writes an observation to the data set after each value for Activity has been read?

a The OUTPUT statement must be included in the loop so that each time a value for Activity is read, an observation is immediately written to the data set.

203

Which SAS statement repetitively executes several statements while the value of Cholesterol is greater than 200? a. do cholesterol > 200; b. do cholesterol gt 200; c. do while (cholesterol > 200); d. do while cholesterol > 200;

c The DO WHILE statement checks for the condition that Cholesterol is greater than 200. The expression must be enclosed in parentheses. The expression is evaluated at the top of the loop, before any statements are executed. If the condition is true, the DO WHILE loop 120 executes. If the expression is false the first time it is evaluated, then the loop never executes.

204

Which choice below is an example of a Sum statement? a. totalpay=1; b. totalpay+1; c. totalpay*1; d. totalpay by 1;

b The Sum statement adds the result of an expression to a counter variable. So the + sign is an essential part of the Sum statement. Here, the value of TotalPay is incremented by 1.

205

Which program creates the SAS data set Perm.Topstore from the raw data file shown below?

a. data perm.topstores; infile sales98 missover; input Store Sales : comma. @; do while (sales ne .); month + 1; output; 121 input sales : comma. @; end; run; b. data perm.topstores; infile sales98 missover; input Store Sales : comma. @; do while (sales ne .); Month=0; month + 1; output; input sales : comma. @; end; run; c. data perm.topstores; infile sales98 missover; input Store Sales : comma. Month @; do while (sales ne .); month + 1; input sales : comma. @; end; output; run; d. data perm.topstores; infile sales98 missover; input Store Sales : comma. @; Month=0; do while (sales ne .); month + 1; output; input sales : comma. @; end; run;

d The assignment statement that precedes the DO WHILE loop creates the counter variable Month and assigns an initial value of zero to it. Each time the DO WHILE loop executes, the Sum statement increments the value of Month by 1.

206

How many observations are produced by the DATA step that reads this external file? 122 data perm.choices; infile icecream missover; input Day $ Flavor : $10. @; do while (flavor ne ' '); output; input flavor : $10. @; end; run; a. 3 b. 5 c. 12 d. 15

c This DATA step produces one observation for each repeating field. The MISSOVER option in the INFILE statement prevents SAS from reading the next record when missing values occur at the end of a record. Every observation contains one value for Flavor, paired with the corresponding value for ID. Because there are 12 values for Flavor, there are 12 observations in the data set.

207

When you write a DATA step to create one observation per detail record you need to a. distinguish between header and detail records. b. keep the header record as part of each observation until the next header record is encountered. c. hold the current value of each record type so that the other values in the record can be 124 read. d. all of the above

d In order to create one observation per detail record, it is necessary to distinguish between header and detail records. Use a RETAIN statement to keep the header record as part of each observation until the next header record is encountered. You also need to use the @ line-hold specifier to hold the current value of each record type so that the other values in the record can be read.

208

Which SAS statement reads the value for code (in the first field), and then holds the value until an INPUT statement reads the remaining value in each observation in the same iteration of the DATA step? a. input code $2. @; b. input code $2. @@; c. retain code; d. none of the above

a An INPUT statement is used to read the value for code. The single @ sign at the end of the INPUT statement holds the current record for a later INPUT statement in the same iteration of the DATA step.

209

Which SAS statement checks for the condition that Record equals C and executes a single statement to read the values for Amount? a. if record=c then input @3 Amount comma7.; 125 b. if record='C' then input @3 Amount comma7.; c. if record='C' then do input @3 Amount comma7.; d. if record=C then do input @3 Amount comma7.;

b The IF-THEN statement defines the condition that Record equals C and executes an INPUT statement to read the values for Amount when the condition is true. C must be enclosed in quotation marks and must be specified exactly as shown because it is a character value.

210

After the value for code is read in the sixth iteration, which illustration of the program data vector is correct? data perm.produce (drop=code); infile orders; retain Vegetable; input code $1. @; if code='H' then input @3 vegetable $6.; if code='P'; input @3 Variety : $10. @15 Supplier : $15.; run; proc print data=perm.produce; run;

b The value of Vegetable is retained across iterations of the DATA step. As the sixth iteration begins, the INPUT statement reads the value for code and holds the record, so that the values for Variety and Supplier can be read with an additional INPUT statement.

211

What happens when the fourth iteration of the DATA step is complete? data perm.orders (drop=type); infile produce; retain Fruit; input type $1. @; if type='F' then input @3 fruit $7.; if type='V'; input @3 Variety : $16. @20 Price comma5.; run; a. All of the values in the program data vector are written to the data set as the third observation. b. All of the values in the program data vector are written to the data set as the fourth observation. c. The values for Fruit, Variety, and Price are written to the data set as the third observation. d. The values for Fruit, Variety, and Price are written to the data set as the fourth observation.

c This program creates one observation for each detail record. The RETAIN statement retains the value for Fruit as part of each observation until the values for Variety and Price can be read. The DROP= option in the DATA statement prevents the values for type from being written to the data set.

212

Which SAS statement indicates that several other statements should be executed when Record has a value of A? a. if record='A' then do; b. if record=A then do; c. if record='A' then; d. if record=A then;

a The IF-THEN statement defines the condition that Record equals A and specifies a simple DO group. The keyword DO indicates that several executable statements follow until the DO group is closed by an END statement. The value A must be enclosed in quotation marks and specified exactly as shown because it is a character value.

213

Which is true for the following statements (X indicates a header record)? if code='X' then do; if _n_ > 1 then output; Total=0; input Name $ 3-20; end; a. _N_ equals the number of times the DATA step has begun to execute. b. When code='X' and _n_ > 1 are true, an OUTPUT statement is executed. c. Each header record causes an observation to be written to the data set. d. a and b

d _N_ is an automatic variable whose value is the number of times the DATA step has begun to execute. The expression _n_ > 1 defines a condition where the DATA step has executed more than once. When the conditions code='X' and _n_ > 1 are true, an OUTPUT statement is executed, and Total is initialized to zero. Thus, each header record except for the first one causes an observation to be written to the data set.

214

What happens when the condition type='P' is false? if type='P' then input @3 ID $5. @9 Address $20.; else if type='V' then input @3 Charge 6.; a. The values for ID and Address are read. b. The values for Charge are read. c. Type is assigned the value of V. d. The ELSE statement is executed.

d The condition is false, so the values for ID and Address are not read. Instead, the ELSE statement is executed and defines another condition which might or might not be true.

215

What happens when last has a value other than zero? data perm.househld (drop=code); infile citydata end=last; retain Address; input type $1. @; if code='A' then do; if _n_ > 1 then output; Total=0; input address $ 3-17; end; else if code='N' then total+1; if last then output; run; a. Last has a value of 1. b. The OUTPUT statement writes the last observation to the data set. c. The current value of last is written to the DATA set. 129 d. a and b

d You can determine when the current record is the last record in an external file by specifying the END= option in the INFILE statement. Last is a temporary numeric variable whose value is zero until the last line is read. Last has a value of 1 after the last line is read. Like automatic variables, the END= variable is not written to the data set.

216

Based on the values in the program data vector, what happens next? data work.supplies (drop=type amount); infile orders end=last; retain Department Extension; input type $1. @; if type='D' then do; if _n_ > 1 then output; Total=0; input @3 department $10. @16 extension $5.; end; else if type='S' then do; input @16 Amount comma5.; total+amount; if last then output; end; run; a. All the values in the program data vector are written to the data set as the first observation. b. The values for Department, Total, and Extension are written to the data set as the first observation. c. The values for Department, Total, and Extension are written to the data set as 130 the fourth observation. d. The value of last changes to 1.

b This program creates one observation for each header record and combines information from each detail record into the summary variable, Total. When the value of type is D and the value of _N_ is greater than 1, the OUTPUT statement executes, and the values for Department, Total, and Extension are written to the data set as the first observation. The variables _N_, last, type, and Amount are not written to the data set