Setting Up Your Practice Files (REQUIRED)
To complete the activities, demos, and practices in this course, you must access SAS software and set up your practice files.
Using the Hands-On Lab
The Hands-On Lab already has the course files loaded for you. All you have to do is launch the lab and start using SAS Studio.
Instructions for using the Hands-On Lab
Using SAS OnDemand for Academics
Setup instructions for SAS OnDemand for Academics
Using your own SAS software
Start your SAS software. Click Open next to the SAS software interface you have, and follow the instructions to set up your data.
Setup Instructions for SAS Studio
Setup Instructions for SAS Enterprise Guide
Setup Instructions for SAS Windowing Environment
SAS® Programming 3: Advanced Techniques
Lesson 01, Section 1 Activity
Open p301a03.sas from the activities folder and perform the following tasks:
- View the DATA step syntax. Run the DATA step. How many rows are in the output table? What is the value of Year?
- Uncomment the OUTPUT statement. Run the DATA step.
- How many rows are in the output table?
- What is the range of values for Year?
- How many times did SAS iterate through the DATA step based on the PUTLOG statement?
- Run the PROC SGPLOT step. In what year will the predicted population of India exceed China?
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 1 Activity
Go to the Programming: SAS 9.4 and SAS Viya documentation.
- Under DATA Step, select Function and CALL Routines.
- Under Dictionary of Functions and CALL Routines, select FIND Function.
- How many arguments are required for the FIND function, and what does the function return?
- How many modifiers are available for the FIND function?
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 1 Activity
Open p302a02.sas from the activities folder and perform the following tasks:
- Complete the assignment statement for Open1MnthBack, which is equal to the Open value from one previous month.
- Complete the assignment statement for Open2MnthBack, which is equal to the Open value from two previous months.
- Run the program and view the results. What is the three-month average (Open3MnthAvg) for 02MAR2010?
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 1 Activity
Open p302a03.sas from the activities folder and perform the following tasks:
- Run the program and verify that Open3MnthAvg is equal to missing for the first two rows. What is its value for the third row?
- Modify the conditional statement to include the calls to LAG1 and LAG2, and delete the two assignment statements containing LAG functions.
- Run the program and view the results. Is Open3MnthAvg equal to 102.90 for 02MAR2010 as in the previous activity?
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 1 Activity
Go to the Programming: SAS 9.4 and SAS Viya documentation.
- Under DATA Step, select Functions and CALL Routines.
- Under Dictionary of Functions and CALL Routines, select FINDW Function.
- Find and review the modifiers for the FIND and FINDW functions.
- Does the FINDW function contain more modifiers that the FIND function, or fewer?
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 1
Practice Level 1: Using the LAG Function
The pg3.np_2016traffic table contains monthly traffic counts for national parks for the year 2016. There are 12 rows for each value of ParkCode, and each row gives the traffic count for months 1 through 12. Calculate the change in traffic count between consecutive months for each park.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p302p01.sas program in the practices folder and review the code. Which column is named in the BY statement, and what 2 variables are available as a result?
Solution:
ParkCode is named in the BY statement, making FIRST.ParkCode and LAST.ParkCode available.
- Run the program to view the 2016 traffic count data. What numeric columns are included in the report?
Solution:
The output report includes Year, Month, and TrafficCount.
- Modify the program.
- In an assignment statement, use the LAG function to create the column PrevMthTC, which is the previous value of TrafficCount.
- In another assignment statement, create the column OneMthChange, which is TrafficCount minus PrevMthTC.
Solution:
data work.ParkTraffic2016; set pg3.np_2016traffic; by ParkCode; PrevMthTC=lag1(TrafficCount); OneMthChange=TrafficCount-PrevMthTC; run; title '2016 National Park Traffic Counts'; proc print data=work.ParkTraffic2016; run;
- Run the program and view the results. What is the value of PrevMthTC for row 13, the first row for the park code ACAD? Is this correct?
Solution:
The value of PrevMthTC for row 13 is 2265.00. This is not correct. In order to calculate the change in traffic count between consecutive months for each park, the first row for each park should be set to a missing value.
- Before the OneMthChange assignment statement, add an IF/THEN statement that changes the PrevMthTC value to missing for the first occurrence of a park code.
Solution:
/* p302p01_s.sas */ data work.ParkTraffic2016; set pg3.np_2016traffic; by ParkCode; PrevMthTC=lag1(TrafficCount); if first.ParkCode=1 then PrevMthTC=.; OneMthChange=TrafficCount-PrevMthTC; run; title '2016 National Park Traffic Counts'; proc print data=work.ParkTraffic2016; run;
- Run the program and verify the results. What is the value of OneMthChange when Month is equal to 1 for ParkCode values ACAD and ALPO?
Solution:
OneMthChange has a numeric missing value in the first row for each park code.
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 1
Practice Level 2: Using the COUNT and FINDW Functions
The pg3.np_grandcanyon table contains comments regarding Grand Canyon National Park. The canyon consists of the North Rim and the South Rim. Determine how many times the word South appears in each comment. Also, retrieve the word after the first occurrence of the word South in each comment.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p302p02.sas program in the practices folder. Run the program to view the Grand Canyon comments. How many rows are in the report?
Solution:
There are 76 rows in the report.
- Modify the program.
- Use the COUNT function to create a column named NumSouth that is equal to the number of times the word South appears in each comment. Use the modifier i to ignore case.
- Subset the data to include only the rows that contain a comment with the word South.
Solution:
data work.SouthRim; set pg3.np_grandcanyon; NumSouth=count(Comments,'South','i'); if NumSouth>0; run; title 'Grand Canyon Comments Regarding South Rim'; proc print data=work.SouthRim; run; title;
- Run the program. How many rows contain a comment with the word South?
Solution:
19 rows contain a comment with the word South.
- Modify the program.
- Use the FINDW function to create a column named SouthWordPos that is equal to the word number for the first occurrence of the word South in each comment.
- Specify the space and period as the delimiters that separate words.
- Use the modifier i to ignore case and the modifier e to return the word number instead of the starting position.
- Use the SCAN function to create a column named AfterSouth that is equal to the word after the first occurrence of the word South. Use the same delimiters in the SCAN function as the FINDW function.
Solution:
/* p302p02_s.sas */ data work.SouthRim; set pg3.np_grandcanyon; NumSouth=count(Comments,'South','i'); if NumSouth>0; SouthWordPos=findw(Comments,'South',' .','ei'); AfterSouth=scan(Comments,SouthWordPos+1,' .'); run; title 'Grand Canyon Comments Regarding South Rim'; proc print data=work.SouthRim; run; title;
- Run the program and verify the results. What word occurs most often after the word South?
Solution:
The word Entrance occurs most often.
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 1
Challenge Practice: Using the COUNTC Function
The pg3.np_mammals table contains the scientific and common names of mammals found in national parks. The table lists one scientific name for each mammal and one or more common names. Common names are separated by commas, forward slashes, or asterisks. Create a separate row for each common name.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p302p03.sas program in the practices folder. Run the program to view the scientific and common names of national park mammals. How many common names are there for the species Vulpes vulpes (row 5)? How many rows are in the report?
Solution:
There are six common names for the Vulpes vulpes species, and there are 3867 rows in the report.
- Modify the program.
- Use the COUNTC function to create a column named SpecCharNum that is equal to the number of times that a special character (comma, forward slash, or asterisk) is found in the Common_Names column per each mammal.
- If SpecCharNum is equal to zero (only one common name), assign the Name column the value of Common_Names and write it to output.
- If SpecCharNum is greater than zero (multiple common names), assign a Name column with each common name from the Common_Names column and write to output.
- Use a DO loop incrementing from a start value of 1 to a stop value of SpecCharNum plus 1.
- Use the SCAN function to capture each common name.
Solution:
/* p302p03_s.sas */ data work.Mammal_Names; set pg3.np_mammals(keep=Scientific_Name Common_Names); SpecCharNum=countc(Common_Names,',/*'); if SpecCharNum=0 then do; Name=Common_Names; output; end; else do i=1 to SpecCharNum+1; Name=scan(Common_Names,i,',/*'); output; end; run; title 'National Park Mammals'; proc print data=work.Mammal_Names; run; title;
- Run the program and verify the results. How many rows were written to the new table for
the species Vulpes vulpes (which starts on row 10)? How many rows are in the final report?
Solution:
Six rows were written for the species Vulpes vulpes, one for each common nmae. The final report contains 5129 rows.
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 2 Activity
Open p302a06.sas from the activities folder and perform the following tasks:
-
Run the PROC PRINT step and view the results.
-
Modify the WHERE statement in the PROC PRINT step to find all the values of Narrative that contain EF3, EF-3, EF4, or EF-4.
-
Run the PROC PRINT step. View the results and the SAS log. How many rows were read based on the WHERE statement?
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 2 Activity
Open p302a07.sas from the activities folder and perform the following tasks:
- Run the program and view the results. What does the value in the Loc column represent?
- Modify the program as follows:
- Uncomment the Narrative_New assignment statement.
- Modify the first argument of the PRXCHANGE function to find the pattern of EF- and change it to the value EF.
- Modify the second argument of the PRXCHANGE function so that all occurrences of the pattern are substituted.
- Run the program. For row 7, how many EF- values were changed to EF.
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 2
Level 1 Practice: Using the PRXMATCH and PRXCHANGE Functions
The pg3.np_acres table contains acreage amounts for national parks. The ParkName column contains the descriptive name for each park. Find the national preserves by locating all rows with a ParkName value that contains the string N PRES, N PRESERVE, NPRES, or NPRE followed by a space. Within the ParkName values, modify the national preserve string to be displayed with the string of NPRES.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p302p04.sas program in the practices folder. Run the program to view the ParkName values. What is the value of ParkName in row 12?
Solution:
ANIAKCHAK N PRESERVE
- Modify the program.
- In an assignment statement, use the PRXMATCH function to create the column Position, which is equal to the starting position of a string that represents national preserves. All national preserves will contain one of the following strings followed by a space (\s): N PRES, N PRESERVE, NPRES, or NPRE.
- Add a subsetting IF statement to include only the rows where the Position values are greater than zero.
Solution:
data work.NationalPreserves; set pg3.np_acres; Position=prxmatch('/N PRES\s|N PRESERVE\s|NPRES\s|NPRE\s/',ParkName); if Position ne 0; run; title 'National Preserves (NPRE)'; proc print data=work.NationalPreserves; run; title;
- Run the program and verify the results. How many rows are returned for national preserves? What is the value of ParkName in the first row?
Solution:
Nine rows are returned. In the first row the ParkName is ANIAKCHAK N PRESERVE.
- Add an assignment statement that calls the PRXCHANGE function to change the strings N PRES, N PRESERVE, or NPRES to be NPRE. Store the changed values in a column named NewName.
- Run the program and verify that NPRE is displayed in the NewName column for all rows in the table. How many national preserves are in the state of Alaska (AK)?
Solution:
/* p302p04_s.sas */ data work.NationalPreserves; set pg3.np_acres; Position=prxmatch('/N PRES\s|N PRESERVE\s|NPRES\s|NPRE\s/',ParkName); if Position ne 0; NewName=prxchange('s/N PRES\s|N PRESERVE\s|NPRES\s/NPRE /',1,ParkName); run; title 'National Preserves (NPRE)'; proc print data=work.NationalPreserves; run; title;
Four of the national preserves are in Alaska.
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 2
Level 2 Practice: Using the PRXCHANGE Function with Capture Buffers
The sashelp.baseball data set contains salary and performance information for Major League Baseball players (excluding pitchers) who played at least one game in both the 1986 and 1987 seasons. The Name column contains the player's name in the form LastName, FirstName (that is, Mattingly, Don). For each player, rearrange the order of the player's name to be in the form FirstName LastName (that is, Don Mattingly).
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p302p05.sas program in the practices folder. Run the program. What are the values of Name for rows 41 and 236?
Solution:
Row 41: Robidoux, Billy Jo
Row 236: O'Brien, Pete
- Modify the program.
- In an assignment statement, create a column named FirstLastName based on using the PRXCHANGE function to rearrange the order of the Name column. Use three sets of parentheses to create three capture buffers that represent the pattern of the Name column.
- The first set of parentheses represents the last name. The last name can contain an embedded blank (for example, Van Slyke) or a special character (for example, O'Brien).
- The second set of parentheses represents the comma and space.
- The third set of parentheses represents the first name. The first name can contain an embedded blank (that is, Billy Jo). Include a word boundary metacharacter at the end of the pattern to avoid trailing spaces on the first name.
- Use capture buffer references to rearrange the order of the capture buffers so that the player's name is in the form of FirstName LastName.
Solution:
/* p302p05_s.sas */ data work.BaseballPlayers; set sashelp.baseball(keep=Name); FirstLastName=prxchange('s/(\w+\D*\w*)(, )(\w+\s*\w*\b)/$3 $1/',-1,Name); run; title 'Names of Baseball Players'; proc print data=work.BaseballPlayers; run; title;
- Run the program and verify the results.
- What is the value of FirstLastName for row 41?
- What is the value of FirstLastName for row 236?
- What is the byte size of the FirstLastName column?
Solution:
- What is the value of FirstLastName for row 41? Billy Jo Robidoux
- What is the value of FirstLastName for row 236? Pete O'Brien
- What is the byte size of the FirstLastName column? 200 bytes
SAS® Programming 3: Advanced Techniques
Lesson 02, Section 2
Challenge Practice: Using the CALL PRXSUBSTR Routine
The pg3.np_unstructured_codes table contains a single column (Column1) whose contents include park codes and names. Extract the first occurrence of the park codes. Park codes are 3 to 10 uppercase letters.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p302p06.sas program in the practices folder. Run the program to view the Column1 values. The park codes follow a "C:" or "Code:" prefix. What are the characteristics of the codes?
Solution:
The park codes consist of three or more uppercase letters enclosed in double quotation marks.
- Modify the program.
- Create a column named Expression that is equal to a Perl regular expression that includes at least three uppercase letters followed by zero or more uppercase letters. Be sure to end the expression with the letter o so that it is compiled only once.
- Create a column named PatternID that uses the PRXPARSE function on the Expression column to return a pattern identifier number.
- Use the CALL PRXSUBSTR routine to find the expression (PatternID) in Column1 and create a MyStart column and a MyLength column.
CALL PRXSUBSTR(pattern-identifier-number, source, position <,length>);
Note: Use the SAS documentation (SAS Functions and CALL Routines: Reference) to learn about the CALL PRXSUBSTR routine.
Solution:
data work.ParkCodes; set pg3.np_unstructured_codes; Expression='/[A-Z]{3}[A-Z]*/o'; PatternId=prxparse(Expression); call prxsubstr(PatternID,Column1,MyStart,MyLength); run; title 'Park Codes from Unstructured Column'; proc print data=work.ParkCodes; run; title;
- Run the program. What are the values of the MyStart and MyLength columns in row 1 and row 7?
Solution:
In row 1, MyStart=10 and MyLength=4. In row 7, MyStart=7 and MyLength=6.
- Use the SUBSTR function to create a column named ParkCode based on the MyStart and MyLength columns.
Solution:
/* p302p06_s.sas */ data work.ParkCodes; set pg3.np_unstructured_codes; Expression='/[A-Z]{3}[A-Z]*/o'; PatternId=prxparse(Expression); call prxsubstr(PatternID,Column1,MyStart,MyLength); ParkCode=substr(Column1,MyStart,MyLength); run; title 'Park Codes from Unstructured Column'; proc print data=work.ParkCodes; run; title;
- Run the program and verify the ParkCode values. What are the values of ParkCode in rows 1 and 7?
Solution:
In row 1, ParkCode=ADAM and in row7, ParkCode=AKEPMT.
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 1 Activity
Open p303a01.sas from the activities folder and perform the following tasks:
- Replace the number signs (#) to reference the appropriate number of 2017 temperature columns.
- Modify the temperature-conversion assignment statement by replacing the ??? with the name of the column being incremented.
- Run the program and confirm that you are now seeing Celsius temperatures. What is the lowest average Celsius temperature for each city in 2017?
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 1 Activity
Open p303a02.sas from the activities folder and perform the following tasks:
- Modify the ARRAY statement to use an asterisk in place of the number of elements and to reference all 2018 temperature columns that start with Temp.
- In the DO statement, replace the value of 12 with the DIM function referencing the Temperature array.
- Run the program. Based on the results, how many temperature columns are in the array or the 2018 data?
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 2 Activity
Suppose we need the average precipitation values in centimeters instead of inches. Open p303a03.sas from the activities folder and perform the following tasks:
- Run the DATA step (which does not include the ARRAY syntax). How many rows of rotated data does the new table contain?
- Modify the DATA step to use ARRAY syntax to rotate the data.
- Run the DATA step which now includes the ARRAY syntax. How many rows of rotated data does the new table contain?
- Run the PROC SGPLOT step to create the desired bar chart. What is the highest average quarterly precipitation in centimeters for Dublin?
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 2 Activity
Which ARRAY statements have incorrect syntax for a one-dimensional array? Select all that apply.
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 2
Practice Level 1: Using One-Dimensional Arrays on Numeric Data
The pg3.eu_occ table contains monthly occupancy rates broken down by type of property (Hotel, ShortStay, and Camp) for European countries from January 2004 through September 2017. Calculate the percentage that each type of property represents of the total occupancy for each month and year by country.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p303p01.sas program in the practices folder. Run the program to view the European occupancy data.
- Add an ARRAY statement to create an array named OccType that references the three existing columns of property type: Hotel, ShortStay, and Camp.
- Add another ARRAY statement to create an array named OccPct that creates the numeric columns HotelPct, ShortStayPct, and CampPct.
Solution:
data work.MonthlyOcc; set pg3.eu_occ(drop=Geo); OccTotal=sum(Hotel,ShortStay,Camp); array OccType[3] Hotel ShortStay Camp; array OccPct[3] HotelPct ShortStayPct CampPct; format Hotel ShortStay Camp OccTotal comma16.; run;
- Add a DO loop with the index column Num.
- Use a start value of 1 and an end value of 3.
- Within the DO loop, add an assignment statement using array references to calculate the percentage of occupancy rate. The percentage (HotelPct, ShortStayPct, and CampPct) is equal to the property type (Hotel, ShortStay, and Camp) divided by the total occupancy (OccTotal).
- Add a DROP statement to eliminate the Num column from the output table.
Solution:
data work.MonthlyOcc; set pg3.eu_occ(drop=Geo); OccTotal=sum(Hotel,ShortStay,Camp); array OccType[3] Hotel ShortStay Camp; array OccPct[3] HotelPct ShortStayPct CampPct; do Num=1 to 3; OccPct[Num]=OccType[Num]/OccTotal; end; format Hotel ShortStay Camp OccTotal comma16. drop Num; run;
- Run the program. What is the value of HotelPct for row 1?
Solution:
The value of HotelPct for row 1 is 0.79709
- Add to the FORMAT statement to format HotelPct, ShortStayPct, and CampPct using the PERCENT8.1 format.
Solution:
data work.MonthlyOcc; set pg3.eu_occ(drop=Geo); OccTotal=sum(Hotel,ShortStay,Camp); array OccType[3] Hotel ShortStay Camp; array OccPct[3] HotelPct ShortStayPct CampPct; do Num=1 to 3; OccPct[Num]=OccType[Num]/OccTotal; end; format Hotel ShortStay Camp OccTotal comma16. HotelPct ShortStayPct CampPct percent8.1; drop Num; run; title 'Percentage of Occupancy by Type'; proc print data=work.MonthlyOcc; run; title;
- Run the program. What is the value of HotelPct for row 1?
Solution:
The value of HotelPct for row 1 is 79.7%
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 2
Practice Level 2: Using One-Dimensional Arrays on Character Data
The pg3.test_answers table contains employee answers (A through E) to 10 test questions. Calculate the test score for each employee by comparing their answers to the correct answers.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p303p02.sas program in the practices folder. Run the program to view the employee's answers.
- Create an array named EmpAnswer that references the employee's answers to the 10 test questions.
- Create a temporary array named CorAnswer that references the correct answers to the 10 questions as shown below.
Question 1 2 3 4 5 6 7 8 9 10 Answer A C C B E E D B B A
Solution:
array EmpAnswer[10] Q1-Q10; array CorAnswer[10] $ 1 _temporary_ ('A','C','C','B','E','E','D','B','B','A');
- Within a DO loop, use a conditional IF/THEN statement to compare the employee answer to the correct answer for the 10 questions. If the values compare, add 1 to the column Score.
Solution:
do i=1 to 10; if EmpAnswer[i]=CorAnswer[i] then Score+1; end;
- Eliminate the index column from the output table.
Solution:
data work.TestScores; set pg3.test_answers; Score=0; array EmpAnswer[10] Q1-Q10; array CorAnswer[10] $ 1 _temporary_ ('A','C','C','B','E','E','D','B','B','A'); do i=1 to 10; if EmpAnswer[i]=CorAnswer[i] then Score+1; end; drop i; run;
- Run the program and verify the results. How many employees have a perfect score?
Solution:
Four employees have a perfect score.
data work.TestScores; set pg3.test_answers; Score=0; array EmpAnswer[10] Q1-Q10; array CorAnswer[10] $ 1 _temporary_ ('A','C','C','B','E','E','D','B','B','A'); do i=1 to 10; if EmpAnswer[i]=CorAnswer[i] then Score+1; end; drop i; run; title 'Employee Test Results'; proc print data=work.TestScores; run; title;
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 2
Challenge Practice: Calculating the Differences between Array Elements
The pg3.storm_range table contains four recorded wind speeds (Wind1, Wind2, Wind3, and Wind4) for storms from the 1980 season through the 2016 season. Using arrays, calculate the difference in Wind1 and Wind2, Wind2 and Wind3, and Wind3 and Wind4.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p303p03.sas program in the practices folder. The current DATA step is calculating the three desired wind differences. Run the program to view the wind differences.
- Modify the DATA step to calculate the wind differences using two arrays and a DO loop.
Note: An expression can be used to determine the element number in an array reference.
Solution:
data work.WindDifference; set pg3.storm_range; array Wind[4] Wind1-Wind4; array Diff[3] Diff12 Diff23 Diff34; do i=1 to 3; Diff[i]=Wind[i]-Wind[i+1]; end; run;
- Run the program and verify the results. Based on the PROC MEANS results, what is the maximum value of Diff12, Diff23, and Diff34?
Solution:
The maximum value of Diff12 is 20.
The maximum value of Diff23 is 30.
The maximum value of Diff34 is 30.
data work.WindDifference; set pg3.storm_range; array Wind[4] Wind1-Wind4; array Diff[3] Diff12 Diff23 Diff34; do i=1 to 3; Diff[i]=Wind[i]-Wind[i+1]; end; run; title 'Storm Wind Differences (first 10 rows)'; proc print data=work.WindDifference(obs=10); var Name Basin StartYear Wind1-Wind4 Diff12 Diff23 Diff34; run; title; title 'Summary of Storm Wind Differences'; proc means data=work.WindDifference maxdec=1; var Diff12 Diff23 Diff34; run; title;
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 3 Activity
Given the table shown below:
4 | 7 |
2 | 10 |
16 | 5 |
- Write an array statement to create an array named Numbers, using the values in the table to initialize the array elements.
- What is the value of Numbers[3,2]?
- What is the value of Numbers5?
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 3 Activity
Open p303a05.sas from the activities folder and perform the following tasks:
- Add an ARRAY statement to create a two-dimensional array.
- Name the array PMT.
- The row dimension should reference the values 2015, 2016, and 2017.
- The column dimension should reference the values 1 to 2.
- The array elements should be temporary.
- Use the following as the six initial values:
Year PrecipTotal1 PrecipTotal2 2015 2.291.042016 4.152.342017 0.902.44
- Run the program and view the results. How many dates have daily precipitation greater than 0.3 inches and greater than 20% of the monthly precipitation?
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 3
Practice Level 1: Using a Two-Dimensional Array Based on Initial Values
The pg3.storm_stats table contains statistics such as MaxWindMPH for storms from seasons 1980 to 2016. For the storms in season 1980 and 1981, calculate the difference in a storm's MaxWindMPH compared to the quarterly maximum wind speed per season as shown below.
Quarter | |||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | ||
Season | 1980 | 132 | 121 | 190 | 138 |
1981 | 127 | 109 | 138 | 127 |
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p303p04.sas program in the practices folder. Run the program to view the MaxWindMPH for storms from seasons 1980 and 1981.
- Add an ARRAY statement to create a two-dimensional array named MWtable.
- The row dimension is based on two rows with a lower boundary of 1980 and an upper boundary of 1981.
- The column dimension is based on four columns corresponding to the four quarters.
- The array needs to be temporary.
- The array elements should include the initial values as shown in the table above.
Solution:
data work.MaxWind; set pg3.storm_stats; where Season between 1980 and 1981; Qtr=qtr(StartDate); array MWtable[1980:1981,4] _temporary_ (132,121,190,138, 127,109,138,127);
- Add an assignment statement to create a column named MaxWindSQ.
- The value of this column will be retrieved from the MWtable array using Season for the row dimension and Qtr for the column dimension.
- Add MaxWindSQ to the VAR statement in the PROC PRINT step. Run the program and verify the results.
Solution:
data work.MaxWind; set pg3.storm_stats; where Season between 1980 and 1981; Qtr=qtr(StartDate); array MWtable[1980:1981,4] _temporary_ (132,121,190,138, 127,109,138,127); MaxWindSQ=MWTable[Season,Qtr]; run; title 'Maximum Winds for Storms Between 1980 and 1981'; proc print data=work.MaxWind; var Season Qtr Name MaxWindMPH MaxWindSQ; run; title;
- Add an assignment statement to create a column named Difference.
- Difference is equal to MaxWindMPH minus MaxWindSQ.
- Add Difference to the VAR statement in the PROC PRINT step. Run the program and verify the results. What is the value of Difference for the 1980 storm named Lester (row 55)?
Solution:
The Difference for the storm named Lester is -150.
data work.MaxWind; set pg3.storm_stats; where Season between 1980 and 1981; Qtr=qtr(StartDate); array MWtable[1980:1981,4] _temporary_ (132,121,190,138, 127,109,138,127); MaxWindSQ=MWTable[Season,Qtr]; Difference=MaxWindMPH-MaxWindSQ; run; title 'Maximum Winds for Storms Between 1980 and 1981'; proc print data=work.MaxWind; var Season Qtr Name MaxWindMPH MaxWindSQ Difference; run; title;
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 3
Practice Level 2: Using a Two-Dimensional Array Based on Loading a Table
The pg3.storm_stats table contains statistics such as MaxWindMPH for storms from seasons 1980 through 2016. The pg3.storm_maxwindseasqtr table contains quarterly maximum wind speeds (MaxWindQ1 to MaxWindQ4) for the same seasons. Calculate the difference in a storm's MaxWindMPH value compared to the quarterly maximum wind speed per season.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p303p05.sas program in the practices folder. Run the program to view MaxWindMPH for storms from season 1980 to 2016.
- Add an ARRAY statement to create a two-dimensional array named MWtable.
- The row dimension is based on 37 rows with a lower boundary of 1980 and an upper boundary of 2016.
- The column dimension is based on four columns corresponding to the four quarters.
- The array needs to be temporary.
Solution:
data work.MaxWind; array MWTable[1980:2016,4] _temporary_;
- Load the two-dimensional array with the pg3.storm_maxwindseasqtr table.
- Load the array only during the first iteration of the DATA step.
- Use an outer DO loop to iterate through each season.
- Create a one-dimensional array referencing MaxWindQ1-MaxWindQ4.
- Use an inner DO loop to iterate through the quarters.
- Load the two-dimensional array by referencing the one-dimensional array.
Solution:
data work.MaxWind; array MWTable[1980:2016,4] _temporary_; if _N_=1 then do S=1980 to 2016; set pg3.storm_maxwindseasqtr; array MWQ[4] MaxWindQ1-MaxWindQ4; do Q=1 to 4; MWTable[S,Q]=MWQ[Q]; end; end;
- Add the following assignment statements:
- an assignment to create a column named MaxWindSQ. The value of this column will be retrieved from the MWtable array using Season for the row dimension and Qtr for the column dimension.
- an assignment statement to create a column named Difference that is equal to MaxWindMPH minus MaxWindSQ.
- Add MaxWindSQ and Difference to the VAR statement in the PROC PRINT step.
Solution:
data work.MaxWind; array MWTable[1980:2016,4] _temporary_; if _N_=1 then do S=1980 to 2016; set pg3.storm_maxwindseasqtr; array MWQ[4] MaxWindQ1-MaxWindQ4; do Q=1 to 4; MWTable[S,Q]=MWQ[Q]; end; end; set pg3.storm_stats; Qtr=qtr(StartDate); MaxWindSQ=MWTable[Season,Qtr]; Difference=MaxWindMPH-MaxWindSQ; run; title 'Maximum Winds for Storms Between 1980 and 2016'; proc print data=work.MaxWind; var Season Qtr Name MaxWindMPH MaxWindSQ Difference; run; title;
- Remove the index columns and MaxWindQ1-MaxWindQ4. Run the program and verify the results. What is the value of Difference for the 2016 storm named Zena (row 3038)?
Solution:
The value of Difference is -74.
data work.MaxWind; array MWTable[1980:2016,4] _temporary_; if _N_=1 then do S=1980 to 2016; set pg3.storm_maxwindseasqtr; array MWQ[4] MaxWindQ1-MaxWindQ4; do Q=1 to 4; MWTable[S,Q]=MWQ[Q]; end; end; set pg3.storm_stats; Qtr=qtr(StartDate); MaxWindSQ=MWTable[Season,Qtr]; Difference=MaxWindMPH-MaxWindSQ; drop S Q MaxWindQ1-MaxWindQ4; run; title 'Maximum Winds for Storms Between 1980 and 2016'; proc print data=work.MaxWind; var Season Qtr Name MaxWindMPH MaxWindSQ Difference; run; title;
SAS® Programming 3: Advanced Techniques
Lesson 03, Section 3
Challenge Practice: Using a Three-Dimensional Array
The pg3.product_warehouse table contains the warehouse location for all products. The warehouse location is per each ProductLine (values from 21 to 24), ProductCatID (values from 0 to 8), and ProductLocID (values from 0 to 9). The pg3.product_list table contains the ProductID and ProductName values for all products. ProductID consists of 12 digits. The digits in positions 1 and 2 represent ProductLine. The digits in positions 3 and 4 represent ProductCatID. The digit in position 12 represents ProductLocID. Look up the warehouse location using digits from ProductID.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p303p06.sas program in the practices folder. Run the program to view the list of products.
- Add an ARRAY statement to create a three-dimensional array named W.
- The first dimension is based on four ProductLine values with a lower boundary of 21 and an upper boundary of 24.
- The second dimension is based on nine ProductCatID values with a lower boundary of 0 and an upper boundary of 8.
- The third dimension is based on 10 ProductLocID values with a lower boundary of 0 and an upper boundary of 9.
- The array is based on character elements with a byte size of 5.
- The array needs to be temporary.
Solution:
data work.warehouses; array W[21:24,0:8,0:9] $ 5 _temporary_;
- Load the three-dimensional array with the pg3.product_warehouse table.
- Load the array only during the first iteration of the DATA step.
- Use a DO loop to iterate through each row of the input table (116 observations).
- Load the three-dimensional array by referencing the Warehouse column.
Solution:
data work.warehouses; array W[21:24,0:8,0:9] $ 5 _temporary_; if _N_=1 then do i=1 to 116; set pg3.product_warehouse; W[ProductLine,ProductCatID,ProductLocID]=Warehouse; end;
- Add an assignment statement to create a column named WarehouseLoc.
- The value of this column will be retrieved from the W array using ProductLine for the first dimension, ProductCatID for the second dimension, and ProductLocID for the third dimension.
- Add a KEEP statement to include ProductID, ProductName, and WarehouseLoc in the output table.
Solution:
data work.warehouses; array W[21:24,0:8,0:9] $ 5 _temporary_; if _N_=1 then do i=1 to 116; set pg3.product_warehouse; W[ProductLine,ProductCatID,ProductLocID]=Warehouse; end; set pg3.product_list; ProdID=put(ProductID,12.); ProductLine=input(substr(ProdID,1,2),2.); ProductCatID=input(substr(ProdID,3,2),2.); ProductLocID=input(substr(ProdID,12,1),1.); WarehouseLoc=W[ProductLine,ProductCatID,ProductLocID]; keep ProductID ProductName WarehouseLoc; run; title 'Warehouse Location for Products'; proc print data=work.warehouses; run; title;
- Run the program and verify the results. What is the warehouse location for the product Children's Mitten (row 10)?
Solution:
The warehouse location for the Children's Mitten is A2122.
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 1 Activity
Open the Programming: SAS® 9.4 and SAS® Viya® documentation.
- Under DATA Step, select Component Objects. Under Dictionary of Hash and Hash Iterator Object Language Elements, select DECLARE Statement, Hash and Hash Iterator Objects.
- How many argument_tag:values are valid for the DECLARE statement?
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 2 Activity
Open p304a02.sas from the activities folder and perform the following tasks:
- Add two statements to the DATA step for the Monthly hash object:
- DEFINEKEY method referencing the keys of City and Month
- DEFINEDATA method referencing the data of TempMonAvg and PrecipMonSum
- Run the DATA step and confirm no errors in your SAS log. How many rows were read from the input table into the hash object?
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 3
Practice Level 1: Performing a Table Lookup Using One Key
Create a hash object based on the pg3.np_codelookup table. This table contains national park information. Read the columns ParkCode, State, and GrossAcres from the pg3.np_acres2 table. This table contains acreage amounts for the national parks. Look up the uppercase value of ParkCode in the hash object to retrieve values of ParkName and Type.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p304p01.sas program in the practices folder. Review the DATA step syntax. What is the name of the hash object being created? What table is loading the hash object? What is the name of the key component? What are the names of the data components?
Solution:
What is the name of the hash object being created? ParkDesc
What table is loading the hash object? pg3.np_codelookup
What is the name of the key component? ParkCode
What are the names of the data components? Parkname and Type
- Add an assignment statement to the DATA step to create the column RC, which is equal to the return code from finding the ParkCode value in the hash object.
Solution:
RC = object-name,FIND(<KEY: value>);
- Run the DATA step. View the log and the output table. Based on the log, how many rows from the pg3.np_codelookup table were read into the hash object? Based on the log, how many rows were read from the pg3.np_acres2 table? Based on the log, how many rows are in the output table? Based on the output table, how many ParkCode values are not found in the hash object (RC not equal to 0)?
Solution:
Based on the log, how many rows from the pg3.np_codelookup table were read into the hash object? 713
Based on the log, how many rows were read from the pg3.np_acres2 table? 368
Based on the log, how many rows are in the output table? 368
Based on the output table, how many ParkCode values are not found in the hash object (RC not equal to 0)? 2 (rows 10 and 30)
- Add a subsetting IF statement to output only the RC values that are equal to 0 (matches). Add a DROP statement to eliminate the RC column.
Solution:
data work.acreage; length ParkCode $ 4 ParkName $ 115 Type $ 28; if _N_=1 then do; declare hash ParkDesc(dataset:'pg3.np_codelookup'); ParkDesc.definekey('ParkCode'); ParkDesc.definedata('ParkName','Type'); ParkDesc.definedone(); call missing(ParkCode,ParkName,Type); end; set pg3.np_acres2; ParkCode=upcase(ParkCode); RC=ParkDesc.find(key:ParkCode); *RC=ParkDesc.find(); if RC=0; drop RC; run; title 'Gross Acres for National Parks'; proc print data=work.acreage; run; title;
- Run the program and verify the results. How many data rows are in the results?
Solution:
366
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 3
Practice Level 2: Performing a Table Lookup Using Three Keys
Create a hash object based on the pg3.storm_range table. This table contains four wind measurements for each combination of StartYear, Name, and Basin. Read the columns from the pg3.storm_summary_cat345 table. This table contains information such as MaxWindMPH and MinPressure for each combination of StartDate, Name, and Basin for category 3, 4, and 5 storms. Look up the appropriate values in the hash object to retrieve the four wind measurements.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p304p02.sas program in the practices folder. Add statements to the DATA step to create a hash object named Storm.
- Add a DECLARE statement to load the hash object Storm with the table pg3.storm_range.
- Use the DEFINEKEY method to specify the key components of StartYear, Name, and Basin.
- Use the DEFINEDATA method to specify the data components of Wind1, Wind2, Wind3, and Wind4.
- Use the DEFINEDONE method to complete the hash object.
Solution:
data work.storm_cat345_facts; if _N_=1 then do; if 0 then set pg3.storm_range; declare hash Storm(dataset:'pg3.storm_range'); Storm.definekey('StartYear','Name','Basin'); Storm.definedata('Wind1','Wind2','Wind3','Wind4'); Storm.definedone(); end; set pg3.storm_summary_cat345; run;
- Add an assignment statement to create the column ReturnCode, which is equal to the return code from finding the key values in the hash object. You will need to use the YEAR function on StartDate for the first key value followed by the key values of Name and Basin.
Note: Key values must be specified in the FIND method in the same order as specified with the DEFINEKEY method.
Solution:
data work.storm_cat345_facts; if _N_=1 then do; if 0 then set pg3.storm_range; declare hash Storm(dataset:'pg3.storm_range'); Storm.definekey('StartYear','Name','Basin'); Storm.definedata('Wind1','Wind2','Wind3','Wind4'); Storm.definedone(); end; set pg3.storm_summary_cat345; ReturnCode=Storm.find(key:year(StartDate),key:Name,key:Basin); run;
- Run the DATA step. View the log and the results. How many rows from the pg3. storm_range table were read into the hash object? How many rows were read from the pg3. storm_summary_cat345 table? How many rows are in the output table? How many key values are not found in the hash object (ReturnCode not equal to 0)? Why is the StartYear column set to missing values?
Solution:
How many rows from the pg3. storm_range table were read into the hash object? 2959
How many rows were read from the pg3. storm_summary_cat345 table? 570
How many rows are in the output table? 570
How many key values are not found in the hash object (ReturnCode not equal to 0)? Two (rows 505 and 539)
Why is the StartYear column set to missing values? StartYear is a key component. Key components are not copied to the PDV. Only data components are copied to the PDV. Basin and Name are also key components, but those values are not missing because the values are read from pg3.storm_summary_cat345.
- Modify the assignment statement to be a subsetting IF statement to output only the FIND values that are equal to 0 (matches). Drop the StartYear column.
Solution:
data work.storm_cat345_facts; if _N_=1 then do; if 0 then set pg3.storm_range; declare hash Storm(dataset:'pg3.storm_range'); Storm.definekey('StartYear','Name','Basin'); Storm.definedata('Wind1','Wind2','Wind3','Wind4'); Storm.definedone(); end; set pg3.storm_summary_cat345; if Storm.find(key:year(StartDate),key:Name,key:Basin)=0; drop StartYear; run; title 'Storm Statistics for Category 3, 4, and 5'; proc print data=work.storm_cat345_facts; run; title;
- Run the program and verify the results. How many data rows are in the results?
Solution:
568
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 3
Challenge Practice: Performing a Table Lookup with Two Hash Objects and a Table
Create a hash object based on the pg3.storm_range table. This table contains four wind measurements for each combination of StartYear, Name, and Basin. Create another hash object based on the pg3.storm_basincodes table. This table contains the BasinName value for each value of Basin. Read the columns from the pg3.storm_summary_cat345 table. This table contains information such as MaxWindMPH and MinPressure for each combination of StartDate, Name, and Basin for category 3, 4, and 5 storms. Look up the appropriate values in the hash objects to retrieve the four wind measurements and the basin name.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p304p03.sas program in the practices folder. Review the existing DATA step syntax that is creating the Storm hash object and finding values in the hash object.
Declare and define an additional hash object.- Name the hash object BasinDesc and load the hash object with the table pg3.storm_basincodes.
- Specify a key component of Basin and a data component of BasinName.
- In the IF 0 THEN DO block, include a SET statement for the pg3.storm_basincodes table.
Solution:
data work.storm_cat345_facts work.nonmatches; if _N_=1 then do; if 0 then do; set pg3.storm_range; set pg3.storm_basincodes; end; declare hash Storm(dataset:'pg3.storm_range'); Storm.definekey('StartYear','Name','Basin'); Storm.definedata('Wind1','Wind2','Wind3','Wind4'); Storm.definedone(); declare hash BasinDesc(dataset:'pg3.storm_basincodes'); BasinDesc.definekey('Basin'); BasinDesc.definedata('BasinName'); BasinDesc.definedone(); end;
- After the ReturnCode1 assignment statement, add an assignment statement to create a ReturnCode2 column. This column is equal to the return code from finding the Basin values in the BasinDesc hash object. Add conditional statements to write the output to the work.storm_cat345_facts table if both return codes are equal to zero. Otherwise, write the output to the work.nonmatches table.
Solution:
data work.storm_cat345_facts work.nonmatches; if _N_=1 then do; if 0 then do; set pg3.storm_range; set pg3.storm_basincodes; end; declare hash Storm(dataset:'pg3.storm_range'); Storm.definekey('StartYear','Name','Basin'); Storm.definedata('Wind1','Wind2','Wind3','Wind4'); Storm.definedone(); declare hash BasinDesc(dataset:'pg3.storm_basincodes'); BasinDesc.definekey('Basin'); BasinDesc.definedata('BasinName'); BasinDesc.definedone(); end; set pg3.storm_summary_cat345; ReturnCode1=Storm.find(key:year(StartDate),key:Name,key:Basin); ReturnCode2=BasinDesc.find(key:Basin); if ReturnCode1=0 and ReturnCode2=0 then output work.storm_cat345_facts; else output work.nonmatches; drop StartYear; run; title 'Storm Statistics with Basin Names for Category 3, 4, and 5'; proc print data=work.storm_cat345_facts; run; title; title 'Non-Matches'; proc print data=work.nonmatches; run; title;
- Run the program and verify the results. How many rows are in the work.nonmatches table?
Solution:
11 (9 non-matches for Basin and 2 non-matches for StartYear, Name, and Basin)
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 4 Activity
Open p304a03.sas from the activities folder and perform the following tasks:
- Run the program and view the duplicate key errors in the SAS log.
- Add the MULTIDATA: 'YES' argument to the DECLARE statement for the CapitalPopSort has object.
- Run the program. View the work.CapitalPopSort table. Are there duplicate PctPop values?
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 4
Practice Level 1: Creating a Sorted Table from a Hash Object
The solution to practice 1 involves looking up the values of ParkCode in a hash object to retrieve values of ParkName and Type. The output table work.acreage contains national park information in the default sorted order of ParkCode. Modify the starter program to create an additional output table, work.acreage_sort, which contains the same data as work.acreage but in sorted order by descending GrossAcres.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p304p04.sas program in the practices folder. This program contains the solution to practice 1. Review the DATA step syntax. Run the program and view the results. Declare and define a hash object named Acreage. Specify an ORDERED argument of DESCENDING and a MULTIDATA argument of YES. Define a key component of GrossAcres and data components of ParkCode, ParkName, Type, State, and GrossAcres.
Solution:
data work.acreage; length ParkCode $ 4 ParkName $ 115 Type $ 28; if _N_=1 then do; declare hash ParkDesc(dataset:'pg3.np_codelookup'); ParkDesc.definekey('ParkCode'); ParkDesc.definedata('ParkName','Type'); ParkDesc.definedone(); call missing(ParkCode,ParkName,Type); declare hash Acreage(ordered:'descending', multidata:'yes'); Acreage.definekey('GrossAcres'); Acreage.definedata('ParkCode','ParkName','Type', 'State','GrossAcres'); Acreage.definedone(); end; set pg3.np_acres2; ParkCode=upcase(ParkCode); RC=ParkDesc.find(key:ParkCode); if RC=0; drop RC; run; title 'Gross Acres for National Parks Sorted by ParkCode'; proc print data=work.acreage; run; title;
- Change the subsetting IF statement for RC to be a conditional IF-THEN statement. If RC is equal to 0, then add the data to the Acreage hash object. Add an IF-THEN statement before the DROP statement to output the hash object Acreage to a table named work.acreage_sort if the last row has been read from the input table. Add an END= option to the SET statement to create a column named Last.
Solution:
data work.acreage; length ParkCode $ 4 ParkName $ 115 Type $ 28; if _N_=1 then do; declare hash ParkDesc(dataset:'pg3.np_codelookup'); ParkDesc.definekey('ParkCode'); ParkDesc.definedata('ParkName','Type'); ParkDesc.definedone(); call missing(ParkCode,ParkName,Type); declare hash Acreage(ordered:'descending', multidata:'yes'); Acreage.definekey('GrossAcres'); Acreage.definedata('ParkCode','ParkName','Type', 'State','GrossAcres'); Acreage.definedone(); end; set pg3.np_acres2 end=Last; ParkCode=upcase(ParkCode); RC=ParkDesc.find(key:ParkCode); if RC=0 then Acreage.add(); if Last=1 then Acreage.output(dataset:'work.acreage_sort'); drop RC; run; title 'Gross Acres for National Parks Sorted by ParkCode'; proc print data=work.acreage; run; title;
- Add a PROC PRINT step for the table work.acreage_sort. What is the smallest value of GrossAcres?
Solution:
0.02 acres
data work.acreage; length ParkCode $ 4 ParkName $ 115 Type $ 28; if _N_=1 then do; declare hash ParkDesc(dataset:'pg3.np_codelookup'); ParkDesc.definekey('ParkCode'); ParkDesc.definedata('ParkName','Type'); ParkDesc.definedone(); call missing(ParkCode,ParkName,Type); declare hash Acreage(ordered:'descending', multidata:'yes'); Acreage.definekey('GrossAcres'); Acreage.definedata('ParkCode','ParkName','Type', 'State','GrossAcres'); Acreage.definedone(); end; set pg3.np_acres2 end=Last; ParkCode=upcase(ParkCode); RC=ParkDesc.find(key:ParkCode); if RC=0 then Acreage.add(); if Last=1 then Acreage.output(dataset:'work.acreage_sort'); drop RC; run; title 'Gross Acres for National Parks Sorted by ParkCode'; proc print data=work.acreage; run; title; title 'Gross Acres for National Parks Sorted by Gross Acres'; proc print data=work.acreage_sort; run; title;
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 4
Practice Level 2: Eliminating a PROC SORT step by Creating a Sorted Table from a Hash Object
The solution to practice 2 involves looking up the values of StartYear, Name, and Basin to retrieve four wind measurement values. The final output table, work.storm_cat345_facts, contains storm information in default sorted order of Season and Name. Modify the starter program to create an additional output table, work.cat345_sort, that contains the data in sorted order by descending MaxWindMPH, Season, and Name.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p304p05.sas program in the practices folder. Run the DATA step and verify that the output table is sorted by ascending Season and Name. Run the PROC SORT and PROC PRINT steps. Verify that the results are sorted by descending MaxWindMPH, Season, and Name.
- In the DATA step, declare and define a hash object named StormSort. Specify an ORDERED argument of DESCENDING and a MULTIDATA argument of YES.
- Define key components for the StormSort hash object based on the columns specified in the BY statement of the PROC SORT step.
- Define data components for the StormSort hash object based on the columns specified in the KEEP= option in the PROC SORT statement.
Solution:
data work.storm_cat345_facts; if _N_=1 then do; if 0 then set pg3.storm_range; declare hash Storm(dataset:'pg3.storm_range'); Storm.definekey('StartYear','Name','Basin'); Storm.definedata('Wind1','Wind2','Wind3','Wind4'); Storm.definedone(); declare hash StormSort(ordered:'descending', multidata:'yes'); StormSort.definekey('MaxWindMPH','Season','Name'); StormSort.definedata('Season','Name','Wind1','Wind2', 'Wind3','Wind4','MaxWindMPH'); StormSort.definedone(); end; set pg3.storm_summary_cat345; if Storm.find(key:year(StartDate),key:Name,key:Basin)=0; keep Name Basin Wind1-Wind4 Season MaxWindMPH StartDate; run; proc sort data=work.storm_cat345_facts out=work.cat345_sort (keep=Season Name Wind1-Wind4 MaxWindMPH); by descending MaxWindMPH descending Season descending Name; run;
- Change the subsetting IF statement to be a conditional IF-THEN statement. If the FIND method is equal to 0, then add the data to the StormSort hash object. Add an END= option to the SET statement to create a column named Last. Add an IF-THEN statement before the KEEP statement to output the hash object StormSort to a table named work.cat345_sort if the last row has been read from the input table.
Solution:
data work.storm_cat345_facts; if _N_=1 then do; if 0 then set pg3.storm_range; declare hash Storm(dataset:'pg3.storm_range'); Storm.definekey('StartYear','Name','Basin'); Storm.definedata('Wind1','Wind2','Wind3','Wind4'); Storm.definedone(); declare hash StormSort(ordered:'descending', multidata:'yes'); StormSort.definekey('MaxWindMPH','Season','Name'); StormSort.definedata('Season','Name','Wind1','Wind2', 'Wind3','Wind4','MaxWindMPH'); StormSort.definedone(); end; set pg3.storm_summary_cat345 end=Last; if Storm.find(key:year(StartDate),key:Name,key:Basin)=0 then StormSort.add(); if Last=1 then StormSort.output(dataset:'work.cat345_sort'); keep Name Basin Wind1-Wind4 Season MaxWindMPH StartDate; run; title1 'Storm Statistics for Category 3, 4, and 5'; title2 'sorted by descending (MaxWindMPH, Season, and Name)'; proc print data=work.cat345_sort; run; title; proc sort data=work.storm_cat345_facts out=work.cat345_sort (keep=Season Name Wind1-Wind4 MaxWindMPH); by descending MaxWindMPH descending Season descending Name; run;
- Delete the PROC SORT step. Run the program and verify the PROC PRINT results. How many storms have a maximum wind speed of 173 mph?
Solution:
six (rows 9-14)
data work.storm_cat345_facts; if _N_=1 then do; if 0 then set pg3.storm_range; declare hash Storm(dataset:'pg3.storm_range'); Storm.definekey('StartYear','Name','Basin'); Storm.definedata('Wind1','Wind2','Wind3','Wind4'); Storm.definedone(); declare hash StormSort(ordered:'descending', multidata:'yes'); StormSort.definekey('MaxWindMPH','Season','Name'); StormSort.definedata('Season','Name','Wind1','Wind2', 'Wind3','Wind4','MaxWindMPH'); StormSort.definedone(); end; set pg3.storm_summary_cat345 end=Last; if Storm.find(key:year(StartDate),key:Name,key:Basin)=0 then StormSort.add(); if Last=1 then StormSort.output(dataset:'work.cat345_sort'); keep Name Basin Wind1-Wind4 Season MaxWindMPH StartDate; run; title1 'Storm Statistics for Category 3, 4, and 5'; title2 'sorted by descending (MaxWindMPH, Season, and Name)'; proc print data=work.cat345_sort; run; title;
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 4
Challenge Practice: Ignoring Duplicate Keys when Loading a Hash Object
The pg3.storm_final table contains statistics such as MaxWindMPH for the storms from seasons 1980 to 2017. Load the table into a hash object to find the unique combinations of Season and MaxWindMPH. Use the data from the hash object in other hash objects to determine the lowest and highest values of MaxWindMPH per season.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p304p06.sas program in the practices folder. Review the DATA step syntax, specifically the syntax for the SeasWind hash. Run the DATA step and view the work.SeasonWind table. This table contains the unique combinations of the key components (Season and MaxWindMPH). The default is to store the first unique combination of keys and ignore all subsequent duplicate combinations.
- Uncomment the syntax block that is creating the Seas1 hash object. Review the syntax. Run the DATA step and view the work.SeasonWindLow table. This table contains the first row of each unique Season value, which contains the lowest MaxWindMPH.
Solution:
data _null_; if 0 then set pg3.storm_final(keep=Season MaxWindMPH); declare hash SeasWind(dataset:'pg3.storm_final (where=(MaxWindMPH ne .))', ordered:'ascending'); SeasWind.definekey('Season','MaxWindMPH'); SeasWind.definedone(); *output each unique combination of Season and MaxWindMPH; SeasWind.output(dataset:'work.SeasonWind'); declare hash Seas1(dataset:'work.SeasonWind', ordered:'ascending'); Seas1.definekey('Season'); Seas1.definedata('Season','MaxWindMPH'); Seas1.definedone(); *output the lowest MaxWindMPH per Season; Seas1.output(dataset:'work.SeasonWindLow'); /* declare hash Seas2(dataset:'work.SeasonWind', ordered:'ascending'); Seas2.definekey('Season'); Seas2.definedata('Season','MaxWindMPH'); Seas2.definedone(); *output the highest MaxWindMPH per Season; Seas2.output(dataset:'work.SeasonWindHigh'); */ stop; run;
- Uncomment the syntax block that is creating the Seas2 hash object. Review the syntax. Add an argument to the DECLARE statement that will store the last duplicate key row instead of the duplicate key row. Note: Use the SAS documentation (SAS® Component Objects: Reference) to determine the needed DECLARE statement argument.
Solution:
data _null_; if 0 then set pg3.storm_final(keep=Season MaxWindMPH); declare hash SeasWind(dataset:'pg3.storm_final (where=(MaxWindMPH ne .))', ordered:'ascending'); SeasWind.definekey('Season','MaxWindMPH'); SeasWind.definedone(); *output each unique combination of Season and MaxWindMPH; SeasWind.output(dataset:'work.SeasonWind'); declare hash Seas1(dataset:'work.SeasonWind', ordered:'ascending'); Seas1.definekey('Season'); Seas1.definedata('Season','MaxWindMPH'); Seas1.definedone(); *output the lowest MaxWindMPH per Season; Seas1.output(dataset:'work.SeasonWindLow'); declare hash Seas2(dataset:'work.SeasonWind', ordered:'ascending', duplicate:'r'); Seas2.definekey('Season'); Seas2.definedata('Season','MaxWindMPH'); Seas2.definedone(); *output the highest MaxWindMPH per Season; Seas2.output(dataset:'work.SeasonWindHigh'); stop; run;
- Run the DATA step and view the work.SeasonWindHigh table. This table contains the last row of each unique Season value, which will contain the highest MaxWindMPH. What is the highest MaxWindMPH for season 2017?
Solution:
185
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 5
Practice Level 1: Reading Data in Forward and Reverse Direction
The pg3.np_acres table contains acreage amounts for national parks. Use a hash iterator to create the following two tables:
- work.LowAcres, which contains the 10 parks with the lowest number of acres
- work.HighAcres, which contains the 10 parks with the highest number of acres
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p304p07.sas program in the practices folder. Review the DATA step syntax that is creating the Acres hash object. Add a DECLARE statement to create the hash iterator named A, which is associated with the Acres hash object.
Solution:
data work.LowAcres work.HighAcres; if _N_=1 then do; if 0 then set pg3.np_acres(keep=ParkName GrossAcres); declare hash Acres(dataset:'pg3.np_acres', ordered:'ascending', multidata:'yes'); Acres.definekey('GrossAcres'); Acres.definedata('ParkName','GrossAcres'); Acres.definedone(); declare hiter A('Acres'); end; do i=1 to 10; /* retrieve parks with the lowest number of acres */ end; do i=1 to 10; /* retrieve parks with the highest number of acres */ end; drop i; run;
- In the first DO loop, read the first 10 rows of the hash iterator, which will be the national parks with the lowest number of acres. Output each row to work.LowAcres. In the last DO loop, read the last 10 rows of the hash iterator, which will be the national parks with the highest number of acres. Output each row to work.HighAcres.
Solution:
data work.LowAcres work.HighAcres; if _N_=1 then do; if 0 then set pg3.np_acres(keep=ParkName GrossAcres); declare hash Acres(dataset:'pg3.np_acres', ordered:'ascending', multidata:'yes'); Acres.definekey('GrossAcres'); Acres.definedata('ParkName','GrossAcres'); Acres.definedone(); /* declare a hash iterator */ declare hiter A('Acres'); end; do i=1 to 10; /* retrieve parks with the lowest number of acres */ if i=1 then A.first(); else A.next(); output work.LowAcres; end; do i=1 to 10; /* retrieve parks with the highest number of acres */ if i=1 then A.last(); else A.prev(); output work.HighAcres; end; drop i; run; title 'National Parks with Lowest Acreage'; proc print data=work.LowAcres; run; title; title 'National Parks with Highest Acreage'; proc print data=work.HighAcres; run; title;
- Run the program and verify the results. What is the GrossAcres value for the national park with the 10th highest acreage?
Solution:
3,281,789.43
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 5
Practice Level 2: Reading Data in Forward and Reverse Directions
The pg3.storm_final table contains storm statistics such as MaxWindMPH for seasons 1980 through 2017. Use a hash iterator to create the following two tables:
- work.LowWind, which contains the five storms with the lowest maximum wind speeds
- work.HighWind, which contains the five storms with the highest maximum wind speeds
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p304p08.sas program in the practices folder. Review the DATA step syntax that is defining the Storm hash object.
- Add a DECLARE statement to declare the hash object named Storm. Load the hash object with the pg3.storm_final table, excluding the MaxWindMPH values that are missing. Specify the ORDERED argument with a value of ASCENDING and the MULTIDATA argument with a value of YES. Add another DECLARE statement to create the hash iterator names Stm, which is associated with the Storm hash object.
Solution:
data work.LowWind work.HighWind; if _N_=1 then do; if 0 then set pg3.storm_final (keep=Season Name BasinName MaxWindMPH); declare hash Storm (dataset:'pg3.storm_final(where=(MaxWindMPH ne .))', ordered:'ascending', multidata:'yes'); Storm.definekey('MaxWindMPH'); Storm.definedata('Season','Name','BasinName','MaxWindMPH'); Storm.definedone(); declare hiter Stm('Storm'); end; do i=1 to 5; /* retrieve storms with the lowest maximum winds */ end; do i=1 to 5; end; drop i; run;
- In the first DO loop, read the first five rows of the hash iterator, which will be the storms with the lowest maximum winds. Output each row to work.LowWind. In the last DO loop, read the last five rows of the hash iterator, which will be the storms with the highest maximum winds. Output each row to work.HighWind.
Solution:
data work.LowWind work.HighWind; if _N_=1 then do; if 0 then set pg3.storm_final (keep=Season Name BasinName MaxWindMPH); declare hash Storm (dataset:'pg3.storm_final(where=(MaxWindMPH ne .))', ordered:'ascending', multidata:'yes'); Storm.definekey('MaxWindMPH'); Storm.definedata('Season','Name','BasinName','MaxWindMPH'); Storm.definedone(); declare hiter Stm('Storm'); end; do i=1 to 5; if i=1 then Stm.first(); else Stm.next(); output work.LowWind; end; do i=1 to 5; if i=1 then Stm.last(); else Stm.prev(); output work.HighWind; end; drop i; run; title 'Storms with Lowest Maximum Winds'; proc print data=work.LowWind; run; title; title 'Storms with Highest Maximum Winds'; proc print data=work.HighWind; run; title;
- Run the program and verify the results. Based on the SAS log, how many observations were read from the data set and loaded into the hash object?
Solution:
3071
SAS® Programming 3: Advanced Techniques
Lesson 04, Section 5
Challenge Practice: Reading Data in a Forward Direction Based on a Starting Key
The pg3.storm_final table contains storm statistics such as MaxWindMPH for seasons 1980 through 2017. Use a hash iterator to find the category 5 storm with the lowest maximum wind speed. Then output the 20 category 5 storms with the lowest maximum wind speeds. Category 5 storms have winds greater than or equal to 157 mph.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p304p09.sas program in the practices folder. Review the DATA step syntax that is defining the Storm hash object and the Stm hash iterator object.
- Within the DO UNTIL loop, add the following two statements:
- Add an assignment statement that creates the column RC based on whether the Cat5Speed value is found in the hash iterator. If the value is found, a zero value is returned. Note: Use the SAS documentation (SAS® Component Objects: Reference) to learn the syntax that is needed for the SETCUR method, which specifies a starting key value for a hash iterator.
- Add a sum statement that increments the Cat5Speed column by a value of 1.
Solution:
data work.category5; if _N_=1 then do; if 0 then set pg3.storm_final (keep=Season Name BasinName MaxWindMPH); declare hash Storm(dataset:'pg3.storm_final', ordered:'ascending', multidata:'yes'); Storm.definekey('MaxWindMPH'); Storm.definedata('Season','Name','BasinName','MaxWindMPH'); Storm.definedone(); declare hiter Stm('Storm'); end; Cat5Speed=157; do until(rc=0); rc=Stm.setcur(key:Cat5Speed); Cat5Speed+1; end; do i=1 to 20; end; run;
- Within the last DO loop, add the following two statements:
- Add a statement to output the existing category 5 storm.
- Add a statement with the appropriate hash iterator method that goes to the next row in the hash iterator.
Solution:
data work.category5; if _N_=1 then do; if 0 then set pg3.storm_final (keep=Season Name BasinName MaxWindMPH); declare hash Storm(dataset:'pg3.storm_final', ordered:'ascending', multidata:'yes'); Storm.definekey('MaxWindMPH'); Storm.definedata('Season','Name','BasinName','MaxWindMPH'); Storm.definedone(); declare hiter Stm('Storm'); end; Cat5Speed=157; do until(rc=0); rc=Stm.setcur(key:Cat5Speed); Cat5Speed+1; end; do i=1 to 20; output; Stm.next(); end; run; title 'Twenty Weakest Category 5 Storms'; proc print data=work.category5; run; title;
- Run the program and verify the results. What is the range of the MaxWindMPH values for the 20 weakest category 5 storms?
Solution:
161 to 173 mph
SAS® Programming 3: Advanced Techniques
Lesson 05, Section 1 Activity
Open p305a02.sas from the activities folder and perform the following tasks:
- Run the program and notice the formatting of BeginDate.
- Add a PICTURE statement to create another date format.
- Name the format MyMonth.
- Specify a default length.
- All date values should have a format layout similar to November of 2017.
Hint: Use the %B and %Y directives. - Declare the data type.
- Modify the FORMAT statement in the PROC FREQ step to use the new format. Run the program. Which month has the most tornadoes?
SAS® Programming 3: Advanced Techniques
Lesson 05, Section 1 Activity
What are the formatted values of the PERCENT column based on the following PICTURE statement?
picture mypct low-high='009.99%' (multiplier=100);
State | COUNT |
PERCENT |
---|---|---|
AL | 6 |
1.8987 |
AR | 22 |
6.9620 |
CA | 1 |
0.3165 |
SAS® Programming 3: Advanced Techniques
Lesson 05, Section 1
Practice Level 1: Specifying a Template Based on Date Directives
The pg3.storm_final table contains storm information such as StartDate and EndDate for storms from the 1980 through 2017 storm seasons. Create a custom date format with the following layout: three-letter-weekday.full-month-name.two-digit-day-of-month.two-digit-year (for example, Sat.September.09.17).
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p305p01.sas program in the practices folder. Run the program. Notice the formatted values of StartDate and EndDate based on the WORDDATE format.
- In the PROC FORMAT step, add a PICTURE statement to create a custom date format.
- Name the format NewDate.
- Set an appropriate default length that accommodates the formatted date values, such as: Sat.September.09.17.
- Specify a date range from LOW to HIGH.
- Use the appropriate date directives to produce a template with the following layout: three-letter-weekday.full-month-name.two-digit-day-of-month.two-digit-year.
- Specify the appropriate DATATYPE= option.
Solution:
proc format; picture NewDate (default=19) low-high = '%a.%B.%0d.%0y' (datatype=date); run; proc sort data=pg3.storm_final out=work.storm_final; by descending StartDate; run; title 'Detail Storm Report by Descending Start Date'; proc print data=work.storm_final; var Name BasinName StartDate EndDate MaxWindMPH MinPressure; format StartDate EndDate worddate; run; title;
- Modify the FORMAT statement in the PROC PRINT step to use the custom date format for the StartDate and EndDate columns.
Solution:
proc format; picture NewDate (default=19) low-high = '%a.%B.%0d.%0y' (datatype=date); run; proc sort data=pg3.storm_final out=work.storm_final; by descending StartDate; run; title 'Detail Storm Report by Descending Start Date'; proc print data=work.storm_final; var Name BasinName StartDate EndDate MaxWindMPH MinPressure; format StartDate EndDate NewDate.; run; title;
- Run the program and verify the results. What is the formatted StartDate value for the last storm that occurred in the year 2016?
Solution:
Tue.December.20.16
SAS® Programming 3: Advanced Techniques
Lesson 05, Section 1
Practice Level 2: Specifying a Template Based on Digit Selectors
The pg3.stocks table contains stock market data for the first weekday of the month for the years 2010 through 2017. Create custom formats to display the DailyChange value with a suffix of USD and the Volume value with a suffix of shares.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p305p02.sas program in the practices folder. Run the DATA step and PROC PRINT step. View the results. Notice how the values of VolumeChar and DailyChangeChar are left-justified because they are character columns based on the CATX function. Because of this left justification, the commas, decimal points, and words are not aligned.
- In the PROC FORMAT step, add a PICTURE statement to create a custom format in addition to the current PICTURE statement that creates a format named shares.
- Name the format usd.
- Set the default length to 11.
- Specify a numeric range of LOW through less than 0 to account for negative values. Use a template of '009.99 USD' with a prefix of a negative sign.
- Specify a numeric range of 0 to HIGH to account for positive values. Use a template of '009.99 USD'.
Solution:
proc format; picture shares low-high='000,000,009 shares'; picture usd (default=11) low-<0='009.99 USD' (prefix='-') 0-high='009.99 USD'; run;
- Add a FORMAT statement to the DATA step to format Volume with the shares format and DailyChange with the usd format.
Solution:
data work.stock_report; set pg3.stocks(drop=High Low); VolumeChar=catx(' ',put(Volume,comma18.),'shares'); DailyChange=Close-Open; DailyChangeChar=catx(' ',DailyChange,'USD'); format Volume shares. DailyChange usd.; run;
- Run the program and view the results. Notice how the values of Volume and DailyChange are right-justified because they are numeric columns. Because of the right justification, the commas, decimal points, and words line up. Do the DailyChange values match the DailyChangeChar values in the results, other than in regard to justification?
Solution:
Do the DailyChange values match the DailyChangeChar values in the results, other than in regard to justification? Yes
SAS® Programming 3: Advanced Techniques
Lesson 05, Section 1
Challenge Practice: Specifying a Template Based on Digit Selectors and Characters
The work.savings table contains the deposits made to a savings account during the first half of 2019. Create a custom format to display the deposit amounts less than 1000 as the actual digits with a dollar sign but the amounts of 1000 or greater as a character string of $1,000+.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p305p03.sas program in the practices folder. Run the program and view the formatted values for the Deposits column.
- In the PROC FORMAT step, add a PICTURE statement to create a custom format.
- Name the format MyDep.
- Display values from 0 to less than 1000 as the actual digits with a prefix of a dollar sign.
- Display values of 1000 or greater as the character string of $1,000+.
Note: Use the SAS documentation (Base SAS® Procedures Guide) to determine the PICTURE statement option that treats numbers as message characters rather than digit selectors.
Solution:
proc format; picture MyDep 0-<1000 = '009' (prefix='$') 1000-high = '$1,000+' (noedit); run;
- Modify the FORMAT statement in the PROC PRINT step to use the custom format MyDep for the Deposits column.
Solution:
title 'Monthly Deposits for Savings'; proc print data=work.savings; format Date mmddyy10. Deposits MyDep.; run; title;
- Run the program and verify the results. What PICTURE statement option is treating the numbers as message characters instead of digit selectors for rows 4 and 6?
Solution:
The NOEDIT option treats the numbers as message characters instead of digit selectors.
data work.savings; input Date date9. Deposits; datalines; 07JAN2019 199 04FEB2019 325 04MAR2019 557 01APR2019 1200 06MAY2019 215 03JUN2019 22200 ; run; proc format; picture MyDep 0-<1000 = '009' (prefix='$') 1000-high = '$1,000+' (noedit); run; title 'Monthly Deposits for Savings'; proc print data=work.savings; format Date mmddyy10. Deposits MyDep.; run; title;
SAS® Programming 3: Advanced Techniques
Lesson 05, Section 2 Activity
Let's create a custom function to convert values in inches to values in centimeters. Open p305a04.sas from the activities folder and perform the following tasks:
- Create another custom function in the PROC FCMP step.
- Add a FUNCTION statement to create a function named INtoCM that has a numeric argument of Pin.
- Add the following assignment statement:
Pcm=Pin*2.54;
- Add a RETURN statement to return the value of Pcm.
- Add an ENDSUB statement.
- Run the program and verify that the PrecipCM values are 2.54 times bigger than the Precip values. Does the PROC SQL step use the custom functions successfully?
SAS® Programming 3: Advanced Techniques
Lesson 05, Section 2
Practice Level 1: Creating a Custom Function That Returns a Numeric Value
The pg3.class_tests table contains student scores (ranging from 1 to 10) for four tests and a final exam. Create a function that calculates each student's final score. The final score is the average of six scores: the four tests and the final exam, which is counted twice.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p305p04.sas program in the practices folder. Highlight and run the DATA step and the PROC PRINT step to view the student scores.
- Create a custom function that calculates each student's final score.
- Add a PROC FCMP statement that has an OUTLIB= option to store the custom function in pg3.myfunctions.class.
- Add a FUNCTION statement to name the function CalcScore. The function should contain five arguments: T1, T2, T3, T4, and F.
- Add the following assignment statement:
FScore=round(sum(of T1-T4, 2*F)/6,.01);
- Add a RETURN statement to return the value of FScore.
- Add an ENDSUB statement and a RUN statement.
Solution:
proc fcmp outlib=pg3.myfunctions.class; function CalcScore(T1, T2, T3, T4, F); FScore=round(sum(of T1-T4, 2*F)/6,.01); return(FScore); endsub; run;
- Highlight and run the PROC FCMP step. View the log and confirm that the function is saved.
Solution:
NOTE: Function CalcScore saved to pg3.myfunctions.class
- After the PROC FCMP step and before the DATA step, add a global OPTIONS statement with the CMPLIB= option to specify the table location of the function, pg3.myfunctions.
Solution:
options cmplib=pg3.myfunctions;
- In the DATA step, add an assignment to create FinalScore. This column is equal to the CalcScore function with the arguments Test1, Test2, Test3, Test4, and Final.
Solution:
data work.scores; set pg3.class_tests; FinalScore=CalcScore(of Test1-Test4, Final); run;
- Run the program and verify the results. How many students have a final score greater than 9.00?
Solution:
Two
proc fcmp outlib=pg3.myfunctions.class; function CalcScore(T1, T2, T3, T4, F); FScore=round(sum(of T1-T4, 2*F)/6,.01); return(FScore); endsub; run; options cmplib=pg3.myfunctions; data work.scores; set pg3.class_tests; FinalScore=CalcScore(of Test1-Test4, Final); run; title 'Student Scores'; proc print data=work.scores; run; title;
SAS® Programming 3: Advanced Techniques
Lesson 05, Section 2
Practice Level 2: Creating a Custom Function That Returns a Character Value
The sashelp.baseball table contains baseball statistics per each player. The Name column contains the player's name with the last name appearing before the first name (for example, Davis, Alan). Create a function that switches the order of the first and last names.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p305p05.sas program in the practices folder.
- Add a PROC FCMP step to create a custom function.
- Store the function in a package named baseball within the pg3.myfunctions table.
- Name the function flip. The function should contain a character argument named LastFirst and return a character value with an appropriate length.
- Use the following RETURN statement:
return(catx(' ',scan(LastFirst,2,','), scan(LastFirst,1,',')));
- Highlight and run the PROC FCMP step. View the log and confirm that the function is saved.
Solution:
proc fcmp outlib=pg3.myfunctions.baseball; function flip(LastFirst $) $ 18; return(catx(' ',scan(LastFirst,2,','), scan(LastFirst,1,','))); endsub; run;
- Add an OPTIONS statement with the CMPLIB= option to specify the table that SAS searches for the baseball package.
Solution:
options cmplib=pg3.myfunctions;
- In the DATA step, add an assignment to create Player. This column is equal to the flip function with the argument of Name.
Solution:
data work.FlipNames; set sashelp.baseball(keep=Name Team); Player=flip(Name); drop Name; run;
- Run the program and verify the results. What is the name of the baseball player in row 21?
Solution:
Barry Bonds
proc fcmp outlib=pg3.myfunctions.baseball; function flip(LastFirst $) $ 18; return(catx(' ',scan(LastFirst,2,','), scan(LastFirst,1,','))); endsub; run; options cmplib=pg3.myfunctions; data work.FlipNames; set sashelp.baseball(keep=Name Team); Player=flip(Name); drop Name; run; title 'Baseball Players and Teams'; proc print data=work.FlipNames; var Player Team; run; title;
SAS® Programming 3: Advanced Techniques
Lesson 05, Section 2
Challenge Practice: Creating a Custom CALL Routine
The pg3.eu_occ table contains monthly occupancy rates for European countries from January 2004 through September 2017. The column YearMon contains values with the character layout of a four-digit year, the letter M, and a two-digit month (for example, 2017M09). Create a custom CALL routine that extracts the year and month from the character layout to store a numeric SAS date representing the end of the given month for the given year in a specified column.
Reminder: If you restarted your SAS session, open and submit the libname.sas program in the course files.
- Open the p305p06.sas program in the practices folder. Run the program to view the European occupancy data and the missing data for the NewDate column.
- Add a PROC FCMP step to create a custom CALL routine. Note: Use the SAS documentation (Base SAS® Procedures Guide) to determine the needed syntax for creating a CALL routine with the FCMP procedure.
- Store the routine in a package named dates within the pg3.myfunctions table.
- In the SUBROUTINE statement, name the routine C2Ndate. The routine should contain a character argument named CDate and a numeric argument named NDate.
- In the OUTARGS statement, specify the NDate argument as the column for the subroutine to update.
- Use the following statements in the routine:
Year=input(substr(CDate,1,4),4.); Month=input(substr(CDate,6,2),2.); NDate=intnx('month',mdy(Month,1,Year),0,'end');
- Highlight and run the PROC FCMP step. View the log and confirm that C2Ndate is saved.
Solution:
proc fcmp outlib=pg3.myfunctions.dates; subroutine C2Ndate(CDate $, NDate); outargs NDate; Year=input(substr(CDate,1,4),4.); Month=input(substr(CDate,6,2),2.); NDate=intnx('month',mdy(Month,1,Year),0,'end'); endsub; run;
- Add an OPTIONS statement that contains the CMPLIB= option to specify the table that SAS searches for the dates package.
Solution:
options cmplib=pg3.myfunctions;
- In the DATA step, add a CALL statement to reference the C2Ndate routine with arguments of YearMon and NewDate.
Solution:
data work.DateChange; set pg3.eu_occ; call missing(NewDate); call C2Ndate(YearMon,NewDate); format NewDate date9.; run;
- Run the program and verify the results. What are the YearMon and NewDate values for row 20?
Solution:
YearMon is 2016M02 and NewDate is 29FEB2016.
proc fcmp outlib=pg3.myfunctions.dates; subroutine C2Ndate(CDate $, NDate); outargs NDate; Year=input(substr(CDate,1,4),4.); Month=input(substr(CDate,6,2),2.); NDate=intnx('month',mdy(Month,1,Year),0,'end'); endsub; run; options cmplib=pg3.myfunctions; data work.DateChange; set pg3.eu_occ; call missing(NewDate); call C2Ndate(YearMon,NewDate); format NewDate date9.; run; title 'End of Month Occupancies'; proc print data=work.DateChange; run; title;