Setting Up Data for Exercises (REQUIRED)
To complete the activities, demos, and practices in this course, you must access SAS software and set up your practice files.
Using the Hands-On Lab
The Hands-On Lab already has the course files loaded for you. All you have to do is launch the lab and start using SAS Studio.
Instructions for using the Hands-On Lab
Using SAS OnDemand for Academics
Setup instructions for SAS OnDemand for Academics
Using your own SAS software
Start your SAS software. Click Open next to the SAS software interface you have, and follow the instructions to set up your data.
Setup Instructions for SAS Studio
Setup Instructions for SAS Enterprise Guide
Setup Instructions for SAS Windowing Environment
SAS® SQL 1: Essentials
Lesson 01, Section 2 Activity
Open s101a04.sas from the activities folder and perform the following tasks to explore the sq.customer table:
- Remove the asterisk and select only the FirstName, LastName, and State columns. Run the query and view the log and results.
- Remove the OBS=10 data set option and add the INOBS=10 PROC SQL option after the PROC SQL keywords and before the semicolon. Run the query. Are the results the same using the INOBS=10 option? What about the log?
- After the INOBS= option, add the NUMBER option. Run the query. Which column was added to the results?
SAS® SQL 1: Essentials
Lesson 02, Section 1 Activity
Open s102a01.sas from the activities folder and perform the following tasks to find all customers in the states VT, SC, or GA:
- Complete the WHERE clause to filter for customers in the state of VT and run the query. How many customers are from VT?
- Add another expression using the OR operator to select only customers from the state of VT or SC. How many customers are from either VT or SC?
- Switch your current expression to use the IN operator. Add the state of GA. How many customers are from either VT, SC, or GA?
SAS® SQL 1: Essentials
Lesson 02, Section 1 Activity
Open s102a02.sas from the activities folder and perform the following tasks to find all customers with a nonmissing CreditScore value that is less than 500:
- Examine the query. Add a WHERE clause to find all customers with a CreditScore value that is less than 500 and run the query. What do you notice about the values in the CreditScore column? How many rows are in your report?
- Include the AND operator in the WHERE clause to find all rows that are less than 500 and not null. Use a method of your choice. How many rows are in your final report?
SAS® SQL 1: Essentials
Lesson 02, Section 1 Activity
Open s102a03.sas from the activities folder and perform the following tasks to sort the report by CreditScore and LastName:
- Complete the ORDER BY clause and sort by CreditScore. Run the query and examine the report. What is the default sort order?
- Add the keyword DESC after the CreditScore column in the ORDER BY clause. Run the query and examine the report. What does the DESC option do?
- Add a secondary sort column to sort by LastName. Run the query. Who is the first customer on the report?
- Remove LastName from the SELECT clause and rerun the query. Are the results still sorted by LastName within CreditScore?
SAS® SQL 1: Essentials
Lesson 02, Section 1 Activity
Open s102a04.sas from the activities folder and perform the following tasks to enhance a report:
- Examine the query. Add the title "Customers from Hawaii" and a footnote using today's date. Run the program and examine the new title and footnote in your report.
- Apply LABEL="Email Address" to the UserID column and LABEL= "Estimated Income" to the Income column.
- Apply FORMAT=DATE9. to the DOB column and FORMAT=DOLLAR16.2 to the Income column. Run the program and examine the report.
- Change the DOLLAR16.2 format to DOLLAR7.2. Run the program. What happens to the values in the Income column?
SAS® SQL 1: Essentials
Lesson 02, Section 1 Activity
Open s102a05.sas from the activities folder and perform the following tasks to find all customers 70 years old and older:
- Examine and run the query. View the results.
- Add the expression yrdif(dob,'01jan2019'd) in the SELECT clause after UserID to create a new column. Run the query and examine the results. What is the name of the new column?
- Add as Age after your function. Run the query and examine the results. What changes?
- Remove the OBS= data set option in the FROM clause and add a WHERE clause to return rows where Age is greater than or equal to 70. Run the query. Did the query run successfully?
SAS® SQL 1: Essentials
Lesson 02, Section 1
Level 1 Practice: Querying a Table
The sq.transactionfull table contains customer and transaction information.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Using the sq.transactionfull table, write a PROC SQL step to generate a report for large transactions that are not related to tuition payments to universities. Use the following requirements as you generate the report:
- Display the following columns in this order: CustomerName, MerchantName, Type, Service, and Amount from the sq.transactionfull table.
- Select rows that have a transaction Amount value greater than $1,000 and a Service value not equal to University.
- Order the rows such that the largest transaction is listed first.
- Format the Amount column with the DOLLAR10.2 format.
- Label CustomerName as Customer Name, and Amount as Transaction Amount.
- Add the title Large Non-Educational Transactions.
- Run the program and view the results.
Solution:
/*s102s01.sas*/ title 'Large Non-Educational Transactions'; proc sql; select CustomerName label='Customer Name', MerchantName, Type, Service, Amount format=dollar10.2 label='Transaction Amount' from sq.transactionfull where Amount >1000 and Service ne 'University' order by Amount desc; quit; title;
- Display the following columns in this order: CustomerName, MerchantName, Type, Service, and Amount from the sq.transactionfull table.
- What is the first value for Transaction Amount?
Solution:
$4,072.08
SAS® SQL 1: Essentials
Lesson 02, Section 1
Level 2 Practice: Working with Datetime Values
The sq.transactionfull table contains a list of customer and transaction information.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Using the sq.transactionfull table, write a query to create a report displaying transactions that took place in November and December of any year. Use the following requirements as you generate the report:
- Display the following columns in this order: CustomerName, MerchantName, and Amount.
- Create a new column named TransactionDate by using the DATEPART function to extract the SAS date value from the DateTime column. Format the new column using the DATE9. format.
- Filter the data to select rows where the month of the transaction date is November or December and the Service value is not equal to University.
- Order the report by the original DateTime column.
- Format the Amount column with the DOLLAR10.2 format.
- Label CustomerName as Customer Name, MerchantName as Merchant Name, Amount as Transaction Amount, and TransactionDate as Transaction Date.
- Add the title November/December Transactions.
- Run the program and view the results.
Solution:
/*s102s03.sas*/ title 'November/December Transactions'; proc sql; select CustomerName label='Customer Name', MerchantName label='Merchant Name', Amount format=dollar10.2 label='Transaction Amount', datepart(DateTime) as TransactionDate format=date9. label='Transaction Date' from sq.transactionfull where Service ^= 'University' and month(calculated TransactionDate) in (11, 12) order by DateTime; quit; title;
- Display the following columns in this order: CustomerName, MerchantName, and Amount.
- What value of MerchantName is on the first documented transaction in December?
Solution:
Sceneit Cinemas, LLC
SAS® SQL 1: Essentials
Lesson 02, Section 1
Challenge Practice: Conditional Processing with a Dynamic Title
The sq.customer table contains customer demographic and banking information.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Using the sq.customer table, write a query to conditionally create a new column based on a customer's age. Use the following requirements as you generate the report:
- Use a character format to display the first initial of FirstName labeled as Initial and to display LastName labeled as Last Name. In addition, display CreditScore with the label Credit Score.
- Create a new column named Generation based on the customer's date of birth (DOB), using the following logic:
- DOB between 01JAN1928 and 31DEC1945 Generation = Silent.
- DOB between 01JAN1946 and 31DEC1964 Generation = Boomer.
- DOB between 01JAN1965 and 31DEC1979 Generation = GenX.
- DOB between 01JAN1980 and 31DEC1996 Generation = Millennial.
- DOB on or after 01JAN1997 Generation = Post-Millennial.
- Otherwise, set Generation = Unknown.
- Filter the data to select rows where CreditScore is not missing and State = VT.
- Order the report by descending CreditScore within each value of Generation.
- Add a dynamic title that specifies the exact date the report was run: Created on Dynamic Date Value (Hint: Research the %QSYSFUNC macro function). Format the date value using the WEEKDATE format.
- Run the program and view the results.
Solution:
/*s102s04.sas*/ title "Created on %left(%qsysfunc(today(),weekdate.))"; proc sql; select FirstName 'Initial' format=$1., LastName 'Last Name', DOB 'Date of Birth' format=date9., CreditScore 'Credit Score', case when DOB between '01jan1928'd and '31dec1945'd then 'Silent' when DOB between '01jan1946'd and '31dec1964'd then 'Boomer' when DOB between '01jan1965'd and '31dec1979'd then 'GenX' when DOB between '01jan1980'd and '31dec1996'd then 'Millennial' when DOB >= '01jan1997'd then 'Post-Millennial' else 'Unknown' end as Generation from sq.customer where CreditScore is not null and State='VT' order by Generation, CreditScore desc; quit; title;
- Use a character format to display the first initial of FirstName labeled as Initial and to display LastName labeled as Last Name. In addition, display CreditScore with the label Credit Score.
- Confirm that the date in your title is today's date.
SAS® SQL 1: Essentials
Lesson 02, Section 2 Activity
Open s102a06.sas from the activities folder and perform the following tasks to eliminate duplicate values in a table:
- Examine and run the query. View the results.
- Change the State column in the SELECT clause to the Employed column. Run the query. What does this query show?
- Add the Married column in the SELECT clause after the Employed column. Run the query. What does this query show?
SAS® SQL 1: Essentials
Lesson 02, Section 2 Activity
Open s102a07.sas from the activities folder and perform the following tasks to summarize a table using the COUNT function:
- Examine and run the query. View the results. Why is the value of MaritalStatus different from the value of TotalRows?
- Inside the COUNT function, add the DISTINCT keyword in front of the Married column and run the query. What does the new report show?
SAS® SQL 1: Essentials
Lesson 02, Section 2 Activity
Open s102a08.sas from the activities folder and perform the following tasks to summarize data using date functions:
- Examine and run the query. View the results. Which month has the highest value for MedianSpent?
- Replace the MONTH function with the QTR function. Change the name of the Month column to Qtr. Run the query. What is the error?
- Replace Month in the GROUP BY clause with Qtr. Run the query. Which quarter has the highest value for MedianSpent?
SAS® SQL 1: Essentials
Lesson 02, Section 2
Level 1 Practice: Eliminating Duplicates
The sq.globalfull table contains estimated financial information by geographic region and country for the population age 15 years and older.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Using the sq.globalfull table, write a query to generate a report that displays the unique CountryCode values. Use the following requirements as you generate the report:
- Order the rows by the CountryCode.
- Add the title Unique Country Codes.
- Run the program and view the results.
Solution:
/*s102s05.sas*/ title 'Unique Country Codes'; proc sql; select distinct CountryCode from sq.globalfull order by CountryCode; quit;
- Do you see any duplicate CountryCode values?
Solution:
No
- Modify the query to produce a count of the unique CountryCode values.
- Name the result of the count CountryCount.
- Add the title Count of Unique Country Codes.
- Run the program and view the results.
Solution:
title 'Count of Unique Country Codes'; proc sql; select count(distinct CountryCode) as CountryCount from sq.globalfull; quit; title;
- How many unique country codes are in the sq.globalfull table?
Solution:
151
SAS® SQL 1: Essentials
Lesson 02, Section 2
Level 2 Practice: Grouping and Summarizing Data
The sq.transactionfull table contains a list of customer and transaction information.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Using the sq.transactionfull table, write a query to generate a report that identifies which customers have the greatest percentage of suspiciously large transactions (over $500). Use the following requirements as you generate the report:
- Select and group the report by CustomerID.
- Create the following columns:
- TotalTransactions using the COUNT(*) function to count the number of transactions for each value of CustomerID.
- SuspiciousTransactions as SUM(Amount >= 500) to count the number of transactions greater than 500.
- PCTSuspicious by dividing SuspiciousTransactions by TotalTransactions. Format the new column with PERCENT8.2.
- Select only transactions where the Service value is not equal to University.
- Filter the output to display only summary rows where PCTSuspicious > .05.
- Order the report by descending PCTSuspicious.
- Add the title Customers with High Percentage of Suspicious Transactions.
- Run the program and view the results.
Solution:
/*s102s06.sas*/ title 'Customers with High Percentage of Suspicious Transactions'; proc sql; select CustomerID, count(*) as TotalTransactions, sum(Amount >= 500) as SuspiciousTransactions, calculated SuspiciousTransactions/calculated TotalTransactions as PCTSuspicious format=percent8.2 from sq.transactionfull where Service ^= 'University' group by CustomerID having PCTSuspicious>.05 order by PCTSuspicious desc; quit; title;
- Which CustomerID value had the highest percentage of suspicious transactions?
Solution:
1973179983
SAS® SQL 1: Essentials
Lesson 02, Section 2
Challenge Practice: Grouping and Summarizing Data with Calculations
The sq.globalfull table contains estimated financial information by geographic region and country for the population age 15 years and older.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Using the sq.globalfull table, write a query to generate a report that displays how many people in each region ages 15 or older will borrow for health or medical purposes next year and three years from now. The report will include the one-year and three-year forecasted counts and a percent increase between the two.
- Select and group the report by Region.
- Calculate three new columns:
- The estimated number of people who will borrow for health or medical purposes by finding the mean of EstYear1Pct, converting it to a percentage by dividing by 100, and multiplying it by the sum of EstYear1Pop.
- Name the column EstCount1 and format it using commas.
- Use this expression as a guide: mean(EstYear1PCT)/100)*sum(EstYear1Pop).
- The estimated number of people who will borrow for health or medical purposes by finding the mean of EstYear3Pct, converting it to a percentage by dividing by 100, and multiplying it by the sum of EstYear3Pop.
- Name the column EstCount3 and format it using commas.
- Hint: Copy and modify the expression using in the previous step to refer to the three-year population estimate columns.
- Subtract EstCount1 from EstCount3 and divide the result by EstCount1 to determine the percent increase. Name the column PctIncrease and format it using the PERCENT format.
- The estimated number of people who will borrow for health or medical purposes by finding the mean of EstYear1Pct, converting it to a percentage by dividing by 100, and multiplying it by the sum of EstYear1Pop.
- Filter the rows where IndicatorName is Borrowed for health or medical purposes (% age 15+).
- Order the report by descending PctIncrease.
- Add an appropriate title.
- Run the program and view the results.
Solution:
/*s102s07.sas*/ title 'Forecasted Regions who will Borrow for Health or Medical Purpose'; proc sql; select Region, (mean(EstYear1PCT)/100)*sum(EstYear1Pop) as EstCount1 format=comma16., (mean(EstYear3PCT)/100)*sum(EstYear3Pop) as EstCount3 format=comma16., (calculated EstCount3 - calculated EstCount1)/calculated EstCount1 as PctIncrease format=percent7.2 from sq.globalfull where IndicatorName = 'Borrowed for health or medical purposes (% age 15+)' group by Region order by PctIncrease desc; quit; title;
- Which Region had the largest percentage decrease from year 1 to year 3?
Solution:
East Asia & Pacific
SAS® SQL 1: Essentials
Lesson 02, Section 3 Activity
Open s102a09.sas from the activities folder and perform the following tasks:
- Examine and run the query in the Create a Table from a Query section. View the results.
- Add the CREATE TABLE statement and create a table named Top5States. Run the query and confirm that the table was created successfully.
- Run the code below your SQL query. What did the code produce?
SAS® SQL 1: Essentials
Lesson 02, Section 3 Activity
Open s102a10.sas from the activities folder and perform the following tasks to create a new table and insert rows into it:
- Examine the CREATE TABLE statement and run the query only. Confirm that an empty table was created.
- In the Inserting Rows with a Query section, enter the correct column names to complete the INSERT INTO statement. Run the query. How many rows were inserted into the table highcredit?
- In the Inserting Rows with the SET Clause section, complete the INSERT INTO statement with the SET clause and insert yourself as a customer into the highcredit table. Run the query. What does the note in the log say?
- Complete the code to drop the highcredit table.
SAS® SQL 1: Essentials
Lesson 02, Section 4 Activity
Open s102a11.sas from the activities folder and perform the following tasks to find all the available DICTIONARY tables in your SAS session:
- Examine and run the program. View the log and results.
- In the log, note the column labels for the first two columns: Member Name is the DICTIONARY table, and Data Set Label is the description of that table.
- Replace the asterisk in the SELECT clause and select the distinct memname and memlabel columns. Run the query and examine all the available DICTIONARY tables in your SAS session.
- What is the data set label of the MEMBERS DICTIONARY table?
SAS® SQL 1: Essentials
Lesson 02, Section 4
Level 1 Practice: Counting the Number of Tables in a Library
In this practice, you count the number of tables in a library.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Write a query to create a report that displays the count of the number of tables in the SQ library.
- Use DICTIONARY.tables as input.
- Name the calculated column TableCount.
- Add an appropriate title.
- Run the program and view the results.
Solution:
/*s102s08.sas*/ title 'Count of SQ Tables'; proc sql; select count(*) as TableCount from dictionary.tables where libname='SQ'; quit; title;
- Use DICTIONARY.tables as input.
- How many tables are in the sq library?
Solution:
27
SAS® SQL 1: Essentials
Lesson 02, Section 4
Level 2 Practice: Counting the Number of Tables in all Libraries
In this practice, you count the number of tables in all libraries.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Write a query to create a report that displays the count of the number of tables in all libraries.
- Use DICTIONARY.tables as input.
- Name the calculated column TableCount.
- Group the results by the library name.
- Add an appropriate title and display the library name and table count.
- Run the program and view the results.
Solution:
/*s102s09.sas*/ title 'Count of All Tables'; proc sql; select libname, count(*) as TableCount from dictionary.tables group by libname; quit; title;
- Use DICTIONARY.tables as input.
- Which library has the most tables?
Solution:
MAPSGFK. Your results might differ.
SAS® SQL 1: Essentials
Lesson 02, Section 4
Challenge Practice: Finding Tables with a Column Using Pattern Matching
In this practice, you find tables with a column using pattern matching.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Write a query to create a report that displays the list of all tables in the SQ library with a column containing ID in its name.
- Display the table name and the column containing ID.
- Run the program and view the results.
Solution:
/*s102s10.sas*/ title 'SQ Tables containing ID columns'; proc sql; select distinct MemName, Name from dictionary.columns where Libname = 'SQ' and Name contains 'ID'; quit; title;
- How many unique tables have a column that contains ID in its name?
Solution:
13
SAS® SQL 1: Essentials
Lesson 03, Section 1 Activity
Open s103a01.sas from the activities folder and perform the following tasks to perform a default join of two tables:
- Examine and run the two queries to explore the sq.smallcustomer and sq.smalltransaction tables. Confirm that the sq.smallcustomer contains 8 rows and the sq.smalltransaction contains 12 rows.
- In the next section, list the sq.smallcustomer and sq.smalltransaction table in the FROM clause and separate the tables by a comma. Run the query and view the log. What note do you see?
- View the results. Name two issues with the report.
SAS® SQL 1: Essentials
Lesson 03, Section 2 Activity
Open s103a02.sas from the activities folder and perform the following tasks to perform an inner join on tables with different columns names:
- Examine and run the two queries to explore the sq.statepopulation and sq.statecode tables. What columns can you use to join the tables?
- In the Complete the Inner Join section, specify the tables in the FROM clause and perform an inner join. Add the alias p for the sq.statepopulation table, and the alias s for the sq.statecode table.
- Complete the ON expression to match rows where p.Name = s.StateCode. Highlight and run the query. How many rows are in the new report?
SAS® SQL 1: Essentials
Lesson 03, Section 2 Activity
Open s103a03.sas from the activities folder and perform the following tasks to find tables that contain BankID and MerchantID columns:
- Complete the first query by adding the BANKID column name in the WHERE clause. How many tables contain the BankID column?
- Replace BANKID with MERCHANTID. How many tables contain the MerchantID column?
SAS® SQL 1: Essentials
Lesson 03, Section 2 Activity
Open s103a04.sas from the activities folder and perform the following tasks to use a non-equijoin:
- Complete the ON clause to join on rows where customer Income is greater than the LowIncome range, and less than or equal to the HighIncome range using the BETWEEN-AND WHERE operator.
- What tax bracket is Olga Comstock in?
- View your log. What note do you see?
SAS® SQL 1: Essentials
Lesson 03, Section 2
Practice Level 1: Performing an Inner Join
Open s103p01.sas from the practices folder. Modify the program to generate a report that shows the breakdown of employment and marital status for customers in New York City.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Add a PROC SQL step to create a table named work.nyc that combines sq.customer and sq.maritalcode. Use the following requirements:
- This table should include only FirstName, LastName, Employed, and MaritalStatus.
- Perform an inner join on the Married column in the sq.customer table and MaritalCode column in the sq.maritalcode table.
- Filter the Zip column for customers in the 10001 Zip code.
- Run the program and view the results.
Solution:
/*s103s01.sas*/ proc sql; create table work.NYC as select c.FirstName, c.LastName, c.Employed, m.MaritalStatus from sq.customer as c inner join sq.maritalcode as m on c.married=m.MaritalCode where zip=10001; quit; /*Alternate Solution*/ proc sql; create table work.NYC as select c.FirstName, c.LastName, c.Employed, m.MaritalStatus from sq.customer as c, sq.maritalcode as m where c.married=m.MaritalCode and zip=10001; quit;
- This table should include only FirstName, LastName, Employed, and MaritalStatus.
- Execute the PROC FREQ step to generate the crosstabulation of MaritalStatus and Employed.
Overall, are NYC customers likely to be employed or unemployed?
Solution:
employed
- Does this vary across marital status?
Solution:
Yes, NYC customers whose marital status is Single are more likely to be unemployed.
SAS® SQL 1: Essentials
Lesson 03, Section 2
Practice Level 2: Joining on Inequality
Open s103p02.sas from the practices folder. Modify the program to join the sq.customer and sq.agegroup tables based on a customer's year of birth.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Add a PROC SQL step to the top of the program to create a table named work.generation that combines sq.customer and sq.agegroup. Use the following requirements:
- Select FirstName and LastName, and create a column named Year to determine the DOB year of the customer from the sq.customer table. Select the Name column from the sq.agegroup table.
- The StartYear and EndYear columns in the sq.agegroup table indicate the starting and ending years for each generation. Use these columns to perform a non-equijoin using the calculated Year value from the previous step.
- Run the program and view the results.
Solution:
/*s103s02.sas*/ /*Solution*/ proc sql; create table work.generation as select c.FirstName, c.LastName, year(c.DOB) as Year, a.Name from sq.Customer as c inner join sq.AgeGroup as a on year(c.DOB) between a.StartYear and a.EndYear; quit; /*Alternate Solution*/ proc sql; create table work.generation as select c.FirstName, c.LastName, year(c.DOB) as Year, a.Name from sq.Customer as c, sq.AgeGroup as a where calculated Year between a.StartYear and a.EndYear; quit;
- Select FirstName and LastName, and create a column named Year to determine the DOB year of the customer from the sq.customer table. Select the Name column from the sq.agegroup table.
- Execute the PROC SGPLOT step below your query to generate the bar chart. Which age group has the most customers?
Solution:
Millenials or Generation Y
SAS® SQL 1: Essentials
Lesson 03, Section 2
Challenge Practice: Joining More Than Two Tables
Open s103p03.sas from the practices folder. Modify the program to create a table named work.births3 that will be used to create a report showing a three-year projection of births by Region, Division, and State. The data to create this report is stored in four tables as described below.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
sq.statepopulation | sq.regioncode | sq.divisioncode | sq.statecode |
---|---|---|---|
Region (the region code) | RegionCode | DivisionCode | StateCode |
Division (the division code) | RegionName | DivisionName | StateName |
Name (the state code) | |||
Births3 (an indicator of state-lvel, three-year projected births) |
- Using all inner joins, combine the four tables such that the descriptive Region, Division, and State names are combined with the Births3 data. Run the program and view the results.
Solution:
/*s103s03.sas*/ proc sql; create table Births3 as select c.RegionName, d.DivisionName, b.Statename, a.Births3 from sq.statepopulation as a inner join sq.statecode as b on a.name=b.statecode inner join sq.RegionCode as c on a.region=c.regionCode inner join sq.divisioncode as d on a.Division=d.divisioncode; quit; /*Alternate Solution*/ proc sql; create table Births3 as select c.RegionName, d.DivisionName, b.Statename, a.Births3 from sq.statepopulation as a, sq.statecode as b, sq.RegionCode as c, sq.divisioncode as d where a.name=b.statecode and a.region=c.regionCode and a.Division=d.divisioncode; quit;
- Execute the PROC TABULATE step to generate the report. Which division has the highest projected three-year births, and which has the lowest?
Solution:
South Atlantic has the highest. New England has the lowest.
- Which region has the highest projected three-year births, and which has the lowest?
Solution:
South Region has the highest. Northeast Region has the lowest.
SAS® SQL 1: Essentials
Lesson 03, Section 3 Activity
Open s103a05.sas from the activities folder and perform the following tasks to perform a left join:
- Run the query to create a left join between the sq.smallcustomer and sq.smalltransaction tables. Notice the difference within the AccountID columns in rows 8 and 9.
- Remove the column t.AccountID in the SELECT clause. Run the query and examine the results. How many missing AccountID values are in the results?
- Replace c.AccountID with t.AccountID. Replace the c in the column label with a t. How many missing AccountID values are in the results?
SAS® SQL 1: Essentials
Lesson 03, Section 3 Activity
Open s103a06.sas from the activities folder and perform the following tasks to find all transactions not associated with a documented customer:
- Run the query to create a left join between the sq.smalltransaction2 and sq.smallcustomer2 tables. Examine the report. Notice that the rows with missing values in AccountID have been joined.
- In the ON clause, add the expression and t.AccountID is not null. Run the query. Confirm that missing values were not joined.
- Add a WHERE clause with the expression c.AccountID is null to filter for all transactions without a documented customer. Run the query and examine the report. How many transactions do not have a customer associated with them?
SAS® SQL 1: Essentials
Lesson 03, Section 3
Practice Level 1: Using Outer Joins to Find Nonmatches
Join the sq.globalpop and sq.globalmetadata tables to create the work.meta table. Use the work.meta table to generate a report showing the country codes for countries in the sq.globalpop table that do not have any country metadata in the sq.globalmetadata table.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Write a PROC SQL step to join the sq.globalpop and sq.globalmetadata tables and create a table named work.meta. Use the following requirements:
- Select the CountryCode, SeriesName, EstYear1, and EstYear3 columns from the sq.globalpop table and the ShortName and IncomeGroup columns from the sq.globalmetadata table.
- Perform a left join on the sq.globalpop and sq.globalmetadata tables.
- Use the CountryCode column in both tables for the join criteria.
- Run the program and view the results.
Solution:
/*s103s04.sas*/ ************************************; *Solution with Temporary Table *; ************************************; proc sql; create table work.meta as select p.CountryCode, p.SeriesName, p.EstYear1, p.EstYear3, m.ShortName, m.IncomeGroup from sq.globalpop as p left join sq.globalmetadata as m on p.countrycode= m.countrycode; quit;
- Select the CountryCode, SeriesName, EstYear1, and EstYear3 columns from the sq.globalpop table and the ShortName and IncomeGroup columns from the sq.globalmetadata table.
- Create a report showing the unique country codes for which there is no global metadata using the work.meta table.
- Select the CountryCode column from the work.meta table and eliminate duplicate values.
- Filter for rows where the ShortName column is missing. The ShortName column contains values from the sq.globalmetadata table. If the results are missing, then the row did not retrieve information from sq.globalmetadata.
- Order the results by CountryCode.
- Run the program and view the results.
Solution:
title 'Countries with no Metadata'; proc sql; select distinct CountryCode from work.meta where ShortName is null order by CountryCode; quit; title; ************************************; *Solution to Step 1 and Step 2 with No Temporary Table *; ************************************; title 'Countries with no Metadata'; title2 'No Temporary Table'; proc sql; select distinct p.CountryCode from sq.globalpop as p left join sq.globalmetadata as m on p.CountryCode= m.CountryCode where ShortName is null; quit; title;
- Select the CountryCode column from the work.meta table and eliminate duplicate values.
- What is the last CountryCode value in your results?
Solution:
WSM
SAS® SQL 1: Essentials
Lesson 03, Section 3
Practice Level 2: Using Outer Joins to Summarize Data
Generate a report showing the count of customer marital status descriptions for each primary bank. The sq.customer table contains a marital code (Married) and a primary bank ID. The final results should contain BankID, MaritalStatus, Name (name of bank), and Count.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Write a PROC SQL step to join the sq.customer and sq.maritalcode tables. Use the following requirements:
- Select BankID from the sq.customer table and MaritalStatus value from the sq.maritalcode table. Create a new column named Count to count the number of customers. Format the new column using commas.
- Use a left join to select all customers from the sq.customer table, with or without matches in the sq.maritalcode table.
- Use the Married column in the sq.customer table and the MaritalCode column in the sq.maritalcode table as the join criteria.
- Filter rows where the customer's bank ID is not missing.
- Group the data by BankID and MaritalStatus.
- Order the table by descending Count.
- As a checkpoint, run the query.
Solution:
/*s103s05.sas*/ /*a*/ proc sql; select c.BankID, m.MaritalStatus, count(*) as Count format=comma10. from sq.customer as c left join sq.maritalcode as m on c.Married=m.MaritalCode where c.BankID is not null group by c.BankID, m.MaritalStatus order by Count desc; quit;
- Select BankID from the sq.customer table and MaritalStatus value from the sq.maritalcode table. Create a new column named Count to count the number of customers. Format the new column using commas.
- In the same PROC SQL step, add the descriptive bank Name column to the results.
- After the MaritalStatus column in the SELECT clause, add Name from the sq.bank table based on matching BankID values, again using a left join.
- Add Name in the GROUP BY clause after the MaritalStatus column.
- Correctly reference the BankID columns in the query.
- Add an appropriate title.
- Run the program and view the results.
Solution:
/*b*/ title 'Count of Marital Status by Bank'; proc sql; select c.BankID, m.MaritalStatus, b.Name, count(*) as Count format=comma10. from sq.customer as c left join sq.maritalcode as m on c.Married=m.MaritalCode left join sq.bank as b on c.BankID=b.BankID where c.BankID is not null group by c.BankID, m.MaritalStatus, b.Name order by Count desc; quit; title;
- After the MaritalStatus column in the SELECT clause, add Name from the sq.bank table based on matching BankID values, again using a left join.
- Which combination of MaritalStatus and Name has the lowest count of customers?
Solution:
Widowed and Wheatberry Bank, Inc. had a total count of 362.
SAS® SQL 1: Essentials
Lesson 03, Section 3
Challenge Practice: Combining Inner and Outer Joins
Some of the New York City merchants are having difficulty with sales. Your job is to identify merchants who have had no recent sales, capture what industry they are in, and then generate a list of customers in their area who are transacting with other vendors to help them market the appropriate customers to generate sales.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Use a left outer join to combine the sq.merchant and sq.transaction tables to identify New York City merchants (Zip=10001) with no transactions. Use the following requirements:
- From the sq.merchant table, select MerchantName, MerchantID, Type, and Zip.
- Label Type as Merchant Type and Zip as Merchant Zipcode.
- Merchants who do not have any transactions will be those who have no entries in the sq.transaction table.
- Run the query and view the results.
Solution:
/*s103s06.sas*/ /*a*/ title 'NYC Merchants with no Transactions'; proc sql; select a.MerchantName, a.MerchantID, a.Type 'Merchant Type', a.Zip 'Merchant Zipcode' from sq.merchant as a left join sq.transaction as b on a.MerchantID=b.MerchantID where a.Zip=10001 and b.MerchantID is null; quit; title;
- The next step is to find the customers in NYC who have similar Type transactions with other vendors. The problem is that the sq.transactionfull table (where transaction information is combined with customer information) does not contain a column for the customer's ZIP code for us to join on, only the full address. Therefore, we need to extract the ZIP code from the customer.Address column to perform the final join.
- In the same PROC SQL step, use an inner join to combine the above list with the sq.transactionfull table by Type and Zip to generate a report with customer contact information for all NYC customers with similar Type transactions.
- Hint: Assign the sq.transactionfull table an alias of c and use the following expression to extract ZIP from the c.address column to perform this join:
input(scan(c.address, -1), 5.)
- Include CustomerID, CustomerName, and Address from the sq.transactionfull table.
- Label Address as Customer Address.
- Add an appropriate title to the report.
- Run the query and view the results. Ensure that the report shows unique rows of data and is ordered by MerchantID and CustomerID.
Solution:
/*b*/ proc sql; title 'Customers with similar Type Transactions'; title2 'as NYC Merchants with no Sales'; select distinct a.MerchantName, a.MerchantID, a.type 'Merchant Type', a.Zip 'Merchant Zipcode', c.CustomerID, c.Customername, c.Address 'Customer Address' /*Join merchants with transactions. Notice where clause below that determines which rows are returned.*/ from sq.merchant as a left join sq.transaction as b on a.MerchantID=b.MerchantID /*Combine the above list with all customers with similar type transactions.*/ inner join sq.transactionfull as c on a.type=c.type and input(scan(c.address,-1),5.)=a.zip /*the where filter selects NYC merchants who do not exist in the transaction table=no sales*/ where a.zip=10001 and b.MerchantID is null order by 2,5; quit; title;
- How many potential new customers can the merchant Miasma Mitigation, Inc. now market?
Solution:
six potential new customers
SAS® SQL 1: Essentials
Lesson 03, Section 4 Activity
Open s103a07.sas from the activities folder and perform the following tasks to join two tables using the SUBSTR functionr:
- Examine and run the query. Did you receive a syntax error?
- In the ON clause, use the SUBSTR function on t.StateID to extract the first two characters. Run the query. Which StateName is Caberto, Glen Daniel from?
SAS® SQL 1: Essentials
Lesson 03, Section 4 Activity
Open s103a08.sas from the activities folder and perform the following tasks to join a numeric column with a character column:
- Run the queries in the Create a Table and Insert Values section. View the newly created table.
- Run the query in the Join Different Column Types section. What syntax error was generated when you joined columns of different types?
- Leave the program open for the next Activity.
SAS® SQL 1: Essentials
Lesson 03, Section 4 Activity
Open s103a08.sas from the activities folder and perform the following tasks to join a numeric column with a character column by using the PUT function:
- If you have not run the queries in the Create a Table and Insert Values section, run those now.
- Use the PUT function to convert z.Zip in the ON clause to a character value using the z5 format. Run the query.
- What city does the ZipCode value 14216 represent?
SAS® SQL 1: Essentials
Lesson 04, Section 1 Activity
Open s104a01.sas from the activities folder and perform the following tasks to use a subquery to return two columns in the WHERE clause:
- Examine and run the first query. Confirm that the results contain one row and two columns.
- Add the first query as a subquery in the second query to find all states with PopEstimate1 values higher than the average estimated state population.
- Run the query. What is the syntax error in the log?
SAS® SQL 1: Essentials
Lesson 04, Section 1 Activity
Open s104a02.sas from the activities folder and perform the following tasks to use a subquery in the HAVING clause:
- Examine and run the first query. View the results.
- Modify the second query. Copy the value returned by the first query into the subquery against the HAVING clause to return divisions with an average PopEstimate1 value greater than the total average of PopEstimate1. Run the query. How many divisions have a higher average PopEstimate1 value than the average PopEstimate1 value of all the states?
- Remove the static value and add the subquery in the HAVING clause. Run the query.
- How many divisions have a higher average PopEstimate1 value than the average PopEstimate1 value of all the states?
SAS® SQL 1: Essentials
Lesson 04, Section 1 Activity
Open s104a03.sas from the activities folder and perform the following tasks to find all states with a PopEstimate1 value that is lower than the value for New York or Florida:
- Complete the query using the ANY keyword or MAX statistic. Run the query.
- How many states have estimated populations lower than New York or Florida?
SAS® SQL 1: Essentials
Lesson 04, Section 1
Practice Level 1: Using a Subquery That Returns a Single Value
The sq.statepopulation table contains estimated population statistics for every state in the US and its territories. Which states have an estimated three-year population growth greater than the average for all states?
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Using the sq.statepopulation table, write a query that displays the average three-year population growth for all states. Use the nPopChg3 column to calculate the average from the sq.statepopulation table. Run the program and view the results.
Solution:
/*s104s01.sas*/ /*a*/ proc sql; select mean(nPopChg3) from sq.statepopulation; quit;
- Use the query from step 1 to display the states that have a projected three-year population growth greater than the overall average. Use the following requirements:
- Include Name and nPopChg3 in the results.
- Label nPopChg3 as Estimated Growth and format the values with commas.
- Use the query from step 1 to subset the table.
- Order the results by descending nPopChg3.
- Add an appropriate title.
- Run the query and view the results.
Solution:
/*b*/ title "States with an estimated 3-Year Population Growth"; title2 "Greater than the Overall Average"; proc sql; select Name, nPopChg3 label="Estimated Growth" format=comma16. from sq.statepopulation where nPopChg3 > (select mean(nPopChg3) from sq.statepopulation) order by nPopChg3 desc; quit; title;
- Which state has the greatest estimated growth in your results?
Solution:
Texas, with an Estimated Growth value of 379,128
SAS® SQL 1: Essentials
Lesson 04, Section 1
Challenge Practice: Using Nested Subqueries
Create a report that lists all countries that have a higher population than the average of all countries and where the EstYear1 forecast for Outstanding housing loan (% age 15+) increases in EstYear3.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Open s104p05.sas from the practices folder. Run the query to create the table CountryEstPop. The table contains total estimated population for ages 15+ for each country.
- Write a query to find all countries in the CountryEstPop table with a higher population than the mean of all countries.
Solution:
/*s104s05.sas*/ /*b*/ proc sql; select CountryCode from CountryEstPop where EstPop > (select mean(EstPop) from CountryEstPop); quit;
- Use the previous query as a subquery. Use the following requirements:
- Select all columns from the sq.globalfindex table where IndicatorName is Outstanding housing loan (% age 15+), the EstYear1 forecast increases in three years, EstYear1 and EstYear3 are not null, and the country has a population estimate greater than the mean of all countries.
- Convert EstYear1 and EstYear3 to a percent and format the values.
- Create a new column to determine the percent increase by subtracting EstYear3 by EstYear1. Name the column PctIncrease and format the values using a percent.
- Order the results by descending PctIncrease. Add an appropriate title.
- Run the query and view the results.
Solution:
/*c*/ proc sql; select CountryCode, IndicatorName, EstYear1/100 as EstYear1 format=percent7.2, EstYear3/100 as EstYear3 format=percent7.2, calculated EstYear3 - calculated EstYear1 as PctIncrease format=percent7.2 from sq.globalfindex where IndicatorName = "Outstanding housing loan (% age 15+)" and EstYear1 is not null and EstYear3 is not null and EstYear1 < EstYear3 and CountryCode in (select CountryCode from CountryEstPop where EstPop > (select mean(EstPop) from CountryEstPop)) order by PctIncrease desc; quit; title;
- Which country had the largest value for PctIncrease?
Solution:
BGD
SAS® SQL 1: Essentials
Lesson 04, Section 2 Activity
Open s104a04.sas from the activities folder and perform the following tasks to create and use a view:
- Create a view named VWtotalcustomer from the query. Run the query and examine the log.
- Run the code in the section Use the View in the PROCS Below. Which state has the highest number of customers?
SAS® SQL 1: Essentials
Lesson 04, Section 2
Practice Level 1: Using an In-Line View
Determine which customers have an extremely high credit score relative to other customers in their ZIP code (Zip). Similar to the practice in the previous section, extremely high credit is defined as greater than 2 standard deviations above the mean of CreditScore. However, rather than use an overall high-credit threshold for all customers, we need to calculate the threshold for each value of Zip. This can be accomplished using an in-line view.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Open s104p06.sas from the practices folder. Run the query to summarize the HighZipCredit threshold for each Zip value for the first 1000 rows.
- Use the query from step 1 as an in-line view to join with the sq.customer table. Use the following requirements:
- Select c.CustomerID, c.Zip, and c.CreditScore from the sq.customer table, and select s.HighZipCredit from the in-line view. Format the c.Zip column using the Z5. format.
- Perform an inner join with the sq.customer table and the in-line view from step 1. Give the sq.customer table the alias c and the in-line view the alias s.
- Remove the INOBS= option from the in-line view.
- Use c.Zip = s.Zip as the join criteria.
- Filter rows where the customer's c.CreditScore value is greater than the s.HighZipCredit value.
- Order the results by Zip and descending CreditScore.
- Add an appropriate title to the report.
- Run the query and view the results.
Solution:
/*s104s06.sas*/ title 'Customers with Extremely High Credit for their Zipcode'; proc sql; select c.CustomerID, c.Zip format=z5., c.CreditScore, s.HighZipCredit from sq.customer as c inner join (select Zip, sum(avg(CreditScore),(2*std(CreditScore))) as HighZipCredit from sq.customer where CreditScore is not null group by Zip) as s on c.Zip=s.Zip where c.CreditScore>s.HighZipCredit order by Zip, CreditScore desc; quit; title;
- What is the last Zip value and the corresponding CreditScore value in the final report?
Solution:
The last Zip value is 99701 with a CreditScore value of 815.
SAS® SQL 1: Essentials
Lesson 04, Section 2
Practice Level 2: Building a Complex Query Using In-Line Views
Determine which employees have the highest salary for their job title in every state.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Using the sq.employee table, write a query to calculate the maximum Salary value for each value of JobTitle within each State. Use the following requirements:
- Convert the values of the State column to uppercase and name the column State to standardize the state code values.
- Select JobTitle and calculate the maximum salary. Name the column MaxJobSalary.
- Filter rows where State is not null.
- Group the results bythe calculated State and JobTitle. (You must use the CALCULATED keyword. See Column Alias Extensions in the SAS Documentation.
- Order the results by State.
- Add an appropriate title.
- Run the query and view the results.
Solution:
/*s104s07.sas*/ /*a*/ title 'Maximum Salary for Each Job in every State'; proc sql; select upcase(State) as State, JobTitle, max(Salary) as MaxJobSalary from sq.employee where State is not null group by calculated State, JobTitle order by State; quit; title;
- What is the MaxJobSalary value for a Chief Executive Officer in FL?
Solution:
433800
- Use the query from step 1 as an in-line view to join with the sq.employee table. Use the following requirements:
- Display EmployeeID, EmployeeName, State, JobTitle, and Salary for the highest paid employee for each value of JobTitle in every state.
- Perform an inner join with the sq.employee table and the in-line view from step 1. Give the sq.employee table the alias detail and the in-line view the alias summary. Hint: Remove the ORDER BY clause when using an in-line view.
- Use JobTitle, State, and Salary equal to MaxJobSalary as the join criteria. HINT: Make sure State is upper cased when joining.
- Order the report by State and JobTitle.
- Add an appropriate title and format the Salary values with a dollar sign and comma.
- Run the query and view the results.
Solution:
/*b*/ title 'Employees with Highest Salary for their Job in every State'; proc sql; select detail.EmployeeID, detail.EmployeeName, detail.State, detail.JobTitle, detail.Salary format=dollar12. from sq.employee as detail inner join (select upcase(State) as State, JobTitle, max(Salary) as MaxJobSalary from sq.employee where State is not null group by calculated State, JobTitle) as summary on detail.Jobtitle=summary.JobTitle and upcase(detail.State)=Summary.State and detail.Salary=Summary.MaxJobSalary order by detail.State, detail.JobTitle; quit; title;
- Who is the highest paid Auditor I in CA?
Solution:
Worton, Steven
SAS® SQL 1: Essentials
Lesson 04, Section 2
Challenge Practice: Building a Complex Query Using a Join and Subquery
Generate a report of the total estimated number of individuals next year who made or received digital payments in the past year (% age 15+) in South Asia to determine the best country in which to promote a digital payment app.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Calculate the total estimated population for ages 15+ for countries in South Asia in the sq.globalpop table. Use the following requirements:
- Select the CountryCode and calculate the sum of EstYear1. Name the new column EstYear1Pop and format using commas.
- Filter the SeriesName column for estimated population greater than 15 years of age. Hint: Use the LIKE operator in the WHERE clause.
- Use a subquery from the sq.globalmetadata table to include only CountryCodes in South Asia.
- Group by CountryCode.
- Order the results by State.
- Add an appropriate title.
- Run the query and view the results.
Solution:
/*s104s08.sas*/ /*a*/ title 'Estimated Population for Next Year in South Asia'; title2 'Ages 15+'; proc sql; select CountryCode, sum(EstYear1) as EstYear1Pop format=comma14. from sq.globalpop where SeriesName not like '%00-04' and SeriesName not like '%05-09' and SeriesName not like '%10-14' and CountryCode in (select CountryCode from sq.globalmetadata where Region = 'South Asia') group by CountryCode; quit; title;
- Which country has the highest EstYear1Pop value?
Solution:
IND
- Use the query from step 1 as an in-line view to join by CountryCode and multiply the estimated percentage of individuals who made or received digital payments in the past year (% age 15+) by the estimated population for next year. Use the following requirements:
- Select the CountryCode and IndicatorName columns from the sq.globalfindex table.
- Convert the EstYear1 value from sq.globalfindex to a percentage by dividing by 100 and then multiply by the EstYear1Pop value from the in-line view. Name the column Estimate and format using commas. Note: Multiplying the EstYear1 percentage of individuals by the EstYear1Pop total population for the country returns an estimated number of individuals who used digital payments.
- Perform an inner join of sq.globalfindex and the in-line view from step 1. Assign sq.globalfindex the alias f, and assign the in-line view the alias pop. Use CountryCode as the join criteria.
- Filter the rows by the IndicatorName value of Made or received digital payments in the past year (% age 15+).
- Order the results by descending Estimate.
- Add an appropriate title.
- Run the query and view the results.
Solution:
/*b*/ title 'Estimated Population who Made or Received Digital Payments in the Past Year'; title2 'South Asia Ages 15+'; proc sql; select f.CountryCode, f.IndicatorName, ((f.EstYear1/100) * pop.EstYear1Pop) as Estimate format=comma14. from sq.globalfindex as f inner join (select CountryCode, sum(EstYear1) as EstYear1Pop format=comma14. from sq.globalpop where SeriesName not like '%00-04' and SeriesName not like '%05-09' and SeriesName not like '%10-14' and CountryCode in (select CountryCode from sq.globalmetadata where Region = 'South Asia') group by CountryCode) as pop on f.CountryCode = pop.CountryCode where IndicatorName = 'Made or received digital payments in the past year (% age 15+)' order by Estimate desc; quit;
- Which value of CountryCode has the highest estimated use of digital payments?
Solution:
IND has an estimated digital payment use of 177,123,495 people.
SAS® SQL 1: Essentials
Lesson 04, Section 3 Activity
Open s104a05.sas from the activities folder and perform the following tasks to disable remerging of summary statistics:
- Examine and run the query. What note do you see in the log?
- Add the PROC SQL option NOREMERGE. Run the query. Did it run successfully? What was the error in the log?
- Add a GROUP BY clause after the FROM clause and group by Region. Run the query. Did it run successfully?
SAS® SQL 1: Essentials
Lesson 04, Section 3
Practice Level 1: Remerging Summary Statistics
Use the sq.statepopulation table to determine which states have the most estimated births for next year. Include a column showing each state's births as the percent of national births.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Write a query to remerge summary statistics in the sq.statepopulation table. Use the following requirements:
- Select the Name and Births columns.
- Create a new column named PctBirth by dividing Births1 for each state by the sum of Births1 for all states. Format the new column using the PERCENT format.
- Order the results by descending PctBirth.
- Add an appropriate title to the report.
- Run the query and view the results.
Solution:
/*s104s09.sas*/ title 'Estimate Percentage of Births by Each State'; proc sql; select Name, Births1, Births1/sum(Births1) as PctBirth format=percent7.2 from sq.statepopulation order by PctBirth desc; quit; title; /*Alternate Solution*/ title 'Estimate Percentage of Births by Each State'; proc sql; select Name, Births1, Births1/(select sum(Births1) from sq.statepopulation) as PctBirth format=percent7.2 from sq.statepopulation order by PctBirth desc; quit; title;
- Which state has the highest percentage of estimated births for next year?
Solution:
CA has the highest percentage at 12.3%.
SAS® SQL 1: Essentials
Lesson 04, Section 3
Practice Level 2: Using a Subquery in the SELECT Clause with an In-Line Views
Find the top 10 countries that have the highest percentage of estimated global population using the population estimates of ages 15+ in the sq.globalfull table.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Use the sq.globalfull table to write a query to sum the EstYear1Pop of all countries. Use the following requirements:
- Use an in-line view to select the distinct CountryCode and EstYear1Pop values from the sq.globalfull table. Note: You must find the distinct estimated population of each country because the data contains the estimated population for each country multiple times.
- Sum the EstYear1Pop column and name the column new column EstPct. Format the column using the COMMA format.
- Run the query and view the results.
Solution:
/*s104s10.sas*/ /*a*/ title 'Estimated Population of Ages 15+ for Next Year'; proc sql; select sum(EstYear1Pop) as EstPct format=comma16. from (select distinct CountryCode, EstYear1Pop from sq.globalfull); quit; title;
- What is the value of EstPct?
Solution:
5,271,101,653
- Create a new query using the query from step 1 to determine the estimated global population percentage of each country. Use the following requirements:
- Select the distinct CountryCode and ShortName values.
- Determine the percentage population by creating a new column. Divide the EstYear1Pop value of each country by the total value calculated in step 1. Name the new column PctPop and format using the PERCENT format.
- Use the sq.globalfull table.
- Order the results by descending PctPop.
- Limit the results to the top 10 countries.
- Add an appropriate title.
- Run the query and view the results.
Solution:
/*b*/ title 'Top 10 Countries by Estimate Population'; title2 'Ages 15+'; proc sql outobs=10; select distinct CountryCode, ShortName, EstYear1Pop/ (select sum(EstYear1Pop) as EstPct format=comma16. from (select distinct CountryCode, EstYear1Pop from sq.globalfull)) as PctPop format=percent7.2 from sq.globalfull order by PctPop desc; quit; title;
- Which country has the highest estimated population of individuals 15 or over?
Solution:
China with 21.3%
SAS® SQL 1: Essentials
Lesson 04, Section 3
Challenge Practice: Remerging GROUP BY Summary Statistics
In the sq.statepopulation table, states in the United States are categorized by divisions. Calculate the percentage of births for next year by each state for its division.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Use the sq.statepopulation table to write a query to sum the values of Births1 for each division and divide the summarized value by the Births1 values of each state.
Use the following requirements:
- Select the Name, Division, and Births1 columns. Create two new calculated columns.
- The first column should sum all Births1 values. Name the column TotalDivisionEst, and format it using the COMMA format.
- The second column should divide Births1 by the new column, TotalDivisionEst. Name the column PctDivision, and format it using the PERCENT format.
- Group the query by Division.
- Order the results by Division and PctDivision descending.
- Format the Births1 column using the COMMA format.
- Add an appropriate title.
- Run the query and view the results.
Solution:
/*s104s11.sas*/ title "Percentage of Estimated Births by Each Division"; proc sql; select Name, Division, Births1 format=comma14., sum(Births1) as TotalDivisionEst format=comma14., Births1/calculated TotalDivisionEst as PctDivision format=percent7.1 from sq.statepopulation group by Division order by Division, PctDivision desc; quit; title;
- Select the Name, Division, and Births1 columns. Create two new calculated columns.
- Which state has the highest percentage of estimated births in Division 8?
Solution:
AZ with 27.7% of estimated births
SAS® SQL 1: Essentials
Lesson 05, Section 2 Activity
Open s105a01.sas from the activities folder and perform the following tasks to find unique customers who have responded by phone and email:
- Run the first queries to preview the sq.salesemail and sq.salesphone tables. Examine the columns in both tables.
- In the Intersect section, examine and run the query. Did the query run successfully? Why not?
- Add the CORR keyword after the INTERSECT set operator. Run the query. Did the query run successfully? Why?
SAS® SQL 1: Essentials
Lesson 05, Section 2 Activity
Open s105a02.sas from the activities folder and perform the following tasks to find all target customers who have not responded to our sales phone call:
- Run the first queries in the Preview Tables section to preview the sq.salesemail and sq.salesphone tables. Examine the columns in both tables.
- In the EXCEPT section, complete the query to find all customers from the sq.saleslist table who have not responded to our sales call in sq.salesphone.
- How many customers have not responded to our phone call?
SAS® SQL 1: Essentials
Lesson 05, Section 2
Practice Level 1: Using the EXCEPT Set Operator
Use the EXCEPT set operator to generate a report listing of merchants in the sq.merchant table who are not listed in the sq.transaction table.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Write a query using the following requirements:
- Select MerchantID from the sq.merchant table.
- Use the EXCEPT set operator.
- Select MerchantID from the sq.transaction table.
- Order the results by MerchantID.
- Add an appropriate title.
- Run the query and view the results.
Solution:
/*s105s01.sas*/ title 'Merchants without Transactions'; proc sql; select MerchantID from sq.merchant except select MerchantID from sq.transaction order by MerchantID; quit; title;
- How many merchants do not have a transaction?
Solution:
six
SAS® SQL 1: Essentials
Lesson 05, Section 2
Practice Level 2: Using the EXCEPT Set Operator with the DISTINCT Keyword
Using the sq.statepopulation table, generate a list of state codes for states without any customers.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Write a query to list the unique Name values in the sq.statepopulation table. This list represents all available states.
Solution:
proc sql; select distinct Name from sq.statepopulation; quit;
- Write a separate query to list the unique State values from the sq.customer table. This list represents all states where customers reside.
Solution:
proc sql; select distinct State from sq.customer; quit;
- Combine the queries using the EXCEPT set operator to display states with no customers. Add an appropriate title and run the query.
Solution:
title 'States with No Customers'; proc sql; select Name from sq.statepopulation except select distinct State from sq.customer; quit; title;
- For which value (or values) of State are there no customers?
Solution:
PR
SAS® SQL 1: Essentials
Lesson 05, Section 2
Challenge Practice: Using Set Operators to Summarize Data
Determine what percentage of customers have accepted either the phone or email offer. The sq.saleslist table contains the full list of customers presented with an offer. The sq.salesemail and sq.salesphone tables contain email and phone responses.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Write a query to use a UNION SET operation to combine the CustomerID values for customers who accepted either the phone or email offer. This will form the basis of your in-line view.
Solution:
proc sql; select * from(select CustomerID from sq.salesemail where EmailResp = 'Accepted' union select CustomerID from sq.salesphone where PhoneResp = 'Accepted'); quit;
- Write a query to count the number of customers who have accepted either offer (step 1). Use the following requirements:
- Use the following formula to calculate the rate of offer acceptance:
select count(*)/(select count(*) from sq.saleslist) from (your in-line view code from step 1)
- Name the calculated column PctResp. Format it as a percent with no decimals.
- Label the new column Offer Acceptance Rate.
- Add an appropriate title to the report
- Run the query and view the results.
Solution:
/*s105s04.sas*/ title 'Acceptance Rate'; proc sql; select count(*)/(select count(*) from sq.saleslist) as PctResp format=percent5. label='Offer Acceptance Rate' from(select CustomerID from sq.salesemail where EmailResp = 'Accepted' union select CustomerID from sq.salesphone where PhoneResp = 'Accepted'); quit; title;
- Use the following formula to calculate the rate of offer acceptance:
- What is the value of Offer Acceptance Rate?
Solution:
50%
SAS® SQL 1: Essentials
Lesson 05, Section 3
Practice Level 1: Using the OUTER UNION Set Operator
Create a report that shows the email and phone offer responses along with the sales representative if available.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Using the sq.salesphone table as input, write a query to list the following columns:
- CustomerID
- a new column named Response based on the existing PhoneResp column
- SalesRep labeled Sales Rep
- a new column named Channel with the constant text Phone
- Run the query and view the results.
Solution:
proc sql; select CustomerID, PhoneResp as Response, SalesRep 'Sales Rep', 'Phone' as Channel from sq.salesphone; quit;
- Using the sq.salesemail table as input, write a query to list the following columns:
- CustomerID
- a new column named Response based on the existing EmailResp column
- a new column named Channel with the constant text Email
- Run the query and view the results.
Solution:
proc sql; select CustomerID, EmailResp as Response, 'Email' as Channel from sq.salesemail; quit;
- Combine the two query results using the OUTER UNION set operation. Use the following requirements:
- Be mindful of the column alignment. Use the SET operator modifiers as needed.
- Order the results by CustomerID and Response.
- Add an appropriate title.
- Run the query and view the final results.
Solution:
/*s105s05.sas*/ title 'Offer Results with Sales Rep'; proc sql; select CustomerID, PhoneResp as Response, SalesRep 'Sales Rep', 'Phone' as Channel from sq.salesphone outer union corr select CustomerID, EmailResp as Response, 'Email' as Channel from sq.salesemail order by 1,2; quit; title;
- To how many sales representatives can we attribute an accepted offer?
Solution:
two
SAS® SQL 1: Essentials
Lesson 06, Section 1 Activity
Open s106a01.sas from the activities folder and perform the following tasks to create and use user-defined macro variables:
- Run the program. Examine the log and output data. Confirm that the name of the newly created table is customerga and contains 957 rows.
- Replace the values GA and 650 in the %LET statements with NC and 700. Run the program. Examine the log and output data. What is the name of the newly created table? How many rows?
- Change the double quotation marks in the WHERE clause expression to single quotation marks. Run the query. How many rows are in the new table?
SAS® SQL 1: Essentials
Lesson 06, Section 2 Activity
Open s106a02.sas from the activities folder and perform the following tasks to store a number larger than eight digits in a macro variable:
- Run the program in the STEP 1 section to create the macro variables MaxPop and TotalCtry. The macro variables store the estimated maximum three-year population estimate for a country and the total number of countries. View the log. Notice that MaxPop stores a value in scientific notation.
- Examine STEP 2 of the program to find the country with the maximum estimated three-year population. Run the program to use the macro variables that you created. Confirm that no rows were returned.
- Add the FORMAT=10. column modifier to the max(estYear3Pop) column in STEP 1. Run the entire program. Which country has the largest three-year estimated population?
SAS® SQL 1: Essentials
Lesson 06, Section 2
Practice Level 1: Creating a Macro Variable from an SQL Query
Write a program that dynamically returns states in a specified region that have a three-year estimated population change greater than the median population change for the entire region.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Using the sq.statepopulation table, write a query to create the macro variable MedianEst to store the value of the median nPopChg3 of Region 1. Use the following requirements:
- Calculate the median of nPopChg3.
- Use the INTO clause to create a macro variable named MedianEst. Use the TRIMMED keyword.
- Filter the results for rows in the Region column with the value of 1, which is character.
- Add the NOPRINT option in the PROC SQL statement.
- Below the query, view the new macro variable in the log using %PUT:
%put &=MedianEst;
- Run the query and view the log.
Solution:
/*s106s01.sas*/ /*a*/ proc sql noprint; select median(nPopChg3) into :MedianEst trimmed from sq.statepopulation where Region="1"; quit; %put &=MedianEst;
- What is the value of MedianEst?
Solution:
3341
- Using the sq.statepopulation table, write another query to select states and their three-year population change in Region 1 that have a higher estimate than the median of all states in that region. Use the following requirements:
- Select the Name and nPopChg3 columns. Format nPopChg3 so that values are displayed with commas.
- Filter rows for states in region 1 that have an nPopChg3 value greater than the macro variable &MedianEst created in step 1.
- Order by nPopChg3 descending.
- Add the following titles:
- title 1 – States in Region 1 with a 3-Year Estimated Population Change Greater than the Median
- title 2 – Median Estimate: &MedianEst
- Run the query and view the results.
Solution:
/*b*/ title "States in Region 1 with a 3-Year Estimated Population Change Greater than the Median"; title2 "Median Estimate: &MedianEst"; proc sql; select Name,nPopChg3 format=comma14. from sq.statepopulation where Region="1" and nPopChg3 > &MedianEst order by nPopChg3 desc; quit; title;
- How many states are listed in the report?
Solution:
four
- At the beginning of your program, create a macro variable named RegionNum to specify the region to analyze. Use the following requirements:
- Use the %LET statement to create the macro variable RegionNum and set the value equal to 1.
- In your program, replace every location of the character value 1 with &RegionNum. Make sure that the macro variable is enclosed in double quotation marks. Hint: You need to do this replacement in the WHERE clause in each query and in the TITLE statement.
- Run the program and view the final results. Confirm that these results are the same as in step 3.
Solution:
/*c*/ %let RegionNum=1; proc sql noprint; select median(nPopChg3) into :MedianEst trimmed from sq.statepopulation where Region="&RegionNum"; quit; %put &=MedianEst; title "States in Region &RegionNum with a 3-Year Estimated Population Change Greater than the Median"; title2 "Median Estimate: &MedianEst"; proc sql; select Name,nPopChg3 format=comma14. from sq.statepopulation where Region="&RegionNum" and nPopChg3 > &MedianEst order by nPopChg3 desc; quit; title;
- In the %LET statement, replace the value 1 with 2. Run the entire program.
- How many states in Region 2 have a higher three-year estimated population change than the median of all states in that region?
Solution:
six
SAS® SQL 1: Essentials
Lesson 06, Section 2
Challenge Exercise: Splitting One Table into Many
In this practice, you split one table into many.
Reminder: If you restarted your SAS session, you must recreate the SQ library so you can access your practice files. In SAS Studio, open and submit the libname.sas program in the ESQ1M6 folder. In Enterprise Guide, run the Autoexec process flow.
- Write a program that dynamically creates a new table for each distinct value in a column. Use the Region column in the sq.globalmetadata table. Hint: Visit the Extended Learning page and view the SAS blog How to split one data set into many in the SAS Macro Language section or follow the direct link https://blogs.sas.com/content/sasdummy/2015/01/26/how-to-split-one-data-set-into-many/:
Solution:
/*s106s04.sas*/ %let table=sq.globalmetadata; %let column=Region; proc sql noprint; select distinct cat("DATA out_", compress(&column.,,'kad'), "; set &TABLE.(where=(&COLUMN.='", &column.,"')); run;") length=500 into :allsteps separated by ';' from &table.; quit; %macro runSteps; &allsteps.; %mend; %runSteps;
- How many tables were created?
Solution:
eight
SAS® SQL 1: Essentials
Lesson 07, Section 2 Activity
Open s107a01.sas from the activities folder and perform the following tasks to pass Microsoft Access SQL to the database:
Note: You will be NOT be able to do this activity if you are using SAS OnDemand for Academics because this platforms does not have the software necessary to read Microsoft Access files. In addition, you might not be able to do this activity if you are using your own installation of SAS Enterprise Guide.
- Examine the native DBMS query. Run the entire query. Did it produce an error?
- In the WHERE clause, replace IS MISSING with IS NULL. Run the entire query. Did it run successfully?
SAS® SQL 1: Essentials
Lesson 07, Section 3 Activity
Open a browser and perform the following tasks to view SAS DBMS-Specific Reference documentation:
Note: You will be NOT be able to do this activity if you are using SAS OnDemand for Academics because this platforms does not have the software necessary to read Microsoft Access files. In addition, you might not be able to do this activity if you are using you own installation of SAS Enterprise Guide.
- Access SAS Help at http://support.sas.com/documentation.
- Under Popular Documentation, select Programming: SAS 9.4 and Viya.
- Scroll down until you find the Accessing Data section. Inside the section, find SAS/ACCESS and select Relational Databases.
- Scroll down and find the DBMS-Specific Reference section and select a reference of your choice. Review the topic.
- Click the Passing SAS Functions to link and review the SAS functions that pass to your DBMS for processing.
SAS® SQL 1: Essentials
Lesson 07, Section 4 Activity
Open s107a03.sas from the activities folder and perform the following tasks to perform a PROC FEDSQL query:
- Examine and run the query. Did it produce an error?
- In the WHERE clause, replace the double quotation marks around NC with single quotation marks. Run the entire query. Did it run successfully?