Lesson 01

Introduction to Statistical Concepts
Lesson 01, Section 2 Setup for Practices (Required)

To complete the activities, demos, and practices in this course, you must have access SAS software and set up your practice files.

Follow these instructions to use free software from SAS

Step 1: Access SAS OnDemand for Academics

To use SAS OnDemand for Academics, all you need is a web browser with an internet connection. Sign in with your SAS profile email address and password (the same one you use to access this training). Then, register for a SAS OnDemand for Academics account. You can use this same link to sign in to SAS OnDemand for Academics after you register.

Step 2: Set up your practice files

Setup instructions for SAS OnDemand for Academics


Follow these instructions to use your own SAS software

Step 1: Start your SAS Software

Step 2: Set up your practice files

Click Open next to the software you will use for this course, and then follow the instructions to set up your data. These instructions cannot be used with SAS OnDemand for Academics.

Setup Instructions for SAS Studio

Setup Instructions for SAS Enterprise Guide

Setup Instructions for SAS Windowing Environment


Lesson 02

Introduction to Statistical Concepts
Lesson 02, Section 2 Practice

You want to understand if true mean body temperature is 98.6 and whether women's body temperatures are the same as men's body temperatures. The Statdata.NormTemp data set contains the data that you need.

Reminder: Make sure you've defined the Statdata library.

  1. Use PROC MEANS to determine the overall mean and standard deviation of the variable BodyTemp in the data set NormTemp. Ensure that SAS displays statistics for all requested combinations of the class variable.

    Submit the following code.
       proc means data=statdata.normtemp
                  maxdec=2 fw=10 printalltypes
                  n mean std q1 q3;
          var BodyTemp;
          class Gender;
          title 'Selected Descriptive Statistics for Body Temp';
       run;
       title;
       
    Review the Summary Statistics output. The overall mean is 98.25. The standard deviation is 0.73.

  2. Do the mean values seem to differ between men and women?

    The values do differ somewhat.
  3. What is the interquartile range of body temperature?

    The interquartile range is 0.90 (98.70 - 97.80). Another option for determining the interquartile range is to include the keyword QRANGE in the list of statistics specified in the PROC MEANS statement. That way SAS will calculate this statistic for you.

Introduction to Statistical Concepts
Lesson 02, Section 3 Practice

You need to determine if the variables BodyTemp and HeartRate in the Statdata.NormTemp data set are normally distributed and if average body temperature is truly 98.6 degrees.

Reminder: Make sure you've defined the Statdata library.

  1. Determine the minimum, maximum, mean, and standard deviation for the variables BodyTemp and HeartRate in the NormTemp data set. Also calculate the skewness and kurtosis statistics. Do the variables appear to be normally distributed?

    You use the NOPRINT option in both the PROC UNIVARIATE and HISTOGRAM statements to suppress the printing of the tabular output. Because the statistics are being reported in the insets of the plots, they are not needed in the output tables.
    proc univariate data=statdata.normtemp noprint;
       var BodyTemp HeartRate;
       histogram BodyTemp HeartRate / normal(mu=est sigma=est noprint);
       inset min max skewness kurtosis / position=ne;
       probplot BodyTemp HeartRate / normal(mu=est sigma=est);
       inset min max skewness kurtosis;
       title 'Descriptive Statistics Using PROC UNIVARIATE';
    run;
    title;
    
    The distributions for both variables look approximately normal.

  2. Create box plots for the BodyTemp and HeartRate variables. Use ID to identify outliers. For BodyTemp, display a reference line at 98.6 degrees. Does the average body temperature seem to be 98.6 degrees?

    Submit the following code.
    proc sgplot data=statdata.normtemp;
       refline 98.6 / axis=y lineattrs=(color=blue);
       vbox BodyTemp / datalabel=ID;
       format ID 3.;
       title "Box Plots of Body Temps";
    run;
    proc sgplot data=statdata.normtemp;
       vbox HeartRate / datalabel=ID;
       format ID 3.;
       title "Box Plots of Heart Rate";
    run;
    title;
    
    The average body temperature seems to be somewhat less than 98.6 degrees.

  3. In the NormTemp data set, which of the following phrases best describes the distributions of BodyTemp and HeartRate?

    1. pretty close to normal
    2. left-skewed
    3. right-skewed
    4. to have high positive kurtosis
    5. to have high negative kurtosis.
    The correct answer is a. Because the histograms are bell shaped and the data follows the diagonal reference lines in the normal probability plots, the variables BodyTemp and HeartRate are both normally distributed. The skewness and kurtosis statistics are fairly close to zero for both variables as well, which tells us that BodyTemp and HeartRate are approximately normal.

Introduction to Statistical Concepts
Lesson 02, Section 4 Practice

You need to generate a 95% confidence interval for the mean of the variable BodyTemp in the Statdata.NormTemp data set.

Reminder: Make sure you've defined the Statdata library.

  1. Use PROC MEANS to generate a 95% confidence interval for the mean of BodyTemp in the NormTemp data set. Is the assumption of normality met to produce a confidence interval for these data?

    Submit the following code.
       proc means data=statdata.normtemp maxdec=2
                  n mean stderr clm;
          var BodyTemp;
          title '95% Confidence Interval for Body Temp';
       run;
       title;
    
    Yes, the normality assumption seems to hold because the sample size is large enough and because the data values seemed to be normally distributed.

  2. What is the confidence interval?

    The 95% confidence interval is 98.12 to 98.38 degrees Fahrenheit.
  3. How do you interpret this interval with regards to the true population mean for body temperature?

    You are 95% confident that the true mean body temperature for the population of all people in the world is somewhere between 98.12 and 98.38 degrees.

Introduction to Statistical Concepts
Lesson 02, Section 5 Practice

You need to perform a one-sample t-test for the variable BodyTemp in the Statdata.NormTemp data set to confirm whether average body temperature is truly 98.6 degrees.

Task description goes right here. Reminder: Make sure you've defined the Statdata library.

  1. Use PROC UNIVARIATE to perform a one-sample t-test to determine whether the mean of the variable BodyTemp in the data set NormTemp is truly the value 98.6. What is the value of the t statistic and the corresponding p-value?

    Submit the following code:
       ods select testsforlocation;
       proc univariate data=statdata.normtemp mu0=98.6;
          var BodyTemp;
          title 'Testing Whether the Mean Body Temperature = 98.6';
       run;
       title;
    
    The value of the t statistic and the corresponding p-value are -5.45 and <.0001 respectively.

  2. What is the null hypothesis?

    The population mean is equal to 98.6.
  3. What is the alternative hypothesis?

    The population mean is not equal to 98.6.
  4. Using a 0.05 alpha, do you reject or fail to reject the null hypothesis?

    Because the p-value is less than the stated alpha level of .05, you do reject the null hypothesis.