## Introduction to Statistical Concepts

Lesson 01, Section 2 Setup for Practices (Required)

To complete the activities, demos, and practices in this course, you must ** have access SAS software **and ** set up your practice files**.

### Follow these instructions to use free software from SAS

**Step 1: Access SAS OnDemand for Academics**

To use SAS OnDemand for Academics, all you need is a web browser with an internet connection. Sign in with your SAS profile email address and password (the same one you use to access this training). Then, register for a SAS OnDemand for Academics account. You can use this same link to sign in to SAS OnDemand for Academics after you register.

**Step 2: Set up your practice files**

Setup instructions for SAS OnDemand for Academics

### Follow these instructions to use your own SAS software

**Step 1: Start your SAS Software**

**Step 2: Set up your practice files**

Click **Open** next to the software you will use for this course, and then follow the instructions to set up your data. These instructions cannot be used with SAS OnDemand for Academics.

Setup Instructions for SAS Studio

Setup Instructions for SAS Enterprise Guide

Setup Instructions for SAS Windowing Environment

## Introduction to Statistical Concepts

Lesson 02, Section 2 Practice

You want to understand if true mean body temperature is 98.6 and whether women's body temperatures are the same as men's body temperatures. The **Statdata.NormTemp** data set contains the data that you need.

**Reminder**: Make sure you've defined the **Statdata** library.

- Use PROC MEANS to determine the overall mean and standard deviation of the variable
**BodyTemp**in the data set**NormTemp**. Ensure that SAS displays statistics for all requested combinations of the class variable.

Submit the following code.proc means data=statdata.normtemp maxdec=2 fw=10 printalltypes n mean std q1 q3; var BodyTemp; class Gender; title 'Selected Descriptive Statistics for Body Temp'; run; title;

Review the Summary Statistics output. The overall mean is 98.25. The standard deviation is 0.73.

- Do the mean values seem to differ between men and women?

The values do differ somewhat.

- What is the interquartile range of body temperature?

The interquartile range is 0.90 (98.70 - 97.80). Another option for determining the interquartile range is to include the keyword QRANGE in the list of statistics specified in the PROC MEANS statement. That way SAS will calculate this statistic for you.

## Introduction to Statistical Concepts

Lesson 02, Section 3 Practice

You need to determine if the variables **BodyTemp** and **HeartRate** in the **Statdata.NormTemp** data set are normally distributed and if average body temperature is truly 98.6 degrees.

**Reminder**: Make sure you've defined the **Statdata** library.

- Determine the minimum, maximum, mean, and standard deviation for the variables
**BodyTemp**and**HeartRate**in the**NormTemp**data set. Also calculate the skewness and kurtosis statistics. Do the variables appear to be normally distributed?

You use the NOPRINT option in both the PROC UNIVARIATE and HISTOGRAM statements to suppress the printing of the tabular output. Because the statistics are being reported in the insets of the plots, they are not needed in the output tables.proc univariate data=statdata.normtemp noprint; var BodyTemp HeartRate; histogram BodyTemp HeartRate / normal(mu=est sigma=est noprint); inset min max skewness kurtosis / position=ne; probplot BodyTemp HeartRate / normal(mu=est sigma=est); inset min max skewness kurtosis; title 'Descriptive Statistics Using PROC UNIVARIATE'; run; title;

The distributions for both variables look approximately normal.

- Create box plots for the
**BodyTemp**and**HeartRate**variables. Use**ID**to identify outliers. For**BodyTemp**, display a reference line at 98.6 degrees. Does the average body temperature seem to be 98.6 degrees?

Submit the following code.proc sgplot data=statdata.normtemp; refline 98.6 / axis=y lineattrs=(color=blue); vbox BodyTemp / datalabel=ID; format ID 3.; title "Box Plots of Body Temps"; run; proc sgplot data=statdata.normtemp; vbox HeartRate / datalabel=ID; format ID 3.; title "Box Plots of Heart Rate"; run; title;

The average body temperature seems to be somewhat less than 98.6 degrees.

- In the
**NormTemp**data set, which of the following phrases best describes the distributions of**BodyTemp**and**HeartRate**?- pretty close to normal
- left-skewed
- right-skewed
- to have high positive kurtosis
- to have high negative kurtosis.

**a**. Because the histograms are bell shaped and the data follows the diagonal reference lines in the normal probability plots, the variables**BodyTemp**and**HeartRate**are both normally distributed. The skewness and kurtosis statistics are fairly close to zero for both variables as well, which tells us that**BodyTemp**and**HeartRate**are approximately normal.

## Introduction to Statistical Concepts

Lesson 02, Section 4 Practice

You need to generate a 95% confidence interval for the mean of the variable **BodyTemp** in the **Statdata.NormTemp** data set.

**Reminder**: Make sure you've defined the **Statdata** library.

- Use PROC MEANS to generate a 95% confidence interval for the mean of
**BodyTemp**in the**NormTemp**data set. Is the assumption of normality met to produce a confidence interval for these data?

Submit the following code.proc means data=statdata.normtemp maxdec=2 n mean stderr clm; var BodyTemp; title '95% Confidence Interval for Body Temp'; run; title;

Yes, the normality assumption seems to hold because the sample size is large enough and because the data values seemed to be normally distributed.

- What is the confidence interval?

The 95% confidence interval is 98.12 to 98.38 degrees Fahrenheit.

- How do you interpret this interval with regards to the true population mean for body temperature?

You are 95% confident that the true mean body temperature for the population of all people in the world is somewhere between 98.12 and 98.38 degrees.

## Introduction to Statistical Concepts

Lesson 02, Section 5 Practice

You need to perform a one-sample* t*-test for the variable **BodyTemp** in the **Statdata.NormTemp** data set to confirm whether average body temperature is truly 98.6 degrees.

Task description goes right here. **Reminder**: Make sure you've defined the **Statdata** library.

- Use PROC UNIVARIATE to perform a one-sample
*t*-test to determine whether the mean of the variable**BodyTemp**in the data set**NormTemp**is truly the value 98.6. What is the value of the*t*statistic and the corresponding*p*-value?

Submit the following code:ods select testsforlocation; proc univariate data=statdata.normtemp mu0=98.6; var BodyTemp; title 'Testing Whether the Mean Body Temperature = 98.6'; run; title;

The value of the*t*statistic and the corresponding*p*-value are -5.45 and <.0001 respectively.

- What is the null hypothesis?

The population mean is equal to 98.6.

- What is the alternative hypothesis?

The population mean is not equal to 98.6.

- Using a 0.05 alpha, do you reject or fail to reject the null hypothesis?

Because the*p*-value is less than the stated alpha level of .05, you do reject the null hypothesis.