When it comes to using data, there are two main camps, traditional statistics and machine learning, and the two camps complement each other. Statistics remains highly relevant, irrespective of the "bigness" of data. Its role remains what it has always been, but it is even more important now. There is a need to transition from traditional statistical modeling to the machine learning world. This course introduces the statistical background necessary for machine learning using SAS Viya. Knowledge of statistics relevant to machine learning will prepare you to become a data scientist. The course prepares you for future instruction on doing machine learning (including its underlying methodology that has statistical foundations) and enables you to develop a deeper understanding of machine learning models.

This course is a prerequisite to many of the courses in the data science curriculum. A more advanced treatment of machine learning occurs in the courses Supervised Machine Learning Pipelines Using SAS(R) Viya(R), Interactive Machine Learning in SAS® Viya®, SAS® Visual Statistics in SAS® Viya®: Interactive Model Building, and Supervised Machine Learning Using SAS(R) Viya(TM)
.

For students interested in statistics for inference and explanatory analysis used in scientific and medical research, Statistics I: Introduction to ANOVA, Regression, and Logistic Regression is an appropriate foundational course.

Learn How To

Explain the relevance of statistics in big data and machine learning.

Relate statistical and data science terminology.

Generate descriptive statistics and explore data with graphs.

Detect associations among variables.

Perform linear regression for explanatory modeling.

Compare explanatory modeling with predictive modeling.

Describe the trade-off between bias and variance.

Fit a logistic regression model for predictive modeling.

Score new data.

Explain the statistical foundations of machine learning.

Discuss data difficulties and modeling issues and their statistical solutions.

Who Should Attend

Anyone in the field of data science who does not yet have a deep understanding of statistical and machine learning concepts or wants to enhance their knowledge, which might include business analysts, data analysts, marketing analysts, marketing managers, data scientists, data engineers, financial analysts, data miners, statisticians, mathematicians, and others who work in allied areas

Prerequisites

Before attending this course, you should have experience using computer software. It is beneficial if you have completed the equivalent of an undergraduate course in statistics covering distribution of data, p-values, hypothesis testing, and regression. No prior SAS experience is needed.

SAS Products Covered

SAS Viya

Course Outline

Statistics and Machine Learning

Relevance of statistics in big data and machine learning.

Terminology and vocabulary.

Introduction to SAS Viya and SAS Studio.

Fundamental Statistical Concepts

Introduction to statistical analysis.

Descriptive statistics.

Inferential statistics.

Explanatory Modeling Using Linear Regression

Correlation and simple linear regression.

Multiple regression and model selection.

Model diagnostics.

Predictive Modeling Using Logistic Regression

Introduction to predictive modeling.

Categorical associations.

Logistic regression model.

Model deployment.

Statistical Foundations of Machine Learning

Overview of machine learning.

Data pre-processing for machine learning models.

Model evaluation, estimation, and post-training tasks.