This is the second course in the Data Curation Professional, SAS Academy for Data Science program. The program is required to earn your SAS data science certification. Designed for SAS data scientists, this program covers SAS topics for data curation techniques, including big data preparation with Hadoop. In this course, you discover how to access your data from a variety of sources, create processes to manage and transform data, and ensure the reliability and consistency of your data.
Learn How To
Read and write data with SAS/ACCESS technologies.
Perform extract, transform, and load (ETL) tasks using SAS Data Integration Studio.
Discover capabilities of the SAS Quality Knowledge Base.
Use DataFlux Data Management Studio to understand and improve your data.
Understand the structure and functionality of the SAS Quality Knowledge Base.
Access the components of SAS Quality Knowledge Base programmatically using SAS code.
Prerequisites
Before attending this course, you should have:;
Experience with SAS programming basics and data manipulation techniques.
Familiarity with SQL processing.;You can gain this experience by completing the SAS Programming 1: Essentials, SAS Programming 2: Data Manipulation Techniques, and SAS SQL 1: Essentials courses.
SAS Products Covered
DataFlux Web Studio Server;SAS/ACCESS;SAS Data Integration Studio;DataFlux Data Management Server
Course Outline
SAS/ACCESS Technology Overview
SAS/ACCESS technology overview.
SAS Data Integration Studio: Essentials
Exploring the SAS platform and SAS Data Integration Studio.
Exploring SAS Data Integration Studio basics.
Examining SAS Data Integration Studio jobs and options.
SAS Data Integration Studio: Defining Source Data Metadata
Setting up the environment.
Defining metadata for a library.
Registering metadata for data sources.
Registering SAS table metadata.
Registering DBMS table metadata.
Registering ODBC data source table metadata.
Registering metadata for external files.
SAS Data Integration Studio: Defining Target Data Metadata
Registering metadata for target tables.
Importing metadata.
SAS Data Integration Studio: Working with Jobs
Creating metadata for jobs.
Working with the Join transformation.
SAS Data Integration Studio: Working with Transformations
Working with the Extract and Summary Statistics transformations.
Exploring the SQL transformations.
Creating custom transformations.
Introduction to Data Quality and the SAS Quality Knowledge Base
Introduction to data quality.
SAS Quality Knowledge Base overview.
DataFlux Data Management Studio: Essentials
Overview of Data Management Studio.
DataFlux Repositories.
Quality Knowledge Bases and reference data sources.
Data connections.
DataFlux Data Management Studio: Understanding Data
Methodology review.
Creating data collections.
Designing data explorations.
Creating data profiles.
Profiling other input types.
Designing data standardization schemes.
DataFlux Data Management Studio: Building Data Jobs to Improve Data
Introduction to data jobs.
Standardization, parsing, and casing.
Identification analysis and right fielding.
Branching and gender analysis.
Data enrichment.
DataFlux Data Management Studio: Building Data Jobs for Entity Resolution
Creating match codes.
Clustering records.
Survivorship.
Understanding the SAS Quality Knowledge Base (QKB)
Working with QKB component files.
Working with QKB definitions.
Using SAS Code to Access QKB Components
SAS configuration options for accessing the QKB.
SAS Data Quality Server overview.
The hands-on lab is preconfigured to support this course and will not support hands-on practice for all your enrolled courses.
Hands-On Lab Reservation System
When you are planning your study time, keep in mind that the virtual lab takes 45-60 minutes to start
There was an error in getting content for the activity