A typical organization loses an estimated 5 percent of its yearly revenue to fraud. This course shows how learning fraud patterns from historical data can be used to fight fraud. The course discusses the use of supervised learning (using a labeled data set), unsupervised learning (using an unlabeled data set), and social network learning (using a networked data set). The techniques can be applied across a wide variety of fraud applications, such as insurance fraud, credit card fraud, anti-money laundering, healthcare fraud, telecommunications fraud, click fraud, tax evasion, and counterfeiting. The course provides a mix of both theoretical and technical insights, as well as practical implementation details. During the course, the instructor reports extensively on his recent research insights about the topic. Various real-life case studies and examples are presented for further clarification.
Learn How To
Preprocess data for fraud detection (sampling, missing values, outliers, categorization, and so on). Build fraud detection models using supervised analytics (logistic regression, decision trees, neural networks, ensemble models, and so on). Build fraud detection models using unsupervised analytics (hierarchical clustering, non-hierarchical clustering, k-means, self organizing maps, and so on). Build fraud detection models using social network analytics (homophily, featurization, egonets, PageRank, bigraphs, and so on).Who Should Attend
Fraud analysts, data miners, and data scientists; consultants working in fraud detection; validators auditing fraud models; and researchers in financial services companies, banks, insurance companies, government institutions, health-care institutions, and consulting firms
Prerequisites
Before attending this course, you should have a basic knowledge of statistics, including descriptive statistics, confidence intervals, and hypothesis testing.
SAS Products Covered
SAS Enterprise Miner
Course Outline
Fraud Detection
The importance of fraud detection. Defining fraud. Anomalous behavior. Fraud cycle. Types of fraud. Examples of insurance fraud and credit card fraud. Key characteristics of successful fraud analytics models. Fraud detection challenges. Approaches to fraud detection.Data PreprocessingMotivation. Types of variables. Sampling. Visual data exploration. Missing values. Outlier detection and treatment. Standardizing data. Transforming data. Coarse classification and grouping of attributes. Recoding categorical variables. Segmentation. Variable selection.Supervised Methods for Fraud DetectionTarget definition. Linear regression. Logistic regression. Decision trees. Ensemble methods: bagging, boosting, random forests. Neural networks. Dealing with skewed class distributions. Evaluating fraud detection models.Unsupervised Methods for Fraud DetectionUnsupervised learning. Clustering approaches: hierarchical clustering, k-means clustering, self-organizing maps. Peer group analysis. Break point analysis.Social Networks for Fraud DetectionSocial networks and applications. Is fraud a social phenomenon? Social network components. Visualizing social networks. Social network metrics. Community mining. Social-network-based inference (network classifiers and collective inference). From unipartite toward bipartite graphs. Featurizing a bigraph. Fraud propagation. Case study.Fraud Analytics: Putting It All to WorkQuantitative monitoring: backtesting, benchmarking. Qualitative monitoring: data quality, model design, documentation, corporate governance.