introduction to data mining sas® enterprise miner™

61
Copyright © 2012, SAS Institute Inc. All rights reserved. INTRODUCTION TO DATA MINING SAS ® ENTERPRISE MINER™ Mary-Elizabeth (“M-E”) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc.

Upload: ngominh

Post on 01-Jan-2017

241 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

INTRODUCTION TO DATA MININGSAS® ENTERPRISE MINER™

Mary-Elizabeth (“M-E”) EddlestonePrincipal Systems Engineer, AnalyticsSAS Customer Loyalty, SAS Institute, Inc.

Page 2: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

AGENDA

• Overview/Introduction to Data Mining and Predictive Modeling

• Building Models Using SAS® Enterprise Miner™• Walk through example• Essential steps: Sample, Explore, Modify, Model, Assess, Score• Show selection of tools, how to change their properties and surface

results• Building Automated Models using Excel or SAS®

Enterprise Guide (Rapid Predictive Modeler)

Page 3: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

INTRODUCTION TO DATA MINING

Page 4: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Page 5: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DATA MINING GOALS

AGILEor

DYNAMIC

IMPROVEDPROFITABILITYINSIGHT PRECISIONSPEED

Better Decisions

PERSONALIZATION

Page 6: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Page 7: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

ANALYTICS INFERENTIAL

Inferential Statistics Uses patterns in the sample data to draw inferences

about the population represented, accounting for randomness Answering yes/no questions about the data (hypothesis testing) Describing associations within the data (correlation) Modeling relationships within the data (regression)

Source: Wikipedia

Page 8: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

ANALYTICS PREDICTIVE

Predictive Analytics Encompasses a variety of techniques from statistics,

modeling, machine learning, and data mining that analyze current and historical facts to make predictions about future, or otherwise unknown, events. Include:

• Data Mining • Forecasting

Source: Wikipedia

Page 9: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

ANALYTICS DATA MINING VERSUS FORECASTING

• DATA MINING• Time independent• Casual (relationship) focused• Categorical, Continuous,

Discrete• Seldom weight more recent

observations

• FORECASTING• Time dependent• Interval oriented• Continuity assumed• Frequently weights more

recent phenomena

Both are predictive and both model past behavior.

Page 10: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DATA MINING

• Descriptive Data Mining• Predictive Data Mining

Page 11: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DATA MINING

• Descriptive Data Mining• Clustering (Segmentation)• Associations and Sequences

• Predictive Data Mining• Classification Models to predict class membership• Regression Models to predict a number

Page 12: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

THE GOAL? SCORING!

• Scoring is the act of applying what we’ve learned from data mining to new cases.

• Keep this goal in mind and use it to help formulate the questions and the data needed for data mining and scoring.

Page 13: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

THE ULTIMATE GOAL? BETTER DECISIONS

• The ultimate goal of data mining is to improve decision making.

• As you formulate your problem, also keep in mind how and when model scores will be used.

Page 14: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXAMPLE DEVELOPING A CLASSIFICATION MODEL

• Models are developed using historical data in which the behavior is observed or known.

• Information about each subject, in this case an individual, is used as inputs to the model to see how well the model can distinguish between the people who exhibit the behavior and those who do not. For example, age, gender, previous behaviors, etc.

Indicates the behavior was observed in this subject

Page 15: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXAMPLE DATA

Page 16: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

WHY?

• Consider a group of subjects whose relevant behavior is unknown.

• The same information is available for each of these subjects (age, gender, etc.) as is available for the individuals with known behavior.

• We would like to know which individuals are most likely to have the relevant behavior.

Page 17: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXAMPLE NEW DATA

?

Page 18: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SCORING

• The output of a predictive classification model output is typically an equation. Models are applied to new cases to calculate the predicted behavior through a process called scoring.

• Scoring, using the equation, calculates each subject’s likelihood to have the relevant behavior. (It also calculates the likelihood to not have the behavior.)

Page 19: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

EXAMPLE SCORED DATA

Page 20: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

IDENTIFY /FORMULATE

PROBLEMDATA

PREPARATION

DATAEXPLORATION

TRANSFORM& SELECT

BUILDMODEL

VALIDATEMODEL

DEPLOYMODEL

EVALUATE /MONITORRESULTSDomain Expert

Makes DecisionsEvaluates Processes and ROI

BUSINESSMANAGER

Data PreparationModel ValidationModel DeploymentModel Monitoring

IT SYSTEMS /MANAGEMENT

Data ExplorationData VisualizationReport Creation

BUSINESSANALYST

Exploratory AnalysisDescriptive SegmentationPredictive Modeling

DATA MINER /STATISTICIAN

THE ANALYTICS LIFECYCLE

Page 21: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

IDENTIFY /FORMULATE

PROBLEMDATA

PREPARATION

DATAEXPLORATION

TRANSFORM& SELECT

BUILDMODEL

VALIDATEMODEL

DEPLOYMODEL

EVALUATE /MONITORRESULTS

THE ANALYTICS LIFECYCLE

Page 22: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

MAIN TYPES OF DATA MARTS

One-Row-per-Subject Data Mart

Multiple-Row-per-Subject Data Mart

LongitudinalData Mart

Page 23: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

IDENTIFY /FORMULATE

PROBLEMDATA

PREPARATION

DATAEXPLORATION

TRANSFORM& SELECT

BUILDMODEL

VALIDATEMODEL

DEPLOYMODEL

EVALUATE /MONITORRESULTS

Exploratory AnalysisDescriptive SegmentationPredictive Modeling

DATA MINER /STATISTICIAN

THE ANALYTICS LIFECYCLE

SAS® Enterprise Miner™ focuses on these aspects of the process.

Page 24: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS® ENTERPRISE MINER™

Page 25: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS® ENTERPRISE MINER™

Page 26: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS® ENTERPRISE MINER™

• Organized and logical GUI for data mining success

• Unmatched suite of modeling techniques and methods

• Sophisticated set of data preparation, summarization and exploration tools

• Business-based model comparisons, reporting and management

Page 27: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS® ENTERPRISE MINER™

• Automated scoring process delivers faster results

• High-performance grid-enabled workbench

• Modern, distributable data mining system suited for large enterprises

• Open, extensible design for ultimate flexibility

Page 28: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

WHAT IS SAS®

ENTERPRISE MINER™?

• SAS Enterprise Miner is a sophisticated graphical user interface, designed with the specific needs of data miners in mind.

• SAS Enterprise Miner is a data miner’s workbench that manages the process and provides a comprehensive set of tools to aid the data miner throughout the essential steps, known by the acronym, SEMMA: Sample, Explore, Modify, Model, Assess.

• SAS Enterprise Miner streamlines the data mining process to create highly accurate predictive and descriptive models based on analysis of vast amounts of data from across an enterprise.

Page 29: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DATA MINING WITH SAS®

ENTERPRISE MINER™

Page 30: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS® ENTERPRISE MINER™ 7.1 AND

12.1MODEL DEVELOPMENT PROCESS (SEMMA)

Sample Explore Modify Model Assess

Utility

Page 31: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS® ENTERPRISE MINER™

Page 32: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS® ENTERPRISE MINER™

• Use the desired tools to define a logical process (SEMMA)

Sample Explore Modify Model Assess

Page 33: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS® ENTERPRISE MINER™

• Modify settings (properties) for the tools.

Page 34: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS® ENTERPRISE MINER™

• Run the flow and check results. Refine as needed.

Page 35: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DEMONSTRATION

Page 36: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

AUTOMATED PREDICTIVE MODELING

Page 37: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Page 38: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS RAPID PREDICTIVE

MODELERKEY DRIVERS (BUSINESS USERS)

• Need to generate numerous models to solve a variety of business problems in a credible manner

• Models need to be developed in a quick time-frame using a self-service approach

• Does not want to always rely on analytic professionals (e.g. statistician or modeler or data miner)

Page 39: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS RAPID PREDICTIVE

MODELERKEY DRIVERS (ANALYTIC PROFESSIONALS)

• Solving more complex issues on hand to gain incremental value

• Further customize or refine models for better results

Page 40: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

RAPID PREDICTIVE MODELER

Page 41: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Open your data in SAS® Enterprise Guide or Microsoft Excel

Use the Rapid Predictive Modeler task and modify settings

Review results

Page 42: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Microsoft Excel

Page 43: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS Enterprise Guide

Page 44: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Page 45: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Page 46: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

RAPID PREDICTIVE

MODELERBASIC

Page 47: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Page 48: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

RAPID PREDICTIVE

MODELERINTERMEDIATE

Page 49: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Page 50: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

RAPID PREDICTIVE

MODELERADVANCED

Page 51: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

RAPID PREDICTIVE

MODELER: SAMPLE OUTPUT

Page 52: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Rapid Predictive Modeler: Sample Output

Page 53: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Rapid Predictive Modeler: Sample Output

Page 54: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Rapid Predictive Modeler: Sample Output

Page 55: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

DEMONSTRATION

Page 56: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

IN CONCLUSION

Page 57: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

SAS® ENTERPRISE MINER™ BENEFITS

• Support the entire data mining process with a broad set of tools.

• Build more models faster with an easy-to-use Graphical User Interface.

• Enhance accuracy of predictions• Surface business information and easily share results through the unique model repository

Page 58: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

RESOURCES

• SAS Rapid Predictive Modeler Website • Product brief, Press release, Brief product demo, etc.

• SAS Enterprise Miner Web Site• SAS Enterprise Miner Technical Support Web Site• SAS Enterprise Miner Technical Forum (Join Today!)• SAS Enterprise Miner Training

• “Rapid Predictive Modeling for Customer Intelligence”• SAS Global Forum 2010 paper written by Wayne Thompson and

David Duling, SAS Institute Inc., Cary, NC

Page 59: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

POTENTIAL NEXT STEPS

• Work through the example in “Getting Started with SAS®

Enterprise Miner™” - Both the data and the documentation are available on support.sas.com http://support.sas.com/documentation/onlinedoc/miner/

• Contact SAS Technical Support if you get stuck• There is no charge for this – it is included in your SAS software

license.

Page 60: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved.

Page 61: INTRODUCTION TO DATA MINING SAS® ENTERPRISE MINER™

Copyr igh t © 2012, SAS Ins t i tu te Inc . A l l r igh ts reserved. www.SAS.com

THANK YOU FOR USING SAS!