why, and how, your analytics project will fail peter mccallum director, cbi
TRANSCRIPT
Why, and How, your Why, and How, your Analytics Project will FailAnalytics Project will Fail
Peter McCallumPeter McCallumDirectorDirector, CBI, CBI
AgendaAgenda IntroductionIntroduction Pyle’s 9 Rules for Analytics Project Pyle’s 9 Rules for Analytics Project
FailureFailure Why navigating Pyle’s 9 Rules still Why navigating Pyle’s 9 Rules still
doesn’t guarantee successdoesn’t guarantee success Incorporating the analytical model into Incorporating the analytical model into
the business process the business process SummarySummary
IntroductionIntroduction Who am I?Who am I?
20 years experience in the IT industry20 years experience in the IT industry The last 12 years working exclusively The last 12 years working exclusively
delivering Business Intelligence & delivering Business Intelligence & Analytical solutionsAnalytical solutions
Have experienced the frustration of seeing Have experienced the frustration of seeing a data mining project fail to deliver the a data mining project fail to deliver the quick wins promisedquick wins promised
AgendaAgenda IntroductionIntroduction Pyle’s 9 Rules for Analytics Project Pyle’s 9 Rules for Analytics Project
FailureFailure Why navigating Pyle’s 9 Rules still Why navigating Pyle’s 9 Rules still
doesn’t guarantee successdoesn’t guarantee success Incorporating the analytical model into Incorporating the analytical model into
the business process the business process SummarySummary
Pyle’s 9 RulesPyle’s 9 Rules Who is Dorian Pyle?Who is Dorian Pyle? What are his rules?What are his rules? Why are they still relevant?Why are they still relevant?
Pyle’s Rule #1Pyle’s Rule #1 # 1. Jump Right In # 1. Jump Right In
Ignore the businessIgnore the business Use whatever data is on handUse whatever data is on hand Use whatever tools you’re most Use whatever tools you’re most
comfortable withcomfortable with And don’t worry about how (or whether) And don’t worry about how (or whether)
your results can actually be appliedyour results can actually be applied
Pyle’s Rule #2Pyle’s Rule #2 # 2. Frame the problem in terms of the # 2. Frame the problem in terms of the
data data You’ve been given data – mine it!You’ve been given data – mine it! Don’t stop to ask whether there might be Don’t stop to ask whether there might be
other methods of solving the problemother methods of solving the problem Don’t think outside of the current data set – Don’t think outside of the current data set –
simply ignore any environmental or simply ignore any environmental or organisational factorsorganisational factors
Restate the objective based on “whatever Restate the objective based on “whatever the data can be persuaded to reveal”the data can be persuaded to reveal”
Pyle’s Rule #3Pyle’s Rule #3 # 3. Focus only on the most obvious # 3. Focus only on the most obvious
way to frame the problem way to frame the problem Don’t waste your time exploring the dataDon’t waste your time exploring the data Concentrate on the technical merits of the Concentrate on the technical merits of the
model to the exclusion of all elsemodel to the exclusion of all else Aim for the highest degree of technical Aim for the highest degree of technical
perfectionperfection
Pyle’s Rule #4Pyle’s Rule #4 # 4. Rely on your own judgment# 4. Rely on your own judgment
The data miner knows bestThe data miner knows best The data contains all the required The data contains all the required
information – focus on revealing the information – focus on revealing the nuggets within nuggets within
Input from others, Input from others, especiallyespecially the business, the business, is unnecessary & should be ignoredis unnecessary & should be ignored
Remember – the miner knows bestRemember – the miner knows best
Pyle’s Rule #5Pyle’s Rule #5 # 5. Find the best algorithms # 5. Find the best algorithms
For any set of data one particular algorithm For any set of data one particular algorithm will produce the best modelwill produce the best model
So focus on finding the best algorithmSo focus on finding the best algorithm It’s what data mining is all aboutIt’s what data mining is all about
Pyle’s Rule #6Pyle’s Rule #6 # 6. Rely on memory# 6. Rely on memory
Don’t waste your time documentingDon’t waste your time documenting Press on with the data investigation…. As Press on with the data investigation…. As
fast as possiblefast as possible Should you ever need to duplicate the Should you ever need to duplicate the
investigation you’ll remember exactly what investigation you’ll remember exactly what you did and whyyou did and why
Should anyone ever dare ask you to justify Should anyone ever dare ask you to justify or explain your results, you will remember or explain your results, you will remember
Pyle’s Rule #7Pyle’s Rule #7 # 7. Intuition is more important than # 7. Intuition is more important than
standard practicestandard practice Data mining is an art, not a scienceData mining is an art, not a science Standards are really only intended for Standards are really only intended for
“newbies”“newbies” All data sets are different, so simply rely on All data sets are different, so simply rely on
your instinctsyour instincts
Pyle’s Rule #8Pyle’s Rule #8 # 8. Minimize interaction between # 8. Minimize interaction between
miners and business managers miners and business managers Stay away from the businessStay away from the business Rely exclusively on what the data tells you, Rely exclusively on what the data tells you,
irrespective of what the business might try irrespective of what the business might try to tell youto tell you
After all, mining is primarily about letting After all, mining is primarily about letting the tools do the talkingthe tools do the talking
Pyle’s Rule #9Pyle’s Rule #9 # 9. Minimize data preparation # 9. Minimize data preparation
Creating the models themselves is the Creating the models themselves is the most interesting part of data mining most interesting part of data mining
Data preparation is dull, tedious & time Data preparation is dull, tedious & time consuming consuming
Let the tools look after the data preparation Let the tools look after the data preparation for youfor you
Do as little preparation as possible and cut Do as little preparation as possible and cut straight to the modelingstraight to the modeling
AgendaAgenda IntroductionIntroduction Pyle’s 9 Rules for Analytics Project Pyle’s 9 Rules for Analytics Project
FailureFailure Why navigating Pyle’s 9 Rules still Why navigating Pyle’s 9 Rules still
doesn’t guarantee successdoesn’t guarantee success Incorporating the analytical model into Incorporating the analytical model into
the business process the business process SummarySummary
The Bigger PictureThe Bigger Picture““Data mining is part, and a very small Data mining is part, and a very small part, of a much larger business process. part, of a much larger business process. It may be an essential part of a data It may be an essential part of a data mining project, but incorporating the mining project, but incorporating the results of mining with all the related results of mining with all the related parts of the corporate project is equally, parts of the corporate project is equally, if not more, important for ultimate if not more, important for ultimate success”success”
Dorian PyleDorian Pyle
Virtuous Cycle of Data Mining Virtuous Cycle of Data Mining
Identify business problem
Transform Data
Measure the results
Act on the Information
Berry & Linoff
Realising Business ValueRealising Business Value““The heart of data mining is The heart of data mining is transforming data into actionable transforming data into actionable results”results”
Berry & LinoffBerry & Linoff
Where’s the payback?Where’s the payback? Large multi-nationalLarge multi-national Undertook a review of their churn Undertook a review of their churn
management processmanagement process Led by an international consulting firmLed by an international consulting firm Executive management sponsorshipExecutive management sponsorship Chasing millions in potential benefitsChasing millions in potential benefits
What went rightWhat went right Everything!Everything!
Fully engaged with the businessFully engaged with the business Invested time in data exploration & Invested time in data exploration &
preparationpreparation Focused on the business issue rather than Focused on the business issue rather than
the technicalitiesthe technicalities Every step documentedEvery step documented Project uncovered some excellent insights Project uncovered some excellent insights Models developed showed lift of 3X or Models developed showed lift of 3X or
moremore
All we had to do was deploy the modelsAll we had to do was deploy the models
What went wrongWhat went wrong Deploying the modelsDeploying the models
AgendaAgenda IntroductionIntroduction Pyle’s 9 Rules for Analytics Project Pyle’s 9 Rules for Analytics Project
FailureFailure Why navigating Pyle’s 9 Rules still Why navigating Pyle’s 9 Rules still
doesn’t guarantee successdoesn’t guarantee success Incorporating the analytical model into Incorporating the analytical model into
the business processthe business process SummarySummary
The Starting Point The Starting Point
Data Warehouse
Data Warehouse
Manual Data Extracts
Mining Tool
CampaignManagement
System
Churn Lists
Outbound CallLists
CustomerManagement
System
The IssuesThe Issues Poor IntegrationPoor Integration Huge degree of manual effortHuge degree of manual effort Large amount of latencyLarge amount of latency Non existent feedback loopNon existent feedback loop
The ImpactsThe Impacts Introduced a high degree of risk every Introduced a high degree of risk every
time the model was refreshedtime the model was refreshed Restricted how often the churn Restricted how often the churn
propensity models could be runpropensity models could be run Drastically reduced the value in running Drastically reduced the value in running
the modelsthe models Made it extremely difficult to measure Made it extremely difficult to measure
the performance of retention effortsthe performance of retention efforts
The GoalThe Goal To overcome the issues with the To overcome the issues with the
existing processexisting process To make the churn propensity scores To make the churn propensity scores
more widely availablemore widely available
The Goal (cont’d)The Goal (cont’d)
Data Warehouse
Data WarehouseMining Tool
CampaignManagement
System
CustomerManagement
System
Direct Connect Contact List
Automated Update
Churn Scores Direct Connect
Outbound CallLists
Challenge #1Challenge #1
Data Warehouse
Data WarehouseMining Tool
CampaignManagement
System
CustomerManagement
System
Direct Connect Contact List
Automated Update
Churn Scores Direct Connect
Outbound CallLists
The Data Mining platform & licenses had The Data Mining platform & licenses had to be completely upgradedto be completely upgraded
Challenge #2Challenge #2
Data Warehouse
Data WarehouseMining Tool
CampaignManagement
System
CustomerManagement
System
Direct Connect Contact List
Automated Update
Churn Scores Direct Connect
Outbound CallLists
The Data Warehouse was re-platformed The Data Warehouse was re-platformed mid projectmid project
Challenge #3Challenge #3
Data Warehouse
Data WarehouseMining Tool
CampaignManagement
System
CustomerManagement
System
Direct Connect Contact List
Automated Update
Churn Scores Direct Connect
Outbound CallLists
The Campaign Management System was The Campaign Management System was replaced mid projectreplaced mid project
Challenge #4Challenge #4
Data Warehouse
Data WarehouseMining Tool
CampaignManagement
System
CustomerManagement
System
Direct Connect Contact List
Automated Update
Churn Scores Direct Connect
Outbound CallLists
The automated process to update the The automated process to update the churn scores in the CRM just did not churn scores in the CRM just did not workwork
FinallyFinally
Data Warehouse
Data WarehouseMining Tool
CampaignManagement
System
CustomerManagement
System
Direct Connect Contact List
Automated Update
Churn Scores Direct Connect
Outbound CallLists
The Long Awaited BenefitsThe Long Awaited Benefits The time required to refresh the model The time required to refresh the model
was slashed by a factor of 10was slashed by a factor of 10 Churn propensity scores could be Churn propensity scores could be
refreshed across the entire customer refreshed across the entire customer base on a monthly basisbase on a monthly basis
It became possible to accurately It became possible to accurately measure the success of the retention measure the success of the retention effortsefforts
The Customer Services Representatives The Customer Services Representatives could finally recognize at risk could finally recognize at risk customers during inbound calls.customers during inbound calls.
Incorporating the model into Incorporating the model into the businessthe business “ “The more that the use of the analytical The more that the use of the analytical
solution can be embedded into the solution can be embedded into the business process being supported, the business process being supported, the more likely it is that benefits will be more likely it is that benefits will be realisedrealised” ”
Incorporating the model into Incorporating the model into the business (cont’d)the business (cont’d) “ “The key to successful data mining is to The key to successful data mining is to
incorporate the models into the incorporate the models into the business” business”
Berry & LinoffBerry & Linoff
AgendaAgenda IntroductionIntroduction Pyle’s 9 Rules for Analytics Project Pyle’s 9 Rules for Analytics Project
FailureFailure Why navigating Pyle’s 9 Rules still Why navigating Pyle’s 9 Rules still
doesn’t guarantee successdoesn’t guarantee success Incorporating the analytical model into Incorporating the analytical model into
the business process the business process SummarySummary
SummarySummary Remember Pyle’s 9 RulesRemember Pyle’s 9 Rules BUT more importantly…BUT more importantly…
Remember The Bigger Picture Remember The Bigger Picture
The Bigger PictureThe Bigger Picture““Data mining is part, and a very small Data mining is part, and a very small part, of a much larger business process. part, of a much larger business process. It may be an essential part of a data It may be an essential part of a data mining project, but incorporating the mining project, but incorporating the results of mining with all the related results of mining with all the related parts of the corporate project is equally, parts of the corporate project is equally, if not more, important for ultimate if not more, important for ultimate success”success”
Dorian PyleDorian Pyle