building data science teams from scratch (polish business analytics summit, march 2016)
TRANSCRIPT
Building Data Science Teams from ScratchENDA RIDGE, PHD
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
2What You Will Learn A Data Science Capability
Why do this? The strategic advantage of a Data Science capability in Retail What do you need? The 3 components of a capability Where do you start? 5 steps to build a capability
How this will help you Leadership: how to set the direction How to enable a team How to fit into the enterprise Practitioners:
The support you must lobby for Your focus in year 1
Copyright Enda Ridge 2016#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
3What I’ve Learned
PhD‘Design of Experime
nts for Tuning
Algorithms’
Boutique Consultanc
y
Forensic Data
Analytics
Senior Manager
Professional
Services
Head of Algorith
ms
Copyright Enda Ridge 2016
No matter the industry, doing agile data science always faces the same challenge…
2004 2008 2010 2012 2015
Organisations do not have the flexibility to accommodate data science
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
4
The Strategic Advantage of Data Science
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
5Typical ChallengesHave we changed customer online behaviour?
Could we tell when our plant will fail?
Can we improve our supply chain?
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
6Typical ChallengesWhich financial products should we offer?
Where do we next locate a store?
Etc etc
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
7Problem characteristics
Copyright Enda Ridge 2016
Complex, interrelated, living systems
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
8Problem characteristics Uncertainty
Data Process Questions Solutions
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
9Problem characteristics New data, ‘informal’ data sources
Disparate sources Surveys Web scrapes Logs 3rd party
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
10Problem characteristics Huge variety of solutions to try out
Data joins Visualizations Algorithms
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
11You’re not ready for the factory line
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
12What is Data Science?“Data Science is the discipline of understanding and using data
to improve your business”
MathematicsStatistics
Machine learningVisualization
- Enhance products- Find opportunities- Increase efficiency
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
13Strategic advantage?Have we changed customer buying behaviour?
Could we tell when our plant will fail?
How do we make our warehouse more efficient?
Copyright Enda Ridge 2016
Experiment design
Predictive modelling
Operations Research
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
14Strategic advantage?Which financial products should we offer?
Where do we next locate a store?
Copyright Enda Ridge 2016
Logistic regression Geo queries
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
15What Data Science is not…
Big Data
Business Intelligence
creating beautiful visualizations just because we can
Copyright Enda Ridge 2016
https://vimeo.com/88093956
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
16Are you doing Data Science?
Frame a business problem
Gather and generate data
AnalyseConfirm with experiment
Copyright Enda Ridge 2016
Business operations
Data-driven products
Best in class organisations integrate Data Science into everything they do
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
17
3 Components of a Data Science Capability
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
18Typical mistakes Not knowing how Data Science really
works in the trenches
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
19Typical mistakes Not knowing how Data Science really
works in the trenches Expecting magic
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
20Typical mistakes Not knowing how Data Science really
works in the trenches Expecting magic Bundling with IT
or isolating from IT
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
21Typical mistakes Not knowing how Data Science really
works in the trenches Expecting magic Bundling with IT
or isolating from IT Too much structure / bureaucracy
Copyright Enda Ridge 2016
http://workplacereport.com/
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
223 Components of a Capability
Data Science
Leadership
DataPeople and Technology
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
23Component 1: Leadership Set the direction and support it
Changes to BAU Inefficiencies exposed Opportunities to capitalise on
Pitfall: Data Science very difficult Results don’t get used
Copyright Enda Ridge 2016
Frame a business problem
Gather and generate data
Analyse
Confirm with experiment
Business operation
sData-driven
products
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
24Component 1: Leadership Set targets and measure progress What’s a Data Science KPI?
# of Algorithms in products? Improvements to bottom line? # of Experiments completed? How to cost?
Pitfalls: Whimsical projects Losing business focus
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
25Component 1: Leadership Prioritise the pipeline
Pitfalls: No strategic focus
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
26Component 2: People & Technology Hype says you need geniuses Reality:
Communication Consulting and Influencing Tenacity Passion
Pitfalls Failure to understand business context Disillusionment at obstacles Cannot answer the ‘so what’?
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
27Component 2: People & Technology
What you need Pitfall
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
28Component 2: People & Technology Data Science needs technology flexibility Faced with
Overwhelming firewalls Irrational fear of Open Source IT SLAs for server builds Ad-hoc IT support
Pitfalls Premature tech governance Technology dictated from above
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
29Component 3: Data Data Scientists need access to your data In the early days
Focus on blockers to access, storage Let the Data Scientists work the data
Pitfalls: Not taking a strategic view on your data Making a data dictionary a pre-requisite Letting security perceptions be an excuse Sticking to outmoded ideas of ‘production
data’
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
303 Components of a Capability
Data Science
Leadership
DataPeople and Technology
Copyright Enda Ridge 2016
• Vision• Smash barriers• Priority targets
• Access• Security• Service Ops
• Coal face• Soft skills• Flexible tech
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
31
5 Steps to Build a Capability
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
325 Steps
Build a customer base
Assemble the right people
Enable them
Engage and Operate
Work with product development
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
33Step 1: Build a customer base Find the low hanging fruit Deliver quick wins Educate the organisation Market the team
Business benefit, business benefit, business benefit…
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
34Step 2: Assemble the right people
Data Science
Data Scientists
+Tech
Support+
Enlightened
Customer
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
35Step 3: Enable your people
1 Laptops2 Database3 ApplicationServers
Laptops Powerful Elevated privileges Internet access
Database Pick good enough general analytics
database Application Servers
Internet access Plenty of RAM Pick a good enough general analytics
language
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
36Step 4: Engage and Operate Simple Engagement model
Short sharp studies When are we done? What does success look like? What Data Science doesn’t do
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
37Step 4: Engage and Operate Simple Operating model
Track your projects Simple conventions on data Version control Track deliverables
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
38Step 5: Work with product development
Language incompatibility Agile incompatibilities
What’s a Data Science sprint? Influence for Data Science features
Data Scientists have user stories too! Influence for Data Science data
Data Scientists have user stories too!
Copyright Enda Ridge 2016
#GuerrillaAnalytics http://guerrilla-analytics.net @enda_ridge
39Building a Data Science Capability The strategic advantage of Data Science
finding opportunities, efficiencies and product enhancements in data
3 components Leadership & Targets People and Technology Data
5 steps Build a customer base Gather the right people Enable them Engage and Operate Work with product development
Copyright Enda Ridge 2016