agile data
TRANSCRIPT
![Page 1: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/1.jpg)
AGILE DATAChristopher Bergh
Head Chef,
DataKitchen
O P E N
D A T A
S C I E N C E
C O N F E R E N C E_BOSTON 2015
@opendatasci
![Page 2: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/2.jpg)
AGENDA
Who Am I?
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Five Shocking Steps
![Page 3: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/3.jpg)
3K I T C H E NDATA
Algorithm Nerd
Columbia, MIT, NASA-Ames; ATC Automation
Into In 1990
Fuzzy Logic, Neural Networks, Constraint Satisfaction; Unix/C
Software Nerd
CTO, Dir Engineering, VP Product Management
Into In 2000
Management of Software Teams &
Startups; PowerPoint
Data Nerd
COO: ETL Engineers, Analysts & Analytic Tool
Into In 2010
W. Edwards Deming, Data, Bootstrapping;
Excel Hacking
WHO AM I
![Page 4: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/4.jpg)
AGENDA
Who Am I?
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Five Shocking Steps
![Page 5: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/5.jpg)
SO WHAT IS THE PROBLEM?
In one word ….
![Page 6: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/6.jpg)
LOTSATechnologies in Analytics
![Page 7: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/7.jpg)
LOTSAPeople In Analytic Teams
DATA SCIENTIST
REPORTING ANALYST
ETL ENGINEER
DATABASE ARCHITECT
DEV OPS ENGINEERData Governance
![Page 8: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/8.jpg)
LOTSAData & Analysis
ONE OFF
RE
USE
![Page 9: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/9.jpg)
LOTSAMissed Expectations
Analyze
Prepare Data
C
Analyze
Prepare Data
Business Customer Expectation Analyst Reality
Communicate The business does not think that Analysts are preparing data
Analysts don’t want to prepare data
![Page 10: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/10.jpg)
Complexity
Another Field, Software Development, Ran into the Same Problems With Complexity ...
… They Used Something Called ‘Agile’ To Solve The Problem
![Page 11: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/11.jpg)
AGENDA
Who Am I?
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Five Shocking Steps
![Page 12: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/12.jpg)
AGILEMANIFESTO.ORG
5/31/2015 12
AGILEMANIFESTO.ORG
![Page 13: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/13.jpg)
AGILEMANIFESTO.ORG
13
analytics
![Page 14: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/14.jpg)
s/software/analytics/
![Page 15: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/15.jpg)
PRACTICES THAT ARE EASY TO APPLY
Development Sprints
User Stories
Daily Meetings
Defined Roles
Retrospectives
Pair Programming
Burn Down Charts
![Page 16: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/16.jpg)
SOME PRACTICES HAVE BEEN DIFFICULT TO APPLY
Test Driven Development
Branching And Merging
Refactoring
Small Releases
Frequent Or Continuous Integration
Experimentation For Learning
Individual Development Environments
![Page 17: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/17.jpg)
AGILE – WHAT IS UNIQUE TO ANALYTICS?
17
PUT THE
ANALYST AT
THE CENTER
![Page 18: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/18.jpg)
AGILE – WHAT IS UNIQUE TO ANALYTICS?
ANALYICS
PERCIEVED
VALUE DECAY
CURVE
![Page 19: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/19.jpg)
AGENDA
Who Am I?
What Is The Problem?
A Look At Agile Through Data Lens
How To Do Agile Data In Five Shocking Steps
![Page 20: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/20.jpg)
Why? Your work is just code: models, transforms, etc.
Use a source code control system (like GIT) to enable:
Branching
Merging
Diff
5/31/2015 20
1. MANAGE YOUR WORK LIKE CODE
![Page 21: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/21.jpg)
2. TEST AND CONTAIN
1. Create and monitor tests
2. Test on separate data from production
3. Run tests early and often
4. Target 20% of code for tests
5/31/2015 21
Unit Tests & Systems Test … Keep Adding & Improving
1. Break up you work into components
2. Manage the environment for each component (e.g. Docker, AMI)
3. Practice Environment Version Control
![Page 22: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/22.jpg)
3. PROVIDE SEPARATE ENVIRONMENTS FOR ANALYSTS
Why?
Analysts need their data the data to iterate, develop & explore.
5/31/2015 22
![Page 23: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/23.jpg)
4. SUPPORT THREE TYPES OF WORKFLOWS
Small Team
Work directly on production
Feature Branch
Merge back to production branch
Data Governance
3rd party verification before production merge
5/31/2015 23
Review
Test
Approve
![Page 24: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/24.jpg)
5. GIVE ANALYSTS ABILITY TO EDIT DATABASE SAFELY
5/31/2015 24
Best-in-class companies take 12 days
to integrate new data sources into
their analytical systems; industry
average companies take 60 days;
and, laggards average 143 days
Source: Aberdeen Group: Data Management for BI: Fueling the analytical engine with high-octane information
Figure out how to
do this in
minutes
![Page 25: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/25.jpg)
CONCLUSION
![Page 26: Agile Data](https://reader033.vdocuments.mx/reader033/viewer/2022042701/55cef487bb61ebd43d8b481a/html5/thumbnails/26.jpg)
CONCLUSION