from the big bang to the new economy, a journey in making sense of big data
DESCRIPTION
TRANSCRIPT
…
How old are you?
FROM THE BIG BANG TO THE NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATAPatrick DeglonDirector of Engineering, Analytics Area Tech [email protected]/pdeglon
4FROM THE BIG BANG TO ECOMMERCE,
A JOURNEY IN MAKING SENSE OF BIG DATA
from the Big Bang…
5Image: CERN
15 billions years
5 billions years
1 billion years
300,000 years
2 min
0.0000000001 sec
10-34 sec = 0.0…001 sec (34 zeros)
10-43 sec = 0.0…001 sec (43 zeros)
During 1996-2002, worked at CERN (the European Laboratory for Particle Physics) for my MS and PhD at the University of Geneva
6
Geneva Switzerland
Image: CERN
17 miles underground tunnelfor the LEP & LHC accelerator
Source: CERN
Mont Blanc
7Image: CERN Source: CERN
8
Tape robotSource: CERN
PAW – Physics Analysis WorkstationSource: Wikipedia
Data collection & analysis was done in Fortran. Advance
analysis/statistics was done through PAW. [1996-2002]
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
Example of a particle collision
9FROM THE BIG BANG TO NEW ECONOMY,
A JOURNEY IN MAKING SENSE OF BIG DATA
Solving the puzzle… which particles go together?
10
?
A
B
CD
1. AB + CD?2. AC + BD?3. AD + BC?
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
Solution: Big Data infrastructure enables large scale computational such as combine all possibilities (cross-product)
11
Statistical Noise
Signal(particle resonance)
Source: http://www.atlas.ch/news/2011/ATLAS-discovers-its-first-new-particle.html
Schematic View CERN Example(discovery of a new particle bb)
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
12
Size of the electron?
01
23
45
6
R < 5.1 x 10-19 m ***
*** Patrick Deglon, Etude de la diffusion Bhabha avec le détecteur L3 au LEP, Th. phys. Genève, 2002; Sc. 3332
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
13
Extra dimension?
MS > 1.1 TeV ***
e-
e+
e+
e-
our universe in 4 dimensions
extra dimension
*** Patrick Deglon, Etude de la diffusion Bhabha avec le détecteur L3 au LEP, Th. phys. Genève, 2002; Sc. 3332
graviton
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
14FROM THE BIG BANG TO ECOMMERCE,
A JOURNEY IN MAKING SENSE OF BIG DATA
… to the New Economy
15
What my friends think I do
What my mum thinks I do
What the BU thinks I do
What I think I do What the BU wants me to do
What I really do
Source: Pierre Donzier
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
16
Example #1
KPI reporting & Impact Measurement
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
Measuring impact of initiatives
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000
Aug 1st Sep 1st Oct 1st
Number of listings
2012
2011B
A
C
Pre/Post analysis illustrative example (Simulation)
D
Impact of the initiative
pre post
Initiativelaunched
• Used to measure the impact of an initiative in a full market or a market segment
• Randomized Test/Control group methodology is a golden standard in research
A/B test illustrative example (Simulation)
0
50
100
150
200
250
300
350
400
450
Aug 1st Sep 1st Oct 1st
Number of purchases
Impact of the initiative
Initiativelaunched
control group
test group
17FROM THE BIG BANG TO NEW ECONOMY,
A JOURNEY IN MAKING SENSE OF BIG DATA
18
Key Performance Indicators
Motorola Factory# Shipments
Distribution Channels# Sales
First Usage# Activations
Simplified Business Flow
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
19
Data Flow
Motorola Factory# Shipments
Distribution Channels# Sales
First Usage# Activations
Google BigQuery
MotorolaCloud
Insights
...
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
20
Google Spreadsheet as a Reporting Engine
Spreadsheet
BigQuery
HTML body in sheet
App Script
Scheduler
Distribution Group
Charts
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
21
Google Spreadsheet as a Reporting Engine
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
22
Analytics Portal
App Engine
Data Source:Big Query
Data Source:Google Analytics
iFrame Source:Tableau Server
iFrame Source:Google Documents
Data Source:Spreadsheet & CSV
ReportMeta Data:
Google Drive
(Text file with JSON)
Report Meta Data:Datastore
(Report copy & usage tracking)
Users Access Control:
Google Users + Drive Sharing
Google Drive
Android App
Web portal
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
23
42…so what?
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
Answer to the Ultimate Question of Life, The Universe, and Everything
24
Campaign MeasurementCampaigns
• Campaign Id• Campaign Name• Time range• Set of Countries• Set of Products
KPI
• Date• Country• Product• KPI[]
X
Trend
• Campaign Id• Date• Total of KPI[]
Summary
• Campaign Id• Campaign Name• Impact Measurement[]• Statistical Error[]
Time Series
Analysis
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
25
Campaign Measurement
Campaign Window
Don’t include weekday cycle in
your volatility measurement
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
26
Campaign Measurement
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
Define Campaign
Run Campaign
Measure Impacts
Drive Insights
27
Example #2
Internet Marketing
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
Case study: Online Search
Natural/OrganicSearch (free)
Paid Search
28FROM THE BIG BANG TO NEW ECONOMY,
A JOURNEY IN MAKING SENSE OF BIG DATA
X days
2 purchases
missing
X days
Y days
all purchasesare incremental1 purchase is
uncorrelated
Y days
Jan 1st Feb 1st
$ $ $ $ $ $ $
click
$ $ $
Behavioral purchasesUncorrelated to Marketing
clickMar 1st
$
Influence purchaseCorrelated to Marketing
Customer behaviors and Internet Marketing Investment
Which customer purchases are influenced by Marketing?
29FROM THE BIG BANG TO NEW ECONOMY,
A JOURNEY IN MAKING SENSE OF BIG DATA
Remember this physics problem?
30
?
A
B
CD
1. AB + CD?2. AC + BD?3. AD + BC?
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
Solution: Big Data infrastructure enables large scale computational such as combine all possibilities (cross-product)
31
Statistical Noise
Signal(particle resonance)
Source: http://www.atlas.ch/news/2011/ATLAS-discovers-its-first-new-particle.html
Schematic View
Combine correlated events and uncorrelated events produce a system with a statistical noise (which is simple enough to extract) and the researched signal
CERN Example(discovery of a new particle bb)
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
Marketing incrementality
(correlated purchases) Level of
behavioral purchases
Positive LatencyPurchase after Click (potential causality)
Behavior & Internet Marketing impact
Level of behavioral purchases
0 2 4 6 8 10 12 14
Latency (days)
Number of events (pairs click-purchase)
Negative LatencyPurchase before Click (no causality)
Behavior only
-14 -12 -10 -8 -6 -4 -2
User clicks on an ad-banner at time=0
User makes a purchase X days later
Latency time for each pair click - purchase
32FROM THE BIG BANG TO NEW ECONOMY,
A JOURNEY IN MAKING SENSE OF BIG DATA
33
Sales ROI Channel A 8% +20%Channel B 5% -10%Channel C 1% +10%
Method 1• Reduce spend on channel B• Invest in channel A• When prioritizing, ignore
channel C
Sales ROI Channel A 7% -20%Channel B 6% +30%Channel C 12% +60%
Method 2• Reduce spend on channel A• Invest heavily on channel C• Marketing counts actually for
25% of the site
<>
… So what?
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
Consumer Heterogeneity and Paid Search Effectiveness: A Large Scale Field Experiment, Thomas Blake, Chris Nosko, Steven Tadelis
Case study: Online Search
34FROM THE BIG BANG TO NEW ECONOMY,
A JOURNEY IN MAKING SENSE OF BIG DATA
Case study: Online Search
Consumer Heterogeneity and Paid Search Effectiveness: A Large Scale Field Experiment, Thomas Blake, Chris Nosko, Steven Tadelis 35
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
Consumer Heterogeneity and Paid Search Effectiveness: A Large Scale Field Experiment, Thomas Blake, Chris Nosko, Steven Tadelis
Case study: Online Search
36FROM THE BIG BANG TO NEW ECONOMY,
A JOURNEY IN MAKING SENSE OF BIG DATA
Consumer Heterogeneity and Paid Search Effectiveness: A Large Scale Field Experiment, Thomas Blake, Chris Nosko, Steven Tadelis
Case study: Online Search
37FROM THE BIG BANG TO NEW ECONOMY,
A JOURNEY IN MAKING SENSE OF BIG DATA
38
So, what’s next?Marketing 101
Don’t Do Marketing Do Marketing
No Purchase
PurchaseL L
D DC
C?
?
Cost
Direct Return
Incr Return
Rule #1: Never, ever, spend money unless you really-really have to
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA
So, what’s next?
Investment (costs)
Output Cost
Return (Revenues)
ProfitMax SalesNo Profit
Total ROI = 0
Max Profit
DReturn = DInvestmenti.e. marginal ROI = 0Rule #2: If you have to spend, you spend
to the point of marginal return=0
39FROM THE BIG BANG TO NEW ECONOMY,
A JOURNEY IN MAKING SENSE OF BIG DATA
SpendBucket i
SpendBucket 0
(most profitable)
SpendBucket N
(leastprofitable)
…
…
Marginal Return Chart
CumulativeCost
ROI
CurrentSpend Level
Area/initiatives/segment withnegative profitability
Cost reduction opportunity!
Point of marginal
return = 0(maximum profit)
In depth Analysis require to validate
high ROI
40FROM THE BIG BANG TO NEW ECONOMY,
A JOURNEY IN MAKING SENSE OF BIG DATA
CERN vs New Economy
41
CERN
• Write kilometers long Fortran code
New Economy
• Write miles long SQL code• Analysis can run for many hours… before a
batch robot error• Queries can run for many hours… before a
spool space error
• Study billions of collision data • Study billions of customer data• Great depth of data structure & complexity • Great depth of data structure & complexity• Know your local expert for question – but try
to find the solution by yourself… much quicker
• Know your local expert for question – but try to find the solution by yourself… much quicker
• Remove “bad runs” (unclean data batch) • Remove “wackos” (non material transactions)
• Transform a complex system into insights • Transform a complex system into insights• Communicate findings to conferences • Communicate recommendation to business
review• Strong competitive landscape (4 distinct
experiments competing to the first to publish, or publish better results)
• Strong competitive landscape
FROM THE BIG BANG TO NEW ECONOMY, A JOURNEY IN MAKING SENSE OF BIG DATA