intro ab-taguchi
TRANSCRIPT
Prepared for :NYC UX + DATA MeetupMarch 12, 2014Pivotal Labs, New York
A/B and Pairwise TestingHow I Learned to Stop Worrying and Love
Data-Driven Decisions
Wednesday, March 12, 14
About Me
• Founded Splitforce in 2013 - Data is power, and it should be easy to leverage
• Marketing for Chinese media company in Shanghai
• Designed experiments and predictive analytics for ILABS in Montreal
• Studied in economics and statistics at McGill University in Montreal
Wednesday, March 12, 14
User Base
Wednesday, March 12, 14
User Base
Publish two different versions of your app...
Wednesday, March 12, 14
User Base
Publish two different versions of your app...
50% sees version B50% sees version A
Wednesday, March 12, 14
User Base
...and see which one is driving desirable user behavior.
Publish two different versions of your app...
50% sees version B50% sees version A
Wednesday, March 12, 14
Which Version Won?
Version A Version B
Wednesday, March 12, 14
Version B: 114% Improvement
Version A Version B
Marketer’s Surprise: ‘FREE’ Loses
✔ ✗
Wednesday, March 12, 14
How Obama Raised $60 Million
Wednesday, March 12, 14
Four Button Variations
Wednesday, March 12, 14
Six Media Variations
Wednesday, March 12, 14
24 Combinations!
Wednesday, March 12, 14
And the Winner is...
+40%increase in conversion
rate
2.9 millionadditional donators
$60 millionvalue of additional
donations
Wednesday, March 12, 14
Obamalytics
• Original Conversion Rate: 8.3%
Wednesday, March 12, 14
Obamalytics
• Original Conversion Rate: 8.3%
• New Conversion Rate: 11.6%
Wednesday, March 12, 14
Obamalytics
• Original Conversion Rate: 8.3%
• New Conversion Rate: 11.6%
• 10 million signups from New Version would have been 7.12 million signups with the Original Version
Wednesday, March 12, 14
Obamalytics
• Original Conversion Rate: 8.3%
• New Conversion Rate: 11.6%
• 10 million signups from New Version would have been 7.12 million signups with the Original Version
• +2.88 million additional signups
Wednesday, March 12, 14
Obamalytics
• Original Conversion Rate: 8.3%
• New Conversion Rate: 11.6%
• 10 million signups from New Version would have been 7.12 million signups with the Original Version
• +2.88 million additional signups
• $21 average donation per signup
Wednesday, March 12, 14
Obamalytics
• Original Conversion Rate: 8.3%
• New Conversion Rate: 11.6%
• 10 million signups from New Version would have been 7.12 million signups with the Original Version
• +2.88 million additional signups
• $21 average donation per signup
• Approximately $60 million in additional donations
Wednesday, March 12, 14
Interpreting Test Results
Wednesday, March 12, 14
Multivariate Testing
• Every screen has X components (ex: Marilyn’s hair)
• For each, we can test Y variations (ex.: Green)
• In total, we have [Y1 x Y2 x Y3] combinations
Wednesday, March 12, 14
Costs of Testing
• Risk of false positives (Type I error, saying something is there when it’s not)
• Need for adequate sample size
• Testing presents an opportunity cost
Wednesday, March 12, 14
Design of Experiments
• Let’s say we have four variables:
• Header Banner (A, B, C)• Main Copy (1, 2, 3)• Button Color (Cyan, Magenta, Yellow)• Call to Action (Buy!, Check Out)
Wednesday, March 12, 14
Design of Experiments
• Option 1: Full factorial design - multiply out for all different combinations
Wednesday, March 12, 14
Design of Experiments
• Option 1: Full factorial design - multiply out for all different combinations
• Example: (3 header banners) x (3 main copy) x (3 button colors) x (2 CTAs) = 54 combinations
Wednesday, March 12, 14
Design of Experiments
• Option 1: Full factorial design - multiply out for all different combinations
• Example: (3 header banners) x (3 main copy) x (3 button colors) x (2 CTAs) = 54 combinations
• Can we get similar information with fewer tests?
Wednesday, March 12, 14
Design of Experiments
Option 2: Orthogonal arrays tests pairs of combinations instead of all combinations
Wednesday, March 12, 14
Design of Experiments
Option 2: Orthogonal arrays tests pairs of combinations instead of all combinations
• Risk: pairing will hide some combinations, and the effects that paired variables have on each other
Wednesday, March 12, 14
Design of Experiments
Option 2: Orthogonal arrays tests pairs of combinations instead of all combinations
• Risk: pairing will hide some combinations, and the effects that paired variables have on each other
• Mitigation: pair variables that are unlikely to influence each other
Wednesday, March 12, 14
L9 Array
Compare any pair of variables across all combinations and you’ll see that they’re all represented!
Wednesday, March 12, 14
Design of Experiments
• Let’s say we have four variables:
• Header Banner (A, B, C)• Main Copy (1, 2, 3)• Button Color (Cyan, Magenta, Yellow)• Call to Action (Buy!, Check Out)
Wednesday, March 12, 14
Design of Experiments• Four variables:
• Header Banner (A, B, C)• Main Copy (1, 2, 3)• Button Color (Cyan, Magenta, Yellow)• Call to Action
(Buy, Purchase)Combo # HB MC BC CTA
1 A 1 Cyan Buy
2 A 2 Magenta Purchase
3 A 3 Yellow
4 B 1 Magenta
5 B 2 Yellow Buy
6 B 3 Cyan Purchase
7 C 1 Yellow Purchase
8 C 2 Cyan
9 C 3 Magenta Buy
Wednesday, March 12, 14
Design of Experiments• Four variables:
• Header Banner (A, B, C)• Main Copy (1, 2, 3)• Button Color (Cyan, Magenta, Yellow)• Call to Action
(Buy, Purchase)Combo # HB MC BC CTA
1 A 1 Cyan Buy
2 A 2 Magenta Purchase
3 A 3 Yellow Buy
4 B 1 Magenta Purchase
5 B 2 Yellow Buy
6 B 3 Cyan Purchase
7 C 1 Yellow Purchase
8 C 2 Cyan Buy
9 C 3 Magenta Buy
Wednesday, March 12, 14
Design of Experiments• Four variables:
• Header Banner (A, B, C)• Main Copy (1, 2, 3)• Button Color (Cyan, Magenta, Yellow)• Call to Action
(Buy, Purchase)Combo # HB MC BC CTA
1 A 1 Cyan Buy
2 A 2 Magenta Purchase
3 A 3 Yellow Buy
4 B 1 Magenta Purchase
5 B 2 Yellow Buy
6 B 3 Cyan Purchase
7 C 1 Yellow Purchase
8 C 2 Cyan Buy
9 C 3 Magenta Buy
We’ve reduced need to collect data on 54 combinations to just 9 (6x efficiency increase)
Wednesday, March 12, 14
FROM 54 COMBINATIONSA1CyanBuy, A1CyanPurchase, A1MagentaBuy, A1MagentaPurchase, A1YellowBuy, A1YellowPurchase, A2CyanBuy, A2CyanPurchase, A2Magen t aBuy, A2Magen t aPu rch a s e , A2Ye l l owBuy, A2YellowPurchase, A3CyanBuy, A3CyanPurchase, A3MagentaBuy, A3MagentaPurchase, A3YellowBuy, A3YellowPurchase, B1CyanBuy, B1CyanPurchase , B1MagentaBuy, B1MagentaPurchase , B1YellowBuy, B1YellowPurchase, B2CyanBuy, B2CyanPurchase, B 2Magen t aBuy, B 2Magen t aPu rc h a s e , B 2Ye l l owBuy, B2YellowPurchase, B3CyanBuy, B3CyanPurchase, B3MagentaBuy, B3MagentaPurchase, B3YellowBuy, B3YellowPurchase, C1CyanBuy, C1CyanPurchase , C1MagentaBuy, C1MagentaPurchase , C1YellowBuy, C1YellowPurchase, C2CyanBuy, C2CyanPurchase, C 2Mag en t aBuy, C 2Mag en t a Pu r c h a s e , C 2Ye l l owBuy, C2YellowPurchase, C3CyanBuy, C3CyanPurchase, C3MagentaBuy, C3MagentaPurchase, C3YellowBuy, C3YellowPurchase
Wednesday, March 12, 14
TO JUST 9 (+6X EFFICIENCY)A1CyanBuy, A1CyanPurchase, A1MagentaBuy, A1MagentaPurchase, A1YellowBuy, A1YellowPurchase, A2CyanBuy, A2CyanPurchase, A2Magen t aBuy, A2Magen t aPu rch a s e , A2Ye l l owBuy, A2YellowPurchase, A3CyanBuy, A3CyanPurchase, A3MagentaBuy, A3MagentaPurchase, A3YellowBuy, A3YellowPurchase, B1CyanBuy, B1CyanPurchase , B1MagentaBuy, B1MagentaPurchase , B1YellowBuy, B1YellowPurchase, B2CyanBuy, B2CyanPurchase, B 2Magen t aBuy, B 2Magen t aPu rc h a s e , B 2Ye l l owBuy, B2YellowPurchase, B3CyanBuy, B3CyanPurchase, B3MagentaBuy, B3MagentaPurchase, B3YellowBuy, B3YellowPurchase, C1CyanBuy, C1CyanPurchase , C1MagentaBuy, C1MagentaPurchase , C1YellowBuy, C1YellowPurchase, C2CyanBuy, C2CyanPurchase, C 2Mag en t aBuy, C 2Mag en t a Pu r c h a s e , C 2Ye l l owBuy, C2YellowPurchase, C3CyanBuy, C3CyanPurchase, C3MagentaBuy, C3MagentaPurchase, C3YellowBuy, C3YellowPurchase
Wednesday, March 12, 14
Design of Experiments
• Where do orthogonal arrays come from?• Derived by hand (like playing Sudoku!)• Look them up (U Michigan, U York, Hexawise.com)
Wednesday, March 12, 14
Design of Experiments
• Where do orthogonal arrays come from?• Derived by hand (like playing Sudoku!)• Look them up (U Michigan, U York, Hexawise.com)
• How to choose a design?• Number of variables• Number of states for each variable
Wednesday, March 12, 14
Design of Experiments
• Where do orthogonal arrays come from?• Derived by hand (like playing Sudoku!)• Look them up (U Michigan, U York, Hexawise.com)
• How to choose a design?• Number of variables• Number of states for each variable
• How to analyze results?• Plot data, Analysis of Variance (ANOVA), binning
Wednesday, March 12, 14
Analyzing Results• Plot data and look at it
• Some things you don’t need statistics to tell you, it’s just there• Your eye is a pretty good analysis tool
Wednesday, March 12, 14
Analyzing Results• Plot data and look at it
• Some things you don’t need statistics to tell you, it’s just there• Your eye is a pretty good analysis tool
• Analysis of Variance (ANOVA)• One-way ANOVAs to find influence of a one variable on the
result (assume that other variables have minimal influence)• Two-way ANOVAs to find influence of two variables on
result at once
Wednesday, March 12, 14
Analyzing Results• Plot data and look at it
• Some things you don’t need statistics to tell you, it’s just there• Your eye is a pretty good analysis tool
• Analysis of Variance (ANOVA)• One-way ANOVAs to find influence of a one variable on the
result (assume that other variables have minimal influence)• Two-way ANOVAs to find influence of two variables on
result at once
• Binning• Group combinations based on results (high vs. low)• How many Header Banner A’s have high result? low result?
Wednesday, March 12, 14
Analyzing Results• Plot data and look at it
• Some things you don’t need statistics to tell you, it’s just there• Your eye is a pretty good analysis tool
• Analysis of Variance (ANOVA)• One-way ANOVAs to find influence of a one variable on the
result (assume that other variables have minimal influence)• Two-way ANOVAs to find influence of two variables on
result at once
• Binning• Group combinations based on results (high vs. low)• How many Header Banner A’s have high result? low result?
Takeaway: You can extrapolate data from a subset of combinations to make a conclusion about a full factorial set
Wednesday, March 12, 14
Design of Experiments• Can get pretty complex, but super efficient!• L36 array - reducing ~94 million combinations to 36
Wednesday, March 12, 14
Comparison of A/B Testing Platforms
Google Analytics Optimizely Splitforce
PlatformWeb / mWeb X X
PlatformNative Mobile X
A/B Testing X X
ExperimentDesign Multivariate X X
Automation X X
OtherIn-Browser Editor X X
OtherConsulting X X
Wednesday, March 12, 14
In-House vs. Agency
In-House Agency
Pros
Lower initial costs
More control over testing process
Better understanding of business objectives
No need for internal resources
Faster results as agency provides specialized expertise
Learn best practices and accelerate internal competency
Cons
Long time to build expertise from scratch
Longer time to start achieving great test results
Higher initial costs
Less understanding of complexities / nuances of your business
Less control over testing
Wednesday, March 12, 14
Thank You!
For more information:
Zac Aghion, CEO & [email protected]
China: (+86)1592-1631-924USA: (+1)617-750-6684
www.splitforce.com
Wednesday, March 12, 14