multi armed bandits [email protected]. survey click here

50
Multi Armed Bandits [email protected]

Upload: ashly-corbell

Post on 29-Mar-2015

223 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Multi Armed Bandits

[email protected]

Page 2: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Survey

Page 3: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Click Here

Page 4: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Click Here

Click-through Rate (Clicks / Impressions)20%

Page 5: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Click Here Click Here

Page 6: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Click Here Click Here

Click-through Rate20% ?

Page 7: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Click Here Click Here

Click-through Rate20% ?

AB Test• Randomized Controlled Experiment• Show each button to 50% of users

Page 8: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

AB Test Timeline

AB Test

AB TestAfter Test (show winner)Before Test

Time

Exploration Phase(Testing)

Exploitation Phase(Show Winner)

Page 9: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Click Here Click Here

Click-through Rate20% ?

Page 10: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Click Here Click Here

Click-through Rate20% 30%

Page 11: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

• 10,000 impressions/month

• Need 4,000 clicks by EOM• 30% CTR won’t be enough

Page 12: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Need to keep testing (Exploration)

Page 13: Multi Armed Bandits chalpert@meetup.com. Survey Click Here
Page 14: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

ABCDEFG... TestEach variant would be assigned with probability 1/N

N = # of variants

Page 15: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Not everyone is a winner

Page 16: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

Click Here Click Here

ABCDEFG... TestEach variant would be assigned with probability 1/N

N = # of variants

Page 17: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Need to keep testing (Exploration)

Need to minimize regret(Exploitation)

Page 18: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Multi Armed Bandit

Balance of Exploitation & Exploration

Page 19: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Bandit Algorithm Balances Exploitation & Exploration

Multi Armed BanditBefore Test

Time

AB Test

AB TestAfter TestBefore Test

Discrete Exploitation & Exploration Phases

Continuous Exploitation & Exploration

Bandit Favors Winning Arm

Page 20: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Bandit Algorithm Reduces Risk of Testing

AB TestBest arm exploited with probability 1/N– More Arms: Less exploitation

BanditBest arm exploited with determined probability– Reduced exposure to suboptimal arms

Page 21: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Demo

Borrowed fromProbabilistic Programming & Bayesian Methods for Hackers

Page 22: Multi Armed Bandits chalpert@meetup.com. Survey Click Here
Page 23: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Split Test

Bandit

Winner Breaks Away!

Still sending losers

AB test would have cost 4.3 percentage points

Page 24: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

How it works

Epsilon Greedy Algorithmε = Probability of Exploration

ε

1 - ε

Exploration

Exploitation(show best arm)

Start of round

1 / N

1 / N

Click Here

Click Here

Click Here

ε / N

1-ε

ε / N

Epsilon Greedy with ε = 1 = AB Test

Page 25: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Epsilon Greedy Issues

• Constant Epsilon:– Initially under exploring– Later over exploring– Better if probability of exploration decreases with

sample size (annealing)• No prior knowledge

Page 26: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Some Alternatives

• Epsilon-First• Epsilon-Decreasing• Softmax• UCB (UCB1, UCB2)• Bayesian-UCB• Thompson Sampling (Bayesian Bandits)

Page 27: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Bandit Algorithm Comparison

Regret:

Page 28: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Thompson Sampling

Setup: Assign each arm a Beta distribution with parameters (α,β) (# Success, # Failures)

Click Here Click Here Click Here

Beta(α,β) Beta(α,β) Beta(α,β)

Page 29: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Thompson Sampling

Setup: Initialize priors with ignorant state of Beta(1,1) (Uniform distribution)- Or initialize with an informed prior to aid convergence

Click Here Click Here Click Here

Beta(1,1) Beta(1,1) Beta(1,1)

Page 30: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

For each round:

Thompson Sampling

Click Here Click Here Click Here

Beta(1,1) Beta(1,1) Beta(1,1)

1: Sample random variable X from each arm’s Beta Distribution

2: Select the arm with largest X3: Observe the result of selected arm

4: Update prior Beta distribution for selected arm

X

Success!

0.7 0.2 0.4

Page 31: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

For each round:

Thompson Sampling

Click Here Click Here Click Here

Beta(2,1) Beta(1,1) Beta(1,1)

1: Sample random variable X from each arm’s Beta Distribution

2: Select the arm with largest X3: Observe the result of selected arm

4: Update prior Beta distribution for selected arm

X

Success!

0.7 0.2 0.4

Page 32: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

For each round:

Thompson Sampling

Click Here Click Here Click Here

Beta(2,1) Beta(1,1) Beta(1,1)

1: Sample random variable X from each arm’s Beta Distribution

2: Select the arm with largest X3: Observe the result of selected arm

4: Update prior Beta distribution for selected arm

X

Failure!

0.4 0.8 0.2

Page 33: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

For each round:

Thompson Sampling

Click Here Click Here Click Here

Beta(2,1) Beta(1,2) Beta(1,1)

1: Sample random variable X from each arm’s Beta Distribution

2: Select the arm with largest X3: Observe the result of selected arm

4: Update prior Beta distribution for selected arm

X

Failure!

0.4 0.8 0.2

Page 34: Multi Armed Bandits chalpert@meetup.com. Survey Click Here
Page 35: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Posterior after 100k pulls (30 arms)

Page 36: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Bandits at Meetup

Page 37: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Meetup’s First Bandit

Page 38: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Control: Welcome To Meetup! - 60% Open RateWinner: What?Winner: Hi - 75% Open Rate (+25%)

76 Arms

Page 39: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Control: Welcome To Meetup! - 60% Open RateWinner: What?Winner: Hi - 75% Open Rate (+25%)

76 Arms

Page 40: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Control: Welcome To Meetup! - 60% Open RateWinner: What?Winner: Hi - 75% Open Rate (+25%)

76 Arms

Page 41: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Avoid Linkbaity Subject Lines

Page 42: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Control: Save 50%, start your Meetup Group – 42% Open RateWinner: Here is a coupon – 53% Open Rate (+26%)

16 Arms

Coupon Email

Page 43: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

398 Arms

Page 44: Multi Armed Bandits chalpert@meetup.com. Survey Click Here
Page 45: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

210% Click-through Difference:

Best:Looking to start the perfect Meetup for you?We’ll help you find just the right people

Start the perfect Meetup for you!We’ll help you find just the right people

Worst:Launch your own Meetup in January and save 50%Start the perfect Meetup for you50% off promotion ends February 1st.

Page 46: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Choose the Right Metric of Success

• Success tied to click in last experiment• Sale end & discount messaging had bad

results• Perhaps people don’t know that hosting a

Meetup costs $$$?– Better to tie success to group creation

Page 47: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

More Issues

• Email open & click delay• New subject line effect– Problem when testing notifications

• Monitor success trends to detect weirdness

Page 48: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Seasonality

• Thompson Sampling should naturally adapt to seasonal changes– Learning rate can be added for faster adaptation

Click Here

Winner all other times

Click Here

Page 49: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Bandit or Split Test?

AB Test good for:- Biased Tests- Complicated Tests

Bandit good for:- Unbiased Tests- Many Variants- Time Restraints- Set It And Forget It

Page 50: Multi Armed Bandits chalpert@meetup.com. Survey Click Here

Thanks!

[email protected]