may the best man win! - university of...

34
“May the Best Man Win!” Simulation optimization for match-making in e-sports Ilya O. Ryzhov 1 Awais Tariq 2 Warren B. Powell 2 1 Robert H. Smith School of Business University of Maryland College Park, MD 20742 2 Operations Research and Financial Engineering Princeton University Princeton, NJ 08544 INFORMS Annual Meeting November 15, 2011 1 / 34

Upload: others

Post on 04-Jan-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

“May the Best Man Win!”Simulation optimization for match-making in e-sports

Ilya O. Ryzhov1 Awais Tariq2 Warren B. Powell2

1Robert H. Smith School of BusinessUniversity of Maryland

College Park, MD 20742

2Operations Research and Financial EngineeringPrinceton UniversityPrinceton, NJ 08544

INFORMS Annual MeetingNovember 15, 2011

1 / 34

Page 2: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Outline

1 Introduction

2 TrueSkill™ model for learning skill levelsLearning with moment-matchingThe DrawChance policy

3 Match-making with knowledge gradients

4 Moving on: Targeting and selection

5 Conclusions

2 / 34

Page 3: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Outline

1 Introduction

2 TrueSkill™ model for learning skill levelsLearning with moment-matchingThe DrawChance policy

3 Match-making with knowledge gradients

4 Moving on: Targeting and selection

5 Conclusions

3 / 34

Page 4: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Motivation: e-sports

The term “e-sports” refers to competitive multi-player online gaming

Thousands of players simultaneously log on to networks such as XboxLive or Battle.net

4 / 34

Page 5: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Motivation: e-sports

Revenues of South Korean game company NCSoft, 2000-2004 (Huhh 2008).

E-sports have become culturally significant and very profitable

Top players from around the world compete professionally

In 2005, Xbox Live had over 2 million subscribers; Battle.net has over3 million registered players for a single game

5 / 34

Page 6: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Ranking and competition in e-sports

Game services and outside organizations create rankings of players

Casual players are matched up automatically according to their skilllevel

6 / 34

Page 7: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Simulation optimization for match-making

We would like to create fair and challenging games by matchingplayers of similar skill level

The TrueSkill™ system used by Xbox Live views this as a Bayesianlearning problem in which we sequentially learn players’ skills

Unlike e.g. multi-armed bandit problems (Gittins 1989), the goal is tomatch a target rather than find the most skilled player

We compare a value-of-information procedure to the greedy policyused by Microsoft

7 / 34

Page 8: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Outline

1 Introduction

2 TrueSkill™ model for learning skill levelsLearning with moment-matchingThe DrawChance policy

3 Match-making with knowledge gradients

4 Moving on: Targeting and selection

5 Conclusions

8 / 34

Page 9: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Mathematical model

Player i = 0,1, ...,M has an underlying skill level si , unknown to thegame master

Our uncertainty about si is expressed as

si ∼N(

µ0i ,(σ

0i

)2)

The performance of player i in a game is expressed as

pi ∼N(si ,σ

)We assume that performances and skill levels are independent

9 / 34

Page 10: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Non-conjugacy of Bayesian model

We say that player i beats player j if pi > pj in a game between theseplayers

Unfortunately, we never observe the exact values of pi or pj , onlywhich player won

Thus, the posterior belief

P (si ∈ ds |pi > pj) =P (pi > pj |si = s)P (si ∈ ds)

P (pi > pj)

is non-normal

Conjugacy is forced using moment-matching (Minka 2001): plug themean and variance of the posterior into a normal distribution

10 / 34

Page 11: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Moment-matching for approximate conjugacy

Given the beliefs at time n, and the outcome of game n+ 1 between i andj , update (Dangauthier et al. 2007)

µn+1i =

µni +

(σni )

2

σ̄nij·v(

µni −µn

j

σ̄nij

)if pn+1

i > pn+1j ,

µni −

(σni )

2

σ̄nij·v(

µnj −µn

i

σ̄nij

)if pn+1

i < pn+1j ,

(σn+1i

)2=

(σn

i )2

(1− (σn

i )2

σ̄nij·w(

µni −µn

j

σ̄nij

))if pn+1

i > pn+1j ,

(σni )2

(1− (σn

i )2

σ̄nij·w(

µnj −µn

i

σ̄nij

))if pn+1

i < pn+1j ,

with v (x) = φ(x)Φ(x) , w (x) = v (x)(v (x) + x), and

σ̄nij = (σ

ni )2 +

(σnj

)2+ 2σ

2ε .

Intuitively: increase our skill estimate for the winning player.

11 / 34

Page 12: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Choosing an opponent

In Dangauthier et al. (2007), a game between i and j ends in a draw if

|pi −pj |< δ

for some small δ > 0.12 / 34

Page 13: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

The draw probability

After n games, our prediction that the (n+ 1)st game will end in a draw is

Pn(∣∣∣pn+1

i −pn+1j

∣∣∣< δ

)= IEnPn

(∣∣∣pn+1i −pn+1

j

∣∣∣< δ |si ,sj).

For very small δ ,

Pn(∣∣∣pn+1

i −pn+1j

∣∣∣< δ |si ,sj)≈ 1√

2π (2σ2ε )

e−(si−sj)

2

4σ2ε δ .

We take δ → 0 and define the draw probability as

qnij = IEn

1√2π (2σ2

ε )e−(si−sj)

2

4σ2ε

.

13 / 34

Page 14: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Choosing an opponent

Thus, the “probability” of a draw between players i and j is

qnij =1√2π

1√(σni

)2+(

σnj

)2+ 2σ2

ε

e

− (µni −µnj )2

2

((σn

i )2

+(σnj )

2+2σ2

ε

).

We expect the game to be more competitive when this quantity is higher.

14 / 34

Page 15: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Connection to DrawChance

The DrawChance formula given in Herbrich et al. (2006) is

q̃nij =

√√√√ 2σ2ε(

σni

)2+(

σnj

)2+ 2σ2

ε

e

− (µni −µnj )2

2

((σn

i )2

+(σnj )

2+2σ2

ε

)

which is identical to qnij up to a constant scale factor

The DrawChance policy used by Xbox Live greedily selects thematch-up with the highest draw probability

15 / 34

Page 16: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Outline

1 Introduction

2 TrueSkill™ model for learning skill levelsLearning with moment-matchingThe DrawChance policy

3 Match-making with knowledge gradients

4 Moving on: Targeting and selection

5 Conclusions

16 / 34

Page 17: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Simulation optimization for match-making

We interpret the match-making problem as online simulationoptimization with the objective

supπ

N

∑n=0

qn0,X π (µn,σn)

where π is a policy for choosing opponents for a fixed player 0

The concept of value of information (Chick 2006) looks ahead to theoutcome of the next decision

This approach can be adapted to many types of objective functions(Frazier et al. 2008; Ryzhov & Powell 2011a)

17 / 34

Page 18: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Prediction of game outcome

Our Bayesian beliefs provide us with an (approximate) estimate of theoutcome of a hypothetical game between i and j :

Proposition

Under the normality assumption, the probability that player i beats playerj in game n+ 1 is given by

Pn(pn+1i > pn+1

j

)= Φ

µni −µn

j(σni

)2+(

σnj

)2+ 2σ2

ε

.

18 / 34

Page 19: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Prediction of game outcome

Proof.

We compute

Pn(pn+1i > pn+1

j

)= IEnΦ

(si − sj√

2σ2ε

)

=∫

−∞

Φ

(x√2σ2

ε

)1√

((σni

)2+(

σnj

)2)e

− (x−(µi−µj))2

2

((σn

i )2

+(σnj )

2)dx

and recast the last line as P (X ≤ Y ) where X ∼N(0,2σ2

ε

)and

Y ∼N

(µi −µj ,(σn

i )2 +(

σnj

)2)

are independent.

19 / 34

Page 20: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Value of information in match-making

Let

kw ,n+1 =(µw ,n+1,σw ,n+1

), k l ,n+1 =

(µl ,n+1,σ l ,n+1

)be the beliefs that we would have at time n+ 1 if player 0 wins (orloses) against j

Similarly, let qw ,n+10i (or ql ,n+1

0i ) be the draw probabilities if player 0wins (or loses)

The greedy policy would arrange the next game by computing

Fw ,n+1 = maxi

qw ,n+10i , F l ,n+1 max

iql ,n+1

0i

depending on what happens now

20 / 34

Page 21: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Value of information in match-making

If we stop learning after the next game, the optimal match-up is

X n = arg maxj

qn0j + (N−n)F nj

where

F nj = Pn (0 beats j)Fw ,n+1 +Pn (j beats 0)F l ,n+1

is the expected value (pre-game) of the highest draw probability(post-game)

If the total number N of games is unknown, use

X n = arg maxj

qn0j +γ

1− γF nj

where γ is tunable

21 / 34

Page 22: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Experimental results: draw probabilities

In simulations, our method behaved more aggressively than DrawChance...

22 / 34

Page 23: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Experimental results: difference in true skills

...pursued tougher opponents early on, but found better matches later...

23 / 34

Page 24: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Experimental results: errors of estimates

...produced better estimates of player 0’s true skill...

24 / 34

Page 25: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Experimental results: win/loss ratios

...and came closer to a 0.5 win/loss ratio.

25 / 34

Page 26: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Outline

1 Introduction

2 TrueSkill™ model for learning skill levelsLearning with moment-matchingThe DrawChance policy

3 Match-making with knowledge gradients

4 Moving on: Targeting and selection

5 Conclusions

26 / 34

Page 27: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

...but that’s not the end!

In simulation optimization, wemight tune a simulator to seehow performance of a systemcould be improved

But before the simulator can beoptimized, we need to make surethat it is a good model of reality

Targeting and selection: whichsimulation model most closelymatches data from the field?

27 / 34

Page 28: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Targeting and selection

Let c be a deterministic target (e.g. average historical performance)and consider M simulation models

The mean output si of model i matches the target if

|si − c |< δ

We can simulate system i to obtain a noisy observation

pi ∼N(si ,σ

),

and apply Bayesian updating with no moment-matching required

The “draw probability” in this context is given by

qni =1√2π

1√(σni

)2+ σ2

ε

e

− (µni −c)2

2

((σn

i )2

+σ2ε

).

28 / 34

Page 29: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

The value of information

Our goal is to maximize

supπ

IEmaxi

qNi

or its online equivalent, if we are refining an existing simulator

Bayesian analysis tells us that, conditional on our beliefs at time n,

µn+1i ∼N

(µni ,(σ̃

ni )2)

where (σ̃ni )2 = (σn

i )2−(σn+1i

)2

The knowledge gradient approach simulates the system

X n = arg maxi

IEni max

jqn+1j

which is expected to yield the best result after the simulation

29 / 34

Page 30: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Issues for further work

The quantity IEni maxj q

n+1j can be computed in closed form

If i is believed to be suboptimal with high precision, one simulationmay yield no information (see also Ryzhov & Powell 2011b)

30 / 34

Page 31: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Outline

1 Introduction

2 TrueSkill™ model for learning skill levelsLearning with moment-matchingThe DrawChance policy

3 Match-making with knowledge gradients

4 Moving on: Targeting and selection

5 Conclusions

31 / 34

Page 32: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

Conclusions

We have studied online match-making through the framework ofonline optimal learning

In simulations, a look-ahead policy offers some improvement over agreedy policy

The formulation of the problem has interesting implications for futurework in simulation optimization (Ryzhov 2011)

32 / 34

Page 33: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

References

Chick, S.E. (2006) “Subjective probability and Bayesianmethodology.” In Handbooks of Operations Research andManagement Science 13, 225–258.

Dangauthier, P., Herbrich, R., Minka, T. & Graepel, T. (2007)“TrueSkill through time: revisiting the history of chess.” In Advancesin Neural Information Processing Systems 20, 337–344.

Frazier, P.I., Powell, W. & Dayanik, S. (2008) “A knowledge-gradientpolicy for sequential information collection.” SIAM J. on Control andOptimization 47:5, 2410-2439.

Gittins, J. (1989) Multi-armed bandit allocation indices. John Wileyand Sons.

Herbrich, R., Minka, T. & Graepel, T. (2006) “TrueSkill™: a Bayesianskill rating system.” In Advances in Neural Information ProcessingSystems 19, 569–576.

33 / 34

Page 34: May the Best Man Win! - University Of Marylandscholar.rhsmith.umd.edu/sites/default/files/iryzhov/...\May the Best Man Win!" Simulation optimization for match-making in e-sports Ilya

References

Huhh, J. (2008) “Culture and business of PC bangs in Korea.”Games and Culture 3:1, 26–37.

Minka, T. (2001) “A family of algorithms for approximate Bayesianinference.” Ph.D. thesis, MIT.

Ryzhov, I.O. (2011) “Targeting and selection: a new approach tosimulation validation.” In preparation.

Ryzhov, I.O. & Powell, W.B. (2011a) “Information collection on agraph.” Operations Research 59:1, 188–201.

Ryzhov, I.O. & Powell, W.B. (2011b) “The value of information inmulti-armed bandits with exponentially distributed rewards.”Proceedings of the 2011 International Conference on ComputationalScience, 1363–1372.

34 / 34