preprocessing techniques for computing nash equilibria vincent conitzer duke university based on:...

Preprocessing Techniques for Computing Nash Equilibria

Vincent Conitzer Duke University

Based on:

Conitzer and Sandholm. A Generalized Strategy Eliminability Criterion and Computational Methods for Applying It. AAAI-05

Conitzer and Sandholm. A Technique for Reducing Normal-Form Games to Compute a Nash Equilibrium. AAMAS-06

Computing Nash equilibria in (2-player) normal-form games

• Computing one Nash equilibrium is PPAD-complete– Daskalakis, Goldberg, & Papadimitriou ECCC05; Chen & Deng FOCS06

• Determining whether a Nash equilibrium with a certain property exists is typically NP-complete– Is there an equilibrium with player 1’s utility / each player’s utility / average utility > k?

• Even hard to approximate

– Is there an equilibrium that puts positive / zero probability on pure strategy s?– Etc.– Gilboa & Zemel GEB 89; Conitzer & Sandholm IJCAI03/extended draft

• All known algorithms take exponential time– Even Lemke-Howson [Savani & von Stengel FOCS04/Econometrica06]

Preprocessing games

• We are solving for an equilibrium (optimal equilibrium, maybe)

• When can we shrink the game and solve the shrunk game instead?

• If a strategy is (strictly) dominated, can throw it out• If the game has only a small nonzero component, can

focus on that• We will see generalizations of both of these

A class of “hard” gamesSandholm, Gilpin, Conitzer AAAI05

0, 2 0, 3 3, 0 0, 0 0, 2 0, 0 0, 2

0, 2 0, 0 0, 0 0, 3 0, 2 3, 0 0, 2

2, 4 2, 0 2, 0 2, 0 4, 2 2, 0 3, 3

3, 3 2, 0 2, 0 2, 0 2, 4 2, 0 4, 2

0, 2 3, 0 0, 3 0, 0 0, 2 0, 0 0, 2

0, 2 0, 0 0, 0 3, 0 0, 2 0, 3 0, 2

4, 2 2, 0 2, 0 2, 0 3, 3 2, 0 2, 4

1/3 1/3 1/3

1/3

1/31/3

0 00000

00

Eliminability conceptsDominance: strategy always does worse than some other (mixed)

strategy- strong argument- local reasoning- easy to compute

- often does not apply

Nash equilibrium: strategy does not appear in support of any Nash

equilibrium-weaker argument- global reasoning- hard to compute

- applies more often

3, 2 2, 3

2, 3 3, 2

4, 0 0, 1

.5

.5

0

.5 .5

Is there something “in between” that combines good aspects of both? Yes! [Conitzer & Sandholm AAAI05]

3, 2 2, 3

2, 3 3, 2

2, 0 2, 1

.5

.5

Definition as game between attacker and defender

• Stage 1: Defender specifies probabilities on E strategies (er

*

must get > 0) 3, 00, 30, 20, 2sr

4

0, 33, 00, 20, 2sr3

2, 02, 02, 22, 2sr2

2, 02, 02, 22, 2sr1

sc4sc

3sc2sc

1

0.4

0.3

0.5 0.4

• Stage 2: Attacker chooses one of the E strategies with positive probability to attack and chooses (possibly mixed) attacking strategy

0.5 0.4

attacked

attacking

• Stage 3: Defender chooses on which (non-E) strategy to place the remainder of the probability– If attacking outperforms attacked,

attacker wins attacked

attacking

0.5 0.40.1

er* = sr

3, Er = {sr3, sr

4}, Ec = {sc3, sc

4}

3, 00, 30, 20, 2sr4

0, 33, 00, 20, 2sr3

2, 02, 02, 22, 2sr2

2, 02, 02, 22, 2sr1

sc4sc

3sc2sc

1

3, 00, 30, 20, 2sr4

0, 33, 00, 20, 2sr3

2, 02, 02, 22, 2sr2

2, 02, 02, 22, 2sr1

sc4sc

3sc2sc

1

A spectrum of elimination power

• The larger the Ei sets, the more strategies are eliminable

• If the Ei sets include all strategies, then a strategy is eliminable if and only if no Nash equilibrium places positive probability on it

• If the Ei sets are empty (with the exception of er*) then er* is eliminable if and only if it is dominated

dominance Nash equilibrium

larger Ei sets

Alternative definition

• Stage 1: Defender specifies probabilities on E sets (er

* must get > 0)

0.40.3

0.5 0.4

• Stage 2: Attacker chooses one of the E strategies with positive probability to attack

• Stage 3: Defender distributes the remainder of the probability (not on E)

attacked

0.5 0.4

attacked

0.5 0.4

• Stage 4: Attacker chooses attacking strategy– If attacking outperforms attacked,

attacker wins

0.05 0.05

attacked

0.5 0.40.05 0.05

attacking

er* = sr

3, Er = {sr3, sr

4}, Ec = {sc3, sc

4}

3, 00, 30, 20, 2sr4

0, 33, 00, 20, 2sr3

2, 02, 02, 22, 2sr2

2, 02, 02, 22, 2sr1

sc4sc

3sc2sc

1

3, 00, 30, 20, 2sr4

0, 33, 00, 20, 2sr3

2, 02, 02, 22, 2sr2

2, 02, 02, 22, 2sr1

sc4sc

3sc2sc

1

3, 00, 30, 20, 2sr4

0, 33, 00, 20, 2sr3

2, 02, 02, 22, 2sr2

2, 02, 02, 22, 2sr1

sc4sc

3sc2sc

1

3, 00, 30, 20, 2sr4

0, 33, 00, 20, 2sr3

2, 02, 02, 22, 2sr2

2, 02, 02, 22, 2sr1

sc4sc

3sc2sc

1

Equivalence

• Theorem. The alternative definition is equivalent to the original one.

• Proof based on duality (more specifically, Minimax Theorem [von Neumann 1927])

Mixed integer programming approach (using alternative definition)

• Continuous variables: pi(ei), pie-i(si), binary: bi(ei)

• maximize pr(er*)

• subject to– for both i, for any eiEi, Σp-i(e-i) + Σp-i

ei(s-i) = 1

– for both i, for any eiEi, pi(ei) ≤ bi(ei)

– for both i, for any eiEi and any diSi,

Σp-i(e-i)(ui(ei, e-i)-ui(di, e-i)) + Σp-iei(s-i)(ui(ei, s-i)-ui(di, s-i)) ≥ (bi(ei)-1)Ui

Ui is the maximum difference between two of player i’s utilities

• Number of binary variables = |Er| + |Ec|– Exponential only in this!

Eliminating strategies in the “hard” game

0, 2 0, 3 3, 0 0, 0 0, 2 0, 0 0, 2

0, 2 0, 0 0, 0 0, 3 0, 2 3, 0 0, 2

2, 4 2, 0 2, 0 2, 0 4, 2 2, 0 3, 3

3, 3 2, 0 2, 0 2, 0 2, 4 2, 0 4, 2

0, 2 3, 0 0, 3 0, 0 0, 2 0, 0 0, 2

0, 2 0, 0 0, 0 3, 0 0, 2 0, 3 0, 2

4, 2 2, 0 2, 0 2, 0 3, 3 2, 0 2, 4

1/3 1/3 1/3

1/3

1/31/3

0 00000

00

Er

Ec

Another preprocessing technique for computing a Nash equilibrium [Conitzer & Sandholm AAMAS06]

al, dml…a2, dm2a1, dm1

………

al, d2l…a2, d22a1, d21

al, d1l…a2, d12a1, d11

ckn, bk…ck2, bkck1, bk

………

c2n, b2…c22, b2c21, b2

c1n, b1…c12, b1c11, b1

G

πr, πcal, ΣipG(si)d1l…a2, ΣipG(si)di2a1,ΣipG(si)di1

ΣjpG(tj)ckj, bk

…

ΣjpG(tj)c2j, b2

ΣjpG(tj)c1j, b1

G

H

H

Required structure on original game O

al, dml…a2, dm2a1, dm1sm

…………

al, d2l…a2, d22a1, d21s2

al, d1l…a2, d12a1, d11s1

ckn, bk…ck2, bkck1, bkuk

…………

c2n, b2…c22, b2c21, b2u2

c1n, b1…c12, b1c11, b1u1

tn…t2t1vl…v2v1

That is: against any fixed vj, all the si give the row player the same utility aj

against any fixed ui, all the tj give the column player the same utility bi

H

G

Solve for equilibrium of G (recursively)

sm

…

s2

s1

tn…t2t1

• Obtain– Equilibrium distributions pG(si), pG(tj)

– Player’s expected payoffs in equilibrium πr, πc

G

Reduced game R

πr, πcal, ΣipG(si)d1l…a2, ΣipG(si)di2a1,ΣipG(si)di1s

ΣjpG(tj)ckj, bkuk

……

ΣjpG(tj)c2j, b2u2

ΣjpG(tj)c1j, b1u1

tvl…v2v1

Expected payoffs when row player plays the equilibrium of G, column player plays vi

Expected payoffs when both players play the equilibrium of G

• Theorem. pR(ui), pR(s)pG(si); pR(vj), pR(t)pG(tj) constitutes a Nash equilibrium of original game.

H

Example

v1 t1 t2

u1 2, 2 0, 3 2, 3

s1 1, 2 4, 0 0, 4

s2 1, 4 0, 4 4, 0

t1 t2

s1 4, 0 0, 4

s2 0, 4 4, 0

0.5 0.5

0.5

0.5

v1 t

u1 2, 2 1, 3

s 1, 3 2, 2

0.50.5

0.5 0.5

0.5

0.5 0.25 0.25

0.25

0.25

A more difficult example

= the game that we solved before!

v1 = b2 t1 = b1 t2 = b3

u1 = a2 2, 2 0, 3 2, 3

s1 = a1 1, 2 4, 0 0, 4

s2 = a3 1, 4 0, 4 4, 0

b1 b2 b3

a1 4, 0 1, 2 0, 4

a2 0, 3 2, 2 2, 3

a3 0, 4 1, 4 4, 0

• But how (in general) do we find the correct labeling of the strategies as ui, si , vj , tj? Can it be done in polynomial time?

Let’s try to use satisfiability

b1 b2 b3

a1 4, 0 1, 2 0, 4

a2 0, 3 2, 2 2, 3

a3 0, 4 1, 4 4, 0• Say that v(σ) = true if we label σ as one of the si or tj (that is, we put it “in” G)

• If a1, a2 are both in G, then b1 must also be in G because a1, a2 get different payoffs against b1

• Equivalently, v(a1) and v(a2) v(b1)

– or (-v(a1) or -v(a2) or v(b1))

• Theorem: satisfaction of all such clauses the condition is satisfied

Clauses for the example

b1 b2 b3

a1 4, 0 1, 2 0, 4

a2 0, 3 2, 2 2, 3

a3 0, 4 1, 4 4, 0• v(a1) and v(a2) v(b1) and v(b2) and v(b3)

• v(a1) and v(a3) v(b1) and v(b3)


• v(b1) and v(b2) v(a1) and v(a2)


• v(b2) and v(b3) v(a1) and v(a2) and v(a3)

• Complete characterization of solutions:– Set at most one variable to true for each player (does not reduce game)

– Set all variables to true (G = whole game!)

– Only nontrivial solution: set v(a1), v(a3), v(b1), v(b3) to true

Simple algorithm

• Algorithm to find nontrivial solution:– Start with any two variables for the same agent set to true– Follow the implications– If all variables set to true, start with next pair of variables

Solving the example with the algorithm (pass 1)

b1 b2 b3

a1 4, 0 1, 2 0, 4

a2 0, 3 2, 2 2, 3

a3 0, 4 1, 4 4, 0

• v(a1) and v(a2) v(b1) and v(b2) and v(b3)






• Variables set to true: v(a1) v(a2) v(a3)v(b1) v(b2) v(b3)

Solving the example with the algorithm (pass 2)

b1 b2 b3

a1 4, 0 1, 2 0, 4

a2 0, 3 2, 2 2, 3

a3 0, 4 1, 4 4, 0

• v(a1) and v(a2) v(b1) and v(b2) and v(b3)






• Variables set to true: v(a1) v(a3) v(b1) v(b3)

Algorithm complexity

• Theorem. Requires at most O((#rows+#columns)4) clause applications– That is, quadratic if the game is square

• Can improve in practice by caching previous results

Preprocessing the “hard” game

2, 4 4, 2 3, 3

3, 3 2, 4 4, 2

4, 2 3, 3 2, 4

0, 2 1.5, 1.5 0, 2 0, 2

2, 4 2, 0 4, 2 3, 3

3, 3 2, 0 2, 4 4, 2

4, 2 2, 0 3, 3 2, 4

0, 2 0, 3 3, 0 0, 0 0, 2 0, 0 0, 2

0, 2 0, 0 0, 0 0, 3 0, 2 3, 0 0, 2

2, 4 2, 0 2, 0 2, 0 4, 2 2, 0 3, 3

3, 3 2, 0 2, 0 2, 0 2, 4 2, 0 4, 2

0, 2 3, 0 0, 3 0, 0 0, 2 0, 0 0, 2

0, 2 0, 0 0, 0 3, 0 0, 2 0, 3 0, 2

4, 2 2, 0 2, 0 2, 0 3, 3 2, 0 2, 40, 3 3, 0

3, 0 0, 3

0, 3 3, 0 0, 0

0, 0 0, 0 1.5, 1.5

3, 0 0, 3 0, 0

1.5, 1.5 0, 0

0, 0 1.5, 1.5

0, 3 3, 0

3, 0 0, 3

0, 3 3, 0 0, 0 0, 0

0, 0 0, 0 0, 3 3, 0

3, 0 0, 3 0, 0 0, 0

0, 0 0, 0 3, 0 0, 3

1/2 1/21/2

1/2

1/2 1/21/2

1/2

1/3

11

0

0

1/3 1/3

1/3

1/3

1/3

Conclusions• Generalized strategy eliminability criterion [AAAI05]

– Parameterized definition– At one extreme setting, dominance– At other extreme, whether a strategy is in the support of any Nash– Efficiently computable for settings close to dominance

• Technique for recursively solving subcomponent [AAMAS06]

– Subcomponent’s solution can be used to shrink original game– Efficient algorithm for finding subcomponent

• Other techniques? Generalization to extensive form?

Thank you for your attention!

preprocessing techniques for computing nash equilibria vincent conitzer duke university based on:...

Documents

e i e i

p i e i b i e i

u i e i

d i s i

e i sets

u i d i

u i u i

e strategies e r