games where you can play optimally without any memory
DESCRIPTION
Games where you can play optimally without any memory. Authors: Hugo Gimbert and Wieslaw Zielonka. Presented by Moria Abadi. Arena and Play. Play. Max. Min. color(play) = blue blue yellow …. Payoff Mapping of Player. means that y is good for the player at least as x. - PowerPoint PPT PresentationTRANSCRIPT
1
Games where you can play optimally without any memory
Authors:Hugo Gimbert and Wieslaw Zielonka
Presented by Moria Abadi
2
Arena and Play
Max
Min
MaxS MinS
color(play) = blue blue yellow …
Play
3
Payoff Mapping of Player
means that
y is good for the player at least as x
},{: RCu
)()( yuxu
Player wins payoff u(x) in play x
4
Example 1 – Parity Game
2mod)suplim(...)( 10 inincccu
Max wins 1 if the highest color visited infinitely often is odd, otherwise his payoff is 0
NC
5
Example 2 – Sup Game
iNicccu
sup...)( 10
Max wins the highest value seen during the play
RC
6
Example 4 – Mean Payoff Game
RC
n
ii
nc
nccu
010
1lim...)(
n
ii
nc
nccu
010
1suplim...)(
Does not always exist
7
Example 4 – Mean Payoff Game
1 1 1
0 0
01
0
2
10 11
n
ii
nc
nccu
010
1suplim...)(
0
8
Preference Relation of Player
is complete preorder relation on C
x y means
y is good for the player at least as x
u induces : x y iff u(x)≤u(y)
x y denotes x y but not y x
9
Antagonistic Games
• x -1 y iff y x
is preference relation of Max
-1 is preference relation of Min
10
Games, Strategies
• Game (G,)– G is finite arena G = (SMax, SMin, E)
is a preference relation for player Max
strategy for Max
strategy for Min
• pG(t,,) is a play in G with source t consistent
with both and .
11
Optimal Strategies Intuition
• pG(t,#,#) is a play
# and # are optimal if:
For Max and Min it is not worth to exchange his strategy unilaterally
12
Optimal Strategies Definition(G,) is given
# and # are optimal if
For all states s and all strategies and
)),,(( #spcolour G
)),,(( ## spcolour G
)),,(( # spcolour G
13
The Main Question
Under which conditions Max and Min have optimal memoryless strategies for all
games?
Some conditions on will be definedMin and Max have optimal memoryless strategies iff
satisfies these conditions
Parity games, mean payoff games,…
14
[L]
Rec(C) all languages recognizable by automata
Pref(L) all prefixes of the words in L
Cx[L]={ | every finite prefix of x is in Pref(L)}
LRec(C)
15
[L] Example
})10(,)01{(][ L
}0,10,10|0)01(1{ mjkL jmk
16
Lemma 3
[L M] = [L] [M]
xPref(L), xPref(M)
xPref(M), xPref(L)
17
Co-accessible Automaton
• From any state there is a (possibly empty) path to a final state
i0
0
11
1 0
0
1
C={0,1}
18
Lemma 4• Let A=(Q,i,F,Δ) be a co-accessible finite
automaton recognizing a language L. Then
[L]={color(p) | p is an infinite path in A, source(p)=i}
i0
0
11
1 0
0
1
p=e0e1e2… n there is a path from target(en) to a final state
color(p)[L]
19
Lemma 4• Let A=(Q,I,F,Δ) be a co-accessible finite
automaton recognizing a language L. Then
[L]={color(p) | p is an infinite path in A, source(p)=i}
i0
0
11
1 0
0
1
x=c0c1c2… n there is a path matching c0…cnThere is an infinite path p: color(p)=x
20
Extension of and
XY iff xX yY, xy
XY iff yY xX, xy
For X,YC
21
Monotony
is monotone if M,NRec(C)
xC* [xM] [xN] yC* [yM] [yN]
x
y
MM
NN
Intuitively: at each moment during the play the optimal choice between two possible futures does not depend on the preceding finite play
22
Example of non-monotone
i
n
kNnc
nccu
121
1sup...)(
1 1 1
0 0
01
0
2
10 1
y:
x:
v=20
w=1
u(xv)<u(xw) while u(yw)<u(yv)
u(xv) = 2/5, u(xw) = 1, u(yv) = 6/5, u(yw) = 1
C=R
23
Selectivity
is selective if xC* M,N,KRec(C)
[x(MN)*K] [xM*] [xN*] [xK]
Intuitively: the player cannot improve his payoff by switching between different behaviors
NNMM
KK
24
Example of non-selective
...)( 21ccu1 if the colors 0 and 1 occur infinitely often 0 otherwise
C={0,1}
01M = {1k | 0≤k} N = {0k | 0≤k}
(01) [(MN)*] [M*] = {1}
u((01) > u(1) and u((01) > u(0)
[N*] = {0}
25
The Main Theorem
Given a preference relation , both players have optimal memoryless strategies for all games (G,) over finite arenas G if and only if the relations and -1 are monotone and selective
26
Proof of Necessary Condition
Given a preference relation , if both players have optimal memoryless strategies for all games (G,) over finite arenas G then the relations and -1 are monotone and selective
27
Simplification 1
SA
SB
A, , #
B, -1, #
SB
SA
B, -1, #
A, , #
Max Min
It is enough to prove only for
28
Simplification 2
• It turns out that already for one-player games if Max has optimal strategy, has to be monotone and selective
Two-player arenas
One-player arenas
29
Lemma 5
Suppose that player Max has optimal
memoryless strategies for all games (G,) over finite one-player arenas G=(SMax,Ø,E).
Then is monotone and selective.
30
Prove of Monotony
x,yC* and M,NRec(C) and [xM] [xN]We shall prove [yM] [yN]
• Ax and Ay are deterministic co-accessible
automata recognizing {x} and {y}
• AN and AM are co-accessible automata
recognizing N and M
• W.l.o.g. AN and AM have no transition with initial state as a target
31
Prove of Monotony
x,yC* and M,NRec(C) and [xM] [xN] [yM] [yN]
If [M] = Ø – trivial.
[M] Ø and [N] Ø by Lemma 4 there is an infinite path from initial state of AM and AN
32
AxAx
Automaton A
AyAy
i
t
AMAM
ANANi i
FFFF
F F
Recognizes
x(MN)
All plays are
[x(MN)]
=[xM][xN]
33
AxAx Ay
Ay
i
t
AMAM
ANAN
FFFF
p play consistent with #
x,yC* and M,NRec(C) and [xM] [xN] [yM] [yN]
i
q play consistent with #
color(q)[yN],
[yM][yN]
color(q)
[yM] [yN]
34
Proof of Sufficient Condition
Given a monotone and selective preference relations and -1, both players have optimal memoryless strategies for all games (G,) over finite arenas G.
35
Arena Number
• G=(S,E)
• nG = |E|-|S|
• Each state has at least one outgoing transition nG0
• The proof by induction on nG
36
Induction
Basis
For arena G, where nG=0.
Hypothesis
Let G be an arena and is monotone and selective. Suppose Max and Min have memoryless strategies in all games (H,) over arenas H such that nH<nG. Then Max has optimal memoryless strategy in (G,).
strategies are unique
37
#
• We need to find # such that (#,#) optimal
• We will find #m which requires memory
such that (#, #m) optimal
• Permuting Max and Min we will find (#
m, #) optimal
• (#, #m) and (#
m, #) are optimal (#,#) optimal
38
Induction Step
t
G0
G1
GG nni
G
(#i, #
i) – optimal strategies in Gi
39
Induction Step
t
G0
G1
G
Ki colors of finite plays from in Gi from t consistent with #
i
KiRec(C), monotone xC* [xK0] [xK1] or xC* [xK1] [xK0]
W.l.o.g xC* [xK1] [xK0] So let # = #0
40
#
t
G0
G1
G
)(# p #0(target(p)) if last transition from t was to G0
#1(target(p)) if last transition from t was to G1
41
color(pG(s,,#))color(pG(s,#,#))color(pG(s,#,))
t
G0
G1
G
42
color(pG(s,#,#))color(pG(s,#,))
t
G0
G1
G
All plays are in G0
43
color(pG(s,,#))color(pG(s,#,#))
t
G0
G1
G
pG(s,,#) traverse the state t
All plays are in G0
44
color(pG(s,,#))color(pG(s,#,#))
t
G0
G1
G
color(pG(s,,#)) [x(M0M1)*(K0K1)] [x(M0)*] [x(M1)*][x(K0K1)](Mi*)Ki color(pG(s, ,#)) [x(K0K1)] = [xK0][xK1] [xK0]
x - color of the shortest path to t consistent with #
Mi colors of finite plays from in Gi from t to t consistent with #
i
45
color(pG(s,,#))color(pG(s,#,#))
t
G0
G1
G
color(pG(s, ,#)) [xK0] color(pG0(s,#0,#
0)) = color(pG(s,#,#))
46
A Very Important Corollary
Suppose that is such that for each finite arena G=(SMax,SMin,E) controlled by one player (SMax=Ø or SMin=Ø), this player has an optimal memoryless strategy in (G,).Then for all finite two-player arenas G both players have optimal memoryless strategies in the games (G,).
47
Mean Payoff Game
n
ii
nc
nccu
010
1suplim...)(
S