randomness for free laurent doyen lsv, ens cachan & cnrs joint work with krishnendu chatterjee,...

72
Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Upload: ashlyn-riley

Post on 16-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for Free

Laurent Doyen

LSV, ENS Cachan & CNRS

joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Page 2: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Stochastic games on graphs

• Games for synthesis

- Reactive system synthesis = finding a winning strategy in a two-player game

• Game played on a graph

- Infinite number of rounds

- Player’s moves determine successor state

- Outcome = infinite path in the graph

When is randomness more powerful ?

When is randomness for free ?

Page 3: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

• Games for synthesis

- Reactive system synthesis = finding a winning strategy in a game

• Game played on a graph

- Infinite number of rounds

- Player’s moves determine successor state

- Outcome = infinite path in the graph

When is randomness more powerful ?

When is randomness for free ?

… in game structures ?

… in strategies ?

Stochastic games on graphs

Page 4: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Interaction: how players' moves are combined.

Information: what is visible to the players.

• Classification according to Information & Interaction

Round 1

Stochastic games on graphs

Page 5: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Interaction: how players' moves are combined.

Information: what is visible to the players.

• Classification according to Information & Interaction

Round 2

Stochastic games on graphs

Page 6: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Interaction: how players' moves are combined.

Information: what is visible to the players.

• Classification according to Information & Interaction

Round 3

Stochastic games on graphs

Page 7: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Mode of interaction

• Classification according to Information & Interaction Interaction

General case: concurrent & stochastic

Player 1’s move

Player 2’s move

Probability distribution on successor state

Players choose their moves simultaneously and independently

-player games

Page 8: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Mode of interaction

• Classification according to Information & Interaction Interaction

General case: concurrent & stochastic

Player 1’s move

Player 2’s move

Probability distribution on successor state

Players choose their moves simultaneously and independently

-player games (MDP)if A1 or A2 is a singleton.

Page 9: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Mode of interaction

Interaction

General case: concurrent & stochastic

Separation of concurrency & probabilities

Page 10: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Mode of interaction

Interaction

Special case: turn-based

Player 1 state

Player 2 state

In each state, one player’s move determines successor

Page 11: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Knowledge

• Classification according to Information & Interaction

In state , player sees such that

Information

General case: partial observation

Two partitions and

Player 1’s view

Player 2’s view

Page 12: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Knowledge

Information

Special case 1: one-sided complete observation

or

Player 1’s view

Player 2’s view

Page 13: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Knowledge

Information

Special case 2: complete observation

and

Player 1’s view

Player 2’s view

Page 14: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Classification

• Classification according to Information & Interaction

complete observation

one-sided complete obs.

partial observation

turn-based

concurrent

Information Interaction

-player games

Page 15: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Classification

• Classification according to Information & Interaction

complete observation MDP

one-sided complete obs. POMDP

turn-based

concurrent

=

InformationInteraction

-player games

Markov Decision Process

partial observation

=

Page 16: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Strategies

A strategy for Player is a function that maps histories to probability distribution over actions.

Strategies are observation-based:

Page 17: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Strategies

A strategy for Player is a function that maps histories to probability distribution over actions.

Strategies are observation-based:

Special case: pure strategies

Page 18: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Objectives

An objective (for player 1) is a measurable set of infinite sequences of states:

Page 19: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Objectives

An objective (for player 1) is a measurable set of infinite sequences of states:

Examples:

- Reachability, safety - Büchi, coBüchi - Parity - Borel

B C

Büchi coBüchi

Reachability Safety

Page 20: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Value

Probability of finite prefix of a play:

induces a unique probability measure on measurable sets of plays:

and a value function for Player 1:

Page 21: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Value

Probability of finite prefix of a play:

induces a unique probability measure on measurable sets of plays:

and a value function for Player 1:

Our reductions preserve values and existence of optimal strategies.

Page 22: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Outline

• Randomness in game structures

- for free with complete-observation, concurrent

- for free with one-sided, turn-based

• Randomness in strategies

- for free in (PO)MDP

• Corollary: undecidability results

Page 23: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Preliminary

Rational probabilities

Probabilities only

1. Make all probabilistic states have two successors

12 3 4

1

2

3 4

Page 24: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Preliminary

Rational probabilities

Probabilities only

2. Use binary encoding of p and q-p to simulate [Zwick,Paterson 96] :

Page 25: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Preliminary

Rational probabilities

Probabilities only

2. Use binary encoding of p and q-p to simulate [Zwick,Paterson 96] :

Page 26: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Preliminary

Rational probabilities

Probabilities only

2. Use binary encoding of p and q-p to simulate [Zwick,Paterson 96] :

• Polynomial-time reduction, for parity objectives

Page 27: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Outline

• Randomness in game structure

- for free with complete-observation, concurrent

- for free with one-sided, turn-based

• Randomness in strategies

- for free in (PO)MDP

• Corollary: undecidability results

Page 28: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Classification

• Classification according to Information & Interaction

complete observation

one-sided complete obs.

partial observation

turn-based

concurrent

Information Interaction

Page 29: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Reduction

Simulate probabilistic state with concurrent state

Page 30: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Probability to go from s to s’0:

Reduction

Simulate probabilistic state with concurrent state

Page 31: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Reduction

Simulate probabilistic state with concurrent state

Probability to go from s to s’0:

If , then

Page 32: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Probability to go from s to s’0:

If , then

If , then

Reduction

Simulate probabilistic state with concurrent state

Page 33: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Probability to go from s to s’0:

If , then

If , then

Simulate probabilistic state with concurrent state

Each player can unilaterally decide to simulate the original game.

Reduction

Page 34: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Simulate probabilistic state with concurrent state

Player 1 can obtain at least the value v(s) by playing all actions uniformly at random: v’(s) ≥ v(s)

Player 2 can force the value to be at most v(s) by playing all actions uniformly at random: v’(s) ≤ v(s)

Reduction

Page 35: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

For {complete,one-sided,partial} observation, given a game with rational probabilities

we can construct a concurrent game with deterministic transition function

such that:

and existence of optimal observation-based strategies is preserved.

The reduction is in polynomial time for complete-observation parity games.

Reduction

Page 36: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Information leak from the back edges…

Partial information

Page 37: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Information leak from the back edges…

Partial information

No information leakage if all probabilities have same denominator q

use 1/q=gcd of all probabilities in G

Page 38: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Information leak from the back edges…

Partial information

No information leakage if all probabilities have same denominator q

use 1/q=gcd of all probabilities in GThe reduction is then exponential.

Page 39: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Example

Page 40: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Outline

• Randomness in game structure

- for free with complete-observation, concurrent

- for free with one-sided, turn-based

• Randomness in strategies

- for free in (PO)MDP

• Corollary: undecidability results

Page 41: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Overview

• Classification according to Information & Interaction

complete observation

one-sided complete obs.

partial observation

turn-based

concurrent

Information Interaction

Page 42: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Reduction

Simulate probabilistic state with imperfect information turn-based states

Player 1 states

Player 2 state Player 1

observation

Page 43: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Reduction

Each player can unilaterally decide to simulate the probabilistic stateby playing uniformly at random:

Player 2 chooses states (s,0), (s,1) unifiormly at random Player 1 chooses actions 0,1 unifiormly at random

Simulate probabilistic state with imperfect information turn-based states

Page 44: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Games with rational probabilities can be reduced to turn-based-games with deterministic transitions and (at least) one-sided complete observation.

Values and existence of optimal strategies are preserved.

Reduction

Page 45: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free

In transition function (this talk):

Page 46: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

When randomness is not for free

• Complete-information turn-based ( ) games

- in deterministic games, value is either 0 or 1 [Martin98]- MDPs with reachability objective can have values in [0,1] Randomness is not for free.

• -player games (MDP & POMDP)

- in deterministic partial info -player games, value is either 0 or 1 [see later] - MDPs have value in [0,1]

Randomness is not for free.

Page 47: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free

In transition function (this talk):

Page 48: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free

In transition function (this talk):

In strategies (Everett’57, Martin’98, CDHR’07):

Page 49: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness in strategies

Example- concurrent, complete observation

- reachability

Reminder: randomized strategy for Player :

pure strategy for Player :

Randomized strategies are more powerful

Page 50: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness in strategies

Example- turn-based, one-sided complete

observation - reachability

Randomized strategies are more powerful

Page 51: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Outline

• Randomness in game structure

- for free with complete-observation, concurrent

- for free with one-sided, turn-based

• Randomness in strategies

- for free in (PO)MDP

• Corollary: undecidability results

Page 52: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

(p)

Randomness in strategies

Randomized

strategy

probability p

a

b(1-p)

RandStrategy(p) { x = rand(0…1) if x≤p then out(a) else out(b) }

Page 53: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

(p)

Randomness in strategies

Randomized

strategy

probability p

a

b(1-p)

Random generator

Uniform probability

x [0,1]

Pure strategya

Pure strategyb

x ≤ p

x > p

Page 54: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

For every randomized observation-based strategy , there exists a pure observation-based strategy such that:

1½-player game (POMDP), s0 initial state.

Proof. (assume alphabet of size 2, and fan-out = 2)

Given σ, we show that the value of σ can be obtained as the average of the value of pure strategies σx:

Page 55: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

Strategies σx are obtained by «de-randomization» of σ

Assume σ(s1)(a) = p and σ(s1)(b) = 1-p

σ is equivalent to playing σ0 with frequency p and σ1 with frequency 1-p where:

Page 56: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

Strategies σx are obtained by «de-randomization» of σ

Assume σ(s1)(a) = p and σ(s1)(b) = 1-p

σ is equivalent to playing σ0 with frequency p and σ1 with frequency 1-p where:

Page 57: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

Equivalently, toss a coin x [0,1],play σ0 if x ≤ p, and play σ1 if x > p.Playing σ in G can be viewed as a sequence of coin tosses:

s0

a

b

x ≤ p

x > p

toss x [0,1]

[ p = σ(s1)(a) ]

Page 58: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

s1

Randomness for free in strategies

Equivalently, toss a coin x [0,1],play σ0 if x ≤ p, and play σ1 if x > p.Playing σ in G can be viewed as a sequence of coin tosses:

s0

a

b

x ≤ p

x > p

y ≤ qa

y > qa

y ≤ qb

y > qb

s’

s’’

s’’’

s’’’’

s1

s1

s1

toss x [0,1]

toss y [0,1]

[ p = σ(s1)(a) ]

[ qa = δ(s0,a)(s1) ]s’

[ qb = δ(s0,b)(s1 ) ]s’’’

Page 59: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

Equivalently, toss a coin x [0,1],play σ0 if x ≤ p, and play σ1 if x > p.Playing σ in G can be viewed as a sequence of coin tosses:

s0

a

b

x ≤ p

x > p

y ≤ qa

y > qa

y ≤ qb

y > qb

s’

s’’

s’’’

s’’’’

s1

s1

s1

s1

toss x [0,1]

toss y [0,1]

toss x [0,1]

[ p = σ(s1)(a) ]

[ qa = δ(s1,a)(s1) ]s’

[ qb = δ(s1,b)(s1 ) ]s’’’

Page 60: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

Given an infinite sequence x=(xn)n≥0 [0,1]ω, define for all s0, s1, …, sn:

σx is a pure and observation-based strategy !

σx plays like σ, assuming that the result of the coin tosses is the sequence x.

The value of σ is the « average of the outcome » of the strategies σx.

Page 61: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

Assume x=(xn)n≥0 and y=(yn)n≥0 are fixed.

Then σ determines a unique outcome…which is the outcome of some pure strategy !

s0

a

b

x0 ≤ p0

x0 > p0

y0 ≤ q0a

y0 > q0a

y0 ≤ q0b

y0 > q0b

s’

s’’

s’’’

s’’’’

s1

s1

s1

s1

x1 > p1

y1 > q1b

Page 62: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

s0

a

b

x0 ≤ p

x0 > p

y0 ≤ qa

y0 > qa

y0 ≤ qb

y0 > qb

s’

s’’

s’’’

s’’’’

s1

s1

s1

s1

Assume x=(xn)n≥0 and y=(yn)n≥0 are fixed.

Let where

Page 63: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

s0

a

b

x0 ≤ p

x0 > p

y0 ≤ qa

y0 > qa

y0 ≤ qb

y0 > qb

s’

s’’

s’’’

s’’’’

s1

s1

s1

s1

Assume x=(xn)n≥0 and y=(yn)n≥0 are fixed.

Let where

Theorem:

Page 64: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

s0

a

b

x0 ≤ p

x0 > p

y0 ≤ qa

y0 > qa

y0 ≤ qb

y0 > qb

s’

s’’

s’’’

s’’’’

s1

s1

s1

s1

Assume x=(xn)n≥0 and y=(yn)n≥0 are fixed.

Let where

Playing σ in G is equivalent to choosing (xn) and (yn) uniformly at random, and then producing

Page 65: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free in strategies

s0

a

b

x0 ≤ p

x0 > p

y0 ≤ qa

y0 > qa

y0 ≤ qb

y0 > qb

s’

s’’

s’’’

s’’’’

s1

s1

s1

s1

Assume x=(xn)n≥0 and y=(yn)n≥0 are fixed.

Let where

Playing σ in G is equivalent to choosing (xn) and (yn) uniformly at random, and then producing

• The random sequences (xn) and (yn) can be generated separately, and independently !

• σx is a pure strategy !

pure strategy

Page 66: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Proof

The value of σ is the « average of the outcome » of the strategies σx.

Page 67: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Proof

The value of σ is the « average of the outcome » of the strategies σx.

By Fubini’s theorem…

Page 68: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Proof

The value of σ is the « average of the outcome » of the strategies σx.

Pure and randomized strategies are equally powerful in POMDPs !

Page 69: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Randomness for free

In transition function:

In strategies:

Page 70: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Outline

• Randomness in game structure

- for free with complete-observation, concurrent

- for free with one-sided, turn-based

• Randomness in strategies

- for free in (PO)MDP

• Corollary: undecidability results

Page 71: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Corollary

Almost-sure coBüchi (and positive Büchi) with randomized (or pure) strategies is undecidable for POMDPs.

Using [BaierBertrandGrößer’08]:

Using randomness for free in transition functions :

Almost-sure coBüchi (and positive Büchi) is undecidable for deterministic turn-based one-sided complete observation games.

Page 72: Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger

Thank you !

Questions ?