Games, Times, and Probabilities: Value Iteration in Verification and Control Krishnendu Chatterjee Tom Henzinger

Download Games, Times, and Probabilities: Value Iteration in Verification and Control Krishnendu Chatterjee Tom Henzinger

Post on 19-Dec-2015

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

<ul><li> Slide 1 </li> <li> Games, Times, and Probabilities: Value Iteration in Verification and Control Krishnendu Chatterjee Tom Henzinger </li> <li> Slide 2 </li> <li> Graph Models of Systems vertices = states edges = transitions paths = behaviors </li> <li> Slide 3 </li> <li> graph Extended Graph Models CONTROL: game graph OBJECTIVE: -automaton PROBABILITIES: Markov decision process stochastic game regular game CLOCKS: timed automaton stochastic hybrid system </li> <li> Slide 4 </li> <li> Graphs vs. Games a ba a b a </li> <li> Slide 5 </li> <li> Games model Open Systems Two players: environment / controller / input vs. system / plant / output Multiple players: processes / components / agents Stochastic players: nature / randomized algorithms </li> <li> Slide 6 </li> <li> Example P1: init x := 0 loop choice | x := x+1 mod 2 | x := 0 end choice end loop 1 : ( x = y ) P2: init y := 0 loop choice | y := x | y := x+1 mod 2 end choice end loop 2 : ( y = 0 ) </li> <li> Slide 7 </li> <li> Graph Questions 8 ( x = y ) 9 ( x = y ) CTL </li> <li> Slide 8 </li> <li> Graph Questions 8 ( x = y ) 9 ( x = y ) 00 1011 01 X CTL </li> <li> Slide 9 </li> <li> Zero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) ATL [Alur/H/Kupferman] </li> <li> Slide 10 </li> <li> Zero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 10 01 11 ATL [Alur/H/Kupferman] </li> <li> Slide 11 </li> <li> Zero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 10 01 11 ATL [Alur/H/Kupferman] X </li> <li> Slide 12 </li> <li> Zero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 10 01 11 ATL [Alur/H/Kupferman] X </li> <li> Slide 13 </li> <li> Nonzero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 10 01 11 Secure equilibra [Chatterjee/H/Jurdzinski] </li> <li> Slide 14 </li> <li> Nonzero-Sum Game Questions hhP1ii ( x = y ) hhP2ii ( y = 0 ) 00 10 01 11 Secure equilibra [Chatterjee/H/Jurdzinski] </li> <li> Slide 15 </li> <li> Strategies Strategies x,y: Q * ! Q From a state q, a pair (x,y) of a player-1 strategy x2 1 and a player-2 strategy y2 2 gives a unique infinite path Outcome x,y (q) 2 Q . </li> <li> Slide 16 </li> <li> Strategies hhP1ii 1 = (9 x2 1 ) (8 y2 2 ) 1 (x,y) Short for: q hhP1ii 1 iff (9 x2 1 ) (8 y2 2 ) ( Outcome x,y (q) 1 ) Strategies x,y: Q * ! Q From a state q, a pair (x,y) of a player-1 strategy x2 1 and a player-2 strategy y2 2 gives a unique infinite path Outcome x,y (q) 2 Q . </li> <li> Slide 17 </li> <li> Strategies hhP1ii 1 = (9 x2 1 ) (8 y2 2 ) 1 (x,y) hhP1ii 1 hhP2ii 2 = (9 x2 1 ) (9 y2 2 ) [ ( 1 2 )(x,y) (8 y2 2 ) ( 2 ! 1 )(x,y) (8 x2 1 ) ( 2 ! 1 )(x,y) ] Strategies x,y: Q * ! Q From a state q, a pair (x,y) of a player-1 strategy x2 1 and a player-2 strategy y2 2 gives a unique infinite path Outcome x,y (q) 2 Q . </li> <li> Slide 18 </li> <li> Objectives and 2 Qualitative: reachability; Buechi; parity ( -regular) Quantitative: max; lim sup; lim avg </li> <li> Slide 19 </li> <li> Reachability} a Safety a= :}: a Normal Forms of -Regular Sets Borel-1 </li> <li> Slide 20 </li> <li> Reachability} a Safety a= :}: a Buechi } a coBuechi} a = : }: a Normal Forms of -Regular Sets Borel-1 Borel-2 </li> <li> Slide 21 </li> <li> Reachability} a Safety a= :}: a Buechi } a coBuechi} a = : }: a Streett ( } a ! } b ) = ( } : a } b ) Rabin ( } a } b ) Parity: complement-closed subset of Streett/Rabin Normal Forms of -Regular Sets Borel-1 Borel-2 Borel-2.5 </li> <li> Slide 22 </li> <li> Buechi Game q4q4 q0q0 q2q2 q1q1 q3q3 G B </li> <li> Slide 23 </li> <li> q4q4 q0q0 q2q2 q1q1 q3q3 G B Secure equilibrium (x,y) at q 0 : x: if q 1 ! q 0, then q 2 else q 4. y: if q 3 ! q 1, then q 0 else q 4. Strategies require memory. </li> <li> Slide 24 </li> <li> Zero-Sum Games: Determinacy W1W1 W2W2 1 = : 2 hhP2ii 2 hhP1ii 1 </li> <li> Slide 25 </li> <li> Nonzero-sum Games W 10 hhP1ii ( 1 : 2 ) W 01 hhP2ii ( 2 : 1 ) W 11 W 00 hhP1ii 1 hhP2ii 2 </li> <li> Slide 26 </li> <li> Objectives Qualitative: reachability; Buchi; parity ( -regular) Quantitative: max; lim sup; lim avg </li> <li> Slide 27 </li> <li> Objectives Qualitative: reachability; Buchi; parity ( -regular) Quantitative: max; lim sup; lim avg Borel-1 Borel-2 Borel-3 </li> <li> Slide 28 </li> <li> Quantitative Games hhP1ii lim sup hhP1ii lim avg 4 2 2 0 2 0 0 4 3 </li> <li> Slide 29 </li> <li> Quantitative Games hhP1ii lim sup = 3 hhP1ii lim avg 4 2 2 0 2 0 0 4 3 </li> <li> Slide 30 </li> <li> Quantitative Games hhP1ii lim sup = 3 hhP1ii lim avg = 1 4 2 2 0 2 0 0 4 3 </li> <li> Slide 31 </li> <li> Solving Games by Value Iteration Generalization of the -calculus: computing fixpoints of transfer functions (pre; post). Generalization of dynamic programming: iterative optimization. q Region R: Q ! V q R(q) </li> <li> Slide 32 </li> <li> Solving Games by Value Iteration Generalization of the -calculus: computing fixpoints of transfer functions (pre; post). Generalization of dynamic programming: iterative optimization. q Region R: Q ! V q R(q) R(q) := pre(R(q)) </li> <li> Slide 33 </li> <li> Q states transition labels : Q Q transition function Graph </li> <li> Slide 34 </li> <li> Q states transition labels : Q Q transition function = [ Q ! {0,1} ] regions with V = B 9 pre: q 9 pre(R) iff ( ) (q, ) R 8 pre: q 8 pre(R) iff ( ) (q, ) R Graph </li> <li> Slide 35 </li> <li> acb 9 c =( X) ( c 9pre(X) ) </li> <li> Slide 36 </li> <li> acb Graph 9 c =( X) ( c 9pre(X) ) </li> <li> Slide 37 </li> <li> acb Graph 9 c =( X) ( c 9pre(X) ) </li> <li> Slide 38 </li> <li> acb Graph 9 c =( X) ( c 9pre(X) ) 8 c=( X) ( c 8pre(X) ) </li> <li> Slide 39 </li> <li> Graph Reachability R Given R Q, find the states from which some path leads to R. R </li> <li> Slide 40 </li> <li> R R [ pre(R) R = ( X) (R 9 pre(X)) Given R Q, find the states from which some path leads to R. Graph Reachability </li> <li> Slide 41 </li> <li> R R [ pre(R) R [ pre(R) [ pre 2 (R) R = ( X) (R 9 pre(X)) Given R Q, find the states from which some path leads to R. Graph Reachability </li> <li> Slide 42 </li> <li> R... RR R [ pre(R) R [ pre(R) [ pre 2 (R) R = ( X) (R 9 pre(X)) Given R Q, find the states from which some path leads to R. Graph Reachability </li> <li> Slide 43 </li> <li> R... RR R [ pre(R) R [ pre(R) [ pre 2 (R) R = ( X) (R 8 pre(X)) Given R Q, find the states from which all paths lead to R. Graph Reachability </li> <li> Slide 44 </li> <li> Value Iteration Algorithms consist of A.LOCAL PART: 9pre and 8pre computation B.GLOBAL PART: evaluation of a fixpoint expression We need to generalize both parts to solve games. </li> <li> Slide 45 </li> <li> Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q Q transition function Turn-based Game </li> <li> Slide 46 </li> <li> Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q Q transition function = [ Q ! {0,1} ] regions with V = B 1pre: q 1pre(R) iff q 2 Q 1 ( ) (q, ) R or q 2 Q 2 ( 8 2 ) (q, ) 2 R Turn-based Game </li> <li> Slide 47 </li> <li> Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q Q transition function = [ Q ! {0,1} ] regions with V = B 1pre: q 1pre(R) iff q 2 Q 1 ( ) (q, ) R or q 2 Q 2 ( 8 2 ) (q, ) 2 R 2pre: q 2pre(R) iff q 2 Q 1 ( 8 ) (q, ) R or q 2 Q 2 ( 9 2 ) (q, ) 2 R Turn-based Game </li> <li> Slide 48 </li> <li> c ab </li> <li> Slide 49 </li> <li> c ab hhP1ii c =( X) ( c 1pre(X) ) </li> <li> Slide 50 </li> <li> c Turn-based Game ab hhP1ii c =( X) ( c 1pre(X) ) </li> <li> Slide 51 </li> <li> c Turn-based Game ab hhP1ii c =( X) ( c 1pre(X) ) hhP2ii c=( X) ( c 2pre(X) ) </li> <li> Slide 52 </li> <li> c Turn-based Game ab hhP1ii c =( X) ( c 1pre(X) ) hhP2ii c=( X) ( c 2pre(X) ) </li> <li> Slide 53 </li> <li> c Turn-based Game ab hhP1ii c =( X) ( c 1pre(X) ) hhP2ii c=( X) ( c 2pre(X) ) </li> <li> Slide 54 </li> <li> R P1 R Given R Q, find the states from which player 1 has a strategy to force the game to R. Reachability Game </li> <li> Slide 55 </li> <li> R R [ 1pre(R) P1 R Given R Q, find the states from which player 1 has a strategy to force the game to R. Reachability Game </li> <li> Slide 56 </li> <li> R R [ 1pre(R) R [ 1pre(R) [ 1pre 2 (R) P1 R Given R Q, find the states from which player 1 has a strategy to force the game to R. Reachability Game </li> <li> Slide 57 </li> <li> R... 1 R R [ 1pre(R) R [ 1pre(R) [ 1pre 2 (R) P1 R = ( X) (R 1pre(X)) Given R Q, find the states from which player 1 has a strategy to force the game to R. Reachability Game </li> <li> Slide 58 </li> <li> P1 R Given R Q, find the states from which player 1 has a strategy to keep the game in R. R Safety Game </li> <li> Slide 59 </li> <li> R \ 1pre(R) P1 R Given R Q, find the states from which player 1 has a strategy to keep the game in R. R Safety Game </li> <li> Slide 60 </li> <li> R \ 1pre(R) R \ 1pre(R) \ 1pre 2 (R) P1 R Given R Q, find the states from which player 1 has a strategy to keep the game in R. R Safety Game </li> <li> Slide 61 </li> <li> ... 1 R R \ 1pre(R) R \ 1pre(R) \ 1pre 2 (R) P1 R = ( X) (R 1pre(X)) Given R Q, find the states from which player 1 has a strategy to keep the game in R. R Safety Game </li> <li> Slide 62 </li> <li> Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q N Q transition function Quantitative Game </li> <li> Slide 63 </li> <li> Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q N Q transition function = [ Q ! N ] regions with V = N 1pre: 1pre(R)(q) = (max ) max( 1 (q, ), R( 2 (q, )) ) if q 2 Q 1 (min 2 ) max( 1 (q, ), R( 2 (q, )) ) if q 2 Q 2 Quantitative Game </li> <li> Slide 64 </li> <li> Q 1, Q 2 states( Q = Q 1 [ Q 2 ) transition labels : Q N Q transition function = [ Q ! N ] regions with V = N 1pre: 1pre(R)(q) = (max ) max( 1 (q, ), R( 2 (q, )) ) if q 2 Q 1 (min 2 ) max( 1 (q, ), R( 2 (q, )) ) if q 2 Q 2 2pre: 2pre(R)(q) = (min ) max( 1 (q, ), R( (q, )) ) if q 2 Q 1 (max 2 ) max( 1 (q, ), R( 2 (q, )) ) if q 2 Q 2 Quantitative Game </li> <li> Slide 65 </li> <li> c Maximizing Game ab 0 1 2 5 3 </li> <li> Slide 66 </li> <li> c ab hhP1ii 0 =( X) max( 0, 1pre(X) ) 0 1 2 5 3 0 0 0 </li> <li> Slide 67 </li> <li> c Maximizing Game ab hhP1ii 0 =( X) max( 0, 1pre(X) ) 0 1 2 5 3 1 0 0 </li> <li> Slide 68 </li> <li> c Maximizing Game ab hhP1ii 0 =( X) max( 0, 1pre(X) ) 0 1 2 5 3 1 2 0 </li> <li> Slide 69 </li> <li> c Maximizing Game ab hhP1ii 0 =( X) max( 0, 1pre(X) ) 0 1 2 5 3 2 2 0 </li> <li> Slide 70 </li> <li> B B Given B Q, find the states from which some path visits B infinitely often. Buechi Graph </li> <li> Slide 71 </li> <li> B R 1 = pre(B)... pre(B) pre(B) [ pre 2 (B) B Given B Q, find the states from which some path visits B infinitely often. Buechi Graph </li> <li> Slide 72 </li> <li> B R 1 = pre(B) B Given B Q, find the states from which some path visits B infinitely often. Buechi Graph </li> <li> Slide 73 </li> <li> B R 1 = pre(B) R 2 = pre(B R 1 ) B Given B Q, find the states from which some path visits B infinitely often. Buechi Graph </li> <li> Slide 74 </li> <li> B... B B = ( Y) 9 (B 9 pre(Y)) Given B Q, find the states from which some path visits B infinitely often. Buechi Graph </li> <li> Slide 75 </li> <li> B... B B = ( Y) ( X) ((B 9 pre(Y)) 9 pre(X)) Given B Q, find the states from which some path visits B infinitely often. Buechi Graph </li> <li> Slide 76 </li> <li> B P1 B Given B Q, find the states from which player 1 has a strategy to force the game to B infinitely often. Buechi Game </li> <li> Slide 77 </li> <li> B... P1 B R 2 = P1 1pre(B R 1 ) R 1 = P1 1pre(B) P1 B = ( Y) ( X) ((B 1pre(Y)) 1pre(X)) Given B Q, find the states from which player 1 has a strategy to force the game to B infinitely often. Buechi Game </li> <li> Slide 78 </li> <li> Can we use the same value iteration scheme? Yes, iff the fixpoint expression computes correctly on all single-player (player 1 and player 2) structures. Reachability:9 p = ( X) (p 9pre(X)) 8 p = ( X) (p 8pre(X)) Hence:hhP1ii p = ( X) (p 1pre(X)) hhP2ii p = ( X) (p 2pre(X)) From Graphs to Games </li> <li> Slide 79 </li> <li> Complexity of Turn-based Games 1.Reachability, safety: linear time (P-complete) 2.Buechi: quadratic time (optimal ???) 3.Parity: NP coNP (in P ???) </li> <li> Slide 80 </li> <li> Complexity of Turn-based Games 1.Reachability, safety: linear time (P-complete) 2.Buechi: quadratic time (optimal ???) 3.Parity: NP coNP (in P ???) on graphs polynomial on graphs linear </li> <li> Slide 81 </li> <li> Graph-based (finite-carrier) systems: Q = B m = boolean formulas [e.g. BDDs] 9 pre = ( 9 x 2 B ) Timed and hybrid systems: Q = B m R n = formulas of ( Q, ,+) [e.g. polyhedral sets] 9 pre = ( 9 x 2 Q ) Beyond Graphs as Finite Carrier Sets </li> <li> Slide 82 </li> <li> Q states 1, 2 moves of both players : Q 1 2 Q transition function Concurrent Game </li> <li> Slide 83 </li> <li> Q states 1, 2 moves of both players : Q 1 2 Q transition function = [ Q ! {0,1} ] regions with V = B 1pre: q 1pre(R) iff ( 1 1 ) ( 2 2 ) (q, 1, 2 ) R Concurrent Game </li> <li> Slide 84 </li> <li> Q states 1, 2 moves of both players : Q 1 2 Q transition function = [ Q ! {0,1} ] regions with V = B 1pre: q 1pre(R) iff ( 1 1 ) ( 2 2 ) (q, 1, 2 ) R 2pre: q 2pre(R) iff ( 2 2 ) ( 1 1 ) (q, 1, 2 ) R Concurrent Game </li> <li> Slide 85 </li> <li> acb 1,11,21,11,2 2,12,22,12,2 1,11,22,21,11,22,2 2,12,1 </li> <li> Slide 86 </li> <li> acb 1,11,21,11,2 2,12,22,12,2 1,11,22,21,11,22,2 2,12,1 hhP2ii c=( X) ( c 2pre(X) ) </li> <li> Slide 87 </li> <li> acb 1,11,21,11,2 2,12,22,12,2 1,11,22,21,11,22,2 2,12,1 Concurrent Game hhP2ii c=( X) ( c 2pre(X) ) </li> <li> Slide 88 </li> <li> acb 1,11,21,11,2 2,12,22,12,2 1,11,22,21,11,22,2 2,12,1 Concurrent Game hhP2ii c=( X) ( c 2pre(X) ) Pr(1): 0.5 Pr(2): 0.5 </li> <li> Slide 89 </li> <li> graph Extended Graph Models CONTROL: game graph OBJECTIVE: -automaton PROBABILITIES: Markov decision process stochastic game regular game CLOCKS: timed automaton stochastic hybrid system </li> <li> Slide 90 </li> <li> Nondeterministic closed system. q1 q2 q3 Graph: 1 Player b a a </li> <li> Slide 91 </li> <li> a Probabilistic closed system. 0.40.6 q1 q3 q2 q5 q4 MDP: 1.5 Players a b a c </li> <li> Slide 92 </li> <li> Asynchronous open system. q1 q3 q2 q5 q4 Turn-based Game: 2 Players a b a c a </li> <li> Slide 93 </li> <li> a Probabilistic asynchronous open system. 0.40.6 q1 q3 q2 q5 q4 q7 q6 Turn-based Stochastic Game: 2.5 Players cb c a b a </li> <li> Slide 94 </li> <li> a aa q1 bb q2 q4q5q3 1,1 1,2 2,1 2,2 Concurrent Game Synchronous open system. </li> <li> Slide 95 </li> <li> a aa q1 bb q2 q4q5q3 q2: 0.3 q3: 0.2 q4: 0.5 q5: q2: 0.1 q3: 0.1 q4: 0.5 q5: 0.3 q2: q3: 0.2 q4: 0.1 q5: 0.7 q2: 1.0 q3: q4: q5: 12 2 1 Matrix game at each ve...</li></ul>

Recommended

View more >