JF RaskinU.L.B.
Quasimodo Workshop - Oct. 24, 2010ES Week Tutorial
Controller Synthesis and the Hydac Case Study
Content
❶ Context and motivations
❷ Flavor of the underlying techniques
A. Untimed safety gamesB. Timed games - Controllable predecessorsC. Recent progresses - Imperfect info D. Recent progresses - Optimality
❸ Application: The Hydac case study
Context - Motivations
Embedded systems
System
Hybrid systems mix discrete and continuous components : non trivial interactions
☛ difficult to develop
They are often safety critical
☛ need for formal methods
VerificationAn introduction to hybrid automta 9
Fig. 4. Hybrid automata for the burner and the thermometer
3. Jump({l11, l21}, !, {l12, l
22}) = Jump((l11, !, l12)) ! Jump((l21 , !, l22)) if ! "
"1 # "2;Conditions 1 and 2 express that discrete changes that are local to oneautomaton have the enabling condition and the e!ect described by thejump predicate of that automaton and the variables which are not sharedremain unchanged. Condition 3 expresses that discrete changes sharedby the two automata have as enabling condition the conjunction of theenabling conditions of each discrete change. Their e!ect is the conjunctionof the e!ects of each discrete change.
10 J.-F. Raskin
In our example, we obtain the complete system by composing the threeautomata. It is easy to show that the product operation that we have definedis commutative and associative, so we can write Sys = Tank!Burner!Thermo.Fig. 5 shows the hybrid automaton obtained by composing the automaton forthe tank and the automaton for the thermometer. We have omitted transitionsthat are incompatible with the invariant of their starting location. That is,edges e = (l, !, l!) such that [[Jump(e) " Inv(l)]]= # are not depicted.
Fig. 5. Product of tank and thermometer
3 Properties of Hybrid Systems
Properties assign values to trajectories of hybrid systems. In this introduc-tion, we restrict ourselves to properties that classify trajectories as good orbad according to whether or not they stay or not in a given set of (good)states. Those properties are called safety properties [AS85], and, are the mostimportant class of properties when considering safety critical systems.
Let us go back to our running example. Now that we have a complete modelof our system, we would like to design a controller that enforces some desiredbehaviors. The controller will be an additional hybrid automaton that, whencomposed with the automata modeling our system, must enforce the followingproperties on the trajectories of the entire system:
satisfies ?!(low ! x ! high)
Math. model
Hybrid automata
System
SynthesisAn introduction to hybrid automta 9
Fig. 4. Hybrid automata for the burner and the thermometer
3. Jump({l11, l21}, !, {l12, l
22}) = Jump((l11, !, l12)) ! Jump((l21 , !, l22)) if ! "
"1 # "2;Conditions 1 and 2 express that discrete changes that are local to oneautomaton have the enabling condition and the e!ect described by thejump predicate of that automaton and the variables which are not sharedremain unchanged. Condition 3 expresses that discrete changes sharedby the two automata have as enabling condition the conjunction of theenabling conditions of each discrete change. Their e!ect is the conjunctionof the e!ects of each discrete change.
construct ? such that!(low ! x ! high)
?
?
is satisfied no matterhow Env behaves !
Hybrid automata
System
An introduction to hybrid automta 9
Fig. 4. Hybrid automata for the burner and the thermometer
3. Jump({l11, l21}, !, {l12, l
22}) = Jump((l11, !, l12)) ! Jump((l21 , !, l22)) if ! "
"1 # "2;Conditions 1 and 2 express that discrete changes that are local to oneautomaton have the enabling condition and the e!ect described by thejump predicate of that automaton and the variables which are not sharedremain unchanged. Condition 3 expresses that discrete changes sharedby the two automata have as enabling condition the conjunction of theenabling conditions of each discrete change. Their e!ect is the conjunctionof the e!ects of each discrete change.
?
?
Hybrid automata
Synthesis implies “correct by construction”
System
construct ? such that!(low ! x ! high)
is satisfied no matterhow Env behaves !
Synthesis
Technical highlight(Untimed) Safety Games
Two-player game structures
0000
0101
1010
0100
1000
1101
1110
1111
Rounded positions belong to Player I
(Controller)
Square positions belong to Player 2
(Environment)
A game is played as follows: in each round, the game is in a position, if the game is in a rounded position, Player I resolves the choice for the next state, if the game is in a square position, Player 2 resolves the choice. The game is played for an infinite number of rounds.
0000
0101
1010
0100
1000
1101
1110
1111
Rounded positions belong to Player ISquare positions belong to Player 2
0000
0101
1010
0100
1000
1101
1110
1111
Play : 0000
0000
0101
1010
0100
1000
1101
1110
1111
Play : 0000 0100
0000
0101
1010
0100
1000
1101
1110
1111
Play : 0000 0100 0101
0000
0101
1010
0100
1000
1101
1110
1111
Play : 0000 0100 0101 1101
0000
0101
1010
0100
1000
1101
1110
1111
Play : 0000 0100 0101 1101 ...
0000
0101
1010
0100
1000
1101
1110
1111
Play : 0000 0100 0101 1101 ...
Who is winning ?
0000
0101
1010
0100
1000
1101
1110
1111
Play : 0000 0100 0101 1101 ...
Is this a good or a bad play for Player 1 ?
Who is winning ?
0000
0101
1010
0100
1000
1101
1110
1111
Who is winning ?
A winning condition (for Player 1) is a set of playsW ! (Q1 " Q2)
!
Game=
Two-player game structure+
Winning condition for Player 1
Strategies
Players are playing according to strategies.
λ1(0011 1001 1101 0011)=1110
prefix of play
Player I’sposition
Choice for the next position
λ1 : S*•S1 → S
Strategies
Players are playing according to strategies.
λ1(0011 1001 1101 0011)=1110
prefix of play
Player I’sposition
Choice for the next position
λ1 : S*•S1 → SSymmetrically for Player 2
Outcome of strategiesIf we fix a strategy for the two players and we let the two players apply their strategies, we get a play:
Outcome(λ1,λ2)=1100 0011 0001 0011 ...
If we fix a strategy only for Player I, we get a set of plays
Outcome(λ1)=∪λ2Outcome (λ1,λ2)
A strategy for Player I is winning for objective W iff
Outcome(λ1) ⊆ W
Outcome of strategiesA strategy for Player I is winning for objective W iff
Outcome(λ1) ⊆ W
That is, no matter how Player 2 resolves his choices, when player I plays according to λI, the resulting play belongs to W.
⟹ Player I can force the play to be in W.
Winning strategies
=
Controllers that enforce winning plays
Safety Games
0000
0101
1010
0100
1000
1101
1110
1111
A Safety Game
Does Player I (rounded positions) have a strategy (against any choices of Player II) to stay within the set of states
?Q \ {1111}
0000
0101
1010
0100
1000
1101
1110
1111
A Safety Game
W = { s0 s1 s2 .... | si ≠ 1111, for all i≥0 }
Symbolic algorithms to solve games
Symbolic algorithm for safety games
Bad
From where can Player I avoid Bad ?
Bad
From where can Player I avoid Bad ?
✔
✕
✔
✕
Symbolic algorithm for safety games
Bad
From where can Player I avoid Bad ?
Symbolic algorithm for safety games
1 step unsafe
1 step unsafe
Bad
From where can Player I avoid Bad ?
Iterate untilstabilization.
Symbolic algorithm for safety games
Bad
From where can Player I avoid Bad ?
Symbolic algorithm for safety games
2 steps unsafe
2 steps unsafe
Bad
From where can Player I avoid Bad ?
Symbolic algorithm for safety games
3 steps unsafe
3 steps unsafe
Bad
From where can Player I avoid Bad ?
Stabilization !
Symbolic algorithm for safety games
This is exactly the set of states where Player I has a strategy
to avoid the bad states.
Symbolic algorithm for safety games
0000
0101
1010
0100
1000
1101
1110
1111
Similar fixed point algorithms exist for any omega-regular objectives
This extends to rich winning conditions !
Technical highlight:Timed Game
Controllable Predecessors
• Timed games are not turn-based
• Objective: reach goal (and so avoid L4).
• Strategies:f1 : history → (delay,controllable action)
• Existence of memoryless strategies:
f1 : (l,v) → (delay,controllable action)
f1(L0,0)=(0.5,L0→L1)
Note that this action may not happen if Player 2 plays faster.
Timed Game Structures
• Timed games are not turn-based
• Objective: reach goal (and so avoid L4).
• Strategies:f1 : history → (delay,controllable action)
• Existence of memoryless strategies:
f1 : (l,v) → (delay,controllable action)
f1(L0,0)=(0.5,L0→L1)
Note that this action may not happen if Player 2 plays faster.
Timed Game Structures
Player 1 controls plain arcsPlayer 2 controls dashed arcs
⚠Player 2 can play (L0,L4) only if Player 1 does not play before clock x has reached value 1 !⚠
Delays in playing are essential here !
Timed Game Structures• Timed games are not turn-based
• Strategy for player 1:f1 : history → (delay,controllable action)
• Objective: reach goal (and so avoid L4).
• Existence of memoryless strategies:
f1 : (l,v) → (delay,controllable action)
f1(L0,0)=(0.5,L0→L1)
Note that this action may not happen if Player 2 plays faster.
Questions that we want to answer:
• Does there exist a winning strategy for Player 1 ?
• If yes, get one which is as simple as possible.
Timed Game Structures
Timed Games
Theorem[AMPS98][HK99]. ➢ Safety, reachability, (co)-Büchi are EXPTIME-C. ➢ Region based strategies are sufficient.
➣ Region are not practical !
In [CDFLL05] Larsen et al propose an efficient on-the-fly algorithms for the analysis of timed games.
This algorithm is implemented in UppAal-Tiga.
Timed Game Structures
L0
L1
L2
L3
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
Timed Game Structures
L0
L1
L2
L3
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
Timed Game Structures
L0
L1
L2
L3
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
Timed Game Structures
L0
L1
L2
L3
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
Timed Game Structures
L0
L1
L2
L3
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
Timed Game Structures
L0
L1
L2
L3
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
0 1 2 3 ...
x≤1
x≤1
Timed GamesTheorem[AMPS98][HK99]. ➢ Safety, reachability, (co)-Büchi are EXPTIME-C. ➢ Region based strategies are sufficient.
➣ Region are not practical !
In [CDFLL05] Larsen et al. propose an efficient on-the-fly algorithm.
OTFUR [CDFLL05]
Forward exploration Backward propagation
Unsafe
Init
Unsafe
∧∨× ××
Only reachable states are exploredNon trivial use of Difference Bound Matrices.
Pl.1 Pl.2
OTFUR [CDFLL05]
Unsafe
Implemented in UppAal-Tiga
Recent Progressesin Timed Games
Imperfect information
• Embedded controllers get information through sensors.
• Sensors = finite precision = imperfect information.
• Timed games with imperfect information are undecidable [BDMP03].
• But interesting sub-cases are decidable [CDLLR07]:
• restrict the class of strategies to be regular functions from observation sequences to controllable actions;
• if not sufficient for control then add new observations.
!"#$ %&"'(&) '# !*)) +,-."/&'(#0 10#2).34. 5&-.3 6*,-.' 7#0-'"*8'(#0
!"#$%&'(& )*+&' ,-.+&/ 0#"+/1-2/3#"+,-."/&'(#0-9
••••: ! " ;<
;< # ! " ;=;= # !
6.0-#"
$ # <
6.0-.3
$ # ><
%&(0'
$ " ><
%(-'#0
$ # ><
?03
+@@
$ % <
A9B<
$ % C
$ 9= <
$ % C
$ 9= <
$ % C
$ 9= <
D(8DE
FG.0 ;< H ! " ;=I 'G. )#8&'(#0 (- %(-'#0J
KLMKN<O P;QR><R;<<OS L($.3 7#0'"#) 2('G +,-."/&'(#0 5&-.3 &03 6'*''."(04 T0/&"(&0' 6'"&'.4(.- >= R ;>
!"#$ %&"'(&) '# !*)) +,-."/&'(#0 10#2).34. 5&-.3 6*,-.' 7#0-'"*8'(#0
!"#$%&'(& )*+&' ,-.+&/ 0#"+/1-2/3#"+,-."/&'(#0-9
••••: ! " ;<
;< # ! " ;=;= # !
6.0-#"
$ # <
6.0-.3
$ # ><
%&(0'
$ " ><
%(-'#0
$ # ><
?03
+@@
$ % <
A9B<
$ % C
$ 9= <
$ % C
$ 9= <
$ % C
$ 9= <
D(8DE
FG.0 ;< H ! " ;=I 'G. )#8&'(#0 (- %(-'#0J
KLMKN<O P;QR><R;<<OS L($.3 7#0'"#) 2('G +,-."/&'(#0 5&-.3 &03 6'*''."(04 T0/&"(&0' 6'"&'.4(.- >= R ;>
Imperfect information
Kick!
Regular observation based strategyOnly observation changes are observed !
Optimal strategies
Specifications for embedded controllers:
✓ Regular objectives (safety, liveness, etc.)
+✓ Optimality criteria
(ex: energy consumption, time to reachability, etc.)
Weighted Timed Automata
2
More recently, the ability to consider more general performance measures has been
given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal
reachability problem for priced timed automata. Efficient incorporation in UPPAAL is
provided by use of so-called priced zones as a main data structure [25]. In [29] the im-
plementation of cost-optimal reachability is improved considerably by exploiting the
duality with linear programming problems over zones (min-cost flow problems). More
recently [11], the problem of computing optimal infinite schedules (in terms of minimal
limit-ratios) is solved for the model of priced timed automata.
The Optimal Cost Control Problem for Timed Games. In this paper we combine the
notions of game and price and solve the problem of cost-optimal winning strategies
for priced timed game automata.The problem we consider is: “Given a timed game
automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per
time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the
sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4
3 giving a minimal cost of 14 13 .
!0
cost(!0) = 5
!1
[y = 0]
!2
cost(!2) = 10
!3
cost(!3) = 1
Goalx ! 2; c1 ; y := 0
u
u
x " 2; c2; cost = 1
x " 2; c2 ; cost = 7
Fig. 1. A Reachability Priced Time Game Automaton A
Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and
the more general case of non-acyclic games have been recently considered in [1]. In [1],
the problem they consider is “compute the optimal cost within k steps” (we refer to this
Locations are annotated with a cost (weight) per time unit (derivative)Transitions are annotated with a cost (weight).
[Alur et al. & Larsen et al., 2001]
Example from [Bouyer et al, 2004]
!!! Costs are observers !!!
Weighted Timed Automata
2
More recently, the ability to consider more general performance measures has been
given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal
reachability problem for priced timed automata. Efficient incorporation in UPPAAL is
provided by use of so-called priced zones as a main data structure [25]. In [29] the im-
plementation of cost-optimal reachability is improved considerably by exploiting the
duality with linear programming problems over zones (min-cost flow problems). More
recently [11], the problem of computing optimal infinite schedules (in terms of minimal
limit-ratios) is solved for the model of priced timed automata.
The Optimal Cost Control Problem for Timed Games. In this paper we combine the
notions of game and price and solve the problem of cost-optimal winning strategies
for priced timed game automata.The problem we consider is: “Given a timed game
automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per
time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the
sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4
3 giving a minimal cost of 14 13 .
!0
cost(!0) = 5
!1
[y = 0]
!2
cost(!2) = 10
!3
cost(!3) = 1
Goalx ! 2; c1 ; y := 0
u
u
x " 2; c2; cost = 1
x " 2; c2 ; cost = 7
Fig. 1. A Reachability Priced Time Game Automaton A
Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and
the more general case of non-acyclic games have been recently considered in [1]. In [1],
the problem they consider is “compute the optimal cost within k steps” (we refer to this
[Alur et al. & Larsen et al., 2001]
Example from [Bouyer et al, 2004]
(l0,(0,0),0)−0.8→(l0,(0.8,0.8),4)−c1→(l1,(0.8,0),4)−u− (l2,(0.8,0),4)−3.1→(l2,(3.9,3.1),35) −c2→(Goal,(3.9,3.1),36)
Costs are accumulated but are not “tested” along the run.
!!! Costs are observers !!!
2
More recently, the ability to consider more general performance measures has been
given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal
reachability problem for priced timed automata. Efficient incorporation in UPPAAL is
provided by use of so-called priced zones as a main data structure [25]. In [29] the im-
plementation of cost-optimal reachability is improved considerably by exploiting the
duality with linear programming problems over zones (min-cost flow problems). More
recently [11], the problem of computing optimal infinite schedules (in terms of minimal
limit-ratios) is solved for the model of priced timed automata.
The Optimal Cost Control Problem for Timed Games. In this paper we combine the
notions of game and price and solve the problem of cost-optimal winning strategies
for priced timed game automata.The problem we consider is: “Given a timed game
automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per
time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the
sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4
3 giving a minimal cost of 14 13 .
!0
cost(!0) = 5
!1
[y = 0]
!2
cost(!2) = 10
!3
cost(!3) = 1
Goalx ! 2; c1 ; y := 0
u
u
x " 2; c2; cost = 1
x " 2; c2 ; cost = 7
Fig. 1. A Reachability Priced Time Game Automaton A
Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and
the more general case of non-acyclic games have been recently considered in [1]. In [1],
the problem they consider is “compute the optimal cost within k steps” (we refer to this
Optimal strategy problem
2
More recently, the ability to consider more general performance measures has been
given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal
reachability problem for priced timed automata. Efficient incorporation in UPPAAL is
provided by use of so-called priced zones as a main data structure [25]. In [29] the im-
plementation of cost-optimal reachability is improved considerably by exploiting the
duality with linear programming problems over zones (min-cost flow problems). More
recently [11], the problem of computing optimal infinite schedules (in terms of minimal
limit-ratios) is solved for the model of priced timed automata.
The Optimal Cost Control Problem for Timed Games. In this paper we combine the
notions of game and price and solve the problem of cost-optimal winning strategies
for priced timed game automata.The problem we consider is: “Given a timed game
automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per
time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the
sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4
3 giving a minimal cost of 14 13 .
!0
cost(!0) = 5
!1
[y = 0]
!2
cost(!2) = 10
!3
cost(!3) = 1
Goalx ! 2; c1 ; y := 0
u
u
x " 2; c2; cost = 1
x " 2; c2 ; cost = 7
Fig. 1. A Reachability Priced Time Game Automaton A
Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and
the more general case of non-acyclic games have been recently considered in [1]. In [1],
the problem they consider is “compute the optimal cost within k steps” (we refer to this
Mint(Max(5t + 10(2 ! t) + 1, 5t + (2 ! t) + 7))
Player I’s choice
Player II’s choice
Optimal strategy problem
t = time to wait in l0.
Mint(Max(5t + 10(2 ! t) + 1, 5t + (2 ! t) + 7))
Which is when t =3
4and the cost is 14
3
4
Optimal strategy problem
2
More recently, the ability to consider more general performance measures has been
given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal
reachability problem for priced timed automata. Efficient incorporation in UPPAAL is
provided by use of so-called priced zones as a main data structure [25]. In [29] the im-
plementation of cost-optimal reachability is improved considerably by exploiting the
duality with linear programming problems over zones (min-cost flow problems). More
recently [11], the problem of computing optimal infinite schedules (in terms of minimal
limit-ratios) is solved for the model of priced timed automata.
The Optimal Cost Control Problem for Timed Games. In this paper we combine the
notions of game and price and solve the problem of cost-optimal winning strategies
for priced timed game automata.The problem we consider is: “Given a timed game
automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per
time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the
sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4
3 giving a minimal cost of 14 13 .
!0
cost(!0) = 5
!1
[y = 0]
!2
cost(!2) = 10
!3
cost(!3) = 1
Goalx ! 2; c1 ; y := 0
u
u
x " 2; c2; cost = 1
x " 2; c2 ; cost = 7
Fig. 1. A Reachability Priced Time Game Automaton A
Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and
the more general case of non-acyclic games have been recently considered in [1]. In [1],
the problem they consider is “compute the optimal cost within k steps” (we refer to this
Optimal strategy problem
2
More recently, the ability to consider more general performance measures has been
given. Priced extensions of timed automata have been introduced where a cost c is asso-ciated with each location ! giving the cost of a unit of time spent in !. In [3] cost-boundreachability has been shown decidable. [8] and [5] independently solve the cost-optimal
reachability problem for priced timed automata. Efficient incorporation in UPPAAL is
provided by use of so-called priced zones as a main data structure [25]. In [29] the im-
plementation of cost-optimal reachability is improved considerably by exploiting the
duality with linear programming problems over zones (min-cost flow problems). More
recently [11], the problem of computing optimal infinite schedules (in terms of minimal
limit-ratios) is solved for the model of priced timed automata.
The Optimal Cost Control Problem for Timed Games. In this paper we combine the
notions of game and price and solve the problem of cost-optimal winning strategies
for priced timed game automata.The problem we consider is: “Given a timed game
automaton A, a goal location Goal, what is the optimal cost we can achieve to reachGoal inA?”. We refer to this problem as the Optimal Cost Problem (OCP). Consider theexample of a priced timed game automaton given in Fig. 1. Here the cost-rates (cost per
time unit) in locations !0, !2 and !3 are 5, 10 and 1 respectively. In !1 the environmentmay choose to move to either !2 or !3 (dashed arrows are uncontrollable). However, dueto the invariant y = 0 this choice must be made instantaneous. Obviously, once !2 or !3has been reached the optimal strategy for the controller is to move to Goal immediately(however there is a discrete cost (resp. 1 and 7) on each discrete transition). The crucial(and only remaining) question is how long the controller should wait in !0 before takingthe transition to !1. Obviously, in order for the controller to win this duration must beno more than two time units. However, what is the optimal choice for the duration in the
sense that the overall cost of reaching Goal is minimal? Denote by t the chosen delayin !0. Then 5t + 10(2 ! t) + 1 is the minimal cost through !2 and 5t + (2 ! t) + 7 isthe minimal cost through !3. As the environment chooses between these two transitionsthe best choice for the controller is to delay t " 2 such that max(21 ! 5t, 9 + 4t) isminimum, which is t = 4
3 giving a minimal cost of 14 13 .
!0
cost(!0) = 5
!1
[y = 0]
!2
cost(!2) = 10
!3
cost(!3) = 1
Goalx ! 2; c1 ; y := 0
u
u
x " 2; c2; cost = 1
x " 2; c2 ; cost = 7
Fig. 1. A Reachability Priced Time Game Automaton A
Related Work. Acyclic priced (or weighted) timed games have been studied in [23] and
the more general case of non-acyclic games have been recently considered in [1]. In [1],
the problem they consider is “compute the optimal cost within k steps” (we refer to this
Optimal moves on rational points (not necessarily integer points !)
Optimization problems - HighlightsTheorem[ALP01,BGHRV01,BBBR07]. Optimal reachability in timed automata is PSpaceC.
Theorem[BCFL04,ABM04]. Optimal reachability in timed game automata with strictly positive costs is decidable.
Theorem[BBR04]. Cost optimal branching time logic model-checking on timed automata is undecidable.
Theorem[BBR05]. Cost optimal reachability in timed game automata is undecidable.
Theorem[BLMR06]. ε-Cost optimal reachability for 1-clock timed automata is decidable.
Theorem[BLM07]. Cost optimal branching time logic model-checking on 1-clock timed automata is decidable.
And more recently: (multi-)energy games, generalized mean-payoff games, ...
The Hydac Case-Study
2/ 30
In a few words
Case study submitted by the Hydac Gmbh Company, within the EuropeanResearch Project Quasimodo.
Concerns a controller for a Plastic Injection Molding Machine
Robust and optimal control
Tool chainSynthesis: UppAal TiGaVerification: PHAVerPerformance: Simulink
The gain is ! 45% w.r.t.classical controllers
5/ 30
Problem Description
Pump
Reservoir
Accumulator
Machine/Consumer
Vmax
Vmin
+2.2 litres/second
The Machine consumes oil inthe Accumulator
The Machine returns oil backinto the Reservoir
The Pump can move oil fromthe Reservoir into theAccumulator
The total amount of oil in thesystem is constant
6/ 30
Synthesis Objectives
Pump
Reservoir
Accumulator
Machine/Consumer
Vmax
Vmin
+2.2 litres/second Company’s Objectives:
Safety:the volume must stay within asafe interval [Vmin, Vmax ]
Optimization:minimize average/overall oil
volumeR t=T
t=0v(t)dt/T
Realistic Controllers:
Simple design
Implementability (tolerance toimprecise measures)
7/ 30
The Machine and the Pump
The Machine:
Infinite cyclic consumption to be satisfied by our control strategy.One cycle: 20 seconds ! Deterministic behaviour
0 2 4 6 8 10 12 14 16 18 200.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
2.2
2.4
2.6
2.8
3.0
Time (second)
1.2 1.2
2.5
1.7
0.5
Mac
hin
e R
ate
(lit
re/s
eco
nd
)
Fluctuations in consumption of 0.1 liter/sec! Non Determinism!
The Pump:
Latency constraint: 2 seconds between state change of the pump
7/ 30
The Machine and the Pump
The Machine:
Infinite cyclic consumption to be satisfied by our control strategy.One cycle: 20 seconds ! Deterministic behaviour
0 2 4 6 8 10 12 14 16 18 200.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
2.2
2.4
2.6
2.8
3.0
Time (second)
1.2 1.2
2.5
1.7
0.5
Mac
hin
e R
ate
(lit
re/s
eco
nd
)
Fluctuations in consumption of 0.1 liter/sec! Non Determinism!
The Pump:
Latency constraint: 2 seconds between state change of the pump
Noise iscontrolled by the
adversary
8/ 30
Two existing controllers
Bang-Bang controller:Use two min/max bounds on the volume to
control the pump:
if V < Vminsafe , start the pump;
if V > Vmaxsafe , stop the pump;
0 20 40 60 80 100 120 140 160 180 2000
5
10
15
20
25
Time [s]
Vo
lum
e [
l],
pu
mp
[o
n/o
ff]
2!point controller
Volume
Pump on/off
Company’s controller:400 lines of C-code in Simulink
Roughly, use sampling (measures every10ms) of the volume to compute adequatedates of control.
0 20 40 60 80 100 120 140 160 180 2000
5
10
15
20
25
Time [s]
Vo
lum
e [
l],
pu
mp
[o
n/o
ff]
Hydac controller
Volume
Pump on/off
! this shows less energy used: in average ! 15%
10/ 30
An hybrid modelization
i1 i2 i3 i4 i5
i6i7i8i9
[t ! 2][t ! 4] [t ! 8] [t ! 10] [t ! 12]
[t ! 14][t ! 16][t ! 18][t ! 20]
mr := 0t := 0
t = 2mr :=1.2
t = 4mr :=0
t = 8mr :=1.2
t = 10mr :=2.5
t = 12mr :=0
t = 14mr :=1.7
t = 16mr :=0.5
t = 18mr :=0
t = 20t :=0
(a) The Machine
dvdt
" [pr # m+r (!); pr # m!
r (!)]
dVaccdt
= v
[true]
v := 10.0Vacc := 0
(b) The Accumulator
O! On
[true] [true]
z := 2pr := 0
z $ 2, switch on!
pr := 2.2, z := 0
z $ 2, switch o!!
pr := 0, z := 0
(c) The Pump
! synthesize a controller for the pump ensuring safety, optimality, with asimple design, and robust to imprecisions in measures of volume and time.
Undecidable problem!
Our approach“Very” abstract UppAal-Tiga model
Abstract control strategy
Automatic synthesis
PhaVer HA modelfor robustness
verification
Simulink modelfor peformances
evaluation
Embed Embed
Model of one cycle
✓The controller gets only level at the start✓The controller can decide to open/close the pump k times during the cycle
➡ simple controller✓The noise is controlled by the adversary
➡ noise is out of control of the system✓The noise is not observable during run
➡ explicit modeling of imperfect information✓Oil level is discretized
➡ model is simple enough (but nor an over-approximation nor under-approximation)
Additional automata
Pump: on/off+delays Scheduler models the interaction between cycle || controller || pump
Complete model for one cycle
But we want a solution for an
y number of cycles
Complete model for one cycle
But we want a solution for an
y number of cycles
Idea:formulate a control
objective that ensures that the strategy is inductive
Complete model for one cycle
Inductive strategiesO
il le
vel
② Wherever we start in should end in① Safe with Min-Max ③ ⊆
④ Find ( , ) that minimizes mean-level. +margin(robustness)
Time - activations
1 cycle
18/ 30
Results - An Example
Margin: m = 0.4 l
Granularity: D = 1
Best stable interval: I = [5.1, 10]
Representation of the strategy: (for any initial volume, it gives two periods ofactivation)
50
time (s)
initia
lvo
lum
e(l
)
5 10 15
6
7
8
9
10
PhaVer model
PhaVer HA modelfor robustness
verification
One of our automatically generated controller
☛ Robustness OK☛ Any number of cycles OK☛ Correctness w.r.t. over-approx. OK
Verification necessarybecause our game model
is “very” abstract !
Performace evaluation
Mathlab - Simulink
Bang-BangHydac solution
or Automatically generated controller
Simulation
Stochastic model of noise
27/ 30
Plots of execution
0 20 40 60 80 100 120 140 160 180 2000
5
10
15
20
25
Time [s]
Vo
lum
e [
l],
pu
mp
[o
n/o
ff]
2!point controller
Volume
Pump on/off
0 20 40 60 80 100 120 140 160 180 2000
5
10
15
20
25
Time [s]
Vo
lum
e [
l],
pu
mp
[o
n/o
ff]
Hydac controller
Volume
Pump on/off
0 20 40 60 80 100 120 140 160 180 2000
5
10
15
20
25
Time [s]
Vo
lum
e [
l],
pu
mp
[o
n/o
ff]
m4g1
Volume
Pump on/off
28/ 30
Numerical results
Performances obtained according to Simulink simulations:
Controller Acc. volume Mean volume Mean volume (Tiga)
Bang-Bang 2689 13.45 -Hydac 2232 11.16 -
G1M4 1511 7.56 8.45G1M3 1511 7.56 8.35G1M2 1518 7.59 8.25G1M1 1518 7.59 8, 2
G2M4 1527 7.64 8.05G2M3 1513 7.57 7.95G2M2 1500 7.5 7.95G2M1 1489 7.44 7.95
This shows a vaste reduction of energy used:! 45% w.r.t. Bang-bang controller! 33% w.r.t. Hydac’s controller
Conclusion
• Automatically generated controllers
• Correct and robust !
• More efficient than BB and Hydac solutions.
• Aggressive abstractions: need for verification.
• Explicit modeling of imperfect information (algorithms for imperfect information not yet available).
• Discretization of costs:
• we need further research efforts !
• new interesting theoretical questions !