qest'12 paper seminar
DESCRIPTION
A quick seminar based on a paper published at QEST'12.TRANSCRIPT
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Statistical Model Checking for Markov DecisionProcesses
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke
Computer Science DepartmentCarnegie Mellon University
June 6, 2012
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Summary
1 The PMC Problem
2 Resolving Non-Determinism
3 Algorithm
4 Implementation and Results
5 Conclusions
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Model Checking
Given:
Property ϕ in temporal logic
A transition system M
Does ϕ hold in M, or M |= ϕ?
Example
Is one car always safely behind another, where x1 and x2 are theirpositions:
Gx1 < x2
State of the art can handle millions of states. Used in hardwareand software industry.
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Probabilistic Model Checking
Given:
Property ϕ in temporal logic
A probabilistic transition system M
A probability threshold θ
Is the probability that M satisfies ϕ smaller than θ, P≤θ(ϕ)?
Example
Is it very unlikely that cars collide?
P≤0.00001(Fx1 = x2)
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Probabilistic Model Checking: exact approach
Exact methods: pros
Can currently handle relatively complex scenarios
Handles systems with non-determinism
Mature tools such as PRISM
They are exact...
Exact methods: cons
State explosion problem greatly reduces applicability
Time-consuming
Possibly hard to parallelise (e.g. PRISM)
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Probabilistic Model Checking: statistical approach
Statistical Model Checking: pros
Can currently handle very complex scenarios
Highly parallelisable
Only requires bounded memory
Comes in two flavours: hypothesis testing, interval estimation
Statistical Model Checking: cons
Not exact (but converges to correct solution)
Requires a bounded number ”steps”, i.e. bounded property
Requires fully probabilistic systems
But most interesting systems feature non-determinism!
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Summary
1 The PMC Problem
2 Resolving Non-Determinism
3 Algorithm
4 Implementation and Results
5 Conclusions
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Markov decision processes (MDPs) & schedulers
p
s p
0.01
0.99
0.5 1
0.5
MDP chooses action non-deterministically
Each action has a distribution of target states
Schedulers σ : States→ Actions are used to resolvenon-determinism
General schedulers 6= memoryless schedulers for boundedproperties
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Probabilistic Model Checking: resolving non-determinism
Property P≤θ(ϕ) is actually: is the probability that model Msatisfies property ϕ smaller than θ for all schedulers.
Thus, we check only for optimal schedulers, i.e. thatmaximises P(ϕ)
If optimal schedulers generate probabilities above θ, theproperty is false.
True otherwise.
How do we find optimal schedulers?
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Summary
1 The PMC Problem
2 Resolving Non-Determinism
3 Algorithm
4 Implementation and Results
5 Conclusions
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Schedulerevaluation
Schedulerimprovement
Determinisation
SMC
False
True
σ uniform
σ improvedQ
σ candidate
deterministic σ
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Scheduler Evaluation
This step estimates:
How good the current scheduler is
How much each choice contributed to the satisfaction of ϕ
It does this by:
Turning the MDP into a Markov chain using scheduler σ
Sampling from the Markov chain repeatedly
For satisfying trace, give a positive point to each action taken,and vice-versa for non-sat
At the end, for each action, we have an estimate of the probabilityof satisfaction of ϕ
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Scheduler Improvement
This step provably improves scheduler σ by:
For each state, choosing the “best” action with probability1− εChoosing all others with ε
n−1 probability, with n the numberof possible actions
This ensures that:
Search efforts are largely directed at the promising regions ofthe state space
All states remain explorable/reachable (p > 0)
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Putting it all together
The entire algorithm is thus very simple:
Start with an uninformed (i.e. Uniform) scheduler
Estimate best actions
Improve scheduler with this new information
Rinse & repeat
When scheduler is “good enough” (or time-limit reached),determinise it
Run Statistical Model Checking using the determinisedscheduler
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Properties
This algorithm is a False-biased Monte Carlo algorithm.
If it finds a counterexample, it returns false (up to SMC)
If it does not, it returns true with arbitrarily high probability
It has the following nice properties
Provides counter-example
Converges
Statistically correct
Highly parallelisable
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Summary
1 The PMC Problem
2 Resolving Non-Determinism
3 Algorithm
4 Implementation and Results
5 Conclusions
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Implementation
Integrated with PRISM simulation engine
Works with PRISM MDP benchmarks
Parallel sampling
Synchronisation for data-structures during evaluation
Ran on 32-core and 48-core machines
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Scheduler improvement
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 40 50 60 70 80 90
# sa%sfied
traces / # to
tal traces
# Learning rounds
10 processes 50 processes 100 processes
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Parallelisability
0
50
100
150
200
250
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Run$
me (s)
# of threads
Mutex Bugged 10
Mutex Bugged 30
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Correctness
0
50
100
150
200
250
300
350
10 (8K states) 15 (393K states) 20 (16M states) 25 (654M states)
Tim
e(s)
Time (SMC) Time (PRISM)
Learning Optimal Policies for Model Checking João Martins and David Henriques
Introduction
Background
The Algorithm Results - Convergence
References
Markov decision processes (MDPs) are expressive models, popular for modeling systems that exhibit both probabilistic and non-deterministic behaviour.
Useful quantitative properties over MDPs can be automatically verified with probabilistic model checking (PMC), a popular formal verification technique.
Unfortunately, PMC suffers from the state explosion problem. Statistical methods can be used to approximate the desired result without need for complete state space exploration.
One well identified shortcoming is that Statistical methods have been limited to fully probabilistic systems.
Bounded LTL BLTL is an expressive probabilistic logic for reasoning about dynamic systems. Its syntax is given by
:= p | | F<n | G<n | U<n | W<n
It allows us to express properties such as request is acknowledged within n time or process enters the critical region until the flag is .
PRISM Prism is the reference state of the art probabilistic model checker.
It answers the question P>p( ) using value iteration: is the probability of satisfying greater than p for all resolutions of nondeterminism? P>p( ) is known as the probabilistic property and is known as the temporal formula.
Requires the entire state space in memory.
Objectives Develop a Reinforcement Learning algorithm to learn optimal policies for model checking PLTL in MDPs that does not require computation over the entire state space.
Using the above technique, apply Statistical Model Checking (SMC) in systems with non determinism. To the best of our knowledge, this is the first attempt to solve this problem in a general setting.
Integrate the algorithm with the PRISM model checker, in particular, allow the use of extensive benchmark suite.
Younes, H., Clarke, E. and Zuliani, P. Statistical of Probabilistic Properties with Unbounded Until.
10.
Kwiatkowska, M., Norman, G. and Parker, D. PRISM: Probabilistic symbolic model checker.
Bogdoll, J. and Fioriti, L. and Hartmanns, A. and Hermanns, H. Partial Order Methods for Statistical Model Checking and Simulation. 11.
Top-Level Algorithm 1. Initialise a policy such that actions are chosen
uniformly from each state
2. Do K times:
a. Sample a set P of N paths from the MDP using policy
b. For each
If
Positively reinforce
If ( )
Negatively reinforce
c. Update policy based on reinforcement
3. Determinise the policy
4. Use SMC to check the probabilistic property
Leader Election Protocol - Error Randomized leader election protocol. Graphic shows the probability of electing a leader within x steps.
Results - Efficiency
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
25 30 35 40 45 50 55 60 65 70 75
Pro
babi
lity
of E
lect
ing
a Le
ader
Lower bound Upper bound Real probability
Mutex Protocol Error Several mutual exclusion processes, one having a small probability of entering the critical zone illegally. Checking the worst-case probability of error. Reinforcement
A, the reinforcement R is R(s,a) = |{ : (s,a and }|- |{ :(s,a and
( )}|
Policy Update New probability distributions are Multinomials with
parameters given by the MLE from reinforcement information (R(a,s)/ R(s,a)).
To avoid having transitions with 0 probability and minimize harmful runs, actual policy updates are a mixture of the previous distribution and the new probability distribution.
Convergence and Stopping Criteria Since optimal policies are deterministic, every once in
a while we determinise the policy and check the probabilistic property using SMC.
Negative answers from SMC are (probabilistically) guaranteed to mean the probabilistic property is false, since there is at least a policy achieving the value in question.
Positive answers from SMC may be false positives. We run the algorithm several times to minimize the probability of always converging to local maxima.
Mutex Protocol - Efficiency
# steps # processes
0
0.2
0.4
0.6
0.8
1
1.2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Pro
babi
lity
Deterministic Probabilistic
Mutex Protocol Convergence
K
Wireless Network Efficiency IEEE 802.11 Wireless LAN standard for collision avoidance. Several stations broadcast at the same time and enact a back off protocol when collisions are detected.
0.7
0.75
0.8
0.85
0.9
0.95
1
0.75
0.8
0.85
0.9
0.95
1
0.75
0.8
0.85
0.9
0.95
1
0.75
0.8
0.85
0.9
0.95
1
10 processes
True probability
True probability
True probability probability
15 processes
20 processes 25 processes 0
10
20
30
40
50
60
70
2 (204K states) 3 (616K states) 4 (1.9M states) 5 (6.2M states) 6 (19.8M states)
Tim
e (s
)
Time (PRISM) Time (SMC) # stations
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
CS
MA
34
θ 0.5 0.8 0.85 0.9 0.95 PRISMout F F F T T 0.86t 1.7 11.5 35.9 115.7 111.9 136
CS
MA
36
θ 0.3 0.4 0.45 0.5 0.8 PRISMout F F F T T 0.48t 2.5 9.4 18.8 133.9 119.3 2995
CS
MA
44
θ 0.5 0.7 0.8 0.9 0.95 PRISMout F F F F T 0.93t 3.5 3.7 17.5 69.0 232.8 16244
CS
MA
46
θ 0.5 0.7 0.8 0.9 0.95 PRISMout F F F F F∗
memout
t 3.7 4.1 4.2 26.2 258.9 memout
WL
AN
5
θ 0.1 0.15 0.2 0.25 0.5 PRISMout F F T T T 0.18t 4.9 11.1 124.7 104.7 103.2 1.6
WL
AN
6
θ 0.1 0.15 0.2 0.25 0.5 PRISMout F F T T T 0.18t 5.0 11.3 127.0 104.9 102.9 1.6
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Summary
1 The PMC Problem
2 Resolving Non-Determinism
3 Algorithm
4 Implementation and Results
5 Conclusions
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Conclusions
Trading absolute correctness for statistical correctness gives usmore applicability
Faster than traditional exact approaches for not completelystructured systems
Statistical correctness
Integration with PRISM
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes
The PMC ProblemResolving Non-Determinism
AlgorithmImplementation and Results
Conclusions
Thank you, questions?
D. Henriques, J. Martins, P. Zuliani, A. Platzer, E. M. Clarke Statistical Model Checking for Markov Decision Processes