talk outline
DESCRIPTION
Trace Verification for Parallel Systems Vijay K. Garg Department of Electrical and Computer Engineering The University of Texas at Austin Austin, TX 78712 email: [email protected]. Talk Outline. Motivation and Overview Instrumentation Clock : Tracking Dependency Property Checking - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/1.jpg)
Parallel and Distributed Systems Laboratory
Paradise: A Toolkit for Building Reliable Concurrent Systems
Trace Verification for Parallel Systems
Vijay K. GargDepartment of Electrical and Computer Engineering
The University of Texas at AustinAustin, TX 78712
email: [email protected]
![Page 2: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/2.jpg)
2
Talk Outline
Motivation and Overview
Instrumentation
– Clock : Tracking Dependency
Property Checking
– Sensor : Detecting Global Properties
– Slicer : Computation Slicing
![Page 3: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/3.jpg)
4
Motivation: Reliable System
Concurrent systems are prone to errors.
– Concurrency, nondeterminism, process and channel failures
Techniques to ensure correctness
Modeling: Model Checking and Formal Verification
Bug Hunting: Simulation, Debugging and Verification
Fault-Tolerance
![Page 4: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/4.jpg)
5
Paradise Environment
Program Monitor
Slicer
Predicate
Observe
Control
![Page 5: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/5.jpg)
6
Talk Outline
Motivation and Overview
Instrumentation
– Clock : Tracking Dependency
Property Checking
– Sensor : Detecting Global Properties
– Slicer : Computation Slicing
![Page 6: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/6.jpg)
7
Trace Model: Total Order vs Partial Order
Total order: interleaving of events in a trace
Partial order: Lamport’s happened-before model
f2e1
CS2 CS1
f1 e2
P1
P2
Partial Order Trace
CS2
CS1
e1 e2
f1 f2
e2e1
CS1CS2
f1 f2
Successful Trace
Specification:CS1 Λ CS2
¬CS2 ¬CS1
¬ CS1
¬CS2
¬CS1 ¬ CS2
Faulty Trace
![Page 7: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/7.jpg)
8
Tracking Dependency
computation: a set of events ordered by “happened before” relation
Problem: Timestamp events to answer
– e happened before f ?
– e concurrent with f ?
![Page 8: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/8.jpg)
9
Clocks in a Distributed System
Result: s happened before t i the vector at s is less than the vector at t.
Vector Clocks [Fidge 89, Mattern 89]
P1
(1,0,0) (2,1,0) (3,1,0)
P2
(0,1,0) (0,2,0)
P3
(0,0,1) (0,0,2) (2,1,3)
![Page 9: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/9.jpg)
10
Dynamic Chain Clocks
Problem with vector clocks: scalability, dynamic process structure
Idea: Computing the “chains” in an online fashion [Aggarwal and Garg PODC 05] for relevant events
a
f
eb
d
c
h
g
a b c d
e f g h
A computation with 4 processes The relevant subcomputation
P1
P2
P3
P4
![Page 10: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/10.jpg)
11
Experimental Results
Simulation of a computation with 1% relevant events
Measured
– number of components vs number of threads
– total time overhead vs number of threads
![Page 11: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/11.jpg)
12
Talk Outline
Motivation and Overview
Instrumentation
– Clock : Tracking Dependency
Property Checking
– Sensor : Detecting Global Properties
– Slicer : Computation Slicing
![Page 12: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/12.jpg)
13
Global Property Detection
Predicate: A global condition expressed using variables on processes
– e.g., more than one process is in critical section, there is no token in the system
Problem: find a global state that satisfies the given predicate
P1
P2
G1 G2
Critical section
![Page 13: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/13.jpg)
14
The Main Difficulty in Partial Order
Algorithm for general predicate [Cooper and Marzullo 91]
Too many global states : A computation may contain as many as O(kn) global states
• k: maximum number of events on a process• n: number of processes
e1 e2
f1 f2
T┴
P1
P2 {e1, ┴} {f1, ┴}
{e1, f1, ┴}
{e2, e1, f1, ┴}
{e2, e1, f2, f1, ┴
{e1, f2, f1, ┴}
{e2, e1, ┴}
{┴}
![Page 14: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/14.jpg)
15
Efficient Predicate Detection for Special Cases
stable predicate: [Chandy and Lamport 85] • once the predicate becomes true, it stays true e.g.,
deadlock
unstable predicate:• observer independent predicate [Charron-Bost et al 95]
occurs in one interleaving occurs in all interleavings e.g., any disjunction of local predicate
• linear predicate [Chase and Garg 95] e.g., conjunctive predicates such as there is no leader in
the system• relational predicate: x1 + x2 +…+ xn ≥ k [Chase and Garg
95] e.g., violation of k-mutual exclusion
![Page 15: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/15.jpg)
16
Algorithms for Conjunctive Predicates
Centralized Algorithm [Garg and Waldecker 92] Each non-checker process maintains its local vector and sends to the checker process the chain clock whenever
– local predicate is true
– at most once in each message interval.
Time complexity: Checker requires at most O(n2m) comparisons.
– token based algorithm [Garg and Chase 95]
– completely distributed algorithm [Garg and Chase 95]
– keeping queues shorter [Chiou and Korfhage 95]
– avoiding control messages [Hurfin, Mizuno, Raynal, Singhal 96]
![Page 16: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/16.jpg)
17
Other Special Classes of Predicates
Relational Predicates
– Let xi: number of token at Pi
– Σxi < k: loss of tokens
– Algorithms: max-flow techniques [Groselj 93, Chase and Garg 95, Wu and Chen 98]
– Dilworth's partition [Tomlinson and Garg 96]
![Page 17: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/17.jpg)
18
Talk Outline
Motivation and Overview
Instrumentation
– Clock : Tracking Dependency
Property Checking
– Sensor : Detecting Global Properties
– Slicer : Computation Slicing
![Page 18: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/18.jpg)
19
The Main Idea of Computation Slicing
Partial order trace
slice
state explosion
keep all red global statesslicing
![Page 19: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/19.jpg)
20
How does Computation Slicing Help?
Partial order trace
slice
retain all global states satisfying b1
slicing for b1
check b1 Λ b2
check b2
satisfy b1
![Page 20: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/20.jpg)
21
Example
Detect predicate (x*y + z < 5) Λ (x ≥1) Λ (z ≤3)
P1
P2
P3
x
y
z
a
1
b
2
c
-1
d
0
e
0
f
2
g
1
h
3
u
4
v
1
w
2
x
4
{a,e,f,u,v} {b}
{w} {g}
Computation
Slice with respect to (x ≥1) Λ (z ≤3)
![Page 21: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/21.jpg)
22
Computation Slice
computation slice: a sub-computation such that: [Mittal and Garg 01]
1. it contains all global states of the computation satisfying the given predicate, and
2. it contains the least number of global states
![Page 22: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/22.jpg)
23
POTA Architecture [Sen Dissertation 04]
Instrumentor
Specification
SlicerPredicate Detector
Trace
Slice
Predicate (Specification)
TranslatorExecuteProgram
ExecuteSPIN
Program
InstrumentedProgram
Promela
Trace Slice
yes/witnessno/counterexample
no/counterexample
yes
Analyzer
![Page 23: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/23.jpg)
24
Results
Efficient polynomial-time algorithms for computing the slice for:
– linear predicates: [Garg and Mittal 01]• time-complexity: O(n2m)
– general predicate:• Theorem: Given a computation, if a predicate b can be
detected efficiently then the slice for b can also be computed efficiently. [Mittal,Sen and Garg 03]
– combining slices: Boolean operators
– temporal logic operators: EF, AG, EG
– approximate slice: For arbitrary boolean expression
• n: number of processes• m: number of events
![Page 24: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/24.jpg)
25
Experiments: Dining Philosophers Trace Verification
POTA: Partial Order Trace Analyzer (based on slicing) [Sen and Garg 03]
SPIN: A widely used model checking tool [Holzmann 97]
– SPIN: 250 seconds for n = 6, runs out of memory for n > 6.
– POTA: can handle n= 200. Used 400 seconds.
Predicate: Two neighboring dining philosophers do not eat concurrently
![Page 25: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/25.jpg)
26
Conclusions
Bug-hunting in concurrent systems
Total order vs. Partial Order
Abstraction like slicing to combat state space explosion problem
![Page 26: Talk Outline](https://reader036.vdocuments.mx/reader036/viewer/2022070409/568144cf550346895db19a61/html5/thumbnails/26.jpg)
27
Questions
?