context-bounded model checking of concurrent software
DESCRIPTION
Context-bounded model checking of concurrent software. Shaz Qadeer Microsoft Research. Joint work with: Jakob Rehof, Microsoft Research Dinghao Wu, Princeton University. . . . . . . . . . . . . . . . . . . . . . . . . - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/1.jpg)
Context-bounded model checking of concurrent
softwareShaz Qadeer
Microsoft Research
Joint work with:•Jakob Rehof, Microsoft Research•Dinghao Wu, Princeton University
![Page 2: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/2.jpg)
Concurrent software
•Operating systems, device drivers•Databases, web servers, browsers, GUIs, ...•Modern languages: C#, Java
Processor 1
Processor 2
Thread 1
Thread 2
Thread 3
Thread 4
![Page 3: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/3.jpg)
Concurrency is increasingly important
• New classes of concurrent software– Web services– Workflows
• Single-chip multiprocessors are an architectural inflexion point– Software running on these chips will be
even more concurrent
![Page 4: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/4.jpg)
Reliable concurrent software?
•Correctness Problem– does program behaves correctly for all
inputs and all interleavings?
•Bugs due to concurrency are insidious – non-deterministic, timing dependent– difficult to detect, reproduce, eliminate– coverage from testing very poor
![Page 5: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/5.jpg)
Analysis of concurrent programs is difficult (1)
• Finite-data single-procedure program– n lines– m states for global data variables
• 1 thread– n * m states
• K threads– (n)
K * m states
![Page 6: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/6.jpg)
Analysis of concurrent programs is difficult (2)
• Finite-data program with procedures– n lines– m states for global data variables
• 1 thread– Infinite number of states– Can still decide assertions in O(n * m3)– SLAM, ESP, BLAST implement this algorithm
• K 2 threads– Undecidable! (Ramalingam 00)
![Page 7: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/7.jpg)
Context-bounded verification of concurrent software
Context Context Context
Context switch Context switch
Analyze all executions with small number of context switches !
![Page 8: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/8.jpg)
• Many subtle concurrency errors are manifested in executions with a small number of contexts
• Context-bounded analysis can be performed efficiently
Why context-bounded analysis?
![Page 9: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/9.jpg)
KISS: A static checker for concurrent software
• An implementation of context-bounded analysis– Technique to use any sequential checker
to perform context-bounded concurrency analysis
• Has found a number of concurrency errors in NT device drivers
![Page 10: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/10.jpg)
Sequentialprogram QKISS
Sequential Checker
Concurrentprogram P
No error found
Error in Q indicateserror in P
KISS: A static checker for concurrent software
![Page 11: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/11.jpg)
Sequentialprogram QKISS
Concurrentprogram P
KISS: A static checker for concurrent software
No error found
Error in Q indicateserror in P
SDV
![Page 12: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/12.jpg)
Sequentialprogram QKISS
Concurrentprogram P
KISS: A static checker for concurrent software
No error found
Error in Q indicateserror in P
PREfix
![Page 13: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/13.jpg)
Sequentialprogram QKISS
Concurrentprogram P
KISS: A static checker for concurrent software
No error found
Error in Q indicateserror in P
ESP
![Page 14: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/14.jpg)
Inside a static checker for sequential programs
int x, y, z;
void foo ( ) { if (x > y) { y = x; } if (y > z) { z = y; }
assert (x ≤ z);}
• Symbolically analyze all paths
• Check the assertion for each path
• Interprocedural analysis – e.g., PREfix, ESP, SLAM,
BLAST
![Page 15: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/15.jpg)
KISS strategy
• Q encodes executions of P with small number of context switches– instrumentation introduces lots of extra paths to
mimic context switches
• Leverage all-path analysis of sequential checkers
Sequentialprogram QKISS
Concurrentprogram P
SDV
![Page 16: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/16.jpg)
PnpStop( ) { int t; de->stopping = T; t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); WaitEvent(& de->stopEvent);}
DispatchRoutine( ) { int t; if (! de->stopping) { AtomicIncr(& de->count); // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); }}
![Page 17: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/17.jpg)
DispatchRoutine( ) { int t; if (! de->stopping) { AtomicIncr(& de->count); // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); }}
PnpStop( ) { int t; if ($) return; de->stopping = T; if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); if ($) return; WaitEvent(& de->stopEvent);}
![Page 18: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/18.jpg)
PnpStop( ) { int t; if ($) return; de->stopping = T; if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); if ($) return; WaitEvent(& de->stopEvent);}
DispatchRoutine( ) { int t; CODE; if (! de->stopping) { CODE; AtomicIncr(& de->count); // do useful work // … CODE; t = AtomicDecr(& de->count); CODE; if (t == 0) SetEvent(& de->stopEvent); }}
if ( !done ) { if ($) { done = T; PnpStop( ); }}
CODE bool done = F;
![Page 19: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/19.jpg)
PnpStop( ) { int t; if ($) return; de->stopping = T; if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); if ($) return; WaitEvent(& de->stopEvent);}
DispatchRoutine( ) { int t; CODE; if (! de->stopping) { CODE; AtomicIncr(& de->count); // do useful work // … CODE; t = AtomicDecr(& de->count); CODE; if (t == 0) SetEvent(& de->stopEvent); }}
if ( !done ) { if ($) { done = T; PnpStop( ); }}
CODE bool done = F;
main( ) { DispatchRoutine( ); }
![Page 20: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/20.jpg)
PnpStop( ) { int t; CODE; de->stopping = T; CODE; t = AtomicDecr(& de->count); CODE; if (t == 0) SetEvent(& de->stopEvent); CODE; WaitEvent(& de->stopEvent);}
DispatchRoutine( ) { int t; if ($) return; if (! de->stopping) { if ($) return; AtomicIncr(& de->count); // do useful work // … if ($) return; t = AtomicDecr(& de->count); if ($) return; if (t == 0) SetEvent(& de->stopEvent); }}
if ( !done ) { if ($) { done = T; PnpStop( ); }}
CODE bool done = F;
main( ) { PnpStop( ); }
![Page 21: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/21.jpg)
KISS features• KISS trades off soundness for scalability • Cost of analyzing a concurrent program P =
cost of analyzing a sequential program Q– Size of Q asymptotically same as size of P
• Unsoundness is precisely quantifiable– for 2-thread program, explores all executions
with up to two context switches – for n-thread program, explores up to 2n-2
context switches
• Allows any sequential checker to analyze concurrency
![Page 22: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/22.jpg)
![Page 23: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/23.jpg)
Experimental Evaluation of KISS
![Page 24: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/24.jpg)
Driver Stopping Error in Bluetooth Driver (1 KLOC)
DispatchRoutine() { int t; if (! de->stopping) { AtomicIncr(& de->count); assert ! driverStopped; // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); }}
PnpStop() { int t; de->stopping = T; t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent); WaitEvent(& de->stopEvent); driverStopped = T;}
![Page 25: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/25.jpg)
int t;if (! de->stopping) {
int t;de->stopping = T;t = AtomicDecr(& de->count);if (t == 0) SetEvent(& de->stopEvent);WaitEvent(& de->stopEvent);driverStopped = T;
AtomicIncr(& de->count); assert ! driverStopped; // do useful work // … t = AtomicDecr(& de->count); if (t == 0) SetEvent(& de->stopEvent);}
Assertion fails!
![Page 26: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/26.jpg)
DispatchRoutine(IRP *irp) { … irp->CancelRoutine = PacketCancelRoutine; Enqueue(irp); IoMarkIrpPending(irp); …}
IoCancelIrp(IRP *irp) { IoAcquireCancelSpinLock(); if (irp->CancelRoutine) { (irp->CancelRoutine)(irp); } …}
PacketCancelRoutine(IRP *irp) { … Dequeue(irp); IoCompleteRequest(irp); IoReleaseCancelSpinLock(); …}
IRP Cancellation Error in Packet Driver (2.5 KLOC)
![Page 27: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/27.jpg)
…irp->CancelRoutine = PacketCancelRoutine;Enqueue(irp);
IoAcquireCancelSpinLock();if (irp->CancelRoutine) { // inline PacketCancelRoutine(irp) … Dequeue(irp); IoCompleteRequest(irp); IoReleaseCancelSpinLock();
IoMarkIrpPending(irp);
Error: An irp should not be marked pending after it has been completed !
![Page 28: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/28.jpg)
Data-race Conditions in DDK Sample Drivers
• Device extension shared among threads• Data-races on device extension fields• 18 sample DDK drivers
– Range 0.5-9.2 KLOC– Total 70 KLOC
• Each field checked separately with resource limit of 20 minutes and 800MB
• Two threads: each calls nondeterministically chosen dispatch routine
![Page 29: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/29.jpg)
Driver KLOC # Fields # Races
Tracedrv 0.5 3 0
Moufiltr 1.0 14 0
Kbfiltr 1.1 15 0
Imca 1.1 5 1
Startio 1.1 9 0
Toaster/toastmon 1.4 8 1
Diskperf 2.4 16 0
1394diag 2.7 18 0
1394vdev 2.8 18 1
Fakemodem 2.9 39 6
Toaster/bus 5.0 30 0
Serenum 5.9 41 2
Toaster/func 6.6 24 5
Mouclass 7.0 34 1
Kbdclass 7.4 36 1
Mouser 7.6 34 1
Fdc 9.2 92 9
Total:30 races
![Page 30: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/30.jpg)
ToastMon_DispatchPnp(DEVICE_OBJECT *obj,IRP *irp)
{ … IoAcquireRemoveLock(); … case IRP_MN_QUERY_STOP_DEVICE: // Race: write access deviceExt->DevicePnPState = StopPending; … break; … IoReleaseRemoveLock(); …}
ToastMon_DispatchPower(DEVICE_OBJECT *obj,IRP *irp)
{ … // Race: read access if (deviceExt->DevicePnpState == Deleted) { … } …}
DevicePnpState Field in Toaster/toastmon
![Page 31: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/31.jpg)
Acknowledgments
• Tom Ball• Byron Cook• John Henry• Doron Holan• Vladimir Levin• Jakob Lichtenberg• Adrian Oney• Sriram Rajamani• Peter Wieland• …
![Page 32: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/32.jpg)
Keep It Simple and Sequential
• Context-bounded analysis by leveraging existing sequential checkers
• Validates the hypothesis that many concurrency errors require few context switches to show up
![Page 33: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/33.jpg)
However…
• Hard limit on number of explored contexts– e.g., two context switches for concurrent
program with two threads
• Case study: Concurrent transaction management code written in C# (Naik-Rehof 04)– Analyzed by the Zing model checker after
automatically translating to the Zing input language
– Found three bugs each requiring between three and four context switches
![Page 34: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/34.jpg)
Is a tuning knob possible?
Given a concurrent boolean program P and a positive integer c, does P go wrong by failing an assertion via anexecution with at most c contexts?
Given a concurrent boolean program P, does P go wrong by failing an assertion? Undecidable
Decidable
Given a concurrent boolean program P with unbounded fork-join parallelism and a positive integer c, does P go wrong by failing an assertion via an execution with at most c contexts? Decidable
![Page 35: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/35.jpg)
Context Context Context
Context switch Context switch
Problem:• Unbounded computation possible within each context!• Unbounded execution depth and reachable state space• Different from bounded-depth model checking
![Page 36: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/36.jpg)
Global store g, valuation to global variablesLocal store l, valuation to local variables Stack s, sequence of local storesState (g, s)
Sequential pushdown system
Transition relation:
(g, s) (g’, s’)
![Page 37: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/37.jpg)
Reachability problem for sequential pushdown
systemGiven (g, s), is there s’ such that (g, s) * (error,s’)?
![Page 38: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/38.jpg)
Aggregate state
Set of stacks ssAggregate state (g, ss) = { (g,s) | s ss }
Reach(g, ss, g’) = {s’ | (g’, s’) Reach(g, ss)}
Reach(g, ss) = { (g’, s’) | exists s ss such that (g, s) * (g’, s’) }
![Page 39: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/39.jpg)
Theorem (Buchi, Schwoon00)
• If ss is regular, then Reach(g, ss, g’) is regular.
• If ss is given as a finite automaton A, then a finite automaton A’ for Reach(g, ss, g’) can be constructed from A in polynomial time.
![Page 40: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/40.jpg)
Algorithm
Solution:Compute automaton for Reach(g, {s}, error) and report error if it is nonempty.
Problem:Given (g, s), is there s’ such that (g, s) * (error,s’)?
![Page 41: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/41.jpg)
Global store g, valuation to global variablesLocal store l, valuation to local variables Stack s, sequence of local storesState (g, s1, s2)
Concurrent pushdown system
Transition relation:
(g, s1) (g’, s’1) in thread 1
(g, s1, s2) 1 (g, s’1, s2)
(g, s2) (g’, s’2) in thread 2
(g, s1, s2) 2 (g, s1, s’2)
![Page 42: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/42.jpg)
Reachability problem for concurrent pushdown
system
Given (g, s1, s2), are there s’1 and s’2 such that (g, s1, s2) reaches (error, s’1, s’2) via an execution with at most c contexts?
![Page 43: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/43.jpg)
Aggregate transition relation
ss’1 = Reach1(g, ss1, g’)
(g, ss1, ss2) 1 (g’, ss’1, ss2)
(g, ss1, ss2) 2 (g’, ss1, ss’2)
ss’2 = Reach2(g, ss2, g’)
![Page 44: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/44.jpg)
Algorithm: 2 threads, c contexts
1 2
1 2
1 2Depth c
(g, {s1}, {s2})
Compute the set of reachable aggregate states.Report an error if (g, ss1, ss2) is reachable andg = error, ss1 is nonempty, and ss2 is nonempty.
![Page 45: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/45.jpg)
Complexity: 2 threads, c contexts
1 2
1 2
1 2
Depth of tree = context bound cBranching factor bounded by G 2 (G = # of global stores)Number of edges bounded by (G 2) (c+1)
Each edge computable in polynomial time
Depth c
(g, {s1}, {s2})
![Page 46: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/46.jpg)
Unbounded fork-join parallelism
• Fork operation: x = fork• Join operation: join(x)• Copy thread identifier from one
variable to another
![Page 47: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/47.jpg)
Algorithm: unbounded fork-join parallelism, c contexts
• At most c threads may perform a transition
• Reduce to previously solved problem with c threads and c contexts– Nondeterministically pick c forked
threads for execution
![Page 48: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/48.jpg)
start : {1, …, c} boolean, initialized to i. (i == 1) end : {1, …, c} boolean, initialized to i. false
x = fork translates to
if ($) { assume(count < c); count = count + 1; x = count; start[count] = true;} else { x = c + 1;}
join(x) translates to
assume(x c);assume(end[x]);
count : {1, …, c}, initialized to 1
• c statically created threads• thread i starts execution when start[i] is true • thread i sets end[i] to true on termination
![Page 49: Context-bounded model checking of concurrent software](https://reader035.vdocuments.mx/reader035/viewer/2022062409/568145b7550346895db2be25/html5/thumbnails/49.jpg)
Context-bounded analysis of concurrent software
• Many subtle concurrency errors are manifested in executions with few context switches – Experience with KISS on Windows drivers– Experience with Zing on transaction manager
• Algorithms for context-bounded analysis are more efficient than those for unbounded analysis– Reducibility to sequential checking with KISS– Decidability of assertion checking for
concurrent boolean programs