on sequentializing concurrent programs (bounded model checking) gennaro parlato university of...
TRANSCRIPT
On Sequentializing Concurrent Programs
(Bounded Model Checking)
Gennaro ParlatoUniversity of Southampton, UK
UPMARC 7th Summer School on Multicore Computing, June 8-10, 2015
Concurrent Programs - Reachability Problem
concurrent C programs• POSIX threads• SC memory model
reachability• assertion failure• out-of-bound array• division-by-zero, …
bounded model checking (BMC)• bug-finding, not complete analysis
SHARED MEMORY
…
T1 T2TN
THREADS
BMC approach – Sequential Programs
Efficient tools for C• BLITZ [ Cho, D'Silva, Song – ASE’13 ]• CBMC [ Clarke, Kroening, Lerda – TACAS’04 ] • LLBMC [ Falke, Merz, Sinz – ASE’13 ]• ESBMC [ Cordeiro, Fischer, Marques-Silva – ASE’09 ]
PROGRAMBOUNDEDPROGRAM
SAT/SMTFORMULA
SOLVER
inliningunrollingSSA form
direct SAT/SMT approach• encode each thread as in the sequential case • add a conjunct for shared memory operations• all possible interleavings in the bounded program
φthreads ∧ φconcurrency
papers:• [ Sinha, Wang – POPL’11 ]• [ Alglave, Kroening, Tautschnig – CAV’13 ]
BMC approach - Concurrent C Programs
CONCPROGRAM
BOUNDEDPROGRAM
SAT/SMTFORMULA
SOLVER
concurrencyhandling
SE
QU
EN
TIA
LIZ
AT
ION
(cod
e-to
-cod
e tr
ansl
atio
n)
Sequentialization
papers
• proposal [ Qadeer, Wu – PLDI’04 ]• eager, bounded context-switch, finite # threads [ Lal, Reps – CAV’08 ]• lazy, finite # threads, parameterized [La Torre, Madhusudan, Parlato – CAV’09, CAV’10]• thread creation [Bouajjani, Emmi, Parlato – SAS’11] [Emmi, Qadeer, Rakamaric – POPL’11]• Lal/Reps for real-time systems [Chaki, Gurfinkel, Strichman – FMCAD’11]• message-passing programs [Bouajjani, Emmi -- TACAS’12]
CONCPROGRAM
BOUNDEDPROGRAM
SAT/SMTFORMULA
SOLVER
SEQPROGRAM
BMCSEQ TOOL
SE
QU
EN
TIA
LIZ
AT
ION
(cod
e-to
-cod
e tr
ansl
atio
n)
Sequentialization
BMC-based tools (Implementations of variants of Lal/Reps schema)
• Corral (microsoft research) [ Lal, Qadeer, Lahiri – CAV’12, FSE’14 ]
• Rek [ Chaki, Gurfinkel, Strichman – FMCAD’11 ]
• STORM [ Lahiri,Qadeer,Rakamaric CAV’09 ]
CONCPROGRAM
BOUNDEDPROGRAM
SAT/SMTFORMULA
SOLVER
SEQPROGRAM
BMCSEQ TOOL
Sequentialization
We have designed new sequentializations targeting BMC scalable analyses + surprisingly simple
•Lazy-CSeq•Memory Unwinding
CONCPROGRAM
BOUNDEDPROGRAM
SAT/SMTFORMULA
SOLVER
SEQPROGRAM
SE
QU
EN
TIA
LIZ
AT
ION
(cod
e-to
-cod
e tr
ansl
atio
n)
BMCSEQ TOOL
Lazy-CSeq: Schema Overview(new sequentialization for BMC)
[ Inverso–Tomasco–Fischer–La Torre–Parlato, CAV’14 ]
Lazy-CSeq Approach
BOUNDEDPROGRAM
BMC SEQUENTIAL
TOOL
SEQPROGRAM
SE
QU
EN
TIA
LIZ
AT
ION
(cod
e-to
-cod
e tr
ansl
atio
n)
CONCPROGRAM
Bounded Concurrent Programs
main()
T0 TNTN-1T1 …
• no loops• no function calls• Control flow only forward• one procedure for each thread
Round Robin Schedule
main()
T0 TNTN-1T1
…
Lazy-Cseq sequentialization:•captures all bounded Round-Robin computations for a given bound•error manifest themselves within very few rounds [ Musuvathi, Qadeer – PLDI’07 ]
round 1
round 2
round k
round 3
Schema Overview
…main()T0 T1
TN
…F0 F1
FN main()
bounded concurrent program
“equivalent”
Sequential program with non determinism
Sequentialization(code-to-code translation)
Sequentialized functions Driver
translatestranslates
…
translatestranslates
translatestranslates
Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION
pc0=0; ... pcN=0;local0; ... localk;
main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei)
Fi();}
main driver• a global pc for each thread • thread locals thread global
Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION
pc0=0; ... pcN=0;local0; ... localk;
main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei)
Fi();}
main driverfor each round
for each thread Ti
simulate Ti
Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION
pc0=0; ... pcN=0;local0; ... localk;
main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei)
Fi();}
switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
Fi()
Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION
pc0=0; ... pcN=0;local0; ... localk;
main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei)
Fi();}
switch(pci) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
Fi()
... ...
context-switch resum
e m
echanism
Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION
pc0=0; ... pcN=0;local0; ... localk;
main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei)
Fi();}
switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(1); stmt0;2: CS(2); stmt1;3: CS(3); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
Fi() ...
... ...
Context-switch simulation: #define CS(j) if (*) { pci=j; return; }
Context-switch simulation: #define CS(j) if (*) { pci=j; return; }
Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION
pc0=0; ... pcN=0;local0; ... localk;
main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei)
Fi();}
switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(1); stmt0;2: CS(2); stmt1;3: CS(3); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
Fi() ...
... ...
Naïve Lazy Sequentialization: CROSS PRODUCT SIMULATION
pc0=0; pc1=0; ... pcN=0;local0; local1; ... localk;
main() { for (r=0; r<R; r++) for (k=0; k<N; k++) // simulate Tk
Fk();}
switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
...
... ...
Formula encoding:
goto statement to formula
add a guard for each crossing control-flow edge
= O(M2) guards
Formula encoding:
goto statement to formula
add a guard for each crossing control-flow edge
= O(M2) guards
pc0=0; ... pcN=0;local0; ... localk;
main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei) Fi();}
switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
Fi() ...
... ...
CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION
pc0=0; ... pcN=0;local0; ... localk;
main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei) Fi();}
switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
Fi() ...
... ...
CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION
pc0=0; ... pcN=0; local0; ... localk;
main() { for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei) Fi();}
switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
Fi() ...
CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION
switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(1); stmt0;2: CS(2); stmt1;3: CS(3); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
Fi() ...
CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION
#define CS(j) NEW if (j<pci || j>=nextCS) goto j+1; #define CS(j) NEW if (j<pci || j>=nextCS) goto j+1;
pc0=0; ... pcN=0; local0; ... localk;nextCS;main() for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei) nextCS = nondet; assume(nextCS>=pci) Fi(); pci = nextCS;
switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(1); stmt0;2: CS(2); stmt1;3: CS(3); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
Fi()
CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION
#define CS(j) NEW if (j<pci || j>=nextCS) goto j+1; #define CS(j) NEW if (j<pci || j>=nextCS) goto j+1;
pc0=0; ... pcN=0; local0; ... localk;nextCS;main() for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei) nextCS = nondet; assume(nextCS>=pci) Fi(); pci = nextCS;
switch(pck) { case 0: goto 0; case 1: goto 1; case 2: goto 2;
... case M: goto M;}
1: CS(0); stmt0;2: CS(1); stmt1;3: CS(2); stmt2; . . . E XE . . .M: CS(M); stmtM;
main driver
Fi()
CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION
#define CS(j) NEW if (j<pci || j>=nextCS) goto j+1; #define CS(j) NEW if (j<pci || j>=nextCS) goto j+1;
...
pc0=0; ... pcN=0; local0; ... localk;nextCS;main() for (r=0; r<K; r++) for (i=0; i<N; i++) // simulate Ti
if (activei) nextCS = nondet; assume(nextCS>=pci) Fi(); pci = nextCS;
resuming + context-switch
1: CS(1); stmt1;2: CS(2); stmt2;3: CS(3); stmt3;
EXECUTE
M: CS(M); stmtM;
nextCS
...
skip
pci
...
skip
#define CS(j) NEW if (j<pci || j>=nextCS) goto j+1; #define CS(j) NEW if (j<pci || j>=nextCS) goto j+1;
CSeq-Lazy Sequentialization: CROSS PRODUCT SIMULATION
Individual Threads Sequentialization
1: CS(1); stmt1;2: CS(2); stmt2;3: CS(3); stmt3;
EXECUTE
M: CS(M); stmtM;
...
...
Formula encoding:
goto statement to formula
add a guard for each crossing control-flow edge
= O(M) guards
Formula encoding:
goto statement to formula
add a guard for each crossing control-flow edge
= O(M) guards
Individual Threads Sequentialization
1: CS(1); stmt1;2: CS(2); stmt2;3: CS(3); stmt3;
EXECUTE
M: CS(M); stmtM;
...
...
#define CS(j) if (j<pci || j>=next_CS) goto pc+1; #define CS(j) if (j<pci || j>=next_CS) goto pc+1;
inject light-weight, non-invasive control code
•no non-determinism•no assignments•no return
inject light-weight, non-invasive control code
•no non-determinism•no assignments•no return
Experiments
Evaluation: bug-hunting SVCOMP’14, Concurrency (UNSAFE instances)
Evaluation: bug-hunting SVCOMP’14, Concurrency (UNSAFE instances)
Remarks
Lazy-CSeq is a new sequentialization targeted to BMC backends• lazy• based on bounded round-robin computations• efficient for bug-hunting• simple to implement (CSeq framework)
Eager vs Lazy• no empirical evidence that lazy is faster • Lazy does not require an implementation of a memory model and handling of error
checks (as opposed to LR sequentialization)
Lazy-CSeq won 2 gold medals in the Concurrency category of the last two editions of the software verification competition SV-COMP
• all verification tasks solved• 30x faster than the best tool with native
concurrency handling
Sequentialization based on
Memory Unwinding
[ Tomasco–Inverso–Fischer–La Torre–Parlato, TACAS’15 ]
T1 T2 T3
Interleaving semantics
rxwx rx wy rxwxrywy ryrx wywxrun
T1 T2 T3
From interleavings to memory unwinding
rxwx rx wy rxwxrywyrun ryrx wy
accesses to shared memory
writes
rxwx rx wy rxwxrywy ryrx wy
wx
⇒ Memory unwinding models thread interaction history by data
MEMORYUNWINDINGwx wy wxwy wywx wy wxwy wy
Memory unwindings as thread interfaces
T1 T2
Assume-guarantee reasoning style decomposition of verification
TASK 1 TASK 2
T1
bounding parameter: # of shared write operations-most concurrency bugs exposed by few interactions [Lu, Park, Seo, Zhou – ASPLOS 2008]
Assume-guarantee reasoning
T1 T2
TASK 1 TASK 2
Thread T2
assumes: MU writes of other threads
guarantees: its own MU writes
Task 1: Task 2:
Thread T1
assumes: MU writes of other threads
guarantees: its own MU writes
Simulating a thread against an MU
y=1 …. a=x+5 ..... x=1 ….
x=2y=1y=1
x=1
x=3y=8
z=2
pcmc
Global variable write – check against current MU entry
void write(uint t, uint v, int val) { mc[t] = th_nxt_wr[t][mc[t]]; assume(var[mc[t]] == v); assume(val[mc[t]] == val);}
pc
void write(uint t, uint v, int val) { mc[t] = th_nxt_wr[t][mc[t]];
}mc
Simulating a thread against an MU
y=1 …. a=x+5 ..... x=1 ….
x=2y=1y=1
x=1
x=3y=8
z=2
pc
mc
Local statement (no global variables) – update pc, keep mc
pc
Simulating a thread against an MU
y=1 …. a=x+5 ..... x=1 ….
x=2y=1y=1
x=1
x=3y=8
z=2
pcmc
Global variable read – “pick write position in range”
int read(uint t, uint v) { if (thr_terminated()) return 0; if (var_fst_wr[v]==0) return 0; uint r_from = *; assume((r_from <= last_wr_pos) && (r_from < th_nxt_wr[t][mc[t]])); assume(var[r_from] == v); if (r_from < mc[t]) assume(var_nxt_wr[r_from] > mc[t]); else { if (r_from < var_fst_wr[v]) return 0; mc[t] = r_from; }; return value[r_from];}
non-deterministic assignment
non-deterministic assignment
Simulating a thread against an MU
y=1 …. a=x+5 ..... x=1 ….
x=2y=1y=1
x=1
x=3y=8
z=2
pcmc
Global variable read – “pick write position in range”
int read(uint t, uint v) { if (thr_terminated()) return 0; if (var_fst_wr[v]==0) return 0; uint r_from = *; assume((r_from <= last_wr_pos) && (r_from < th_nxt_wr[t][mc[t]])); assume(var[r_from] == v); if (r_from < mc[t]) assume(var_nxt_wr[r_from] > mc[t]); else { if (r_from < var_fst_wr[v]) return 0; mc[t] = r_from; }; return value[r_from];}
...before next write of thread...
...before next write of thread...
Check in unwinding...
Check in unwinding...
Simulating a thread against an MU
y=1 …. a=x+5 ..... x=1 ….
x=2y=1y=1
x=1
x=3y=8
z=2
pcmc
Global variable read – “pick write position in range”
int read(uint t, uint v) { if (thr_terminated()) return 0; if (var_fst_wr[v]==0) return 0; uint r_from = *; assume((r_from <= last_wr_pos) && (r_from < th_nxt_wr[t][mc[t]])); assume(var[r_from] == v); if (r_from< mc[t]) assume(var_nxt_wr[nxt_mc] > mc[t]); else { if (r_from< var_fst_wr[v]) return 0; mc[t] = r_from; }; return value[r_from];}
...before next write of thread...
...before next write of thread...
...right variable......right variable...
Simulating a thread against an MU
y=1 …. a=x+5 ..... x=1 ….
x=2y=1y=1
x=1
x=3y=8
z=2
pcmc
Global variable read – “pick write position in range”
int read(uint t, uint v) { if (thr_terminated()) return 0; if (var_fst_wr[v]==0) return 0; uint r_from = *; assume((r_from <= last_wr_pos) && (r_from < th_nxt_wr[t][mc[t]])); assume(var[r_from] == v); if (r_from < mc[t]) assume(var_nxt_wr[r_from] > mc[t]); else { if (r_from < var_fst_wr[v]) return 0; mc[t] = r_from; }; return value[r_from];}
...right variable......right variable......next write of variable in
future...
...next write of variable in
future...
Simulating a thread against an MU
y=1 …. a=x+5 ..... x=1 ….
x=2y=1y=1
x=1
x=3y=8
z=2
pcmc
Global variable read – “pick write position in range”
int read(uint t, uint v) { if (thr_terminated()) return 0; if (var_fst_wr[v]==0) return 0; uint r_from = *; assume((r_from <= last_wr_pos) && (r_from < th_nxt_wr[t][mc[t]])); assume(var[r_from] == v); if (r_from < mc[t]) assume(var_nxt_wr[r_from] > mc[t]); else { if (r_from < var_fst_wr[v]) return 0; mc[t] = r_from; }; return value[r_from];}
...before next write of
variable...
...before next write of
variable...
...else update mc.
...else update mc.
mc
Return value.Return value.
pc
Simulating a thread against an MU
y=1 …. a=x+5 ..... x=1 ….
assert(b);
x=2y=1y=1
x=1
x=3y=8
z=2
Handling errors – “flag that an error occurs”
pc
translates to if (! b) _error = 1;
(computation might not be feasible)
translates to if (! b) _error = 1;
(computation might not be feasible)
Our apporach: sequentialization by MU
…T0 T1
TN
…F0 F1
FN main()
Concurrent program
Sequential program
Sequentialization(code-to-code translation)
Simulation functions
translates
…
translates
translates
Main Sequential Program
void main(void){
memory_init(W);
ct := mem_thread_create();
F0();
thread_terminate(ct);
all_thread_simulated();
assert(_error == 0) }
main
instantiate memory unwinding
error check
all writes of the thread are executed
simulation starts from main thread
all writes of the MU are executed
register main thread
Thread creationThread creation
CSeq framework + Evaluation
CSeq framework
sequentialnon-deterministic
C program
P'
concurrentC program
P
sequentialanalysis
toolcode-to-codetranslation
is a framework that simplifies code-to-code translations- for C programs + Pthread- comprises several code-to-code translation modules- supports several sequential analysis back-end tools
Internal modules-unrolling-function inlining-counter-example
…
Sequentialisations-Memory-Unwinding-Lazy-CSeq-LR-CSeq
…
testingKlee
bounded model-checking-BLITZ-CBMC-ESBMC-LLBM
abstraction-CPA-checker-Frama-C-SATABS
http://users.ecs.soton.ac.uk/gp4/cseq/
Evaluation [ SV–COMP 2014 2015 ]
SV-COMP software verification competition @ TACASConcurrency category benchmarks: 1003 files •Lazy-CSeq won 2 GOLD medals (2014, 2015)•MU-CSeq won 2 SILVER medals (2014, 2015)
Ongoing & Future Work
Sequentializations based on BMC and Abstraction for programs communication through FIFO channels
• weak memory models (TSO) [Atig, Bouajjani, – CAV 2011 ]• message passing interface (MPI)
Symbolic partial order reduction based on interleaving partitioning (CEGAR) combining:
• abstraction interpretation to discard “safe” partitions• bug-finding for partitions with “potential” bugs with few interleavings• swarm verification (on clusters)• implemented using Lazy- and MU-Cseq sequentialization
Foundational work: Reasoning about programs using graph representations. Analysis based on graph decompositions.
[“The Tree Width of Automata with Auxiliary Storage” Madhusudan, Parlato – POPL 2011 ]for multi-stack pushdown automata communicating either through shared variables or queues)
Ongoing work & Future work
Sequentializations based on BMC and Abstraction for programs communication through FIFO channels
• weak memory models (TSO) [Atig, Bouajjani, – CAV 2011 ]• message passing interface (MPI)
Symbolic partial order reduction based on interleaving partitioning (CEGAR) combining:
• abstraction interpretation to discard “safe” partitions• bug-finding for partitions with “potential” bugs with few interleavings• swarm verification (on clusters)• implemented using Lazy- and MU-Cseq sequentialization
Foundational work: Reasoning about programs using graph representations. Analysis based on graph decompositions.
[“The Tree Width of Automata with Auxiliary Storage” Madhusudan, Parlato – POPL 2011 ]for multi-stack pushdown automata communicating either through shared variables or queues)
Ongoing work & Future work
Thank You
users.ecs.soton.ac.uk/gp4/cseq