towards tractability of dataflow analysis for concurrent programs vineet kahlon nec labs, princeton,...

Post on 20-Dec-2015

221 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Towards Tractability of Dataflow Analysis

for Concurrent Programs

Vineet KahlonNEC Labs, Princeton, USA

Sequential Dataflow Analysis

• Program Design

• Debugging

• Optimization

• Maintenance

• Documentation

• …

Sequential Dataflow Analysis

• Program Design

• Debugging

• Optimization

• Maintenance

• Documentation

• …

Concurrent

Sequential Dataflow Analysis

• Program Design

• Debugging

• Optimization

• Maintenance

• Documentation

• …

Hardly anything of interest is decidable

Concurrent

Pointer Analysis

int main(){ foo(int **t){ int *p, *v; if(…) rc = fork(foo,&v); *t = d; v = c; else{ p = v; *t = e; rc = join(); *t = f;} } }

Pointer Analysis

int main(){ foo(int **t){ int *p, *v; if(…) rc = fork(foo,&v); *t = d; v = c; else{ p = v; *t = e; rc = join(); *t = f;} } }

Pointer Analysis

int main(){ foo(int **t){ int *p, *v; if(…) rc = fork(foo,&v); *t = d; v = c; else{ p = v; *t = e; rc = join(); *t = f;} } }

Pointer Analysis

int main(){ foo(int **t){ int *p, *v; if(…) rc = fork(foo,&v); *t = d; v = c; else{ p = v; *t = e; rc = join(); *t = f;} } }

Pointer Analysis

int main(){ foo(int **t){ int *p, *v; if(…) rc = fork(foo,&v); *t = d; v = c; else{ p = v; *t = e; rc = join(); *t = f;} } }

Pointer Analysis

int main(){ foo(int **t){ int *p, *v; if(…){ rc = fork(foo,&v); send(sig1); v = c; *t = d; wait(sig1); } else{ p = v; send(sig2); rc = join(); *t = e; } wait(sig1); *t=f; } }

Analysis of Concurrent Programs is Inherently Global

foo(…){

}

Pairwise Reachability

• Key Problem: Given a program location c in a thread determine which locations in other threads can contribute to dataflow facts at c

• Technical formulation: Pairwise Reachability

EF (c Æ d)

Concurrent Dataflow Analysis Framework

• Step 1: Abstractly interpret each thread based on the analysis

• Step 2: For each location c of a given thread T determine which locations in the concurrent program can contribute to dataflow facts at c

• Step 3: Compute dataflow facts using standard fixpoint computations

Concurrent Dataflow Analysis Framework

• Step 1: Abstractly interpret each thread based on the analysis

• Step 2: For each location c of a given thread T determine which locations in the concurrent program can contribute to dataflow facts at c

• Step 3: Compute dataflow facts using standard fixpoint computations

Inter-procedural Dataflow Analysis for Sequential Programs

• Close relationship between Data Flow Analysis for sequential programs and the model checking problem for Pushdown Systems (PDS)–Use abstract interpretation to get a finite representation of the control part of the program–Recursion is modeled as a stack–Exploit the fact that the model checking problem for Pushdown Systems is efficiently decidable for very expressive linear and branching-time logics

[Bouajjani et.al., Walukiewicz, Reps, Schwoon, Jha]

From Programs to PDSs

main(){

l1: sh = 0;

l2: if(…){

l3: sh = 1;

l4: foo(); } else{

l5: sh=z;

l6: foo(); }

l7: … }

l1

l2

l3 l5

l6

l’6

l7

m1

m0

l4

l’4

! foo4

! foo6

foo6 !

foo4 !

From Programs to PDSs

main(){

l1: sh = 0;

l2: if(…){

l3: sh = 1;

l4: foo(); } else{

l5: sh=z;

l6: foo(); }

l7: … }

l

l2

l3 l4

l5

l’5

l7

l4

l’4

! foo4

! foo6

foo6 !

foo4 !

(l1,0)

(l2,0)

(l3,1) (l5,0)

(l6,>)

(l’6,>)

(l4,1)

(l7,>)

(l’4,1)

(m0,1)

(m0,>)

(m0,>)

(m0,1)

PDS

A PDS is a tuple (Q, , , q0), where

• Q is a finite set of control states• is a set of stack symbols

• µ (Q £ ) £ (Q £ *), Not: c1 ! c2

Configurations:

h c, u i c : control state

u : stack content

a !

Inter-procedural Dataflow Analysis for Concurrent Programs

• Dataflow analysis for concurrent program reduces to the model checking problem for interacting PDS systems

• Fundamental Problem: To study the decidability of the model checking problem for PDS interacting via the standard synchronization primitives

–Lock

•Non-nested Locks

•Nested Locks

–Pairwise and Asynchronous Rendezvous

–Broadcasts

–Boolean Guards

Undecidability Barrier

Undecidable for PDSs interacting via

–Pairwise Rendezvous [Ramalingam]

–Locks [Kahlon et. al.]

Key Underlying Obstacle: Checking non-emptiness of

the intersection of two context free languages is

undecidable

Consequence: If the PDSs are coupled tightly enough

either by making

–the synchronization primitive expressive enough, or

–the temporal property being checked strong enough,

we get undecidability of the model checking problem

Indexed Linear Temporal Logic

• LTL – Atomic Propositions: If atomic proposition are

interpreted over the local states of k threads then the formula is k-indexed,

– Temporal Operators: • F p : eventually p• P U q : p until q• X p : next time p• G p : always p

– Boolean Connectives: Æ, Ç and : • Example: Data Race - F(c Æ d) • L(Op1,…,Opk): Only operators Op1,…,Opk are allowed

– Example: L(G,U) allows G, U, Æ, Ç and atomic prop.

1

LTL Landscape

L(G,U)

L(G,F)L(U)

L(F) L(G)

L(F,F)1

Nested Locks

Non-NestedLocks

BroadcastsPairwise

Rendezvous

Decidability , Loose Coupling

• One thread cannot force another thread to execute

• Model Checking is decidable for loosely coupled PDSs

• When model checking is decidable we can reduce the analysis for the program to its constituent threads

Frequently Used Primitives

• Locks

• Rendezvous

Java: Wait/Notify

Pthreads: pthread_cond_wait()

pthread_cond_send()

• Broadcasts

Java: Wait/NotifyAll

Frequently Used Primitives

• Locks

• Rendezvous

Java: Wait/Notify

Pthreads: pthread_cond_wait()

pthread_cond_send()

• Broadcasts

Java: Wait/NotifyAll

Locks

• Locks:– Nested: things of interest are decidable– Non-Nested: nothing of interest is

undecidable

• Rendezvous: nothing of interest is decidable– Solutions:

• Over-approximation via regular sets• Over-approximation via parameterization• …

Locks

L(G,U)

L(G,F)L(U)

L(F) L(G)

L(F,F)1

Nested

Non-Nested

Nested Locks

A concurrent multi-threaded program uses locks in a nestedfashion iff along every computation each thread can onlyrelease that lock which it acquired last and that has not yet been released

f() { g(){ h(){ acquire(b) ; release(b); acquire(c); g(); acquire(c); release(b); release(c); } } }

• Programming guidelines typically recommend that programmers use lock in a nested fashion

• Locks are guaranteed to be nested in Java· 1.4 and C#

Nested Locks

A concurrent multi-threaded program uses locks in a nestedfashion iff along every computation each thread can onlyrelease that lock which it acquired last and that has not yet been released

f() { g(){ h(){ acquire(b) ; release(b); acquire(c); h(); acquire(c); release(b); release(c); } } }

• Programming guidelines typically recommend that programmers use lock in a nested fashion

• Locks are guaranteed to be nested in Java· 1.4 and C#

Coupling via Locks

l1

l2

l3

l4

c1

l2

l3

l4

l1

c2

Nested Locks can only enforce mutual exclusionbut cannot constrain the order in whichtransitions can be executed

Locks can, in general, enforce synchronizationthrough chaining

Nested Locks

L(G,U)

L(G,F)L(U)

L(F) L(G)

L(F,F)1

Nested

Nested Locks

• Pairwise reachability is (efficiently) decidable

• Reasoning about a concurrent program comprised of threads interacting via nested locks can be de-coupled to its constituent threads

Acquisition History: Motivation

Thread1(){ Thread2(){ c1: acquire(a); g1: acquire(c); c2: acquire(c); g2: acquire(a); c3: release(c); g3: release(a); c4: Error1; g4: Error2; } }

Observation: c4 and g4 are not simultaneously reachable even though Lock-Set(c4) Å Lock-Set(g4) = ;

Bottomline: Tracking Lock-Sets is not enough

Acquisition History: Cyclic Dependencies

Thread1(){ Thread2(){

c1: acquire(a); g1: acquire(c);

c2: acquire(c); g2: acquire(a);

c3: release(c); g3: release(a);

c4: Error1; g4: Error2;

} }

• acquire(a) must be executed by Thread1 before acquire(c) is executed by Thread2

• acquire(c) must be executed by Thread2 before acquire(a) is executed by Thread1.

Acquisition History: Definition

Thread1(){ Thread2(){

c1: acquire(a); g1: acquire(c); c2: acquire(c); g2: release(a); c3: release(c); g3: release(a); c4: Error1; g4: Error2; } }

The acquisition history of a lock lk at a control location of a thread T is the set of locks that have been acquired (and possibly released) by T since the last acquisition of lk by

• Acq-Hist(c4,a) = {c}• Acq-Hist(g4,c) = {a}

Acquisition History: Consistency

Thread1(){ Thread2(){

c1: acquire(a); g1: acquire(c); c2: acquire(c); g2: release(a); c3: release(c); g3: release(a); c4: Error1; g4: Error2; } }

Acq-Hist(c1, l1) is consistent with Acq-Hist(c2, l2) iff thefollowing does not hold: l1 2 Acq-Hist(c2, l2) and l2 2 Acq-Hist(c1, l1)

Decomposition Result

Control states c1 and c2 of Thread1 and Thread2, respectively,

are simultaneously reachable iff

• Lock-Set(c1) Å Lock-Set(c2) = ; ;

• There do not exist locks l, m:

– l 2 Acq-Hist(c1, m)

– m 2 Acq-Hist(c2, l)

Corollary: By tracking acquisition histories we can reduce the

model checking problem from a concurrent program to its

Individual threads.

Decomposition Result

(c1, c2) is reachable from the initial state (in1, in2) iff there

exist local paths of T1 and T2 along which the acquisition histories are consistent. , There exist consistent acquisition histories AH1 and AH2

such that the augmented local states (c1, AH1) and (c2, AH2) are reachable individually in T1 and T2, resp., For each i, ini 2 pre*({(ci, AHi)})

Bottomline: pre*closure for a Multi-threaded programinteracting via nested locks can be reduced to its individualconstituent threads.

A decision procedure for EF(c1 Æ c2)

1. Enumerate the set of all pairs pi of augmented local states (c1, AH1i) and (c2, AH2i), where AH1i and AH2i are consistent

2. For a pair pi, compute for each individual thread Tj, the sets pre*(c1, AH1i) and pre*(c2, AH2i)

3. EF(c1 Æ c2) holds iff for some i,1. in1 2 pre*(c1, AH1i), and

2. in2 2 pre*(c2, AH2i)

Model Checking LTL Properties

Main Idea: Reduce the Model Checking Problem to multiple instances of reachability

Model Checking for Finite State Systems

ing

S £ Bf

stem

cycle

System: S

Temporal Property: f

Model Checking a Single PDS for LTL properties

Decide whether the product BP of the given PDS P and the Buchi

Automaton for the given property f has an accepting

lollipop, i.e., there exist a global configuration hc, aui such that

1. There is a path from the initial state hin, ?i to hc, aui [Stem]

2. For some v, there is a path from hc, ai to hc, avi containing an accepting state g of BP. [Cycle]

Notation: hc,aui•c – control state •a – top stack symbol•u – stack content

Pumping diagram

h in, ?igh c, aui

ua

Pumping diagram

h in, ?igh c, aui

ua

Pumping diagram

h in, ?igh c, avui

vu

a

Dual Pumping

hc, ai hc, avi

hc, aui hc, avui

Pumping diagram

h in, ?igh c, avui

vu

a

Pumping diagram

h in, ?igh c, av2ui

vu

a

v

Pumping diagram

h in, ?igh c, aviui

Dual Pumping

hc1, a1u1i h c1, a1v1u1ih in1, ?i

Dual Pumping

h c1, a1v1u1ihc1, a1u1ih in1, ?i

h in2, ?i h c2, a2u2i h c2, a2v2u2i

Dual Pumping

f

Dual Pumping

f lf2lf1

Dual Pumping

f lf2lf1

Dual Pumping

f lf2lf1

Dual Pumping

f lf2lf1

Dual Pumping

f lf2lf1

Dual Pumping

f lf2lf1

c2

d2

Dual Pumping

f lf2lf1

c2

d2

Dual Pumping

f lf2lf1

c2

d2

Dual Pumping

f lf2lf1

Dual Pumping

f lf2lf1d1

Dual Pumping

f lf2lf1

c1

d1

Dual Pumping

f lf2lf1

c1in

Reduction to Reachability

• Dual Pumping reduces model checking for F (c1 Æ c2) to reachability in Dual-PDS systems

• Reachability is decidable for PDS interacting via nested locks but undecidable for PDS interacting via non-nested ones

• Reachability can de decided in a compositional manner

1

LTL Landscape

L(G,U)

L(G,F)L(U)

L(F) L(G)

L(F,F)1

Nested Locks

Non-NestedLocks

BroadcastsPairwise

Rendezvous

LTL Landscape

L(G,U)

L(G,F)L(U)

L(F) L(G)

L(F,F)1

Nested Locks

Non-NestedLocks

BroadcastsPairwise

Rendezvous

Rendezvous

Over-approximation via Regular Languages

T1(){ T2(){ … …

a1: if(..){ b1: wait(obj);

a2: send(obj); b2: …

a3: counter++; b3: counter++;

a4: } else{ }

a5: counter = 0; }}

Over-approximation via Regular Languages

T1(){ T2(){ … …

a1: if(..){ b1: wait(obj);

a2: send(obj); b2: …

a3: counter++; b3: counter++;

a4: } else{ }

a5: counter = 0; }}

Over-approximation via Regular Languages

T1(){ T2(){ … …

a1: if(..){ b1: wait(obj);

a2: send(obj); b2: …

a3: counter++; b3: counter++;

a4: } else{ }

a5: counter = 0; }}

Over-approximation via Regular Languages

• Compute the language of sends/waits at each control location

• c1 and c2 are simultaneously reachable only if

L(c1) Å L’(c2) ;

L’(c2) is the language gotten from L(c2) by replacing each send with a wait, and vice versa

Over-approximation via Regular Languages

main(){ foo(..){ for (int i = 0; i < 100; i++) if(){ send(a); send(b); … }else{ foo(); send(c); … foo();} wait(d); } }

L(exmain) = (a!)100(c!)n(b!)(d?)n

Lo(exmain) = (a!)*(c!)*(b!)(d?)*

Parameterized Systems

• Systems are comprised many replicated copies of a few basic components

U1 || … || Uk

• Examples

– Multi-core processors

– Protocols

– Drivers

• PMCP: 9n1,…,nk: U1 || … || Uk ² f

n1 nk

nkn1

Example: Huge Tlb

static struct page *alloc_fresh_huge_page(struct page *page) { static int nid = 0;

page = alloc_pages_node(nid,GFP_HIGHUSER|__GFP_COMP|__GFP_NOWARN, HUGETLB_PAGE_ORDER); nid = (nid + 1) % num_online_nodes(); if (page) { // Data Race here !!! //++ spin_lock(&hugetlb_lock);

nr_huge_pages++;

nr_huge_pages_node[page_to_nid(page)]++;

//++ spin_unlock(&hugetlb_lock); } return page;}

Why Parameteriztion ?

• Parameterized Applications: – Example: Device drivers are supposed to be data

race free irrespective of how many thread instances running the driver exist

• (Partial) Completeness: Many data races occur iff they occur in a parameterized setting

• Soundness: Data race freedom in a parameterized setting implies data race freedom for any concrete finite instance

Parameterization as Abstraction

T1 || T2 ² EF (c1 Æ c2)

versus

9 n, m, T1n || T2

m ² EF (c1 Æ c2)

Parameterization as Abstraction

T1 || T2 ² EF (c1 Æ c2)

versus

9 n, m, T1n || T2

m ² EF (c1 Æ c2)

Why Parameterization for Dataflow Analysis ?

• Surprise: parameterized reachability is more tractable

• For threads communicating via locks parameterization does not lead to an over-approximation of pairwise reachable states

• Can be used as a first step to cheaply filter out many interleavings

• Existing tools can be easily adapted to decide parameterized reachability

Tractability via Parameterization

L(G,U)

L(G,F)L(U)

L(F) L(G)

L(F,F)1

PairwiseRendezvous

Pairwise Rendezvous

Tractability via Parameterization

L(G,U)

L(G,F)L(U)

L(F) L(G)

L(F,F)1

PairwiseRendezvous

Parameterization vis-à-vis Ramaligam’s Result

• Avoid reasoning about instances of progressively increasing size

• Given a pair of parameterized pairwise reachable states c1 and c2 computing the smallest instance for which c1 and c2 are pairwise reachable is not possible in general

Tractability via Parameterization

L(G,U)

L(G,F)L(U)

L(F) L(G)

L(F,F)1

PairwiseRendezvous

Unbounded Multiplicity Result

The multiplicity of any reachable states

can made to exceed any given m

Un ² EF c ) Unm ² EF¸ m c

Efficient Parameterized Reachability

!

! !

!

b!

a?

a!

b?

c?d!

c!

Efficient Parameterized Reachability

!

! !

!

b!

a?

a!

b?

c?d!

c!

c0 c1

c2

Efficient Parameterized Reachability

!

! !

!

c0 c1

c2

Efficient Parameterized Reachability

!

! !

!

Efficient Parameterized Reachability

!

! !

!

b!

b?

c?d!

c!

c0 c3

Efficient Parameterized Reachability

!

! !

!

c0 c3

Efficient Parameterized Reachability

!

! !

!

Efficient Parameterized Reachability

!

! !

!

c?d!

c!

Efficient Parameterized Reachability

!

! !

!

Complexity of Parameterized Reachability

• At most s iterations

• Each iteration costs O(s3) time

• Total time: O(s4)

• Can use existing PDS model checking tools

s: no of control states

A Concise History

• Undecidable for PDS interacting via rendezvous

[Ramalingam]• Undecidable for PDSs interacting via Locks [Kahlon et al.]• Decidable for

–PA processes [Esparza et. al., Lugiez et. al]–Constrained Dynamic Pushdown Networks

[Bouajjani et.al.]–Asynchronous Dynamic Pushdown Network

[Bouajjani et. al.]• Decidable for Asynchronous Program

[Sen et. al., Jhala et. al., Olm et.al.]

A Concise History

• Over approximation techniques for PDS interacting via rendezvous [Chaki et.al.]

• Dataflow Analysis from Partial Order Traces

[Farzan and Madhusudan]• Delineation of the decidability boundary for the

standard synchronization primitives

[Kahlon et. al.]• Parameterization as a form of abstraction

[Kahlon]

Concluding Remarks

• Decidability , Threads Loosely Coupled• For decidable cases one can reduce dataflow analysis

for the given concurrent threads to its individual threads

• Undesirables: Model Checking of L(F) is undecidable for PDSs interacting via primitives other than nested locks

• Exploit program structure to ensure tractability– Parameterization– Over-Approximation via Regular Languages

top related