ranjit jhala rupak majumdar interprocedural analysis of asynchronous programs

53
Ranjit Jhala Rupak Majumdar Interprocedural Analysis of Asynchronous Programs

Post on 22-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Ranjit Jhala Rupak Majumdar

Interprocedural Analysis of

Asynchronous Programs

Conclusions

Boost your pet Dataflow Analysis to work on

Asynchronous Programs

… lets begin at the beginning

client(rc)

Asynchronous Programs

reqs(){ if(r == NULL){ async reqs(); return; } rc = malloc(…); if (rc == NULL){ return NO_MEM; } async client(rc,r->id); r = r->next; reqs();}

client(*c,id){ ... c->id = id; ... return;}

main(){ ... async reqs(); ...}

global request_list *r;

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

Asynchronous Programs

reqs(){ if(r == NULL){ async reqs(); return; } rc = malloc(…); if (rc == NULL){ return NO_MEM; } async client(rc,r->id); r = r->next; reqs();}

client(*c,id){ ... c->id = id; ... return;}

main(){ ... async reqs(); ...}

global request_list *r;

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

client(rc)

client(rc)

Asynchronous Programs

Dispatch Location V3

Calls all other functions

clientreqs

v0

v1

v2

v3

v5

v6

v7

v8

v9

v11

v13

v14

v15

reqs

main reqs client

reqsreqs

v4

v10

v12

Asynchronous Program Execution

clientreqs

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

PC

client(rc)

Asynchronous Program Execution

Pending Calls

reqs

clientreqs

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

PC

• Async calls stored in set

client(rc)

Asynchronous Program Execution

Pending Calls

reqs

clientreqs

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

• Async calls stored in set– Execute at dispatch loop

PC

PC client(rc)

Asynchronous Program Execution

Pending Calls

clientreqs

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

client(rc)

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

• Async calls stored in set– Execute at dispatch loop

PC

Asynchronous Program Execution

Pending Calls

client(…)

clientreqs

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

• Async calls stored in set– Execute at dispatch loop

• Sync calls exec at call site

PCclient(rc)

Asynchronous Program Execution

Pending Calls

client(…)

client(…)

clientreqs

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

PC

• Async calls stored in set– Execute at dispatch loop

• Sync calls exec at call site

client(rc)

Asynchronous Program Execution

Pending Calls

client(…)

client(…)

reqs

clientreqs

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

PC

• Async calls stored in set– Execute at dispatch loop

• Sync calls exec at call site

client(rc)

Asynchronous Program Execution

Pending Calls

client(…)

client(…)

reqs

clientreqs

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

PC

• Async calls stored in set– Execute at dispatch loop – Order is non-deterministic

• Sync calls exec at call site

PC

client(rc)

Asynchronous Program Execution

Pending Calls

client(…)

reqs

clientreqs

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

PC

• Async calls stored in set– Execute at dispatch loop – Order is non-deterministic

• Sync calls exec at call site

PC

client(rc)

Asynchronous Program Execution

Pending Calls

reqs

clientreqs

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

reqs

reqs

PC

• Async calls stored in set– Execute at dispatch loop – Order is non-deterministic

• Sync calls exec at call site

PC

client(rc)

Asynchronous Programs

Why? Latency hiding and Parallelism

Domains:• Distributed Systems• Web Servers• Embedded Systems• Discrete-event simulation

Languages and Libraries:• Java + Atomic Methods• LibAsync, LibEvent, …• NesC

Async calls stored in set• Execute at dispatch loop Sync calls execute at call site

client(rc)

Q: How to Analyze Async Programs ?

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

clientreqs

reqs

reqs

Provedereference of c is safei.e. c not null at v13

r

rc

c

(Must) Non-null (May) Null

rrcc

rrcc

Dataflow Facts

client(rc)

Verification via Dataflow Analysis

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

clientreqs

reqs

reqs

ProveFlow fact holds at v13

r

rc

c

(Must) Non-null (May) Null

rrcc

rrcc

Dataflow Facts

c

client(rc)

Dataflow Analysis

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

clientreqs

reqs

reqs

1st Attempt Treat asynchronous calls as synchronous

r

r

rc

r

r

r rc

rc

r rcr c

r c

Verification “works” …but unsoundly deduces

global r is non-null!

[Sharir-Pnueli 80][Reps-Horwitz-Sagiv 95]

client(rc)

Dataflow Analysis

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

clientreqs

reqs

reqs

1st Attempt Treat asynchronous calls as synchronous

Unsound

Global r may change

between call, dispatch

IdeaSeparately track local and global facts

r c

client(rc)

Dataflow Analysis

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

clientreqs

reqs

reqs

2nd Attempt Only execute async callsfrom dispatch location

client(rc)

Dataflow Analysis

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

clientreqs

reqs

reqs

ImpreciseInitial value of formals ?All values (>) too coarse

IdeaTrack pending callswith formals at call-site

2nd Attempt Only execute async callsfrom dispatch location

Encoding Pending Calls as Flow Facts

Idea: Counters- For each kind of async call £ input fact: Count number of pending calls of kind- Expanded DFA facts: Dataflow facts £ Counters

reqs 1 client, 5client, 0

cc

IdeaTrack pending callswith formals at call-site

Key: Combining two Analyses

Expanded DFA facts: Dataflow facts £ Counters

reqs 1 client, 5client, 0

cc

r rc

r not null, rc maybe null,

andpending calls: 1 to reqs,5 to client (arg non-null)0 to client (arg null)

Key: Combining two Analyses

Expanded DFA facts: Dataflow facts £ Counters

Counters: Restrict analysis to valid inter-procedural pathsi.e. feasible sequences of async calls /

dispatches

Dataflow facts: Perform desired analysis over restricted paths

client(rc)

Dataflow Analysis

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

clientreqs

reqs

reqs

3rd Attempt Count # pending calls of each kind

reqs 1 client, 5client, 0

cc

r rc

client(rc)

Dataflow Analysis

v0

v1

v2

v3

v4

v5

v6

v10

v7

v8

v9

v11 v12

v13

v14

v15

reqs

main reqs client

[r!=0]

c->id=id

rc=malloc()

r=r->next

[r==0]

[rc!=0] [rc==0]

clientreqs

reqs

reqs

Non-Terminating#Pending calls unboundeddue to recursion, loops

IdeaApproximate via Abstract counters

3rd Attempt Count # pending calls of each kind

Dealing with Unbounded Async Calls

Over-Approximations : k1-Abstract Counters- For each async call £ input fact: Abstractly count number of pending calls of each

kind- Values > k, abstracted to infinity 1

- Finite counter values = {0,1,…,k,1}- Finite DFA facts: Dataflow facts £ k1-Abs counters- Analysis Terminates

reqs 1 client, 5client, 0

cc

1 E.g. k =1

Recall: Combining two Analyses

Expanded DFA facts: Dataflow facts £ Counters

Counters: Restrict analysis to valid inter-procedural pathsi.e. feasible sequences of async calls / dispatches

Dataflow facts: Perform desired analysis over restricted paths

Which interprocedural pathsdo k1 – Abstractions consider ?

reqs

client

reqs

client

reqs

clientreqs

reqs

1

2

3

8

7

6

5

4

reqs 1 client, 1client, 0

cc

PC

Example: (k=1)1 Abstraction

reqs

client

reqs

client

reqs

clientreqs

reqs

client

1

2

3

8

7

6

5

4

reqs 1 client, 1client, 0

cc

PC

Example: (k=1)1 Abstraction

reqs

client

reqs

client

reqs

clientreqs

reqs

client

client

1

2

3

9

8

7

6

5

4

10

reqs 1 client, 1client, 0

cc

PC

Example: (k=1)1 Abstraction

Valid

reqs

client

reqs

client

reqs

clientreqs

reqs

client

client

1

2

3

9

8

7

6

5

4

10 reqs 1 client, 1client, 0

cc

PC

Valid Invalid

Example: (k=1)1 Abstraction

No matching async call

Over-Approx: k1-Abstraction

- Considers all valid paths

- Plus, some invalid paths

- DFA on superset of valid paths

Over-approximate/Sound- Works for example … but imprecise in general- How to do exact DFA over set of valid paths ?

Dealing with Unbounded Async Calls

IdeaHow bad is over-approximation ?Find out using under-approximation!

Over-Approx: k1-Abstraction

- Considers all valid paths

- Plus, some invalid paths

- DFA on superset of valid paths

Computing Under-Approximate Solutions

Under-Approximations: k-Abstract Counters- For each async call £ input fact: Abstractly count number of pending calls of each

kind- Values > k, abstracted to k - Effect: All calls after k are “dropped”

- Finite counter values = {0,1,…,k}- Finite dataflow facts £ k-Abs counters, ) termination

reqs 1 client, 5client, 0

cc

1 E.g. k =1

Key: Combining two Analyses

Expanded DFA facts: Dataflow facts £ Counters

Counters: Restrict analysis to valid inter-procedural pathsi.e. feasible sequences of async calls / dispatches

Dataflow facts: Perform desired analysis over restricted paths

Which interprocedural pathsdo k – Abstractions consider ?

reqs

client

reqs

reqs

client

1

2

3

5

4reqs 0 client, 1client, 0

cc

PC

Example: (k=1) Abstraction

Already (k=1) pending calls with given input factCall at step 5 is “dropped”Only one pending call !

reqs

client

reqs

reqs

client

1

2

3

5

4

Example: (k=1) Abstraction

Already (k=1) pending calls with given input factCall at step 5 is “dropped”Only one pending call !

reqs 0 client, 1client, 0

cc

PC

reqs

client

reqs

client

reqs

clientreqs

reqs

1

2

3

8

7

6

5

4

Example: (k=1) Abstraction

reqs 1 client, 1client, 0

cc

PC

Only one call in (k=1)-Abstract pending set None remain after this dispatch

reqs

client

reqs

client

reqs

clientreqs

reqs

client

1

2

3

8

7

6

5

4

reqs 1 client, 0client, 0

cc

PC

Example: (k=1) Abstraction

Exists matching async callBut no more calls in (k=1)-abstract pending set !

Only one call in (k=1)-Abstract pending set None remain after this dispatch

reqs

client

reqs

client

reqs

clientreqs

reqs

client

1

2

3

9

8

7

6

5

4

PC

Example: (k=1) Abstraction

Valid

but ignored byk-abstraction

?Exists matching async callBut no more calls in (k=1)-abstract pending set !

client

client

client

client

1

2

3

9

8

7

6

5

4

Example: (k=1) Abstraction

Valid

but ignored byk-abstraction

Under-Approx: k-Abstraction- Ignores all invalid paths- and some valid paths- DFA on subset of valid paths- Under-Approx. DFA

solution

What we have: For all K …

Require

Exact DFA on Valid Paths

K1-Abstract DFAOver-Approx Paths

K-Abstract DFAUnder-Approx Paths

Both Computable Via Standard DFA [Sharir-Pnueli 80]

[Reps-Horwitz-Sagiv 95]

Increase K to Increase Precision

Require

Exact DFA on Valid Paths

K-Abstract DFAUnder-Approx Paths

K1-Abstract DFAOver-Approx Paths

K++

K++

But how to compute exact DFA Solution ?

Theorem: There exists magic K …

Require

Exact DFA on Valid Paths

K1-Abstract DFAOver-Approx Paths

K-Abstract DFAUnder-Approx Paths

Approximations Converge!

To Exact DFA on Valid Paths

K1-Abstract DFAOver-Approx

K-Abstract DFAUnder-Approx

Algorithm

Require

Exact DFA on Valid Paths

AsyncDFA(){ k := 0 repeat over := DFA(k1-Counter); under := DFA(k-Counter); k := k+1; until (over = under); return over;}

DFA = Interprocedural Analysis via Summaries [Sharir-Pnueli 80, Reps-Horwitz-Sagiv 95]

Proof

“Obvious ? Finitely many solutions + monotonicity implies computable fixpoint …”

Alas, over- and under- approximations could converge to different fixpoints …

Proof Buzzwords • Counters are Well Quasi Ordered

• Pre* exists – Initial configurations reaching a location– Constructable via complex backward

algorithm– Petri Nets: [Esparza-Finkel-Mayr 98]– Async Programs: [Sen-Vishwanathan 06]

• Magic k exists due to existence of Pre*

– Simple forward algorithm – [PLDI 04, Geeraerts-Raskin-van Begin 04]

Application: Safety VerificationGround Dataflow Facts = Predicate Abstraction

Implemented on BLAST framework - Lazy Interprocedural DFA [POPL 02]

– Predicates automatically found via Counterexample Formula Interpolation [POPL 04]

– Reduced Product of Predicate Abstraction, Counter lattice [FSE 05]

Preliminary Experiments

• C LibEvent Programs – Load Balancer– Network Simulator

• Properties– Buffer Overflow– Null Pointer Dereference– Protocol State

• Several proved, bugs found

A Few Fun Facts

• For async calls (events) exact solution computable– Unlike threads [Ramalingam 00]

• Optimizations directly carry over:– Procedure summarization,– On-the-fly exploration,– Demand-driven, …

• Proof messy but algorithm very simple

• EXPSPACE-Hard – but early experiments cause for optimism– magic k = 1

Conclusions

Boost your pet Dataflow Analysis to work on

Asynchronous Programs

… just add counters

Merci ?