assume/guarantee reasoning using abstract interpretation

74
Assume/Guarantee Reasoning using Abstract Interpretation Nurit Dor Tom Reps Greta Yorsh Mooly Sagiv

Upload: nida

Post on 11-Jan-2016

53 views

Category:

Documents


0 download

DESCRIPTION

Assume/Guarantee Reasoning using Abstract Interpretation. Nurit Dor Tom Reps Greta Yorsh Mooly Sagiv. Limitations of Whole Program Analysis. Complexity of Chaotic Iterations Not all the source code is available Large libraries Software components No interaction with the client - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Assume/Guarantee Reasoning using Abstract Interpretation

Assume/Guarantee Reasoningusing Abstract Interpretation

Nurit DorTom Reps

Greta YorshMooly Sagiv

Page 2: Assume/Guarantee Reasoning using Abstract Interpretation

Limitations of Whole Program Analysis

• Complexity of Chaotic Iterations

• Not all the source code is available– Large libraries– Software components

• No interaction with the client– Program design

Page 3: Assume/Guarantee Reasoning using Abstract Interpretation

A Motivating ExampleList rev(List x) {

if (x ==null) return null ;

return append(rev(xnext), x);

}

List append(List x, List y) {

List e;

if (x == null) return y;

e = malloc(…);

edata = xdata;

enext = append(xnext, y);

}

List rev(List x)

requires acyclic(x)

ensures $$=reverse(x)

List append(List x, List y)

requires acyclic(x) acyclic(y)

ensures $$= x || y

Con

tractC

ontract

Can also used for runtime testing

Page 4: Assume/Guarantee Reasoning using Abstract Interpretation

Challenges in A/G Reasoning

• Specifying procedure contracts

• Performing abstract interpretation using contracts

Page 5: Assume/Guarantee Reasoning using Abstract Interpretation

Specifying Contracts

• Executable specifications– assert– Can use loops– Expressive– Natural– But what about side-effects

• Declarative specifications– Types– First order logic– Z

• Hybrid– Larch– Java Modeling Language

Page 6: Assume/Guarantee Reasoning using Abstract Interpretation

Procedure Contracts and Modularity

• The postcondition does not reveal the whole story

void foo(List x, List z) {

List y, t ;

y = rev(x);

t = rev(z);

}

List rev(List x)

requires acyclic(x)

ensures $$=reverse(x)

List foo(List x)

requires acyclic(x) acyclic(y)

ensures true

Page 7: Assume/Guarantee Reasoning using Abstract Interpretation

Procedure Contracts and Modularity

• Specify parts of the state which may be modified

• But difficult to define potential side-effects• Can use abstract interpretation

void foo(List x, List z) {

List y, t ;

y = rev(x);

t = rev(z)

}

List rev(List x)

requires acyclic(x)

ensures $$=reverse(x)

List foo(List x)

requires acyclic(x) acyclic(y)

ensures true

Page 8: Assume/Guarantee Reasoning using Abstract Interpretation

Issues in Specifying Contracts

• Expressible

• Conciseness

• Natural

• Reuse

• Cost of dynamic check (model checking)

• Decidability

• Cost of abstract interpretation

Page 9: Assume/Guarantee Reasoning using Abstract Interpretation

Plan

• CSSV: A tool for verifying absence of buffer overruns (N. Dor)

• An algorithm for performing abstract interpretation in the most precise way using specification

Page 10: Assume/Guarantee Reasoning using Abstract Interpretation

CSSV: Towards a Realistic Tool for Statically Detecting All

Buffer Overflows in CNurit Dor, Michael Rodeh, Mooly Sagiv

DAEDALUS project

Page 11: Assume/Guarantee Reasoning using Abstract Interpretation

/* from web2c [strpascal.c] */

void foo(char *s)

{

while ( *s != ‘ ‘ )

s++;

*s = 0;

}

Vulnerabilities of C programs

Null dereferenceDereference to unallocated storage

Out of bound pointer arithmetic

Out of bound update

Page 12: Assume/Guarantee Reasoning using Abstract Interpretation

Is it common?

• General belief – yes!• FUZZ study

– Test reliability by random input

– Tens of applications on 9 different UNIX systems

– 18% – 23% hang or crash

• CERT advisory– Up to 50% of attacks are due to buffer overflow

COMMON AND DANGEROUS

Page 13: Assume/Guarantee Reasoning using Abstract Interpretation

CSSV’s Goals

• Efficient conservative static checking algorithm– Verify the absence of buffer overflow

• not just finding bugs

– All C constructs• Pointer arithmetic, casting, dynamic memory, …

– Real programs– Minimum false alarms

Page 14: Assume/Guarantee Reasoning using Abstract Interpretation

Verifying Absence of Buffer Overflow is non-trivial

void safe_cat(char *dst, int size, char *src )

{ if ( size > strlen(src) + strlen(dst) ) {

dst = dst + strlen(dst);

strcpy(dst, src); }}

{string(src) alloc(dst) > len(src)}

{string(src) string(dst) alloc(dst+len(dst)) > len(src)}

string(src) string(dst) (size > len(src)+len(dst)) alloc(dst+len(dst)) > len(src))

Page 15: Assume/Guarantee Reasoning using Abstract Interpretation

Can this be done for real programs?

• Complex linear relationships• Pointer arithmetic• Loops• Procedures

• Use Polyhedra[CH78]• Pointer analysis• Widening• Procedure contracts

Very few false alarms!

Page 16: Assume/Guarantee Reasoning using Abstract Interpretation

Linear Relation Analysis

Cousot and Halbwachs, 78 Statically analyze program variable relations:

a1* var1 + a2* var2 + … + an* varn b Polyhedron

y 1 x + y 3-x + y 1

0 1 2 3 x

0

1

2

3

y

V = { (1,2) (2,1) }R = { (1,0) (1,1) }

Page 17: Assume/Guarantee Reasoning using Abstract Interpretation

C String Static Verifier

• Detects string violations– Buffer overflow (update beyond bounds)– Unsafe pointer arithmetic– References beyond null termination– Unsafe library calls

• Handles full C– Multi-level pointers, pointer arithmetic, structures, casting, …

• Applied to real programs– Public domain software– C code from Airbus

Page 18: Assume/Guarantee Reasoning using Abstract Interpretation

Plan

• Semantics for C program

• Contract language

• Static analysis algorithm

• Implementation

Page 19: Assume/Guarantee Reasoning using Abstract Interpretation

Standard C Semantics

void safe_cat( char *dst, int size, char *src )

{ if ( size > strlen(src) + strlen(dst) ) {

dst = dst + strlen(dst);strcpy(dst, src);

}}

src 0x480588

dst 0x480580

size 0x480584

0x5058510

125

‘x’0x5050510

0x5050518 0

‘y’0x6000009

0x6000A00 0

0x6000009

Page 20: Assume/Guarantee Reasoning using Abstract Interpretation

Instrumented C Semantics

src 0x480588

dst 0x480580

size 0x480584

0x5058510

125

‘x’0x5050510

0x5050518 0

‘y’0x6000009

0x6000A00 0

4

130

base asize

4

4

245

0x6000009

Page 21: Assume/Guarantee Reasoning using Abstract Interpretation

Instrumented C Semantics

src 0x480588

dst 0x480580

size 0x480584

0x5058510

125

‘x’0x5050510

0x5050518 0

‘y’0x6000009

0x6000A00 0

4

130

base asize

4

4

245

0x6000009

0

offset

9

0x6000000

Page 22: Assume/Guarantee Reasoning using Abstract Interpretation

The instrumented semantics checks validity of C expressions ANSI C Cleanness

dst = dst + i

Safety

offset(dst) + i asize(base(dst))

dst

offset(dst)

base(dst)

asize(base(dst))i

Page 23: Assume/Guarantee Reasoning using Abstract Interpretation

Contracts

• Defined in the instrumented semantics• Specify string behavior of procedures (C

expressions)– Precondition

– Postcondition• Use of values at procedure entry

– Side-effects• Can be approximated from pointed information

• No need to specify pointer information– Not aiming for modular pointer analysis

Page 24: Assume/Guarantee Reasoning using Abstract Interpretation

Contracts’ Advantages

• Modular analysis – Use contracts on call statements– Not all the code is available– Enable more expensive analyses

• User control of the verification– Detect errors at point of logical error– Improve the precision of the analysis

• Check additional properties– Beyond ANSI-C

Page 25: Assume/Guarantee Reasoning using Abstract Interpretation

Example

char* strcpy(char* dst, char* src)

requires

mod

ensures

( string(src) alloc(dst) > len(src))

( len(dst) = = [len(src)]pre

return = = [dst]pre

)

dst

Page 26: Assume/Guarantee Reasoning using Abstract Interpretation

safe_cat’s contract

void safe_cat(char* dst, int size, char* src)

requires

mod

ensures

( string(src) string(dst) alloc(dst) == size)

( len(dst) <= [len(src)]pre +

[len(dst)]pre len(dst) >= [len(dst)]pre

)

dst

Page 27: Assume/Guarantee Reasoning using Abstract Interpretation

Contracts and Soundness

• All errors are detected– Violation of statement’s precondition

• …a[i]…

– Violation of procedure’s precondition• Call

– Violation of procedure's postcondition• Return

• Violation messages depend on the contracts• But may lead to more false alarms (e.g., trivial

contracts)

Page 28: Assume/Guarantee Reasoning using Abstract Interpretation

CSSV Static Analysis

1. Inline contracts• Expose behavior of called procedures

2. Pointer analysis (global)• Find relationship between base addresses

3. Integer analysis• Compute offset information

Page 29: Assume/Guarantee Reasoning using Abstract Interpretation

Step 1: Inliner

void safe_cat( char *dst, int size, char *src )

{ …

strcpy(dst, src); …}

void safe_cat( char *dst, int size, char *src )

requires ( string(src) string(dst) alloc(dst) == size)mod dstensures ( len(dst) = =

[pre@len(src)]pre + [len(dst)]pre )

char* strcpy( char *dst, char *src )requires ( string(src) alloc(dst) > len(src))mod dst

ensures ( len(dst) = = [len(src)]pre return = = [dst]pre

)

Page 30: Assume/Guarantee Reasoning using Abstract Interpretation

Step 1: Inliner

void safe_cat( char *dst, int size, char *src )

{ …

strcpy(dst, src); …}

void safe_cat( char *dst, int size, char *src )

requires ( string(src) string(dst) alloc(dst) == size)mod dstensures ( len(dst) = =

[pre@len(src)]pre + [len(dst)]pre )

char* strcpy( char *dst, char *src )requires ( string(src) alloc(dst) > len(src))mod dst

ensures ( len(dst) = = [len(src)]pre return = = [dst]pre

)

assume

assert

Page 31: Assume/Guarantee Reasoning using Abstract Interpretation

Step 1: Inliner

void safe_cat( char *dst, int size, char *src )

{ …

strcpy(dst, src); …}

void safe_cat( char *dst, int size, char *src )

requires ( string(src) string(dst) alloc(dst) == size)mod dstensures ( len(dst) = =

[pre@len(src)]pre + [len(dst)]pre )

char* strcpy( char *dst, char *src )requires ( string(src) alloc(dst) > len(src))mod dst

ensures ( len(dst) = = [len(src)]pre return = = [dst]pre

)

assume

assert

Page 32: Assume/Guarantee Reasoning using Abstract Interpretation

Step 2: Compute Pointer Information

• Required for reasoning about pointers• Every base address is abstracted by an abstract

location• Relationships between base addresses is computed

(points-to)• Global analysis

– Scalable– Imprecise

• Flow insensitive• (Almost) Context insensitive

Page 33: Assume/Guarantee Reasoning using Abstract Interpretation

Global Points-To

main() {char s[10], t[20],r;char *p1, *p2; …p1= r + i;safe_cat(s,10,p1);p2 = r + j;safe_cat(t,10,p2);…

}

s t r

p2

dst src

safe_cat( char *dst, int size, char *src )

{ … strcpy(dst, src); …}

p1

Page 34: Assume/Guarantee Reasoning using Abstract Interpretation

Procedural Points-to (PPT)

• “Project” pointer information on visible variables of the procedure

• Introduce abstract locations for formal parameters• Allow destructive updates through formal

parameters (well behaved programs)• Can decrease precision in some procedures

Page 35: Assume/Guarantee Reasoning using Abstract Interpretation

PPT

Param #1

Param # 2

dst src

safe_cat( char *dst, int size, char *src )

{ … strcpy(dst, src); …}

Page 36: Assume/Guarantee Reasoning using Abstract Interpretation

Step 3: Static Analysis

• Prove linear inequalities on string indices • Abstract string properties using constraint

variables• Use abstract interpretation to conservatively

interpret program statements• Verify safety preconditions

Page 37: Assume/Guarantee Reasoning using Abstract Interpretation

Back to Semantics

src 0x480588

dst 0x480580

size 0x480584

0x5058510

125

‘x’0x5050510

0x5050518 0

‘y’0x6000009

0x6000A00 0

4

130

base asize

4

4

245

0x6000009

0

offset

9

0x6000000

Page 38: Assume/Guarantee Reasoning using Abstract Interpretation

Abstract Representation

src

dst

size

n1

n2

Base address relationship

src 0x480588

dst 0x480580

size 0x480584

0x5058510

125

‘x’0x5050510

0x5050518 0

‘y’0x6000009

0x6000A00 0

0x6000009

0x6000000

Page 39: Assume/Guarantee Reasoning using Abstract Interpretation

Constraint Variables

• For every abstract location

a.offset

src.offset = 9

src

Page 40: Assume/Guarantee Reasoning using Abstract Interpretation

Constraint Variables

• For every integer abstract location

a.val

size.val = 125

size

Page 41: Assume/Guarantee Reasoning using Abstract Interpretation

Constraint Variables

• For every abstract location

a.is_nullt

a.len

a.asize

n1

n1.lenn1.asize

0

Page 42: Assume/Guarantee Reasoning using Abstract Interpretation

Abstract Representation

src

dst

size

n1

n2

dst.offset < n1.len

size.val+ dst.offset = n1.asize

n1.is_nullt = true

n2.is_nullt = true

Page 43: Assume/Guarantee Reasoning using Abstract Interpretation

What does it represent?

dstsize

?

?

n1.is_nullt = true

0

?dst.offset < n1.len

n 1.len

dst.o

ffse

t

size.val + dst.offset = n1.asize

size

.val

n 1.asi

ze

Page 44: Assume/Guarantee Reasoning using Abstract Interpretation

Abstract Interpretation

dst.offset < n1.len

size.val = n1.asize - dst.offset

dst = dst + strlen(dst);

dst.offset = n1.len

size.val = n1.asize - dst.offset + n1.len

Page 45: Assume/Guarantee Reasoning using Abstract Interpretation

Verify Safety Condition

dst = dst + i

dst

offset(dst)

base(dst)

asize(base(dst))i

offset(dst) + i asize(base(dst))

concrete semantics abstract semantics

dst.offset + i.val n1.asize

n1

dst.offsetn1.asize

dst

i

Page 46: Assume/Guarantee Reasoning using Abstract Interpretation

The Assume-Operation

• Use two copies of constraint variables

• Set modified values to ⊤• Meet the post

Page 47: Assume/Guarantee Reasoning using Abstract Interpretation

CSSV Implementation

Cfiles

PreModPost

Cfiles

cont

ract

s

Procedure name

Pointer Analysis

Procedure’sPointer infoInliner

Cfiles

C’files

C2IP

Integer Procedure

Potential Error Messages

Integer Analysis

Page 48: Assume/Guarantee Reasoning using Abstract Interpretation

Used Software

• ASToolKit [Microsoft]

• Core C [TAU - Greta Yorsh]

• GOLF [Microsoft - Manuvir Das]

• New Polka [Inria - Bertrand Jeannet]

Page 49: Assume/Guarantee Reasoning using Abstract Interpretation

Applications

• Verified string library from Airbus with 6 false alarms– Could be avoided by analyzing correlated conditions

• Found 8 real errors in another string intensive application with 2 false alarms– In one case safety depends on correctness– Could be avoided by defensive programming

• 1 - 206 CPU seconds per procedure– No optimizations

• Very few false alarms

Page 50: Assume/Guarantee Reasoning using Abstract Interpretation

Related Work

Non-Conservative

• Wagner et. al. [NDSS’00]

• LCLint’s extension [USENIX’01]

• Eau Claire [IEEE Oakland 02]

Conservative

• Polyspace verifier

• Dor, Rodeh and Sagiv [SAS’01]

Page 51: Assume/Guarantee Reasoning using Abstract Interpretation

Further work

• Derive contracts

• Improve efficiency

• Interprocedural

Page 52: Assume/Guarantee Reasoning using Abstract Interpretation

CSSV: Summary

• Semantics– Safety checking

– Full C

– Enables abstractions

• Contract language– String behavior

– Omit pointer aliasing

• Procedural points-to – Scalable

– Improve precision

• Static analysis – Tracks important string

properties

– Utilizes integer analysis

Page 53: Assume/Guarantee Reasoning using Abstract Interpretation

Foundation of A/G abstract interpretation

Greta Yorsh

www.cs.tau.ac.il/~gretay

Page 54: Assume/Guarantee Reasoning using Abstract Interpretation

Assume-Guarantee Reasoning using AI

T bar();

void foo() {

T p;...

p = bar();

...

}

{prebar, postbar}

{prefoo, postfoo}

assume[prefoo];

assert[prebar];-----------assume[postbar];

assert[postfoo];

Is (a) ?

assert[](a)assume[](a)

<⊤>

<a1>

<a2>

<a3>

<a4>( (a) ⋂ ) a ⋂ ( )

Page 55: Assume/Guarantee Reasoning using Abstract Interpretation

Goals

• Generic algorithms for assert & assume

• Effective

• Efficient

• Allow natural specifications

• Rather precise verification

Page 56: Assume/Guarantee Reasoning using Abstract Interpretation

Motivation

• New approach to using symbolic techniques in abstract interpretation – for shape analysis– for other analyses

• What does it mean to harness a decision procedure for use in static analysis?– what are the requirements ?– what does it buy us ?

Page 57: Assume/Guarantee Reasoning using Abstract Interpretation

What are the requirements ?

Formulas

S ∈ (a) ⇔ S (a) ^

AbstractConcrete

a

Is (a) empty? Is (a) satisfiable?^⇔

(a)

Page 58: Assume/Guarantee Reasoning using Abstract Interpretation

[x0, y0, z0]

[x0, y1, z0]

[x0, y2, z0]

[x0, y, z0]

AbstractConcrete Formulas

(x=0)(z=0)

S ⊧ (a) ⇔ S ∈(a)^

Page 59: Assume/Guarantee Reasoning using Abstract Interpretation

FormulasConcreteValues

AbstractValues

u1

xu

x

...x

v1,v2 : nodeu1(v1) nodeu (v2) v1 ≠ v2 v : nodeu1(v) nodeu (v) . . .

Page 60: Assume/Guarantee Reasoning using Abstract Interpretation

What does it buy us ?

• Guarantee the most-precise result w.r.t. to the abstraction– best transformer– other abstract operations

• Modular reasoning– assume-guarantee reasoning– scalability

Page 61: Assume/Guarantee Reasoning using Abstract Interpretation

AbstractConcrete

The assume[](a) Operation

a

= ((a))

Formulas

(a) ^

X

(a)

( (a) )^ ^

assume[](a)

X

Page 62: Assume/Guarantee Reasoning using Abstract Interpretation

Formulas AbstractConcrete

The abstraction operation () ^

a1a2

Page 63: Assume/Guarantee Reasoning using Abstract Interpretation

Assume-Guarantee Reasoning using AI

T bar();

void foo() {

T p;...

p = bar();

...

}

{prebar, postbar}

{prefoo, postfoo}

assume[prefoo];

assert[prebar];-----------assume[postbar];

assert[postfoo];

^Is (a) ?

assert[](a)assume[](a)

<⊤>

<a1>

<a2>

<a3>

<a4> ( ( (a) ⋀ ))^ ^

Page 64: Assume/Guarantee Reasoning using Abstract Interpretation

Formulas AbstractConcrete

Computing ()

^

ans

a1

Page 65: Assume/Guarantee Reasoning using Abstract Interpretation

3-Valued Logical Structures

• Relation meaning over {0, 1, ½}

• Kleene– 1: True– 0: False

– ½ : Unknown

• A join semi-lattice: 0 ⊔ 1 = ½

½

Page 66: Assume/Guarantee Reasoning using Abstract Interpretation

Canonical Abstraction

x

u1 u2 u3 u4

c,rxc,rxc,rxc,rx

xu1 u2

c,rx c,rx

x

∃v1,v2:nodeu1(v1) node⋀ u2(v2)⋀∀w: nodeu1(w) node⋁ u2(w)

⋀ ∀w1,w2:nodeu1(w1) node⋀ u1(w2)

⇒(w1=w2)⋀ n(w⌝ 1,w2) v:r⋀∀ x(v)⇔ v1: x(v1) n*(v1,v) ∃ ⋀v:c(v)⇔ v1:n(v,v1) n*(v1,v)⋀∀ ∃ ⋀⋀∀v1,v2:x(v1) x(v2) v1=v2⋀ ⇒

⋀ ∀v,v1,v2:n(v,v1) n(v,v2) v1=v2⋀ ⇒

FOFOTCTC

(a) ≜^

Page 67: Assume/Guarantee Reasoning using Abstract Interpretation

y == x->n

FormulasConcrete

⊤ ans

≜ ∀v1:y(v1) ↔∃v2: x(v2) n(v⋀ 2, v1)

Abstract

xu1 u2

y y

Abstract

xu1 uy

y

xu1 u2uy

y

x

(()^

Page 68: Assume/Guarantee Reasoning using Abstract Interpretation

Example - Materialization

xu1 u2

y y

xu1 u2

y y

y(u2)=0materialization

u2 uy, u2

y(uy) = 1, y(u2) =0

u2

xu1 uy

y y y

y(u2)=1

xu1 u2

yy

Is (a)

satisfiable ?

^

y == x->n

Page 69: Assume/Guarantee Reasoning using Abstract Interpretation

Abstract Operations

() – best abstract value that represents • What does it buy us ?• assume[](a) = ( (a) ⋀ )

– assume-guarantee reasoning – pre- and post-conditions specified by logical

formulas

• BT(t,a) = ( (extend(a)) t )⋀– best abstract transformer– parametric abstractions

• meet(a1, a2) = ( (a1) ⋀ (a2) )

^

^^^

^^

^^

Page 70: Assume/Guarantee Reasoning using Abstract Interpretation

SPASS Experience

• Handles arbitrary FO formulas

• Can diverge– use timeout

• Converges in our examples– Captures older shape analysis algorithms

• How to handle FOTC ?– Overapproximations lead to too many

structures

Page 71: Assume/Guarantee Reasoning using Abstract Interpretation

Decidable Transitive-closure Logic• Neil Immerman (UMASS), Alexander Rabinovich

(TAU)

• ∃∀(TC,f) is subset of FOTC – exist-forall form – arbitrary unary relations– single function f

• Decidable for satisfiability– NEXPTIME-complete

• Any “reasonable” extension is undecidable

• Rather limited

Page 72: Assume/Guarantee Reasoning using Abstract Interpretation

Simulation Technique – CAV’04• Neil Immerman (UMASS), Alexander Rabinovich

(TAU)

• Simulate realistic data structures using decidable logic over tractable structures– Singly linked list - shared/cyclic/nested– Doubly linked list– Trees

• Preserved under mutations

• Abstract interpretation, Hoare-style verification

Page 73: Assume/Guarantee Reasoning using Abstract Interpretation

Further Work

• Implementation• Decidable logic for shape analysis• Assume-guarantee of “real” programs

– case study: Java Collection (B. Livshits, Noam)– Estimate side-effects (A. Skidanov)– specification language– write procedure specifications

• Extend to other domains– Infinite-height

• Tune the abstraction based on specification

Page 74: Assume/Guarantee Reasoning using Abstract Interpretation

Summary

• A/G Approach can scale program analysis/verification

• But requires some effort– Language designers– Programmers– Abstract interpretation– Efficient runtime testing