noam rinetzky lecture 9: abstract interpretation ii

Program Analysis and Verification

0368-4479http://www.cs.tau.ac.il/~maon/teaching/2013-2014/paav/paav1314b.html

Noam Rinetzky

Lecture 9: Abstract Interpretation II

Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav

http://www.cs.tau.ac.il/~maon/teaching/2013-2014/paav/paav1314b.html

From verification to analysis

• Manual program verification– Verifier provides assertions• Loop invariants

• Program analysis– Automatic program verification – Tool automatically synthesize assertions • Finds loop invariants

3

Abstract Interpretation [Cousot’77]

• Mathematical foundation of static analysis

4


• Mathematical framework for approximating semantics (aka abstraction)– Allows designing sound static analysis algorithms• Usually compute by iterating to a fixed-point

– Computes (loop) invariants• Can be interpreted as axiomatic verification assertions• Generalizes Hoare Logic & WP / SP calculus

5


• Mathematical foundation of static analysis– Abstract domains• Abstract states ~ Assertions• Join () ~ Weakening

– Transformer functions• Abstract steps ~ Axioms

– Chaotic iteration • Structured Programs ~ Control-flow graphs• Abstract computation ~ Loop invariants

6

Concrete Semantics

set of statesoperational semantics(concrete semantics)

statement Sset of states

7

Conservative Semantics

set of states set of statesoperational semantics(concrete semantics)


8

Abstract (conservative) interpretation


statement S

abstract representation abstract

representation abstract semanticsstatement S abstract

representation

abstraction abstraction

{P} S {Q} sp(S, P)generalizes axiomatic verification

9




abstract representation abstract semantics

statement S abstract representation

concretization concretization

10




abstract state abstract semantics (transfer function)

statement S abstract state


11


• Mathematical foundation of static analysis– Abstract domains• Abstract states ~ Assertions• Join () ~ Weakening

– Transformer functions• Abstract steps ~ Axioms

– Chaotic iteration • Abstract computation ~ Loop invariants• Structured Programs ~ Control-flow graphs

Lattices(D, , , , , )

Monotonic functions

Fixpoints

12

A taxonomy of semantic domain typesComplete Lattice(D, , , , , )

Lattice(D, , , , , )

Join semilattice(D, , , )

Meet semilattice(D, , , )

Complete partial order (CPO)(D, , )

Partial order (poset)(D, )

Preorder(D, )

13

Preorder• We say that a binary order relation over a

set D is a preorder if the following conditions hold for every d, d’, d’’ D– Reflexive: d d– Transitive: d d’ and d’ d’’ implies d d’’

14

Preorder

d d’

d’’

• We say that a binary order relation over a set D is a preorder if the following conditions hold for every d, d’, d’’ D– Reflexive: d d– Transitive: d d’ and d’ d’’ implies d d’’

Hasse Diagram

15

Preorder• We say that a binary order relation over a

set D is a preorder if the following conditions hold for every d, d’, d’’ D– Reflexive: d d– Transitive: d d’ and d’ d’’ implies d d’’

d d’

d’’

Hasse Diagram

16

Partial order• We say that a binary order relation over a

set D is a preorder if the following conditions hold for every d, d’, d’’ D– Reflexive: d d– Transitive: d d’ and d’ d’’ implies d d’’– Anti-symmetric: d d’ and d’ d implies d = d’

d d’

d’’

Hasse Diagram

17

Chains

• d d’ means d d’ and d d’

• An ascending chain is a sequencex1 x2 … xk …

• A descending chain is a sequencex1 x2 … xk …

• The height of a poset (D, ) is the length of the maximal ascending chain in D

18

poset Hasse diagram (for CP)

{x=0}

{x=-1}{x=-2} {x=1} {x=2} ……

19

Some posets-related terminology

• If x y (alt y ⊒x) we can say– x is lower than y – x is more precise than y – x is more concrete than y – x under-approximates y

– y is greater than x – y is less precise than x – y is more abstract than x – y over-approximates x

Least upper bound (LUB)

• (D, ) is a poset

• b D is an ∊ upper bound of A D if a ⊆ ∀ A: a b

• b D is ∊ the least upper bound of A D if ⊆– b is an upper bound of A – If b’ is an upper bound of A then b b’– Join: X = LUB of X• x y = {x, y}

May not exist

May not exist

21

Join operator

• Properties of a join operator– Commutative: x y = y x– Associative: (x y) z = x (y z)– Idempotent: x x = x

• A kind of abstract union (disjunction) operator

• Top element of (D, ) is = D

22

Join Example

{x=0}

{x=-1}{x=-2} {x=1} {x=2} ……

23

Join Example

{x=0}

{x=-1}{x=-2} {x=1} {x=2} ……

24

Join Example

{x=0}

{x=-1}{x=-2} {x=1} {x=2} ……

Greatest lower bound (GLB)

• (D, ) is a poset

• b D is an ∊ lower bound of A D if a ⊆ ∀ A: b a

• b D is ∊ the greatest lower bound of A D if ⊆– b is an lower bound of A – If b’ is an lower bound of A then b’ b– Meet: X = GLB of X• x y = {x, y}

May not exist

May not exist

26

Meet operator

• Properties of a meet operator– Commutative: x y = y x– Associative: (x y) z = x (y z)– Idempotent: x x = x

• A kind of abstract intersection (conjunction) operator

• Bottom element of (D, ) is = D

27

Complete partial order (CPO)

• A poset (D , ) is a complete partial if every ascending chain x1 x2 … xk … has a LUB

28

Meet Example

x=0

x0

x<0 x>0

x0

29

Meet Example

x=0

x0

x<0 x>0

x0

30

Meet Example

x=0

x0

x<0 x>0

x0

31

Complete partial order (CPO)

• A poset (D , ) is a complete partial if every ascending chain x1 x2 … xk … has a LUB

32

Join semilattices

• (D, , , ) is a join semilattice – (D, ) is a partial order – ∀X FIN D . X is defined – A top element

33

Meet semilattices

• (D, , , ) is a meet semilattice – (D, ) is a partial order – ∀X FIN D . X is defined – A bottom element

34

Lattices

• (D, , , , , ) is a lattice if– (D, , , ) is a join semilattice– (D, , , ) is a meet semilattice

• A lattice (D, , , , , ) is a complete lattice if– X and Y are defined for arbitrary sets

35

Example: Powerset lattices

• (2X, , , , , X) is the powerset lattice of X– A complete lattice

36

Example: Sign lattice

x=0

x0

x<0 x>0

x0

37

A taxonomy of semantic domain typesComplete Lattice(D, , , , , )

Lattice(D, , , , , )

Join semilattice(D, , , )

Meet semilattice(D, , , )

Join/Meet exist for every subset of DJoin/Meet exist for every finite

subset of D (alternatively, binary join/meet)

Join of the empty set Meet of the empty set

Complete partial order (CPO)(D, , )

Partial order (poset)(D, )

Preorder(D, )

reflexivetransitiveanti-symmetric: d d’ and d’ d implies d = d’

reflexive: d dtransitive: d d’, d’ d’’ implies d d’’

poset with LUB for all ascending chains

38

Towards a recipe for static analysis

39

Collecting semantics

• For a set of program states State, we define the collecting lattice

(2State, , , , , State)

• The collecting semantics accumulates the (possibly infinite) sets of states generated during the execution– Not computable in general

40




abstract representation abstract semantics

statement S abstract representation


41


{x 1, x 2, …}↦ ↦ {x 0, x 1, …}↦ ↦operational semantics(concrete semantics)

x=x-1{x 0, x 1, …}↦ ↦

0 < x abstract semanticsx = x -1

0 ≤ x


42


{x 1, x 2, …}↦ ↦ {…, x 0, …}↦operational semantics(concrete semantics)

x=x-1{x 0, x 1, …}↦ ↦



43

Abstract (non-conservative) interpretation

{x 1, x 2, …}↦ ↦ { x 1, …}↦operational semantics(concrete semantics)

x=x-1{x 0, x 1, …}↦ ↦ ⊈


0 < x


44

But …

• what if we have x & y?– Define lattice (semantics) for each variable– Compose lattices• Goal: compositional definition

• What if we have more than 1 statement?– Define semantics for entire program via CFG– Different “abstract states” at every CFG node

45

One lattice per variabletrue

false

x=0

x0

x<0 x>0

x0

true

false

y=0

y0

y<0 y>0

y0

How can we compose them?

46

Cartesian product of complete lattices

• For two complete lattices L1 = (D1, 1, 1, 1, 1, 1) L2 = (D2, 2, 2, 2, 2, 2)

• Define the posetLcart = (D1D2, cart, cart, cart, cart, cart)as follows:– (x1, x2) cart (y1, y2) iff x1 1 y1 x2 2 y2

– cart = ? cart = ? cart = ? cart = ?

• Lemma: L is a complete lattice• Define the Cartesian constructor Lcart = Cart(L1, L2)

47

Cartesian product example=(,)

x<0,y<0 x<0,y=0 x<0,y>0 x=0,y<0 x=0,y=0 x=0,y>0 x>0,y<0 x>0,y=0 x>0,y>0

x0,y<0 x0,y<0 x0,y=0 x0,y=0 x0,y>0 x0,y>0 x>0,y0 x>0,y0……

x0,y0 x0,y0 x0,y0x0,y0

x0 x0 y0 y0

…

(false, false)How does it represent

(x<0y<0) (x>0y>0)?

=(, )

48

Disjunctive completion• For a complete lattice

L = (D, , , , , )• Define the powerset lattice

L = (2D, , , , , ) = ? = ? = ? = ? = ?

• Lemma: L is a complete lattice

• L contains all subsets of D, which can be thought of as disjunctions of the corresponding predicates

• Define the disjunctive completion constructorL = Disj(L)

49

The base lattice CP

{x=0}

{x=-1}{x=-2} {x=1} {x=2} ……

50

The disjunctive completion of CP

{x=0}

true

{x=-1}{x=-2} {x=1} {x=2} ……

false

{x=-2x=-1} {x=-2x=0} {x=-2x=1} {x=1x=2}… … …

{x=0 x=1x=2}{x=-1 x=1x=-2}… ………

What is the height of this lattice?

51

Relational product of lattices

• L1 = (D1, 1, 1, 1, 1, 1)L2 = (D2, 2, 2, 2, 2, 2)

• Lrel = (2D1D2, rel, rel, rel, rel, rel)as follows:– Lrel = ?

52


• L1 = (D1, 1, 1, 1, 1, 1)L2 = (D2, 2, 2, 2, 2, 2)

• Lrel = (2D1D2, rel, rel, rel, rel, rel)as follows:– Lrel = Disj(Cart(L1, L2))

• Lemma: L is a complete lattice• What does it buy us?

53

Cartesian product exampletrue

false

x<0,y<0 x<0,y=0 x<0,y>0 x=0,y<0 x=0,y=0 x=0,y>0 x>0,y<0 x>0,y=0 x>0,y>0

x0,y<0 x0,y<0 x0,y=0 x0,y=0 x0,y>0 x0,y>0 x>0,y0 x>0,y0……

x0,y0 x0,y0 x0,y0x0,y0

x0 x0 y0 y0

…

How does it represent(x<0y<0) (x>0y>0)?


54

Relational product exampletrue

false

(x<0y<0)(x>0y>0)

x0 x0 y0 y0

How does it represent(x<0y<0) (x>0y>0)?

(x<0y<0)(x>0y=0) (x<0y0)(x<0y0)

…


Collecting semantics1 label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

234

5

if x > 0

x := x - 1

2

3

entry

exit

[x1]

[x1]

[x1]

[x0]

[x0]

[x-1]

[x-1]

[x2]

[x2]

[x2]

[x2]

[x3]

[x3]

[x3]

…

…

…55

[x-2]…

56

Defining the collecting semantics

• How should we represent the set of states at a given control-flow node by a lattice?

• How should we represent the sets of states at all control-flow nodes by a lattice?

57

Finite maps• For a complete lattice

L = (D, , , , , )and finite set V

• Define the posetLVL = (VD, VL, VL, VL, VL, VL)as follows:– f1 VL f2 iff for all vV

f1(v) f2(v)

– VL = ? VL = ? VL = ? VL = ?

• Lemma: L is a complete lattice• Define the map constructor LVL = Map(V, L)

58

The collecting lattice

• Lattice for a given control-flow node v: ?

• Lattice for entire control-flow graph with nodes V:

?• We will use this lattice as a baseline for static

analysis and define abstractions of its elements

59

The collecting lattice

• Lattice for a given control-flow node v: Lv=(2State, , , , , State)


LCFG = Map(V, Lv)• We will use this lattice as a baseline for static


60

Equational definition of the semantics• Define variables of type

set of states for each control-flow node

• Define constraints between them

if x > 0

x := x - 1

2

3

entry

exit

R[entry]

R[2]

R[3]R[exit]

61

Equational definition of the semantics• R[2] = R[entry] x:=x-1 R[3]• R[3] = R[2] {s | s(x) > 0}• R[exit] = R[2] {s | s(x) 0}• A system of recursive equations• How can we approximate it using what

we have learned so far?if x > 0

x := x - 1

2

3

entry

exit

R[entry]

R[2]

R[3]R[exit]

62

An abstract semantics• R[2] = R[entry] x:=x-1# R[3]• R[3] = R[2] {s | s(x) > 0}#

• R[exit] = R[2] {s | s(x) 0}#

• A system of recursive equations

if x > 0

x := x - 1

2

3

entry

exit

R[entry]

R[2]

R[3]R[exit]

Abstract transformer for x:=x-1

Abstract representationof {s | s(x) < 0}

63

Abstract interpretation via abstraction

set of states set of statescollecting semantics

statement S

abstract representationof sets of states

abstract

representationof sets of states abstract semantics

statement Sabstract

representationof sets of states

abstraction abstraction

{P} S {Q} sp(S, P)generalizes axiomatic verification

64

Abstract interpretation via concretization

set of states set of statescollecting semantics


abstract representationof sets of states

abstract semanticsstatement S abstract

representationof sets of states


{P} S {Q}

models(P) models(sp(S, P)) models(Q)

65

Required knowledge

Collecting semanticsAbstract semantics • Connection between collecting semantics and

abstract semantics• Algorithm to compute abstract semantics

66

The collecting lattice (sets of states)

• Lattice for a given control-flow node v: Lv=(2State, , , , , State)


LCFG = Map(V, Lv)• We will use this lattice as a baseline for static


Collecting semantics example1234

5

67

label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

if x > 0

x := x-1

entry

exit

2

3

[x1][x0][x-1][x2][x2][x3]…[x1][x0][x-1][x2][x2][x3]…

[x1][x3][x4]

[x2]

…[x-1][x-2]… [x0]

[x0][x1][x2][x3] …

Shorthand notation1234

5

68

label0: if x <= 0 goto label1 x := x – 1 goto label0

label1:

if x > 0

x := x-1

entry

exit

{xZ}

{xZ}

{x>0}{x0}

{x0}

2

3

69

Equational definition example• A vector of variables R[0, 1, 2, 3, 4]• R[0] = {xZ} // established input

R[1] = R[0] R[4]R[2] = R[1] {s | s(x) > 0}R[3] = R[1] {s | s(x) 0}R[4] = x:=x-1 R[2]

• A (recursive) system of equations

if x > 0

x := x-1

entry

exit

R[0]

R[1]

R[2]R[4]

R[3]

Semantic function for assume x>0

Semantic function for x:=x-1 lifted to sets of states

70

General definition• A vector of variables R[0, …, k] one per input/output of a node

– R[0] is for entry• For node n with multiple predecessors add equation

R[n] = {R[k] | k is a predecessor of n}• For an atomic operation node R[m] S R[n] add equation

R[n] = S R[m]• Treat true/false branches of conditions as atomic statements

assume bexpr and assume bexpr

if x > 0

x := x-1

entry

exit

R[0]

R[1]

R[2]R[4]

R[3]

71

Equation systems in general• Let L be a complete lattice (D, , , , , )• Let R be a vector of variables R[0, …, n] D … D• Let F be a vector of functions of the type

F[i] : R[0, …, n] R[0, …, n]• A system of equations

R[0] = f[0](R[0], …, R[n])…R[n] = f[n](R[0], …, R[n])

• In vector notation R = F(R)• Questions:

1. Does a solution always exist?

2. If so, is it unique?

3. If so, is it computable?

72

Equation systems in general• Let L be a complete lattice (D, , , , , )• Let R be a vector of variables R[0, …, n] D … D• Let F be a vector of functions of the type

F[i] : R[0, …, n] R[0, …, n]• A system of equations

R[0] = f[0](R[0], …, R[n])…R[n] = f[n](R[0], …, R[n])

• In vector notation R = F(R)• Questions:

1. Does a solution always exist?2. If so, is it unique?3. If so, is it computable?

73

Monotone functions

• Let L1=(D1, ) and L2=(D2, ) be two posets

• A function f : D1 D2 is monotone if for every pair x, y D1

x y implies f(x) f(y)• A special case: L1=L2=(D, )

f : D D

74

Important cases of monotonicity

• Join: f(X, Y) = X YProve it!

• For a set X and any function gF(X) = { g(x) | x X }Prove it!

• Notice that the collecting semantics function is defined in terms of– Join (set union)– Semantic function for atomic statements lifted to sets of

states

75

Extensive/reductive functions

• Let L=(D, ) be a poset• A function f : D D is extensive if for every

pair x D, we have that x f(x)• A function f : D D is reductive if for every

pair x D, we have that x f(x)

76

Fixed-points• L = (D, , , , , )• f : D D monotone• Fix(f) = { d | f(d) = d }• Red(f) = { d | f(d) d }• Ext(f) = { d | d f(d) }• Theorem [Tarski 1955]– lfp(f) = Fix(f) = Red(f) Fix(f)– gfp(f) = Fix(f) = Ext(f) Fix(f)

Red(f)

Ext(f)

Fix(f)

lfp

gfp

fn()

fn()

1.Does a solution always exist? Yes2. If so, is it unique? No, but it has least/greatest solutions3. If so, is it computable? Under some conditions…

77

Fixed point example for program• R[0] = {xZ}


if x>0

x := x-1

2

3

entry

exit

xZ

xZ

{x>0}{x<0}

if x>0

x := x-1

2

3

entry

exit

xZ

xZ

{x>0}{x<0}

F(d) : Fixed-point

=

d

78



if x>0

x := x-1

2

3

entry

exit

xZ

xZ

{x>0}{x<-5}

if x>0

x := x-1

2

3

entry

exit

xZ

xZ

{x>0}{x<0}

F(d) : pre Fixed-point

d

79



if x>0

x := x-1

2

3

entry

exit

xZ

xZ

{x>0}{x<9}

if x>0

x := x-1

2

3

entry

exit

xZ

xZ

{x>0}{x<0}

F(d) : post Fixed-point

d

80

Continuity and ACC condition

• Let L = (D, , , ) be a complete partial order– Every ascending chain has an upper bound

• A function f is continuous if for every increasing chain Y D*,

f(Y) = { f(y) | yY }• L satisfies the ascending chain condition (ACC)

if every ascending chain eventually stabilizes:d0 d1 … dn = dn+1 = …

81

Fixed-point theorem [Kleene]• Let L = (D, , , ) be a complete partial order and a continuous function f: D D

then

lfp(f) = nN fn()

• Lemma: Monotone functions on posets satisfying ACC are continuousProof:

1. Every ascending chain Y eventually stabilizes d0 d1 … dn = dn+1 = … hence dn is the least upper bound of {d0, d1, … , dn},thus f(Y) = f(dn)

2. From monotonicity of f f(d0) f(d1) … f(dn) = f(dn+1) = … Hence f(dn) is the least upper bound of {f(d0), f(d1), … , f(dn)},thus { f(y) | yY } = f(dn)

82

Resulting algorithm • Kleene’s fixed point theorem

gives a constructive method for computing the lfp

lfpfn()

f()f2()

…d := while f(d) d do d := d f(d)return d

Algorithm

lfp(f) = nN fn()Mathematical definition

83

Chaotic iteration• Input:

– A cpo L = (D, , , ) satisfying ACC– Ln = L L … L– A monotone function f : Dn Dn – A system of equations { X[i] | f(X) | 1 i n }

• Output: lfp(f)• A worklist-based algorithm

for i:=1 to n do X[i] := WL = {1,…,n}while WL do j := pop WL // choose index non-deterministically N := F[i](X) if N X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i])return X

84

Chaotic iteration for static analysis• Specialize chaotic iteration for programs• Create a CFG for program• Choose a cpo of properties for the static analysis to infer: L

= (D, , , )• Define variables R[0,…,n] for input/output of each CFG node

such that R[i] D• For each node v let vout be the variable at the output of that

node:vout = F[v]( u | (u,v) is a CFG edge)– Make sure each F[v] is monotone

• Variable dependence determined by outgoing edges in CFG

85

Constant propagation example

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

x := 4;while (y5) do z := x; x := 4

86

Constant propagation lattice• For each variable x define L as

• For a set of program variables Var=x1,…,xn

Ln = L L … L

0-1-2 1 2 ......

no information

not-a-constant

87

Write down variables

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

x := 4;while (y5) do z := x; x := 4

88

Write down equations

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2R2

R3

R4

R6

R1

R5

R0x := 4;while (y5) do z := x; x := 4

89

Collecting semantics equations

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2R2

R3

R4

R6

R0 = StateR1 = x:=4 R0

R2 = R1 R5

R3 = assume y5 R2

R4 = z:=x R3

R5 = x:=4 R4

R6 = assume y=5 R2

R1

R5

R0

90

Constant propagation equations

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2R2

R3

R4

R6

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1

R5

R0

abstract transformer

91

Abstract operations for CPR0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

Lattice elements have the form: (vx, vy, vz)x:=4# (vx,vy,vz) = (4, vy, vz)z:=x# (vx,vy,vz) = (vx, vy, vx)assume y5# (vx,vy,vz) = (vx, vy, vx)assume y=5# (vx,vy,vz) = if vy=k5 then (, , ) else (vx, 5, vz)R1 R5 = (a1, b1, c1) (a5, b5, c5) = (a1a5, b1b5, c1c5)

0-1-2 1 2 ......

CP lattice for a single variable

92

Chaotic iteration for CP: initialization

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(, , )

R5=(, , )

R0=(, , )

WL = {R0, R1, R2, R3, R4, R5, R6}

93

Chaotic iteration for CP

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(, , )

R5=(, , )

R0=(, , )

WL = {R1, R2, R3, R4, R5, R6}

94


x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(4, , )

R5=(, , )

R0=(, , )

WL = {R2, R3, R4, R5, R6}

95


x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(4, , )

R5=(, , )

R0=(, , )

0-1-2 1 2 ......

3 4

WL = {R2, R3, R4, R5, R6}

96


x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(4, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(4, , )

R5=(, , )

R0=(, , )

0-1-2 1 2 ......

3 4

WL = {R2, R3, R4, R5, R6}

97


x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2=(4, , )R2R2

R3=(, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R1=(4, , )

R5=(, , )

R0=(, , )

WL = {R3, R4, R5, R6}

98


x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2

R3=(4, , )

R4=(, , )

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R5=(, , )

R1=(4, , )

R0=(, , )

R2=(4, , )

WL = {R4, R5, R6}

99


x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2

R3=(4, , )

R4=(4, , 4)

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R5=(, , )

R1=(4, , )

R0=(, , )

R2=(4, , )

WL = {R5, R6}

100


x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2

R6=(, , )

R0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

R5=(4, , 4)

R4=(4, , 4)

R3=(4, , )

R1=(4, , )

R0=(, , )

R2=(4, , )R2=(4, , )

WL = {R2, R6}

added R2 back to worklist since it depends on R5

101

Chaotic iteration for CPR0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2

R6=(, , )

R5=(4, , 4)

R4=(4, , 4)

R3=(4, , )

R1=(4, , )

R0=(, , )

R2=(4, , )

WL = {R6}

102

Chaotic iteration for CPR0 = R1 = x:=4# R0

R2 = R1 R5

R3 = assume y5# R2

R4 = z:=x# R3

R5 = x:=4# R4

R6 = assume y=5# R2

x := 4

if (*)

assume y5

assume y=5

z := x

x := 4

entry

exit

R2R2

R6=(4, 5, )

R5=(4, , 4)

R4=(4, , 4)

R3=(4, , )

R1=(4, , )

R0=(, , )

R2=(4, , )

Fixed-point

WL = {}

In practice maintaina worklist of nodes

103

Chaotic iteration for static analysis• Specialize chaotic iteration for programs• Create a CFG for program• Choose a cpo of properties for the static analysis to infer: L

= (D, , , )• Define variables R[0,…,n] for input/output of each CFG node

such that R[i] D• For each node v let vout be the variable at the output of that

node:vout = F[v]( u | (u,v) is a CFG edge)– Make sure each F[v] is monotone

• Variable dependence determined by outgoing edges in CFG

104

Complexity of chaotic iteration• Parameters:

– n the number of CFG nodes– k is the maximum in-degree of edges – Height h of lattice L– c is the maximum cost of

• Applying Fv

• • Checking fixed-point condition for lattice L

• Complexity: O(n h c k)• Incremental (worklist) algorithm reduces the n factor in factor

– Implement worklist by priority queue and order nodes by reversed topological order

105

Required knowledge

Collecting semanticsAbstract semantics (over lattices)Algorithm to compute abstract semantics

(chaotic iteration)• Connection between collecting semantics and

abstract semantics• Abstract transformers

106

Recap• We defined a reference semantics –

the collecting semantics• We defined an abstract semantics for a given lattice and

abstract transformers• We defined an algorithm to compute abstract least fixed-point

when transformers are monotone and lattice obeys ACC• Questions:1. What is the connection between the two least fixed-points?2. Transformer monotonicity is required for termination – what

should we require for correctness?

107

Recap• We defined a reference semantics –

the collecting semantics• We defined an abstract semantics for a given lattice and

abstract transformers• We defined an algorithm to compute abstract least fixed-point

when transformers are monotone and lattice obeys ACC• Questions:1. What is the connection between the two least fixed-points?2. Transformer monotonicity is required for termination – what

should we require for correctness?

108

Galois Connection• Given two complete lattices

C = (DC, C, C, C, C, C) – concrete domainA = (DA, A, A, A, A, A) – abstract domain

• A Galois Connection (GC) is quadruple (C, , , A)that relates C and A via the monotone functions– The abstraction function : DC DA

– The concretization function : DA DC

• for every concrete element c DC

and abstract element a DA

( (a)) a and c ( (c))• Alternatively (c) a iff c (a)

109

Galois Connection: c ((c))

1

c2(c)

3((c))

The most precise (least) element in A representing c

C A

110

Galois Connection: ((a)) a

1

3((a))

2(a) a

C AWhat a represents in C(its meaning)

111

Example: lattice of equalities

• Concrete lattice:C = (2State, , , , , State)

• Abstract lattice:EQ = { x=y | x, y Var}A = (2EQ, , , , EQ , )– Treat elements of A as both formulas and sets of constraints

• Useful for copy propagation – a compiler optimization•

(X) = ?(Y) = ?

112

Example: lattice of equalities

• Concrete lattice:C = (2State, , , , , State)

• Abstract lattice:EQ = { x=y | x, y Var}A = (2EQ, , , , EQ , )– Treat elements of A as both formulas and sets of constraints

• Useful for copy propagation – a compiler optimization• (s) = ({s}) = { x=y | s x = s y} that is s x=y

(X) = {(s) | sX} = A {(s) | sX} (Y) = { s | s Y } = models(Y)

113

Galois Connection: c ((c))

1

[x5, y5, z5]

2

x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y

3

…[x6, y6, z6][x5, y5, z5][x4, y4, z4]

…

4

x=x, y=y, z=z

The most precise (least) element in A representing [x5, y5, z5]

C A

114

Most precise abstract representation

1c

5

C A

46

2

7 3 8

9

(c)

(c) = {c’ | c (c’)}

115

Most precise abstract representation

1c

5

C A

46

2

7 3 8

9

(c)= x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y

(c) = {c’ | c (c’)}

[x5, y5, z5]

x=y, y=zx=y, z=y

x=y

116

Galois Connection: ((a)) a

1

3x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y

2

…[x6, y6, z6][x5, y5, z5][x4, y4, z4]

…

x=y, y=z

What a represents in C(its meaning)

is called a semanticreduction

C A

117

Galois Insertion a: ((a))=a

1x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y

2

…[x6, y6, z6][x5, y5, z5][x4, y4, z4]

…

C AHow can we obtain a Galois Insertion

from a Galois Connection?

All elementsare reduced

118

Properties of a Galois Connection

• The abstraction and concretization functions uniquely determine each other:

(a) = {c | (c) a}(c) = {a | c (a)}

119

Abstracting (disjunctive) sets

• It is usually convenient to first define the abstraction of single elements

(s) = ({s}) • Then lift the abstraction to sets of elements

(X) = A {(s) | sX}

120

The case of symbolic domains• An important class of abstract domains are symbolic domains –

domains of formulas• C = (2State, , , , , State)

A = (DA, A, A, A, A, A)• If DA is a set of formulas then the abstraction of a state is defined

as (s) = ({s}) = A{ | s }

the least formula from DA that s satisfies• The abstraction of a set of states is

(X) = A { (s) | s X}• The concretization is

( ) = { s | s } = models( )

121

Inducing along the connections

• Assume the complete latticesC = (DC, C, C, C, C, C) A = (DA, A, A, A, A, A)M = (DM, M, M, M, M, M)andGalois connectionsGCC,A=(C, C,A, A,C, A) and GCA,M=(A, A,M, M,A, M)

• Lemma: both connections induce theGCC,M= (C, C,M, M,C, M)

defined by C,M = C,A A,M and M,C = M,A A,C

122

Inducing along the connections

1 C,A

A,C

c 2C,A(c)

5

C A

3

M

A,M

4

M,A

c’a’

=A,M(C,A(c))

123

Sound abstract transformer

• Given two latticesC = (DC, C, C, C, C, C)A = (DA, A, A, A, A, A)and GCC,A=(C, , , A) with

• A concrete transformer f : DC DC

an abstract transformer f# : DA DA

• We say that f# is a sound transformer (w.r.t. f) if• c: f(c)=c’ (f#(c)) (c’)• For every a and a’ such that

(f((a))) A f#(a)

124

Transformer soundness condition 1

1 2

C A

f 34f#

5

c: f(c)=c’ (f#(c)) (c’)

125

Transformer soundness condition 2

C A

1 2

f#35f

4

a: f#(a)=a’ f((a)) (a’)

126

Best (induced) transformer

C A

23f

f#(a)= (f((a)))

1 f#

4

Problem: incomputable directly

127

Best abstract transformer [CC’77]

• Best in terms of precision– Most precise abstract transformer– May be too expensive to compute

• Constructively defined asf# = f

– Induced by the GC• Not directly computable because first step is

concretization• We often compromise for a “good enough” transformer– Useful tool: partial concretization

128

Transformer example

• C = (2State, , , , , State)• EQ = { x=y | x, y Var}

A = (2EQ, , , , EQ , )• (s) = ({s}) = { x=y | s x = s y} that is s x=y

(X) = {(s) | sX} = A {(s) | sX} () = { s | s } = models()

• Concrete: x:=y X = { s[xs y] | sX}• Abstract: x:=y# X = ?

129

Developing a transformer for EQ - 1

• Input has the form X = {a=b}• sp(x:=expr, ) = v. x=expr[v/x] [v/x]

• sp(x:=y, X) = v. x=y[v/x] {a=b}[v/x] = …• Let’s define helper notations:– EQ(X, y) = {y=a, b=y X}• Subset of equalities containing y

– EQc(X, y) = X \ EQ(X, y)• Subset of equalities not containing y

130


• sp(x:=y, X) = v. x=y[v/x] {a=b}[v/x] = …• Two cases– x is y: sp(x:=y, X) = X – x is different from y:

sp(x:=y, X) = v. x=y EQ)X, x)[v/x] EQc(X, x)[v/x] = x=y EQc(X, x) v. EQ)X, x)[v/x] x=y EQc(X, x)

• Vanilla transformer: x:=y#1 X = x=y EQc(X, x)

• Example: x:=y#1 {x=p, q=x, m=n} = {x=y, m=n}Is this the most precise result?

131


• x:=y#1 {x=p, x=q, m=n} = {x=y, m=n} {x=y, m=n, p=q}– Where does the information p=q come from?

• sp(x:=y, X) =x=y EQc(X, x) v. EQ)X, x)[v/x]

• v. EQ)X, x)[v/x] holds possible equalities between different a’s and b’s – how can we account for that?

132


• Define a reduction operator:Explicate(X) = if exist {a=b, b=c}X

but not {a=c} X then Explicate(X{a=c})elseX

• Define x:=y#2 = x:=y#1 Explicate• x:=y#2){x=p, x=q, m=n}) = {x=y, m=n, p=q}

is this the best transformer?

133


• x:=y#2 ){y=z}) = {x=y, y=z} {x=y, y=z, x=z}• Idea: apply reduction operator again after the vanilla

transformer• x:=y#3 = Explicate x:=y#1 Explicate • Observation : after the first time we apply Explicate, all

subsequent values will be in the image of the abstraction so really we only need to apply it once to the input

• Finally: x:=y#(X) = Explicate x:=y#1

– Best transformer for reduced elements (elements in the image of the abstraction)

134

Negative property of best transformers

• Let f# = f • Best transformer does not compose

(f(f((a)))) f#(f#(a))

135

(f(f((a)))) f#(f#(a))

C A

23f

1 f#

6

54

f

7

f#

8

9

f

136

Soundness theorem 1

1. Given two complete latticesC = (DC, C, C, C, C, C)A = (DA, A, A, A, A, A)and GCC,A=(C, , , A) with

2. Monotone concrete transformer f : DC DC

3. Monotone abstract transformer f# : DA DA

4. a DA : f( (a)) (f#(a))Then

lfp(f) (lfp(f#)) (lfp(f)) lfp(f#)

137

Soundness theorem 1

C A

f

fn …

lpf(f)

f2 f3

f#n

…

lpf(f#)

f#2 f#3

f#

aDA : f((a)) (f#(a)) aDA : fn((a)) (f#n(a)) aDA : lfp(fn)((a)) (lfp(f#n)(a)) lfp(f) lfp(f#)

138

Soundness theorem 2

1. Given two complete latticesC = (DC, C, C, C, C, C)A = (DA, A, A, A, A, A)and GCC,A=(C, , , A) with

2. Monotone concrete transformer f : DC DC

3. Monotone abstract transformer f# : DA DA

4. c DC : (f(c)) f#( (c))Then

(lfp(f)) lfp(f#)lfp(f) (lfp(f#))

139

Soundness theorem 2

C A

f

fn …

lpf(f)

f2 f3

f#n

…

lpf(f#)

f#2 f#3

f#

c DC : (f(c)) f#((c)) c DC : (fn(c)) f#n((c)) c DC : (lfp(f)(c)) lfp(f#)((c)) lfp(f) lfp(f#)

140

A recipe for a sound static analysis

• Define an “appropriate” operational semantics• Define “collecting” structural operational semantics • Establish a Galois connection between collecting

states and abstract states• Local correctness: show that the abstract

interpretation of every atomic statement is soundw.r.t. the collecting semantics

• Global correctness: conclude that the analysis is sound

141

Completeness

• Local property:– forward complete: c: (f#(c)) = (f(c))– backward complete: a: f((a)) = (f#(a))

• A property of domain and the (best) transformer• Global property:– (lfp(f)) = lfp(f#)– lfp(f) = (lfp(f#))

• Very ideal but usually not possible unless we change the program model (apply strong abstraction) and/or aim for very simple properties

142

Forward complete transformer

1 2

C A

f 34

f#

c: (f#(c)) = (f(c))

143

Backward complete transformer

C A

1 2

f#35f

a: f((a)) = (f#(a))

144

Global (backward) completeness

C A

f

fn …

lpf(f)

f2 f3

f#n

…

lpf(f#)

f#2 f#3

f#

a: f((a)) = (f#(a)) a: fn((a)) = (f#n(a)) aDA : lfp(fn)((a)) = (lfp(f#n)(a)) lfp(f) = lfp(f#)

145

Global (forward) completeness

C A

f

fn …

lpf(f)

f2 f3

f#n

…

lpf(f#)

f#2 f#3

f#

c DC : (f(c)) = f#((c)) c DC : (fn(c)) = f#n((c)) c DC : (lfp(f)(c)) = lfp(f#)((c)) lfp(f) = lfp(f#)

146

Three example analyses

• Abstract states are conjunctions of constraints• Variable Equalities– VE-factoids = { x=y | x, y Var} false

VE = (2VE-factoids, , , , false, )• Constant Propagation– CP-factoids = { x=c | x Var, c Z} false

CP = (2CP-factoids, , , , false, )• Available Expressions– AE-factoids = { x=y+z | x Var, y,z VarZ} false

A = (2AE-factoids, , , , false, )

147

Lattice combinators reminder

• Cartesian Product– L1 = (D1, 1, 1, 1, 1, 1)

L2 = (D2, 2, 2, 2, 2, 2)– Cart(L1, L2) = (D1 D2, cart, cart, cart, cart, cart)

• Disjunctive completion– L = (D, , , , , )– Disj(L) = (2D, , , , , )

• Relational Product– Rel(L1, L2) = Disj(Cart(L1, L2))

148

Cartesian product of complete lattices


• Define the posetLcart = (D1 D2, cart, cart, cart, cart, cart)as follows:– (x1, x2) cart (y1, y2) iff

x1 1 y1 andx2 2 y2

– cart = ? cart = ? cart = ? cart = ?

• Lemma: L is a complete lattice• Define the Cartesian constructor Lcart = Cart(L1, L2)

149

Cartesian product of GCs

• GCC,A=(C, C,A, A,C, A)GCC,B=(C, C,B, B,C, B)

• Cartesian ProductGCC,AB = (C, C,AB, AB,C, AB)

– C,AB(X)= ?– AB,C(Y) = ?

150

Cartesian product of GCs



– C,AB(X) = (C,A(X), C,B(X))– AB,C(Y) = A,C(X) B,C(X)

• What about transformers?

151

Cartesian product transformers

• GCC,A=(C, C,A, A,C, A) FA[st] : A AGCC,B=(C, C,B, B,C, B) FB[st] : B B



• How should we define FAB[st] : AB AB

152

Cartesian product transformers

• GCC,A=(C, C,A, A,C, A) FA[st] : A AGCC,B=(C, C,B, B,C, B) FB[st] : B B



• How should we define FAB[st] : AB AB• Idea: FAB[st](a, b) = (FA[st] a, FB[st] b)• Are component-wise transformers precise?

153

Cartesian product analysis example• Abstract interpreter 1: Constant Propagation• Abstract interpreter 2: Variable Equalities• Let’s compare

– Running them separately and combining results– Running the analysis with their Cartesian product

a := 9;b := 9;c := a;

a := 9;b := 9;c := a;

CP analysis VE analysis{a=9}{a=9, b=9}{a=9, b=9, c=9}

{}{}{c=a}

154



CP analysis + VE analysisa := 9;b := 9;c := a;

{a=9}{a=9, b=9}{a=9, b=9, c=9, c=a}

155



CPVE analysisMissing

{a=b, b=c}

a := 9;b := 9;c := a;

{a=9}{a=9, b=9}{a=9, b=9, c=9, c=a}

156

Transformers for Cartesian product

• Naïve (component-wise) transformers do not utilize information from both components– Same as running analyses separately and then

combining results• Can we treat transformers from each analysis

as black box and obtain best transformer for their combination?

157

Can we combine transformer modularly?

• No generic method for any abstract interpretations

158

Reducing values for CPVE

• X = set of CP constraints of the form x=c(e.g., a=9)

• Y = set of VE constraints of the form x=y• ReduceCPVE(X, Y) = (X’, Y’) such that

(X’, Y’) (X’, Y’)• Ideas?

159

Reducing values for CPVE• X = set of CP constraints of the form x=c

(e.g., a=9)• Y = set of VE constraints of the form x=y• ReduceCPVE(X, Y) = (X’, Y’) such that

(X’, Y’) (X’, Y’)• ReduceRight:

– if a=b X and a=c Y then add b=c to Y• ReduceLeft:

– If a=c and b=c Y then add a=b to X• Keep applying ReduceLeft and ReduceRight and reductions

on each domain separately until reaching a fixed-point

160


• Do we get the best transformer by applying component-wise transformer followed by reduction?– Unfortunately, no (what’s the intuition?)– Can we do better?– Logical Product [Gulwani and Tiwari, PLDI 2006]

161

Product vs. reduced productCPVE lattice

{a=9}{c=a} {c=9}{c=a}

{a=9, c=9}{c=a}{[a9, c 9]}

collecting lattice

{}

162

Reduced product


• Define the reduced posetD1D2 = {(d1,d2)D1D2 | (d1,d2) = (d1,d2) } L1L2 = (D1D2, cart, cart, cart, cart, cart)

163


• Do we get the best transformer by applying component-wise transformer followed by reduction?– Unfortunately, no (what’s the intuition?)– Can we do better?– Logical Product [Gulwani and Tiwari, PLDI 2006]

165

Logical product--

• Assume A=(D,…) is an abstract domain that supports two operations: for xD– inferEqualities(x) = { a=b | (x) a=b }

returns a set of equalities between variables that are satisfied in all states given by x

– refineFromEqualities(x, {a=b}) = ysuch that• (x)=(y)• y x

166


• Input has the form X = {a=b}• sp(x:=expr, ) = v. x=expr[v/x] [v/x]

• sp(x:=y, X) = v. x=y[v/x] {a=b}[v/x] = …• Let’s define helper notations:– EQ(X, y) = {y=a, b=y X}• Subset of equalities containing y

– EQc(X, y) = X \ EQ(X, y)• Subset of equalities not containing y

167


• sp(x:=y, X) = v. x=y[v/x] {a=b}[v/x] = …• Two cases

– x is y: sp(x:=y, X) = X – x is different from y:

sp(x:=y, X) = v. x=y EQ)X, x)[v/x] EQc(X, x)[v/x] = x=y EQc(X, x) v. EQ)X, x)[v/x] x=y EQc(X, x)

• Vanilla transformer: x:=y#1 X = x=y EQc(X, x)

• Example: x:=y#1 {x=p, q=x, m=n} = {x=y, m=n}Is this the most precise result?

168


• x:=y#1 {x=p, x=q, m=n} = {x=y, m=n} {x=y, m=n, p=q}– Where does the information p=q come from?

• sp(x:=y, X) =x=y EQc(X, x) v. EQ)X, x)[v/x]

• v. EQ)X, x)[v/x] holds possible equalities between different a’s and b’s – how can we account for that?

169


• Define a reduction operator:Explicate(X) = if exist {a=b, b=c}X

but not {a=c} X then Explicate(X{a=c})elseX

• Define x:=y#2 = x:=y#1 Explicate• x:=y#2){x=p, x=q, m=n}) = {x=y, m=n, p=q}

is this the best transformer?

170


• x:=y#2 ){y=z}) = {x=y, y=z} {x=y, y=z, x=z}• Idea: apply reduction operator again after the

vanilla transformer• x:=y#3 = Explicate x:=y#1 Explicate

171

Logical Product-

basically the strongest postcondition

safely abstracting the existential quantifier

172

Abstracting the existentialReduce the pair

Abstract away existential quantifier for each domain

173

Example

174

Information loss exampleif (…) b := 5else b := -5

if (b>0) b := b-5else b := b+5assert b==0

{}{b=5}

{b=-5}{b=}

{b=}

{b=}can’t prove

175

Disjunctive completion of a lattice• For a complete lattice

L = (D, , , , , )• Define the powerset lattice

L = (2D, , , , , ) = ? = ? = ? = ? = ?

• Lemma: L is a complete lattice

• L contains all subsets of D, which can be thought of as disjunctions of the corresponding predicates

• Define the disjunctive completion constructorL = Disj(L)

176

Disjunctive completion for GCs


• Disjunctive completionGCC,P(A) = (C, P(A), P(A), P(A))

– C,P(A)(X) = ?– P(A),C(Y) = ?

177

Disjunctive completion for GCs


• Disjunctive completionGCC,P(A) = (C, P(A), P(A), P(A))

– C,P(A)(X) = {C,A({x}) | xX}– P(A),C(Y) = {P(A)(y) | yY}


178

Information loss exampleif (…) b := 5else b := -5

if (b>0) b := b-5else b := b+5assert b==0

{}{b=5}

{b=-5}{b=5 b=-5}

{b=0}

{b=0}proved

179

The base lattice CP

{x=0}

true

{x=-1}{x=-2} {x=1} {x=2} ……

false

180

The disjunctive completion of CP

{x=0}

true

{x=-1}{x=-2} {x=1} {x=2} ……

false

{x=-2x=-1} {x=-2x=0} {x=-2x=1} {x=1x=2}… … …

{x=0 x=1x=2}{x=-1 x=1x=-2}… ………


181

Taming disjunctive completion• Disjunctive completion is very precise

– Maintains correlations between states of different analyses– Helps handle conditions precisely– But very expensive – number of abstract states grows

exponentially– May lead to non-termination

• Base analysis (usually product) is less precise– Analysis terminates if the analyses of each component

terminates• How can we combine them to get more precision yet

ensure termination and state explosion?

182

Taming disjunctive completion

• Use different abstractions for different program locations– At loop heads use coarse abstraction (base)– At other points use disjunctive completion

• Termination is guaranteed (by base domain)• Precision increased inside loop body

183

With Disj(CP)while (…) { if (…) b := 5 else b := -5

if (b>0) b := b-5 else b := b+5 assert b==0}

Doesn’t terminate

184

With tamed Disj(CP)while (…) { if (…) b := 5 else b := -5

if (b>0) b := b-5 else b := b+5 assert b==0}

terminates

CP

Disj(CP)

What MultiCartDomain implements

185

Reducing disjunctive elements

• A disjunctive set X may contain within it an ascending chain Y=a b c…

• We only need max(Y) – remove all elements below

186


• L1 = (D1, 1, 1, 1, 1, 1)L2 = (D2, 2, 2, 2, 2, 2)

• Lrel = (2D1D2, rel, rel, rel, rel, rel)as follows:– Lrel = ?

187


• L1 = (D1, 1, 1, 1, 1, 1)L2 = (D2, 2, 2, 2, 2, 2)

• Lrel = (2D1D2, rel, rel, rel, rel, rel)as follows:– Lrel = Disj(Cart(L1, L2))

• Lemma: L is a complete lattice• What does it buy us?– How is it relative to Cart(Disj(L1), Disj(L2))?


188

Relational product of GCs


• Relational ProductGCC,P(AB) = (C, C,P(AB), P(AB),C, P(AB))

– C,P(AB)(X) = ?– P(AB),C(Y) = ?

189

Relational product of GCs


• Relational ProductGCC,P(AB) = (C, C,P(AB), P(AB),C, P(AB))

– C,P(AB)(X) = {(C,A({x}), C,B({x})) | xX}– P(AB),C(Y) = {A,C(yA) B,C(yB) | (yA,yB)Y}

190

Cartesian product example

Correlations preserved

191

Function space• GCC,A=(C, C,A, A,C, A)

GCC,B=(C, C,B, B,C, B)• Denote the set of monotone functions from A to B by AB• Define for elements of AB as follows

(a1, b1) (a2, b2) = if a1=a2 then {(a1, b1B b1)} else {(a1, b1), (a2, b2)}

• Reduced cardinal powerGCC,AB = (C, C,AB, AB,C, AB)

– C,AB(X) = {(C,A({x}), C,B({x})) | xX}– AB,C(Y) = {A,C(yA) B,C(yB) | (yA,yB)Y}

• Useful when A is small and B is much larger– E.g., typestate verification

192

Widening/Narrowing

193

How can we prove this automatically?

RelProd(CP, VE)

194

Intervals domain

• One of the simplest numerical domains• Maintain for each variable x an interval [L,H]– L is either an integer of -– H is either an integer of +

• A (non-relational) numeric domain

195

Intervals lattice for variable x

[0,0][-1,-1][-2,-2] [1,1] [2,2] ......

[-,+]

[0,1] [1,2] [2,3][-1,0][-2,-1]

[-10,10]

[1,+][-,0]

... ...

[2,+][0,+][-,-1][-,-1]... ...

[-20,10]

196


• Dint[x] = { (L,H) | L-,Z and HZ,+ and LH}• • =[-,+]• = ?– [1,2] [3,4] ?– [1,4] [1,3] ?– [1,3] [1,4] ?– [1,3] [-,+] ?

• What is the lattice height?

197


• Dint[x] = { (L,H) | L-,Z and HZ,+ and LH}• • =[-,+]• = ?– [1,2] [3,4] no– [1,4] [1,3] no– [1,3] [1,4] yes– [1,3] [-,+] yes

• What is the lattice height? Infinite

198

Joining/meeting intervals

• [a,b] [c,d] = ?– [1,1] [2,2] = ?– [1,1] [2, +] = ?

• [a,b] [c,d] = ?– [1,2] [3,4] = ?– [1,4] [3,4] = ?– [1,1] [1,+] = ?

• Check that indeed xy if and only if xy=y

199

Joining/meeting intervals

• [a,b] [c,d] = [min(a,c), max(b,d)]– [1,1] [2,2] = [1,2]– [1,1] [2,+] = [1,+]

• [a,b] [c,d] = [max(a,c), min(b,d)] if a proper interval and otherwise – [1,2] [3,4] = – [1,4] [3,4] = [3,4]– [1,1] [1,+] = [1,1]

• Check that indeed xy if and only if xy=y

200

Interval domain for programs

• Dint[x] = { (L,H) | L-,Z and HZ,+ and LH}• For a program with variables Var={x1,…,xk}• Dint[Var] = ?

201


• Dint[x] = { (L,H) | L-,Z and HZ,+ and LH}• For a program with variables Var={x1,…,xk}

• Dint[Var] = Dint[x1] … Dint[xk]• How can we represent it in terms of formulas?

202


• Dint[x] = { (L,H) | L-,Z and HZ,+ and LH}• For a program with variables Var={x1,…,xk}

• Dint[Var] = Dint[x1] … Dint[xk]• How can we represent it in terms of formulas?– Two types of factoids xc and xc– Example: S = {x9, y5, y10}– Helper operations• c + + = +• remove(S, x) = S without any x-constraints• lb(S, x) =

203

Assignment transformers• x := c# S = ?• x := y# S = ?• x := y+c# S = ?• x := y+z# S = ?• x := y*c# S = ?• x := y*z# S = ?

204

Assignment transformers• x := c# S = remove(S,x) {xc, xc}• x := y# S = remove(S,x) {xlb(S,y), xub(S,y)}• x := y+c# S = remove(S,x) {xlb(S,y)+c, xub(S,y)+c}• x := y+z# S = remove(S,x) {xlb(S,y)+lb(S,z),

xub(S,y)+ub(S,z)}• x := y*c# S = remove(S,x) if c>0 {xlb(S,y)*c,

xub(S,y)*c} else {xub(S,y)*-c, xlb(S,y)*-c}

• x := y*z# S = remove(S,x) ?

205

assume transformers• assume x=c# S = ?• assume x<c# S = ?• assume x=y# S = ?• assume xc# S = ?

206

assume transformers• assume x=c# S = S {xc, xc}• assume x<c# S = S {xc-1}• assume x=y# S = S {xlb(S,y), xub(S,y)}• assume xc# S = ?

207

assume transformers• assume x=c# S = S {xc, xc}• assume x<c# S = S {xc-1}• assume x=y# S = S {xlb(S,y), xub(S,y)}• assume xc# S = (S {xc-1}) (S {xc+1})

208

Effect of function f on lattice elements• L = (D, , , , , )• f : D D monotone• Fix(f) = { d | f(d) = d }• Red(f) = { d | f(d) d }• Ext(f) = { d | d f(d) }• Theorem [Tarski 1955]– lfp(f) = Fix(f) = Red(f) Fix(f)– gfp(f) = Fix(f) = Ext(f) Fix(f)

Red(f)

Ext(f)

Fix(f)

lfp

gfp

fn()

fn()

209

Effect of function f on lattice elements• L = (D, , , , , )• f : D D monotone• Fix(f) = { d | f(d) = d }• Red(f) = { d | f(d) d }• Ext(f) = { d | d f(d) }• Theorem [Tarski 1955]– lfp(f) = Fix(f) = Red(f) Fix(f)– gfp(f) = Fix(f) = Ext(f) Fix(f)

Red(f)

Ext(f)

Fix(f)

lfp

gfp

fn()

fn()

210

Continuity and ACC condition

• Let L = (D, , , ) be a complete partial order– Every ascending chain has an upper bound

• A function f is continuous if for every increasing chain Y D*,

f(Y) = { f(y) | yY }• L satisfies the ascending chain condition (ACC)

if every ascending chain eventually stabilizes:d0 d1 … dn = dn+1 = …

211

Fixed-point theorem [Kleene]

• Let L = (D, , , ) be a complete partial order and a continuous function f: D D then

lfp(f) = nN fn()

212

Resulting algorithm • Kleene’s fixed point theorem

gives a constructive method for computing the lfp

lfpfn()

f()f2()

…d := while f(d) d do d := d f(d)return d

Algorithm

lfp(f) = nN fn()Mathematical definition

213

Chaotic iteration• Input:

– A cpo L = (D, , , ) satisfying ACC– Ln = L L … L– A monotone function f : Dn Dn – A system of equations { X[i] | f(X) | 1 i n }

• Output: lfp(f)• A worklist-based algorithm

for i:=1 to n do X[i] := WL = {1,…,n}while WL do j := pop WL // choose index non-deterministically N := F[i](X) if N X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i])return X

214

Concrete semantics equations

• R[0] = {xZ} R[1] = x:=7R[2] = R[1] R[4]R[3] = R[2] {s | s(x) < 1000}R[4] = x:=x+1 R[3]R[5] = R[2] {s | s(x) 1000} R[6] = R[5] {s | s(x) 1001}

R[0]R[2]R[3] R[4]

R[1]

R[5]R[6]

215

Abstract semantics equations

• R[0] = ({xZ}) R[1] = x:=7#

R[2] = R[1] R[4]R[3] = R[2] ({s | s(x) < 1000})R[4] = x:=x+1# R[3]R[5] = R[2] ({s | s(x) 1000})R[6] = R[5] ({s | s(x) 1001}) R[5] ({s | s(x) 999})

R[0]R[2]R[3] R[4]

R[1]

R[5]R[6]

216

Abstract semantics equations

• R[0] = R[1] = [7,7] R[2] = R[1] R[4]R[3] = R[2] [-,999]R[4] = R[3] + [1,1]R[5] = R[2] [1000,+] R[6] = R[5] [999,+] R[5] [1001,+]

R[0]R[2]R[3] R[4]

R[1]

R[5]R[6]

217

Too many iterations to converge

218

How many iterations for this one?

219

Widening

• Introduce a new binary operator to ensure termination– A kind of extrapolation

• Enables static analysis to use infinite height lattices– Dynamically adapts to given program

• Tricky to design• Precision less predictable then with finite-

height domains (widening non-monotone)

220

Formal definition• For all elements d1 d2 d1 d2 • For all ascending chains d0 d1 d2 …

the following sequence is finite– y0 = d0 – yi+1 = yi di+1

• For a monotone function f : DD define– x0 = – xi+1 = xi f(xi )

• Theorem:– There exits k such that xk+1 = xk

– xkRed(f) = { d | dD and f(d) d }

221

Analysis with finite-height lattice

A

f#n = lpf(f#) …

f#2 f#3

f#

Red(f)

Fix(f)

222

Analysis with widening

A

f#2 f#3

f#2 f#3

f#

Red(f)

Fix(f) lpf(f#)

223

Widening for Intervals Analysis

• [c, d] = [c, d]• [a, b] [c, d] = [

if a cthen aelse -,

if b dthen belse

224

Semantic equations with widening

• R[0] = R[1] = [7,7] R[2] = R[1] R[4]R[2.1] = R[2.1] R[2]R[3] = R[2.1] [-,999]R[4] = R[3] + [1,1]R[5] = R[2] [1001,+] R[6] = R[5] [999,+] R[5] [1001,+]

R[0]R[2]R[3] R[4]

R[1]

R[5]R[6]

225

Choosing analysis with widening

Enable widening

Non monotonicity of widening

• [0,1] [0,2] = ?• [0,2] [0,2] = ?

Non monotonicity of widening

• [0,1] [0,2] = [0, ]• [0,2] [0,2] = [0,2]

228

Analysis results with widening

Did we prove it?

229

Analysis with narrowing

A

f#2 f#3

f#2 f#3

f#

Red(f)

Fix(f) lpf(f#)

Formal definition of narrowing• Improves the result of widening• y x y (x y) x• For all decreasing chains x0 x1 …

the following sequence is finite– y0 = x0

– yi+1 = yi xi+1

• For a monotone function f: DDand xkRed(f) = { d | dD and f(d) d }define– y0 = x– yi+1 = yi f(yi )

• Theorem:– There exits k such that yk+1 =yk

– ykRed(f) = { d | dD and f(d) d }

Narrowing for Interval Analysis • [a, b] = [a, b]• [a, b] [c, d] = [

if a = - then celse a,

if b = then delse b

]

232

Semantic equations with narrowing

• R[0] = R[1] = [7,7] R[2] = R[1] R[4]R[2.1] = R[2.1] R[2]R[3] = R[2.1] [-,999]R[4] = R[3]+[1,1]R[5] = R[2]# [1000,+] R[6] = R[5] [999,+] R[5] [1001,+]

R[0]R[2]R[3] R[4]

R[1]

R[5]R[6]

233

Analysis with widening/narrowing• Two phases– Phase 1: analyze with

widening until converging

– Phase 2: use values to analyze with narrowing

Phase 2:R[0] = R[1] = [7,7] R[2] = R[1] R[4]R[2.1] = R[2.1] R[2]R[3] = R[2.1] [-,999]R[4] = R[3]+[1,1]R[5] = R[2]# [1000,+] R[6] = R[5] [999,+] R[5] [1001,+]

Phase 1:R[0] = R[1] = [7,7] R[2] = R[1] R[4]R[2.1] = R[2.1] R[2]R[3] = R[2.1] [-,999]R[4] = R[3] + [1,1]R[5] = R[2] [1001,+] R[6] = R[5] [999,+] R[5] [1001,+]

234

Analysis with widening/narrowing

235

Analysis results widening/narrowing

Precise invariant

noam rinetzky lecture 9: abstract interpretation ii

Documents

d dtransitive

d dreflexive

thatd d

d disomporphism

d d hasse diagramthere

verification program

binary order relation

partial order posetd