automated abstraction refinement ii heuristic aspects ken mcmillan cadence berkeley labs

Automated abstraction refinement II

Heuristic aspects

Ken McMillanCadence Berkeley Labs

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Introduction

• Part I– introduced the framework of abstract

interpretation and discusses practical instances, such as predicate abstraction

• Part II– will discuss heuristic aspects of abstraction, i.e.,

how do we find useful abstractions in an automated way?

Will locate methods on three axes...


Axis 1

• Embedding abstractions (E)– Encode the state in an abstract space, e.g.,

• Predicate abstraction• Polyhedral abstractions

• Weakening abstractions (W)– Throw away information about the system

without re-encoding the state, e.g.,• Localization• Interpolation

Some methods combine these types


Axis 2

• A priori bias (A)– Abstraction represents a general bias and is not

property-specific, e.g.,• Polyhedral abstraction• Van Eicks's method• Invisible invariants

• Property-specific abstractions (P)– Abstraction is tailored to verification of a

specific property, e.g.,• Predicate abstraction• Localization


Axis 3

• Cartesian lattice (C)– Invariants are conjunctions of literals, e.g.,

• Polyhedral abstraction• Van Eick's method• Predicate abstraction

• Boolean lattice (B)– Invariants are Boolean combinations of atoms,

e.g.,• Localization• Predicate abstraction

Boolean case more expressive but may explode!


Property-specific abstraction

• We must abstract from a system just the information relevant to proving a given property.– In the end this must take the form of an

inductive invariant

But how do we decide what information is "relevant"?


The refinement loop

• All methods are based on the following principle:

The refutation relevance principle (RRP): Facts used to refute a class of potential models are considered relevant.

• This leads to a refinement loop, in which facts used to refute classes of counterexamples are added to the abstraction


Outline

• Applications of the RRP:– SAT solvers– Localization abstraction– Interpolant-based methods– Predicate abstraction


DPLL-style SAT solvers

• Objective:– Check satisfiability of a CNF formula

• literal: v or :v• clause: disjunction of literals• CNF: conjunction of clauses

• Approach:– Branch: make arbitrary decisions– Propagate implication graph– Conflicts (search failures) guide inference steps– Inferred clauses guide search

SATO,GRASP,CHAFF,BERKMIN

Very effective at narrowing to relevant facts


Model refutation in SAT

• Heuristics in the SAT solver are based on the RRP.– Class of models = partial assignment– Relevant facts: clauses

• The SAT solver raises the "relevance" measure of a variable used in the refutation of a partial assignment– increaing the chance that a clause containing

that variable is used– key is the interaction between model search

and deduction


Basic SAT algorithm

A =

empty clause?

yUNSAT

conflict?Deduce conflict

clause andbacktrack

y

n

is Atotal?

ySAT

Branch:add some literal

to A


The Implication Graph (BCP)

(a Ç b) Æ (b Ç c Ç d)

a

c

Decisions

b

Assignment: a Æ b Æ c Æ d

d


Resolution

a Ç b Ç c a Ç c Ç d

b Ç c Ç d

When a conflict occurs, the implication graph isused to guide the resolution of clauses, so that thesame conflict will not occur again.


Conflict Clauses

(a Ç b) Æ (b Ç c Ç d) Æ (b Ç d)

a

c

Decisions

b

Assignment: a Æ b Æ c Æ d

d

Conflict!

(b Ç c )

resolve

Conflict!(a Ç c)

resolve

Conflict!


Decision heuristics

• Conflict clause is a refutation of a partial assignment.– Solvers increase the "relevance" score of variables used in

this refutation (e.g., the VSIDS heuristic)– Variables with higher score are decided first– Thus the solver is biased toward using facts that refuted

earlier solution attempts.

• Decision heuristics are thus an instance of the RRP– Solvers are quite effective in ignoring irrelevant clauses --

many abstraction methods are based on this fact.


Localization abstraction

• Abstract by removing system components not relevant to a given property.

systemproperty

localization

• Think of system components as constraints and localization as removing constraints.

• A weaking abstraction (WPB)

Kurshan


Components as constraints

Transition system described by a set of constraints

ab cp

gg = a b

p = g c

c' = p

Model:

T = { g = a b, p = g c, c' = p }


Localization abstraction

• Property: G (c X c)

ab cp

g

Model:

T = { g = a b, p = g c, c' = p }

#

free variable

Localization does not recode the state. It justweakens the transition relation.


Localization, cont

• T# may refer to fewer state variables than T– reduction in the state explosion problem

• Key issue: how to choose constraints in T#

– apply the RRP


CEGAR loop

Model checkabstraction T#

Choose initial T#

Can extend Cexfrom T# to T?

Add constraintsto T#

true, done

Cex

yes, Cex

no

Kurshan


CEGAR, cont

• CEGAR is an instance of RRP, where:– class of models = partial trace

i.e., trace of subset of variables, or an abstract cex

– relevant facts: system components


Formalizing CEGAR

• Straightforward in terms of Bounded Model Checking [BCCZ99]

• New notation:

let Q<t> denote Q with t primes added to each symbol

• variable v with t primes represents value of v at time t• thus, Q<t> is Q shifted t time units into the future


Unfolding

• Unfold the model k times: Unfk(T) = T<0> T<1> ... T<k-1>

ab

cp

g ab

cp

g ab

cp

g

...I<0> F<k>

• Use SAT solver to check satisfiability of I<0> Æ Unfk(T) Æ F<k>

• A satisfying assignment is a k-step cex

• If unsat, refutes a class of models --• all traces of length k


Abstract counterexamples

• Abstract TR subset of concrete TR:T# µ T

• Abstract variables:

V# = support(T#)

• Abstract counterxample truth assignment to:

V#k = { v<t> | v in V#, t in 0..k }

where k is the number of steps.


Abstract counterexamples, cont.

• Let A# be a solution to

(I<0> Æ Unfk(T#) Æ F<k>) +V#k

• A# can be computed using a model checker

• Note that A# defines a class of models

– all models consistent with A#


Concretization

• Think of A# as a minterm over V#k

• A concretization A of A# is a model of

A# Æ I<0> Æ Unfk(T) Æ F<k>

(A is a counterexample consistent with A#)

• If a concretization exists we are done, else

we must refine the abstraction.

– Note that existence of a concretization is a SAT

problem.

CGJLV 2000


Abstraction refinement

• Refinement = adding constraints to T# sufficient to refute A#.

• An extension is E µ T such that this is unsat:

A# Æ I<0> Æ Unfk(T# [ E) Æ F<k>

• By RRP, constraints in E are relevant:– They refute the class of models defined by A#.

• How to find E?– Many complex heuristsics used for this...– Recall that a SAT solver can produce a

resolution-based refutation in the UNSAT case....


Proof-based refinement

• Recall, a concretization satisfies this:A# Æ I<0> Æ Unfk(T) Æ F<k>

• If UNSAT, we obtain refutation proof P– proof that A# cannot be concretized

• Let E be set of constraints used in proof P:E = { c T | some c<i> occurs in P }

• A# cannot be extended to a Cex for E– P is the proof of this.

Thus, add E to T# and continue...

[CCKSVW02]


In other words...

The refutation of the formula: A# Æ I<0> Æ Unfk(T) Æ F<k>

gives us a sufficient set of constraints to refute the class of models defined by A#.

We rely on the SAT solver's ability to focus onrelevant facts (using the RRP) to produce a small E.


CCKSVW approach (FMCAD02)• Find the shortest prefix of Cex A# that cannot be extended.

• That is, A# Æ I<0> Æ Unfk(T) Æ F<k>

is feasible for all k < i, but not for k=i.

s0 s1 s2 si-1 si...

OK OK OK OK NO!


CCKSVW approach cont.

• Let P be a refutation of A# Æ I<0> Æ Unfk(T) Æ F<k>

• Let E be set of constraints used in proof P only on state si-1:

E = { c T | c<i-2> occurs in P }

s0 s1 s2 si-1 si...

OK OK OK OK NO!

add constraints used here


Optimal abstractions?

• Greedily remove constraints as long as all the abstract counterexamples are still refuted.– Will produce a local minimum

• Optimal abstraction (Gupta et al)– Samples: failed attempts to concretize A# – Each sample falsifies a subset of constraints– Find a minimal cover

But note, these methods are expensive, andmay not make model checking faster.


Weakness of Cex-based approach

• The CEGAR approach refutes fairly small model classes.

• Arbitrarily chosen abstract Cex may be refutable for many reasons not related to property.– Thus, may add irrelevant constraints.– To remedy, may try to generalize abstract

Cex's to represent a larger class of models (e.g., GKM-HFV,TACAS03).

Alternative: don't use counterexamples


Proof-based abstraction

• Also based on RRP:– class of models = all models of length k– relevant facts: transition constraints

By refuting larger classes of models, we hopeto converge faster, and include fewer irrelevantconstraints.


Proof-based abstraction

BMCat depth k

Cex?done

No Cex?

Use refutation to choose abstraction

MC abstraction doneTrue?

False?

Incr

ease

k[MA03]


BMC phase

• Use SAT solver to check satisfiability of I<0> Æ Unfk(T) Æ F<k>

• If unsatisfiable:• property has no Cex of length k• produce a refutation proof P


Abstraction phase

• Let T# be set of constraints used in proof P:T# = { c T | some c<i> occurs in P }

• T# admits no counterexample of length k– P is a refutation of I<0> Æ Unfk(T#) Æ F<k>

• Model check property on T#

– property true for T# implies true for T– else Cex of length k' > k – restart for k = k'

Note, "refinement" here is just increasing k


Algorithm

BMCT at depth k

Cex?done

No Cex?

Refutation P inducesabstraction T#

Model check T# doneTrue?

Cex of depth k'?

let

k =

k'

Notice: MC counterexample is thrown away!


Termination

• Depth k increases at each iteration• Eventually k > d, diameter of T#

• If k > d, no counterexample is possible

In practice, termination uses occurs when k d/2

Usually, diameter T# << diameter of T


Weakness of proof-based abs

• BMC must refute all counterexamples of length k, while in Cex-based, BMC must refute only one (partial) counterexample.

In practice, PBA converges in fewer iterationsthat CEGAR, but is sometimes slower becauserefuting all cex's can be slow. Various compromisesbetween the two are possible.


Also note...

• CEGAR relies heavily on the model checker– Uses model checker as decision heuristic for

SAT solver.– Note the interaction between model search and

refutation.

• PBA only uses model checker to provide the unfolding depth– Relies more heavily on RRP loop inside SAT

solver.


RRP trade-off

• Abstraction methods that refute more general classes of models converge in fewer iterations.

• Refuting more general classes can be more expensive.

Practical tools need to balance these considerations.


Interpolants as abstractions

• Abstraction is extracting sufficient information from a system to prove a given property.

• This notion is in some sense closely related to Craig's interpolation lemma.


Interpolation Lemma

• If A B = false, there exists an interpolant A' for (A,B) such that:

A A'A' B = false

A' refers only to common variables of A,B

• Example: – A = p q, B = q r, A' = q

• Interpolants from proofs– given a resolution refutation of A B,

A' can be derived in linear time.

(Craig,57)

(Pudlak,Krajicek,97)


Interpolants from proofs

• An interpolant for (A,B) derived from a refutation...– is in some sense an abstraction of A relative to

B– captures the information about A that the

prover used to refute B

• This can give us a very general method for extracting information about a system to prove a given property.– with many possible applications in model

checking


Applications

• Propositional case– Finite-state model checking using a SAT solver– Very robust method for hardware verification

• First-order case– Infinite-state model checking using a FO prover.– Verify, for example, parameterized protocols

• Predicate abstraction– Discover useful predicates for predicate

abstraction– Computation of the abstract transition relation

Here will consider just the propositional case...


Interpolant-based image

• A property-specific weakening abstraction– Use interpolants to compute a weakened image

operator (abstract transformer)– Strong enough to refute a class of models

• Applying the RRP:– class of models = continuations of length k– relevant fact: any next-state property


k-adequate image operator

• Abstract transformer T# is k-adequate (w.r.t.) F, when– if P cannot reach F, T#(P) cannot reach F within k steps

• Intuition: want T# to avoid adding states that can reach a bad state


Interpolation-based image

• Idea -- use unfolding to enforce k-adequacyA = P<-1> T<-1>

B = T<0> T<1> T<k-1> F<k>

P FT T T T T T T

A B

t=0 t=k

Let T#(P) = A',

where A' is an interpolant for (A,B)...

T# is a k-adequate abstract transformer!


Huh?

• sup(A') µ sup(A) Å sup(B)– sup(A') = V0 (A' is a state predicate)

• A A'– T(P) T#(P) (T# is sound)

• A' B = false– T#(P) cannot reach F in k steps

P FC C C C C C C

A B

t=0 t=k

A'


Intuition

• A' tells is everything the prover deduced about the image of P in proving it can't reach F in k steps.

• Hence, A' is in some sense an abstraction of the image relative to the property.

P FC C C C C C C

A B

t=0 t=k

A'


Refinement

• Model checking with T# may fail– T# may add a state that reaches F in k+1 steps

• Refinement is just increasing k– Increasing k refutes a larger class of models


Termination

• In the finite-state case:Since k increases at every refinement, eventuallyk > d, the diameter, in which case T# is adequate(adds no bad states) hence we terminate.

• Notes:– don't need to know when k > d in order to

terminate– often termination occurs with k << d


Performance v. Localization

time, interpolation method

tim

e,

pro

of-

base

d a

bst

ract

ion

Source: Nina Amla


k-bound comparisonp

roof-

base

d a

bst

ract

ion

, la

st k

interpolation last k


Interpolants

• Interpolant for (A,B) provides an abstraction of A that refutes B.– Exploits prover’s ability to focus on relevant

facts.

• Provides an image weakening abstraction– Strong enough to refute continuations of length

k– Embodies the RRP: facts that refute classes of

models are considered relevant.


Predicate abstraction

• Encode the state of a system by the truth values of a finite set of predicates P

• Two aspects of abstraction refinement in predicate abstraction:– Selection of predicates in P– Weakening of the abstract transition relation or

transformer


Predicate selection

• Static heuristics– Choose predicates occurring in branch

conditions– Apply weakest precondition to these predicates

• Not very property-specific– Can use localization to remove irrelevent preds

• But this could involve unfoldings with 100's of preds

We can apply the RRPdirectly to predicate selection


Predicate refinement

• The RRP for predicate selection:– Class of models = program path– Relevant facts: predicates

• A decision procedure is used to refute program paths. Predicates in the refutation are considered relevant.– Example: "lazy abstraction" in BLAST [HJMS02]


Example

do { lock(); old = new; if (*) { unlock(); new++; } } while (new != old);unlock();

Let P = {LOCK=0}

LOCK = 0

LOCK 0

LOCK 0

LOCK = 0

LOCK = 0

LOCK = 0

ERR!

lock(); old = new;

[T]

unlock(); new++;

[T]

[old = new]

unlock();

T

LOCK1 = 1old1=new0

T

LOCK2 = 0, new1=new0+1

old1=new1

LOCK2 = 0


Example, cont

old1=new1

0=1

new1=new0+1old1=new0

new1=old1+1old=new

new=old+1

LOCK1 = 1old1=new0

LOCK2 = 0, new1=new0+1

old1=new1

LOCK2 = 0

Path facts... Refute the path... Extract preds

Principle: facts used to refute class of models are relevant.


Check with new preds

LOCK = 0

LOCK 0, old=new

LOCK 0, old=new

LOCK = 0, oldnew

LOCK = 0, oldnew

FALSE!

lock(); old = new;

[T]

unlock(); new++;

[T]

[old = new]

unlock();


Bad example

• Refutation predicates not always adequate to rule out a path, however...

x:=ctr;

ctr := ctr+1;

y := ctr;ctr:=*; m:=*;

[x=m]

[ym+1]

x1 = ctr0

ctr1 = ctr0+1

y1 = ctr1

x1 = m1

y1 m1+1

Refutation preds:

x=mx=ctrm=ctrctr=m+1y=ctry=m+1


Bad example, cont

• Check with new preds...

x:=ctr;

ctr := ctr+1;

[x=m]

[ym+1]

T

x=ctr

ctr=m+1 <-> x=m, xctr

y=ctr

y=ctr,x=m

y=ctr,x=m,ym+1,ctrm+1

Refutation preds:

x=mx=ctrm=ctrctr=m+1y=ctry=m+1

y := ctr;ctr:=*; m:=*;


What went wrong?

• Prover did not prove facts about particular states– Rather, predicates in proof span states– Thus we missed the needed predicate y=x+1

• How do we know what was proved about a given state in refuting the trace?– Jhala: This is precisely an interpolant!


Predicates from interpolants

• Compute interpolant at each path cut

x:=ctr;

ctr := ctr+1;

y := ctr;ctr:=*; m:=*;

[x=m]

[ym+1]

x1 = ctr0

ctr1 = ctr0+1

y1 = ctr1

x1 = m1

y1 m1+1

x1 = ctr0

x1+1 = ctr1

x1+1 = y1

m1+1 = y1

each interpolantimplies the next

Extract preds frominterpolants


Predicates from interpolants

• Guaranteed to rule out program path• Assign predicates to particular locations

– fewer preds per state -> less state explosion

• Provides a way to apply the RRP to predicate selection– tells us which facts (state predicates) are

relevant to refuting a class of models (program path)


Abstract transition relations

• In PA, the abstract transition relation can be very expensive to compute.

abstract states

concrete states

T

T#

a a

• We can represent T# symbolically...

T# = a-1±T±a


Symbolic transition relation

• Let V = {vp | p P }, and W be concrete symbs

• Represent R±S like this:

(R±S)(V,V') = R(V,U) S(U,V’)...where U is a set of fresh symbols

• The abstraction relation is:

a(W,V) = ÆpP (vp p)

• The symbolic abstract transition relation:T#= a-1±T±a

... implictly projected onto V V'


What's the problem?

• Projecting T# onto the abstract state is expensive– Best known solution is to enumerate the

minterms over V V' and test for consistency with T#.

– This can be made more efficient by translating T# to a satisfiability-equivalent Boolean formula and using incremental SAT. [LBC03]

– The alternative is to apply a weaking abstraction to that abstract transition relation.

• Weakened relation may easier to compute but still prove property...


Transition relation weakening

• A priori bias– Boolean programs abstraction (SLAM)

• abstracts transition relation

– Cartesian image abstraction (BLAST)• abstracts the image compution

• Both methods lose correlation between predicates at next time– In effect, can only infer conjunctions of predicates– Avoid enumerating the next states

These weakenings are surprisingly effective,but sometimes fail on trivial problems.


Bad Example

a[x] := y;

y := y + 1;

[a[z]y-1]

[z=x]

Predicates:

x=z a[z] = y a[z] = y-1

T

T

T

x=z

x=z

Cartesian Boolean

T

x=z a[z] = y

x=z a[z] = y - 1

a[z] = y - 1

F

Array properties almost alwaysrequire disjunctions!


The Das/Dill approach

• TR weakening for predicate abstraction• Applies the RRP in this sense:

– Model class = abstract trace• i.e., a sequence of predicate assignments

– Relevant fact: clause in the abstract TR• allows to introduce disjunctions


Example

a[x] := y;

y := y + 1;

[a[z]y-1]

xz, a[z] y, a[z] y-1

[z=x]

x=z, a[z] y, a[z] = y-1

xz, a[z] = y, a[z] y-1

x=z, a[z] y, a[z] y-1

x=z, a[z] = y, a[z] = y-1

Initially, abstract TR is just "true"...

this transition isinconsistent with T#


Refining the TR

• Inconsistent transition a minterm over V [ V'

• Greedily drop literals as long as it remains inconsistent with T#...

xz, a[z] y, a[z] y-1, x'=z', a'[z'] y', a'[z'] = y'-1

x'=z', a'[z'] y'

• Complement is a TR clause implied by T#

x'=z' a'[z'] = y'


TR refinement, cont

• The new TR clause is implied by T#, but inconsistent with the abstract trace– i.e, it as a fact that refutes a class of models,

and thus is relevant by the RRP

• By iterating this process, we can guarantee to converge to a weakened abstract TR that proves the property, if T# proves the property.– else we must add predicates


Summary

• Abstraction is extracting information from a system relevant to proving a property

• We can distinguish abstraction stragies as...– A priori vs property-specific– Embedding vs weaking

• Property specific techniques are based on the refutaion relevance principle.– Distinguished primarily by the class of models

that is refuted


Summary, cont.

• SAT solvers (PW)– class of models = partial assignment

• Localization (PW)– class of models = partial trace --> "CEGAR"– class of models = trace of length k --> "PBA"

• Predicate abstraction– selection: class = program path (PE)– transition relation weakening:

• A priori: Boolean progs, Cartesian image (AW)• Das/Dill: class of models = abstract trace (PW)


Summary, cont

• All the property specific techniques are based on the RRP.

• This provides a tight coupling between model search and deduction– Model search provides model classes to be

refuted– Deduction provides relevance information that

guides model search

• Applying this principle allows us to automatically extract relevant information from systems in many applications.

automated abstraction refinement ii heuristic aspects ken mcmillan cadence berkeley labs

Documents

abstraction slide

cadence design systems

b c d d slide

b b c d b d

c d b c d

abstraction methods

b c d d conflict

b c resolve conflict