carnegie mellon university decision procedures customized for formal verification decision...

52
Carnegie Mellon University Decision Procedures Decision Procedures Customized for Customized for Formal Verification Formal Verification http://www.cs.cmu.edu/~bryant Randal E. Bryant Contributions by former graduate students: Sanjit Seshia, Shuvendu Lahiri

Post on 19-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Carnegie Mellon University

Decision ProceduresDecision ProceduresCustomized forCustomized for

Formal VerificationFormal Verification

Decision ProceduresDecision ProceduresCustomized forCustomized for

Formal VerificationFormal Verification

http://www.cs.cmu.edu/~bryant

Randal E. Bryant

Contributions by former graduate students:Sanjit Seshia, Shuvendu Lahiri

– 2 –CADE ‘05

OutlineOutline

ContextContext Infinite state models of hardware systems Verification techniques

NeedsNeeds Requirements for decision procedures Dealing with quantifiers

Our SolutionOur Solution SAT-based procedure “Eager” Boolean encoding

– 3 –CADE ‘05

Alpha 21264 Microprocessor

Microprocessor Report, Oct. 28, 1996

Verification ExampleVerification Example

TaskTask Verify that

microprocessor correctly implements instruction set definition

Even though heavily pipelined

– 4 –CADE ‘05

Existing Hardware Verification MethodsExisting Hardware Verification Methods

Simulators, equivalence checkers, model checkers, …

All Operate at Bit LevelAll Operate at Bit Level View each register or memory bit as state variable Behavior of each state variable defined by Boolean function

StrengthsStrengths Finite-state systems conceptually simple BDDs & SAT procedures allow high degrees of automation

LimitationsLimitations State space can be very large Only verify fixed instantiation of system

Specific memory sizes, number of processes, buffer lengths, …

– 5 –CADE ‘05

Alpha 21264 Microprocessor

Microprocessor Report, Oct. 28, 1996

Verification ChallengesVerification Challenges

Sources of Sources of ComplexityComplexity Lots of internal state Complex control

logic

OpportunitiesOpportunities Most of the logic

serves to store, select, and communicate data

– 6 –CADE ‘05

Applying Data Abstraction to Hardware VerificationApplying Data Abstraction to Hardware VerificationIdeaIdea

Abstract details of data encodings and operations Keep control logic precise

ApplicationsApplications Verify overall correctness of system Assuming individual functional units correct

Advantages of AbstractionAdvantages of Abstraction Abstract infinite-state system easier to verify than detailed

finite-state one Parametric representation allows verification of many

different system variantsArbitrary number of processes, buffer lengths, etc.

– 7 –CADE ‘05

Word AbstractionWord Abstraction

Data: Abstract details of form & functions

Control: Keep at bit level

Timing: Keep at cycle level

Control LogicControl Logic

Data PathData Path

Com.Log.

1

Com.Log.

2

– 8 –CADE ‘05

Data Abstraction #1: Bits → TermsData Abstraction #1: Bits → Terms

View Data as Symbolic WordsView Data as Symbolic Words Arbitrary integers

No assumptions about size or encodingClassic model for reasoning about software

Can store in memories & registers

x0x1x2

xn-1

x

– 10 –CADE ‘05

Data PathData Path

Com.Log.

1

Com.Log.

2

Abstracting Data BitsAbstracting Data Bits

Control LogicControl Logic

Data PathData Path

Com.Log.

1

Com.Log.

1? ?

What do we do about logic functions?

– 11 –CADE ‘05

Abstraction #2:Uninterpreted Functions

Abstraction #2:Uninterpreted Functions

For any Block that Transforms or Evaluates Data:For any Block that Transforms or Evaluates Data: Replace with generic, unspecified function Only assumed property is functional consistency:

a = x b = y f (a, b) = f (x, y)

ALUf

– 12 –CADE ‘05

Abstracting FunctionsAbstracting Functions

For Any Block that Transforms Data:For Any Block that Transforms Data: Replace by uninterpreted function Ignore detailed functionality Conservative approximation of actual system

Data PathData Path

Control LogicControl Logic

Com.Log.

1

Com.Log.

1F1 F2

– 14 –CADE ‘05

Abstraction #3: Modeling Memories as Mutable FunctionsAbstraction #3: Modeling Memories as Mutable Functions

Memory M Modeled as FunctionMemory M Modeled as Function

M(a): Value at location a

InitiallyInitially

Arbitrary state Modeled by uninterpreted function m0

Ma

M

a m0

– 15 –CADE ‘05

Effect of Memory Write OperationEffect of Memory Write Operation

Writing Transforms MemoryWriting Transforms Memory M = Write(M, wa, wd)

Reading from updated memory:

Address wa will get wdOtherwise get what’s

already in M

Express with Lambda NotationExpress with Lambda Notation

M = a . ITE(a = wa, wd, M(a))

M

Ma 1

0

wd

=wa

– 16 –CADE ‘05

Systems with BuffersSystems with Buffers

Modeling MethodModeling Method Mutable function to describe buffer contents Integers to represent head & tail pointers Parameterize buffer capacity with symbolic value Max

• • • ••• •••

tailtail headhead

In Use

••••••

tailtailheadheadheadhead

In Use

•••

0 0 0 0 MaxMax--11MaxMax--11

Unbounded Buffer Circular Queue

– 17 –CADE ‘05

Some History of Term-Level ModelingSome History of Term-Level Modeling

HistoricallyHistorically Standard model used for program verification

Unbounded integer data types

Widely used with theorem-proving approaches to hardware verification

E.g, Hunt ’85

Automated Approaches to Hardware VerificationAutomated Approaches to Hardware Verification Burch & Dill, ’95

Tool for verifying pipelined microprocessors Implemented by form of symbolic simulation

Continued application to pipelined processor verification

– 18 –CADE ‘05

UCLIDUCLID

Seshia, Lahiri, Bryant, CAV ‘02

Term-Level Verification SystemTerm-Level Verification System Language for describing systems

Inspired by CMU SMV

Symbolic simulatorGenerates integer expressions describing system state after

sequence of steps

Decision procedureDetermines validity of formulas

Support for multiple verification techniques

Available by DownloadAvailable by Downloadhttp://www.cs.cmu.edu/~uclid

– 19 –CADE ‘05

Required LogicRequired LogicScalar Data TypesScalar Data Types

Formulas (F ) Boolean ExpressionsControl signals

Terms (T ) Integer ExpressionsData values

Functional Data TypesFunctional Data Types Functions (Fun) Integer Integer

Immutable: Functional unitsMutable: Memories

Predicates (P) Integer Boolean Immutable: Data-dependent controlMutable: Bit-level memories

– 20 –CADE ‘05

CLU LogicCLU Logic Counter Arithmetic, Lambda Expressions and Uinterpreted

Functions

Terms (Terms (T T )) Integer ExpressionsInteger ExpressionsITE(F, T1, T2) If-then-else

Fun (T1, …, Tk) Function application

succ (T) Increment

pred (T) Decrement

Formulas (Formulas (F F )) Boolean ExpressionsBoolean ExpressionsF, F1 F2, F1 F2 Boolean connectives

T1 = T2 Equation

T1 < T2 Inequality

P(T1, …, Tk) Predicate applicationTo support pointer operations

– 21 –CADE ‘05

CLU Logic (Cont.)CLU Logic (Cont.)Functions (Functions (FunFun)) Integer Integer Integer Integer

f Uninterpreted function symbol

x1, …, xk . T Function definition

Predicates (Predicates (PP)) Integer Integer Boolean Booleanp Uninterpreted predicate

symbol

x1, …, xk . F Predicate definition

– 22 –CADE ‘05

OutlineOutline

ContextContext Infinite state models of hardware systems Verification techniques

NeedsNeeds Requirements for decision procedures Dealing with quantifiers

Our SolutionOur Solution SAT-based procedure “Eager” Boolean encoding

– 23 –CADE ‘05

ReachableStates

Verifying Safety PropertiesVerifying Safety Properties

State Machine ModelState Machine Model State encoded as Booleans, integers, and functions Next state function expresses how updated on each step

Prove: System will never reach bad stateProve: System will never reach bad state

ResetStates

BadStates

PresentState

NextState

Inputs(Arbitrary)

Reset

– 24 –CADE ‘05

Reachable

• • •

Rn

R2

Bounded Model CheckingBounded Model Checking

Repeatedly Perform Image Repeatedly Perform Image ComputationsComputations Set of all states reachable

by one more state transition

Underapproximation of Underapproximation of Reachable State SetReachable State Set But, typically catch most

bugs with 8–10 steps

BadStates

R1

ResetStates

– 25 –CADE ‘05

Implementing BMCImplementing BMC

Construct verification condition formula for step n by symbolically simulating system for n cycles

Check with decision procedure Do as many cycles as tractable

S

X1 X2 Xn

Bad

ResetSatisfiable?

– 26 –CADE ‘05

• • •

Rn

R2

True Model CheckingTrue Model Checking

Reach Fixed-PointReach Fixed-Point Rn = Rn+1 = Reachable

Impractical for Term-Level Impractical for Term-Level ModelsModels

Many systems never reach fixed point

Can keep adding elements to buffer

Convergence test undecidable

(Bryant, Lahiri, Seshia, CHARME ’03)

BadStates

R1

ResetStates

– 27 –CADE ‘05

I

Inductive Invariant CheckingInductive Invariant Checking

Key Properties of System that Make it Operate CorrectlyKey Properties of System that Make it Operate Correctly Formulate as formula I

Prove InductiveProve Inductive Holds initially I(s0)

Preserved by all state changes I(s) I((i, s))

ReachableStates

ResetStates

BadStates

– 28 –CADE ‘05

Inductive InvariantsInductive Invariants

Formulas Formulas II11, …, , …, IInn

Ij(s0) holds for any initial state s0, for 1 j n

I1(s) I2(s) … In(s) Ij(s ) for any current state s and successor state s for 1 j n

Overall CorrectnessOverall Correctness Follows by induction on time

Restricted form of invariantsRestricted form of invariants x1x2…xk (x1…xk)

(x1…xk) is a CLU formula without quantifiers

x1…xk are integer variables free in (x1…xk) Express properties that hold for all buffer indices, register IDs, etc.

– 29 –CADE ‘05

Proving InvariantsProving Invariants

Proving invariants inductive requires quantifiersProving invariants inductive requires quantifiers

|= [x1x2…xk (x1…xk)] [y1y2…ym (y1…ym)]

Prove unsatisfiability of formulaProve unsatisfiability of formulax1x2…xk (x1…xk) (y1…ym)

Undecidable ProblemUndecidable Problem In logic with uninterpreted functions and equality

– 30 –CADE ‘05

Invariant Checking:Out-of-Order Processor DesignsInvariant Checking:Out-of-Order Processor Designs

Generating invariants requires considerable human effort Impractical for realistic designs

base exc exc / br exc / br / mem-simp

exc / br / mem

Total Invariants

13 34 39 67 71

UCLID time

54 s 236 s 403 s 1594 s 2200 s

Person time

2 days 7 days 9 days 24 days 34 days

– 31 –CADE ‘05

Constructing Invariants from PredicatesConstructing Invariants from Predicates

Invariant

Result: Correctness

reg.valid(r)

r,t.reg.valid(r) reg.tag(r) = t (rob.head reg.tag(r) < rob.tail rob.dest(t) = r )

rob.head reg.tag(r)

reg.tag(r) = trob.dest(t) = r

Predicates

– 32 –CADE ‘05

Automatic Predicate AbstractionAutomatic Predicate Abstraction

Graf & Saïdi, CAV ’97

IdeaIdea Given set of predicates P1(s), …, Pk(s)

Boolean formulas describing properties of system state

View as abstraction mapping: States {0,1}k

Defines abstract FSM over state set {0,1}k

Form of abstract interpretationDo reachability analysis similar to symbolic model checking

Early Implementations InefficientEarly Implementations Inefficient Guess at possible next abstract states Test with call to decision procedure

– 33 –CADE ‘05

P.E. as Invariant GeneratorP.E. as Invariant Generator

Reach Fixed-Point on Reach Fixed-Point on Abstract SystemAbstract System Termination guaranteed,

since finite state

Equivalent to Computing Equivalent to Computing Invariant for Concrete Invariant for Concrete SystemSystem Strongest possible

invariant that can be expressed by formula over these predicates

• • •Rn

R2

R1

ResetStates

A

AbstractSystem

Concretize

ConcreteSystem

I

ResetStates

C

– 34 –CADE ‘05

Symbolic Formulation of Predicate AbstractionSymbolic Formulation of Predicate Abstraction

Basic OperationBasic Operation Compute set of legal abstract next states (B) given

current abstract states (B)B, B: Abstract current and next-state state variables

, : Boolean formulas

Create formula of form (S,B)Possible combinations of current concrete state S and next

abstract state B

Formulate as Quantifier Elimination ProblemFormulate as Quantifier Elimination Problem Generate formula of form (B) S (S,B)

S: Integer variables

For interpretation of B, formula true iff (S,B) satisfiable

Lahiri, Bryant, Cook, CAV ‘03

– 35 –CADE ‘05

OutlineOutline

ContextContext Infinite state models of hardware systems Verification techniques

NeedsNeeds Requirements for decision procedures Dealing with quantifiers

Our SolutionOur Solution SAT-based procedure “Eager” Boolean encoding

– 36 –CADE ‘05

Decision Procedure NeedsDecision Procedure Needs

Bounded Model CheckingBounded Model Checking Satisfiability of quantifier-free CLU formula Handled by decision procedure

Invariant CheckingInvariant Checking Satisfiability of quantified CLU formula Undecidable

Predicate AbstractionPredicate Abstraction Eliminate quantifiers from CLU formula

Role of Decision ProcedureRole of Decision Procedure Apply in sound, but incomplete way

– 37 –CADE ‘05

UCLID Decision Procedure OperationUCLID Decision Procedure Operation

Series of transformations leading to propositional formula

Except for lambda expansion, each has polynomial complexity

LambdaExpansion

Function&

PredicateElimination

FiniteInstantiation

BooleanSatisfiability

CLUFormula

-freeFormula

TermFormula

BooleanFormula

– 38 –CADE ‘05

SAT-based Decision ProceduresSAT-based Decision Procedures

Input Formula

Boolean Formula

satisfiable unsatisfiable

Satisfiability-preserving Boolean

Encoder

SAT Solver

EAGER ENCODING

Input Formula

Boolean Formula

satisfiable

unsatisfiable

Approximate Boolean Encoder

SAT Solver satisfying assignment

satisfiable

First-order Conjunctions SAT Checker

unsatisfiableadditional clause

LAZY ENCODING

– 39 –CADE ‘05

Eager Encoding CharacteristicsEager Encoding Characteristics– Must encode all information about

domain properties into Boolean formula

– Some properties can give exponential blowup

+ Lets SAT solver do all of the work

Good Approach for Some DomainsGood Approach for Some Domains Modern SAT solvers have remarkable

capacityGood at extracting relevant portions

out of very large formulasLearns about formula properties as

search proceeds

Input Formula

Boolean Formula

satisfiable unsatisfiable

Satisfiability-preserving Boolean

Encoder

SAT Solver

– 41 –CADE ‘05

Difference Logic Formula

Per-Constraint Encoding (PC)

Small Domain Encoding (SD)

Encoding MethodsEncoding Methods

Boolean Formula

SAT Solver

satisfiable/unsatisfiable

– 42 –CADE ‘05

Small Domain Encoding (SD)Small Domain Encoding (SD)

Can use Boolean encoding of finite range of values– 4 values in this case, so 2-bit encoding

Observation: To check satisfiability, need to consider all possible relative orderings of finitely-many expressions

x x+1y

z

x x+1 y z

Values increase

[Bryant, Lahiri, Seshia, CAV’02]

x y y z z x+1

0x1x0 0y1y0 0y1y0 0z1z0 0z1z0 0x1x0+1

– 43 –CADE ‘05

Per-Constraint Encoding (PC) Per-Constraint Encoding (PC)

Overall Boolean

Encoding

Transitivity Constraints

[Strichman, Seshia, Bryant, CAV’02]

x y y z z x+1

e1 e2 e4

e4 x z

New Difference Predicate

e4 e3

e1

y z

z x+1

x y

e2

e3

e1 e2 e3

– 44 –CADE ‘05

Size of Boolean Encoding: SD better than PCSize of Boolean Encoding: SD better than PCLet N be size of original difference logic formula

Size of a directed acyclic graph representation

SD encoding size is worst-case O(N2)

PC encoding size is worst-case O(2N) Can generate O(2N) transitivity constraints

> 1000000PC

54465SD

Boolean Encoding SizeMethodExample: N = 6813

– 45 –CADE ‘05

Impact on SAT problem: SD vs PC Impact on SAT problem: SD vs PC

Experimentally compared zChaff performance on SD and PC encodings of several unsatisfiable formulas

Sample result:

PC better than SD for zChaff

Method # Boolean variables

# CNF Clauses

# Conflict Clauses

zChaff Time (sec)

PC 57211 169387 150 0.56

SD 23112 67699 15811 21.63

– 46 –CADE ‘05

How to Choose EncodingHow to Choose Encoding

Hybrid StrategyHybrid Strategy Partition variables into classes

Which ones are compared to each other

For each class, choose encoding methodPC except SD when PC blows up

How to Determine Whether PC Will WorkHow to Determine Whether PC Will Work Try to predict based on formula characteristics

Number of constraints, density, …Selection procedure trained by machine learning

– 47 –CADE ‘05

Some Lessons We’ve Learned About Decision ProceduresSome Lessons We’ve Learned About Decision ProceduresPreserve Boolean StructurePreserve Boolean Structure

Other approaches require collapsing to conjunctions of predicates (or extracting them dynamically)

Exploit Problem CharacteristicsExploit Problem Characteristics Sparseness Polarity structure

Let SAT Solver Do the WorkLet SAT Solver Do the Work Eager encoding: provide sufficient set of constraints to

prove / disprove formula They are good at digesting large volume of information

– 48 –CADE ‘05

Invariant Checking RevisitedInvariant Checking Revisited

Prove Unsatisfiability of FormulaProve Unsatisfiability of Formulax1x2…xk (x1…xk) (y1…ym)

General Form: X (X) (Y)

Quantifier InstantiationQuantifier Instantiation Generate expressions E1(Y), …, En(Y)

Using terms that appear in Q

Expand as (E1(Y)) … (En(Y)) (Y) If unsatisfiable, then so is quantified formulaSound, but incomplete

Trade-offTrade-off Be clever about instantiation, or Instantiate many terms and rely on decision procedure

capacity

– 49 –CADE ‘05

Predicate Abstraction RevisitedPredicate Abstraction Revisited

Formulate as Quantifier Elimination ProblemFormulate as Quantifier Elimination Problem Generate formula of form (B) S (S,B)

S: Integer variables

Use Eager SAT Encoding of Use Eager SAT Encoding of Get formula A P(A,B)

A: Boolean variablesSatisfying solutions for P w.r.t. B same as those for

Core problem of symbolic model checking

– 50 –CADE ‘05

Quantifier Elimination for P.A.Quantifier Elimination for P.A.

Formula A P(A,B)A: Boolean variablesTypically: 200+ variables for A, ~20 for B

BDD-BasedBDD-Based Use partitioning techniques developed for symbolic model

checkingTypically too many total Boolean variables

SAT EnumerationSAT Enumeration Find satisfying solution (A) (B) to P Enumerate solution (B) Reformulate P as P (B) Performance: about 1000 solutions / second

– 51 –CADE ‘05

Why Verification Tasks FeasibleWhy Verification Tasks Feasible

CLU Logic Fairly SimpleCLU Logic Fairly Simple Equality, uninterpreted functions, difference constraints Small model property

““Deep” Reasoning Not RequiredDeep” Reasoning Not Required Formulas large and messy, but straightforward Verifying systems that are designed to have constrained

behaviors Only checking effect of a few cycles of system operation

– 52 –CADE ‘05

Decision Procedures RevisitedDecision Procedures Revisited

SAT-Based Approaches EffectiveSAT-Based Approaches Effective Good performance as decision procedures Key to implementing predicate abstraction

Quantifier elimination

Eager Encoding Gives Good PerformanceEager Encoding Gives Good Performance Avoids many iterations of theory-specific checkers Extends to linear integer arithmetic

Seshia & Bryant, LICS ‘04Quantifier-free PresburgerSmall domain encoding exploiting sparseness

– 53 –CADE ‘05

Areas of ResearchAreas of Research

Bit-Vector Decision ProceduresBit-Vector Decision Procedures True model for hardware & low-level software

Bit-field extractionBit-wise Boolean operationsOverflow effects

Automatically apply abstractionsAbstract to symbolic terms whenever possible

Boolean Quantifier EliminationBoolean Quantifier Elimination SAT enumeration still not good enough

Limits predicate abstraction to ~25 predicates

Core problem for symbolic model checking

– 54 –CADE ‘05

More ResearchMore Research

Proof GenerationProof Generation Hard to see how to generate unsatisfiability proof for CLU

formula

Debugging SupportDebugging Support Bounded model checking: provide counterexample trace Invariant checking: hard to determine why invariant fails

And may be due to weakness in quantifier instantiation

Predicate abstraction: Gets nowhere without right set of predicates

Proving LivenessProving Liveness Current abstractions do not preserve liveness properties Can help in proving progress invariant

Questions?Questions?