checking validity of quantifier-free formulas in combinations of first-order theories

Checking Validity of Quantifier-Free Formulas in Combinations of First-Order

Theories

Clark W. Barrett

Ph.D. Dissertation Defense

Department of Computer ScienceStanford University

August 2001

The Problem: First-Order Logic First-Order Logic is a mathematical system for

making precise statements. Statements in first-order logic are made up of

the following pieces: Variables x, y Constants 0, John, Functions f (x ), x + y Predicatesp (x ), x > y, x = y Boolean connectives , , , Quantifiers ,

Example: “Every rectangle is a square”x. (Rectangle (x ) Square(x))

The Problem: First-Order Theories A first-order theory is a set of first-order

statements about a related set of constants, functions, and predicates.

A theory of arithmetic might include the following statements about 0 and +:

x. ( x + 0 = x )

x,y. (x + y = y + x )

The Problem: Validity An expression is valid if every possible way

of interpreting it results in a true statement.

x = x p(x ) p(x ) x = y f (x ) = f (y )f (x ) = f (y ) x = y

Valid Valid Valid Invalid

An expression is valid in a theory if every possible way of interpreting it in that theory results in a true statement.

x 0


x 0 Invalid in the theory of real arithmetic


x 0 Valid in positive real arithmetic

The Problem: Validity Checking Suppose T is a first-order theory and is a first-order

formula We write T = as an abbreviation for “ is valid in

T ” A classical result in Computer Science states that in

general, the question of whether T = is undecidable. It is impossible to write a program that can

always figure out whether T =

However, given appropriate restrictions on T and , a program can automatically decide T =

We consider theories T such that T = is decidable when is quantifier-free.

Motivation Many interesting and practical problems can

be solved by checking the validity of a formula in some theory.

As evidence of this claim, consider the following widely-used tools tools which include decision procedures for checking validity PVS [Owre et al. ‘92] STeP [Manna et al. ‘96, Bjørner ‘99] ESC [Detlefs et al. ‘98] Mona [Klarlund and Møller ‘98] SVC [Barrett et al. ‘96]

The SVC Story Roots in processor verification

[Burch and Dill ‘94] [Jones et al. ‘95]

Internal use at Stanford Symbolic simulation [Su et al. ‘98] Software specification checking [Park et al. ‘98] Infinite-state model checking [Das and Dill ‘01]

External use since public release in 1998 Model Checking [Boppana et al. ‘99] Theorem prover proof assistance [Heilmann ‘99] Integration into programming languages [Day et

al. ‘99] Many others

The SVC Story Despite its success, SVC has many limitations

Gaps in theoretical understanding Outgrown its original software architecture Unnecessarily slow performance in some cases

This thesis is the result of ongoing efforts to address these limitations. New contributions to underlying theory A flexible and efficient implementation Techniques for faster and more robust

performance

Outline Validity Checking Overview

The Problem Motivation The SVC Story Top-Level Algorithm

Methods for Combining Theories Implementation Adapting Techniques from Propositional Satisfiability Contributions and Conclusions

Top-Level Algorithm Consider the following formula in the theory of

arithmetic

x > y y > x x = y

Step 1: Choose an atomic formula Step 2: Consider two cases:

Replace the atomic formula with true Replace the atomic formula is with false

Step 3: Simplify

true y > x x = y false y > x x = y

true y > x x = y

Top-Level Algorithm Consider the following formula in the theory of

arithmetic

x > y y > x x = y

true y > x x = yfalse y > x x = y

true y > x x = y

true x = y

true falsex y y x x yThis formula is unsatisfiable

Validity Checking Overview A literal is an atomic formula or its negation

The validity checker is built on top of a core decision procedure for satisfiability in T of a set of literals.

The method for checking satisfiability will vary greatly depending on the theory in question

The most powerful technique for producing a satisfiability procedure is by combining other satisfiability procedures

Outline Validity Checking Overview Methods for Combining Theories

The Problem Shostak’s Method The Nelson-Oppen Method A Combined Method

Implementation Adapting Techniques from Propositional Satisfiability Contributions and Conclusions

The Problem Consider the following theories:

Real linear arithmetic: +,-,0,1,…, Arrays: s[i], update(s,i,v) Uninterpreted functions and predicates: f

(x ), p(x ),…

And the following set of literals in the combined theory:p (y ) s = update (t, i, 0 ) x - y - z = 0

z + s[i ] = f (x - y ) p (x - f (f (z ) ) )

Consider the following theories: Real linear arithmetic: +,-,0,1,…, Arrays: s[i], update(s,i,v) Uninterpreted functions and predicates: f

(x ), p(x ),…


z + s[i ] = f (x - y ) p (x - f (f (z ) ) )


(x ), p(x ),…


z + s[i ] = f (x - y ) p (x - f (f (z ) ) )


(x ), p(x ),…


z + s[i ] = f (x - y ) p (x - f (f (z ) ) )


(x ), p(x ),…


z + s[i ] = f (x - y ) p (x - f (f (z ) ) ) Question: Given a method to decide satisfiability of literals in each theory,

how do we decide the satisfiability of literals in the combined theory? Two main approaches, each with advantages

and disadvantages Shostak [Shostak ‘84] Nelson-Oppen [Nelson and Oppen ‘79]

Shostak’s Method Has formed an ongoing strand of research

Originally published in 1984 [Shostak ‘84] Several clarifying papers since then

[Cyrluk et al. ‘96] [Ruess and Shankar ‘01]

Used in several automated deduction systems PVS, STeP, SVC

Unfortunately, remains difficult to understand Details are nonintuitive Simple proof of correctness has been especially

elusive Contribution : A new presentation of a key subset

of Shostak’s original algorithm.

Shostak’s Method: Canonizer There are two main components in a Shostak

satisfiability procedure: the canonizer and the solver.

The canonizer rewrites terms into a unique form T = a = b canon (a ) = canon (b )

Example: canonizer for linear arithmetic Combines like terms

canon (x + x ) = 2x Imposes an ordering on the variables

canon (y + x ) = x + y

Shostak’s Method: Solver A set of equations E is said to be in solved form if

the left-hand side of each equation is a variable which appears only once in E

in solved form not in solved formx = y + z x = y + zw = z - a w = z + xv = 3y + b 2v = 3y + b

S means replace each left-hand side variable occurring in S with its corresponding right-hand side

E (w + x + y + z ) = z - a + y + z + y + z

Shostak’s Method: Solver The solver transforms an equation into an

equisatisfiable set of equations in solved form If T = a b , then solve (a = b ) = { false } Otherwise:

solve (a = b ) = a set of equations E in solved form

T = (a = b x. E ) x is a set of fresh variables appearing in E,

but not in a or b.

Example: solver for real linear arithmetic solve (x - y - z = 0 ) = { x = y + z } solve (x + 1 = x - 1 ) = { false }

The Simplified Algorithm Given a set of equations and disequations Step 1: Use the solver to convert into an

equisatisfiable set of equations E in solved form Use a generalization of Gaussian

elimination with back substitution

Choose matrix row


equisatisfiable set of equations E in solved form Select an equation from Apply E as a substitution to Solve to get E’ Apply E’ as a substitution to E Add E’ to E

-x - 3y + 2z = -1x - y - 6z = 12x + y - 10z = 3

311

1012611231E



-x - 3y + 2z = -1x - y - 6z = 12x + y - 10z = 3

311

1012611231E

Apply previous rowsChoose matrix row



x = -3y + 2z +1x - y - 6z = 12x + y - 10z = 3

311

1012611231E

Apply previous rowsMake pivot 1

Choose matrix row

Apply to previous rows



x - y - 6z = 12x + y - 10z = 3

311

1012611231E


Choose matrix row


x = -3y + 2z +1



-3y +2z +1-y -6z =12x + y - 10z = 3

E


Choose matrix row


x = -3y + 2z +1

301

1012440231



y = -z2x + y - 10z = 3

E


Choose matrix row


x = -3y + 2z +1

301

1012110231



y = -z2x + y - 10z = 3

E


Choose matrix row


x = -3(-z) +2z +1

301

1012110501



2x + y - 10z = 3E


Choose matrix row


x = 5z +1y = -z

301

1012110501



2(5z +1)+(-z )-10z=3E


Choose matrix row


x = 5z +1y = -z

101

100110501



z = -1E


Choose matrix row


x = 5z +1y = -z

101

100110501



z = -1E


Choose matrix row


x = 5(-1) +1y = -(-1)

114

100010001


equisatisfiable set of equations E in solved form Select an equation from Apply E as a substitution to Solve to get E’ Apply E’ as a substitution to E Add E’ to E E


Choose matrix row


x = -4y = 1z = -1

114

100010001


equisatisfiable set of equations E in solved form

Step 2: Use this set of equations together with the canonizer to check if any disequality is violated For each a b Check if canon (E (a ) ) = canon (E (b ) )E

x = -4y = 1z = -1

2y - 10x 6(z - 2x)2(1)-10(-4)6(-1-2(-4))

42 42


equisatisfiable set of equations E in solved form

Step 2: Use this set of equations together with the canonizer to check if any disequality is violated For each a b Check if canon (E (a ) ) = canon (E (b ) )E

x = 5z +1y = -z

1 - 4y x - z

1-4(-z) (5z +1) -z4z + 1 4z + 1

The Simplified Algorithm Given a set of equations and disequations

Step 1: Use the solver to convert into an equisatisfiable set of equations E in solved form

Step 2: Use this set of equations together with the canonizer to check if any disequality is violated For each a b Check if canon (E (a ) ) = canon (E (b ) )

Technical detail: If there is more than one disequality, the

theory must be convex

Shostak’s Method: Combining Theories In what sense is this algorithm a method for

combining theories?

Two Shostak theories T1 and T2 can often be combined to form a new Shostak theory T = T2 T2 Compose canonizers: canon = canon1 o canon2 Often, solvers can also be combined

Treat terms from other theory as variables Repeatedly apply solvers from each theory

until resulting set of equations is in solved form

Shostak’s Method: Contributions Shostak’s original algorithm is much more complicated

because it includes a decision procedure for the theory of pure equality with uninterpreted functions

Why is the simplified version a contribution? Can be applied directly to produce decision

procedures, even combinations of decision procedures

Much easier to understand and prove correct Provides intuition for understanding the original

algorithm Provides the foundation for a generalization of the

original Shostak method based on a variation of Nelson-Oppen

Nelson-Oppen Developed for the Stanford Pascal Verifier

[Nelson and Oppen ‘79] [Nelson ‘80, Oppen ‘80]

Tinelli and Harandi discovered a new (simpler) proof and an important optimization [Tinelli and Harandi ‘96]

Used in real systems ESC EHDM [von Henke et al. ‘88] Vampyre

[http://www-cad.eecs.berkeley.edu/~rupak/Vampyre]

Nelson-Oppen Unlike Shostak, Nelson-Oppen does not impose a

specific strategy on individual theories Instead of a solver and canonizer, Each theory provides a complete satisfiability

procedure Technical detail: Each theory must be stably infinite

There are two phases in the version of Nelson-Oppen presented by Tinelli and Harandi Purification phase Check phase

Nelson-Oppen: Purification Phase Transform a set of literals in a combined

theory to an equisatisfiable set of literals such that each literal is pure : A pure literal contains symbols from only

a single theory

Consider again the following set of literals in a combined theory of arithmetic, arrays, and uninterpreted functionsp (y ) s = update (t, i, 0 )x - y - z = 0z + s[i ] = f (x - y )p (x - f (f (z ) ) )

j = 0



a single theory

Consider again the following set of literals in a combined theory of arithmetic, arrays, and uninterpreted functionsp (y ) s = update (t, i, j )x - y - z = j z + s[i ] = f (x - y )p (x - f (f (z ) ) )

j = 0j = 0k = s[i ]



a single theory

Consider again the following set of literals in a combined theory of arithmetic, arrays, and uninterpreted functionsp (y ) s = update (t, i, j )x - y - z = jz + k = f (x - y )p (x - f (f (z ) ) )

j = 0k = s[i ]j = 0k = s[i ]l = x - ym= z + k



a single theory

Consider again the following set of literals in a combined theory of arithmetic, arrays, and uninterpreted functionsp (y ) s = update (t, i, j )l - z = jm = f (l )p (x - f (f (z ) ) )

j = 0k = s[i ]l = x - ym= z + k



a single theory

Consider again the following set of literals in a combined theory of arithmetic, arrays, and uninterpreted functions

j = 0k = s[i ]l = x - ym= z + kn = f (f (z ) ) )

v = x - np (y ) s = update (t, i, j )l - z = jm = f (l )p (v )



a single theory

Consider again the following set of literals in a combined theory of arithmetic, arrays, and uninterpreted functions

s = update (t, i, j ) k = s[i ]

p (y ) m = f (l )p (v )n = f (f (z ) ) )

l - z = jj = 0l = x - ym= z + kv = x - n

Nelson-Oppen: Check Phase Definitions Shared variables are variables that appear

in literals from more than one theory Shared: l, z, j, y, m, k, v, n Unshared: x, s, t, i

s = update (t, i, j ) k = s[i ]

p (y ) m = f (l )p (v )n = f (f (z ) ) )

l - z = jj = 0l = x - ym= z + kv = x - n

An arrangement of a set is a set of equalities that partitions the set into equivalence classes Suppose S = { a , b , c } Some arrangements of S

{ a b , a c , b c } { { a } , { b } , { c } } { a = b , a c , b c } { { a , b } , { c } } { a = b , a = c , b = c } { { a , b , c } }

Nelson-Oppen: Check Phase Choose an arrangement A of the shared

variables For each theory, check if the set of literals

pure in that theory together with the arrangement A is satisfiable

If an arrangement exists that is compatible with each set of literals, then the original set of literals is satisfiable in the combined theory

Arrayss = update (t, i, j ) k = s[i ]

Uninterpretedp (y ) m = f (l )p (v )n = f (f (z ) ) )

Arithmeticl - z = jj = 0l = x - ym= z + kv = x - n

A (l, z, j, y, m, k, v, n )

Nelson-Oppen: A Variation Contribution : A Variation of Nelson-Oppen

The purification phase can be eliminated Instead, simply partition the formulas

according to the outer-most symbol

p (y ) s = update (t, i, 0 )x - y - z = 0z + s[i ] = f (x - y )p (x - f (f (z ) ) )

Arithmetic x - y - z = 0z + s[i ] = f (x - y )

Arrays s = update (t, i, 0 )

Uninterpretedp (y ) p (x - f (f (z ) ) )


The purification phase can be eliminated Instead, simply partition the formulas

according to the outer-most symbol Choose an arrangement A of the shared

terms which appear in a term or formula belonging to another theory

For each theory, check if the set of literals assigned to that theory together with the arrangement is satisfiable

Terms with foreign symbols are treated as variables

Arithmetic x - y - z = 0z + s[i ] = f (x - y )

Arrays s = update (t, i, 0 )

Uninterpretedp (y ) p (x - f (f (z ) ) )

A (s[i ], x - y, f (x - y ), 0, y, z, f (f (z ) ), x - f (f (z ) ) )


The purification phase can be eliminated Instead, simply partition the formulas according to

the outer-most symbol Choose an arrangement A of the shared terms which

appear in a term or formula belonging to another theory

For each theory, check if the set of literals assigned to that theory together with the arrangement is satisfiable

Terms with foreign symbols are treated as variables Contributions of this variation

Fewer formulas given to each theory Easier to implement Easier to combine with Shostak

Combining Shostak and Nelson-Oppen Theory requirements

Shostak requires convexity Nelson-Oppen requires stable-infiniteness Contribution : The following theorem relates the two

Every convex first-order theorywith no trivial models is stably-infinite

The proof is based on first-order compactness Note: if a convex theory does admit trivial models, it

can usually be modified to include the non-triviality axiom:

x,y. x y

Combining Shostak and Nelson-Oppen Contribution : An algorithm for combining the two methods Equalities are processed according to the Shostak

algorithm to get a set of equalities E in solved form All literals are partitioned as in the Nelson-Oppen variation The key idea is to consider the partial arrangement induced

on the shared terms S by canon and E : A= : { a = b a,b S canon (E(a )) = canon (E(b )) }

An arrangement A is chosen as in the Nelson-Oppen variation, but this arrangement must include A=

This arrangement is automatically consistent with E The non-Shostak theories are checked for consistency

with the arrangement as before


Methods for Combining Theories

Implementation

Adapting Techniques from Propositional

Satisfiability

Contributions and Conclusions

Implementation: Approach Based on Nelson-Oppen and Shostak combination Online algorithm Optimizations

A Union-Find data structure and an Update List are used to efficiently keep track of both E and A simultaneously

Simplify phase added Each new formula is simplified Enables rewrites that can reduce the number

of shared terms Flexible theory interface

Accommodates Nelson-Oppen theories, Shostak theories, and more

Implementation: Interface Recall the top-level algorithm

x > y y > x x = y

Choose an atomic formula Consider two cases:

Add to the set of choices made and simplify Add to the set of choices made and simplify

Repeat until formula is true or set of choices is unsatisfiable

Interface from top-level : AddFact, Simplify, Satisfiable


true y > x x = y

AddFact Simplify

Theory-specific code

Top-level code

Assert

AssertEqualities RewriteAssert

FormulaSetupTerm

Satisfiable

RewriteSolve UpdateAssertSetupAddSharedTerm

CheckSat

Satisfiable

AddFact Simplify

Top-level code

Assert


FormulaSetupTerm

Satisfiable

p(y), s = update(t, i, 0), x -y -z = 0, z + s[i] = f (x - y), p(x -f (f (z)))

Uninterpreted Arrays Arithmetic (Shostak)E

p(y)

p(y)

p(y)

p(y)p(y)

y

y

p(y)

p(y)

Update List

AddFact Simplify

Top-level code

Assert


FormulaSetupTerm

Satisfiable


Uninterpreted Arrays Arithmetic (Shostak)E

s = update(t, i, 0)

0

y

p(y)0

s = update(t, i, 0)

s = update(t, i, 0)

s = update(t, i, 0)

s = update(t, i, 0)

s = ...

Update List

AddFact Simplify

Top-level code

Assert


FormulaSetupTerm

Satisfiable


Uninterpreted Arrays Arithmetic (Shostak)E Update List

x -y -z = 0

y

p(y)0

s = update(t, i, 0)

x = y + z

s = update(t, i, 0)

x -y -z = 0x = y + z

x = y + z

x = y + zy + z

x = ...y + z

AddFact Simplify

Top-level code

Assert


FormulaSetupTerm

Satisfiable



z + s[i] = f (x - y)

y

p(y)0

s = update(t, i, 0)

z = f (z)

s = update(t, i, 0)

z = f (z)

x = y + zy + z

z=f (z)f (z)

z+s[i]= ...

s[i]

s[i]0

0z = f (z)

z f (z )

z = f (z)

x - y

x - yz

z

z = f (z)

AddFact Simplify

Top-level code

Assert


FormulaSetupTerm

Satisfiable



p(x -f (f (z)))

y

p(y)0

s = update(t, i, 0)s = update(t, i, 0)x = y + f (z)

y + z = y + f (z)

p(x -…)

z

zf (z)

f (z)p (y )

z f (z )

z = f (z)

f (f (z))

f (f (z))f (z)

f (z)

z = f (z)

x -f (z)

x -f (z)y

yp (y )

p (y )

p (y )

Satisfiable

Implementation: Contributions Better implementation of Nelson-Oppen

Online algorithm Each theory only needs to consider a subset

of the shared terms Simplify phase

Can reduce number of shared terms Equality reasoning is only done once

Simple algorithm with detailed proof Flexible theory interface

Combined with Shostak Generalizes original Shostak algorithm Efficient: same data structure for E and A

Outline Validity Checking Overview Methods for Combining Theories Implementation Adapting Techniques from Propositional

Satisfiability The Problem Combining with SAT Results


The Problem Recall the top-level algorithm

x > y y > x x = y

Choose an atomic formula Consider two cases:

Add to the set of choices made and simplify Add to the set of choices made and simplify

Repeat until formula is true or set of choices is unsatisfiable


true y > x x = y

The Problem The choice of which atomic formula to try next can

make a dramatic difference in performance

SVC includes clever heuristics that improve performance significantly

We are convinced that better performance is possible Equivalent formulas can vary significantly in

performance Research in a related area, Boolean satisfiability

(SAT), has advanced significantly

Strategy : Find a way to apply SAT techniques to first-order validity checking

Combining with SAT: Approach Generate SAT problem from validity-checking

problem Negate the formula whose validity is in question Extract Boolean structure from resulting

formula Convert to CNF [Larabee ‘92] Run SAT on converted formula

If SAT reports unsatisfiabile, the formula is valid

The inverse is not true A satisfying assignment must be checked for

first-order consistency

Combining with SAT: Initial Results Implementation

GRASP SAT engine [Silva ‘96] SVC2

Initial results were disappointing Examples of interest could not be proved by

just considering Boolean structure SAT techniques do not compensate for the

loss of information resulting from translation to SAT

Idea : Incrementally give SAT more information

Combining with SAT: Conflict Clauses A conflict clause captures a minimal set of

decisions that lead to a conflict and keeps SAT from ever making the same set of choices

true y > x x y false y > x x y

true x y

f (x ) = f (y ) y > x x y

true false

true y > x x y

Unsatisfiablef (x ) f (y )

y xx = y

Combining with SAT: Conflict Clauses How do we get a conflict clause from the

first-order satisfiability algorithm Using all decisions too slow Black-box minimization methods too slow

Solution : Use proof-production! Aaron Stump has extended several SVC

decision procedures to produce a proof for every result deduced

By looking at what assumptions are used in a proof of inconsistency, a conflict clause can be obtained

ResultsSVC (no heuristics) SVC (current heuristics) SVC2 with SATTest Case

Decisions Time (s) Decisions Time(s) Decisions Time(s)

fb_var_12_11 17484 6.8 14386 6.0 257 0.8

fb_var_5_11 73484 29.0 60236 25.3 279 0.8

fb_var_6_12 25156 8.0 19533 5.9 79 0.1

pp-bloaddata-a 93637 55.4 902 1.9 894 5.8

pp-bloaddata 344893 292.9 35491 18.5 629 4.1

pp-dmem2 361854 293.6 47989 26.3 775 6.0

pp-invariant 3547 2 3484 1.9 174 0.5

dlx-pc 260 0.3 384 0.4 1244 10.0

dlx-dmem 2809 1.8 655 0.8 2149 30.1

dlx-regfile 989 0.9 936 1.1 40999 1132

Results: Preliminary Conclusions Naïve approach does not work well

Adding conflict clauses results in dramatic speed-ups on several examples

Most helpful on formulas with more Boolean structure

Still more work to be done Find out source of performance problems Compare to related work

[Goel et al. ‘98] [Bryant et al. ‘99]


Methods for Combining Theories

Implementation

Adapting Techniques from Propositional

Satisfiability


Thesis Contributions A new presentation of the core of Shostak’s algorithm

Easier to understand and prove correct Can be applied directly to produce decision

procedures Forms the foundation of a generalization

A new variation of Nelson-Oppen Eliminates purification phase Fewer formulas given to each theory Easier to implement Easier to combine with Shostak

A new algorithm combining Shostak and Nelson-Oppen Theoretical result relating convex and stable-infinite Generalization of Shostak’s original method

Thesis Contributions A detailed and provably correct implementation

Online Optimized to eliminate redundant equality

reasoning Optimized to reduce number of shared terms Flexible theory API

Faster search by combining with SAT Methodology and implementation for extracting

CNF Better performance via conflict clauses Conflict clauses from proofs (with Aaron Stump) Dramatic improvements on several examples

Future Work Relaxing restrictions on theories and formulas

Non-disjoint signatures Non-stably-infinite theories Formulas with quantifiers

Individual Theories Efficient implementation for Presburger arithmetic Better techniques for accommodating third-party

decision procedures

SAT Understand cases where combination with SAT fails

Acknowledgements Advisor: David Dill Orals Committee: John Gill, Zohar Manna,

John Mitchell, Natarajan Shankar Stanford Associates: Aaron Stump, Jeremy

Levitt, Satyaki Das, Jeffrey Xsu, Robert Jones, Vijay Ganesh, Kanna Shimizu, Husam Abu-Haimed, Jens Skakkebæk, David Park, Shankar Govindaraju, Madan Musuvathi, Chris Wilson

Others: Cesare Tinelli SVC Users Personal: Friends and family

Validity Checking Overview Top-level Algorithm

CheckValid(h,c) IF c = true THEN RETURN TRUE; IF !Satisfiable(h) THEN RETURN FALSE; IF c = false THEN RETURN FALSE; subgoals := ApplyTactic(h,c); FOREACH (h,c) in subgoals DO IF !CheckValid(h,c) THEN RETURN FALSE; RETURN TRUE;

CheckValid(h,c) IF c = true THEN RETURN TRUE; IF !Satisfiable(h) THEN RETURN FALSE; IF c = false THEN RETURN FALSE; subgoals := ApplyTactic(h,c); FOREACH (h,c) in subgoals DO IF !CheckValid(h,c) THEN RETURN FALSE; RETURN TRUE;

ApplyTactic(h,c) Let e be an atomic formula appearing in c; h1 := AddFact(h,e); c1 := Simplify(h1,c); h2 := AddFact(h,!e); c2 := Simplify(h2,c); RETURN {(h1,c1),(h2,c2)};

If CheckValid(T, ) = TRUE , then T =

Shostak’s Method: Convexity A set of literals S is convex in a theory T if T S does not entail any disjunction of equalities without entailing one of the equalities itself

A theory T is convex if every set of literals in the language of T is convex in T

Shostak’s Method: Requirements on T Shostak Theory T

Signature of T contains no predicate symbols T is convex Canonizer such that a,b. T = a =b iff a = b

Solver such that if T = a b , then a =b { false } Otherwise: a =b = a set of equations E in solved form T = a =b x. E, where x is the set of

variables appearing in E, but not in a or b. The variables in x are guaranteed to be fresh.

The Simplified Algorithm Given a set of equations and disequations

Step 1: Use the solver to convert into an equisatisfiable set of equations E in solved form

Step 2: Use this set of equations together with the canonizer to check if any disequality is violated Suppose a b canon (E (a ) ) = canon (E (b ) )

T = E (a ) = E (b )

T E = a = b T E { a b } is unsatisfiable

Technical detail:The method is complete only for convex theories

Shostak’s Method: The AlgorithmShostak,,, := ; WHILE DO BEGIN Remove some equality a = b from ; Let a’:= a and b’:= b; Let ’:= a’= b’; IF ’ = false THEN RETURN FALSE; Let := ’ U ’; END IF a = b for some a b in THEN RETURN FALSE; ELSE RETURN TRUE;

Shostak(,,,) = TRUE iff is satisfiable in T

Nelson-Oppen: Definitions Theories must be stably-infinite

A theory T is stably-infinite if every quantifier-free formula is satisfiable in T iff it is satisfiable in an infinite model of T

Terminology for combinations of theories Theories T1, T2, … Tn with signatures 1, 2, … n

As with Shostak, signatures must be disjoint Members of i are called i-symbols An expression containing only i-symbols is called

pure An i-term is a constant i-symbol, an application of a

functional i-symbol, or an i-variable Each variable is associated arbitrarily with a

theory

Nelson-Oppen: Definitions Terminology for combinations of theories (continued)

An i-predicate is the application of a predicate i-symbol

An atomic i-formula is an i-predicate or an equation whose left-hand side is an i-term

An i-literal is an atomic i-formula or its negation An occurrence of a term is i-alien if it is a j-term (i j) and all its super-terms are i-terms

If S is a set of terms, then an arrangement of S is a set of equations and disequations induced by a partition of S

S = { a , b , c } Partition P = { { a , b } , { c } } Arrangement : { a = b , a c , b c }

Nelson-Oppen: Purification PhaseNO-Purify() WHILE != DO BEGIN Let be some i-literal in ; IF is pure THEN Remove from ; i := i U {}; ELSE Let t be an i-alien j-term in ; Replace every occurrence of t in with a new j-variable z; := U { j = t }; ENDIF END RETURN 1^…^n;

is satisfiable in T iff 1 ^ 2 ^ … n is satisfiable in T

Nelson-Oppen: Check PhaseNO-Check(1,...n,Sat1,…,Satn) Let S be the set of variables which appear in more than one i; Let A be an arrangement of S; sat := TRUE; FOREACH i DO BEGIN sat := sat ^ Sati(i^A); END RETURN sat;

The second step is non-deterministic1 ^ 2 ^ … n is satisfiable in T iff

it is possible for NO-Check to return TRUE If the theories are convex, the algorithm can

be determinized inexpensively

Nelson-Oppen: A Variation

The purification phase can be eliminated S is a set of terms rather than a set of variables In calls to Sati , i-alien terms are treated as variables

NO-Check(,Sat1,…,Satn) Let S be the set of terms which are i-alien in either an i-literal or an i-term in ; Let A be an arrangement of S; sat := TRUE; FOREACH set of i-literals i in DO BEGIN sat := sat ^ Sati(i^A); END RETURN sat;

Combining Shostak and Nelson-OppenNO-Shostak(,,,SatNO) Let S be the set of shared terms; Let be the 1-equalities, the 1-disequalities, and NO the 2-literals in ; := ; LOOP BEGIN IF !SatNO(NO^A=) THEN RETURN FALSE; ELSE IF !SatNO(NO^A) THEN Choose a,b from S such that T2NOA |= a=b, but a=b A= ELSE IF = THEN BREAK; ELSE Remove some equality a = b from ; Let a’:= (a) and b’:= (b); Let ’:= (a’= b’); IF ’ = {false} THEN RETURN FALSE; Let := ’() U ’; END IF A THEN RETURN TRUE; ELSE RETURN FALSE;

Combining Shostak and Nelson-OppenNO-Shostak(,,) := ;S := ; LOOP BEGIN IF t1=f(x1,…,xn), t2=f(y1,…,yn) with t1,t2 in S and norm(xi)=norm(yi) but norm(t1) != norm(t2) THEN a := t1, b := t2; ELSE IF = THEN RETURN TRUE; ELSE Remove some equality a = b from ; Let a’:= can(a) and b’:= can(b); Add each sub-term of a’,b’ to S; Let ’:= (a’= b’); IF ’ = {false} THEN RETURN FALSE; Let := ’() U ’; END RETURN TRUE;

Individual Theories SVC contains decision procedures for a number of

individual theories Pure equality with uninterpreted functions Real linear arithmetic Arrays Bit-vectors Records

In our efforts to revisit and improve these decision procedures, a number of interesting issues were uncovered Finite domains Strategies for arithmetic

Finite Domains Theoretical technicalitiy

Cannot directly combine a theory with only finite models Not stably-infinite Union of theories likely to actually be

inconsistent Solution: Form an extended theory whose

relativized reduct with respect to a new predicate P is the theory with a finite domain.

Implementation strategy for nonconvexity Keep track of the terms for which P holds Use graph coloring to determine satisfiability

Arithmetic Suppose we want to handle linear arithmetic

formulas with mixed variable types: some real and some integer.

One approach is the following: Split weak inequalities into the disjunction of an

equation and a strong inequality Use Shostak-style solver to eliminate all equations

that can be solved for a real variable Use Fourier-Motzkin techniques to eliminate all

real variables from inequalities Eliminate disequalities which can be solved for a

real variable What’s left can be done with Presburger decision

procedures

Math symbols()

checking validity of quantifier-free formulas in combinations of first-order theories

Documents

x ypredicatesp x

x px px x

functionsf x

order theory

f y f x

theory results

rectangle x squarexthe

order statements