propositional satisfiability (sat) toby walsh cork constraint computation centre university college...
TRANSCRIPT
Propositional Satisfiability(SAT)
Toby WalshCork Constraint Computation Centre
University College CorkIreland
4c.ucc.ie/~tw/sat/
Outline
What is SAT?
How do we solve SAT?
Why is SAT important?
Propositional satisfiability (SAT) Given a propositional
formula, does it have a “model” (satisfying assignment)? 1st decision problem
shown to be NP-complete
Usually focus on formulae in clausal normal form (CNF)
Clausal Normal Form Formula is a conjunction of
clauses C1 & C2 & …
Each clause is a disjunction of literals L1 v L2 v L3, … Empty clause contains no
literals (=False) Unit clauses contains single
literal Each literal is variable or its
negation P, -P, Q, -Q, …
Clausal Normal Form
k-CNF Each clause has k
literals 3-CNF
NP-complete Best current complete
methods are exponential 2-CNF
Polynomial (indeed, linear time)
How do we solve SAT?
Systematic methods Truth tables Davis Putnam procedure
Local search methods GSAT WalkSAT Tabu search, SA, …
Exotic methods DNA, quantum computing,
Procedure DPLL(C)
(SAT) if C={} then SAT
(Empty) if empty clause in C then UNSAT
(Unit) if unit clause, {l} then
DPLL(C[l/True])
(Split) if DPLL(C[l/True]) then SAT else
DPLL(C[l/False])
GSAT [Selman, Levesque, Mitchell AAAI 92]
Repeat MAX-TRIES times or until clauses satisfied T:= random truth assignment Repeat MAX-FLIPS times or until clauses satisfied
v := variable which flipping maximizes number of SAT clauses
T := T with v’s value flipped
WalkSAT [Selman, Kautz, Cohen AAAI 94]
Repeat MAX-TRIES times or until clauses satisfied T:= random truth assignment Repeat MAX-FLIPS times or until clauses satisfied
c := unsat clause chosen at random v:= var in c chosen either greedily or at random T := T with v’s value flipped
Focuses on UNSAT clauses
Why is SAT important?
Computational complexity 1st problem shown NP-complete Can therefore be used in theory to solve any
NP-complete problem
Many direct applications
Some applications of SAT
Hardware design Signals
Hi = True Lo = False
Gates AND gate = and
connective INVERTOR gate = not
connective ..
Some applications of SAT
Hardware design State of the art
HP verified 1/7th of the DEC Alpha chip using a DP solver
100,000s of variables 1,000,000s of clauses Modelling environment is one of the biggest problems
Some applications of SAT
Planning But planning is
undecidable in general Even propositional
STRIPS planning is PSPACE complete! How can a SAT solver,
which only solves NP-hard problems be used then?
Some applications of SAT
Planning as SAT Put bound on plan length If bound too small, UNSAT Introduce new propositional variables for each
time step
Some applications of SAT
Diagnosis as SAT Otherwise know as “SAT
in space” Deep Space One
spacecraft Propositional theory to
monitor, diagnose and repair faults
Runs in LISP!
Computational complexity
Study of “problem hardness” Typically worst case
Big O analysis Sorting is easy, O(n logn) Chess and GO are hard, EXP-time
“Can I be sure to win?” Need to generalize problem to n by n board
Where do things start getting hard?
Computational complexity
Hierarchy of complexity classes Polynomial (P), NP, PSpace, …. NP-complete problems mark boundary of
tractability No known polynomial time algorithm Though open if P=/=NP
NP-complete problems
Non-deterministic Polynomial time If I guess a solution, I can check it in polynomial time But no known easy way to guess solution correctly!
Complete Representative of all problems in this class If this problem can be solved in polynomial time, all
problems in the class can be solved Any NP-complete problem can be mapped into any
other
NP-complete problems
Many examples Propositional satisfiability (SAT) Graph colouring Travelling salesperson problem Exam timetabling …
SAT is NP-complete
Cook (1971) showed that all non-deterministic Turing machines can be reduced to SAT
=> There is a polynomial reduction of any problem in NP to SAT
But not all SAT problems are equally hard!
Random k-SAT sample uniformly from space of all possible k-clauses n variables, l clauses
Rapid transition in satisfiability 2-SAT occurs at l/n=1 [Chavatal & Reed 92, Goerdt 92]
3-SAT occurs at 3.26 < l/n < 4.598
SAT phase transition [Mitchell, Selman,
Levesque AAAI-92]
Random 3-SAT
Which are the hard instances? around l/n = 4.3
What happens with larger problems?
Why are some dots red and others blue?
Random 3-SAT
Complexity peak coincides with solubility transition
l/n < 4.3 problems under-constrained and SAT
l/n > 4.3 problems over-constrained and UNSAT
l/n=4.3, problems on “knife-edge” between SAT and UNSAT
Random 3-SAT
Varying problem size, n
Complexity peak appears to be largely invariant of algorithm
backtracking algorithms like Davis-Putnam
local search procedures like GSAT
3SAT phase transition
Lower bounds (hard) Analyse algorithm that almost always solves
problem Backtracking hard to reason about so
typically without backtracking Complex branching heuristics needed to ensure
success But these are complex to reason about
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
No assumptions about the distribution of X except non-negative!
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
The expected value of X can be easily calculated
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
E[X] = 2^n * (7/8)^l
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then prob(X>=1) = prob(SAT) < 1
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then 2^n * (7/8)^l < 1
3SAT phase transition Upper bounds (easier)
Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then 2^n * (7/8)^l < 1
n + l log2(7/8) < 0
3SAT phase transition Upper bounds (easier)
Typically by estimating count of solutions E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then 2^n * (7/8)^l < 1
n + l log2(7/8) < 0
l/n > 1/log2(8/7) = 5.19…
3SAT phase transition
Upper bounds (easier) Typically by estimating count of solutions To get tighter bounds than 5.19, can refine
the counting argument E.g. not count all solutions but just those maximal
under some ordering
SAT phase transition
Shape of transition
“sharp” both for 2-SAT and 3-SAT [Friedut 99]
Backbone (dis)continuity 2-SAT transition is "2nd order", continuous 3-SAT transition is "1st order", discontinuous backbone = truth assignments that are fixed when we
satisfy as many clauses as possible[Monasson et al. 1998],…
2+p-SAT
Morph between 2-SAT and 3-SAT fraction p of 3-clauses
fraction (1-p) of 2-clauses
[Monasson et al 1999]
2+p-SAT
Maps from P to NP NP-complete for any
p>0
Insight into change from P to NP, continuous to discontinuous, …?
[Monasson et al 1999]
2+p-SAT
2+p-SAT
Observed search cost
linear for p<0.4
exponential for p>0.4
But NP-hard for all p>0!
2+p-SAT
Continuous2SAT like
Discontinuous3SAT like
Simple bound
Are the 2-clauses UNSAT? 2-clauses are more
constraining than 3-clauses
For p<0.4, transition occurs at lower bound! 3-clauses are not
contributing
2+p-SAT trajectories
The real world isn’t random?
Very true!Can we identify structural
features common in real world problems?
Consider graphs met in real world situations social networks electricity grids neural networks ...
Real versus Random Real graphs tend to be sparse
dense random graphs contains lots of (rare?) structure
Real graphs tend to have short path lengths as do random graphs
Real graphs tend to be clustered unlike sparse random graphs
L, average path lengthC, clustering coefficient(fraction of neighbours connected to
each other, cliqueness measure)
mu, proximity ratio is C/L normalized by that of random graph of same size and density
Small world graphs
Sparse, clustered, short path lengths
Six degrees of separation Stanley Milgram’s famous
1967 postal experiment recently revived by Watts &
Strogatz shown applies to:
actors database US electricity grid neural net of a worm ...
An example
1994 exam timetable at Edinburgh University 59 nodes, 594 edges so
relatively sparse but contains 10-clique
less than 10^-10 chance in a random graph assuming same size and
density clique totally dominated
cost to solve problem
Small world graphs
To construct an ensemble of small world graphs morph between regular graph (like ring lattice) and
random graph prob p include edge from ring lattice, 1-p from random
graph
real problems often contain similar structure and stochastic components?
Small world graphs
ring lattice is clustered but has long paths random edges provide shortcuts without
destroying clustering
Small world graphs
Small world graphs
Other bad news disease spreads more
rapidly in a small world
Good news cooperation breaks
out quicker in iterated Prisoner’s dilemma
Other structural features
It’s not just small world graphs that have been studied
Large degree graphs Barbasi et al’s power-law model
Ultrametric graphs Hogg’s tree based model
Numbers following Benford’s Law 1 is much more common than 9 as a leading digit!
prob(leading digit=i) = log(1+1/i) such clustering, makes number partitioning much easier
Open questions
Prove random 3-SAT occurs at l/n = 4.3 random 2-SAT proved to be at l/n = 1 random 3-SAT transition proved to be in range
3.26 < l/n < 4.598 random 3-SAT phase transition proved to be “sharp”
2+p-SAT heuristic argument based on replica symmetry
predicts discontinuity at p=0.4 prove it exactly!
Open questions
Impact of structure on phase transition behaviour some initial work on quasigroups (alias Latin
squares/sports tournaments) morphing useful tool (e.g. small worlds, 2-d to
3-d TSP, …) Optimization v decision
some initial work by Slaney & Thiebaux
Open questions
Does phase transition behaviour give insights to help answer P=NP? it certainly identifies hard problems! problems like 2+p-SAT and ideas like backbone also
show promise But problems away from phase boundary can be
hard to solve over-constrained 3-SAT region has exponential resolution
proofs under-constrained 3-SAT region can throw up occasional
hard problems (early mistakes?)
Conclusions
SAT is fundamental problem in logic, AI, CS, …
There exist both complete and incomplete methods for solving SAT
We can often solve larger problems than (worst-case) complexity would suggest is possible!