csci 3130: formal languages and automata theory andrej bogdanov andrejb/csc3130 the chinese...
Post on 20-Dec-2015
221 views
TRANSCRIPT
CSCI 3130: Formal languages and automata theory
Andrej Bogdanov
http://www.cse.cuhk.edu.hk/~andrejb/csc3130
The Chinese University of Hong Kong
NP and NP-completeness
Fall 2010
Some more problems
1 2
3 4
Graph G
A clique is a subset of vertices that are all interconnected
{1, 4}, {2, 3, 4}, {1} are cliques
An independent set is a subset of vertices so that no pair is connected
{1, 2}, {1, 3}, {4} are independent sets
there is no independent set of size 3
A vertex cover is a set of vertices that touches (covers) all edges
{2, 4}, {3, 4}, {1, 2, 3} are vertex covers
Boolean formula satisfiability
• A boolean formula is an expression made up of variables, ands, ors, and negations, like
• The formula is satisfiable if one can assign values to the variables so the expression evaluates to true
(x1∨x2 ) ∧ (x2 ∨x3 ∨x4) ∧ (x1)
x1 = F x2 = F x3 = T x4 = TAbove formula is satisfiable because this assignment makes it true:
3SAT
SAT = { 〈〉 : is a satisfiable Boolean formula}
3SAT = { 〈〉 : is a satisfiable Boolean formula in conjunctive normal form with 3 literals per clause}
CNF: AND of ORs of literals
literal: xi or xi
(conjunctive normal form)
3CNF: CNF with 3 literals per clause
(x1∨x2∨x2 ) ∧ (x2∨x3∨x4)
clauseliterals
(repetitions are allowed)
Status of these problems
CLIQUE = { 〈 G, k 〉 : G is a graph with a clique of k vertices}
IS = { 〈 G, k 〉 : G is a graph with an independent set of k vertices}
VC = { 〈 G, k 〉 : G is a graph with a vertex cover of k vertices}
SAT = { 〈 f 〉 : f is a satisfiable Boolean formula}
running timeof best-known algorithm
problem CLIQUE
2O(n)
IS
2O(n)
3SAT
2O(n)
VC
2O(n)
What do these problems have in common?
Checking solutions efficiently
• We don’t know how to solve them efficiently
• But if someone told us the solution, we would be able to verify it very quickly
1
2
3
4
5
6
78
9
10
11
12
13
14
15
Is (G, 5) in CLIQUE?
1,5,9,12,14
Example:
Example: Formula satisfiability
(x1∨x2 ) ∧ (x2 ∨x3 ∨x4) ∧ (x1)=
Verifying a solution:
FFTT
substitutex1 = F x2 = F x3 = T x4 = T
evaluate formula(F ∨T ) ∧ (F∨T∨F) ∧ (T)=
can be done in linear time
Finding a solution:
Try all possible assignments
FFFFFFFTFFTFFFTT
FTFFFTFTFTTFFTTT
TFFFTFFTTFTFTFTT
TTFFTTFTTTTFTTTT
For n variables, there are 2n possible assignments
Takes exponential time
The class NP
• A verifier for L is a TM V such that
– s is a potential solution for x– We say V runs in polynomial time if on every input
x, it runs in time polynomial in |x| (for every s)
NP is the class of all languages that have polynomial-time verifiers
x ∈ L V accepts 〈 x, s 〉 for some s
Examples
3SAT is in NP:
CLIQUE is in NP:
On input a 3CNF and a candidate assignment a, V :=
If a satisfies accept, otherwise reject.
running time = O(m + n)m = number of clauses and n = number of variables
✔
On input 〈 G, k 〉and a set of vertices C, V :=
If C has size k and all edges between vertices ofC are present in G, accept, otherwise reject.
running time = O(n2)✔
P versus NP
because the verifier can ignore the solution
Conceptually, finding solutions can only be harder than checking them
P (efficient)
decidable
L01PATH
CLIQUE
SAT IS
NP (efficiently checkable)
VC
P is contained in NP
Millenium prize problems
• Recall how in 1900, Hilbert gave 23 problems that guided mathematics in the 20th century
• In 2000, the Clay Mathematical Institute gave 7 problems for the 21st century
1 P versus NP2 The Hodge conjecture3 The Poincaré conjecture4 The Riemann hypothesis5 Yang–Mills existence and mass gap6 Navier–Stokes existence and smoothness7 The Birch and Swinnerton-Dyer conjecture
$1,000,000
Hilbert’s 8th problemPerelman 2006 (refused money)
computer science
P versus NP
• The answer to the question
is not known. But one reason it is believed to be negative is because, intuitively, searching is harder than verifying
• For example, solving homework problems (searching for solutions) is harder than grading (verifying the solution is correct)
Is P equal to NP?
$1,000,000
Searching versus verifying
Mathematician: Given a mathematical claim, come up with a proof for it.
Scientist: Given a collection of data on some phenomena, find a theory explaining it.
Engineer: Given a set of constraints (on cost, physical laws, etc.) come up with a design (of an engine, bridge, etc.) which meets them.
Detective: Given the crime scene, find “who’s done it”.
P and NP
languages that can be decided on a TMwith polynomial running time
P =
(problems that admit efficient algorithms)
NP = languages whose solutions can be verifiedon a TM with polynomial running time
(solutions can be checked efficiently)
decidable
NP
P
We believe that NP is bigger than P,but we are not 100% sure
Evidence that NP is bigger than P
• These (and many others) are in NP– Their solutions, once found, are easy to verify
• But no efficient algorithms are known for any of them– The fastest known programs take time ≈2n
CLIQUE = { 〈 G, k 〉 : G is a graph with a clique of k vertices}
IS = { 〈 G, k 〉 : G is a graph with an independent set of k vertices}
VC = { 〈 G, k 〉 : G is a graph with a vertex cover of k vertices}
SAT = { 〈〉 : is a satisfiable Boolean formula}
Equivalence of certain NP languages• We strongly suspect that problems like
CLIQUE, SAT, etc. require time ≈2n to solve
• We do not know how to prove this, but what we can prove is that
If any one of them can be solved on a polynomial-time TM, then all of them can be solved
Equivalence of some NP languages
• All these problems are as hard as one another
• Moreover, they are at the “frontier” of NP– They are at least as hard as any problem in NP
clique
independent set
vertex-cover
satisfiability
NP
P
NP-completeness
NP-completeness
NP-completeness
Polynomial Time Reduction
How to show that a problem R is not easier than a problem Q?
Informally, if R can be solved efficiently, we can solve Q efficiently.
Formally, we say Q polynomially reduces to R if:
1. Given an instance q of problem Q
2. There is a polynomial time transformation to an instance f(q) of R
3. q is a “yes” instance if and only if f(q) is a “yes” instance
Then, if R is polynomial time solvable, then Q is polynomial time solvable.
If Q is not polynomial time solvable, then R is not polynomial time solvable.
Polynomial-time reductions
• What do we mean when we say, for example,
• We mean that
• Or, we can convert any polynomial-time TM for IS into one for CLIQUE
“ IS is at least as hard as CLIQUE”
If CLIQUE has no polynomial-time TM, thenneither does IS
Polynomial-time reductions
• Theorem
If IS has a polynomial-time TM, so does CLIQUE
CLIQUE = { 〈 G, k 〉 : G is a graph with a clique of k vertices}
IS = { 〈 G, k 〉 : G is a graph with an independent set of k vertices}
{1, 4}, {2, 3, 4}, {1} are independent sets
{1, 2}, {1, 3}, {4} are cliques
1 2
3 4
Polynomial-time reductions
• Proof: Suppose IS has an poly-time TM A
• We want to use it to solve CLIQUE
If IS has a polynomial-time TM, so does CLIQUE
〈 G, k 〉
reject if not
accept if G has clique of size kA
for IS
〈 G’, k’ 〉
reject if not
accept if G’ has IS of size k
Reducing CLIQUE to IS
• We look for a polynomial-time TM R that turns the question:
into:RG, k G’, k’“ Does G have a clique of size k?”
“ Does G’ have an IS of size k’?”
1 2
3 4
G
cliques of size k ISs of size k’
1 2
3 4
G’flip all edges
k’ = k
Reducing CLIQUE to IS
RG, k G’, k’On input 〈 G, k 〉 Construct G’ by flipping all edges of G Set k’ = k Output 〈 G’, k’ 〉
cliques in G independent sets in G’
If G’ has an IS of size k, then G has a clique of size k
If G’ does not have an IS of size k, then G has no clique of size k ✓
✓
Reduction recap
• We showed that
by converting an imaginary TM for IS into one for CLIQUE
• To do this, we came up with a reduction that transforms instances of CLIQUE into ones of IS
If IS has a polynomial-time TM, so does CLIQUE
Polynomial-time reductions
• Language L polynomial-time reduces to L’ if
there exists a polynomial-time TM R that takes an instance x of L into instance y of L’ s.t.
x ∈ L if and only if y ∈ L’
L L’(CLIQUE) (IS)
x y= 〈 G, k 〉 = 〈 G’, k’ 〉
x ∈ L y ∈ L’(G has clique of size k) (G’ has IS of size k’)
R
The meaning of reductions
• Saying L reduces to L’ means L is no harder than L’– In other words, if we can solve L’, then we can also solve L
• Therefore
If L reduces to L’ and L’ ∈ P, then L ∈ P
x y
x ∈ L y ∈ L’
Rpoly-time TM
for L’acc
rej
TM accepts
The direction of reductions
• The direction of the reduction is very important– Saying “A is easier than B” and “B is easier than
A” mean different things
• However, it is possible that L reduces to L’ and L’ reduces to L– This means that L and L’ are as hard as one
another– For example, IS and CLIQUE reduce to one
another
The Cook-Levin Theorem
• So every problem in NP is easier than SAT
• But SAT itself is in NP, so SAT must be the “hardest problem” in NP:
Every L ∈ NP reduces to SAT
If SAT ∈ P, then P = NP
SAT = {: is a satisfiable Boolean formula}
(x1∨x2 ) ∧ (x2 ∨x3 ∨x4) ∧ (x1)E.g.
P
SATNP
NP-completeness
• A language C is NP-complete if:
• Cook-Levin Theorem:
1. C is in NP, and
2. For every L in NP, L reduces to C.
P
CNP
3SAT is NP-complete
NP-complete
Our picture of NP
SAT
NPCLIQUEIS
P
0n1n
PATH
any CFL
A BA reduces to B
More NP-complete problems
P
3SAT
NP
CLIQUEIS
0n1n
PATH
NP-complete A BA reduces to B
In practice, most of the NP-problems areeither in P (easy) or NP-complete (probably hard)
Interpretation of Cook-Levin Theorem• Optimistic view:
• Pessimistic view:
If we manage to solve SAT, then we can also solve CLIQUE, scheduling, and almost anything
Since we do not believe P = NP, it is unlikely thatwe will ever have a fast algorithm for SAT
Clique
• Theorem
CLIQUE = {(G, k): G is a graph with a clique of k vertices}
1 2
3 4
A clique is a subset of vertices so that all pairs are connected
{1, 2, 3}, {1, 4}, {4} are cliques
CLIQUE is NP-hard
✓SAT
3SAT
IS
CLIQUE
VC
Reducing 3SAT to CLIQUE
• Proof: We give a reduction from 3SAT to CLIQUE
3SAT = {f: f is a satisfiable Boolean formula in 3CNF}
CLIQUE = {(G, k): G is a graph with a clique of k vertices}
3CNF formula f (G, k)
G has a cliqueof size k
R
f is satisfiable
Reducing 3SAT to CLIQUE
• Example: = (x1∨x1∨x2 ) ∧ (x1∨x2∨x2) ∧ (x1∨x2∨x3)
Put an edge for every consistent pair
x1
x1
x2
x1
x2
x2
x1
x2
x3
Put a vertex for every literal
Reducing 3SAT to CLIQUE
On input f, where f is a 3CNF formula with m clausesConstruct the following graph G:G has 3m vertices, divided into m groups, one for each literal in f
3CNF formula f (G, k)R
R:
If a and b are in different groups and a ≠ b, put an edge (a, b)Output (G, m)
Reducing 3SAT to CLIQUE
3CNF formula f (G, m)
G has a cliqueof size m
R
f is satisfiable
= (x1∨x1∨x2 ) ∧ (x1∨x2∨x2) ∧ (x1∨x2∨x3)
x1
x1
x2
x1
x2
x2
x1
x2
x3
T T F F F T F F T
Reducing 3SAT to CLIQUE
3CNF formula f (G, m)
G has a cliqueof size m
R
f is satisfiable
f = (x1∨x1∨x2 ) ∧ (x1∨x2∨x2) ∧ (x1∨x2∨x3)
x1
x1
x2
x1
x2
x2
x1
x2
x3
F T T F F T T TF
✓SAT
3SAT
IS
CLIQUE
VC
Reducing 3SAT to CLIQUE
• Every satisfying assignment of f gives a clique of size m in G
• Conversely, every clique of size m in G gives a consistent satisfying assignment of f.
3CNF formula f (G, m)
G has a clique of size m
R
f is satisfiable
✓
✓
Vertex cover
✓SAT
3SAT
IS
CLIQUE✓
✓
• Theorem
VC is NP-hard
A vertex cover is a set of vertices that touches (covers) all edges
{2, 4}, {3, 4}, {1, 2, 3} are vertex covers
VC = {(G, k): G is a graph with a vertex cover of size k}
VC
1 2
3 4
Reducing CLIQUE to VC
• Proof: We describe a reduction from IS to VC
• Example
(G’, k’)
G’ has a VC of size k’
R(G, k)
G has an IS of size k
∅ , {1}, {2}, {3}, {4},{1, 2}, {1, 3}
{2, 4}, {3, 4}, {1, 2, 3}, {1, 2, 4},{1, 3, 4}, {2, 3, 4},{1, 2, 3, 4}
independent setsvertex covers
1 2
3 4
Reducing IS to VC
• Claim
• Proof
S is an independent set of G if and only if S is a vertex cover of GS is an independent set of G
no edge has both endpoints in S
every edge has an endpoint in S
S is a vertex cover of G
1 2
3 4
∅{1}{2}{3}{4}
{1, 2}{1, 3}
{2, 4}{3, 4} {1, 2, 3}{1, 2, 4}{1, 3, 4}{2, 3, 4}{1, 2, 3, 4}
IS VC
Reducing IS to VC
(G’, k’)R(G, k)
G has a VC of size n – kG has an IS of size k
On input (G, k),R:Output (G, n – k).
✓SAT
3SAT
IS
CLIQUE
VC
✓
✓
✓
The ubiquity of NP-complete problems• We saw a few examples of NP-complete
problems, but there are many more
• A surprising fact of life is that most CS problems are either in P or NP-complete
• A 1979 book by Garey and Johnsonlists 100+ NP-complete problems
Practicing Reductions
Instance: A set X and a size s(x) for each x in X.
Question: Is there a subset X’ X such that
Instance: A set X and a size s(x) for each x in X, and an integer B.
Question: Is there a subset X’ X such that
PARTITION
SUBSET-SUM
Instance: A graph G=(V,E).
Question: Does G contains a Hamiltonian cycle,
i.e. a cycle which visits every vertex exactly once?
Instance: A graph G=(V,E).
Question: Does G contains a Hamiltonian path,
i.e. a path which visits every vertex exactly once?
HAMILTONIAN CYCLE
HAMILTONIAN PATH
Practicing Reductions
Techniques for Proving NP-completeness
1. Restriction
Show that a special case is already NP-complete.
2. Local replacement
Replace each basic unit by a different structure.
3. Component design
Design “components” with specific functionality.
Subgraph Isomorphism
Instance: Two graphs G = (V1,E1) and H = (V2,E2).
Question: Does G contain a subgraph isomorphic to H?
Two graphs G1 = (V1,E1) and G2 = (V2,E2) are isomorphic if
bijection f: V1 → V2
u —v in E1 iff f (u)—f (v) in E2
Clique <= Subgraph Isomorphism
Bounded Degree Spanning Tree
Instance: A graph G=(V,E) and a positive integer k.
Question: Is there a spanning tree for G in which no vertex has degree > k?
A spanning tree is a connected subgraph with |V|-1 edges.
Hamiltonian path <= Bounded degree spanning tree
Minimum Cover
Instance: Collection C of subsets of a set S, and a positive integer k.
Question: Does C contains a cover for S of size k or less, that is,
a subset C’ C with |C’| <= k and ?
Vertex cover <= Minimum Cover
Dominating Set
Instance: A graph G and a positive integer k.
Question: Does there exist a subset S of at most k vertices such that
every vertex in V-S is adjacent to at least one vertex in S?
Vertex cover <= dominating set
Sequencing within Intervals
Instance: A set T of jobs, each has a release time r(t),
a deadline d(t) and a length l(t).
Question: Does there exist a feasible schedule for T?
Partition <= Sequencing within Intervals
Subset Sum
Instance: A set X and a size s(x) for each x in X, and an integer B.
Question: Is there a subset X’ X such that
Vertex cover <= subset sum
See this proof and many other problems and reductions from Prof. Cai notes:
http://www.cse.cuhk.edu.hk/~csci3160/LectureNotes/11notes.pdf