computational models spring 09: lecture 13bchor/cm09/compute13.pdf · computational models spring...

Computational Models Spring 09: Lecture 13

More Poly-Time Reductions:

3SAT ≤P DirHamPath

3SAT ≤P IP (not in book)

Bounded Halting is NPC

3SAT ≤P 3-COL (not in book)

NP hardness and coNP hardness

Decision, search, and optimization problems

Coping with NP completeness: Approximation

Sipser, Chapter 7, Sections 7.3, 7.4, 7.5, 10.1, 10.2

Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p. 1

Traveling Salesperson

Parameters:

set of cities C

set of inter-city distances D

goal k


Traveling Salesperson Problem (TSP)

Define TRAVELING-SALESMAN = {〈C,D, k〉 | (C,D) has aTS tour of total distance ≤ k}

Remark: Can consider two versions –undirected and directed.


Traveling Salesperson Problem (TSP)

Define TRAVELING-SALESMAN = {〈C,D, k〉 | (C,D) has aTS tour of total distance ≤ k}

Remark: Can consider two versions –undirected and directed.

RecallHAMCIRCUIT = {〈G〉 | G has Hamiltonian circuit}

Theorem: Directed HAMCIRCUIT is polynomial-timereducible to directed TRAVELING-SALESMAN,

HAMCIRCUIT ≤P TRAVELING-SALESMAN .


HAMCIRCUIT≤P TSP

The reduction: Given a directed graph G = (V,E) weconstruct a directed traveling salesman instance.

The cities are identical to the nodes of the originalgraph, C = V .

The distance of going from v1 to v2 is 1 if (v1, v2) ∈ E,and 2 otherwise.

The bound on the total distance of a tour is k = |V |.


HAMCIRCUIT≤P TSP

Validity of Reduction=⇒ Suppose G has a Hamiltonian circuit. Thedistance assigned by the reduction to all edges inthis circuit is 1. Thus in (C,D) there is a travelingsalesman tour of total distance |V | = k, namely(C,D, k) ∈ TRAVELING-SALESMAN.⇐= Suppose (C,D) has a traveling salesman tour oftotal distance |V | = k. Tour cannot contain any edgeof distance 2. Therefore it gives a Hamiltonian circuitin G.


HAMCIRCUIT≤P TSP

Validity of Reduction=⇒ Suppose G has a Hamiltonian circuit. Thedistance assigned by the reduction to all edges inthis circuit is 1. Thus in (C,D) there is a travelingsalesman tour of total distance |V | = k, namely(C,D, k) ∈ TRAVELING-SALESMAN.⇐= Suppose (C,D) has a traveling salesman tour oftotal distance |V | = k. Tour cannot contain any edgeof distance 2. Therefore it gives a Hamiltonian circuitin G.

Efficiency: Reduction in quadratic time (filling updistances for all edges of the complete graph). ♣


The Language SAT (reminder)

Definition: A Boolean formula is in conjunctive normal form(CNF) if it consists of clauses, connected with ∧s. Eachclauseis a disjunction (∨s) of literals.

For example (x1 ∨ x2 ∨ x3 ∨ x4) ∧ (x3 ∨ x5 ∨ x6) ∧ (x3 ∨ x6)

Definition: SAT = {〈φ〉 | φ is satisfiable CNF formula}


3SAT (reminder)

Definition: A Boolean formula is in 3CNF form if it is a CNFformula, and each clause has at most three literals.

(x1 ∨ x2 ∨ x3) ∧ (x3 ∨ x5 ∨ x6) ∧ (x3 ∨ x6 ∨ x4)

Define

3SAT = {〈φ〉 | φ is satisfiable 3CNF formula}

Clearly, if φ is a satisfiable 3CNF formula, then for anysatisfying assignment of φ, every clause must contain atleast one literal assigned 1.


NPC – Reminder

A language B is NP-complete if it satisfies


NPC – Reminder


B ∈ NP , and


NPC – Reminder


B ∈ NP , and

Every A in NP is polynomial time reducible to B.


coNP-Completeness (analog notion)

A language C is coNP-complete if it satisfies

C ∈ coNP (namely its complement is in NP ), and

For every D in coNP, D ≤P C.


Cook-Levin (early 70s)

Theorem: SAT is NP complete.

Prf.: Membership (SAT ∈ NP ) is easy, using satisfyingassignment as certificate.

Hard part is showing that every NP problem reduces toSAT in poly-time.

Idea: Suppose L ∈ NP, and M is an NTM that acceptsL.

On input w of length n, M runs in time t(n) = nc.

We consider the nc-by-nc tableau that describes thecomputation of M on input w.


The Tableau

cell[1,1]

cell[1,t(n)]

q 0 0 10 0 . . .1 2 3 . . . t(n)

0 00

1


Saw a Few Reductions So Far

SAT ≤P 3SAT (⇒ 3SAT is NP-complete)

3SAT ≤P Clique (⇒ Clique is NP-complete)

In recitation: Clique ≤P Independent Set (⇒ IS isNP-complete)

In recitation: Clique ≤P Vertex Cover (⇒ VC isNP-complete)

HamPath ≤P HamCircuit

HamCircuit ≤P TSP

Will now show 3SAT ≤P DirHamPath, thus establishingNP-completeness of DirHamPath, DirHamCircuit, andDirTSP.


Directed Hamiltonian Path

For any 3CNF formula φ,

we construct a directed graph G

with vertices s and t

such that φ is satisfiable iff there is a directedHamiltonian path from s to t.


Directed Hamiltonian Path

Here is a 3CNF formula φ:

(a1 ∨ b1 ∨ c1) ∧ (a2 ∨ b2 ∨ c2) ∧ · · · (ak ∨ bk ∨ ck)∧

where

each ai, bi, ci is xi or xi

the ℓ clauses are C1, . . . , Cℓ,

the k variables are x1, . . . , xk.


DirHamPath: NP Completeness Proof

Turn to a separate pdf presentation:http://www.cs.tau.ac.il/∼bchor/CM07/ham-reduction.pdf


Integer Programming (IP)

Definition: A linear inequality has the form

a1x1 + a2x2 + . . . + anxn ≤ b

where a1, . . . , an, b are real numbers, and x1, . . . , xn arereal variables.

The Integer Programming (IP) problem:

Input: A set of m linear inequalities with integercoefficients (ai, b) in n variables x1, x2, . . . , xn.

The language IP is the collection of all systems of linearinequalities that have a solution where all xi areintegers.


Integer Programming: Example

Consider the following system of linear inequalities

y ≤ 2x green line

−2x + 1 ≤ y red line

4x − 2 ≤ y purple line

0 ≤ x ≤ 1

0 ≤ y ≤ 2

–2

–1

0

1

2

0.2 0.4 0.6 0.8 1 1.2x


Integer Programming: Example

–2

–1

0

1

2

0.2 0.4 0.6 0.8 1 1.2x

This set does have a unique solution: the right hand cornerof the solid triangle, (1, 2).

But if we change the constraint on y to 0 ≤ y ≤ 1, then we’dhave no solution with integer coordinates, even thoughthere are many solutions with rational, or real, coordinates.

Will now show IP is NP complete.Membership in NP easy (why?)


SAT ≤P IP

SAT = {〈φ〉 | ϕ is a satisfiable CNF formula}

For example, the following formula is in SAT:(x1 ∨ x2 ∨ x3 ∨ x4) ∧ (x3 ∨ x5 ∨ x6) ∧ (x3 ∨ x6)

Let ϕ be a CNF formula with m clauses and n variablesx1, . . . , xn (either xi, x̄i, or both, can appear in ϕ).

Will reduce ϕ to an IP instance with 2n variablesx1, y1, . . . , xn, yn and m + 2n linear inequalities, and n linearequalities (why ?).


SAT ≤P IP

Each xi in ϕ corresponds to the variable xi in IP.

Each x̄i in ϕ corresponds to the variable yi in IP.

For each i, we add the inequalities xi ≥ 0, yi ≥ 0, andthe equality xi + yi = 1(what do these three express?)

For each clause k, we add the inequality∑zj∈Clausek

zj ≥ 1

(what does this inequality express?)

For example, (x1 ∨ x2 ∨ x3 ∨ x4) is translated tox1 + y2 + y3 + x4 ≥ 1.


SAT ≤P IP: Example

ϕ = (x1 ∨ x2 ∨ x3 ∨ x4) ∧ (x3 ∨ x5 ∨ x6) ∧ (x3 ∨ x6)

translates to

x1 + y2 + y3 + x4 ≥ 1

x3 + y5 + x6 ≥ 1

x3 + x6 ≥ 1

x1 ≥ 0, y1 ≥ 0, x1 + y1 = 1

x2 ≥ 0, y2 ≥ 0, x2 + y2 = 1

x3 ≥ 0, y3 ≥ 0, x3 + y3 = 1

x4 ≥ 0, y4 ≥ 0, x4 + y4 = 1

x5 ≥ 0, y5 ≥ 0, x5 + y5 = 1

x6 ≥ 0, y6 ≥ 0, x6 + y6 = 1


SAT ≤P IP: Validity (sketch)

Should show(a) Reduction g is poly-time computable

(b) ϕ ∈ SAT =⇒ g(ϕ) ∈ IP

(c) g(ϕ) ∈ IP =⇒ ϕ ∈ SAT .


SAT ≤P IP: Validity (sketch)

Should show(a) Reduction g is poly-time computable

(b) ϕ ∈ SAT =⇒ g(ϕ) ∈ IP

(c) g(ϕ) ∈ IP =⇒ ϕ ∈ SAT .

Poly time: easy (verify details!).

Suppose ϕ ∈ SAT. Take a satisfying assignment.If xi = 1 assign xi = 1, yi = 0 in IP.If xi = 0 assign xi = 0, yi = 1 in IP.

So "sanity check" constraints satisfied. "Clauseconstraints" are satisfied due to at least one literalsatisfied in each clause., implying g(ϕ) ∈ IP .

g(ϕ) ∈ IP =⇒ ϕ ∈ SAT is similar. ♣


Bounded ATM

Bounded ATM: Given encoding 〈M〉 ofnon-deterministic TM, an input w, time bound 1k inunary, does M have an accepting computation of w in ksteps or less?

Bounded ATM is NP complete, via a “generic”reduction.

Finding the reduction is easy (check!).

Proving Bounded ATM is in NP seems less obvious.


Bounded ATM in NP

An instance of the problem has the form 〈M,w, 1k〉. Theuniversal NTM, U , decides this language:

U does not write anything on input tape.

Copies q0w to second tape.

Simulates M step by step, keeping its configuration onsecond tape.

Sets up a unary counter on third tape, initialize to zeroand increments it for each simulated step of M .

If counter reaches k + 1, U rejects.

How many steps of U does it take to simulate k stepsof M?


Bounded ATM in NP

How many steps U takes to simulate k steps of M?To simulate one step of M , NTM U has to find an entry inM ’s transition function that matches current simulatedstate and letter under head.This requires scanning all of M ’s transition function (oninput tape), which takes length of 〈M〉 steps of U .So to simulate k steps of M , NTM U takes k · |〈M〉| steps.Denote n = |〈M,w, 1k〉|. What is k · |〈M〉| in terms of n?k · |〈M〉| = θ(n2), so ATM ∈ NTIME(n2). ♣

Question: What would the NTIME complexity be if k

would be encoded in binary?


Graph Coloring

A legal coloring of an undirected graph is an assignment ofcolors to the nodes of the graph, such that if two nodes areconnected by an edge, they are colored by different colors.


Graph Coloring

For any specific k, a k-coloring of G is a coloring thatuses at most k colors.

Of course, a k-coloring of G may not exist.

This raises the question: Given G and k, how hard it isto decide if a k-coloring of G exists?

Extreme values of k are easy to decide: E.g. k = 1(does the have any edge?), k = 2 (is the graphbipartite? – there is a simple algorithm you can comeup with easily), or k = number of nodes in graph (simplyassign each node a different color).

But what happens beyond trivial cases? For example:"is G 3-colorable”.


3-colorability is NP-Complete

Define3-COL ={G |G is an undirected 3 colorable graph}.3-COL is NP complete.

Membership of 3-COL in NP is easy:The certificate is the coloring. The verifier checks this isindeed a coloring (each node assigned exactly onecolor), and it is legal (nodes connected by edges areassigned different colors).

We will show NP completeness of 3-COL using apolynomial time reduction from 3SAT.


3-colorability is NP-Complete

Based on formula φ, reduction builds a graph G. Eachliteral from φ is assigned a node in G. The graphcontains additional nodes.

Reduction “forces” each node to be colored either F orT or red.

Reduction “forces” each literal to be colored legally byeither F or T.

Reduction “forces” each xi and xi to be assigneddifferent colors in a legal coloring.

For each clause of φ, reduction builds a gadget. Gadgetensures that a 3 coloring of G assigns at least oneliteral node the “color” T.

Nough said. Time for whiteboard scratching.


3-COL is NP-Complete: Conclusion

Saw that 3-COL is in NP.

Saw a poly time reduction from 3SAT to 3-COL(reduction preserves membership and is in poly time).

Since 3SAT is NPC, so is 3-COL. ♠.


Yet More Intractable Problems

Subgraph isomorphism is NP complete (easy reductionfrom clique.

Graph isomorphism is in NP, seems not to be in P, butwe got many good reasons (that are out of scope forthis course) to believe it is not NP complete.


Chains of Reductions: NPC Problems

SAT

IntegerProg 3SAT

Clique

IndepSet

VertexCover

SetCover

3ExactCover

Knapsack

Scheduling

3Color HamPath

HamCircuit

TRAVELING-SALESMAN


NP Hardness

A language B is NP-hard if it satisfies

Every A in NP is polynomial time reducible to B

We do not require B ∈ NP (membership).

Example: The language

{〈Φ1,Φ2〉 | Φ1 ∈ SAT,Φ2 /∈ SAT}

is NP-hard but apparently not NP-complete (why?).


coNP Hardness

A language B is coNP-hard if it satisfies

Every A in coNP is polynomial time reducible to B

We do not require B ∈ coNP (membership).

Example: The language

{〈Φ1,Φ2〉 | Φ1 ∈ SAT,Φ2 /∈ SAT}

is coNP-hard but apparently not coNP-complete (why?).


Decision, Search, Optimization Problems

Let R(·, ·) be a poly time computable predicate.

Decision Problem: Given input x, decide if there issome y satisfying R(x, y)?





Using the “certificate” characterization of languages inNP, the decision problem is the same as decidingmembership x ∈ L for L ∈ NP .






Search Problem: Given input x, find some y satisfyingR(x, y), or declare that none exist.







The search problem seems harder to solve than thedecision problem.




Turns out that for NP complete languages, search anddecision have the “same difficulty”.

Specifically, given access to an oracle for L (thedecision problem), we can solve the search problem inpoly time.

When oracle is successively accessed with queries ofdecreasing sizes, this technique is known as selfreduction.

Examples: SAT and Clique (on board).

This is applicable to NPC problems but not to all of NP.




Optimization Problem: Given input x, find some ysatisfying R(x, y), that is the largest among all solutions(or smallest), or declare that none exist.

Example 1: Given a graph G, find a legal coloring thatminimizes number of colors used.

Example 2: Given a graph G, find a clique of largestsize possible.


Coping with NP-Completeness

Approximation algorithms for hardoptimization problems.

Randomized (coin flipping) algorithms.

Fixed parameter algorithms.

Heuristics.

These stand in the forefront of current algorithmic research,and could easily fill up three or four advanced courses.

(figure from http://wwwbrauer.in.tum.de/gruppen/theorie/hard/vc1.png)Slides modified by Benny Chor, based on original slides by Maurice Herlihy, Brown University. – p. 38

Example: Vertex Cover

Given a graph (V,E)

find the smallest set of vertices C

such that for each edge in the graph,

C contains at least one endpoint.


Example: Vertex Cover

Given a graph (V,E)

find the smallest set of vertices C

such that for each edge in the graph,

C contains at least one endpoint.

(figure from www.cc.ioc.ee/jus/gtglossary/assets/vertex_cover.gif)


Vertex Cover

The decision version of this problem is NP-complete by areduction from Independent Set (proved in recitation).


Vertex Cover

The decision version of this problem is NP-complete by areduction from Independent Set (proved in recitation).

(figure from http://wwwbrauer.in.tum.de/gruppen/theorie/hard/vc1.png)


Approx. Algorithm for VC (Gavril ’74)

C := ∅

while there are edges in G

choose any edge (u, v) in G

add u and v to C

remove them from G

Claim: This algorithm is a 2-approximationalgorithm for vertex cover.

Meaning C is at most twice as large as aminimum vertex cover.


Gavril’s Approximation Algorithm

Claim: This is a 2-approximation algorithm.

Cover C constructed from |C|/2 edges of G

no two edges of these share a vertex

any vertex cover, including the optimum,

contains at least one node from each of theseedges (otherwise an edge would not be covered).

It follows that OPT (G) ≥ |C|/2(so |C|/OPT ≤ 2) ♠

Remark: Under some plausible complexity assumption,this factor 2 approximation cannot be improved.


computational models spring 09: lecture 13bchor/cm09/compute13.pdf · computational models spring...

Documents