bbm402-lecture 20: lp duality - web.cs.hacettepe.edu.trozkahya/classes/bbm402/lectures/... ·...
TRANSCRIPT
BBM402-Lecture 20: LP Duality
Lecturer: Lale Özkahya
Resources for the presentation:https://courses.engr.illinois.edu/cs473/fa2016/lectures.html
An easy LP?
max cx subject to Ax = b
which is compact form for
max c1x1 + c2x2 + . . . + cnxn
ai1x1 + ai2x2 + . . . + ainxn = bi 1 ≤ i ≤ m
Question: Is this a geneal LP problem or is it some how easy?
Chandra & Ruta (UIUC) CS473 3 Fall 2016 3 / 39
An easy LP?
max cx subject to Ax = b
Basically reduces to linear system solving. Three cases for Ax = b.The system Ax = b is infeasible, that is, no solutionThe system Ax = b has a unique solution x∗ whenrank([A b]) = n (full rank). Optimum solution value is cx∗
The system Ax = b has infinite solutions whenrank([A b]) < n. There all vectors of the form x∗ + y arefeasible where y is null-space(A) = {y | Ay = 0}. Let d bedimension of null-space(A) and let e1, e2, . . . , ed be anorthonormal basis. Thencx = cx∗ + cy = cx∗ + c(λ1e1 + λ2e2 + . . . + λded . Ifcei 6= 0 for any i then optimum solution value is unbounded.Otherwise cx∗.
Chandra & Ruta (UIUC) CS473 4 Fall 2016 4 / 39
LP Canonical Forms
Two basic canonical forms:
max cx,Ax = b, x ≥ 0
max cx,Ax ≤ b, x ≥ 0
What makes LP non-trivial and different from linear system solving isthe additional non-negativity constraint on variables.
Chandra & Ruta (UIUC) CS473 5 Fall 2016 5 / 39
Part I
Derivation and Definition of Dual LP
Chandra & Ruta (UIUC) CS473 6 Fall 2016 6 / 39
Feasible Solutions and Lower Bounds
Consider the program
maximize 4x1+ 2x2
subject to x1+ 3x2 ≤ 52x1− 4x2 ≤ 10x1+ x2 ≤ 7
x1 ≤ 5
1 (0, 1) satisfies all the constraints and gives value 2 for theobjective function.
2 Thus, optimal value σ∗ is at least 4.
3 (2, 0) also feasible, and gives a better bound of 8.
4 How good is 8 when compared with σ∗?
Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 39
Feasible Solutions and Lower Bounds
Consider the program
maximize 4x1+ 2x2
subject to x1+ 3x2 ≤ 52x1− 4x2 ≤ 10x1+ x2 ≤ 7
x1 ≤ 5
1 (0, 1) satisfies all the constraints and gives value 2 for theobjective function.
2 Thus, optimal value σ∗ is at least 4.
3 (2, 0) also feasible, and gives a better bound of 8.
4 How good is 8 when compared with σ∗?
Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 39
Feasible Solutions and Lower Bounds
Consider the program
maximize 4x1+ 2x2
subject to x1+ 3x2 ≤ 52x1− 4x2 ≤ 10x1+ x2 ≤ 7
x1 ≤ 5
1 (0, 1) satisfies all the constraints and gives value 2 for theobjective function.
2 Thus, optimal value σ∗ is at least 4.
3 (2, 0) also feasible, and gives a better bound of 8.
4 How good is 8 when compared with σ∗?
Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 39
Feasible Solutions and Lower Bounds
Consider the program
maximize 4x1+ 2x2
subject to x1+ 3x2 ≤ 52x1− 4x2 ≤ 10x1+ x2 ≤ 7
x1 ≤ 5
1 (0, 1) satisfies all the constraints and gives value 2 for theobjective function.
2 Thus, optimal value σ∗ is at least 4.
3 (2, 0) also feasible, and gives a better bound of 8.
4 How good is 8 when compared with σ∗?
Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 39
Feasible Solutions and Lower Bounds
Consider the program
maximize 4x1+ 2x2
subject to x1+ 3x2 ≤ 52x1− 4x2 ≤ 10x1+ x2 ≤ 7
x1 ≤ 5
1 (0, 1) satisfies all the constraints and gives value 2 for theobjective function.
2 Thus, optimal value σ∗ is at least 4.
3 (2, 0) also feasible, and gives a better bound of 8.
4 How good is 8 when compared with σ∗?
Chandra & Ruta (UIUC) CS473 7 Fall 2016 7 / 39
Obtaining Upper Bounds
1 Let us multiply the first constraint by 2 and the and add it tosecond constraint
2( x1+ 3x2 ) ≤ 2(5)+1( 2x1− 4x2 ) ≤ 1(10)
4x1+ 2x2 ≤ 20
2 Thus, 20 is an upper bound on the optimum value!
Chandra & Ruta (UIUC) CS473 8 Fall 2016 8 / 39
Generalizing . . .
1 Multiply first equation by y1, second by y2, third by y3 andfourth by y4 (all of y1, y2, y3, y4 being positive) and add
y1( x1+ 3x2 ) ≤ y1(5)+y2( 2x1− 4x2 ) ≤ y2(10)+y3( x1+ x2 ) ≤ y3(7)+y4( x1 ) ≤ y4(5)(y1 + 2y2 + y3 + y4)x1 + (3y1 − 4y2 + y3)x2 ≤ . . .
2 5y1 + 10y2 + 7y3 + 5y4 is an upper bound, providedcoefficients of xi are same as in the objective function, i.e.,
y1 + 2y2 + y3 + y4 = 4 3y1 − 4y2 + y3 = 2
3 The best upper bound is when 5y1 + 10y2 + 7y3 + 5y4 isminimized!
Chandra & Ruta (UIUC) CS473 9 Fall 2016 9 / 39
Dual LP: Example
Thus, the optimum value of program
maximize 4x1 + 2x2
subject to x1 + 3x2 ≤ 52x1 − 4x2 ≤ 10
x1 + x2 ≤ 7x1 ≤ 5
is upper bounded by the optimal value of the program
minimize 5y1 + 10y2 + 7y3 + 5y4
subject to y1 + 2y2 + y3 + y4 = 43y1 − 4y2 + y3 = 2
y1, y2 ≥ 0
Chandra & Ruta (UIUC) CS473 10 Fall 2016 10 / 39
Dual Linear Program
Given a linear program Π in canonical form
maximize∑d
j=1 cjxj
subject to∑d
j=1 aijxj ≤ bi i = 1, 2, . . . n
the dual Dual(Π) is given by
minimize∑n
i=1 biyisubject to
∑ni=1 yiaij = cj j = 1, 2, . . . d
yi ≥ 0 i = 1, 2, . . . n
Proposition
Dual(Dual(Π)) is equivalent to Π
Chandra & Ruta (UIUC) CS473 11 Fall 2016 11 / 39
Dual Linear Program
Given a linear program Π in canonical form
maximize∑d
j=1 cjxj
subject to∑d
j=1 aijxj ≤ bi i = 1, 2, . . . n
the dual Dual(Π) is given by
minimize∑n
i=1 biyisubject to
∑ni=1 yiaij = cj j = 1, 2, . . . d
yi ≥ 0 i = 1, 2, . . . n
Proposition
Dual(Dual(Π)) is equivalent to Π
Chandra & Ruta (UIUC) CS473 11 Fall 2016 11 / 39
Duality Theorems
Theorem (Weak Duality)
If x ′ is a feasible solution to Π and y ′ is a feasible solution toDual(Π) then c · x ′ ≤ y ′ · b.
Theorem (Strong Duality)
If x∗ is an optimal solution to Π and y∗ is an optimal solution toDual(Π) then c · x∗ = y∗ · b.
Many applications! Maxflow-Mincut theorem can be deduced fromduality.
Chandra & Ruta (UIUC) CS473 12 Fall 2016 12 / 39
Duality Theorems
Theorem (Weak Duality)
If x ′ is a feasible solution to Π and y ′ is a feasible solution toDual(Π) then c · x ′ ≤ y ′ · b.
Theorem (Strong Duality)
If x∗ is an optimal solution to Π and y∗ is an optimal solution toDual(Π) then c · x∗ = y∗ · b.
Many applications! Maxflow-Mincut theorem can be deduced fromduality.
Chandra & Ruta (UIUC) CS473 12 Fall 2016 12 / 39
Proof of Weak Duality
We already saw the proof by the way we derived it but we will do itagain formally.
Since y ′ is feasible to Dual(Π): y ′A = c
Therefore c · x ′ = y ′Ax ′
Since x ′ is feasible Ax ′ ≤ b and hence,
c · x ′ = y ′Ax ′ ≤ y ′ · b
Chandra & Ruta (UIUC) CS473 13 Fall 2016 13 / 39
Duality for another canonical form
maximize 4x1+ x2+ 3x3
subject to x1+ 4x2 ≤ 22x1− x2+ x3 ≤ 4
x1, x2, x3 ≥ 0
Choose non-negative y1, y2 and multiply inequalities
maximize 4x1+ x2+ 3x3
subject to y1(x1+ 4x2 ) ≤ 2y1
y2(2x1− x2+ x3) ≤ 4y2
x1, x2, x3 ≥ 0
Chandra & Ruta (UIUC) CS473 14 Fall 2016 14 / 39
Duality for another canonical form
maximize 4x1+ x2+ 3x3
subject to x1+ 4x2 ≤ 22x1− x2+ x3 ≤ 4
x1, x2, x3 ≥ 0
Choose non-negative y1, y2 and multiply inequalities
maximize 4x1+ x2+ 3x3
subject to y1(x1+ 4x2 ) ≤ 2y1
y2(2x1− x2+ x3) ≤ 4y2
x1, x2, x3 ≥ 0
Chandra & Ruta (UIUC) CS473 14 Fall 2016 14 / 39
Duality for another canonical form
Choose non-negative y1, y2 and multiply inequalities
maximize 4x1+ x2+ 3x3
subject to y1(x1+ 4x2 ) ≤ 2y1
y2(2x1− x2+ x3) ≤ 4y2
x1, x2, x3 ≥ 0
Adding the inequalities we get an inequality below that is valid forany feasible x and any non-negative y :
(y1 + 2y2)x1 + (4y1 − y2)x2 + y2 ≤ 2y1 + 4y2
Suppose we choose y1, y2 such thaty1 + 2y2 ≥ 4 and 4y2 − y2 ≥ 1 and 2y1 ≥ 3Then, since x1, x2, x3 ≥ 0, we have 4x1 + x2 + 3x3 ≤ 2y1 + 4y2
Chandra & Ruta (UIUC) CS473 15 Fall 2016 15 / 39
Duality for another canonical form
maximize 4x1+ x2+ 3x3
subject to x1+ 4x2 ≤ 22x1− x2+ x3 ≤ 4
x1, x2, x3 ≥ 0
is upper bounded by
minimize 2y1+ 4y2
subject to y1+ 2y2 ≥ 44y1− y2 ≥ 1
2y1 ≥ 3y1, y2 ≥ 0
Chandra & Ruta (UIUC) CS473 16 Fall 2016 16 / 39
Duality for another canonical form
Compactly,
For the primal LP max cx subject to Ax ≤ b, x ≥ 0the dual LP is min yb subject to yA ≥ c, y ≥ 0
Chandra & Ruta (UIUC) CS473 17 Fall 2016 17 / 39
Some Useful Duality Properties
Assume primal LP is a maximization LP.
For a given LP, Dual is another LP. The variables in the dualcorrespond to “non-trival” primal constraints and vice-versa.
Dual of the dual LP give us back the primal LP.
Weak and strong duality theorems.
If primal is unbounded (objective achieves infinity) then dual LPis infeasible. Why? If dual LP had a feasible solution it wouldupper bound the primal LP which is not possible.
If primal is infeasible then dual LP is unbounded.
Primal and dual optimum solutions satisfy complementaryslackness conditions (discussed soon).
Chandra & Ruta (UIUC) CS473 18 Fall 2016 18 / 39
Part II
Examples of Duality
Chandra & Ruta (UIUC) CS473 19 Fall 2016 19 / 39
Max matching in bipartite graph as LP
Input:G = (V = L ∪ R, E)
max∑
uv∈Exuv
s.t.∑
uv∈Exuv ≤ 1 ∀v ∈ V .
xuv ≥ 0 ∀uv ∈ E
Chandra & Ruta (UIUC) CS473 20 Fall 2016 20 / 39
Network flow
s-t flow in directed graph G = (V ,E) with capacities c . Assumefor simplicity that no incoming edges into s.
max∑
(s,v)∈Ex(s, v)
∑
(u,v)∈Ex(u, v)−
∑
(v ,w)∈Ex(v ,w) = 0 ∀v ∈ V \ {s, t}
x(u, v) ≤ c(u, v) ∀(u, v) ∈ E
x(u, v) ≥ 0 ∀(u, v) ∈ E.
Chandra & Ruta (UIUC) CS473 21 Fall 2016 21 / 39
Dual of Network Flow
Chandra & Ruta (UIUC) CS473 22 Fall 2016 22 / 39
Part III
Farkas Lemma and Strong Duality
Chandra & Ruta (UIUC) CS473 23 Fall 2016 23 / 39
Optimization vs Feasibility
Suppose we want to solve LP of the form:
max cx subject to Ax ≤ b
It is an optimization problem. Can we reduce it to a decisionproblem?
Yes, via binary search. Find the largest values of σ suchthat the system of inequalities
Ax ≤ b, cx ≥ σ
is feasible. Feasible implies that there is at least one solution.Caveat: to do binary search need to know the range of numbers.Skip for now since we need to worry about precision issues etc.
Chandra & Ruta (UIUC) CS473 24 Fall 2016 24 / 39
Optimization vs Feasibility
Suppose we want to solve LP of the form:
max cx subject to Ax ≤ b
It is an optimization problem. Can we reduce it to a decisionproblem? Yes, via binary search. Find the largest values of σ suchthat the system of inequalities
Ax ≤ b, cx ≥ σ
is feasible. Feasible implies that there is at least one solution.Caveat: to do binary search need to know the range of numbers.Skip for now since we need to worry about precision issues etc.
Chandra & Ruta (UIUC) CS473 24 Fall 2016 24 / 39
Certificate for (in)feasibility
Suppose we have a system of m inequalities in n variables defined by
Ax ≤ b
How can we convince some one that there is a feasible solution?
How can we convince some one that there is no feasiblesolution?
Chandra & Ruta (UIUC) CS473 25 Fall 2016 25 / 39
Theorem of the Alternatives
TheoremLet A ∈ Rm×n and b ∈ Rm. The system Ax ≤ b is either feasibleof if it is infeasible then there is a y ∈ Rm such that y ≥ 0 andyA = 0 and yb < 0.
In other words, if Ax ≤ b is infeasible we can demonstrate it via thefollowing compact contradiction. Find a non-negative combination ofthe rows of A (given by certificate y) to derive 0 < 1.
The preceding theorem can be used to prove strong duality. A fairamount of formal detail though geometric intuition is reasonable.
Chandra & Ruta (UIUC) CS473 26 Fall 2016 26 / 39
Farkas Lemma
From the theorem of alternatives we can derive a useful version ofFarkas lemma.
TheoremA system Ax = b, x ≥ 0 is either feasible or there is a y such thatyA ≥ 0 and yb < 0. Then the following hold:
Nice geometric interpretation.
Chandra & Ruta (UIUC) CS473 27 Fall 2016 27 / 39
Complementary Slackness
TheoremLet x∗ be any optimum solution to primal LP Π in the canonicalform max cx,Ax ≤ b, x ≥ 0 and y∗ be an optimum solution tothe dual LP Dual(Π) which is min yb, yA ≥ c, y ≥ 0. Then thefollowing hold.
If y∗i > 0 then∑n
j=1 aijxj = bj (the primal constraint for row iis tight).
If x∗j > 0 then∑m
i=1 yiaij = cj (the dual constraint for row j istight).
The converse also hold: if x∗ and y∗ are primal and dual feasible andsatisfy complementary slackness conditions then both must beoptimal.
Very useful in various applications. Nice geometric interpretation.
Chandra & Ruta (UIUC) CS473 28 Fall 2016 28 / 39
Part IV
Integer Linear Programming
Chandra & Ruta (UIUC) CS473 29 Fall 2016 29 / 39
Integer Linear Programming
ProblemFind a vector x ∈ Zd (integer values) that
maximize∑d
j=1 cjxj
subject to∑d
j=1 aijxj ≤ bi for i = 1 . . . n
Input is matrix A = (aij) ∈ Rn×d , column vector b = (bi) ∈ Rn,and row vector c = (cj) ∈ Rd
Chandra & Ruta (UIUC) CS473 30 Fall 2016 30 / 39
Factory Example
maximize x1 + 6x2
subject to x1 ≤ 200 x2 ≤ 300 x1 + x2 ≤ 400x1, x2 ≥ 0
Suppose we want x1, x2 to be integer valued.
Chandra & Ruta (UIUC) CS473 31 Fall 2016 31 / 39
Factory Example Figure
x2
x1
300
2001 Feasible values of x1 and x2 are integer
points in shaded region
2 Optimization function is a line; moving theline until it just leaves the final integerpoint in feasible region, gives optimal values
Chandra & Ruta (UIUC) CS473 32 Fall 2016 32 / 39
Factory Example Figure
x2
x1
300
2001 Feasible values of x1 and x2 are integer
points in shaded region
2 Optimization function is a line; moving theline until it just leaves the final integerpoint in feasible region, gives optimal values
Chandra & Ruta (UIUC) CS473 32 Fall 2016 32 / 39
Factory Example Figure
x2
x1
300
2001 Feasible values of x1 and x2 are integer
points in shaded region
2 Optimization function is a line; moving theline until it just leaves the final integerpoint in feasible region, gives optimal values
Chandra & Ruta (UIUC) CS473 32 Fall 2016 32 / 39
Integer Programming
Can model many difficult discrete optimization problems as integerprograms!
Therefore integer programming is a hard problem. NP-hard.
Can relax integer program to linear program and approximate.
Practice: integer programs are solved by a variety of methods
1 branch and bound
2 branch and cut
3 adding cutting planes
4 linear programming plays a fundamental role
Chandra & Ruta (UIUC) CS473 33 Fall 2016 33 / 39
Example: Maximum Independent Set
DefinitionGiven undirected graph G = (V ,E) a subset of nodes S ⊆ V is anindependent set (also called a stable set) if for there are no edgesbetween nodes in S . That is, if u, v ∈ S then (u, v) 6∈ E .
Input Graph G = (V ,E)
Goal Find maximum sized independent set in G
Chandra & Ruta (UIUC) CS473 34 Fall 2016 34 / 39
Example: Dominating Set
DefinitionGiven undirected graph G = (V ,E) a subset of nodes S ⊆ V is adominating set if for all v ∈ V , either v ∈ S or a neighbor of v is inS .
Input Graph G = (V ,E), weights w(v) ≥ 0 for v ∈ VGoal Find minimum weight dominating set in G
Chandra & Ruta (UIUC) CS473 35 Fall 2016 35 / 39
Example: s-t minimum cut and implicit constraints
Input Graph G = (V ,E), edge capacities c(e), e ∈ E .s, t ∈ V
Goal Find minimum capacity s-t cut in G .
Chandra & Ruta (UIUC) CS473 36 Fall 2016 36 / 39
Linear Programs with Integer Vertices
Suppose we know that for a linear program all vertices have integercoordinates.
Then solving linear program is same as solving integer program. Weknow how to solve linear programs efficiently (polynomial time) andhence we get an integer solution for free!
Luck or Structure:1 Linear program for flows with integer capacities have integer
vertices2 Linear program for matchings in bipartite graphs have integer
vertices3 A complicated linear program for matchings in general graphs
have integer vertices.
All of above problems can hence be solved efficiently.
Chandra & Ruta (UIUC) CS473 37 Fall 2016 37 / 39
Linear Programs with Integer Vertices
Suppose we know that for a linear program all vertices have integercoordinates.Then solving linear program is same as solving integer program. Weknow how to solve linear programs efficiently (polynomial time) andhence we get an integer solution for free!
Luck or Structure:1 Linear program for flows with integer capacities have integer
vertices2 Linear program for matchings in bipartite graphs have integer
vertices3 A complicated linear program for matchings in general graphs
have integer vertices.
All of above problems can hence be solved efficiently.
Chandra & Ruta (UIUC) CS473 37 Fall 2016 37 / 39
Linear Programs with Integer Vertices
Suppose we know that for a linear program all vertices have integercoordinates.Then solving linear program is same as solving integer program. Weknow how to solve linear programs efficiently (polynomial time) andhence we get an integer solution for free!
Luck or Structure:1 Linear program for flows with integer capacities have integer
vertices2 Linear program for matchings in bipartite graphs have integer
vertices3 A complicated linear program for matchings in general graphs
have integer vertices.
All of above problems can hence be solved efficiently.
Chandra & Ruta (UIUC) CS473 37 Fall 2016 37 / 39
Linear Programs with Integer Vertices
Meta Theorem: A combinatorial optimization problem can be solvedefficiently if and only if there is a linear program for problem withinteger vertices.
Consequence of the Ellipsoid method for solving linear programming.
In a sense linear programming and other geometric generalizationssuch as convex programming are the most general problems that wecan solve efficiently.
Chandra & Ruta (UIUC) CS473 38 Fall 2016 38 / 39
Summary
1 Linear Programming is a useful and powerful (modeling)problem.
2 Can be solved in polynomial time. Practical solvers availablecommercially as well as in open source. Whether there is astrongly polynomial time algorithm is a major open problem.
3 Geometry and linear algebra are important to understand thestructure of LP and in algorithm design. Vertex solutions implythat LPs have poly-sized optimum solutions. This implies thatLP is in NP.
4 Duality is a critical tool in the theory of linear programming.Duality implies the Linear Programming is in co-NP. Do yousee why?
5 Integer Programming in NP-Complete. LP-based techniquescritical in heuristically solving integer programs.
Chandra & Ruta (UIUC) CS473 39 Fall 2016 39 / 39