short course robust optimization and machine …robust optimization and machine learning lecture 2:...

Post on 20-Jun-2020

10 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Short CourseRobust Optimization and Machine Learning

Lecture 2: Convex Optimization

Laurent El Ghaoui

EECS and IEOR DepartmentsUC Berkeley

Spring seminar TRANSP-OR, Zinal, Jan. 16-19, 2012

January 16, 2012

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Outline

Convex problemsConvex setsConvex functionsConvex problems

DualityWeak dualityExamplesStrong duality

References

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Outline

Convex problemsConvex setsConvex functionsConvex problems

DualityWeak dualityExamplesStrong duality

References

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex setsDefinition

A set C in Rn is convex if, or any two pair of points, such as x1 and x2,the line segment joining the two points is entirely in the set.Mathematically:

∀ x1, x2 ∈ C, ∀ λ ∈ [0, 1] : λx1 + (1− λx2) ∈ C.

Points x1,x2 are in the set, so the linebetween them is.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex setsIntersection rule

The intersection of convex sets isconvex: if Cα is a family of convexsets indexed by α ∈ A, then∩α∈ACα is convex. (We allow forinfinite sets A.)

Examples:I the second-order cone {(y , t) :: ‖y‖2 ≤ t} is convex, as it is the

intersection of half-spaces of the form uT y ≤ t with u spanningthe unit Euclidean sphere.

I the cone of positive, semi-definite (PSD) matrices is convex,since a n × n symmetric matrix X is PSD iff zT Xz ≥ 0 for everyn-vector z.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex setsAffine transformation rule

Affine transformations of sets are convex. If C ⊆ Rm is convex, thenfor any x ∈ Rn and R ∈ Rn×m, the set {x + Ru : u ∈ C} is convex.

Example: For a given m × n matrix A, m-vector b, n-vector c, andscalar d , the set {

x ∈ Rn : ‖Ax + b‖2 ≤ cT x + d}

is convex.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex setsExample: chance constraint

If a ∈ Rn is a Gaussian random variable with mean a and covariancematrix Σ, then the set of points x ∈ Rn such that

Prob{a : aT x ≤ b} ≥ 0.99

is convex, and can be expressed via the second-order cone condition

aT x + κ√

xT Σx︸ ︷︷ ︸=‖Σ1/2x‖2

≤ b,

where κ = Φ−1(0.99) ≈ 2.33, with Φ the CDF of the standardGaussian.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex functionsExtended value functions

Extended value functions and domain:I The domain of a function f , denoted by domf , is the set of

points where it is finite.Example: f : x → log x , dom f = R++.

I We can extend functions outside the domain using the values±∞.Example: f : x → log x , x > 0, +∞ otherwise.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex functionsDefinition

A (extended value) function f : Rn → R is convex if

∀ x1, x2, ∀λ ∈ [0, 1] : f (λx1 + (1− λx2)) ≤ λf (x1) + (1− λ)f (x2).

(This requires the domain to be convex.)

Examples:I f : x → 1/x if x > 0, +∞ otherwise, is convex.I f : x → 1/x if x 6= 0, +∞ otherwise, is not convex.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex functionsEpigraph

Epigraph of a function:

epi f :={

(x , t) ∈ Rn × R : f (x) ≤ t}.

f convex⇐⇒ epi f is convex.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex functionsAlternate characterizations

f differentiable, convex⇐⇒

∀ x , x0 : f (x) ≥ f (x0) +∇f (x0)T (x − x0).

Affine approximation to f at anypoint is global lower bound onf .

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex functionsAlternate characterizations

f twice differentiable, convex if and only if its Hessian

∇2f (x) =

(∂2f∂xi∂xj

(x)

)1≤ij≤n

is positive semidefinite (PSD) for every x .

In general, this condition is hard to check. For quadratic functionsthough, it is easy, as seen next.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Quadratic functionsRepresentation via symmetric matrices

A quadratic function q : Rn → R can be represented as

q(x) =12

xT Qx + bT x + c,

for appropriate symmetric matrix Q, vector b ∈ Rn, and scalar c.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Positive semidefinite matrices

Flashback from linear algebra:I A square matrix A has n (possibly non-distinct) eigenvalues,

which are (in general complex) numbers that solvedet(λI − A) = 0.

I Symmetric matrices have real eigenvalues only.I A symmetric matrix Q is said to be positive semi-definite (PSD) if

∀ x : xT Qx ≥ 0.

We write Q � 0.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Eigenvalue decomposition for symmetric matrices

Theorem (EVD of symmetric matrices)We can decompose any symmetric p × p matrix Q as

Q = UΛUT =

p∑i=1

λiuiuTi ,

where Λ = diag(λ1, . . . , λp), with λ1 ≥ . . . ≥ λn the eigenvalues, andU = [u1, . . . , up] is a p × p orthogonal matrix (UT U = Ip) that containsthe eigenvectors of Q.

Corollary

Q � 0⇐⇒ every eigenvalue is non-negative.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex functionsConvex quadratic functions

A quadratic function q : Rn → R can be represented as

q(x) =12

xT Qx + bT x + c,

for appropriate symmetric matrix Q, vector b ∈ Rn, and scalar c.Since ∇2q(x) = Q:

Theorem (Convex quadratic functions)q convex⇐⇒ Q � 0.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex functionsSome properties

I Pointwise maxima of convex functions are convex: if fα is a familyof convex functions sets indexed by α ∈ A, then f with valuesmaxα∈A

fα(x) is convex.

I Conversely, any convex function can be represented as maximaof affine ones. (Hence, “convex means max-linear”).

I If f is convex and component-wise monotone increasing andg1, . . . , gk are convex, then h = f ◦ g with valuesh(x) = f (g1(x), . . . , gk (x)) is convex.Proof: the epigraph of h can be represented via convexconstraints:

h(x) ≤ t ⇐⇒ ∃ u ∈ Rk : f (u) ≤ t , gi (x) ≤ ui , i = 1, . . . , k .

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex functionsExamples

The following functions are convex:I Norms, such as l1, l2 and l∞. Hence, for example,

x → ‖Ax + b‖2, with A a matrix and b a vector, is convex.I The function x → λmax(F (x)), with F (x) = F0 + x1F1 + . . .+ xmFm

and affine combination of given symmetric matrices, and λmax isthe largest eigenvalue, is convex.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Convex problemsDefinition

The problem in standard form

p∗ := minx

f0(x) subject to fi (x) ≤ 0, i = 1, . . . ,m,Ax = b,

is convex if the functions f0, . . . , fm are all convex. Here A ∈ Rp×n,b ∈ Rp are given.

Note that only affine equality constraints are allowed.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

DefinitionMaximization problems

The problem in standard form

maxx

f0(x) subject to fi (x) ≤ 0, i = 1, . . . ,m,Ax = b,

is convex ifI The function f0 is concave.I The functions f1, . . . , fm are all convex.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Problem classesLinear programming

Linear programming (LP) involves the minimization of a linear functionover a polytope:

minx

cT0 x : cT

i x + di ≥ 0, i = 1, . . . ,m.

The problem

minx

3x1 + 1.5x2

s.t. −1 ≤ x1 ≤ 2,0 ≤ x2 ≤ 3

is an LP, since the objective andconstraint functions are all affine.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Problem classesQuadratic programming

Quadratic programming (QP) involves the minimization of a quadraticconvex function over a polytope.

minx

cT x + xT Qx : cTi x + di ≥ 0, i = 1, . . . ,m.

Here, Q must be PSD.The problem

minx

x21 − x1x2 + 2x2

2−3x1 − 1.5x2

s.t. −1 ≤ x1 ≤ 2,0 ≤ x2 ≤ 3

is a QP, since objective isquadratic convex, and the theconstraint functions are all affine.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Example of a quadratic programThe slalom problem

The problem seen in Overview lecture can be formulated as a QP:

minv

v1u0 +n∑

i=1

(yi + vi+1 − vi )2

σ2i

: ‖v‖∞ ≤ c.

where σi (resp. yi ) is the horizontal (resp. vertical) distance betweenthe middle of the gates.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Problem classesSecond-order cone programming

Second-order cone programming (SOCP) generalizes LP and QP viathe inclusion of Euclidean norms in the constraint functions.

minx

cT0 x : ‖Aix + bi‖2 ≤ cT

i x + di , i = 1, . . . ,m.

Application: Chance-constrained linear programming

minx

cT x : Prob{aTi x ≤ b} ≥ 0.99, i = 1, . . . ,m,

where each ai is a Gaussian random variable with mean ai andcovariance matrix Σi , i = 1, . . . ,m.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Problem classesSemi-definite programming programming

Semi-definite programming (SDP) involves the minimization of a linearfunction over the constraint that a symmetric matrix affine in thedecision variables be positive-semidefinite:

minx

cT0 x : F0 +

n∑i=1

xiFi � 0.

(Here, F0, . . . ,Fn are given symmetric matrices.)

Application: relaxation of Boolean problems, see later.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Outline

Convex problemsConvex setsConvex functionsConvex problems

DualityWeak dualityExamplesStrong duality

References

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Lagrangian

Consider the (non-necessarily convex) “primal” problem in standardform

p∗ := minx

f0(x) subject to fi (x) ≤ 0, i = 1, . . . ,m,

We can write the problem as an unconstrained one:

p∗ := minx

maxy≥0L(x , y),

where L is the Lagrange function :

L(x , y) := f0(x) +m∑

i=1

yi fi (x).

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Min-max inequality

Theorem (Min-max inequality)For any function L of two variables:

minx

maxyL(x , y) ≥ max

ymin

xL(x , y).

Proof.In the LHS, maximizing Player y has full information on x (y∗ is afunction of x). The RHS is better (smaller) for the minimizing Player x ,as then y has no information on x .

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Weak duality

Weak duality is the process of applying the min-max inequality to theproblem in unconstrained form:

p∗ = minx

maxy≥0L(x , y) ≥ d∗ := max

y≥0min

xL(x , y).

The dual problem is

d∗ = maxy≥0

G(y), where G(y) := minxL(x , y)

is called the dual function .

Since G is concave, the dual problem is convex (but not necessarilyeasy to solve!)

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

ExampleLP duality

Linear programming:

p∗ := min cT x : Ax ≤ b.

Lagrangian:L(x , y) = cT x + yT (Ax − b).

Dual is also an LP

p∗ ≥ d∗ = maxy≥0

minxL(x , y) = max

y≥0, AT y+c=0−bT y .

Turns out to be equivalent (p∗ = d∗), as seen later.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Example of weak dualityA Boolean problem

Let W = W T be a given n × n matrix.

p∗ := maxx

xT Wx : x2i = 1, i = 1, . . . , n

I Arises in e.g., segmentation problems (with W the Laplacianmatrix of a graph).

I Hard combinatorial problem.I Duality yields a provably good, non-trivial bound.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Example of weak dualityDual of Boolean problem

We can express the problem in an unconstrained way

p∗ = minx

maxyL(x , y)

where L is the Lagrangian:

L(x , y) := xT Wx +n∑

i=1

yi (1− x2i )

Dual function:

G(y) := maxxL(x , y) =

{ ∑ni=1 yi if diag(y)−W is PSD,

+∞ otherwise.

Dual problem is an SDP:

p∗ ≤ d∗ := maxy

G(y) = maxy

n∑i=1

yi : diag(y) � W .

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Example of weak duality

CVX syntax CVX is a matlab prototyping tool for convexoptimization [1].

Here is a matlab snippet that solves for the dual bound, assumingW , n are in the workspace:

cvx_beginvariable y(n,1);minimize( sum(y) );subject to

diag(y)-W == semidefinite(n) ;cvx_enddstar = sum(y);

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Strong dualityDefinition & main result

Recall weak duality:

p∗ = minx

maxy≥0L(x , y) ≥ d∗ := max

y≥0min

xL(x , y).

We say that strong duality holds when p∗ = d∗.I It holds when the primal problem is convex and strictly feasible.I It holds for LPs that are feasible.I It holds in some very specific non-convex problems. For example,

if m = 1 and both f0, f1 are quadratic.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Example of strong dualityBidual of Boolean problem

Return to Lagrange relaxation (“dual”) of Boolean problem:

d∗ = maxy

n∑i=1

yi : diag(y) � W .

Express it as max-min problem:

d∗ = maxy

minX�0

n∑i=1

yi + Tr X (diag(y)−W ).

Here we use the fact that for any symmetric matrix Q,

maxX�0

Tr QX =

{0 if Q � 0,

+∞ otherwise.

Exchanging min and max leads to the so-called bidual :

p∗∗ := maxX

Tr WX : X � 0, Xii = 1, i = 1, . . . , n.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Example of strong dualityBidual of Boolean problem

Bidual:max

XTr WX : X � 0, Xii = 1, i = 1, . . . , n.

I Also an SDP, with objective value equal to d∗ ( strong dualityholds ).

I Can obtain this directly by expressing original problem in terms ofmatrix variable X := xxT , and dropping the rank constraint on X .

I Suggests a way to generate primal feasible points from bidualvariable X , using distribution with mean 0 and covariance matrixgiven by X .

I Allows to prove that (2/π)d∗ ≤ p∗ ≤ d∗ (independent of problemsize!).

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Example of strong dualityCVX syntax

Here is a matlab snippet that solves the bidual, assuming W , n are inthe workspace:

cvx_beginvariable X(n,n) symmetric;maximize( trace(X*W) );subject to

X == semidefinite(n) ;diag(X) == 1;

cvx_enddstar = trace(X*W);

Alternatively, we can use the dual command in the original dualproblem:

cvx_beginvariable y(n,1);dual variable Xminimize( sum(y) );subject to

diag(y)-W == semidefinite(n) : X;cvx_end

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Optimality conditions

If the primal problem is convex, and both primal and dual are strictlyfeasible, then

I Strong duality holds;I both problems are attained;I the “Karush-Kuhn-Tucker” (KKT) conditions

x∗ ∈ arg minxL(x , y∗), y∗i fi (x∗) = 0, i = 1, . . . ,m.

characterize optimality (primal-dual feasible pairs are optimal iffthey satisfy them).

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Applications of dualityI Sensitivity analysis of convex problems.I Stopping criteria in convex optimization algorithms.I Decomposition methods for large-scale convex optimization.I Bounds and approximations for non-convex problems.I For convex problems, leads to often surprising connections.

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

Outline

Convex problemsConvex setsConvex functionsConvex problems

DualityWeak dualityExamplesStrong duality

References

Robust Optimization &Machine Learning

2. ConvexOptimization

Convex problemsConvex sets

Convex functions

Convex problems

DualityWeak duality

Examples

Strong duality

References

References

S. Boyd and M. Grant.

The CVX optimization package, 2010.

S. Boyd and L. Vandenberghe.

Convex Optimization.Cambridge University Press, 2004.

top related