short course robust optimization and machine …elghaoui/teaching/...short course robust...

22
Robust Optimization & Machine Learning 1. Optimization Models What is Optimization? Definition Examples Nomenclature Other standard forms Extensions Convexity Global vs. local optima Convex problems Software Non-convex problems Non-convex problems References Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and IEOR Departments UC Berkeley Spring seminar TRANSP-OR, Zinal, Jan. 16-19, 2012

Upload: others

Post on 03-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Short CourseRobust Optimization and Machine Learning

Lecture 1: Optimization Models

Laurent El Ghaoui

EECS and IEOR DepartmentsUC Berkeley

Spring seminar TRANSP-OR, Zinal, Jan. 16-19, 2012

January 16, 2012

Page 2: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Outline

What is Optimization?DefinitionExamplesNomenclatureOther standard formsExtensions

The Role of ConvexityGlobal vs. local optimaConvex problemsSoftwareNon-convex problemsNon-convex problems

References

Page 3: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Outline

What is Optimization?DefinitionExamplesNomenclatureOther standard formsExtensions

The Role of ConvexityGlobal vs. local optimaConvex problemsSoftwareNon-convex problemsNon-convex problems

References

Page 4: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Optimization problemA standard form

An optimization problem is a problem of the form

p∗ := minx

f0(x) subject to fi(x) ≤ 0, i = 1, . . . ,m,

whereI x ∈ Rn is the decision variable ;I f0 : Rn → R is the objective (or, cost ) function;I fi : Rn → R, i = 1, . . . ,m represent the constraints ;I p∗ is the optimal value .

Often the above is referred to as a “mathematical program” (forhistorical reasons).

Page 5: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

ExampleLeast-squares regression

minw‖X T w − y‖2

whereI X = [x1, . . . , xm] is a n ×m matrix of data points (xi ∈ Rn);I y is a response vector;I ‖ · ‖2 is the l2 (i.e., Euclidean) norm.I Many variants (with e.g., constraints) exist (more on this later).I Perhaps the most popular / useful optimization problem.

Page 6: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

ExampleLinear classification

minw,b

m∑i=1

max(0, 1− yi(wT xi + b))

whereI X = [x1, . . . , xm] is a n ×m matrix of data points (xi ∈ Rn);I y ∈ {−1, 1} is a binary response vector;I Many variants (with e.g., constraints) exist (more on this later).I Very useful for classifying data (e.g., text documents).

Page 7: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

NomenclatureA toy optimization problem

minx

0.9x21 − 0.4x1x2 − 0.6x2

2 − 6.4x1 − 0.8x2

s.t. −1 ≤ x1 ≤ 2, 0 ≤ x2 ≤ 3.

I Feasible set in light blue.I 0.1- suboptimal set in darker

blue.I Unconstrained minimizer : x0;

optimal point: x∗.I Level sets of objective function in

red lines.I A sub-level set in red fill.

Page 8: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Other standard forms

Equality constraints. We may single out equality constraints, if any:

minx

f0(x) subject to hi(x) = 0, i = 1, . . . , p,fi(x) ≤ 0, i = 1, . . . ,m,

where hi ’s are given. Of course, we may reduce the above problem tothe standard form above, representing each equality constraint by apair of inequalities.

Abstract forms. Sometimes, the constraints are described abstractlyvia a set condition, of the form x ∈ X for some subset X of Rn. Thecorresponding notation is

minx∈X

f0(x).

Page 9: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Minimization vs. maximization

Some problems come in the form of maximization problems. Suchproblems are readily cast in standard form via the expression

maxx∈X

f0(x) = −minx∈X

: g0(x),

where g0 := −f0.

I Minimization problems correspond to loss, cost or riskminimization.

I Maximization problems typically correspond to utility or return(e.g., on investment) maximization.

Page 10: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Penalization

A trade-off between two objecgives is commonly accomplished via apenalized problem:

maxx

f (x) + λg(x),

where f and g represent loss and risk functions, and λ > 0 is arisk-aversion parameter.

Example: penalized least-squares

minw‖X T w − y‖2

2 + λ‖w‖22

Here, the risk term ‖w‖22 controls the variance associated with noise

in X .

Page 11: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Robust optimizationDefinition

In many instances the problem data is not known exactly. Assume thatthe functions fi in the original problem also depend on an “uncertainty”vector u that is unknown, but bounded: u ∈ U , with the set U given.

Robust counterpart:

minx

maxu∈U

f0(x , u)

subject to ∀ u ∈ U , fi(x , u) ≤ 0, i = 1, . . . ,m.

I Robust counterparts are sometimes tractable.I If not, systematic procedures exist to generate approximations.

Page 12: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Robust optimizationGeometry

Given a ∈ Rn, b ∈ R, consider the constraint in x ∈ Rn

(a + u)T x ≤ b,

with u’s components are only known within a given set U . The robustcounterpart is:

∀ u ∈ U : (a + u)T x ≤ b.

Robust counterpart when A is a box (left panel) and a sphere (right panel).

Page 13: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Stochastic optimizationDefinition

In stochastic programming, the uncertainty is described by a randomvariable, with known distribution.

Two-stage stochastic linear program with recourse:

minx∈X

aT x + f (x) : f (x) = Ew[ miny∈Y(x,w)

c(w)T y ].

I x-variables correspond to decisions taken now.I y -variables correspond to decisions taken when uncertainty w is

revealed.

I Stochastic problems are usually very hard.I Most known approaches are very expensive to solve.

Page 14: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Outline

What is Optimization?DefinitionExamplesNomenclatureOther standard formsExtensions

The Role of ConvexityGlobal vs. local optimaConvex problemsSoftwareNon-convex problemsNon-convex problems

References

Page 15: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Global vs. local minimaThe curse of optimization

I Point in red is globally optimal(optimal for short).

I Point in green is only locallyoptimal.

I In many applications, we areinterested in global minima.

Curse of optimizationOptimization algorithms for general problems can be trapped in localminima.

Page 16: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Convex functionDefinition

A function f : Rn → R is convex if it satisfies the condition

∀ x , y ∈ Rn, λ ∈ [0, 1] : f (λx + (1− λ)y) ≤ λf (x) + (1− λ)f (y)

Geometrically, the graph of the function is “bowl-shaped”.

Convex function. Non-convex function.

Page 17: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Convexity and local minima

When trying to minimize convex functions, specialized algorithms willalways converge to a global minimum, irrespective of the startingpoint, provided some (weak) assumptions on the function hold.

The Newton algorithm.

Page 18: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Convex optimizationDefinition

The problem in standard form

p∗ := minx

f0(x) subject to fi(x) ≤ 0, i = 1, . . . ,m,

is convex if the functions f0, . . . , fm are all convex.

Examples:I Linear programming (f0, . . . , fm affine).I Quadratic programming (f0 convex quadratic, f1, . . . , fm affine).I Second-order cone programming (f0 linear, fi ’s of the form‖Aix + bi‖2 + cT

i x + di , for appropriate data Ai , bi , ci , di ).

Page 19: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Software for convex optimizationI Free (if you have matlab): CVX [3], Yalmip, Mosek’s student

version [1].I Really free: [4] (in development).I Commercial: Mosek, CPLEX, etc.

Page 20: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Non-convex problemsExamples

I Boolean/integer optimization: some variables are constrained tobe Boolean or integers. Convex optimization can be used forgetting (sometimes) good approximations.

I Cardinality-constrained problems: we seek to bound the numberof non-zero elements in a vector variable. Convex optimizationcan be used for getting good approximations.

I Non-linear programming: usually non-convex problems withdifferentiable objective and functions. Algorithms provide onlylocal minima.

Not all non-convex problems are hard! e.g., low-rank approximationproblem.

Page 21: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

Outline

What is Optimization?DefinitionExamplesNomenclatureOther standard formsExtensions

The Role of ConvexityGlobal vs. local optimaConvex problemsSoftwareNon-convex problemsNon-convex problems

References

Page 22: Short Course Robust Optimization and Machine …elghaoui/Teaching/...Short Course Robust Optimization and Machine Learning Lecture 1: Optimization Models Laurent El Ghaoui EECS and

Robust Optimization &Machine Learning

1. OptimizationModels

What is Optimization?Definition

Examples

Nomenclature

Other standard forms

Extensions

ConvexityGlobal vs. local optima

Convex problems

Software

Non-convex problems

Non-convex problems

References

References

E. D. Andersen and K. D. Andersen.

The MOSEK optimization package.

A. Bental, L. El Ghaoui, and A. Nemirovski.

Robust Optimization.Princeton Series in Applied Mathematics. Princeton University Press,October 2009.

S. Boyd and M. Grant.

The CVX optimization package, 2010.

S. Boyd and T. Tinoco.

The CVXPY optimization package, 2010.

S. Boyd and L. Vandenberghe.

Convex Optimization.Cambridge University Press, 2004.

G. Cornuejols and R. Tutuncu.

Optimization Methods in Finance.Cambridge University Press, 2007.