convex functions - usc

55
CS599: Convex and Combinatorial Optimization Fall 2013 Lectures 5-6: Convex Functions Instructor: Shaddin Dughmi

Upload: others

Post on 21-Feb-2022

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Convex Functions - Usc

CS599: Convex and Combinatorial OptimizationFall 2013

Lectures 5-6: Convex Functions

Instructor: Shaddin Dughmi

Page 2: Convex Functions - Usc

Announcements

HW1 is out, due Thursday 9/26Make sure you get email from meToday: Convex Functions

Read all of B&V Chapter 3.

Page 3: Convex Functions - Usc

Outline

1 Convex Functions

2 Examples of Convex and Concave Functions

3 Convexity-Preserving Operations

Page 4: Convex Functions - Usc

Convex FunctionsA function f : Rn → R is convex if the line segment between any pointson the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f(θx+ (1− θ)y) ≤ θf(x) + (1− θ)f(y)

Inequality called Jensen’s inequality (basic form)f is convex iff its restriction to any line {x+ tv : t ∈ R} is convexf is strictly convex if inequality strict when x 6= y.Analogous definition when the domain of f is a convex subset Dof Rn

Convex Functions 1/24

Page 5: Convex Functions - Usc

Convex FunctionsA function f : Rn → R is convex if the line segment between any pointson the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f(θx+ (1− θ)y) ≤ θf(x) + (1− θ)f(y)

Inequality called Jensen’s inequality (basic form)

f is convex iff its restriction to any line {x+ tv : t ∈ R} is convexf is strictly convex if inequality strict when x 6= y.Analogous definition when the domain of f is a convex subset Dof Rn

Convex Functions 1/24

Page 6: Convex Functions - Usc

Convex FunctionsA function f : Rn → R is convex if the line segment between any pointson the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f(θx+ (1− θ)y) ≤ θf(x) + (1− θ)f(y)

Inequality called Jensen’s inequality (basic form)f is convex iff its restriction to any line {x+ tv : t ∈ R} is convex

f is strictly convex if inequality strict when x 6= y.Analogous definition when the domain of f is a convex subset Dof Rn

Convex Functions 1/24

Page 7: Convex Functions - Usc

Convex FunctionsA function f : Rn → R is convex if the line segment between any pointson the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f(θx+ (1− θ)y) ≤ θf(x) + (1− θ)f(y)

Inequality called Jensen’s inequality (basic form)f is convex iff its restriction to any line {x+ tv : t ∈ R} is convexf is strictly convex if inequality strict when x 6= y.

Analogous definition when the domain of f is a convex subset Dof Rn

Convex Functions 1/24

Page 8: Convex Functions - Usc

Convex FunctionsA function f : Rn → R is convex if the line segment between any pointson the graph of f lies above f . i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f(θx+ (1− θ)y) ≤ θf(x) + (1− θ)f(y)

Inequality called Jensen’s inequality (basic form)f is convex iff its restriction to any line {x+ tv : t ∈ R} is convexf is strictly convex if inequality strict when x 6= y.Analogous definition when the domain of f is a convex subset Dof Rn

Convex Functions 1/24

Page 9: Convex Functions - Usc

Concave and Affine Functions

A function is f : Rn → R is concave if −f is convex. Equivalently:Line segment between any points on the graph of f lies below f .If x, y ∈ Rn and θ ∈ [0, 1], then

f(θx+ (1− θ)y) ≥ θf(x) + (1− θ)f(y)

f : Rn → R is affine if it is both concave and convex. Equivalently:Line segment between any points on the graph of f lies on thegraph of f .f(x) = aᵀx+ b for some a ∈ Rn and b ∈ R.

Convex Functions 2/24

Page 10: Convex Functions - Usc

Concave and Affine Functions

A function is f : Rn → R is concave if −f is convex. Equivalently:Line segment between any points on the graph of f lies below f .If x, y ∈ Rn and θ ∈ [0, 1], then

f(θx+ (1− θ)y) ≥ θf(x) + (1− θ)f(y)

f : Rn → R is affine if it is both concave and convex. Equivalently:Line segment between any points on the graph of f lies on thegraph of f .f(x) = aᵀx+ b for some a ∈ Rn and b ∈ R.

Convex Functions 2/24

Page 11: Convex Functions - Usc

We will now look at some equivalent definitions of convex functions

First Order DefinitionA differentiable f : Rn → R is convex if and only if the first-orderapproximation centered at any point x underestimates f everywhere.

f(y) ≥ f(x) + (5f(x))ᵀ(y − x)

Local information→ global informationIf 5f(x) = 0 then x is a global minimizer of f

Convex Functions 3/24

Page 12: Convex Functions - Usc

We will now look at some equivalent definitions of convex functions

First Order DefinitionA differentiable f : Rn → R is convex if and only if the first-orderapproximation centered at any point x underestimates f everywhere.

f(y) ≥ f(x) + (5f(x))ᵀ(y − x)

Local information→ global informationIf 5f(x) = 0 then x is a global minimizer of f

Convex Functions 3/24

Page 13: Convex Functions - Usc

Second Order DefinitionA twice differentiable f : Rn → R is convex if and only if its derivative isnondecreasing in all directions. Formally,

52f(x) � 0

for all x.

IntepretationRecall definition of PSD: zᵀ 52 f(x)z > 0 for all z ∈ Rn

At x+ δ~z, infitisimal change in gradient is in roughly in the samedirection as ~zGraph of f curves upwards along any lineWhen n = 1, this is f ′′(x) ≥ 0.

Convex Functions 4/24

Page 14: Convex Functions - Usc

Second Order DefinitionA twice differentiable f : Rn → R is convex if and only if its derivative isnondecreasing in all directions. Formally,

52f(x) � 0

for all x.

IntepretationRecall definition of PSD: zᵀ 52 f(x)z > 0 for all z ∈ Rn

At x+ δ~z, infitisimal change in gradient is in roughly in the samedirection as ~zGraph of f curves upwards along any lineWhen n = 1, this is f ′′(x) ≥ 0.

Convex Functions 4/24

Page 15: Convex Functions - Usc

EpigraphThe epigraph of f is the set of points above the graph of f . Formally,

epi(f) = {(x, t) : t ≥ f(x)}

Epigraph Definitionf is a convex function if and only if its epigraph is a convex set.

Convex Functions 5/24

Page 16: Convex Functions - Usc

EpigraphThe epigraph of f is the set of points above the graph of f . Formally,

epi(f) = {(x, t) : t ≥ f(x)}

Epigraph Definitionf is a convex function if and only if its epigraph is a convex set.

Convex Functions 5/24

Page 17: Convex Functions - Usc

Jensen’s Inequality (General Form)

f : Rn → R is convex if and only ifFor every x1, . . . , xk in the domain of f , and θ1, . . . , θk ≥ 0 suchthat

∑i θi = 1, we have

f(∑i

θixi) ≤∑i

θif(xi)

Given a probability measure D on the domain of f , and x ∼ D,

f(E[x]) ≤ E[f(x)]

Adding noise to x can only increase f(x) in expectation.

Convex Functions 6/24

Page 18: Convex Functions - Usc

Jensen’s Inequality (General Form)

f : Rn → R is convex if and only ifFor every x1, . . . , xk in the domain of f , and θ1, . . . , θk ≥ 0 suchthat

∑i θi = 1, we have

f(∑i

θixi) ≤∑i

θif(xi)

Given a probability measure D on the domain of f , and x ∼ D,

f(E[x]) ≤ E[f(x)]

Adding noise to x can only increase f(x) in expectation.

Convex Functions 6/24

Page 19: Convex Functions - Usc

Local and Global Optimality

Local minimumx is a local minimum of f if there is a an open ball B containing xwhere f(y) ≥ f(x) for all y ∈ B.

Local and Global OptimalityWhen f is convex, x is a local minimum of f if and only if it is a globalminimum.

This fact underlies much of the tractability of convex optimization.

Convex Functions 7/24

Page 20: Convex Functions - Usc

Local and Global Optimality

Local minimumx is a local minimum of f if there is a an open ball B containing xwhere f(y) ≥ f(x) for all y ∈ B.

Local and Global OptimalityWhen f is convex, x is a local minimum of f if and only if it is a globalminimum.

This fact underlies much of the tractability of convex optimization.

Convex Functions 7/24

Page 21: Convex Functions - Usc

Sub-level sets

Level sets of f(x, y) =√x2 + y2

Sublevel setThe α-sublevel set of f is {x ∈ domain(f) : f(x) ≤ α}.

FactEvery sub-level set of a convex function is a convex set.

This fact also underlies tractability of convex optimization

Note: converse false, but nevertheless useful check.

Convex Functions 8/24

Page 22: Convex Functions - Usc

Sub-level sets

Level sets of f(x, y) =√x2 + y2

Sublevel setThe α-sublevel set of f is {x ∈ domain(f) : f(x) ≤ α}.

FactEvery sub-level set of a convex function is a convex set.

This fact also underlies tractability of convex optimization

Note: converse false, but nevertheless useful check.

Convex Functions 8/24

Page 23: Convex Functions - Usc

Sub-level sets

Level sets of f(x, y) =√x2 + y2

Sublevel setThe α-sublevel set of f is {x ∈ domain(f) : f(x) ≤ α}.

FactEvery sub-level set of a convex function is a convex set.

This fact also underlies tractability of convex optimization

Note: converse false, but nevertheless useful check.Convex Functions 8/24

Page 24: Convex Functions - Usc

Other Basic Properties

ContinuityConvex functions are continuous.

Extended-value extensionIf a function f : D → R is convex on its domain, and D is convex, thenit can be extended to a convex function on Rn. by setting f(x) =∞whenever x /∈ D.

This simplifies notation. Resulting function f̃ : D → R⋃∞ is “convex”

with respect to the ordering on R⋃∞

Convex Functions 9/24

Page 25: Convex Functions - Usc

Other Basic Properties

ContinuityConvex functions are continuous.

Extended-value extensionIf a function f : D → R is convex on its domain, and D is convex, thenit can be extended to a convex function on Rn. by setting f(x) =∞whenever x /∈ D.

This simplifies notation. Resulting function f̃ : D → R⋃∞ is “convex”

with respect to the ordering on R⋃∞

Convex Functions 9/24

Page 26: Convex Functions - Usc

Outline

1 Convex Functions

2 Examples of Convex and Concave Functions

3 Convexity-Preserving Operations

Page 27: Convex Functions - Usc

Functions on the reals

Affine: ax+ b

Exponential: eax convex for any a ∈ RPowers: xa convex on R++ when a ≥ 1 or a ≤ 0, and concave for0 ≤ a ≤ 1

Logarithm: log x concave on R++.

Examples of Convex and Concave Functions 10/24

Page 28: Convex Functions - Usc

NormsNorms are convex.

||θx+ (1− θ)y|| ≤ ||θx||+ ||(1− θ)y|| = θ||x||+ (1− θ)||y||

Uses both norm axioms: triangle inequality, and homogeneity.Applies to matrix norms, such as the spectral norm (radius ofinduced ellipsoid)

Maxmaxi xi is convex

maxi

(θx+ (1− θ)y)i = maxi

(θxi + (1− θ)yi)

≤ maxiθxi +max

i(1− θ)yi

= θmaxixi + (1− θ)max

iyi

If i’m allowed to pick the maximum entry of θx and θy independently, Ican do only better.

Examples of Convex and Concave Functions 11/24

Page 29: Convex Functions - Usc

NormsNorms are convex.

||θx+ (1− θ)y|| ≤ ||θx||+ ||(1− θ)y|| = θ||x||+ (1− θ)||y||

Uses both norm axioms: triangle inequality, and homogeneity.Applies to matrix norms, such as the spectral norm (radius ofinduced ellipsoid)

Maxmaxi xi is convex

maxi

(θx+ (1− θ)y)i = maxi

(θxi + (1− θ)yi)

≤ maxiθxi +max

i(1− θ)yi

= θmaxixi + (1− θ)max

iyi

If i’m allowed to pick the maximum entry of θx and θy independently, Ican do only better.

Examples of Convex and Concave Functions 11/24

Page 30: Convex Functions - Usc

Log-sum-exp: log(ex1 + ex2 + . . .+ exn) isconvexGeometric mean: (

∏ni=1 xi)

1n is concave

Log-determinant: log detX is concaveQuadratic form: xᵀAx is convex iff A � 0

Other examples in bookf(x, y) = log(ex + ey)

Proving convexity often comes down to case-by-case reasoning,involving:

Definition: restrict to line and check Jensen’s inequalityWrite down the Hessian and prove PSDExpress as a combination of other convex functions throughconvexity-preserving operations (Next)

Examples of Convex and Concave Functions 12/24

Page 31: Convex Functions - Usc

Log-sum-exp: log(ex1 + ex2 + . . .+ exn) isconvexGeometric mean: (

∏ni=1 xi)

1n is concave

Log-determinant: log detX is concaveQuadratic form: xᵀAx is convex iff A � 0

Other examples in bookf(x, y) = log(ex + ey)

Proving convexity often comes down to case-by-case reasoning,involving:

Definition: restrict to line and check Jensen’s inequalityWrite down the Hessian and prove PSDExpress as a combination of other convex functions throughconvexity-preserving operations (Next)

Examples of Convex and Concave Functions 12/24

Page 32: Convex Functions - Usc

Outline

1 Convex Functions

2 Examples of Convex and Concave Functions

3 Convexity-Preserving Operations

Page 33: Convex Functions - Usc

Nonnegative Weighted CombinationsIf f1, f2, . . . , fk are convex, and w1, w2, . . . , wk ≥ 0, theng = w1f1 + w2f2 . . .+ wkfk is convex.

Extends to integrals g(x) =∫y w(y)fy(x) with w(y) ≥ 0, and therefore

expectations Ey fy(x).

Worth NotingMinimizing the expectation of a random convex cost function is also aconvex optimization problem!

A stochastic convex optimization problem is a convex optimizationproblem.

Convexity-Preserving Operations 13/24

Page 34: Convex Functions - Usc

Nonnegative Weighted CombinationsIf f1, f2, . . . , fk are convex, and w1, w2, . . . , wk ≥ 0, theng = w1f1 + w2f2 . . .+ wkfk is convex.

proof (k = 2)

g

(x+ y

2

)= w1f1

(x+ y

2

)+ w2f2

(x+ y

2

)≤ w1

f1(x) + f1(y)

2+ w2

f2(x) + f2(y)

2

=g(x) + g(y)

2

Extends to integrals g(x) =∫y w(y)fy(x) with w(y) ≥ 0, and therefore

expectations Ey fy(x).

Worth NotingMinimizing the expectation of a random convex cost function is also aconvex optimization problem!

A stochastic convex optimization problem is a convex optimizationproblem.

Convexity-Preserving Operations 13/24

Page 35: Convex Functions - Usc

Nonnegative Weighted CombinationsIf f1, f2, . . . , fk are convex, and w1, w2, . . . , wk ≥ 0, theng = w1f1 + w2f2 . . .+ wkfk is convex.

Extends to integrals g(x) =∫y w(y)fy(x) with w(y) ≥ 0, and therefore

expectations Ey fy(x).

Worth NotingMinimizing the expectation of a random convex cost function is also aconvex optimization problem!

A stochastic convex optimization problem is a convex optimizationproblem.

Convexity-Preserving Operations 13/24

Page 36: Convex Functions - Usc

Nonnegative Weighted CombinationsIf f1, f2, . . . , fk are convex, and w1, w2, . . . , wk ≥ 0, theng = w1f1 + w2f2 . . .+ wkfk is convex.

Extends to integrals g(x) =∫y w(y)fy(x) with w(y) ≥ 0, and therefore

expectations Ey fy(x).

Worth NotingMinimizing the expectation of a random convex cost function is also aconvex optimization problem!

A stochastic convex optimization problem is a convex optimizationproblem.

Convexity-Preserving Operations 13/24

Page 37: Convex Functions - Usc

Example: Stochastic Facility Location

Average Distancek customers located at y1, y2, . . . , yk ∈ Rn

If I place a facility at x ∈ Rn, average distance to a customer isg(x) =

∑i1k ||x− yi||

Since distance to any one customer is convex in x, so is theaverage distance.Extends to probability measure over customers

Convexity-Preserving Operations 14/24

Page 38: Convex Functions - Usc

Example: Stochastic Facility Location

Average Distancek customers located at y1, y2, . . . , yk ∈ Rn

If I place a facility at x ∈ Rn, average distance to a customer isg(x) =

∑i1k ||x− yi||

Since distance to any one customer is convex in x, so is theaverage distance.Extends to probability measure over customers

Convexity-Preserving Operations 14/24

Page 39: Convex Functions - Usc

ImplicationConvex functions are a convex cone in the vector space of functionsfrom Rn to R.

The set of convex functions is the intersection of an infinite set ofhomogeneous linear inequalities indexed by x, y, θ

f(θx+ (1− θ)y)− θf(x) + (1− θ)f(y) ≤ 0

Convexity-Preserving Operations 15/24

Page 40: Convex Functions - Usc

Composition with Affine FunctionIf f : Rn → R is convex, and A ∈ Rn×m, b ∈ Rn, then

g(x) = f(Ax+ b)

is a convex function from Rm to R.

Proof

(x, t) ∈ graph(g) ⇐⇒ t = g(x) = f(Ax+b) ⇐⇒ (Ax+b, t) ∈ graph(f)

(x, t) ∈ epi(g) ⇐⇒ t ≥ g(x) = f(Ax+ b) ⇐⇒ (Ax+ b, t) ∈ epi(f)

epi(g) is the inverse image of epi(f) under the affine mapping(x, t)→ (Ax+ b, t)

Convexity-Preserving Operations 16/24

Page 41: Convex Functions - Usc

Composition with Affine FunctionIf f : Rn → R is convex, and A ∈ Rn×m, b ∈ Rn, then

g(x) = f(Ax+ b)

is a convex function from Rm to R.

Proof

(x, t) ∈ graph(g) ⇐⇒ t = g(x) = f(Ax+b) ⇐⇒ (Ax+b, t) ∈ graph(f)

(x, t) ∈ epi(g) ⇐⇒ t ≥ g(x) = f(Ax+ b) ⇐⇒ (Ax+ b, t) ∈ epi(f)

epi(g) is the inverse image of epi(f) under the affine mapping(x, t)→ (Ax+ b, t)

Convexity-Preserving Operations 16/24

Page 42: Convex Functions - Usc

Composition with Affine FunctionIf f : Rn → R is convex, and A ∈ Rn×m, b ∈ Rn, then

g(x) = f(Ax+ b)

is a convex function from Rm to R.

Proof

(x, t) ∈ graph(g) ⇐⇒ t = g(x) = f(Ax+b) ⇐⇒ (Ax+b, t) ∈ graph(f)

(x, t) ∈ epi(g) ⇐⇒ t ≥ g(x) = f(Ax+ b) ⇐⇒ (Ax+ b, t) ∈ epi(f)

epi(g) is the inverse image of epi(f) under the affine mapping(x, t)→ (Ax+ b, t)

Convexity-Preserving Operations 16/24

Page 43: Convex Functions - Usc

Composition with Affine FunctionIf f : Rn → R is convex, and A ∈ Rn×m, b ∈ Rn, then

g(x) = f(Ax+ b)

is a convex function from Rm to R.

Proof

(x, t) ∈ graph(g) ⇐⇒ t = g(x) = f(Ax+b) ⇐⇒ (Ax+b, t) ∈ graph(f)

(x, t) ∈ epi(g) ⇐⇒ t ≥ g(x) = f(Ax+ b) ⇐⇒ (Ax+ b, t) ∈ epi(f)

epi(g) is the inverse image of epi(f) under the affine mapping(x, t)→ (Ax+ b, t)

Convexity-Preserving Operations 16/24

Page 44: Convex Functions - Usc

Examples||Ax+ b|| is convexmax(Ax+ b) is convexlog(ea

ᵀ1x+b1 + ea

ᵀ2x+b2 + . . .+ ea

ᵀnx+bn) is convex

Convexity-Preserving Operations 17/24

Page 45: Convex Functions - Usc

MaximumIf f1, f2 are convex, then g(x) = max {f1(x), f2(x)} is also convex.

Generalizes to the maximum of any number of functions, maxki=1 fi(x),and also to the supremum of an infinite set of functions supy fy(x).

epi g = epi f1⋂

epi f2

Convexity-Preserving Operations 18/24

Page 46: Convex Functions - Usc

MaximumIf f1, f2 are convex, then g(x) = max {f1(x), f2(x)} is also convex.

Generalizes to the maximum of any number of functions, maxki=1 fi(x),and also to the supremum of an infinite set of functions supy fy(x).

epi g = epi f1⋂

epi f2

Convexity-Preserving Operations 18/24

Page 47: Convex Functions - Usc

Example: Robust Facility Location

Maximum Distancek customers located at y1, y2, . . . , yk ∈ Rn

If I place a facility at x ∈ Rn, maximum distance to a customer isg(x) = maxi ||x− yi||

Convexity-Preserving Operations 19/24

Page 48: Convex Functions - Usc

Example: Robust Facility Location

Maximum Distancek customers located at y1, y2, . . . , yk ∈ Rn

If I place a facility at x ∈ Rn, maximum distance to a customer isg(x) = maxi ||x− yi||

Since distance to any one customer is convex in x, so is theworst-case distance.

Convexity-Preserving Operations 19/24

Page 49: Convex Functions - Usc

Example: Robust Facility Location

Maximum Distancek customers located at y1, y2, . . . , yk ∈ Rn

If I place a facility at x ∈ Rn, maximum distance to a customer isg(x) = maxi ||x− yi||

Worth NotingWhen a convex cost function is uncertain, minimizing the worst-casecost is also a convex optimization problem!

A robust (in the worst-case sense) convex optimization problem isa convex optimization problem.

Convexity-Preserving Operations 19/24

Page 50: Convex Functions - Usc

Other ExamplesMaximum eigenvalue of a symmetric matrix A is convex in A

max {vᵀAv : ||v|| = 1}

Sum of k largest components of a vector x is convex in x

max{~1S · x : |S| = k

}

Convexity-Preserving Operations 20/24

Page 51: Convex Functions - Usc

MinimizationIf f(x, y) is convex and C is convex and nonempty, theng(x) = infy∈C f(x, y) is convex.

Proof (for C = Rk)epi g is the projection of epi f onto hyperplane y = 0.

f(x, y) = x2 + y2 g(x) = x2

Convexity-Preserving Operations 21/24

Page 52: Convex Functions - Usc

MinimizationIf f(x, y) is convex and C is convex and nonempty, theng(x) = infy∈C f(x, y) is convex.

Proof (for C = Rk)epi g is the projection of epi f onto hyperplane y = 0.

f(x, y) = x2 + y2 g(x) = x2

Convexity-Preserving Operations 21/24

Page 53: Convex Functions - Usc

ExampleDistance from a convex set C

f(x, y) = infy∈C||x− y||

Convexity-Preserving Operations 22/24

Page 54: Convex Functions - Usc

Composition RulesIf g : Rn → Rk and h : Rk → R, then f = h ◦ g is convex if

gi are convex, and h is convex and nondecreasing in eachargument.gi are concave, and h is convex and nonincreasing in eachargument.

Proof (n = k = 1)

f ′′(x) = h′′(g(x))g′(x)2 + h′(g(x))g′′(x)

Convexity-Preserving Operations 23/24

Page 55: Convex Functions - Usc

PerspectiveIf f is convex then g(x, t) = tf(x/t) is also convex.

Proofepi g is inverse image of epi f under the perspective function.

Convexity-Preserving Operations 24/24