introduction to optimization - tut · introduction to optimization introduction mathematical...
TRANSCRIPT
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Introduction to optimization
Olga [email protected], TG412
ELT-53656 Network Analysis and Dimensioning IITampere University of Technology, Tampere, Finland
February 10, 2016
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Outline
1 Introduction
2 Mathematical optimization
3 Appendix
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Content to be discussed during the next 5 weeks...
This is not Optimization Theory course...and we do not pretend it is.
Our goal: is to (re)visit basics of Methods of Optimizationto broaden horizons of good engineers:
Basic of mathematical optimization (today)
Linear programming
Convex programming
Mixed-integer and integer programming
(came from Graph Theory) Shortest path algorithms
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
History...
Leonhard Euler (1707-1783):nothing at all takes place in the Universe in which some
rule of maximum or minimum does not appear
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Where Optimization Problem Arises?
Methods of Optimization deals with finding the optimumsolution for a mathematical model, in most cases: findingmaxima and minima of functions subject to some constraints.
Manufacturing and transportation systems
Scheduling of goods for manufacturing
Transportation of goods over transportation networks
Scheduling of fleets of airplanes
Communication Systems
Design of communication systems
Flow of information across networks
financial systems, and much more...
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Examples
Wireless network optimization
variables: power for wireless, investments in infrastructure
constraints: budget, max/min Tx power, distance, rate...
objective: system/user throughput, energy efficiency, etc
Data fitting
variables: model parameters
constraints: prior information, parameter limits
objective: measure of prediction error
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Outline
1 Introduction
2 Mathematical optimization
3 Appendix
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Mathematical Model
Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn
subject to x ∈ Ω ⊂ Rn.
Definition
The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.
Definition
The variables x = [x1, ..., xn] are optimization variables.
Definition
Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Mathematical Model
Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn
subject to x ∈ Ω ⊂ Rn.
Definition
The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.
Definition
The variables x = [x1, ..., xn] are optimization variables.
Definition
Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Mathematical Model
Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn
subject to x ∈ Ω ⊂ Rn.
Definition
The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.
Definition
The variables x = [x1, ..., xn] are optimization variables.
Definition
Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Mathematical Model
Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn
subject to x ∈ Ω ⊂ Rn.
Definition
The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.
Definition
The variables x = [x1, ..., xn] are optimization variables.
Definition
Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Mathematical Model
Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn
subject to x ∈ Ω ⊂ Rn.
Definition
The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.
Definition
The variables x = [x1, ..., xn] are optimization variables.
Definition
Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Mathematical Model
Consider an (mathematical) optimization problemminimize f (x), x ∈ Rn
subject to x ∈ Ω ⊂ Rn.
Definition
The function f (x) : Rn → R is a real-valued function, calledthe objective function, or cost function.
Definition
The variables x = [x1, ..., xn] are optimization variables.
Definition
Optimal point x0: f (x0) ≤ f (x),∀x ∈ Ω.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Constraint Set
Definition
The set Ω ⊂ Rn is the constraint set or feasible set/region.
Definition
The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.
Ω takes form:x : hi (x) = 0, gj(x) < 0,
where hi (x), gi (x) are constraint functions
Definition
If Ω = , the problem is infeasible, otherwise feasible.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Constraint Set
Definition
The set Ω ⊂ Rn is the constraint set or feasible set/region.
Definition
The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.
Ω takes form:x : hi (x) = 0, gj(x) < 0,
where hi (x), gi (x) are constraint functions
Definition
If Ω = , the problem is infeasible, otherwise feasible.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Constraint Set
Definition
The set Ω ⊂ Rn is the constraint set or feasible set/region.
Definition
The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.
Ω takes form:x : hi (x) = 0, gj(x) < 0,
where hi (x), gi (x) are constraint functions
Definition
If Ω = , the problem is infeasible, otherwise feasible.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Constraint Set
Definition
The set Ω ⊂ Rn is the constraint set or feasible set/region.
Definition
The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.
Ω takes form:x : hi (x) = 0, gj(x) < 0,
where hi (x), gi (x) are constraint functions
Definition
If Ω = , the problem is infeasible, otherwise feasible.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Constraint Set
Definition
The set Ω ⊂ Rn is the constraint set or feasible set/region.
Definition
The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.
Ω takes form:x : hi (x) = 0, gj(x) < 0,
where hi (x), gi (x) are constraint functions
Definition
If Ω = , the problem is infeasible, otherwise feasible.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Constraint Set
Definition
The set Ω ⊂ Rn is the constraint set or feasible set/region.
Definition
The above problem is a general form of a constrainedoptimization problem. If Ω = Rn, the problem is we refer tothe problem unconstrained.
Ω takes form:x : hi (x) = 0, gj(x) < 0,
where hi (x), gi (x) are constraint functions
Definition
If Ω = , the problem is infeasible, otherwise feasible.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Classification
Problems:
Linear/nonlinear (defined by shape of f (x), hi (x), gj(x))
Convex/non-convex (defined by properties of f (x) and Ω)
Discrete/continuous/integer/mixed-integer/..
One-dimensional/multi-dimensional
One extremum/multiple extrema
Methods:
Direct methods (zero order) use information only on f (x)
allows analyzing functions, defined algorithmically
First-order methods use information on f (x),∇f (x)
e.g., gradient methods
Second-order methods use f (x),∇f (x),∇2f (x)
e.g., Newton methods and modifications
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Solving Optimization Problems
General optimization problem
very difficult to solve
methods involve some compromise, e.g., very longcomputation time, or not always finding the solution
Exceptions: certain problem classes can be solved efficientlyand reliably
least-squares problems
linear programming problems
convex optimization problems
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Least-squares problem
Example: linear regression bi =∑
j xjaij + ε
minimize ||Ax − b||22 =∑
i (∑
j aijxj − bi )2
Solving least-squares problems
analytical solution: x0 = (ATA)−1ATb
reliable and efficient algorithms and software
computation time proportional to n2k (A ∈ Rk×n); less ifstructured
a mature technology
Using least-squares
least-squares problems are easy to recognize
a few standard techniques increase flexibility (e.g.,including weights, adding regularization terms)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Least-squares problem
Example: linear regression bi =∑
j xjaij + ε
minimize ||Ax − b||22 =∑
i (∑
j aijxj − bi )2
Solving least-squares problems
analytical solution: x0 = (ATA)−1ATb
reliable and efficient algorithms and software
computation time proportional to n2k (A ∈ Rk×n); less ifstructured
a mature technology
Using least-squares
least-squares problems are easy to recognize
a few standard techniques increase flexibility (e.g.,including weights, adding regularization terms)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Least-squares problem
Example: linear regression bi =∑
j xjaij + ε
minimize ||Ax − b||22 =∑
i (∑
j aijxj − bi )2
Solving least-squares problems
analytical solution: x0 = (ATA)−1ATb
reliable and efficient algorithms and software
computation time proportional to n2k (A ∈ Rk×n); less ifstructured
a mature technology
Using least-squares
least-squares problems are easy to recognize
a few standard techniques increase flexibility (e.g.,including weights, adding regularization terms)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Least-squares problem
Example: linear regression bi =∑
j xjaij + ε
minimize ||Ax − b||22 =∑
i (∑
j aijxj − bi )2
Solving least-squares problems
analytical solution: x0 = (ATA)−1ATb
reliable and efficient algorithms and software
computation time proportional to n2k (A ∈ Rk×n); less ifstructured
a mature technology
Using least-squares
least-squares problems are easy to recognize
a few standard techniques increase flexibility (e.g.,including weights, adding regularization terms)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Linear programming
minimize cT x =∑
j cjxjsubject to aTi x ≤ bi , i = 1, ...,m
Solving linear programs
no analytical formula for solution
reliable and efficient algorithms and software
computation time proportional to n2m if m ≥ n; less withstructure
a mature technology
Using linear programming
easy to recognize, but sometimes not as easy as LSproblems
a few standard tricks used to convert problems into linearprograms (e.g., problems involving l1- or l∞-norms,piecewise-linear functions)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Linear programming
minimize cT x =∑
j cjxjsubject to aTi x ≤ bi , i = 1, ...,m
Solving linear programs
no analytical formula for solution
reliable and efficient algorithms and software
computation time proportional to n2m if m ≥ n; less withstructure
a mature technology
Using linear programming
easy to recognize, but sometimes not as easy as LSproblems
a few standard tricks used to convert problems into linearprograms (e.g., problems involving l1- or l∞-norms,piecewise-linear functions)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Linear programming
minimize cT x =∑
j cjxjsubject to aTi x ≤ bi , i = 1, ...,m
Solving linear programs
no analytical formula for solution
reliable and efficient algorithms and software
computation time proportional to n2m if m ≥ n; less withstructure
a mature technology
Using linear programming
easy to recognize, but sometimes not as easy as LSproblems
a few standard tricks used to convert problems into linearprograms (e.g., problems involving l1- or l∞-norms,piecewise-linear functions)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex optimization problem
minimize f (x)subject to gi (x) ≤ bi , i = 1, ...,m
objective and constraint functions are convex:
gi (α1x + α2y) ≤ α1gi (x) + α2gi (y),
if α1 + α2 = 1, α1 ≥ 0, α1 ≥ 0.
includes least-squares problems and linear programs asspecial cases
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex optimization problem
minimize f (x)subject to gi (x) ≤ bi , i = 1, ...,m
objective and constraint functions are convex:
gi (α1x + α2y) ≤ α1gi (x) + α2gi (y),
if α1 + α2 = 1, α1 ≥ 0, α1 ≥ 0.
includes least-squares problems and linear programs asspecial cases
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex optimization problems
Solving convex optimization problems
no analytical solution
reliable and efficient algorithms
computation time (roughly) proportional tomaxn3, n2m,F, where F is cost of evaluating f (x) andits first and second derivatives
almost a technology
Using convex optimization
often difficult to recognize (requires proof of convexity)
many tricks for transforming problems into convex form
surprisingly many problems can be solved via convexoptimization
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex optimization problems
Solving convex optimization problems
no analytical solution
reliable and efficient algorithms
computation time (roughly) proportional tomaxn3, n2m,F, where F is cost of evaluating f (x) andits first and second derivatives
almost a technology
Using convex optimization
often difficult to recognize (requires proof of convexity)
many tricks for transforming problems into convex form
surprisingly many problems can be solved via convexoptimization
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Nonlinear optimization
Traditional techniques for general nonconvex problems involvecompromises
Local optimization methods (nonlinear programming)
find a point that minimizes f among feasible points near it
fast, can handle large problems
require initial guess
provide no information about distance to (global) optimum
Global optimization methods
find the (global) solution
worst-case complexity grows exponentially with problemsize
These algorithms are often based on solving convexsubproblems
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Nonlinear optimization
Traditional techniques for general nonconvex problems involvecompromisesLocal optimization methods (nonlinear programming)
find a point that minimizes f among feasible points near it
fast, can handle large problems
require initial guess
provide no information about distance to (global) optimum
Global optimization methods
find the (global) solution
worst-case complexity grows exponentially with problemsize
These algorithms are often based on solving convexsubproblems
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Nonlinear optimization
Traditional techniques for general nonconvex problems involvecompromisesLocal optimization methods (nonlinear programming)
find a point that minimizes f among feasible points near it
fast, can handle large problems
require initial guess
provide no information about distance to (global) optimum
Global optimization methods
find the (global) solution
worst-case complexity grows exponentially with problemsize
These algorithms are often based on solving convexsubproblems
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Local and global extrema
The following theorem helps define whether local solution is aglobal one:
Theorem (Weierstrass)
Continuous real-valued function f (x) defined in non-emptycompact set S attains a maximum and a minimumxmin /max ∈ S .
Reminder: compact set S ∈ Rn = the closed and bounded set
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Outline
1 Introduction
2 Mathematical optimization
3 Appendix
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Affine and Convex Sets
Definition
S ⊂ Rn is affine if [x , y ∈ S , α ∈ R]⇒ αx + (1− α)y ∈ S
Definition
S ⊂ Rn is convex if for all [x , y ∈ S , 0 < α < 1] ⇒z = αx + (1− α)y ∈ S (z a convex combination of x and y).
If x1, ..., xm ∈ Rn,∑
j αj = 1, αj > 0, thenx = α1x1 + ...+ α1x1 is a convex combination of x1, ..., xm.The intersection of (any number of) convex sets is convex
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Affine and Convex Sets
Definition
S ⊂ Rn is affine if [x , y ∈ S , α ∈ R]⇒ αx + (1− α)y ∈ S
Definition
S ⊂ Rn is convex if for all [x , y ∈ S , 0 < α < 1] ⇒z = αx + (1− α)y ∈ S (z a convex combination of x and y).
If x1, ..., xm ∈ Rn,∑
j αj = 1, αj > 0, thenx = α1x1 + ...+ α1x1 is a convex combination of x1, ..., xm.The intersection of (any number of) convex sets is convex
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Affine and Convex Sets
Definition
S ⊂ Rn is affine if [x , y ∈ S , α ∈ R]⇒ αx + (1− α)y ∈ S
Definition
S ⊂ Rn is convex if for all [x , y ∈ S , 0 < α < 1] ⇒z = αx + (1− α)y ∈ S (z a convex combination of x and y).
If x1, ..., xm ∈ Rn,∑
j αj = 1, αj > 0, thenx = α1x1 + ...+ α1x1 is a convex combination of x1, ..., xm.The intersection of (any number of) convex sets is convex
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Affine and Convex Sets
Definition
S ⊂ Rn is affine if [x , y ∈ S , α ∈ R]⇒ αx + (1− α)y ∈ S
Definition
S ⊂ Rn is convex if for all [x , y ∈ S , 0 < α < 1] ⇒z = αx + (1− α)y ∈ S (z a convex combination of x and y).
If x1, ..., xm ∈ Rn,∑
j αj = 1, αj > 0, thenx = α1x1 + ...+ α1x1 is a convex combination of x1, ..., xm.The intersection of (any number of) convex sets is convex
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Affine and Convex Sets
Definition
S ⊂ Rn is affine if [x , y ∈ S , α ∈ R]⇒ αx + (1− α)y ∈ S
Definition
S ⊂ Rn is convex if for all [x , y ∈ S , 0 < α < 1] ⇒z = αx + (1− α)y ∈ S (z a convex combination of x and y).
If x1, ..., xm ∈ Rn,∑
j αj = 1, αj > 0, thenx = α1x1 + ...+ α1x1 is a convex combination of x1, ..., xm.The intersection of (any number of) convex sets is convex
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Compact Sets
Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.
Definition
Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.
Definition
Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .
Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Compact Sets
Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.
Definition
Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.
Definition
Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .
Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Compact Sets
Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.
Definition
Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.
Definition
Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .
Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Compact Sets
Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.
Definition
Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.
Definition
Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .
Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Compact Sets
Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.
Definition
Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.
Definition
Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .
Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Compact Sets
Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.
Definition
Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.
Definition
Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .
Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Compact Sets
Let Bδ(x0) denote the open ball of radius δ centered at thepoint x : Bδ(x0) = x : ||x − x0|| < δ.
Definition
Set S ⊂ Rn is said to be open if for each point x0 ∈ S there isδ such that Bδ(x0). A set S ⊂ Rn is said to be closed if itscomplement Rn \ S is open.
Definition
Set S is compact if each of its open covers has a finitesubcover: ∀Cii∈A, S ⊂ Cii∈A ∃ finite J : S ⊂ Cjj∈J .
Alternative: every sequence in S has a convergentsubsequence, whose limit lies in S .Note: If S ⊂ Rn, closed and bounded, then S - compact(Heine-Borel theorem).
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex functions
Definition
Let C ⊂ Rn be a nonempty convex set. Then f : C → R isconvex (on C) if for all x , y ∈ C and all α ∈ (0, 1):f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y)
If strict inequality holds whenever x 6= y , then f is said to bestrictly convex. The negative of a (strictly) convex function iscalled a (strictly) concave function.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex functions
Definition
Let C ⊂ Rn be a nonempty convex set. Then f : C → R isconvex (on C) if for all x , y ∈ C and all α ∈ (0, 1):f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y)
If strict inequality holds whenever x 6= y , then f is said to bestrictly convex. The negative of a (strictly) convex function iscalled a (strictly) concave function.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex functions
Definition
Let C ⊂ Rn be a nonempty convex set. Then f : C → R isconvex (on C) if for all x , y ∈ C and all α ∈ (0, 1):f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y)
If strict inequality holds whenever x 6= y , then f is said to bestrictly convex. The negative of a (strictly) convex function iscalled a (strictly) concave function.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex functions
Definition
Let C ⊂ Rn be a nonempty convex set. Then f : C → R isconvex (on C) if for all x , y ∈ C and all α ∈ (0, 1):f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y)
If strict inequality holds whenever x 6= y , then f is said to bestrictly convex. The negative of a (strictly) convex function iscalled a (strictly) concave function.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex functions
Definition
Let C ⊂ Rn be a nonempty convex set. Then f : C → R isconvex (on C) if for all x , y ∈ C and all α ∈ (0, 1):f (αx + (1− α)y) ≤ αf (x) + (1− α)f (y)
If strict inequality holds whenever x 6= y , then f is said to bestrictly convex. The negative of a (strictly) convex function iscalled a (strictly) concave function.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex functions
nonnegative multiple: αf is convex if f is convex, α ≥ 0
sum: f1 + f2 convex if f1, f2 convex (extends to infinitesums, integrals)
composition with affine function: f (Ax + b) is convex if fis convex
Some univariate convex functions:1. exponential f (x) = eαx (for all real α)2. powers f (x) = xp if x ≥ 0 and 1 ≤ p <∞3. powers of abs value f (x) = |x |p, if x > 0 and −∞ < p ≤ 0Concave:1. powers f (x) = xp if x ≥ 0 and 0 ≤ p ≤ 12. logarithm: f (x) = log x if x > 0.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Convex functions
nonnegative multiple: αf is convex if f is convex, α ≥ 0
sum: f1 + f2 convex if f1, f2 convex (extends to infinitesums, integrals)
composition with affine function: f (Ax + b) is convex if fis convex
Some univariate convex functions:1. exponential f (x) = eαx (for all real α)2. powers f (x) = xp if x ≥ 0 and 1 ≤ p <∞3. powers of abs value f (x) = |x |p, if x > 0 and −∞ < p ≤ 0Concave:1. powers f (x) = xp if x ≥ 0 and 0 ≤ p ≤ 12. logarithm: f (x) = log x if x > 0.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Differentials
f is differentiable if domain of (f ) is open and the gradient off :
∇f (x) ≡(∂f
∂x1, ...,
∂f
∂xn
)T
,∃ ∀x ∈ dom (f )
f is twice differentiable if domain of f is open and theHessian of f :
H ≡ D2f (x) =
∂2f∂x2
1... ∂2f
∂x1∂xn
...∂2f
∂xn∂x1... ∂2f
∂x2n
,∃∀x ∈ dom (f )
Note:Not all convex functions are differentiable.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Differentials
f is differentiable if domain of (f ) is open and the gradient off :
∇f (x) ≡(∂f
∂x1, ...,
∂f
∂xn
)T
,∃ ∀x ∈ dom (f )
f is twice differentiable if domain of f is open and theHessian of f :
H ≡ D2f (x) =
∂2f∂x2
1... ∂2f
∂x1∂xn
...∂2f
∂xn∂x1... ∂2f
∂x2n
,∃∀x ∈ dom (f )
Note:
Not all convex functions are differentiable.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Differentials
f is differentiable if domain of (f ) is open and the gradient off :
∇f (x) ≡(∂f
∂x1, ...,
∂f
∂xn
)T
,∃ ∀x ∈ dom (f )
f is twice differentiable if domain of f is open and theHessian of f :
H ≡ D2f (x) =
∂2f∂x2
1... ∂2f
∂x1∂xn
...∂2f
∂xn∂x1... ∂2f
∂x2n
,∃∀x ∈ dom (f )
Note:Not all convex functions are differentiable.
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
First-order condition
Theorem (gradient inequality)
Differentiable f is convex on convex C ⊂ Rn i.f.f. ∀x , y ∈ Cf (y) ≥ f (x) + (∇f (x))T (y − x).
Theorem
Minimizing differentiable convex function f (x) s.t. x ∈ C⇔
Find x∗ ∈ C such that (∇f (x∗))T (y − x∗) ≥ 0 for all y ∈ C(variational inequality problem)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
First-order condition
Theorem (gradient inequality)
Differentiable f is convex on convex C ⊂ Rn i.f.f. ∀x , y ∈ Cf (y) ≥ f (x) + (∇f (x))T (y − x).
Theorem
Minimizing differentiable convex function f (x) s.t. x ∈ C⇔
Find x∗ ∈ C such that (∇f (x∗))T (y − x∗) ≥ 0 for all y ∈ C(variational inequality problem)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
First-order condition
Theorem (gradient inequality)
Differentiable f is convex on convex C ⊂ Rn i.f.f. ∀x , y ∈ Cf (y) ≥ f (x) + (∇f (x))T (y − x).
Theorem
Minimizing differentiable convex function f (x) s.t. x ∈ C⇔
Find x∗ ∈ C such that (∇f (x∗))T (y − x∗) ≥ 0 for all y ∈ C(variational inequality problem)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
First-order condition
Theorem (gradient inequality)
Differentiable f is convex on convex C ⊂ Rn i.f.f. ∀x , y ∈ Cf (y) ≥ f (x) + (∇f (x))T (y − x).
Theorem
Minimizing differentiable convex function f (x) s.t. x ∈ C⇔
Find x∗ ∈ C such that (∇f (x∗))T (y − x∗) ≥ 0 for all y ∈ C(variational inequality problem)
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Second-order condition
Theorem
Twice differentiable f is convex on C ⊂ Rn i.f.f Hessian matrix∇2f is positive semidefinite for all x ∈ C .
Note: If ∇2f (x) is positive definite for all x ∈ C , then f isstrictly convex on C . The converse is false.Example: consider the function f (x) = x4
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Second-order condition
Theorem
Twice differentiable f is convex on C ⊂ Rn i.f.f Hessian matrix∇2f is positive semidefinite for all x ∈ C .
Note: If ∇2f (x) is positive definite for all x ∈ C , then f isstrictly convex on C . The converse is false.Example: consider the function f (x) = x4
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Second-order condition
Theorem
Twice differentiable f is convex on C ⊂ Rn i.f.f Hessian matrix∇2f is positive semidefinite for all x ∈ C .
Note: If ∇2f (x) is positive definite for all x ∈ C , then f isstrictly convex on C . The converse is false.Example: consider the function f (x) = x4
Introductionto
optimization
Introduction
Mathematicaloptimization
Appendix
Second-order condition
Theorem
Twice differentiable f is convex on C ⊂ Rn i.f.f Hessian matrix∇2f is positive semidefinite for all x ∈ C .
Note: If ∇2f (x) is positive definite for all x ∈ C , then f isstrictly convex on C . The converse is false.Example: consider the function f (x) = x4