ele539a: optimization of communication systems lecture 18

40
ELE539A: Optimization of Communication Systems Lecture 18: Interior Point Algorithm Professor M. Chiang Electrical Engineering Department, Princeton University March 30, 2007

Upload: others

Post on 03-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ELE539A: Optimization of Communication Systems Lecture 18

ELE539A: Optimization of Communication Systems

Lecture 18: Interior Point Algorithm

Professor M. Chiang

Electrical Engineering Department, Princeton University

March 30, 2007

Page 2: ELE539A: Optimization of Communication Systems Lecture 18

Lecture Outline

• Newton’s method

• Equality constrained convex optimization

• Inequality constrained convex optimization

• Barrier function and central path

• Barrier method

• Summary of topics covered in the course

• Final projects

• Conclusions

Page 3: ELE539A: Optimization of Communication Systems Lecture 18

Newton Method

Newton step:

∆xnt = −∇2f(x)−1∇f(x)

Positive definiteness of ∇2f(x) implies that ∆xnt is a descent direction

Interpretation: linearize optimality condition ∇f(x∗) = 0 near x,

∇f(x + v) ≈ ∇f(x) + ∇2f(x)v = 0

Solving this linear equation in v, obtain v = ∆xnt. Newton step is the

addition needed to x to satisfy linearized optimality condition

Page 4: ELE539A: Optimization of Communication Systems Lecture 18

Main Properties

• Affine invariance: given nonsingular T ∈ Rn×n and let f(y) = f(Tx).

Then Newton step for f at y:

∆ynt = T−1∆xnt

and

x + ∆xnt = T (y + ∆ynt)

• Newton decrement:

λ(x) =“

∇f(x)T∇2f(x)−1∇f(x)”1/2

=“

∆xTnt∇

2f(x)∆xnt

”1/2

Let f be second order approximation of f at x. Then

f(x) − p∗ ≈ f(x) − infy

f(y) = f(x) − f(x + ∆xnt) =1

2λ(x)2

Page 5: ELE539A: Optimization of Communication Systems Lecture 18

Newton Method

GIVEN a starting point x ∈ dom f and tolerance ǫ > 0

REPEAT

1. Compute Newton step and decrement: ∆xnt = −∇2f(x)−1∇f(x) and

λ =`

∇f(x)T∇2f(x)−1∇f(x)´1/2

2. Stopping criterion: QUIT if λ2

2≤ ǫ

3. Line search: choose a step size t > 0

4. Update: x := x + t∆x

Advantages of Newton method: Fast, Robust, Scalable

Page 6: ELE539A: Optimization of Communication Systems Lecture 18

Error vs. Iteration for Problems in R100 and R10000

minimize cT x −M

X

i=1

log(bi − aTi x), x ∈ R100and ∈ R10000

0 2 4 6 8 1010

−15

10−10

10−5

100

105

b

x

k

f(x

(k))−

p∗

0 5 10 15 20

10−5

100

105

k

f(x

(k))−

p∗

Page 7: ELE539A: Optimization of Communication Systems Lecture 18

Equality Constrained Problems

Solve a convex optimization with equality constraints:

minimize f(x)

subject to Ax = b

f : Rn → R is twice differentiable

A ∈ Rp×n with rank p < n

Optimality condition: KKT equations with n + p equations in n + p

variables x∗, ν∗:

Ax∗ = b, ∇f(x∗) + AT ν∗ = 0

Approach 1: Can be turned into an unconstrained optimization, after

eliminating the equality constraints

Page 8: ELE539A: Optimization of Communication Systems Lecture 18

Approach 2: Dual Solution

Dual function:

g(ν) = −bT ν − f∗(−AT ν)

Dual problem:

maximize − bT ν − f∗(−AT ν)

Example: solve

minimize −Pn

i=1 log xi

subject to Ax = b

Dual problem:

maximize − bT ν +n

X

i=1

log(AT ν)i

Recover primal variable from dual variable:

xi(ν) = 1/(AT ν)i

Page 9: ELE539A: Optimization of Communication Systems Lecture 18

Example With Analytic Solution

Convex quadratic minimization over equality constraints:

minimize (1/2)xT Px + qT x + r

subject to Ax = b

Optimality condition:

2

4

P AT

A 0

3

5

2

4

x∗

ν∗

3

5 =

2

4

−q

b

3

5

If KKT matrix is nonsingular, there is a unique optimal primal-dual pair

x∗, ν∗

If KKT matrix is singular but solvable, any solution gives optimal x∗, ν∗

If KKT matrix has no solution, primal problem is unbounded below

Page 10: ELE539A: Optimization of Communication Systems Lecture 18

Approach 3: Direct Derivation of Newton Method

Make sure initial point is feasible and A∆xnt = 0

Replace objective with second order Tayler approximation near x:

minimize f(x + v) = f(x) + ∇f(x)T v + (1/2)vT ∇2f(x)v

subject to A(x + v) = b

Find Newton step ∆xnt by solving:

2

4

∇2f(x) AT

A 0

3

5

2

4

∆xnt

w

3

5 =

2

4

−∇f(x)

0

3

5

where w is associated optimal dual variable of Ax = b

Newton’s method (Newton decrement, affine invariance, and stopping

criterion) stay the same

Page 11: ELE539A: Optimization of Communication Systems Lecture 18

Inequality Constrained Minimization

Let f0, . . . , fm : Rn → R be convex and twice continuously differentiable

and A ∈ Rp×n with rank p < n:

minimize f0(x)

subject to fi(x) ≤ 0, i = 1, . . . , m

Ax = b

Assume the problem is strictly feasible and an optimal x∗ exists

Idea: reduce it to a sequence of linear equality constrained problems

and apply Newton’s method

First, need to approximately formulate inequality constrained problem as

an equality constrained problem

Page 12: ELE539A: Optimization of Communication Systems Lecture 18

Barrier Function

Make inequality constraints implicit in the objective:

minimize f0(x) +Pm

i=1 I−(fi(x))

subject to Ax = b

where I− is indicator function:

I−(u) =

8

<

:

0 u ≤ 0

∞ u > 0

No inequality constraints, but objective function not differentiable

Approximate indicator function by a differentiable, closed, and convex

function:

I−(u) = −(1/t) log(−u), dom I− = −R++

where a larger parameter t gives more accurate approximation

I− increases to ∞ as u increases to 0

Page 13: ELE539A: Optimization of Communication Systems Lecture 18

Log Barrier

−3 −2 −1 0 1−5

0

5

10

u

y

Use Newton method to solve approximation:

minimize f0(x) +Pm

i=1 I−(fi(x))

subject to Ax = b

Page 14: ELE539A: Optimization of Communication Systems Lecture 18

Log Barrier

Log barrier function:

φ(x) = −m

X

i=1

log(−fi(x)), domφ = {x ∈ Rn|fi(x) < 0, i = 1, . . . , m}

Approximation better if t is large, but then Hessian of f0 + (1/t)φ varies

rapidly near boundary of feasible set. Accuracy Stability tradeoff

Solve a sequence of approximation with larger t, using Newton method

for each step of the sequence

Gradient and Hessian of log barrier function:

∇φ(x) =

mX

i=1

1

−fi(x)∇fi(x)

∇2φ(x) =

mX

i=1

1

fi(x)2∇fi(x)∇fi(x)T +

mX

i=1

1

−fi(x)∇2fi(x)

Page 15: ELE539A: Optimization of Communication Systems Lecture 18

Central Path

Consider the family of optimization problems parameterized by t > 0:

minimize tf0(x) + φ(x)

subject to Ax = b

Central path: solutions to above problem x∗(t), characterized by:

1. Strict feasibility:

Ax∗(t) = b, fi(x∗(t)) < 0, i = 1, . . . , m

2. Centrality condition: there exists ν ∈ Rp such that

t∇f0(x∗(t)) +m

X

i=1

1

−fi(x∗(t))∇fi(x

∗(t)) + AT ν = 0

Every central point gives a dual feasible point. Let

λ∗

i (t) = −1

tfi(x∗(t)), i = 1, . . . , m ν∗(t) =

ν

t

Page 16: ELE539A: Optimization of Communication Systems Lecture 18

Central Path

Dual function

g(λ∗(t), ν∗(t)) = f0(x∗(t)) − m/t

which implies duality gap is m/t. Therefore, suboptimality gap

f0(x∗(t)) − p∗ ≤ m/t

Interpretation as modified KKT condition:

x is a central point x∗(t) iff there exits λ, ν such that

Ax = b, fi(x) ≤ 0, i = 1, . . . , m

λ � 0

∇f0(x) +m

X

i=1

λi∇fi(x) + AT ν = 0

−λifi(x) = 1/t, i = 1, . . . , m

Complementary slackness is relaxed from 0 to 1/t

Page 17: ELE539A: Optimization of Communication Systems Lecture 18

Example

Inequality form LP:

minimize cT x

subject to Ax � b

Log barrier function:

φ(x) = −m

X

i=1

log(bi − aTi x)

with gradient and Hessian:

∇φ(x) =m

X

i=1

ai

bi − aTi x

= AT d

∇2φ(x) =m

X

i=1

aiaTi

(bi − aTi x)2

= AT diag(d2)A

where di = 1/(bi − aTi x)

Centrality condition becomes:

tc + AT d = 0

Page 18: ELE539A: Optimization of Communication Systems Lecture 18

c is parallel to ∇φ(x).

Therefore, hyperplane cT x∗(t) is tangent to level set of φ

xo xs

c

Page 19: ELE539A: Optimization of Communication Systems Lecture 18

Barrier Method

GIVEN a strictly feasible point x ∈ dom f, t := t(0) > 0, µ > 1 and

tolerance ǫ > 0

REPEAT

1. Centering step: compute x∗(t) by minimizing tf0 + φ subject to

Ax = b, starting at x

2. Update: x := x∗(t)

3. Stopping criterion: QUIT if mt

≤ ǫ

4. Increase t: t := µt

Other names: Sequential Unconstrained Minimization Technique

(SUMT) or path-following method

Usually use Newton method for Centering Step

Page 20: ELE539A: Optimization of Communication Systems Lecture 18

Remarks

• Each centering step does not need to be exact

• Choice of µ: tradeoff number of inner iterations with number of outer

iterations

• Choice of t(0): tradeoff number of inner iterations within the first

outer iteration with number of outer iterations

• Number of centering steps required:

log(m/(ǫt(0)))

log µ

where m is the number of inequality constraints and ǫ is desired accuracy

Page 21: ELE539A: Optimization of Communication Systems Lecture 18

Progress of Barrier Method for an LP Example

0 20 40 60 80

10−6

10−4

10−2

100

102

c1c2 c3

duality

gap

NT. Ite.

Three curves for µ = 2, 50, 150

Page 22: ELE539A: Optimization of Communication Systems Lecture 18

Tradeoff of µ Parameter for a Small LP

0 20 40 60 80 100 120 140 160 180 2000

20

40

60

80

100

120

140

µ

NT

.Ite.

Page 23: ELE539A: Optimization of Communication Systems Lecture 18

Progress of Barrier Method for a GP Example

0 20 40 60 80 100 120

10−6

10−4

10−2

100

102

c1c2c3

duality

gap

NT Ite.

Page 24: ELE539A: Optimization of Communication Systems Lecture 18

Insensitive to Problem Size

0 10 20 30 40 5010

−4

10−2

100

102

104

c1 c2 c3

duality

gap

NT. Ite. 101

102

103

15

20

25

30

35

m

NT

.Ite.

Three curves for m = 50, 500, 1000, n = 2m.

Page 25: ELE539A: Optimization of Communication Systems Lecture 18

Phase I Method

How to compute a strictly feasible point to start barrier method?

Consider a phase I optimization problem in variables x ∈ Rn, s ∈ R:

minimize s

subject to fi(x) ≤ s, i = 1, . . . , m

Ax = b

Strictly feasible point: for any x(0), let s = max fi(x(0))

1. Apply barrier method to solve phase I problem (stop when s < 0)

2. Use the resulted strictly feasible point for the original problem to

start barrier method for the original problem

Page 26: ELE539A: Optimization of Communication Systems Lecture 18

Not Covered

• Self-concordance analysis and complexity analysis (polynomial time)

• Numerical linear algebra (large scale implementation)

• Generalized inequalities (SDP)

Other (sometimes more efficient) algorithms:

• Primal-dual interior point method

• Ellipsoid methods

• Analytic center cutting plane methods

Page 27: ELE539A: Optimization of Communication Systems Lecture 18

Lecture Summary

• Convert equality constrained optimization into unconstrained

optimization

• Solve a general convex optimization with interior point methods

• Turn inequality constrained problem into a sequence of equality

constrained problems that are increasingly accurate approximation of

the original problem

• Polynomial time (in theory) and much faster (in practice): about

25-50 least-squares effort for a wide range of problem sizes

Readings: Chapters 9.1-9.3, 9.5, 10.1-10.2, 11.1-11.4 in Boyd and

Vandenberghe

Page 28: ELE539A: Optimization of Communication Systems Lecture 18

Why Take This Course

• Learn the tools and mentality of optimization (surprisingly useful for

other study you may engage in later on)

• Learn classic and recent results on optimization of communication

systems (over a surprisingly wide range of problems)

• Enhance the ability to do original research in academia or industry

• Have a fun and productive time on the final project

Page 29: ELE539A: Optimization of Communication Systems Lecture 18

Where We Started

Application

Presentation

Session

Transport

Network

Link

Physical

Source SourceEncoder

ChannelEncoder

Modulator

Channel

DemodulatorChannel Decoder

SourceDecoder

Desti.

minimize f(x)

subject to x ∈ C

Page 30: ELE539A: Optimization of Communication Systems Lecture 18

Topics Covered: Methodologies

• Nonlinear optimization (linear, convex, nonconvex), Pareto

optimization, Dynamic programming, Integer programming

• Convex optimization, convex sets and convex functions

• Lagrange duality and KKT optimality condition

• LP, QP, QCQP, SOCP, SDP, GP, SOS

• Gradient method, Newton’s method, interior point method

• Distributed algorithms and decomposition methods

• Nonconvex optimization and relaxations

• Stochastic network optimization and Lyapunov function

• Robust LP and GP

• Network flow problems

• NUM and generalized NUM

Page 31: ELE539A: Optimization of Communication Systems Lecture 18

Topics Covered: Applications

• Information theory problems: channel capacity and rate distortion for

discrete memoryless models

• Detection and estimation problems: MAP and ML detector,

multi-user detection

• Physical layer: DSL spectrum management

• Wireless networks: resource allocation, power control, joint power and

rate allocation

• Network algorithms and protocols: multi-commodity flow problems,

max flow, shortest path routing, network rate allocation, TCP

congestion control, IP routing

• Reverse engineering, heterogeneous protocols

• Congestion, collision, interference management

• Layering as optimization decomposition

• Eigenvalue and norm optimization problems

Page 32: ELE539A: Optimization of Communication Systems Lecture 18

Some of the Topics Not Covered

• Circuit switching and optimal routing

• Packet switch architecture and algorithms

• Multi-user information theory problems

• More digital signal processing like blind equalization

• Large deviations theory and queuing theory applications

• System stability and control theoretic improvement of TCP

• Inverse optimization, robust optimization

• Self concordance analysis of interior point method’s complexity

• Ellipsoid method, cutting plane method, simplex method ...

Page 33: ELE539A: Optimization of Communication Systems Lecture 18

Final Projects: Timeline

March 30 – May 24:

• Talk to me as often as needed

• Start early

April 27: Interim project report due

Mid May: Project Presentation Day

May 24: Final project report due

Many student continue to extend the project

Page 34: ELE539A: Optimization of Communication Systems Lecture 18

Final Projects: Topics

Wenjie Jiang: Content distribution networking

Suhas Mathur: Wireless network secuirty

Hamed Mohsenian: Multi-channel MAC

Martin Suchara: Stochastic simulation for TCP/IP

Guanchun Wang: Google with feedback

Yihong Wu: Practical stepsize

Yiyue Wu: Coding

Yongxin Xi: Image analysis

Minlan Yu: Network embedding

Dan Zhang: To be decided

Page 35: ELE539A: Optimization of Communication Systems Lecture 18

The Optimization Mentality

• What can be varied and what cannot?

• What’re the objectives?

• What’re the constraints?

• What’s the dual?

Build models, analyze models, prove theorems, compute solutions,

design systems, verify with data, refine models ...

Warnings:

• Remember the restrictive assumptions for application needs

• Understand the limitations: discrete, nonconvex, complexity,

robustness ... (still a lot can be done for these difficult issues)

Page 36: ELE539A: Optimization of Communication Systems Lecture 18

Optimization of Communication Systems

Three ways to apply optimization theory to communication systems:

• Formulate the problem as an optimization problem

• Interpret a given solution as an optimizer/algorithm for an

optimization problem

• Extend the underlying theory by optimization theoretic techniques

A remarkably powerful, versatile, widely applicable and not yet fully

recognized framework

Applications in communication systems also stimulate new

developments in optimization theory and algorithms

Page 37: ELE539A: Optimization of Communication Systems Lecture 18

Optimization of Communication Systems

OPT

OPT

OPT

Theories

SolutionsProblems

Page 38: ELE539A: Optimization of Communication Systems Lecture 18

Optimization Beyond Optimality

• Modeling (reverse engineering, fairness, etc.)

• Architecture

• Robustness

• Design for Optimizability

Page 39: ELE539A: Optimization of Communication Systems Lecture 18

This Is The Beginning

Quote from Thomas Kailath:

A course is not about what you’ve covered, but what you’ve uncovered

Page 40: ELE539A: Optimization of Communication Systems Lecture 18

To Everyone In ELE539A

You’ve Been Amazing