optimization methods in machine learning...optimization methods for convex problems • interior...

75
Optimization Methods in Machine Learning Katya Scheinberg Lehigh University [email protected]

Upload: others

Post on 02-Sep-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Optimization Methods in Machine Learning

TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAAAAAAAA

Katya Scheinberg Lehigh University

[email protected]

Page 2: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Primal Semidefinite Programming Problem

Page 3: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Dual Semidefinite Programming Problem

Primal Semidefinite Programming Problem

Page 4: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Duality gap and complementarity

HW: prove the last statement

Page 5: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Complementarity of eignevalues

Page 6: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Complementarity of eigenvalues

Page 7: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Optimality conditions

Convex QP with linear equality constraints.

Closed form solution via solving a linear system

Page 8: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Optimality conditions

Convex QP with linear inequality constraints.

No closed form solution

Page 9: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Nonlinear Constraints, linear objective:

Convex Quadratically Constrained Quadratic Problems

Feasible set can be described as a convex cone Å affine set

Page 10: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Second Order Cone

x= (x0, x1, . . . , xn ), x̄= (x1, . . . , xn )

K2 Rn+1 is a second order cone:

x1

x2

x0

Page 11: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Discovering SOCP cone

A convex quadratic constraint:

Factorize and rewrite:

Norm constraint

More general form

Variable substitution

SOCP:

Page 12: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Second Order Cone Programming

Page 13: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Complementarity Conditions

Page 14: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Formulating SOCPs

Rotated SOCP cone Equivalent to SOCP cone Example:

Page 15: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Unconstrained Optimization

Page 16: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Traditional methods

•  Gradient descent •  Newton method •  Quazi-Newton method •  Conjugate gradient method

Page 17: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 18: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 19: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 20: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 21: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 22: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 23: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 24: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 25: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 26: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 27: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 28: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 29: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 30: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 31: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 32: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 33: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 34: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Slides from L. Vandenberghe http://www.ee.ucla.edu/~vandenbe/ee236c.html

Page 35: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Interior Point Methods

Page 36: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Interior Point Methods: a history

² Ellipsoid Method, Nemirovskii, 1970’s. No complexity result.

² Polynomial Ellipsoid Method for LP, Khachian 1979. Not practical.

² Karmarkar’s method, 1984, first “efficient” interior point method.

² Primal-dual path following methods and others late 1980’s. Very efficentpractical methods.

² Extensions to other classes of convex problems. Early 1990’s.

² General theory of interior point methods, self-concordant barriers, Nes-terov and Nemirovskii, 1990’s.

Page 37: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Self-concordant barrier

Page 38: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Log barrier for LP

Page 39: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Log-barrier for SDP

Page 40: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Log barrier for SOCP

Page 41: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Dual Linear Programming Problem

Primal Linear Programming Problem

Page 42: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Optimality (KKT) conditions

Page 43: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Central Path

Apply Newton method to the (self-concordant) barrier problem (i.e. to its optimality conditions)

Apply one or two steps of Newton method for a given µ and then reduce µ

Page 44: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

KKT conditions for primal central path

Page 45: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Central Path

Optimality conditions for the barrier problem

Apply Newton method to the system of nonlinear equations

Page 46: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Central Path

-c

It exists iff there is nonempty interior for the primal and dual problems.

Page 47: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Interior point methods, the main idea

•  Each point on the central path can be approximated by

applying Newton method to the perturbed KKT system.

•  Start at some point near the central path for some value of µ, reduce µ.

•  Make one or more Newton steps toward the solution with the new value of µ.

•  Keep driving µ to 0, always staying close to the solutions

of the central path.

•  This prevents the iterates from getting trapped near the boundary and keeps them nicely central.

Page 48: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

-c

Page 49: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

KKT conditions for dual and primal-dual central paths

Page 50: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Newton step

Primal method

Primal-dual method

Dual method

-c

Page 51: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Predictor-Corrector steps

¾ = 0 for predictor step and ¾ > 0 for corrector step.

Solve the system of linear equations twice with the same matrix

Page 52: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Predictor-Corrector steps

Augmented system

Page 53: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Solving the augmented system

x x

x x

x x

x

x

x

x x

x

x x

x

x

x

x

x

x

x x x x

x x x

x x

x

x x

x

x

Normal equation

x

x x

x x

x x

Page 54: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Cholesky Factorization

x

x x

x x

x

x

x x

x

x x

x

x x x

x x

x x

x x

x x

x x

x

x x

x x

x x

x x

x x

x x

x x

x

x x

=

•  Numerically very stable!

• The sparsity pattern of L remains the same at each iteration

• Depends on sparsity pattern of A and ordering of rows of A

• Can compute the pattern in advance (symbolic factorization)

• The work for each factorization depends on sparsity pattern, can be as little as O(n) if very sparse and as much as O(n^3) (if dense).

Page 55: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Complexity per iteration

Page 56: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Complexity and performance

Page 57: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Optimality conditions

Convex QP with linear inequality constraints.

Page 58: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Interior Point method

Page 59: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Newton Step

Page 60: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Complexity per iteration

Page 61: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Dual Semidefinite Programming Problem

Primal Semidefinite Programming Problem

Page 62: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Duality gap and complementarity

Page 63: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Central Path

Page 64: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Central Path

Page 65: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Central Path

Dual CP

Primal-Dual CP

Symmetric Primal-Dual

Page 66: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Computing a step

Page 67: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Computing a step

Page 68: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Cholesky factorization

Each iteration may require O(n6) operations and O(n4) memory.

Page 69: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Second Order Cone Programming

Page 70: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Complementarity Conditions

Page 71: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Log-barrier formulation

Page 72: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Perturbed optimality conditions

Page 73: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Newton step

Page 74: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Optimization methods for convex problems •  Interior Point methods

–  Best iteration complexity O(log(1/²)), in practice <50. –  Worst per-iteration complexity (sometimes prohibitive)

•  Active set methods –  Exponential complexity in theory, often linear in practice. –  Better per iteration complexity.

•  Gradient based methods –  or O(1/²) iterations –  Matrix/vector multiplication per iteration

•  Nonsmooth gradient based methods –  O(1/²) or O(1/²2) iterations –  Matrix/vector multiplication per iteration

•  Block coordinate descent –  Iteration complexity ranges from unknown to similar to FOMs. –  Per iteration complexity can be constant.

Page 75: Optimization Methods in Machine Learning...Optimization methods for convex problems • Interior Point methods – Best iteration complexity O(log(1/²)), in practice

Homework 1.

2.

3.