complementairty formulations for -...
TRANSCRIPT
Complementairty Formulations forL0-norm Optimization Problems1
John E. Mitchell
Department of Mathematical SciencesRPI, Troy, NY 12180 USA
SIAM Conference on OptimizationSan Diego, May 2014
1Joint work with Mingbin Feng, Jong-Shi Pang, Xin Shen,and Andreas Wächter.Supported by AFOSR and NSF.
Mitchell (RPI) Complementarity for L0 SIOPT 2014 1 / 22
Outline
1 Introduction and Applications
2 L0-norm minimization
3 Conclusions
Mitchell (RPI) Complementarity for L0 SIOPT 2014 2 / 22
Introduction and Applications
Outline
1 Introduction and Applications
2 L0-norm minimization
3 Conclusions
Mitchell (RPI) Complementarity for L0 SIOPT 2014 3 / 22
Introduction and Applications
L0-norm minimization
Want to solve:
minx{f (x) + γ||x ||0 : Ax ≥ b, x ∈ IRn}
where A ∈ IRm×n, b ∈ IRm, and ||x ||0 = card{xi : xi 6= 0}.
The parameter γ reflects the relative importance of sparsity and f (x).
L0-norm minimization is of interest in feature selection, compressedsensing, misclassification minimization, and almost any problem wherea sparse solution is desired.
Can be modeled as a Mathematical Program with ComplementarityConstraints or MPCC.
Mitchell (RPI) Complementarity for L0 SIOPT 2014 4 / 22
Introduction and Applications
Eg: Misclassification minimization
Given (C, γ, ε) > 0, and points x i with observed values yi :
minimize(w ,b)
Cn∑
i=1
max{|wT x i + b − yi | − ε, 0
}+ 1
2 wT w︸ ︷︷ ︸standard SVM objective
+ γ
n∑i=1
{1 if |wT x i + b − yi | > ε
0 if |wT x i + b − yi | ≤ ε︸ ︷︷ ︸number of misclassified points
(Mangasarian)
Mitchell (RPI) Complementarity for L0 SIOPT 2014 5 / 22
Introduction and Applications
Eg: Minimum portfolio revision
Given portfolio x0 in standard unit simplex ∆n , {x ∈ Rn+ | 1T x = 1},
consider a portfolio revision problem as follows:
minimizex∈∆n
c2
xT Vx︸ ︷︷ ︸variance
+c ′ VaRβ(x)
︸ ︷︷ ︸composite risk measure
+ K ‖ x − x0 ‖0︸ ︷︷ ︸portfolio revision cost
subject to µT x ≥ R (expected portfolio return),
where VaRβ(x) is the β-value-at-risk of the portfolio x with β ∈ (0,1).
Mitchell (RPI) Complementarity for L0 SIOPT 2014 6 / 22
L0-norm minimization
Outline
1 Introduction and Applications
2 L0-norm minimization
3 Conclusions
Mitchell (RPI) Complementarity for L0 SIOPT 2014 7 / 22
L0-norm minimization Formulation
Complementarity formulationWant to solve:
minx{f (x) + γ||x ||0 : Ax ≥ b, x ∈ IRn}
Equivalently:
minx±,ξ f (x) +∑n
i=1(1− ξi)
subject to Ax ≥ b
0 ≤ ξ ≤ 1
0 ≤ ξ ⊥ x
Note that if xi = 0 then we can choose ξi = 1,and if xi 6= 0 then we must set ξi = 0.
Thus, the objective counts the number of nonzero components.
No need for any big-M terms in this LPCC formulation.Mitchell (RPI) Complementarity for L0 SIOPT 2014 8 / 22
L0-norm minimization Formulation
Complementarity formulation
minx±,ξ f (x) +∑n
i=1(1− ξi)
subject to Ax ≥ b
0 ≤ ξ ≤ 1
0 ≤ ξ ⊥ x
This is a half-complementary formulation;
Mitchell (RPI) Complementarity for L0 SIOPT 2014 9 / 22
L0-norm minimization Formulation
Complementarity formulation
minx±,ξ f (x) +∑n
i=1(1− ξi)
subject to Ax ≥ b
0 ≤ ξ ≤ 1
0 ≤ ξ ⊥ x+ + x− ≥ 0, 0 ≤ x+ ⊥ x− ≥ 0, x = x+ − x−
This is a half-complementary formulation;can also get a full-complementary formulation by splitting x = x+ − x−.
Mitchell (RPI) Complementarity for L0 SIOPT 2014 9 / 22
L0-norm minimization Formulation
KKT points
The Constant Rank Constraint Qualification holds.It also holds under additional assumptions with nonlinear constraints.
Let x satisfy Ax ≥ b and assume x minimizes f (x) for someassignment of the complementarities,set ξ to count the number of nonzero components:
This point is a local minimizer and a strongly stationary pointof the MPCC formulation.
Mitchell (RPI) Complementarity for L0 SIOPT 2014 10 / 22
L0-norm minimization Formulation
Relaxed NLP formulationLet ε > 0. Assume f (x) ≡ 0. Can get a relaxed formulation:
minx±,ξ∑n
i=1(1− ξi)
subject to Ax+ − Ax− ≥ b
0 ≤ x+, x−
0 ≤ ξ ≤ 1
ξi (x+i + x−i ) ≤ ε i = 1, . . . ,n
CQ holds for this problem.
Not every x is a local minimizer.
A sequence of global minimizers as ε→ 0 will have a subsequencethat converges to a global minimizer of the L0-norm problem.
Under certain conditions, the solution to the relaxed formulation is asolution to a reweighted version of the L1-norm minimization problem.
Mitchell (RPI) Complementarity for L0 SIOPT 2014 11 / 22
L0-norm minimization Formulation
Penalty method
Can penalize violations of complementarity in many ways:
• Add (x+ + x−)T ξ to the objective function.
• Add (x+ + x−)T ξ + (x+)T x− to the objective function.Corresponds to the AMPL keyword “complements”when using knitro.
• Add (((x+ + x−)T ξ)2 to the objective function.
• Add ((x+ + x−)T ξ + (x+)T x−))2 to the objective function.
• Use the Fischer-Burmeister function, or some other NCP function.
Theoretical investigation is ongoing.
Surprisingly, the fourth formulation has worked well in our tests.
Mitchell (RPI) Complementarity for L0 SIOPT 2014 12 / 22
L0-norm minimization Formulation
Penalty method
Can penalize violations of complementarity in many ways:
• Add (x+ + x−)T ξ to the objective function.
• Add (x+ + x−)T ξ + (x+)T x− to the objective function.Corresponds to the AMPL keyword “complements”when using knitro.
• Add (((x+ + x−)T ξ)2 to the objective function.
• Add ((x+ + x−)T ξ + (x+)T x−))2 to the objective function.
• Use the Fischer-Burmeister function, or some other NCP function.
Theoretical investigation is ongoing.
Surprisingly, the fourth formulation has worked well in our tests.
Mitchell (RPI) Complementarity for L0 SIOPT 2014 12 / 22
L0-norm minimization Computational Results
Test problems
Examine three classes of problems:
• Minimize L0-norm, without another objective f (x).
• Look at weighted combinations of f (x) and L0-norm.
• Look at signal recovery problems.
In each case, compare solutions obtained by a nonlinear programmingpackage (KNITRO, SNOPT, CONOPT, MINOS) for ourcomplementarity formulation with an L1-norm formulation.
Have no guarantee of global optimality.
Mitchell (RPI) Complementarity for L0 SIOPT 2014 13 / 22
L0-norm minimization Minimize L0-norm
Minimize number of non zeroes
minx∈IRn ‖x‖0,1s.t. Ax ≥ b.
Look at four formulations:
MILP: integer programming formulation with a big-M
L1: L1-approximation solved as an LP.Objective reweighted iteratively (Candes et al).
fullcomp: complementarity formulation solved as NLP
halfcomp: modified complementarity formulation, solved as NLP
Test problemsA and b generated randomly, each entry in U[−1,1].
Mitchell (RPI) Complementarity for L0 SIOPT 2014 14 / 22
L0-norm minimization Minimize L0-norm
A is 30× 50:50 test problems, various solvers and start points
Mitchell (RPI) Complementarity for L0 SIOPT 2014 15 / 22
L0-norm minimization Minimize L0-norm
A is 300× 500: performance profile of sparsity
Mitchell (RPI) Complementarity for L0 SIOPT 2014 16 / 22
L0-norm minimization Pareto frontier
Weighted combinations
minx∈IRn q(x) + β‖x‖0,1s.t. Ax ≥ b.
Solve for each norm for various different choices of β.
In our tests, A is 30× 50, q(x) is strictly convex quadratic function.
The L0-norm formulations are solved using KNITRO under AMPL.
Mitchell (RPI) Complementarity for L0 SIOPT 2014 17 / 22
L0-norm minimization Pareto frontier
Pareto frontier
Mitchell (RPI) Complementarity for L0 SIOPT 2014 18 / 22
L0-norm minimization Signal recovery
Signal recovery
minx∈IRn
‖Ax − b‖22 + β‖x‖0,1
Look at different choices of regularizing parameter β.
A is 256× 1024.In original signal, 40 entries of x are nonzero.Random noise is added to give b.
L1-norm results are improved by debiasing:fix zeroes in L1 solution at zero, then look for best least squares fit.L1-debiased leads to many small nonzero components in our tests.
Mitchell (RPI) Complementarity for L0 SIOPT 2014 19 / 22
L0-norm minimization Signal recovery
Signal recovery sparsity
10−6 10−4 10−2 100100
101
102
103
104
y=40
Regularizing Parameter
Num
ber o
f NZ
in o
ptim
al s
olut
ion
y=40y=40
L0L1L1−Debiased
Mitchell (RPI) Complementarity for L0 SIOPT 2014 20 / 22
Conclusions
Outline
1 Introduction and Applications
2 L0-norm minimization
3 Conclusions
Mitchell (RPI) Complementarity for L0 SIOPT 2014 21 / 22
Conclusions
Conclusions
A complementarity formulation can be very effective at finding sparsesolutions to L0-norm optimization problems, far more quickly than anMILP approach but more slowly than an L1-norm LP approach.
In the first class, outperformed an L1-norm formulation even withsophisticated iterative schemes for the L1 approach.
For Pareto frontier, get better results than L1 for small instance. So far,these results haven’t yet extended well to larger instances.
For signal recovery, able to recover signal effectively. Currentlyinvestigating different distributions and test instances.
At present, we only have a very partial understanding as to why theL0-norm approach works so well. Theoretical investigation is ongoing!
Mitchell (RPI) Complementarity for L0 SIOPT 2014 22 / 22