fast algorithms for nonconvex compressive sensing

Fast algorithms for nonconvexcompressive sensing

Rick Chartrand

Los Alamos National Laboratory

New Mexico Consortium

February 25, 2009

Operated by Los Alamos National Security, LLC for NNSA

Outline

Motivating example

Nonconvex compressive sensing

Algorithms

Summary

Motivating example

Sparse tomographySuppose we want to reconstruct animage from samples of its Fouriertransform. How many samples dowe need?

Shepp-Logan phantom

Consider radial sampling,such as in MRI or (roughly)CT.

Motivating example

Nonconvexity

Fewer measurements are needed with nonconvex minimization:

minu‖∇u‖pp, subject to (Fu)|Ω = (Fx)|Ω.

With p = 1, solution is u = x with 18 lines ( |Ω||x| = 6.9%).

With p = 1/2, 10 lines suffice ( |Ω||x| = 3.8%).

backprojection, 18 lines p = 1, 18 lines p = 12

, 10 lines p = 1, 10 lines

Motivating example

Nonconvexity

backprojection, 18 lines p = 1, 18 lines

p = 12

Motivating example

Nonconvexity

backprojection, 18 lines p = 1, 18 lines p = 12

Motivating example

New results

These are old results (Mar. 2006); what’s new?

I Reconstruction (to 50 dB) in 13 seconds (versus literature-best1–3 minutes).

I Exact reconstruction from 9 lines (3.5% of Fourier transform).

10 lines fastest 10-line re-covery

9 lines recovery fromfewest samples

Motivating example

New results

These are old results (Mar. 2006); what’s new?I Reconstruction (to 50 dB) in 13 seconds (versus literature-best

1–3 minutes).

I Exact reconstruction from 9 lines (3.5% of Fourier transform).

Motivating example

New results

These are old results (Mar. 2006); what’s new?I Reconstruction (to 50 dB) in 13 seconds (versus literature-best

1–3 minutes).I Exact reconstruction from 9 lines (3.5% of Fourier transform).

Optimization for sparse recovery

I Let x ∈ RN be sparse: ‖Ψx‖0 = K, K N .I Suppose Φ is an M ×N matrix, M N , with Φ and Ψ

incoherent. For example, Φ = (ϕij), i.i.d. ϕij ∼ N(0, σ2).

minu‖Ψu‖0, s.t. Φu = Φx.

minu‖Ψu‖pp, s.t. Φu = Φx,

Unique solution is u = x with opti-mally small M , but is NP-hard.

M ≥ 2K suffices w.h.p.

Can be solved efficiently; requiresmore measurements for reconstruc-tion. M ≥ CK log(N/K)

where 0 < p < 1. Solvable in prac-tice; requires fewer measurementsthan `1.M ≥ C1(p)K+pC2(p)K log(N/K)(with V. Staneva)

Optimization for sparse recovery

I Let x ∈ RN be sparse: ‖Ψx‖0 = K, K N .I Suppose Φ is an M ×N matrix, M N , with Φ and Ψ

incoherent. For example, Φ = (ϕij), i.i.d. ϕij ∼ N(0, σ2).

minu‖Ψu‖pp, s.t. Φu = Φx,

Unique solution is u = x with opti-mally small M , but is NP-hard.

M ≥ 2K suffices w.h.p.

Can be solved efficiently; requiresmore measurements for reconstruc-tion. M ≥ CK log(N/K)

where 0 < p < 1. Solvable in prac-tice; requires fewer measurementsthan `1.M ≥ C1(p)K+pC2(p)K log(N/K)(with V. Staneva)

The geometry of `p

minu ‖u‖pp, subject to Φu = Φxp = 2:

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.1p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.2p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.3p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.4p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.5p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.1p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.2p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.3p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.4p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.5p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.6p

The geometry of `p

Φu = Φx

|u1|p + |u2|p + |u3|p = 0.7p

The geometry of `p

minu ‖u‖pp, subject to Φu = Φxp = 1/2:

|u1|p + |u2|p + |u3|p = 0.1p

Φu = Φx

The geometry of `p

|u1|p + |u2|p + |u3|p = 0.2p

Φu = Φx

The geometry of `p

|u1|p + |u2|p + |u3|p = 0.3p

Φu = Φx

The geometry of `p

|u1|p + |u2|p + |u3|p = 0.4p

Φu = Φx

The geometry of `p

|u1|p + |u2|p + |u3|p = 0.5p

Φu = Φx

The geometry of `p

|u1|p + |u2|p + |u3|p = 0.6p

Φu = Φx

The geometry of `p

|u1|p + |u2|p + |u3|p = 0.7p

Φu = Φx

The geometry of `p

|u1|p + |u2|p + |u3|p = 0.8p

Φu = Φx

The geometry of `p

|u1|p + |u2|p + |u3|p = 0.9p

Φu = Φx

The geometry of `p

|u1|p + |u2|p + |u3|p = 1p

Φu = Φx

Why might global minimization be possible?

Consider the ε-regularized, constraint-eliminated objective:

Fε(t) =N∑i=1

[xi + (V t)i

]2 + ε

whereR(V ) = N (Φ). A moderate ε fills in the local minima.

Fε(t) =N∑i=1

[xi + (V t)i

]2 + ε

ε = 0

Fε(t) =N∑i=1

[xi + (V t)i

]2 + ε

ε = 1

Fε(t) =N∑i=1

[xi + (V t)i

]2 + ε

ε = 0.1

Fε(t) =N∑i=1

[xi + (V t)i

]2 + ε

ε = 0.01

Fε(t) =N∑i=1

[xi + (V t)i

]2 + ε

ε = 0.001

Algorithms

Semiconvex regularization

Now we generalize an approach of J. Yang, W. Yin, Y. Zhang, and Y.Wang. Consider a mollified `p objective on R2:

ϕ(t) =

γ|t|2 if |t| ≤ α|t|p/p− δ if |t| > α

The parameters are chosen tomake ϕ ∈ C1.

Now we seek ψ such that

ϕ(t) = minw

ψ(w) + (β/2)|t− w|22

This can be found by convex duality, as |t|22/2−ϕ(t)/β is convex ifβ = αp−2.

Algorithms

Semiconvex regularization

Now we generalize an approach of J. Yang, W. Yin, Y. Zhang, and Y.Wang. Consider a mollified `p objective on R2:

ϕ(t) =

γ|t|2 if |t| ≤ α|t|p/p− δ if |t| > α

The parameters are chosen tomake ϕ ∈ C1.

Now we seek ψ such that

ϕ(t) = minw

ψ(w) + (β/2)|t− w|22

This can be found by convex duality, as |t|22/2−ϕ(t)/β is convex ifβ = αp−2.

Algorithms

A splitting approach

Now we consider an unconstrained `p minimization problem, andreplace

N∑i=1

ϕ((Du)i) + (µ/2)‖Φu− b‖22

with the split version

minu,w

N∑i=1

ψ(wi) + (β/2)‖Du− w‖22 + (µ/2)‖Φu− b‖22,

which we solve by alternate minimization.

Algorithms

Easy iterations

Holding u fixed, the w-subproblem is separable, and its solutioncomes from the convex duality:

wi = max

0, |(Du)i| −

|(Du)i|p−1

(Du)i|(Du)i|

This generalizes shrinkage ( or soft thresholding ) to `p.

Holding w fixed, the u-problem is quadratic:

(βDTD + µΦTΦ)u = βDTw + µΦT b.

If Φ is a Fourier sampling operator, we can solve this in the Fourierdomain. This is very fast!

Algorithms

Easy iterations

wi = max

0, |(Du)i| −

|(Du)i|p−1

(Du)i|(Du)i|

Algorithms

Easy iterations

wi = max

0, |(Du)i| −

|(Du)i|p−1

(Du)i|(Du)i|

Algorithms

Enforcing equality

minu,w

N∑i=1

ψ(wi) + (β/2)‖Du− w‖22 + (µ/2)‖Φu− b‖22

Typically one enforces w = Du and Φu = b by iteratively growingβ and µ larger (continuation).

We get better results from Bregman iteration (generalizing T.Goldstein, S. Osher):

minu,w

N∑i=1

ψ(wi)+(β/2)‖Du−w−Bw‖22 +(µ/2)‖Φu−b−Bu‖22,

and update Bn+1w = Bnw + w −Du (inner loop),

Bm+1u = Bmu + b− Φu (outer loop, if desired).

Algorithms

Enforcing equality

minu,w

N∑i=1

ψ(wi) + (β/2)‖Du− w‖22 + (µ/2)‖Φu− b‖22

Typically one enforces w = Du and Φu = b by iteratively growingβ and µ larger (continuation).

We get better results from Bregman iteration (generalizing T.Goldstein, S. Osher):

minu,w

N∑i=1

ψ(wi)+(β/2)‖Du−w−Bw‖22 +(µ/2)‖Φu−b−Bu‖22,

and update Bn+1w = Bnw + w −Du (inner loop),

Bm+1u = Bmu + b− Φu (outer loop, if desired).

Algorithms

A surprise

I An early coding mistake led to the use of p = −1/2. This iswhat made the 9-line phantom reconstruction possible.

I Results decay slowly as p is decreased below −1/2.I Negative p-values were used by Rao and Kreutz-Delgado in an

iteratively-reweighted least squares (IRLS) algorithm. Theirnegative p results were worse than their positive p results,which were hampered by getting stuck in local minima.

I A mollified IRLS approach (with Wotao Yin) apparently avoidslocal minima, but negative p results are not better than positivep results.

Algorithms

A surprise

I Results decay slowly as p is decreased below −1/2.

I Negative p-values were used by Rao and Kreutz-Delgado in aniteratively-reweighted least squares (IRLS) algorithm. Theirnegative p results were worse than their positive p results,which were hampered by getting stuck in local minima.

Algorithms

A surprise

Algorithms

A surprise

Summary

I Nonconvex compressive sensing allows sparse signals to berecovered with even fewer measurements than “traditional”compressive sensing.

I Decreasing p also improves robustness to noise, and speedsup convergence.

I Regularizing the objective appears to keep algorithms fromconverging to nonglobal minima.

I For Fourier-sampling measurements, such as MRI, a very fastalgorithm is available.

math.lanl.gov/~rick

fast algorithms for nonconvex compressive sensing

Documents

high resolution radar sensing via compressive...

compressive sensing of earth observations

compressive sensing - eurasip

lecture 18 - compressive sensing

introduction to compressive sensing

an introduction to compressive sensing

reconstruction algorithms for compressive sensing ii

compressive sensing & applications

tutorial on compressive sensing

compressive sensing theory notes

compressive sensing of signals from a gmm with sparse...

présentation projet compressive sensing 2014

compressive sensing

nonconvex compressed sensing with the sum-of-squares method

compressive sensing computational ghost imaging

some notes on compressive sensing

introduction to compressive sensing (compressed sensing)

dynamic spectrum sensing in cognitive radio networks using...

compressive sensing techniques for video acquisition

sonar imaging using compressive sensing