review - courant institute of mathematical sciencescfgranda/pages/obda_spring16... · review...

233
Review Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda 5/9/2016

Upload: others

Post on 22-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Review

Optimization-Based Data Analysishttp://www.cims.nyu.edu/~cfgranda/pages/OBDA_spring16

Carlos Fernandez-Granda

5/9/2016

Page 2: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

How to deal with a data-analysis problem

1. Define the problem

2. Establish assumptions on signal structure

3. Design an efficient algorithm

4. Understand under what conditions the problem is well posed

5. Derive theoretical guarantees

Page 3: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 4: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Aim: Extracting information (signal) from data in the presence ofuninformative perturbations (noise)

Additive noise model

data = signal + noisey = x + z

Page 5: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Data

Signal

Page 6: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Data

Signal

Page 7: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Signal Data

Page 8: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Page 9: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Page 10: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Page 11: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Signal recovery

I Compressed sensing

I Deconvolution / super-resolution

I Matrix completion

Page 12: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

General model

Aim: Estimate signal x from measurements y

y = A x

Linear underdetermined system where dimension (y) < dimension (x)

Page 13: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compressed sensing

Page 14: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compressed sensing

Signal Spectrum

Data

Page 15: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Super-resolution

The resolving power of lenses, however perfect, is limited (Lord Rayleigh)

Diffraction imposes a fundamental limit on the resolution of optical systems

Page 16: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Super-resolution

Page 17: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Spatial Super-resolution

Spectrum

Signal

Data

Page 18: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Spectral Super-resolution

Spectrum

Signal

Data

Page 19: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Seismology

Page 20: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Reflection seismology

Geological section Acoustic impedance Reflection coefficients

Page 21: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Deconvolution

Sensing Ref. coeff. Pulse Data

Data ≈ convolution of pulse and reflection coefficients

Page 22: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Deconvolution

Ref. coeff. Pulse Data

∗ =

Spectrum × =

Page 23: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Matrix completion

? ? ? ?

?

?

??

??

???

?

?

Page 24: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Matrix completion

Bob Molly Mary Larry

1 ? 5 4 The Dark Knight? 1 4 5 Spiderman 34 5 2 ? Love Actually5 4 2 1 Bridget Jones’s Diary4 5 1 2 Pretty Woman1 2 ? 5 Superman 2

Page 25: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Signal separation

Aim: Decompose the data into two (or more) signals

y = x1 + x2

Page 26: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram

Page 27: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram

Page 28: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Temperature data

1860 1880 1900 1920 1940 1960 1980 2000

0

5

10

15

20

25

30Tem

pera

ture

(C

els

ius)

Page 29: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Temperature data

1900 1901 1902 1903 1904 19050

5

10

15

20

25Tem

pera

ture

(C

els

ius)

Page 30: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Demixing of sines and spikes

Sines

+ =

Spectrum

+ =

x

+ s = y

Page 31: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Demixing of sines and spikes

Sines

+ =

Spectrum

+ =

Fc x

+ s = y

Page 32: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Demixing of sines and spikes

Sines

+ =

Spectrum

+ =

Fc x

+ s = y

Page 33: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Demixing of sines and spikes

Sines Spikes

+

=

Spectrum +

=

Fc x + s

= y

Page 34: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Demixing of sines and spikes

Sines Spikes Data

+ =

Spectrum + =

Fc x + s = y

Page 35: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Collaborative filtering with outliers

A :=

Bob Molly Mary Larry

5 1 5 5 The Dark Knight1 1 5 5 Spiderman 35 5 1 1 Love Actually5 5 1 1 Bridget Jones’s Diary5 5 1 1 Pretty Woman1 1 5 1 Superman 2

Page 36: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Background subtraction

20 40 60 80 100 120 140 160

20

40

60

80

100

120

+

20 40 60 80 100 120 140 160

20

40

60

80

100

120

=

20 40 60 80 100 120 140 160

20

40

60

80

100

120

L S M

Page 37: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Regression

Aim: Predict the value of a response y ∈ R from p predictorsX1, X2, . . . , Xp ∈ R

Methodology:

1. Fit a model with using n training examples y1, y2, . . . , yn

yi ≈ f (Xi1,Xi2, . . . ,Xip) 1 ≤ i ≤ n

2. Use learned model f to predict from new data

Page 38: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Sparse regression

Assumption: Response only depends on a subset S of s p predictors

Model-selection problem: Determine what predictors are relevant

Page 39: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Classification

Aim: Predict the value of a binary response y ∈ 0, 1 fromp predictors X1, X2, . . . , Xp ∈ R

Methodology:

1. Fit a model with using n training examples y1, y2, . . . , yn

yi ≈ f (Xi1,Xi2, . . . ,Xip) 1 ≤ i ≤ n

2. Use learned model f to predict from new data

Page 40: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Arrhythmia prediction

Predict whether patient has arrhythmia from n = 271 examples andp = 182 predictors

I Age, sex, height, weightI Features obtained from electrocardiogram recordings

Page 41: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compression

Aim: Map a signal x ∈ Rn to a lower-dimensional space

y ≈ f (x)

such that we can recover x from y with minimal loss of information

Page 42: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compression

Page 43: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compression

Page 44: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Dimensionality reduction

Projection of data onto lower-dimensional space

I Decreases computational cost of processing the dataI Allows to visualize (2D, 3D)

Difference with compression: Not necessarily reversible

Page 45: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Dimensionality reduction

Seeds from three different varieties of wheat: Kama, Rosa and Canadian

Dimensions:I AreaI PerimeterI CompactnessI Length of kernelI Width of kernelI Asymmetry coefficientI Length of kernel groove

Page 46: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Clustering

Aim: Separate signals x1, . . . , xn ∈ Rd into different classes

Page 47: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Clustering

Page 48: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Collaborative filtering

A :=

Bob Molly Mary Larry

1 1 5 4 The Dark Knight2 1 4 5 Spiderman 34 5 2 1 Love Actually5 4 2 1 Bridget Jones’s Diary4 5 1 2 Pretty Woman1 2 5 5 Superman 2

Page 49: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Topic modeling

A :=

singer GDP senate election vote stock bass market band Articles

6 1 1 0 0 1 9 0 8 a1 0 9 5 8 1 0 1 0 b8 1 0 1 0 0 9 1 7 c0 7 1 0 0 9 1 7 0 d0 5 6 7 5 6 0 7 2 e1 0 8 5 9 2 0 0 1 f

Page 50: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 51: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Models

I Sparse models

I Group sparse models

I Low-rank models

Page 52: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Sparsity

x =

0

0

0

2

0

0

0

3

0

Page 53: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Group sparsity

Entries are partitioned into m groups G1, G2, . . . , Gm

x =

xG1

xG2

· · ·xGm

Assumption: Most groups are zero

Page 54: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Sparse models

Let D be a dictionary of atoms

1. Synthesis sparse model

x = Dc where c is sparse

2. Analysis sparse model:

DT x is sparse

Page 55: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Low-rank model

Signal is structured as a matrix that presents significant correlations

Page 56: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Collaborative filtering

A :=

Bob Molly Mary Larry

1 1 5 4 The Dark Knight2 1 4 5 Spiderman 34 5 2 1 Love Actually5 4 2 1 Bridget Jones’s Diary4 5 1 2 Pretty Woman1 2 5 5 Superman 2

Page 57: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

SVD

A− A = U Σ V T = U

7.79 0 0 00 1.62 0 00 0 1.55 00 0 0 0.62

V T

Page 58: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Topic modeling

A :=

singer GDP senate election vote stock bass market band Articles

6 1 1 0 0 1 9 0 8 a1 0 9 5 8 1 0 1 0 b8 1 0 1 0 0 9 1 7 c0 7 1 0 0 9 1 7 0 d0 5 6 7 5 6 0 7 2 e1 0 8 5 9 2 0 0 1 f

Page 59: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

SVD

A = U Σ V T = U

23.64 0 0 00 18.82 0 0 0 00 0 14.23 0 0 00 0 0 3.63 0 00 0 0 0 2.03 00 0 0 0 0 1.36

V T

Page 60: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Designing signal representations

I Frequency representation

I Short-time Fourier transform

I Wavelets

I Finite differences

Page 61: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Frequency representation

. . .

x =

. . .

Spectrumof x

Page 62: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Discrete cosine transform

Signal DCT coefficients

Page 63: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram

Page 64: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram (spectrum)

Page 65: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram (spectrum)

Page 66: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Short-time Fourier transform

Real part Imaginary part

Spectrum

Page 67: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Speech signal

Page 68: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Spectrogram (log magnitude of STFT coefficients)

Time

Fre

quency

Page 69: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Wavelets

Scaling function Mother wavelet

Page 70: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram

Signal Haar transform

Page 71: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Scale 29

Contribution Approximation

Page 72: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Scale 28

Contribution Approximation

Page 73: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Scale 27

Contribution Approximation

Page 74: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Scale 26

Contribution Approximation

Page 75: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Scale 25

Contribution Approximation

Page 76: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Scale 24

Contribution Approximation

Page 77: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Scale 23

Contribution Approximation

Page 78: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Scale 22

Contribution Approximation

Page 79: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Scale 21

Contribution Approximation

Page 80: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Scale 20

Contribution Approximation

Page 81: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

2D wavelet transform

Page 82: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

2D wavelet transform

Page 83: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Finite differences

Page 84: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Learning signal representations

Aim: Learn representation from a set of n signals

X :=[x1 x2 · · · xn

]For each signal

xj ≈k∑

i=1

Φi Aij , 1 ≤ j ≤ n, for k n

I Φ1, . . . , Φk ∈ Rd are atoms

I A1, . . . , An ∈ Rk are coefficient vectors

Page 85: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Learning signal representations

Equivalent formulation

X ≈[Φ1 Φ2 · · · Φk

] [A1 A2 · · · An

]= Φ A

Φ ∈ Rd×k , A ∈ Rk×n

Page 86: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Learning signal representations

I k means

I Principal-component analysis

I Nonnegative matrix factorization

I Sparse principal-component analysis

I Dictionary learning

Page 87: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

k means

Aim: Divide x1, . . . xn into k classes

Learn Φ1, . . . , Φk that minimize

n∑i=1

∣∣∣∣xi − Φc(i)∣∣∣∣2

2

c (i) := arg min1≤j≤k

||xi − Φj ||2

Page 88: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

k means

Page 89: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Principal-component analysis

Best rank-k approximation

Φ A = U1:kΣ1:kV T1:k = arg min

M | rank(M)=k

∣∣∣∣∣∣X − M∣∣∣∣∣∣2

F

The atoms Φ1, . . . , Φk are orthogonal

Page 90: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Principal-component analysis

σ1√n

= 1.3490σ2√n

= 0.1438

U1

U2

Page 91: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

PCA

Page 92: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Nonnegative matrix factorization

Nonnegative atoms/coefficients

X ≈ Φ A, Φi ,j ≥ 0, Ai ,j ≥ 0, for all i , j

Page 93: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Faces dataset

Page 94: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Sparse PCA

Sparse atoms

X ≈ Φ A, Φ sparse

Page 95: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Faces dataset

Page 96: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Dictionary learning

Sparse coefficients

X ≈ Φ A, A sparse

Page 97: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Dictionary learning

Page 98: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 99: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 100: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Promoting sparsity

Find sparse x such that x ≈ y

Hard thresholding

Hη (y)i :=

yi if |yi | > η

0 otherwise

Page 101: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Promoting group sparsity

Find group sparse x such that x ≈ y

Block thresholding

Bη (x)i :=

xi if i ∈ Gj such that

∣∣∣∣xGj ∣∣∣∣2 > η

0 otherwise

Page 102: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Promoting low-rank structure

Find low rank M such that M ≈ Y ∈ Rm×n

I Truncate singular-value decomposition Y = U ΣV T

M = U1:k Σ1:kV T1:k

Solves PCA problem

I Fit M = A B, A ∈ Rm×k , B ∈ Rk×n, by solving

minimize∣∣∣∣∣∣Y − A B

∣∣∣∣∣∣F

Page 103: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Promoting additional structure in low-rank models

I Nonnegative factors

minimize∣∣∣∣∣∣Y − A B

∣∣∣∣∣∣F

subject to Ai ,j ≥ 0

Bi ,j ≥ 0 for all i , j

I Sparse factors

minimize∣∣∣∣∣∣Y − A B

∣∣∣∣∣∣F

+ λ

k∑i=1

∣∣∣∣∣∣Ai

∣∣∣∣∣∣1

subject to∣∣∣∣∣∣Ai

∣∣∣∣∣∣2

= 1, 1 ≤ i ≤ k

minimize∣∣∣∣∣∣Y − A B

∣∣∣∣∣∣F

+ λk∑

i=1

∣∣∣∣∣∣Bi

∣∣∣∣∣∣1

subject to∣∣∣∣∣∣Ai

∣∣∣∣∣∣2

= 1, 1 ≤ i ≤ k

Page 104: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Linear models

Signal representation

x = D c

I Columns of D are designed/learned atoms

Page 105: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Linear models

Inverse problems

y = A x

I A models the measurement process

Page 106: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Linear models

Linear regression

y = X β

I X contains the predictors

Page 107: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Least squares

Find x such that Ax ≈ y

minimize ||y − A x ||2

Alternatives: Logistic loss for classification

Page 108: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Promoting sparsity

Find sparse x such that Ax ≈ y

I Greedy methods: Choose entries of x sequentially to minimize residual(matching pursuit, orthogonal m. p., forward stepwise regression)

I Penalize `1 norm of x

minimize ||y − A x ||22 + λ||x ||1

Implementation:

gradient descent + soft-thresholding / coordinate descent

Page 109: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Promoting group sparsity

Find group sparse x such that Ax ≈ y

I Penalize `1/`2 norm of x

minimize ||y − A x ||22 + λ||x ||1,2

Implementation:

gradient descent + block soft-thresholding / coordinate descent

Page 110: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Promoting low-rank structure

Find low rank M such that MΩ ≈ YΩ ∈ Rm×n for a set of entries Ω

I Penalize nuclear norm of x

minimize∣∣∣∣∣∣YΩ − MΩ

∣∣∣∣∣∣22

+ λ∣∣∣∣∣∣M∣∣∣∣∣∣

Implementation: gradient descent + soft-thresholding ofsingular values

I Fit M = A B, A ∈ Rm×k , B ∈ Rk×n, by solving

minimize∣∣∣∣∣∣YΩ −

(A B)

Ω

∣∣∣∣∣∣F

Page 111: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 112: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Data

Signal

Page 113: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via thresholding

Estimate

Signal

Page 114: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

DCT coefficients

Data

Signal

Data

Page 115: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via thresholding in DCT basis

DCT coefficients

Estimate

Signal

Estimate

Page 116: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Page 117: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Page 118: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

2D wavelet coefficients

Page 119: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Original coefficients

Page 120: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Thresholded coefficients

Page 121: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via thresholding in a wavelet basis

Page 122: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via thresholding in a wavelet basis

Page 123: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via thresholding in a wavelet basis

Original Noisy Estimate

Page 124: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Speech denoising

Page 125: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Time thresholding

Page 126: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Spectrum

Page 127: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Frequency thresholding

Page 128: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Frequency thresholding

Data

DFT thresholding

Page 129: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Spectrogram (STFT)

Time

Fre

quency

Page 130: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

STFT thresholding

Time

Fre

quency

Page 131: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

STFT thresholding

DataSTFT thresholding

Page 132: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

STFT block thresholding

Time

Fre

quency

Page 133: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

STFT block thresholding

Data

STFT block thresh.

Page 134: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Sines and spikes

x = Dc c

DCT subdictionarySpike subdictionary

Page 135: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Page 136: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via `1-norm regularized least squares

SignalEstimate

Page 137: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via `1-norm regularized least squares

SignalEstimate

Page 138: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising

Signal Data

Page 139: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via TV regularization

Signal TV reg. (small λ)

Page 140: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via TV regularization

Signal TV reg. (medium λ)

Page 141: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via TV regularization

Signal TV reg. (large λ)

Page 142: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via TV regularization

Page 143: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via TV regularization

Page 144: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Small λ

Page 145: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Small λ

Page 146: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Medium λ

Page 147: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Medium λ

Page 148: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Large λ

Page 149: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Large λ

Page 150: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising via TV regularization

Original Noisy Estimate

Page 151: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 152: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compressed sensing

Signal Spectrum

Data

Page 153: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

`1-norm minimization

Signal Estimate

Page 154: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

x2 undersampling

Page 155: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

`1-norm minimization

Regular Random

Page 156: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Super-resolution

The resolving power of lenses, however perfect, is limited (Lord Rayleigh)

Diffraction imposes a fundamental limit on the resolution of optical systems

Page 157: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Super-resolution

Page 158: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Super-resolution: `1-norm regularization

Page 159: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Spectral Super-resolution

Spectrum

Signal

Data

Page 160: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Spectral super-resolution: Pseudospectrum from low-rankmodel (MUSIC)

Page 161: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Deconvolution

Ref. coeff. Pulse Data

∗ =

Spectrum × =

Page 162: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Deconvolution with the `1 norm (Taylor, Banks, McCoy ’79)

Data

Fit

Pulse

Estimate

Page 163: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Matrix completion

Bob Molly Mary Larry

1 ? 5 4 The Dark Knight? 1 4 5 Spiderman 34 5 2 ? Love Actually5 4 2 1 Bridget Jones’s Diary4 5 1 2 Pretty Woman1 2 ? 5 Superman 2

Page 164: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Matrix completion via nuclear-norm minimization

Bob Molly Mary Larry

1 2 (1) 5 4 The Dark Knight

2 (2) 1 4 5 Spiderman 34 5 2 2 (1) Love Actually5 4 2 1 Bridget Jones’s Diary4 5 1 2 Pretty Woman1 2 5 (5) 5 Superman 2

Page 165: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 166: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram

Spectrum

Page 167: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram: High-frequency noise (power line hum)

Original spectrum Low-pass filtered spectrum

Page 168: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram: High-frequency noise (power line hum)

Original spectrum Low-pass filtered spectrum

Page 169: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram: High-frequency noise (power line hum)

Original Low-pass filtered

Page 170: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Electrocardiogram: High-frequency noise (power line hum)

Original Low-pass filtered

Page 171: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Temperature data

1860 1880 1900 1920 1940 1960 1980 2000

0

5

10

15

20

25

30Tem

pera

ture

(C

els

ius)

Page 172: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Temperature data

1900 1901 1902 1903 1904 19050

5

10

15

20

25Tem

pera

ture

(C

els

ius)

Page 173: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Model fitted by least squares

1860 1880 1900 1920 1940 1960 1980 2000

0

5

10

15

20

25

30Tem

pera

ture

(C

els

ius)

Data

Model

Page 174: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Model fitted by least squares

1900 1901 1902 1903 1904 19050

5

10

15

20

25Tem

pera

ture

(C

els

ius)

Data

Model

Page 175: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Model fitted by least squares

1960 1961 1962 1963 1964 19655

0

5

10

15

20

25Tem

pera

ture

(C

els

ius)

Data

Model

Page 176: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Trend: Increase of 0.75 C / 100 years (1.35 F)

1860 1880 1900 1920 1940 1960 1980 2000

0

5

10

15

20

25

30Tem

pera

ture

(C

els

ius)

Data

Trend

Page 177: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Demixing of sines and spikes

Sines Spikes Data

+ =

Spectrum + =

Fc x + s = y

Page 178: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Demixing of sines and spikes

s s x x

Spikes Sines (spectrum)

Page 179: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Background subtraction

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 180: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Low-rank component

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 181: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Sparse component

20 40 60 80 100 120 140 160

20

40

60

80

100

120

Page 182: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 183: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Sparse regression

Assumption: Response only depends on a subset S of s p predictors

Model-selection problem: Determine what predictors are relevant

Page 184: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Lasso

10-4 10-3 10-2 10-1 100 101

Regularization parameter

0.3

0.2

0.1

0.0

0.1

0.2

0.3

0.4C

oeff

icie

nts

Relevant features

Page 185: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Arrhythmia prediction

Predict whether patient has arrhythmia from n = 271 examples andp = 182 predictors

I Age, sex, height, weightI Features obtained from electrocardiogram recordings

Best sparse model uses around 60 predictors

Page 186: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Lasso (logistic regression)

-10 -8 -6 -4 -2

0.2

50

.35

0.4

5

log(Lambda)

Mis

cla

ssifi

ca

tion

Err

or

182 172 174 159 107 62 18 5

Page 187: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Correlated predictors: Lasso path

10-4 10-3 10-2 10-1 100 101

Regularization parameter

0.6

0.4

0.2

0.0

0.2

0.4

0.6

0.8C

oeff

icie

nts

Group 1Group 2

Page 188: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Correlated predictors: Ridge-regression path

10-4 10-3 10-2 10-1 100 101 102 103 104 105

Regularization parameter

0.6

0.4

0.2

0.0

0.2

0.4

0.6

0.8C

oeff

icie

nts

Group 1Group 2

Page 189: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Correlated predictors: Elastic net path

10-4 10-3 10-2 10-1 100 101

Regularization parameter

0.6

0.4

0.2

0.0

0.2

0.4

0.6

0.8C

oeff

icie

nts

Group 1Group 2

Page 190: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Multi-task learning

Several responses y1, y2, . . . , yk modeled with the same predictors

Assumption: Responses depend on the same subset of predictors

Aim: Learn a group-sparse model

Page 191: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Multitask learning

Lasso Multitask lasso Original

0.0

0.4

0.8

1.2

1.6

2.0

2.4

2.8

3.2

3.6

Page 192: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 193: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compression

Original

Page 194: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compression via frequency representation

10% largest DCT coeffs

Page 195: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compression via frequency representation

2% largest DCT coeffs

Page 196: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Dimensionality reduction

Seeds from three different varieties of wheat: Kama, Rosa and Canadian

Dimensions:I AreaI PerimeterI CompactnessI Length of kernelI Width of kernelI Asymmetry coefficientI Length of kernel groove

Page 197: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

PCA: Projection onto two first PCs

Page 198: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

PCA: Projection onto two last PCs

Page 199: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Random projections

Page 200: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 201: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Clustering

Page 202: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

k means using Lloyd’s algorithm

Page 203: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Collaborative filtering

Bob Molly Mary Larry

1 1 5 4 The Dark Knight2 1 4 5 Spiderman 34 5 2 1 Love Actually5 4 2 1 Bridget Jones’s Diary4 5 1 2 Pretty Woman1 2 5 5 Superman 2

Page 204: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

First left singular vector clusters movies

U1 =D. Knight Sp. 3 Love Act. B.J.’s Diary P. Woman Sup. 2

( )−0.45 −0.39 0.39 0.39 0.39 −0.45

Page 205: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

First right singular vector clusters users

V1 =Bob Molly Mary Larry

( )0.48 0.52 −0.48 −0.52

Page 206: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Topic modeling

singer GDP senate election vote stock bass market band Articles

6 1 1 0 0 1 9 0 8 a1 0 9 5 8 1 0 1 0 b8 1 0 1 0 0 9 1 7 c0 7 1 0 0 9 1 7 0 d0 5 6 7 5 6 0 7 2 e1 0 8 5 9 2 0 0 1 f

Page 207: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Right nonnegative factors cluster words

singer GDP senate election vote stock bass market band

( )H1 = 0.34 0 3.73 2.54 3.67 0.52 0 0.35 0.35( )H2 = 0 2.21 0.21 0.45 0 2.64 0.21 2.43 0.22( )H3 = 3.22 0.37 0.19 0.2 0 0.12 4.13 0.13 3.43

Page 208: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Left nonnegative factors cluster documents

a b c d e f( )W1 = 0.03 2.23 0 0 1.59 2.24( )W2 = 0.1 0 0.08 3.13 2.32 0( )W3 = 2.13 0 2.22 0 0 0.03

Page 209: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 210: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Denoising based on sparsity

Signal is sparse in the chosen representation

Noise is not sparse in the chosen representation

Page 211: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Signal recovery

Measurements Class of signals

Compressedsensing

Gaussian, randomFourier coeffs.

Sparse

Super-resolution Low passSignals with min.

separation

Matrixcompletion Random sampling

Incoherent low-rankmatrices

Page 212: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compressed sensing

=

=

Spectrumof x

Page 213: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compressed sensing

=

=

Spectrumof x

Measurement operator = random frequency samples

Page 214: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compressed sensing

=

=

Spectrumof x

Page 215: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compressed sensing

=

=

Spectrumof x

Aim: Study effect of measurement operator on sparse vectors

Page 216: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compressed sensing

=

=

Spectrumof x

Operator is well conditioned when acting upon any sparse signal

(restricted isometry property)

Page 217: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Compressed sensing

=

=

Spectrumof x

Operator is well conditioned when acting upon any sparse signal

(restricted isometry property)

Page 218: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Super-resolution

. . .

x =

. . .

Spectrumof x

Fc y= y

No discretization

Page 219: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Super-resolution

. . .

x =

. . .

Spectrumof x

Fc y

= y

Data: Low-pass Fourier coefficients

Page 220: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Super-resolution

. . .

x

=

. . .

Spectrumof x

Fc

y

= y

Data: Low-pass Fourier coefficients

Page 221: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Super-resolution

. . .

x

=

. . .

Spectrumof x

Fc

y

= y

Problem: If the support is clustered, the problem may be ill posed

In super-resolution sparsity is not enough!

Page 222: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Super-resolution

. . .

x

=

. . .

Spectrumof x

Fc

y

= y

If the support is spread out, there is still hope

We need conditions beyond sparsity

Page 223: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Matrix completion

1111

[1 1 1 1]

+

0001

[1 2 3 4]

=

1 1 1 11 1 1 11 1 1 12 3 4 5

Page 224: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Signal separation

The signals are identifiable

Example: For low rank + sparse model, low rank componentcannot be sparse and vice versa

Page 225: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Regression

Enough examples to prevent overfitting

For sparse models, enough examples with respect to the numberof relevant predictors

Page 226: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Least-squares regression

100 200 300 400 50050n

0.0

0.1

0.2

0.3

0.4

0.5

Rela

tive e

rror

(l2 n

orm

)

Error (training)

Error (test)

Noise level (training)

Page 227: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Lasso (logistic regression)

n = 271 examples

-10 -8 -6 -4 -2

0.2

50

.35

0.4

5

log(Lambda)

Mis

cla

ssifi

ca

tion

Err

or

182 172 174 159 107 62 18 5

Page 228: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Data-analysis problems

Signal structure

MethodsGeneral techniquesDenoisingSignal recoverySignal separationRegressionCompression / dimensionality reductionClustering

When is the problem well posed?

Theoretical analysis

Page 229: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Optimization problem

minimize f (x)

subject to Ax = y

f is a nondifferentiable convex function

Examples: `1 norm, nuclear norm

Aim: Show that the original signal x is the solution

Page 230: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Dual certificate

Subgradient of f at x of the form

q := AT v

For any h such that Ah = 0

〈q, h〉 =⟨AT v , h

⟩= 〈v ,Ah〉 = 0

f (x + h) ≥ f (x) + 〈q, h〉 = f (x)

Page 231: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Certificates

Subgradient Row space of A

Compressedsensing

sign (x) + z ,||z ||∞ < 1

Random sinusoids

Super-resolution sign (x) + z ,||z ||∞ < 1

Low-pass sinusoids

Matrixcompletion UV T + Z , ||Z || < 1 Observed entries

Page 232: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Certificate for compressed sensing

Page 233: Review - Courant Institute of Mathematical Sciencescfgranda/pages/OBDA_spring16... · Review Optimization-Based Data Analysis cfgranda/pages/OBDA_spring16 Carlos Fernandez-Granda

Certificate for super-resolution

1

0

−1

1

0

−1