sketched learning from random features moments · compressive learning nicolas keriven compression...
TRANSCRIPT
![Page 1: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/1.jpg)
Sketched Learning from Random Features Moments
Nicolas Keriven
Ecole Normale Supérieure (Paris)
CFM-ENS chair in Data Science
(thesis with Rémi Gribonval at Inria Rennes)
ISMP, July 6th 2018
![Page 2: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/2.jpg)
Compressive learning
Nicolas Keriven 1/15
![Page 3: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/3.jpg)
Compressive learning
Nicolas Keriven
Compression Learning
Linear sketch
• Sketched learning: First compress data in a linear sketch [Cormode
2011], then learn
1/15
![Page 4: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/4.jpg)
Compressive learning
Nicolas Keriven
Compression Learning
Linear sketch
• Sketched learning: First compress data in a linear sketch [Cormode
2011], then learn• Hash tables, count sketches, histograms…
1/15
![Page 5: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/5.jpg)
Compressive learning
Nicolas Keriven
Compression Learning
Linear sketch
• Sketched learning: First compress data in a linear sketch [Cormode
2011], then learn• Hash tables, count sketches, histograms…
• Advantages: one-pass, streaming, distributed compression, data privacy…
1/15
![Page 6: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/6.jpg)
Compressive learning
Nicolas Keriven
Compression Learning
Linear sketch
• Sketched learning: First compress data in a linear sketch [Cormode
2011], then learn• Hash tables, count sketches, histograms…
• Advantages: one-pass, streaming, distributed compression, data privacy…
• In this talk: unsupervised learning
1/15
![Page 7: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/7.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
![Page 8: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/8.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
![Page 9: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/9.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
• : mean
![Page 10: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/10.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
• : mean
• : moment
![Page 11: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/11.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
• : mean
• : moment
• : histogram
![Page 12: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/12.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
• : mean
• : moment
• : histogram
• Proposed: kernel random features [Rahimi 2007]
(random proj. + non-linearity)
![Page 13: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/13.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
• : mean
• : moment
• : histogram
• Proposed: kernel random features [Rahimi 2007]
(random proj. + non-linearity)
Questions:
• What information is preserved by the sketching ?
• How to retrieve this information ?
• What is a sufficient number of features ?
![Page 14: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/14.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
• : mean
• : moment
• : histogram
• Proposed: kernel random features [Rahimi 2007]
(random proj. + non-linearity)
Questions:
• What information is preserved by the sketching ?
• How to retrieve this information ?
• What is a sufficient number of features ?
Intuition: sketching as a linear embedding
![Page 15: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/15.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
• : mean
• : moment
• : histogram
• Proposed: kernel random features [Rahimi 2007]
(random proj. + non-linearity)
Questions:
• What information is preserved by the sketching ?
• How to retrieve this information ?
• What is a sufficient number of features ?
- Assumption:
Intuition: sketching as a linear embedding
![Page 16: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/16.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
• : mean
• : moment
• : histogram
• Proposed: kernel random features [Rahimi 2007]
(random proj. + non-linearity)
Questions:
• What information is preserved by the sketching ?
• How to retrieve this information ?
• What is a sufficient number of features ?
- Assumption:
- Linear operator:
Intuition: sketching as a linear embedding
![Page 17: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/17.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
• : mean
• : moment
• : histogram
• Proposed: kernel random features [Rahimi 2007]
(random proj. + non-linearity)
Questions:
• What information is preserved by the sketching ?
• How to retrieve this information ?
• What is a sufficient number of features ?
- Assumption:
- Linear operator:
- « Noisy » linear measurement:
Noise small
Intuition: sketching as a linear embedding
![Page 18: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/18.jpg)
How-to: build a sketch
Nicolas Keriven
What is a sketch ?
Any linear sketch = empirical moments
2/15
What is contained in a sketch ?
• : mean
• : moment
• : histogram
• Proposed: kernel random features [Rahimi 2007]
(random proj. + non-linearity)
Questions:
• What information is preserved by the sketching ?
• How to retrieve this information ?
• What is a sufficient number of features ?
- Assumption:
- Linear operator:
- « Noisy » linear measurement:
Noise small
Intuition: sketching as a linear embedding
Dimensionality-reducing, random, linear embedding: Compressive Sensing?
![Page 19: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/19.jpg)
Retrieving mixture of Diracsfrom a sketch= k-means
Example of applications [Keriven 2016,2017]
Nicolas Keriven 3/15
![Page 20: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/20.jpg)
Retrieving mixture of Diracsfrom a sketch= k-means
Example of applications [Keriven 2016,2017]
Nicolas Keriven
Application: Spectral clusteringfor MNIST classification
3/15
![Page 21: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/21.jpg)
Retrieving mixture of Diracsfrom a sketch= k-means
Example of applications [Keriven 2016,2017]
Nicolas Keriven
Application: Spectral clusteringfor MNIST classification
3/15
- Twice faster than k-means- 4 orders of magnitude more
memory efficient
![Page 22: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/22.jpg)
Retrieving mixture of Diracsfrom a sketch= k-means
Example of applications [Keriven 2016,2017]
Nicolas Keriven
Application: Spectral clusteringfor MNIST classification
3/15
- Twice faster than k-means- 4 orders of magnitude more
memory efficient
Retrieving GMMs from a sketch
Application: speaker verification [Reynolds 2000]
Error:
• EM on 300 000 samples : 29.53• 20kB sketch computed on 50GB database: 28.96
![Page 23: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/23.jpg)
In this talk
Nicolas Keriven
Q: Theoretical guarantees ?
• Inspired by Compressive Sensing:
• 1: with the Restricted Isometry Property (RIP)
• 2: with dual certificates
4/15
![Page 24: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/24.jpg)
Outline
Nicolas Keriven
Information-preservation guarantees: a RIP analysis
Joint work with R. Gribonval, G. Blanchard, Y. Traonmilin
Total variation regularization:a dual certificate analysis
Conclusion, outlooks
![Page 25: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/25.jpg)
Recall: Linear inverse problem
Nicolas Keriven 5/15
![Page 26: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/26.jpg)
True distribution:
Recall: Linear inverse problem
Nicolas Keriven 5/15
Sketch:
![Page 27: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/27.jpg)
True distribution:
Recall: Linear inverse problem
Nicolas Keriven
• Estimation problem = linear inverse problem on measures
• Extremely ill-posed !
5/15
Sketch:
![Page 28: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/28.jpg)
True distribution:
Recall: Linear inverse problem
Nicolas Keriven
• Estimation problem = linear inverse problem on measures
• Extremely ill-posed !
• Feasibility? (information-preservation)
5/15
Best algorithmpossible
Sketch:
![Page 29: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/29.jpg)
Information preservation guarantees
Nicolas Keriven 6/15
![Page 30: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/30.jpg)
: Model set of « simple » distributions (eg. GMMs)
Information preservation guarantees
Nicolas Keriven 6/15
![Page 31: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/31.jpg)
: Model set of « simple » distributions (eg. GMMs)
Information preservation guarantees
Nicolas Keriven 6/15
![Page 32: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/32.jpg)
: Model set of « simple » distributions (eg. GMMs)
Information preservation guarantees
Nicolas Keriven 6/15
![Page 33: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/33.jpg)
: Model set of « simple » distributions (eg. GMMs)
Information preservation guarantees
Nicolas Keriven
GoalProve the existence of a decoder robustto noise and stable to modeling error.
« Instance-optimal » decoder
6/15
![Page 34: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/34.jpg)
: Model set of « simple » distributions (eg. GMMs)
Information preservation guarantees
Nicolas Keriven
GoalProve the existence of a decoder robustto noise and stable to modeling error.
Lower Restricted Isometry Property
« Instance-optimal » decoder
6/15
![Page 35: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/35.jpg)
: Model set of « simple » distributions (eg. GMMs)
Information preservation guarantees
Nicolas Keriven
Generalized Method of Moments
GoalProve the existence of a decoder robustto noise and stable to modeling error.
Lower Restricted Isometry Property
« Instance-optimal » decoder
6/15
![Page 36: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/36.jpg)
: Model set of « simple » distributions (eg. GMMs)
Information preservation guarantees
Nicolas Keriven
New goal: find/construct models and operators that satisfy the LRIP (w.h.p.)
Generalized Method of Moments
GoalProve the existence of a decoder robustto noise and stable to modeling error.
Lower Restricted Isometry Property
« Instance-optimal » decoder
6/15
![Page 37: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/37.jpg)
Proving the LRIP
Nicolas Keriven
Goal: LRIP
7/15
![Page 38: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/38.jpg)
Proving the LRIP
Nicolas Keriven
Construction of :- Kernel mean [Gretton 2006, Borgwardt 2006]
- Random features [Rahimi 2007]
Pointwise LRIP
Goal: LRIP
7/15
![Page 39: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/39.jpg)
Proving the LRIP
Nicolas Keriven
Construction of :- Kernel mean [Gretton 2006, Borgwardt 2006]
- Random features [Rahimi 2007]
Pointwise LRIP
Goal: LRIP
Extension to LRIP
Covering numbers (compacity) of the normalized secant set
7/15
![Page 40: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/40.jpg)
Proving the LRIP
Nicolas Keriven
Construction of :- Kernel mean [Gretton 2006, Borgwardt 2006]
- Random features [Rahimi 2007]
Pointwise LRIP
Goal: LRIP
Extension to LRIP
Covering numbers (compacity) of the normalized secant set
Subset of a unit ball (infinite dimension)that only depends on
7/15
![Page 41: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/41.jpg)
Proving the LRIP
Nicolas Keriven
Construction of :- Kernel mean [Gretton 2006, Borgwardt 2006]
- Random features [Rahimi 2007]
Pointwise LRIP
Goal: LRIP
Extension to LRIP
Covering numbers (compacity) of the normalized secant set
Subset of a unit ball (infinite dimension)that only depends on
7/15
![Page 42: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/42.jpg)
Main result [Keriven 2016]
Nicolas Keriven
Main hypothesis
The normalized secant set has finite covering numbers.
8/15
![Page 43: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/43.jpg)
Result
For ,
Main result [Keriven 2016]
Nicolas Keriven
Main hypothesis
The normalized secant set has finite covering numbers.
8/15
![Page 44: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/44.jpg)
Result
For ,
Main result [Keriven 2016]
Nicolas Keriven
Main hypothesis
The normalized secant set has finite covering numbers.
Pointwise concentration Dimensionality of the model
8/15
![Page 45: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/45.jpg)
Result
For ,
W.h.p.
Main result [Keriven 2016]
Nicolas Keriven
Main hypothesis
The normalized secant set has finite covering numbers.
Pointwise concentration Dimensionality of the model
8/15
![Page 46: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/46.jpg)
Result
For ,
W.h.p.
Main result [Keriven 2016]
Nicolas Keriven
Main hypothesis
The normalized secant set has finite covering numbers.
Pointwise concentration Dimensionality of the model
Modeling error Empirical noise
8/15
![Page 47: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/47.jpg)
Result
For ,
W.h.p.
Main result [Keriven 2016]
Nicolas Keriven
Main hypothesis
The normalized secant set has finite covering numbers.
Pointwise concentration Dimensionality of the model
Modeling error Empirical noise
8/15
Does not depend on !
Does not depend on !
![Page 48: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/48.jpg)
Result
For ,
W.h.p.
Main result [Keriven 2016]
Nicolas Keriven
Main hypothesis
The normalized secant set has finite covering numbers.
Pointwise concentration Dimensionality of the model
Modeling error
- Classic Compressive Sensing: finite dimension: Known- Here: infinite dimension: Technical
Empirical noise
8/15
Does not depend on !
Does not depend on !
![Page 49: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/49.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs
9/15
![Page 50: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/50.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs
Hypotheses- - separated centroids- - bounded domain for centroids
9/15
![Page 51: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/51.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs
Hypotheses- - separated centroids- - bounded domain for centroids
(no assumptionon the data)
9/15
![Page 52: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/52.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs
Hypotheses- - separated centroids- - bounded domain for centroids
Sketch- Adjusted Random Fourier features (for
technical reasons)
(no assumptionon the data)
9/15
![Page 53: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/53.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs
Hypotheses- - separated centroids- - bounded domain for centroids
Sketch- Adjusted Random Fourier features (for
technical reasons)
Result- W.r.t. k-means usual cost (SSE)
(no assumptionon the data)
9/15
![Page 54: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/54.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs
Hypotheses- - separated centroids- - bounded domain for centroids
Sketch- Adjusted Random Fourier features (for
technical reasons)
Result- W.r.t. k-means usual cost (SSE)
Sketch size
(no assumptionon the data)
9/15
![Page 55: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/55.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs GMM with known covariance
Hypotheses- - separated centroids- - bounded domain for centroids
Sketch- Adjusted Random Fourier features (for
technical reasons)
Result- W.r.t. k-means usual cost (SSE)
Sketch size
(no assumptionon the data)
9/15
![Page 56: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/56.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs GMM with known covariance
Hypotheses- - separated centroids- - bounded domain for centroids
Sketch- Adjusted Random Fourier features (for
technical reasons)
Result- W.r.t. k-means usual cost (SSE)
Sketch size
Hypotheses- Sufficiently separated means- Bounded domain for means
(no assumptionon the data)
9/15
![Page 57: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/57.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs GMM with known covariance
Hypotheses- - separated centroids- - bounded domain for centroids
Sketch- Adjusted Random Fourier features (for
technical reasons)
Result- W.r.t. k-means usual cost (SSE)
Sketch size
Hypotheses- Sufficiently separated means- Bounded domain for means
Sketch- Fourier features
(no assumptionon the data)
9/15
![Page 58: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/58.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs GMM with known covariance
Hypotheses- - separated centroids- - bounded domain for centroids
Sketch- Adjusted Random Fourier features (for
technical reasons)
Result- W.r.t. k-means usual cost (SSE)
Sketch size
Hypotheses- Sufficiently separated means- Bounded domain for means
Sketch- Fourier features
Result- With respect to log-likelihood
(no assumptionon the data)
9/15
![Page 59: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/59.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs GMM with known covariance
Hypotheses- - separated centroids- - bounded domain for centroids
Sketch- Adjusted Random Fourier features (for
technical reasons)
Result- W.r.t. k-means usual cost (SSE)
Sketch size
Hypotheses- Sufficiently separated means- Bounded domain for means
Sketch- Fourier features
Result- With respect to log-likelihood
Sketch size
(no assumptionon the data)
9/15
![Page 60: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/60.jpg)
Application
Nicolas Keriven
k-means with mixtures of Diracs GMM with known covariance
Hypotheses- - separated centroids- - bounded domain for centroids
Sketch- Adjusted Random Fourier features (for
technical reasons)
Result- W.r.t. k-means usual cost (SSE)
Sketch size
Hypotheses- Sufficiently separated means- Bounded domain for means
Sketch- Fourier features
Result- With respect to log-likelihood
Sketch size
(no assumptionon the data)
9/15
Compared to Generalized Method of moments, different guarantees
![Page 61: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/61.jpg)
Outline
Nicolas Keriven
Information-preservation guarantees: a RIP analysis
Total variation regularization:a dual certificate analysisJoint work with C. Poon, G. Peyré
Conclusion, outlooks
![Page 62: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/62.jpg)
Total Variation regularization
Nicolas Keriven
Previously: RIP analysis
Minimization: moment matching
11/15
![Page 63: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/63.jpg)
Total Variation regularization
Nicolas Keriven
Previously: RIP analysis
• Must know
• Non-convex !
Minimization: moment matching
11/15
![Page 64: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/64.jpg)
Total Variation regularization
Nicolas Keriven
Previously: RIP analysis
• Must know
• Non-convex !
Minimization: moment matching
Convex relaxation (« super resolution »): Beurling-LASSO (BLASSO) [DeCastro 2015]
• : Radon measure
• : Total variation (« L1 norm »)
11/15
![Page 65: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/65.jpg)
Total Variation regularization
Nicolas Keriven
Previously: RIP analysis
• Must know
• Non-convex !
Minimization: moment matching
Convex relaxation (« super resolution »): Beurling-LASSO (BLASSO) [DeCastro 2015]
• : Radon measure
• : Total variation (« L1 norm »)
Questions:• Is the measure sparse ?
• Does it have the right number of components ?
• Does it recover the true ?
11/15
![Page 66: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/66.jpg)
Dual certificates
Nicolas Keriven 12/15
Dual certificate analysis:
( = Lagrange multiplier)
![Page 67: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/67.jpg)
Dual certificates
Nicolas Keriven 12/15
Dual certificate analysis:
( = Lagrange multiplier)
Function
![Page 68: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/68.jpg)
Dual certificates
Nicolas Keriven 12/15
Dual certificate analysis:
( = Lagrange multiplier)
Such that:
•
• otherwise•
Function
![Page 69: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/69.jpg)
Dual certificates
Nicolas Keriven
Step 1: study full kernel [Candes 2013]
Assume sufficiently separated
12/15
Dual certificate analysis:
( = Lagrange multiplier)
Such that:
•
• otherwise•
Function
![Page 70: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/70.jpg)
Dual certificates
Nicolas Keriven
Step 1: study full kernel [Candes 2013]
Step 2: bounding the deviations
Assume sufficiently separated
12/15
Dual certificate analysis:
( = Lagrange multiplier)
Such that:
•
• otherwise•
Function
![Page 71: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/71.jpg)
Dual certificates
Nicolas Keriven
Step 1: study full kernel [Candes 2013]
Step 2: bounding the deviations
Assume sufficiently separated
m=10
12/15
Dual certificate analysis:
( = Lagrange multiplier)
Such that:
•
• otherwise•
Function
![Page 72: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/72.jpg)
Dual certificates
Nicolas Keriven
Step 1: study full kernel [Candes 2013]
Step 2: bounding the deviations
Assume sufficiently separated
m=10 m=20
12/15
Dual certificate analysis:
( = Lagrange multiplier)
Such that:
•
• otherwise•
Function
![Page 73: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/73.jpg)
Dual certificates
Nicolas Keriven
Step 1: study full kernel [Candes 2013]
Step 2: bounding the deviations
Assume sufficiently separated
m=50m=10 m=20
12/15
Dual certificate analysis:
( = Lagrange multiplier)
Such that:
•
• otherwise•
Function
![Page 74: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/74.jpg)
Results for separated GMM
Nicolas Keriven
1: Ideal scaling in sparsity
13/15
![Page 75: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/75.jpg)
Results for separated GMM
Nicolas Keriven
1: Ideal scaling in sparsity
13/15
![Page 76: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/76.jpg)
Results for separated GMM
Nicolas Keriven
1: Ideal scaling in sparsity
In progress…
13/15
![Page 77: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/77.jpg)
Results for separated GMM
Nicolas Keriven
1: Ideal scaling in sparsity
In progress…
• not necessarily right number of components, but:
• Mass of concentrated around true• (weak) robustness to modelling error
• Proof: infinite-dimensional golfingscheme (new)
13/15
![Page 78: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/78.jpg)
Results for separated GMM
Nicolas Keriven
1: Ideal scaling in sparsity
In progress…
• not necessarily right number of components, but:
• Mass of concentrated around true• (weak) robustness to modelling error
• Proof: infinite-dimensional golfingscheme (new)
2: Minimal norm certificate[Duval, Peyré 2015]
In progress…
13/15
Assumption: data are actually drawn from a GMM…
![Page 79: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/79.jpg)
Results for separated GMM
Nicolas Keriven
1: Ideal scaling in sparsity
In progress…
• not necessarily right number of components, but:
• Mass of concentrated around true• (weak) robustness to modelling error
• Proof: infinite-dimensional golfingscheme (new)
2: Minimal norm certificate[Duval, Peyré 2015]
In progress…
• when n high enough: sparse, withright number of components
•
• Proof: adaptation of [Tang, Recht 2013]
13/15
Assumption: data are actually drawn from a GMM…
![Page 80: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/80.jpg)
Outline
Nicolas Keriven
Information-preservation guarantees: a RIP analysis
Total variation regularization:a dual certificate analysis
Conclusion, outlooks
![Page 81: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/81.jpg)
Sketch learning
Nicolas Keriven
• Sketching :• Streaming, distributed learning
• Original view on data compression and generalized moments
• Combines random features and kernel mean with infinitedimensional Compressive sensing
14/15
![Page 82: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/82.jpg)
Summary, outlooks
Nicolas Keriven
• RIP analysis• Information preservation guarantees• Fine control on noise, modeling error (instance optimal decoder) and
recovery metrics• Necessary and sufficient conditions
15/15
![Page 83: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/83.jpg)
Summary, outlooks
Nicolas Keriven
• Dual certificate analysis• Convex minimization• In some cases, automatically guess the right number of components
• RIP analysis• Information preservation guarantees• Fine control on noise, modeling error (instance optimal decoder) and
recovery metrics• Necessary and sufficient conditions
15/15
![Page 84: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/84.jpg)
Summary, outlooks
Nicolas Keriven
• Dual certificate analysis• Convex minimization• In some cases, automatically guess the right number of components
• RIP analysis• Information preservation guarantees• Fine control on noise, modeling error (instance optimal decoder) and
recovery metrics• Necessary and sufficient conditions
15/15
• Outlooks• Algorithms for TV minimization• Other features (not necessarily random…)• Other « sketched » learning tasks• Multilayer sketches ?
![Page 85: Sketched Learning from Random Features Moments · Compressive learning Nicolas Keriven Compression Learning Linear sketch • Sketched learning: First compress data in a linear sketch](https://reader036.vdocuments.mx/reader036/viewer/2022071022/5fd68d382a91363a492587a4/html5/thumbnails/85.jpg)
Thank you !
Nicolas Keriven
• Gribonval, Blanchard, Keriven, Traonmilin. Compressive Statistical Learning with Random Feature Moments. 2017. <arXiv:1706.07180>
• Keriven. Sketching for Large-Scale Learning of Mixture Models. PhD Thesis. <tel-01620815>
• Poon, Keriven, Peyré. A Dual Certificates Analysis of Compressive Off-the-Grid Recovery. 2018. <arXiv:1802.08464>
• Code, applications: nkeriven.github.io