convolutional neural networks jeremias sulam, yaniv ......2017/12/06  · theorem: [papyan, sulam...

70
Convolutional Neural Networks in View of Sparse Coding Joint work with: Jeremias Sulam, Yaniv Romano, Michael Elad

Upload: others

Post on 14-Nov-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Convolutional Neural Networksin View of Sparse Coding

Joint work with:Jeremias Sulam, Yaniv Romano, Michael Elad

Page 2: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Breiman’s “Two Cultures”

Generative modeling

Gauss Wiener Laplace Bernoulli Fisher

Predictive modeling

Page 3: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Generative Modeling

forward pass of CNN

eyeface

noseedgeeyeface

nose edge

generative model

Page 4: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

forward pass of CNN

eyeface

noseedgeeyeface

nose edge

generative model

What generative model? DRMM!

Page 5: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

forward pass of CNN

eyeface

noseedgeeyeface

nose edge

generative model

Properties of model?

for hierarchical compositional functions deep but not shallow networks avoid the curse of dimensionality because of locality of constituent functions

Page 6: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

forward pass of CNN

eyeface

noseedgeeyeface

nose edge

generative model

Properties of model?Weights and pre-activations are i.i.d Gaussian

Page 7: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

forward pass of CNN

eyeface

noseedgeeyeface

nose edge

generative model

Properties of model?Overparameterization is good for optimization

Page 8: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

forward pass of CNN

eyeface

noseedgeeyeface

nose edge

generative model

Success of inference?

Page 9: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

forward pass of CNN

eyeface

noseedgeeyeface

nose edge

generative model

Uniqueness of representation?

Page 10: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

forward pass of CNN

eyeface

noseedgeeyeface

nose edge

generative model

Stability to perturbations?

Page 11: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

forward pass of CNN

eyeface

noseedgeeyeface

nose edge

generative model

Better inference?Information should propagate both within and between levels of representation in a bidirectional manner

Page 12: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

forward pass of CNN

eyeface

noseedgeeyeface

nose edge

generative model

Better training? EM!random featuresk-meansmatrix factorization

Page 13: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Sparse Representation Generative Model

● Receptive fields in visual cortex are spatially localized, oriented and bandpass

● Coding natural images while promoting sparse solutions results in a set of filters satisfying these properties [Olshausen and Field 1996]

● Two decades later…○ vast theoretical study○ different inference algorithms○ different ways to train the model

Page 14: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

First Layer of a ConvolutionalNeural Network

Evolution of Models

Sparse representationsConvolutional

sparse representation

Multi-LayeredConvolutional

Sparse Representation

First Layer of aNeural Network

Multi-LayeredConvolutionalNeural Network

Page 15: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

First Layer of a Neural Network

Page 16: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Sparse ModelingTask: model image patches of size 8x8 pixels

We assume a dictionary of such image patches is given, containing 256 atoms

Assumption: every patch can be described as a linear combination of a few atoms

Key properties: sparsity and redundancy

Page 17: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Matrix Notation

Page 18: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Sparse CodingGiven a signal, we would like to find its sparse representation

Convexify

Page 19: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Sparse CodingGiven a signal, we would like to find its sparse representation

Crude approximation

Convexify

Page 20: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Thresholding Algorithm

Page 21: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

First Layer of a Neural Network

Page 22: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

ReLU = Soft Nonnegative Thresholding

ReLU is equivalent to soft nonnegative thresholding

Page 23: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

First layer of a Convolutional Neural Network

filters

heig

ht

width

Page 24: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Convolutional Sparse Modeling

Page 25: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Convolutional Sparse Modeling

filters

heig

ht

width

Page 26: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Convolutional Sparse Modeling

deconv

filtershe

ight

width

Page 27: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Thresholding Algorithm

Page 28: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

First layer of a Convolutional Neural Network

Page 29: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

First layer of a Convolutional Neural Network

filters

heig

ht

width

Page 30: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

heig

ht

width Convolutional Neural Network

filters

heig

ht

width

filters

heig

ht

width

Page 31: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Multi-layered Convolutional Sparse Modeling

Page 32: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

heig

ht

width

filters

heig

ht

width

filters

heig

ht

width

Multi-layered Convolutional Sparse Modeling

Page 33: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Layered Thresholding

Page 34: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Convolutional Neural Network

Page 35: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

heig

ht

width

filters

heig

ht

width

filters

heig

ht

width

Convolutional Neural Network

Page 36: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Theories of Deep Learning

Page 37: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

First Layer of a ConvolutionalNeural Network

Evolution of Models

Sparse representationsConvolutional

sparse representation

Multi-LayeredConvolutional

Sparse Representation

First Layer of aNeural Network

Multi-LayeredConvolutionalNeural Network

Page 38: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Sparse Modeling

Page 39: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Classic Sparse Theory

Theorem: [Donoho and Elad, 2003]Basis pursuit is guaranteed to recover the true sparse vector assuming that

Page 40: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Mutual Coherence:

Page 41: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Convolutional Sparse Modeling

Page 42: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Classic Sparse Theory for Convolutional Case

Assuming 2 atoms of length 64 [Welch, 1974]

Success guaranteed when

Theorem: [Donoho and Elad, 2003]Basis pursuit is guaranteed to recover the true sparse vector assuming that

Very pessimistic!

Page 43: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Local Sparsity

maximal number of non-zeroesin a local neighborhood

filters

heig

ht

width

Page 44: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Theorem: [Papyan, Sulam and Elad, 2016]

Assume:

Then:

Success of Basis PursuitTheoretical guarantee for:

● [Zeiler et. al 2010]● [Wohlberg 2013]● [Bristow et. al 2013]● [Fowlkes and Kong 2014]● [Zhou et. al 2014]● [Kong and Fowlkes 2014]● [Zhu and Lucey 2015]● [Heide et. al 2015]● [Gu et. al 2015]● [Wohlberg 2016]● [Šorel and Šroubek 2016]● [Serrano et. al 2016]● [Papyan et. al 2017]● [Garcia-Cardona and

Wohlberg 2017]● [Wohlberg and Rodriguez

2017]● ...

Page 45: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Multi-layered Convolutional Sparse Modeling

Page 46: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Deep Coding Problem

Page 47: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Deep Coding Problem

Page 48: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Uniqueness

Page 49: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Uniqueness Theorem

Page 50: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Success of Forward Pass

Page 51: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Success of Forward Pass Theorem

Layered thresholding guaranteed:1. Find correct places of nonzeros2.

Forward pass always fails at recovering representations exactly

Success depends on ratio

Distance increases with layer

Page 52: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Generative Model and Crude Inference

Page 53: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Layered Lasso StatsDepartment

Page 54: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Success of Layered Lasso

Layered Lasso guaranteed:1. Find only correct places of nonzeros2. Find all coefficients that are big enough3.

Forward pass always fails at recovering representations exactly

Success depends on ratio

Distance increases with layer

Page 55: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Layered Iterative Thresholding

Page 56: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Supervised Deep Sparse Coding Networks[Sun et. al 2017]

Page 57: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Relation to Other Generative Models

Page 58: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

heig

ht

width

filters

heig

ht

width

filters

heig

ht

width

Multi-layered Convolutional Sparse Modeling

Page 59: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Generator in GANs [Goodfellow et. al 2014]

Sparsification of intermediate feature maps with ReLU

Page 60: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

DRMM [Patel et. al]

Sparsification of intermediate feature maps with a random mask

Page 61: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

[Arora et. al, 2015]

Sparsification of intermediate feature maps with a random mask and ReLU

Page 62: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Evidence

Page 63: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Olshausen & Field and AlexNetOlshausen & Field

AlexNet

explicit sparsity

implicit sparsity

Page 64: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Sparsity in Practice

Page 65: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Sparsity in Practice

Page 66: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Mutual Coherence in Practice[Shang 2015] measured the average mutual coherences of the different layers in the “all-conv” network:

Page 67: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Regularizing Coherence[Cisse et. al 2017] proposed the following regularization to improve the robustness of a network to adversarial examples:

Page 68: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Local Sparsity

Page 69: Convolutional Neural Networks Jeremias Sulam, Yaniv ......2017/12/06  · Theorem: [Papyan, Sulam and Elad, 2016] Assume: Then: Success of Basis Pursuit Theoretical guarantee for:

Summary

Sparsity is covertly exploited in practice:ReLU, dropout, stride, dilation, ...

Sparsity well established theoretically

Sparsity is the secret sauce behind CNN

Need to bring sparsity to the surface to better understand CNNs

Andrej Karpathy agrees