how convnets see + applicationsguerzhoy/321/lec/w07/howconvnetssee.pdfunderstanding how convnets see...

23
Understanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy Slides from Andrej Karpathy Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops)

Upload: others

Post on 07-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Understanding How ConvNets See

CSC321: Intro to Machine Learning and Neural Networks, Winter 2016

Michael Guerzhoy

Slides from Andrej Karpathy

Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops)

Page 2: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

What Does a Neuron Do in a ConvNet? (1)

β€’ A neuron in the first hidden layer computes a weighted sum of pixels in a patch of the image for which it is responsible

K. Fukushima, β€œNeurocognitron: A self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position” (Biol. Cybernetics 1980)

Page 3: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

What Does a Neuron Do in a ConvNet? (2)

β€’ For Neurons in the first hidden layer, we can visualize the weights.

Example weights for fully-connected single-hidden layer network for faces, for one neuron

Weights for 9 features in the first convolutional layer of a layer for classifying ImageNet images

Zeiler and Fergus, β€œVisualizing and Understanding Convolutional Networks”

Page 4: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

What Does a Neuron Do in a ConvNet? (3)

β€’ The neuron would be activated the most if the input looks like the weight matrix

β€’ These are called β€œGabor-like filters”

β€’ The colour is due to the input being 3D. We visualize the strength of the weight going from each of the R, G, and B components

Page 5: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

What Does a Neuron Do in a ConvNet (4)

β€’ Another to figuring out what kind of images active the neuron: just try lots of images in a dataset, and see which ones active the neuron the most

Zeiler and Fergus, β€œVisualizing and Understanding Convolutional Networks”

For each feature, fine the 9 images that produce the highest activations for the neuron, and crop out the relevant patch

Page 6: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Aside: Relevant Patch?

β€’ Each neuron is affected by some small patch in the layer below

β€’ Can recursively figure out what patch in the input layer each neuron is affected

β€’ Neurons in the top layers are affected by (almost) the entire image

Page 7: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

This allows us to look at layers besides the first one: layer 3

Page 8: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Layer 4

Page 9: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Layer 5

Page 10: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Which Pixels in the Input Affect the Neuron the Most?β€’ Rephrased: which pixels would make the neuron

not turn on if they had been different?

β€’ In other words, for which inputs isπœ•π‘›π‘’π‘’π‘Ÿπ‘œπ‘›

πœ•π‘₯𝑖large?

Page 11: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

π‘₯1 π‘₯2 π‘₯3

π‘Š(1) π‘Š(1)π‘Š(2)

π‘Š(2)

𝑠1 =

𝑖=1

2

π‘Š(𝑖)π‘₯𝑖𝑠2 =

𝑖=1

2

π‘Š(𝑖)π‘₯𝑖+1

π‘Ÿπ‘’π‘™π‘’1 π‘Ÿπ‘’π‘™π‘’2

β„Ž1 β„Ž2

π‘šπ‘Žπ‘₯π‘Ÿπ‘’π‘™π‘’ π‘₯ = α‰Šπ‘₯, π‘₯ β‰₯ 00, π‘₯ < 0

β„Ž3Assume that for the particular image π‘₯, β„Ž2 > β„Ž3

πœ•β„Ž3πœ•β„Ž2

= 1 𝑖𝑓 β„Ž1 < β„Ž2

πœ•β„Ž3πœ•π‘Ÿπ‘’π‘™π‘’2

=πœ•β„Ž3πœ•β„Ž2

πœ•β„Ž2πœ•π‘Ÿπ‘’π‘™π‘’2

= α‰Š1, 𝑠2 > 00, π‘œ/𝑀

πœ•β„Ž3πœ•π‘ 2

=πœ•β„Ž3

πœ•π‘Ÿπ‘’π‘™π‘’2

πœ•π‘Ÿπ‘’π‘™π‘’2πœ•π‘ 2

= α‰Š1, 𝑠2 > 00, π‘œ/𝑀

πœ•β„Ž3πœ•π‘₯3

=πœ•β„Ž3πœ•π‘ 2

πœ•π‘ 2πœ•π‘₯3

= ΰ΅π‘Š(2), 𝑠2 > 0

0, π‘œ/π‘€πœ•β„Ž3πœ•π‘₯2

=πœ•β„Ž3πœ•π‘ 2

πœ•π‘ 2πœ•π‘₯3

= ΰ΅π‘Š(1), 𝑠2 > 0

0, π‘œ/𝑀

Page 12: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Typical Gradient of a Neuron

β€’ Visualize the gradient of a particular neuron with respect to the input x

β€’ Do a forward pass:

β€’ Compute the gradient of a particular neuron using backprop:

Page 13: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Typical Gradient of a Neuron

β€’ Mostly zero away from the object,but the results are not very satisfying

β€’ Every pixel influences the neuron viamultiple hidden neurons. The network is trying to detect kittens everywhere, and the same pixel could fit a kitten in one location but not another, leading to its overall effect on the kitten neuron to be 0(Explanation on the board)

Page 14: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

β€œGuided Backpropagation”

β€’ Idea: neurons act like detectors of particular image features

β€’ We are only interested in what image features the neuron detects, not in what kind of stuff it doesn’t detect

β€’ So when propagating the gradient, we set all the negative gradients to 0β€’ We don’t care if a pixel β€œsuppresses” a neuron

somewhere along the part to our neuron

Page 15: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

π‘₯1 π‘₯2 π‘₯3

π‘Š(1) π‘Š(1)π‘Š(2)

π‘Š(2)

𝑠1 =

𝑖=1

2

π‘Š(𝑖)π‘₯𝑖𝑠2 =

𝑖=1

2

π‘Š(𝑖)π‘₯𝑖+1

π‘Ÿπ‘’π‘™π‘’1 π‘Ÿπ‘’π‘™π‘’2

β„Ž1 β„Ž2

π‘šπ‘Žπ‘₯π‘Ÿπ‘’π‘™π‘’ π‘₯ = α‰Šπ‘₯, π‘₯ β‰₯ 00, π‘₯ < 0

β„Ž3Assume that for the particular image π‘₯, β„Ž2 > β„Ž3

πœ•β„Ž3πœ•β„Ž2

= 1 𝑖𝑓 β„Ž1 < β„Ž2

πœ•β„Ž3πœ•π‘Ÿπ‘’π‘™π‘’2

=πœ•β„Ž3πœ•β„Ž2

πœ•β„Ž2πœ•π‘Ÿπ‘’π‘™π‘’2

= α‰Š1, 𝑠2 > 00, π‘œ/𝑀

πœ•β„Ž3πœ•π‘ 2

=πœ•β„Ž3

πœ•π‘Ÿπ‘’π‘™π‘’2

πœ•π‘Ÿπ‘’π‘™π‘’2πœ•π‘ 2

= α‰Š1, 𝑠2 > 00, π‘œ/𝑀

πœ•β„Ž3πœ•π‘₯3

=πœ•β„Ž3πœ•π‘ 2

πœ•π‘ 2πœ•π‘₯3

= ΰ΅π‘Š(2), 𝑠2 > 0

0, π‘œ/π‘€πœ•β„Ž3πœ•π‘₯2

=πœ•β„Ž3πœ•π‘ 2

πœ•π‘ 2πœ•π‘₯3

= ΰ΅π‘Š(1), 𝑠2 > 0

0, π‘œ/𝑀

set to 0 of negative

set to 0 of negative

Page 16: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Guided BackpropagationCompute gradient, zero out negatives, backpropagate

Compute gradient, zero out negatives, backpropagate

Compute gradient, zero out negatives, backpropagate

Page 17: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Guided Backpropagation

Backprop Guided Backprop

Page 18: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Guided Backpropagation

Springerberg et al, Striving for Simplicity: The All Convolutional Net (ICLR 2015 workshops)

Page 19: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

What About Doing Gradient Descent?

β€’ What to maximize the i-th output of the softmaxβ€’ Can compute the gradient of the i-th output of the

softmax with respect to the input x (the W’s and b’s are fixed to make classification as good as possible)

β€’ Perform gradient descent on the input

Page 20: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

Yosinski et al, Understanding Neural Networks Through Deep Visualization (ICML 2015)

Page 21: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy
Page 22: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy
Page 23: How ConvNets See + Applicationsguerzhoy/321/lec/W07/HowConvNetsSee.pdfUnderstanding How ConvNets See CSC321: Intro to Machine Learning and Neural Networks, Winter 2016 Michael Guerzhoy

(A Small Tweak For the Gradient Descent Algorithm)

β€’ Doing gradient descent can lead to things that don’t look like images at all, and yet maximize the output

β€’ To keep images from looking like white noise, do the following:β€’ Update the image x using a gradient descent step

β€’ Blur the image x