understanding deep image representations by inverting...
TRANSCRIPT
![Page 1: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/1.jpg)
1
Understanding Deep Image Representations by Inverting Them
Paper by Aravindth Mahendran, Andrea Velaldi
Presentation by Anthony Chen
![Page 2: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/2.jpg)
2
Background
● Feature extraction methods like SIFT and HOG and CNN, but difficult to understand from information preservation standpoint.
![Page 3: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/3.jpg)
3
Contributions
● Novel method to invert representations. – That is, given a function and its output, recover the
original input.
● Analysis of the information preservation of different types of representation (CNN, HOG, SIFT).
![Page 4: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/4.jpg)
4
Related Work
● DeConvNets
Your thoughts on similarities/differences?
![Page 5: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/5.jpg)
5
Related Work (2)
● DeConvNets – My thoughts
– DeConvNet are encouraged to look like original, while this paper enforces no such constraint.
– Therefore, while both can be thought of as inverses, DeConvNet studies how results are obtained, whereas this paper studies information representation/preservation.
![Page 6: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/6.jpg)
6
Inverting Images
● This is the function representing the CNN.
● Let x0 be the original image.
● Goal: Find an x such that is close to
![Page 7: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/7.jpg)
7
Inverting Images (2)
● We want to find an x, which we will call x*, s.t
● Here, we add a regularizer to ensure that the optimization search only searches for “natural images”
![Page 8: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/8.jpg)
8
Inverting Images (3)
● Given an image reconstruction , the reconstruction error is given by:
● Additional modification To ensure that loss near solution is bounded in a [0, 1) range:
where sigma is the mean of the images in our test set.
![Page 9: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/9.jpg)
9
Regularizers
● Let x be a mean subtracted image vector.
● enforces range.
● Total variation:
– Penalizes images with large total gradients.
– Discrete version:
![Page 10: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/10.jpg)
10
Regularizers (2)
● Allows us to set the range of the pixel values . If we want to set the range between [-B, B], then
● Allows us to say how much variability the reconstruction should have.
![Page 11: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/11.jpg)
11
Final objective function
![Page 12: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/12.jpg)
12
Optimization
● Momentum based gradient descent is used to minimize the objective function.
● Momentum size has a decaying factor of .9
● Because CNN's function is differentiable, this is easy to optimize, but not for HOG and SIFT. Therefore, HOG and SIFT are implemented in CNN.
![Page 13: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/13.jpg)
13
Representations: CNN
![Page 14: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/14.jpg)
14
Representations: SIFT and HOG
● DSIFT and HOG implemented w/ CNN architecture which makes it easy to compute gradients.
● Binning is approximated using ReLU layer
● Pooling into cell histograms by linear filter.
● Cell blocks then normalized by a normalization layer.
● Maximum values are then set using ReLU unit.
![Page 15: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/15.jpg)
15
Results
● Normalized reconstruction error
● is the normalization constant. Average pairwise Euclidean distance across 100 images.
● λa = 2.16x108, λVβ = 5, β = 2.
![Page 16: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/16.jpg)
16
Results: SIFT and HOG
● Using bilinear gradient improves HOGb greatly.
![Page 17: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/17.jpg)
17
Results: SIFT and HOG (2)
![Page 18: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/18.jpg)
18
Results: CNN
● Experiments run allowing different levels of total variance. – λ1 = .5. λ2 = 5. λ3 = 50
![Page 19: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/19.jpg)
19
Test Images
![Page 20: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/20.jpg)
20
Results CNN (2)
![Page 21: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/21.jpg)
21
Results: CNN (3)
● Reconstruction from subset of network illustrates subset's purpose.
![Page 22: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/22.jpg)
22
Results (4): Variance in Reconstruction
![Page 23: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/23.jpg)
23
Effects of parameter tuning
Decreasing the regularizing constant leads to higher variance reconstructions. These indistinguishable images still lead to good reconstruction errors.
![Page 24: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/24.jpg)
24
Future Work
● Use this inverse technique to improve CNN architecture.
● Use this technique on other forms of neural networks (LSTM)?
![Page 25: Understanding Deep Image Representations by Inverting Themweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2015/anthony2.pdf · 14 Representations: SIFT and HOG DSIFT and HOG implemented](https://reader033.vdocuments.mx/reader033/viewer/2022050314/5f763458409b315b763df766/html5/thumbnails/25.jpg)
25
Conclusion
● This paper provides a novel method to study and visualize information preservation in a CNN.
● Formalizes relationship between CNN and shallow feature representation.