courtesy of xkcd.com. used with permission. - ocw.mit.edu · nguyen a, yosinski j, clune j. deep...
TRANSCRIPT
![Page 2: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/2.jpg)
Image Classification via Deep Learning
Ryan Alexander, Ishwarya Ananthabhotla, Julian Brown, Henry Nassif, Nan Ma, Ali Soylemezoglu
2
![Page 3: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/3.jpg)
Overview
• What is Deep Learning?
• Image Processing
• CNN Architecture
• Training Process
• Image Classification Results
• Limitations
3
![Page 4: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/4.jpg)
Overview
• What is Deep Learning?
• Image Processing
• CNN Architecture
• Training Process
• Image Classification Results
• Limitations
4
![Page 5: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/5.jpg)
Deep Learning Refers to...
Machine Learning algorithms designed to
extract high-level abstractions from data
via multi-layered processing architectures
using nonlinear transformations at each layer
5
![Page 6: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/6.jpg)
Human Visual System• Distributed Hierarchical processing in the primate cerebral cortex (1991)
6© AAAS. All rights reserved. This content is excluded from our Creative Commonsllicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
![Page 7: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/7.jpg)
How To Classify a Face?
• Identify where the face region is• Foreground Extraction• Edge Detection
• Classify features of the face• Identify and describe eyes, nose, mouth areas
• Look at face as a collection of those features
© ACM, Inc. All rights reserved. This contentis excluded from our Creative Commonslicense. For more information, seehttps://ocw.mit.edu/help/faq-fair-use/.
7
7
![Page 8: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/8.jpg)
Common Architectures
•Deep Convolutional Neural Networks (CNNs)•Deep Belief Networks (DBNs)•Recurrent Neural Network
8
![Page 9: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/9.jpg)
Common Architectures
•Deep Convolutional Neural Networks (CNNs)•Deep Belief Networks (DBNs)•Recurrent Neural Network
9
![Page 10: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/10.jpg)
ImageNet Competition Through Time
10
![Page 11: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/11.jpg)
Overview
• What is Deep Learning?
• Image Processing
• CNN Architecture
• Training Process
• Image Classification Results
• Limitations
11
![Page 12: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/12.jpg)
Classic Classification -- Feature Engineering
12
![Page 13: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/13.jpg)
What if the techniques could be “learned”?
© ACM, Inc. All rights reserved. This content is excluded from our Creative Commonslicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
13
![Page 14: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/14.jpg)
Step 1: Convolution - Definition
Informal Definition: Procedure where two sources of information are intertwined.
Formal Definition :
Discrete :
Continuous :
14
![Page 15: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/15.jpg)
Convolution - Example
Assume the following kernel/filter :
1 0 1
0 1 0
1 0 1
15
![Page 16: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/16.jpg)
Convolution
16
![Page 17: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/17.jpg)
1 1 1
1 -8 1
1 1 1
1 2 1
0 0 0
-1 -2 -1
.0113 .0838 .0113
.0838 .6193 .0838
.0113 .0838 .0113
17
![Page 18: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/18.jpg)
More Information? Fourier Transform!Sum of a set of sinusoidal gratings differing in spatial frequency, orientation,
amplitude, phase
18
![Page 19: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/19.jpg)
Fourier Transform● Fourier Transform image itself is weird to visualize -- Phase and Magnitude!
● Magnitude -- orientation information at all spatial scales
● Phase -- contour information
19
![Page 20: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/20.jpg)
20
![Page 21: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/21.jpg)
Overview
• What is Deep Learning?
• Image Processing
• CNN Architecture
• Training Process
• Image Classification Results
• Limitations
21
![Page 22: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/22.jpg)
Why Neural Net
Hubel & Wiesel (1959, 1962)
22© source unknown. All rights reserved. This content is excluded from our CreativeCommons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
![Page 23: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/23.jpg)
The Structure of a Neuron
input_1
input_2
input_n
. . .
Sum Σx (weight_2) Activation output
23
![Page 24: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/24.jpg)
Combining Neurons into Nets
24
![Page 25: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/25.jpg)
Convolution Step
input_1
input_2
input_n
. . .
Sum Σx (weight_2) Activation output
Convolution Step (dot product between filter and input)
25
![Page 26: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/26.jpg)
Convolutional Layer
26
![Page 27: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/27.jpg)
Activation Step
input_1
input_2
input_n
. . .
Sum Σx (weight_2) Activation output
Activation Step
27
![Page 28: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/28.jpg)
Activation Layer
28
![Page 29: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/29.jpg)
CNN overview
Convolution and
Activation Subsampling
29
![Page 30: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/30.jpg)
Activation Step
Each neuron adds up its inputs, and then feeds the sum into a function -- the activation function -- to determine the neuron's output.
Eg : Sigmoid, tanh, ReLu
30
![Page 31: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/31.jpg)
Activation functions - sigmoid
31
![Page 32: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/32.jpg)
Activation function - tanh
32
![Page 33: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/33.jpg)
Activation function - ReLu
33
![Page 34: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/34.jpg)
Non-linearity Constraint
Activation function is to introduce non-linearity into the network
Without a nonlinear activation function in the network, NN, no matter how many layers it has, will behave like a linear system and we will not be able to mimic a ‘complicated’ function
A neural network may very well contain neurons with linear activation functions, such as in the output layer, but these require the company of neurons with a nonlinear activation function in other parts of the network.
34
![Page 35: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/35.jpg)
Convolution Step
An RGB image is represented by a 3 dimensional matrix
The first channel holds the ‘R’ value of each pixel
The second channel holds the ‘G’ value of each pixel
The third channel holds the ‘B’ value of each pixel
Eg: A 32x32 image is represented by a 32x32x3 matrix
35
![Page 36: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/36.jpg)
32x32x3
Filter 5x5x3
36
![Page 37: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/37.jpg)
32x32x337
![Page 38: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/38.jpg)
Input Volume vs Output Volume for convolution
W1
H1
D1
W2
H2
D2
W2 = W1 - (filter width) + 1
H2 = H1 - (filter height) + 1
D2 = 1 (D1 = filter depth)
Input Output
38
![Page 39: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/39.jpg)
Neurons Activation Map
28x28x1 28x28x139
![Page 40: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/40.jpg)
32x32x3 ( 28x28x1 ) * 5 40
![Page 41: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/41.jpg)
Parameters
Input volume: 32x32x3
Filter size : 5x5x3
Size of 1 activation map: 28*28*1
Depth of first layer: 5
Total Number of neurons: 28*28*5 = 3920
Weights per neuron: 5*5*3 = 75
Total Number of parameters: 75*3920 = 294 000
Neurons Activation Map
41
![Page 42: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/42.jpg)
CNN overview
Convolution and
Activation Subsampling
42
![Page 43: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/43.jpg)
SubsamplingObjectives:
Reduce the size of input/feature space
Keep output of the most responsive neuron of the given interest region.
Common Methods:
- Max Pooling
- Average Pooling
This involves splitting up the matrix of filter outputs into small non-overlapping grids and taking the maximum/average
43
![Page 44: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/44.jpg)
44
![Page 45: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/45.jpg)
Max Pooling
45
![Page 46: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/46.jpg)
Input Volume vs Output Volume for Max Pooling
W1
H1
D1
W2
H2
D2
W2 = W1 - (pool width) + 1
H2 = H1 - (pool height) + 1
D2 = D1
Input Output
46
![Page 47: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/47.jpg)
CNN overview
Convolution and
Activation Subsampling
?
47
![Page 48: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/48.jpg)
Fully Connected LayerNeurons in fully connected layers have full connections to all activations in the previous layer
...
48
![Page 49: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/49.jpg)
SoftmaxTypically, output layer has one neuron corresponding to each label/class
The softmax function, or normalized exponential, "squashes" multi-dimensional vector of arbitrary real values to a multi-dimensional vector of values in the range (0, 1) that add up to 1.
49
![Page 50: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/50.jpg)
Overview
• What is Deep Learning?
• Image Processing
• CNN Architecture
• Training Process
• Image Classification Results
• Limitations
50
![Page 51: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/51.jpg)
Train the Network (setup the problem)- The training is in fact to find a set of weights (for the filters) that minimize the cost
functions, C(w,b).
- Normally, gradient descent algorithm is used to find the optimal
- Therefore, we need to find ∂C/∂wljk and ∂C/∂blj,and we update the weights and bias by:
51
![Page 52: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/52.jpg)
Train the Network (compute the gradient)
- Traditionally, for one training data, If using conventional method (central difference) and we have a million weights, the cost function, C(w,b), will need to be calculated a million times !!
- How can we just calculate C(w,b) once? -- (Backpropagation Algorithm, Rumelhart, Hinton, and Williams, 1986).
52
![Page 53: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/53.jpg)
Backward Propagation of Errors
53
![Page 54: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/54.jpg)
Backward Propagation of Errors
54
![Page 55: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/55.jpg)
Backward Propagation of Errors
55
![Page 56: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/56.jpg)
Backward Propagation of Errors (put it together)
- Proof: http://neuralnetworksanddeeplearning.com/chap2.html
56
![Page 57: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/57.jpg)
Backward Propagation of Errors (put it together)- Tutorial: http://neuralnetworksanddeeplearning.com/chap2.html
57
![Page 58: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/58.jpg)
Backward Propagation of Errors (put it together)
58
![Page 59: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/59.jpg)
Backward Propagation of Errors (put it together)
59
![Page 60: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/60.jpg)
Backward Propagation of Errors (put it together)
60
![Page 61: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/61.jpg)
Train the network (Initializing Weights)
Initialization is need for the gradient descent algorithm and it is critical for the learning performance:
61
![Page 62: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/62.jpg)
Initial Weights
We want to stay away from the saturation area.
Suppose there is n weights coming in one NeuronBest strategy is: Normal(0, )
62
![Page 63: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/63.jpg)
Example architecture
Alex Net, 61 millions weights
63
© ACM. All rights reserved. This content is excluded from our Creative Commonslicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
![Page 64: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/64.jpg)
Preprocessing Tricks and TipsSuppose we have dataset X = [N X D], where N is number of data points, and D is their dimensionality
1. Mean Image Subtraction: Subtraction of the mean across each individual feature in dataset
2. Normalization for Dimension: Division by standard deviation
64
![Page 65: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/65.jpg)
Preprocessing Tricks and Tips3. Principle Component Analysis (PCA) for dimensionality reduction
- Generate covariance matrix across the data- SVD factorization- Decorrelation, rotation into Eigenbasis- Choose a top-k eigenvalues: X’ = [NxK]
4. Whitening- Divide by eigenvalues (square roots of singular val)- Result: Zero mean, Identity Covariance
65
![Page 66: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/66.jpg)
Data Augmentation
1. Rotations
2. Reflections
3. Scaling
4. Cropping
5. Color space remapping
6. Randomization!
66
![Page 67: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/67.jpg)
Overview
• What is Deep Learning?
• Image Processing
• CNN Architecture
• Training Process
• Image Classification Results
• Limitations
67
![Page 68: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/68.jpg)
Revisiting the ImageNet Competition (ILSVRC 2010)
Model Top-1 error rate Top-5 error rate
Sparse coding 0.47 0.28
SIFT + FVs 0.46 0.26
CNNs 0.37 0.17
68
![Page 69: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/69.jpg)
Krizhevsky, Alex et al. Imagenet classification with deep convolutional neural nets. NIPS 2012 69
© ACM. All rights reserved. This content is excluded from our Creative Commonsicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/.l
![Page 70: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/70.jpg)
Google Street View House Numbers
70© Goodfellow, Ian et al. All rights reserved. This content is excluded from our Creative Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/
![Page 71: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/71.jpg)
“Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks” by Ian J. Goodfellow, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, Vinay Shet
Courtesy of Goodfellow, Ian et al. Used with permission.71
![Page 72: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/72.jpg)
Recognizing Hand Gestures-HCI application
N. Jawad, D. Frederick, A. Gianni, C. Dan and M. Ueli, “Max-pooling convolutional neural networks for vision-based hand gesture recognition”, IEEE International Conference on Signal and Image Processing Applications, 2011.
72
© IEEE. All rights reserved. This content is excluded from our Creative Commonslicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
![Page 73: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/73.jpg)
Extended Image Classification: Video Classification
Extend image classification by adding temporal component to classify videos
Note that this adds additional complexity, but the underlying system is the same: Convolutional Neural Nets
73
![Page 74: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/74.jpg)
©IEEE. All rights reserved. This content is excluded from our Creative Commonslicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
Karpathy, Andrej, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. "Large-Scale Video Classification with Convolutional Neural Networks." 2014 IEEE Conference on Computer Vision and Pattern Recognition (2014)
74
New Text
74
![Page 75: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/75.jpg)
Overview
• What is Deep Learning?
• Image Processing
• CNN Architecture
• Training Process
• Image Classification Results
• Limitations
75
![Page 76: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/76.jpg)
Even the Best have Issues
Microsoft won the most recent ImageNet competition and currently holds the state-of-the-art implementation
They can recognize 1000 categories of images, extremely reliably.
However:
1000 categories does not cover as many objects as you might expect.
Uses 1.28 million images to train
Takes weeks to train on multiple GPUs, with heavy optimization
http://arxiv.org/abs/1512.0338576
![Page 77: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/77.jpg)
Szegedy et al. Intriguing Properties of Neural Networks. 2014.
Courtesy of Szegedy, Christian et al. License: CC-BY.
77
![Page 78: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/78.jpg)
Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. In Computer Vision and Pattern Recognition (CVPR ’15), IEEE, 2015.
78
©IEEE. All rights reserved. This content is excluded from our Creative Commonslicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
![Page 79: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/79.jpg)
Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. In Computer Vision and Pattern Recognition (CVPR ’15), IEEE, 2015.
Gradient Ascent
79
© IEEE. All rights reserved. This content is excluded from our Creative Commonslicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
![Page 80: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/80.jpg)
Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. In Computer Vision and Pattern Recognition (CVPR ’15), IEEE, 2015.
Indirect Encoding
80
© IEEE. All rights reserved. This content is excluded from our Creative Commonslicense. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
![Page 81: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/81.jpg)
81
![Page 82: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/82.jpg)
82
![Page 83: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/83.jpg)
83
![Page 84: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/84.jpg)
84
![Page 85: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/85.jpg)
85
![Page 86: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/86.jpg)
86
© Stuart McMillen. All rights reserved. This content is excluded from our CreativeCommons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
![Page 87: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/87.jpg)
• Deep Learning is a powerful tool that relies on many iterations of processing
• CNNs outperform all other algorithms for image classification because of the image processing power of convolutional filters
• Backpropagation is used to efficiently train CNNs
• CNNs need tons of data and processing power
Takeaways
87
![Page 88: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/88.jpg)
Getting Started With Deep Learning
88
![Page 89: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/89.jpg)
ReferencesImageNet Classification with Deep Convolutional Neural Networks. Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton. http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
Approximation by Superpositions of a Sigmoidal Function. G. Cybenko https://www.dartmouth.edu/~gvc/Cybenko_MCSS.pdf
Backpropagation Tutorial http://neuralnetworksanddeeplearning.com/chap2.html
https://papers.nips.cc/paper/877-the-softmax-nonlinearity-derivation-using-statistical-mechanics-and-useful-properties-as-a-multiterminal-analog-circuit-element.pdf
Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images. In Computer Vision and Pattern Recognition (CVPR ’15), IEEE, 2015.
“Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks” by Ian J. Goodfellow, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, Vinay Shet
Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. http://arxiv.org/abs/1512.03385 89
![Page 90: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/90.jpg)
Appendix
90
![Page 91: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/91.jpg)
Backward Propagation of Errors
- The gradient of weights and bias can be found by back chaining the auxiliary variable, defined as:
- By chain rule:
- The back propagate it (chain rule again):
91
![Page 92: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/92.jpg)
Backward Propagation of Errors (put it together)
92
![Page 93: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/93.jpg)
Train the Network (put it together)
93
![Page 94: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/94.jpg)
94© Tim Dettmers. All rights reserved. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
![Page 95: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/95.jpg)
95Courtesy of Tim Dettmers. Used with permission.
![Page 96: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/96.jpg)
Convolution: Filters An output pixel’s value is some function of the corresponding input pixel’s
neighbors
Examples:
Smooth, sharpen, contrast, shift
Enhance edges
Detect particular orientations
Approach: Kernel Convolution
1/9 = 40
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 90 0
0 0 0 90 90 90 90 90 90 0
0 0 0 90 90 90 90 90 90 0
0 0 0 90 90 90 90 90 90 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
1 1 1
1 1 1
1 1 1
96
![Page 97: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/97.jpg)
Convolution for 2D matrices
Given two three-by-three matrices, one a kernel, and the other an image piece, convolution is the process of multiplying entries and summing
The output of this operation constitutes the input to a single neuron in the following layer.
97
![Page 98: Courtesy of xkcd.com. Used with permission. - ocw.mit.edu · Nguyen A, Yosinski J, Clune J. Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable](https://reader030.vdocuments.mx/reader030/viewer/2022040421/5e0a5e9af9ce6a10e8371b50/html5/thumbnails/98.jpg)
MIT OpenCourseWarehttps://ocw.mit.edu
16.412J / 6.834J Cognitive RoboticsSpring 2016
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.