neural networks: backpropagation algorithm data mining and semantic web university of belgrade...
TRANSCRIPT
Neural Networks: Backpropagation algorithm
Data Mining and Semantic Web
University of BelgradeSchool of Electrical Engineering Chair of Computer Engineering and Information Theory
Miroslav Tiš[email protected]
You see this:
But the camera sees this:
What is this?
23.12.2011. Miroslav Tišma 2/21
Computer Vision: Car detection
Testing:
What is this?
Not a carCars
23.12.2011. Miroslav Tišma 3/21
pixel 1
pixel 2
Raw image
Cars“Non”-Cars
50 x 50 pixel images→ 2500 pixels (7500 if RGB)
pixel 1 intensity
pixel 2 intensity
pixel 2500 intensity
Quadratic features ( ): ≈3 million features
Learning Algorithm
pixel 1
pixel 2
23.12.2011. Miroslav Tišma 4/21
Neural Networks
• Origins: Algorithms that try to mimic the brain
• Was very widely used in 80s and early 90s; popularity diminished in late 90s.
• Recent resurgence: State-of-the-art technique for many applications
23.12.2011. Miroslav Tišma 5/21
Neurons in the brain
Dendr(I)tes
Ax(O)n
23.12.2011. Miroslav Tišma 6/21
Neuron model: Logistic unit
Sigmoid (logistic) activation function.
hΘ (𝑥 )= 1
1+𝑒−Θ𝑇 𝑥
𝑔 (𝑧 )= 1
1+𝑒− 𝑧
“bias unit”
“output”
“input wires”
“weights” - parameters
23.12.2011. Miroslav Tišma 7/21
Neural Network
Layer 3Layer 1 Layer 2
“bias unit”
“output layer”“hidden layer”“input layer”
23.12.2011. Miroslav Tišma 8/21
Neural Network“activation” of unit in layer
matrix of weights controlling function mapping from layer to layer
If network has units in layer , units in layer , then will be of dimension .
23.12.2011. Miroslav Tišma 9/21
Simple example: AND
0 00 11 01 1
-30
+20
+20
hΘ (𝑥 )=𝑔(−30+20 𝑥1+20 𝑥2)
hΘ (𝑥 )≈ 𝑥1𝐴𝑁𝐷 𝑥223.12.2011. Miroslav Tišma 10/21
Example: OR function
0 00 11 01 1
-10
+20
+20
hΘ (𝑥 )=𝑔(−10+20 𝑥1+20𝑥2)
hΘ (𝑥 )≈ 𝑥1𝑂𝑅𝑥223.12.2011. Miroslav Tišma 11/21
Multiple output units: One-vs-all.
Pedestrian Car Motorcycle Truck
Want ,
when pedestrian 23.12.2011. Miroslav Tišma 12/21
when car when motorcycle
, etc.,
Neural Network (Classification)
Binary classification
1 output unit
Layer 1 Layer 2 Layer 3 Layer 4
Multi-class classification (K classes)
K output units
total no. of layers in network
no. of units (not counting bias unit) in layer
pedestrian car motorcycle truck
E.g. , , ,
23.12.2011. Miroslav Tišma 13/21
Cost function
Logistic regression:
23.12.2011. Miroslav Tišma 14/21
Neural network:
Gradient computation
Need code to compute:- -
23.12.2011. Miroslav Tišma 15/21
Our goal is to minimize the cost function
Given one training example ( , ):Forward propagation:
Layer 1 Layer 2 Layer 3 Layer 4
𝑎 (1 ) 𝑎 (2) 𝑎 (3) 𝑎 (4 )
23.12.2011. Miroslav Tišma 16/21
Backpropagation algorithm
Backpropagation algorithm
Intuition: “error” of node in layer .
Layer 1 Layer 2 Layer 3 Layer 4
For each output unit (layer L = 4) 𝛿(4 )𝛿(3 )𝛿(2 )
(h𝜃 (𝑥 ) ) 𝑗
the derivate of activation function can be written as
𝜕𝜕Θ𝑖𝑗
❑ 𝐽 (𝜃 )=𝑎(𝑙)𝛿❑(𝑙+1)
element-wise multiplication operator
23.12.2011. Miroslav Tišma 17/21
Backpropagation algorithmTraining set
Set (for all ).
ForSetPerform forward propagation to compute for Using , computeCompute
used to compute
23.12.2011. Miroslav Tišma 18/21
Advantages:- Relatively simple implementation- Standard method and generally wokrs well- Many practical applications: * handwriting recognition, autonomous driving car
Disadvantages:- Slow and inefficient- Can get stuck in local minima resulting in sub-optimal solutions
23.12.2011. Miroslav Tišma 19/21
Literature:
- http://en.wikipedia.org/wiki/Backpropagation
- http://www.ml-class.org
- http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html
23.12.2011. Miroslav Tišma 20/21
23.12.2011. Miroslav Tišma 21/21
Thank you for your attention!