chapter 9 artificial neural network introduction to back propagation neural network bpnn by kh wong...
TRANSCRIPT
Chapter 9Artificial Neural network
Introduction to Back Propagation Neural Network BPNN
By KH Wong
Neural Networks Ch9. , ver. 5f2 1
Introduction
• Neural Network research is are very popular • A high performance Classifier (multi-class)• Successful in handwritten optical character
OCR recognition, speech recognition, image noise removal etc.
• Easy to implementation– Slow in learning– Fast in classification
Neural Networks Ch9. , ver. 5f2 2
http://www.ninds.nih.gov/disorders/brain_basics/ninds_neuron.htmhttp://yann.lecun.com/exdb/mnist/
Motivation
• Biological findings inspire the development of Neural Net– Input weights Logic function output
• Biological relation– Input– Dendrites – Output– Human computes using a net
Neural Networks Ch9. , ver. 5f2 3
X=inputs
W=weights
Neuron(Logic function)
Output
Applications
Neural Networks Ch9. , ver. 5f2 4
• Microsoft: XiaoIce. AI• http://image-net.org/challenges/LS
VRC/2015/– 200 categories: accordion,
airplane ,ant ,antelope ….dishwasher ,dog ,domestic cat ,dragonfly ,drum ,dumbbell , etc.
• Tensor flow
ILSVRC 2015
Number of object classes 200
TrainingNum images 456567
Num objects 478807
ValidationNum images 20121
Num objects 55502
TestingNum images 40152
Num objects ---
Different types of artificial neural networks
• Autoencoder• DNN Deep neural network & Deep learning• MLP Multilayer perceptron• RNN (Recurrent neural network)• RBM Restricted Boltzmann machine• SOM (Self-organizing map)• Convolutional neural network• From https://en.wikipedia.org/wiki/Artificial_neural_network• The method discussed in this power point can be applied to many of the above
nets.
Neural Networks Ch9. , ver. 5f2 5
Theory of Back Propagation Neural Net (BPNN)
• Use many samples to train the weights (W) & Biases (b), so it can be used to classify an unknown input into different classes
• Will explain– How to use it after training: forward pass
(classify /or the recognition of the input )– How to train it: how to train the weights and
biases (using forward and backward passes)
Neural Networks Ch9. , ver. 5f2 6
Back propagation is an essential step in many artificial network designs
• For training an artificial neural network• For each training example xi, a supervised (teacher)
output ti is given.
• For the ith training sample x: xi
1) Feed forward propagation: feed xi to the neural net, obtain output yi. Error ei |ti-yi|2
2) Back propagation: feed ei to net from the output side and adjust weight w (by finding ∆w) to minimize e.
• Repeat 1) and 2) for all samples until E is 0 or very small.
Neural Networks Ch9. , ver. 5f2 7
Example :Optical character recognition OCR
• Training: Train the system first by presenting a lot of samples with known classes to the network
• Recognition: When an image is input to the system, it will tell what character it is
Neural Networks Ch9. , ver. 5f2 8
Neural Net Output3=‘1’, other outputs=‘0’
Neural Net
Training up the network:weights (W) and bias (b)
Overview of this document
• Back Propagation Neural Networks (BPNN)– Part 1: Feed forward processing (classification or
Recognition)– Part 2: Back propagation (Training the network), also
include forward processing, backward processing and update weights
• Appendix:• A MATLAB example is explained• %source :
http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-pattern-recognition-tutorial
Neural Networks Ch9. , ver. 5f2 9
Part 1 (classification in action /or the Recognition process)Forward pass of Back Propagation
Neural Net (BPNN)Assume weights (W) and bias (b) are found by training already (to be discussed in part2)
Neural Networks Ch9. , ver. 5f2 10
Recognition: assume weight (W) bias (b) are found earlier
• Neural Networks Ch9. , ver. 5f2 11
OutputOutput0=0Output1=0Output2=0Output3=1
:Outputn=0
Each pixel is X(u,v)
Correct recognition
•
Neural Networks Ch9. , ver. 5f2
12
1X l 2X l 3X l
1W l 2W lNlW
NlX
Output layer Input layer
Hidden layers
A neural network
Exercise 1• How many input and outputs neurons?• Ans: 4 input and 2 output neurons• How many hidden layers does this network have?• Ans: 3• How many weights in total?• Ans: First hidden layer has 4x4, second layer has 3x4,
third hidden layer has 3x3, fourth hidden layer to output layer has 2x3 weights. total=16+12+9+6=43
Neural Networks Ch9. , ver. 5f2
13
1X l 2X l 3X l
4W NlInputsneurons
1W l
What is this layer of neurons X called?Ans: 4X l
Multi-layer structure of a BP neural network
•
Neural Networks Ch9. , ver. 5f214
Input layer
,Youtput ,W,X inputs
has layer inneuron eachfor that such
biases ofset b weights,ofset W inputs, ofset Xoutputs,Y
lll ywx
l
l
l
:layer
hidden
() function a transfer
,b one
and,...,,
weightshas neuron Each
neurons multiple haslayer A
321
f
bias
www
layer
Output
Otherhidden layers
Inside each neuron there is a bias (b)• In between any neighboring 2
neuron layers, a set of weights are found
Neural Networks Ch9. , ver. 5f2 15
)3( ix
y
)1( iwu uf
)(Iw
)2( iw)2( ix
)( Iix
Inside each neuron x=input, y=output
• Neural Networks Ch9. , ver. 5f2 16
Ii
iixi
u
Ii
i
e
fy
euf
f
uwxb
bw(i)x(i)ufy
1b)()(
1
1
1)u( therefore
,simplicityfor 1 assume,1
1)(
i.e. function, (sigmod) logistica is ()Typically
signal internal weight,input, bias,
, with)u(
)1( ix
y
)1( iwu uf
)(Iw
)2( iw)2( ix
)( Iix
BPNN Forward pass• Forward pass is to find the output when an input is given. For
example:• Assume we have used N=60,000 images (MNIST database) to
train a network to recognize c=10 numerals.• When an unknown image is given to the input, the output
neuron corresponds to the correct answer will give the highest output level.
Neural Networks Ch9. , ver. 5f2 17
10 output neurons for 0,1,2,..,9
Inputimage
000100
Our simple demo program• Training pattern
– 3 classes (in 3 rows)– Each class has 3 training
samples (items in each row)
• After training , an input (assume it is test image #2) is presented to the network, the network should tell you it is class 2.
Neural Networks Ch9. , ver. 5f2 18
class1
class2
class3
Result:image (class 2)
Unknowninput
Numerical Example : Architecture of our example
Neural Networks Ch9. , ver. 5f2 19
Input Layer9x1 pixels
output Layer 3x1
neuron) eachfor bias (1 1x neurons 5b
neuron eachfor inputs 9x neurons 5W
layer
hidden
l
l
x
lx
neuron eachfor (),b,W fbiasesweights ll •
The input x • P2=[50 30 25 215 225 231 31 22 34; ...
%class1: 1st training sample. Gray level 0->255
Neural Networks Ch9. , ver. 5f2 20
P1=50P2=30P3=25P4=215P5=225P6=235P7=31P8=22P9=34
9 neuronsIn input layer
3 neuronsIn output layer
5 neuronsIn hidden layer
Exercise 2: Feed forwardInput =P1,..P9, output =Y1,Y2,Y3
teacher(target) =T1,T2,T3•
Neural Networks Ch9. , ver. 5f2 21
•
A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1
P(i=1)
P(i=2)
P(i=3)
::
P(i=9)
(i=1,j=1)
(i=2,i=1)
A1(j=5)
A1(j=1) (j=1,k=1)
l=2(j=2,k=2)
(j=2,k=1)A1(j=2)
Layer l=1 Layer l=2
Y1=0.5101T1=1
Y2=0.4322T2=0
Y3=0.3241T3=0
Output layer
Input layer
Class1 :T1,T2,T3=1,0,0
Exercise 2: What is the target code for T1,T2,T3 if it is for class3?Ans: 0,0,1
Exercise 3: find Y1•
Neural Networks Ch9. , ver. 5f2 22
l=1i=2
l=1i=3
l=1i=1
l=2i=1b=0.5
l=2i=2b=0.3
l=3i=1b=0.7
l=3i=2b=0.6
Wl=1,j=3,i=2
0.15
0.730.27
0.10.35
0.4
0.6
0.35
0.8
0.25
Input layer
Hidden layer ouput layer
Y1=?
y2
X=1
X=3.1
X=0.5
A1
A2
Ii
iixi
e
fy1
b)()(
1
1)u(
• %demo_bpnn_note1 khw ver15• u1=1*0.1+3.1*0.35+0.5*0.4+0.5• A1=1/(1+exp(-1*u1))• • u2=1*0.27+3.1*0.73+0.5*0.15+0.3• A2=1/(1+exp(-1*u2))• • u_Y1=A1*0.6+A2*0.35+0.7• Y1=1/(1+exp(-1*u_Y1))
• %%%%%% result %%%%%%• %>>demo_bpnn_note1• u1 = 1.8850• A1 = 0.8682• U2 = 2.9080• A2 = 0.9482• Y1 = 0.8528• >> %>>
Neural Networks Ch9. , ver. 5f2 23
Answer 3
Part 2: Back propagation processing
(Training the network)
Back Propagation Neural Net (BPNN) (Training)
Ref:http://en.wikipedia.org/wiki/Backpropagation
Neural Networks Ch9. , ver. 5f2 24
Back propagation stage•
Neural Networks Ch9. , ver. 5f2 25
1ll
1lxlx
Part1:FeedForward (studied before)
Part2: Back propagation
llayer
)(1 bxfx l
We will explain why and prove the necessary equations in the following slides
For training we need to find , why?
E
The criteria to train a network • Based on the overall error function, there are ‘N’ samples and
‘c’ classes to be learned (Assume N=60,000 in MNIST dataset)
Neural Networks Ch9. , ver. 5f2
26
network forward feed theofouput at the
sample training theof classoutput The
(teacher) sample training theof class truegiven The
;2
2
1
:)1( outputs allfor sample training theofError
utss_all_outpall_sampleerror_for_
2
1error Overall
2
2
1
2
1 1
th
thnk
thnk
c
k
nk
nk
n
th
N
N
n
c
k
nk
nk
N
k
ny
nt
norm
ytE
,..ckn
E
ytE
Example: The k-th class training sampleThe teacher says it is class tk
n=1
Before we back propagate data , we have to find the feed forward error signals e(n) first for training sample x(n). Recall: Feed forward processing, Input =P1,..P9, output =Y1,Y2,Y3, teacher =T1,T2,T3
• Input=
Neural Networks Ch9. , ver. 5f2 27
•
A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1
P(i=1)
P(i=2)
P(i=3)
::
P(i=9)
(i=1,j=1)
(i=2,i=1)
A1(j=5)
A1(j=1) (j=1,k=1)
(j=2,k=2)
(j=2,k=1)A1(j=2)
Layer l=1 Layer l=2
Y1=0.5101T1=1
Y2=0.4322T2=0
Y3=0.3241T3=0
Output layer
Input layer
I.e. e(n)=(1/2)|Y1-T1|2
=0.5*(0.5101-1)^2=0.12
e
Exercise 3 : The training idea• Assume it is for the nth training
sample, and belong to class C.• In the previous exercise we
calculated that in this network Y1=0.8059
• During training for this input the teacher says t=1
a) What is the error value e?b) How do we use this e?• Answer a: e=(1/2)|Y1-t|2=0.5*(1-0.8059)^2=0.0188• Answer b: We feed this e back to the network to find w to
minimize the overall E (E =sum_all_n [t-e]). It is because we know that w_new=w_old+ w will give a new w that decreases E. hence by applying this formula recursively, we can achieve a set of W to minimum E.
Neural Networks Ch9. , ver. 5f2 28
t=1
Assume it is for the nth training sample
How to back propagate?
•
Neural Networks Ch9. , ver. 5f2 29
29
Neuron j
?E
find toneed wedoBut why
(1)--------- rule chainby , EE
,E
find want toWe
)(y
, definitionBy
isOutput j, neurona For
output actualy,or teachertarget
outputat error squared 2
1
1j
1
2
ij
ij
j
j
j
jij
ij
j
Ii
iijij
j
Ii
iijij
j
w
w
u
u
y
yw
sow
bwxfuf
bwxu
y
t
ytE
i=1,2,..,II inputs to neuron jOutput of neuron j is yj
jIiw ,
jiw ,1
Because: E/ wi,j tells you how to change w to minimize eE The method is called Learning by gradient decent
•
Neural Networks Ch9. , ver. 5f2 30
b
Ebb
w
Ew
Ewwww
Tw
Ew
eE
www
w
oldnew
argument, same For the
need why wesThat'
slide),next thein explained be ldecent wilgradient of theory (The
0.1)factor learning ( ve smalla useslowly it do o
decent)gradient by (learning make
cycle learningevery for E)ofelement an is ( decrease want to weIf
using
,calculated is new a (epoch), cycle learning each In
oldoldnew
oldnew
We need to find , why?
• Ans:
Neural Networks Ch9. , ver. 5f231
EE
EEEE
EEwE
EEEE
()(
E
)(E
EE
EE
EEE
oldnew
oldoldnew
oldnew
oldnew
oldnewoldnew
decrease will- set :Conclusion
ve always is since ),()(
)(-)()(
becomes *) into **put
rate learning set the to termve smalla is is where
(**)- set we
*----- )()(
, Here
..)()(
definitionby seriesTaylor
Using Taylor series http://www.fepress.org/files/math_primer_fe_taylor.pdfhttp://en.wikipedia.org/wiki/Taylor's_theorem
E
Back propagation ideaInput =P1,..P9, output =Y(k=1),Y(k=2),Y3(k=3)teachers =T(k=1),T(k=3),T(k=3)
•
Neural Networks Ch9. , ver. 5f2 32
•
A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1
P(i=1)
P(i=2)
P(i=3)
::
P(i=9)
(i=1,j=1)
(i=2,j=1)
A1(j=5)
A1(i=1) (j=1,k=1)
l=2(j=2,k=2)
(j=2,k=1)A1(j=2)
32
Layer l=1 Layer l=2
Y(k=1)=0.5101T(k=1)=1
Y(k=2)=0.4322T(k=2)=0
Y(k=3)=0.3241T(k=3)=0
Output layer
Input layer
e=(1/2)|Y1-T1|2
=0.5*(0.5101-1)^2=0.12
Back propagate to find a better w to reduce E
The training algorithm • Loop many epochs until E is very small or W is stable• { For n=1,N_all_training_samples• { feed forward x(n) to network to get y(n)• e(n)=0.5*[y(n)-t(n)]^2 //t(n)=teacher of sample x(n)• back propagate e(n) to the network, • //showed earlier if w=-*E/w , and wnew=wold+ w
• //output y(n) will be closer to t(n) hence e(n) will decrease• find w=-*E/w //E will decrease. 0.1=learning rate• update wnew=wold+ w =wold-*E/w //for weight
• Similarity update bnew=bold+ b =wold-*E/b //for bias• }• E=sum_all_n (e(n))• }
Neural Networks Ch9. , ver. 5f2 33
Theory of how to find E/w
•
Neural Networks Ch9. , ver. 5f2 34
term3 term2,, term1
rule) chain(by , EE
(1) from so E, affects how see want toWe
through neuronoutput toconnected is input An
,,
,
,
kj
j
j
j
jkj
kj
kj
w
u
u
y
yw
w
wkj
Xj=1yk
Wj,k
Output neuron k
uk
k
Jj
jkjjkk
k
Ij
jkjjk
bwxfufy
bwxu
1,
1,
)( Xj=J
Case 1: if neuronj is at the output layer. We want to see how E will change if we change the weight wj,k
•
Neural Networks Ch9. , ver. 5f2 35
ysensitivit
ufufty
u
uf
yu
y
y
xufuftyw
w
tyy
ty
y
bxw
bxw
w
u
ufufufu
uf
u
y
w
u
u
y
yw
k
kkkkk
k
k
kk
k
kk
jkkkkkj
kj
kkk
kk
k
jjkj
jjkj
kj
k
kkkk
k
k
k
kj
k
k
k
kkj
)2())(1)((
term2*term1)(EE
:note
)(1)(E
term3* term2* term1E
since
outputat measured,5.0E
:term1
constant since, :term3
appendix See,)(1)()(')(
:term2
term3 term2,, term1
, EE
,
,
2
,
,
,
,,
xjyk
Wj,k
uk
Outputyk
Teacher(Target )Class=tk
Neuron k as an output neuron
We want to see kjw ,
E
ek=0.5(tk-yk)2
, ekEk
Case2 : if neuron j is at the hidden layer. We want to see if how E will change if we change the weight wi,j. Note: Output yi affects all neurons connected to it in next layer
• Neural Networks Ch9. , ver. 5f2 36
layernext thein all affects
eachfor , because ,:part1b
slide)last of eq.(2) (see EE
:part1a
part1bpart1aEE
:term1
term3 term2term1
EE
,,
11
,,
kj
jkjkkjj
k
kk
k
kk
Kk
k
Kk
k j
k
kj
ji
j
j
j
jji
uy
kywuwy
u
u
y
yu
y
u
uy
w
u
u
y
yw
neuron j
1ku 1ky
program
in
W2
,
Kkjw
jy
kby indexed
neuronsOutput
1kix ju
program in
W1, jiw1, kjw
2ku 2ky
2k
Kku Kky
Kk
2, kjw
Kkjw ,
EChangeshere
Case2 : continue
• Neural Networks Ch9. , ver. 5f2 37
iii
Kk
kkjk
ji
Kk
kkjk
ji
Kk
kkjk
i
Kk
k
xufufww
ww
wy
)(1)(E
hence
slide previous thein that similar to are term3term2,
term3term2term3term2term1E
Epart1bpart1a term1So,
1,
,
1,
,
1,
1
For this hidden neuron j, this is df1 in the program
Input xi to the hidden neuron i, P(:,) in program
After all (E/w) are found after you solved case1 and case2
•
Neural Networks Ch9. , ver. 5f2 38
w
Eww
w
Ew
www
E
w
oldnew
oldnew
0.1) rate learning (use method,decent graident
theusing minimized is so
all update tostep thisuse can We
Revisit the training algorithm • Iter=1: all_epochs (or break when E is very small)• { For n=1:N_all_training_samples• { feed forward x(n) to network to get y(n)• e(n)=0.5*[y(n)-t(n)]^2 ;//t(n)=teacher of sample x(n)• back propagate e(n) to the network, • //showed earlier if w=-*E/w , and wnew=wold+ w
• //output y(n) will be closer to t(n) hence e(n) will decrease• find w=-*E/w //E will decrease. 0.1=learning rate• update wnew=wold+ w =wold-*E/w ;//for weight
• Similarity update bnew=bold+ b =wold-*E/b ;//for bias• }• E=sum_all_n (e(n))• }
Neural Networks Ch9. , ver. 5f2 39
Summary
• Learn what is Back Propagation Neural Networks (BPNN)
• Learn the forward pass• Learn how to back propagate data during
training of the BPNN network
Neural Networks Ch9. , ver. 5f2 40
References• Wiki
– http://en.wikipedia.org/wiki/Backpropagation– http://en.wikipedia.org/wiki/Convolutional_neural_network
• Matlab programs– Neural Network for pattern recognition- Tutorial
http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-pattern-recognition-tutorial
– CNN Matlab example http://www.mathworks.com/matlabcentral/fileexchange/38310-deep-learning-toolbox
• Open source library– Tensor flow: http://www.geekwire.com/2015/google-open-sources-
tensorflow-machine-learning-system-offering-its-neural-network-to-outside-developers/
Neural Networks Ch9. , ver. 5f2 41
Appendices
Neural Networks Ch9. , ver. 5f2 42
Appendix 1:Sigmod function f(u) and its derivative f’(u)
•
Neural Networks Ch9. , ver. 5f243
)(1)()()(
Thus,
)(1)()1(
11
)1(
1
)1(
)1()1(
)1(
1
)1()1(
1
)1(
1)(
)1(
1)(
rule) chain using(,)1(
)1(1
1)(
)(
)()(
1set simplicityfor ,1
1
1
1)(
'
22'
'
'
ufufufdu
udf
ufufee
e
eee
ee
e
e
ee
ee
uf
du
ed
ede
d
du
udfuf
Hence
ufdu
udfee
uf
uu
u
uuu
uu
u
u
uu
uu
u
u
u
uu
http://link.springer.com/chapter/10.1007%2F3-540-59497-3_175#page-1
http://mathworld.wolfram.com/SigmoidFunction.html
•
Neural Networks Ch9. , ver. 5f2 44
nnLlLl
nnn
l
nnn
nnn
l
nnnn
nl
n
n
nnnn
llll
tyf
L l
ivuftyb
E
b
ui
b
uufyt
b
uftyt
b
E
b
ytyt
b
Ei)(ii) & (ii
t
ufy
iiiuftytE
n-th
iiub
u
ub
ib
ubxu
u'
layeroutput at the
)(
,1),(in since
,)(
, From
(teacher)or target truththe
,outputcurrent theis )( Becuase
)()(2
1
2
1
sample theSince
)(ysensitivit theEEE
hence
),(1 so, since
'
'
22
1
Alternative
Derivation (for the output layer , in each neuron)
1ll
Output(last layer)t=target (teacher)y=output.Back propagate error to the previous layer
derivation
•
Neural Networks Ch9. , ver. 5f245
eq(ii)δbb
Ebb
xE
viv(iv eq
T
viE
E
xivxE
ufty(iv)xuftyE
bxufty
ufyt
yyt
E
bwxuytE
lb
loldl
n
blold
lnew
lll
l
ll
l
ll
nnlnnl
lnn
lnn
l
nnn
l
nnn
see , argument, same For the
),,.use hence slide),next see method,decent gradient theis (This
factor) learning ( ve smalla useslowly it do o
)( make
cycle learningeverfy for decrease want to weIf
-(v)------- ,calculated is new a phase, learning eachFor
) weight and input each(for )(
)(' in since,)('
)(')(
and,2
1 , (iii) from Also
1oldoldoldnew
oldnew
2
BNPP example in matlab
Based on Neural Network for pattern recognition- Tutorial
http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-
pattern-recognition-tutorial
Neural Networks Ch9. , ver. 5f2 46
Example: a simple BPNN
• Number of classes (no. of output neurons)=3• Input 9 pixels: each input is a 3x3 image• Training samples =3 for each class• Number of hidden layers =1• Number of neurons in the hidden layer =5
Neural Networks Ch9. , ver. 5f2 47
Display of testing patterns
•
Neural Networks Ch9. , ver. 5f2 48
Architecture
Neural Networks Ch9. , ver. 5f2 •
Input:P=9x1Indexed by j
A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1
l=1(i=1,j=1)
l=1(i=2,j=1)
l=1(i=9,j=1)
P(i=1)
P(i=2)
P(i=3)
::
P(i=9)
)1(jb...)1,2()1,1(112
11
1
1
1A
PjiPji ll
e
A1(j=1)P(i=1)
P(i=2)
P(i=9)
Neuron j=1Bias=b1(j=1)
2(j=1,k=1)
2(j=2,k=1)
2(j=5,k=1)
))1(b...)1()1,2()1()1,1((222
21
2
1
1A
kkAkjkAkj ll
e
A2(k=2)A1
A2
A5
Neuron k=1Bias=b2(k=1)
l=1(i=i,j=1)
l=1(i=2,j=1)
l=1(i=9,j=5)
l=1i(j=3,j=4)
A1(j=5)
A1(j=1)
A2:layer2, 3 Output neuronsindexed by kWl=2=5x3bl=2=3x1
l=2(j=5,k=3)
l=2(j=1,k=1)
l=2(j=2,k=2)
l=2(j=2,k=1)A1(j=2)
49
Layer l=1 Layer l=2S2 generated
S1 generated
• %source : http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-pattern-recognition-tutorial• clear memory %comments added by kh wong• clear all• clc• nump=3; % number of classes• n=3; % number of images per class• % training images reshaped into columns in P • % image size (3x3) reshaped to (1x9)• • % training images • P=[196 35 234 232 59 244 243 57 226; ...• 188 15 236 244 44 228 251 48 230; ... % class 1• 246 48 222 225 40 226 208 35 234; ...• • 255 223 224 255 0 255 249 255 235; ...• 234 255 205 251 0 251 238 253 240; ... % class 2• 232 255 231 247 38 246 190 236 250; ...• • 25 53 224 255 15 25 249 55 235; ...• 24 25 205 251 10 25 238 53 240; ... % class 3• 22 35 231 247 38 24 190 36 250]';• • % testing images • N=[208 16 235 255 44 229 236 34 247; ...• 245 21 213 254 55 252 215 51 249; ... % class 1• 248 22 225 252 30 240 242 27 244; ...• • 255 241 208 255 28 255 194 234 188; ...• 237 243 237 237 19 251 227 225 237; ... % class 2• 224 251 215 245 31 222 233 255 254; ...• • 25 21 208 255 28 25 194 34 188; ...• 27 23 237 237 19 21 227 25 237; ... % class 3• 24 49 215 245 31 22 233 55 254]';• • % Normalization• P=P/256;• N=N/256;•
Neural Networks Ch9. , ver. 5f2 50
• % display the training images • figure(1),• for i=1:n*nump• im=reshape(P(:,i), [3 3]);• %remove theline below to reflect the truth data input• % im=imresize(im,20); % resize the image to make it clear• subplot(nump,n,i),imshow(im);…• title(strcat('Train image/Class #', int2str(ceil(i/n))))• end• % display the testing images • figure,• for i=1:n*nump• im=reshape(N(:,i), [3 3]);• % remove theline below to reflect the truth data input• % im=imresize(im,20); % resize the image to make it clear • subplot(nump,n,i),imshow(im);title(strcat('test image #', int2str(i)))• end•
Neural Networks Ch9. , ver. 5f2 51
• • • % targets• T=[ 1 1 1 0 0 0 0 0 0• 0 0 0 1 1 1 0 0 0• 0 0 0 0 0 0 1 1 1 ];• • S1=5; % numbe of hidden layers• S2=3; % number of output layers (= number of classes)• • [R,Q]=size(P); • epochs = 10000; % number of iterations• goal_err = 10e-5; % goal error• a=0.3; % define the range of random variables• b=-0.3;• W1=a + (b-a) *rand(S1,R); % Weights between Input and Hidden Neurons• W2=a + (b-a) *rand(S2,S1); % Weights between Hidden and Output Neurons• b1=a + (b-a) *rand(S1,1); % Weights between Input and Hidden Neurons• b2=a + (b-a) *rand(S2,1); % Weights between Hidden and Output Neurons• n1=W1*P;• A1=logsig(n1); %feedforward the first time• n2=W2*A1;• A2=logsig(n2);%feedforward the first time• e=A2-T; %actually e=T-A2 in main loop• error =0.5* mean(mean(e.*e)); % better say e=T-A2 , but no harm to error here• nntwarn off
Neural Networks Ch9. , ver. 5f2 52
• for itr =1:epochs• if error <= goal_err • break• else• for i=1:Q %i is index to a column in P(9x9), for each column of P
( P:,i)• % is a training sample image, 9 training samples, 3 for each class• %A1=5x9, A1 =outputs of hidden layer and input to output layer• % A2=3x9, A2=Outputs of output layer• %T=true class, each column in T is for 1 training sample • % hidden_layer =1, output_layer =2, • df1=dlogsig(n1,A1(:,i)); %df1 is 5x1 for 5 neurons in hidden layer• df2=dlogsig(n2,A2(:,i)); %df2 is 3x1 for output neurons• % s2 is sigma2=sensitvity2 from the output layer , equation(2) • s2 = -1*diag(df2) * e(:,i); %e=T-A2; df2=f’=f(1-f) of layer2
Neural Networks Ch9. , ver. 5f2 53
• %s1=5x1• s1 = diag(df1)* W2'* s2; % eq(3),feedback, from s2 to S1• %dW= -n*s2*df(u)*x in ppt, =0.1, S2 is found, x is A1• • %W2 is 3x5 , each output neuron receives, update W2• % 5 inputs from 5 hidden neurons in the hidden layer• %sigma2=s2 = -1*diag(df2) * e(:,i); %e=T-A2; df2=f’=f(1-f) of layer2• %delta_W2 = -learning_rate*sigma2*input_to_output_layer • %delta_W2 = -0.1*sigma2*A1• W2 = W2-0.1*s2*A1(:,i)'; %learning rate=0.1, equ(2) output case• %3x5 =3x5- (3x1*1x5), • %A1=5 hidden neuron outputs (5 hidden neurons)• %A1(:,i)’=1x5=outputs of hidden layer, • • b2 = b2-0.1*s2; %threshold • % 3x1=3x1- 3x1• %P1(:,i)=1x9 =input t o hidden,• % s1=5x1 because each hidden note has 1 sensitivity (sigma)• W1 = W1-0.1*s1*P(:,i)';% update W1 in layer 1, see equ(3) hidden case• %5x9 = 5x9-(5x1* 1x9), since P is 9x9 and for an i, P(:,i)' =1x9
Neural Networks Ch9. , ver. 5f2 54
• b1 = b1-0.1*s1;%threshold • %5x1=5x1-5x1• • A1(:,i)=logsig(W1*P(:,i)+b1);%forward• %5x1 = 5x1• A2(:,i)=logsig(W2*A1(:,i)+b2);%forward• %3x1=3x1• end• e = T - A2; % for this e, put -ve sign for finding s2• error =0.5*mean(mean(e.*e));• disp(sprintf('Iteration :%5d mse :%12.6f
%',itr,error));• mse(itr)=error;• end• end• Neural Networks Ch9. , ver. 5f2 55
• threshold=0.9; % threshold of the system (higher threshold = more accuracy)• • % training images result• • %TrnOutput=real(A2)• TrnOutput=real(A2>threshold) • • % applying test images to NN , TESTING BEGINS HERE• n1=W1*N;• A1=logsig(n1);• n2=W2*A1;• A2test=logsig(n2);• • % testing images result• • %TstOutput=real(A2test)• TstOutput=real(A2test>threshold)• • • % recognition rate• wrong=size(find(TstOutput-T),1);• recognition_rate=100*(size(N,2)-wrong)/size(N,2)• % end of code
Neural Networks Ch9. , ver. 5f2 56
Result of the programmse error vs. itr (epoch iteration)
•
Neural Networks Ch9. , ver. 5f2 57
Appendix: Architecture of our demo program: exercise3(write formulas for A1(i=4) , and A2(k=3)How many inputs, hidden neurons, outputs, weights in each layer?
Neural Networks Ch9. , ver. 5f2 •
Input:P=9x1Indexed by j
A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1
l=1(i=1,j=1)
l=1(i=2,j=1)
l=1(i=9,j=1)
P(i=1)
P(i=2)
P(i=3)
::
P(i=9)
)1(jb...)1,2()1,1(112
11
1
1
1)1(A
PjiPji ll
ej
A1(i=1)P(i=1)
P(i=2)
P(i=9)
Neuron i=1Bias=b1(i=1)
l=2(i=1,k=1)
l=2(i=2,k=1)
l=2(i=5,k=1)
)]1(b...)1()1,2()1()1,1([222
21
2
1
1)1(A
kkAkjkAkj ll
ek
A2(k=2)
A1
A2
A5
Neuron k=1Bias=b2(k=1)
l=1(i=1,j=1)
l=1(i=2,j=1)
l=1(i=9,j=5)
l=1(i=3,j=4)
A1(j=5)
A1(j=1)
A2:layer2, 3 Output neuronsindexed by kWl=2=5x3bl=2=3x1
l=2(j=5,k=3)
l=2(j=1,k=1)
l=2(i=2,k=2)
l=2(j=2,k=1)A1(j=2)
58
Layer l=1 Layer l=2S2 generated
S1 generated
Answer (exercise3: write values for A1(i=4) and A2(k=3)
• P=[ 0.7656 0.7344 0.9609 0.9961 0.9141 0.9063 0.0977 0.0938 0.0859]%each is p(j=1,2,3..)
• Wl=1=[ 0.2112 0.1540 -0.0687 -0.0289 0.0720 -0.1666 0.2938 -0.0169 -0.1127]%each is w(l=1,j=1,2,3,..)
• bl=1= 0.1441 %for neuron i• %Find A1(i=4)• A1_i_is_4=1/(1+exp[-(l=1*P+bl=1))]• =0.49• How many inputs, hidden neurons, outputs, weights and biases in
each layer?• Answer: Inputs=9, hidden neurons=5, outputs=3, weights in hidden
layer (layer1) =9x5, neurons in output layer (layer2)= 5x3, 5 biases in hidden layer (layer1), 3 biases in output layer (layer2)
• The 4th hidden neuron is A1(i=4)
Neural Networks Ch9. , ver. 5f2 59
)4(jb...)4,2()4,1(112
11
1
1
1)4(A
PjjPjj ll
ej