chapter 9 artificial neural network introduction to back propagation neural network bpnn by kh wong...

59
Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9. , ver. 5f2 1

Upload: laureen-welch

Post on 18-Jan-2016

262 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Chapter 9Artificial Neural network

Introduction to Back Propagation Neural Network BPNN

By KH Wong

Neural Networks Ch9. , ver. 5f2 1

Page 2: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Introduction

• Neural Network research is are very popular • A high performance Classifier (multi-class)• Successful in handwritten optical character

OCR recognition, speech recognition, image noise removal etc.

• Easy to implementation– Slow in learning– Fast in classification

Neural Networks Ch9. , ver. 5f2 2

http://www.ninds.nih.gov/disorders/brain_basics/ninds_neuron.htmhttp://yann.lecun.com/exdb/mnist/

Page 3: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Motivation

• Biological findings inspire the development of Neural Net– Input weights Logic function output

• Biological relation– Input– Dendrites – Output– Human computes using a net

Neural Networks Ch9. , ver. 5f2 3

X=inputs

W=weights

Neuron(Logic function)

Output

Page 4: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Applications

Neural Networks Ch9. , ver. 5f2 4

• Microsoft: XiaoIce. AI• http://image-net.org/challenges/LS

VRC/2015/– 200 categories: accordion,

airplane ,ant ,antelope ….dishwasher ,dog ,domestic cat ,dragonfly ,drum ,dumbbell , etc.

• Tensor flow

ILSVRC 2015

Number of object classes 200

TrainingNum images 456567

Num objects 478807

ValidationNum images 20121

Num objects 55502

TestingNum images 40152

Num objects ---

Page 5: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Different types of artificial neural networks

• Autoencoder• DNN Deep neural network & Deep learning• MLP Multilayer perceptron• RNN (Recurrent neural network)• RBM Restricted Boltzmann machine• SOM (Self-organizing map)• Convolutional neural network• From https://en.wikipedia.org/wiki/Artificial_neural_network• The method discussed in this power point can be applied to many of the above

nets.

Neural Networks Ch9. , ver. 5f2 5

Page 6: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Theory of Back Propagation Neural Net (BPNN)

• Use many samples to train the weights (W) & Biases (b), so it can be used to classify an unknown input into different classes

• Will explain– How to use it after training: forward pass

(classify /or the recognition of the input )– How to train it: how to train the weights and

biases (using forward and backward passes)

Neural Networks Ch9. , ver. 5f2 6

Page 7: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Back propagation is an essential step in many artificial network designs

• For training an artificial neural network• For each training example xi, a supervised (teacher)

output ti is given.

• For the ith training sample x: xi

1) Feed forward propagation: feed xi to the neural net, obtain output yi. Error ei |ti-yi|2

2) Back propagation: feed ei to net from the output side and adjust weight w (by finding ∆w) to minimize e.

• Repeat 1) and 2) for all samples until E is 0 or very small.

Neural Networks Ch9. , ver. 5f2 7

Page 8: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Example :Optical character recognition OCR

• Training: Train the system first by presenting a lot of samples with known classes to the network

• Recognition: When an image is input to the system, it will tell what character it is

Neural Networks Ch9. , ver. 5f2 8

Neural Net Output3=‘1’, other outputs=‘0’

Neural Net

Training up the network:weights (W) and bias (b)

Page 9: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Overview of this document

• Back Propagation Neural Networks (BPNN)– Part 1: Feed forward processing (classification or

Recognition)– Part 2: Back propagation (Training the network), also

include forward processing, backward processing and update weights

• Appendix:• A MATLAB example is explained• %source :

http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-pattern-recognition-tutorial

Neural Networks Ch9. , ver. 5f2 9

Page 10: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Part 1 (classification in action /or the Recognition process)Forward pass of Back Propagation

Neural Net (BPNN)Assume weights (W) and bias (b) are found by training already (to be discussed in part2)

Neural Networks Ch9. , ver. 5f2 10

Page 11: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Recognition: assume weight (W) bias (b) are found earlier

• Neural Networks Ch9. , ver. 5f2 11

OutputOutput0=0Output1=0Output2=0Output3=1

:Outputn=0

Each pixel is X(u,v)

Correct recognition

Page 12: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Neural Networks Ch9. , ver. 5f2

12

1X l 2X l 3X l

1W l 2W lNlW

NlX

Output layer Input layer

Hidden layers

A neural network

Page 13: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Exercise 1• How many input and outputs neurons?• Ans: 4 input and 2 output neurons• How many hidden layers does this network have?• Ans: 3• How many weights in total?• Ans: First hidden layer has 4x4, second layer has 3x4,

third hidden layer has 3x3, fourth hidden layer to output layer has 2x3 weights. total=16+12+9+6=43

Neural Networks Ch9. , ver. 5f2

13

1X l 2X l 3X l

4W NlInputsneurons

1W l

What is this layer of neurons X called?Ans: 4X l

Page 14: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Multi-layer structure of a BP neural network

Neural Networks Ch9. , ver. 5f214

Input layer

,Youtput ,W,X inputs

has layer inneuron eachfor that such

biases ofset b weights,ofset W inputs, ofset Xoutputs,Y

lll ywx

l

l

l

:layer

hidden

() function a transfer

,b one

and,...,,

weightshas neuron Each

neurons multiple haslayer A

321

f

bias

www

layer

Output

Otherhidden layers

Page 15: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Inside each neuron there is a bias (b)• In between any neighboring 2

neuron layers, a set of weights are found

Neural Networks Ch9. , ver. 5f2 15

)3( ix

y

)1( iwu uf

)(Iw

)2( iw)2( ix

)( Iix

Page 16: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Inside each neuron x=input, y=output

• Neural Networks Ch9. , ver. 5f2 16

Ii

iixi

u

Ii

i

e

fy

euf

f

uwxb

bw(i)x(i)ufy

1b)()(

1

1

1)u( therefore

,simplicityfor 1 assume,1

1)(

i.e. function, (sigmod) logistica is ()Typically

signal internal weight,input, bias,

, with)u(

)1( ix

y

)1( iwu uf

)(Iw

)2( iw)2( ix

)( Iix

Page 17: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

BPNN Forward pass• Forward pass is to find the output when an input is given. For

example:• Assume we have used N=60,000 images (MNIST database) to

train a network to recognize c=10 numerals.• When an unknown image is given to the input, the output

neuron corresponds to the correct answer will give the highest output level.

Neural Networks Ch9. , ver. 5f2 17

10 output neurons for 0,1,2,..,9

Inputimage

000100

Page 18: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Our simple demo program• Training pattern

– 3 classes (in 3 rows)– Each class has 3 training

samples (items in each row)

• After training , an input (assume it is test image #2) is presented to the network, the network should tell you it is class 2.

Neural Networks Ch9. , ver. 5f2 18

class1

class2

class3

Result:image (class 2)

Unknowninput

Page 19: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Numerical Example : Architecture of our example

Neural Networks Ch9. , ver. 5f2 19

Input Layer9x1 pixels

output Layer 3x1

neuron) eachfor bias (1 1x neurons 5b

neuron eachfor inputs 9x neurons 5W

layer

hidden

l

l

x

lx

neuron eachfor (),b,W fbiasesweights ll •

Page 20: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

The input x • P2=[50 30 25 215 225 231 31 22 34; ...

%class1: 1st training sample. Gray level 0->255

Neural Networks Ch9. , ver. 5f2 20

P1=50P2=30P3=25P4=215P5=225P6=235P7=31P8=22P9=34

9 neuronsIn input layer

3 neuronsIn output layer

5 neuronsIn hidden layer

Page 21: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Exercise 2: Feed forwardInput =P1,..P9, output =Y1,Y2,Y3

teacher(target) =T1,T2,T3•

Neural Networks Ch9. , ver. 5f2 21

A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1

P(i=1)

P(i=2)

P(i=3)

::

P(i=9)

(i=1,j=1)

(i=2,i=1)

A1(j=5)

A1(j=1) (j=1,k=1)

l=2(j=2,k=2)

(j=2,k=1)A1(j=2)

Layer l=1 Layer l=2

Y1=0.5101T1=1

Y2=0.4322T2=0

Y3=0.3241T3=0

Output layer

Input layer

Class1 :T1,T2,T3=1,0,0

Exercise 2: What is the target code for T1,T2,T3 if it is for class3?Ans: 0,0,1

Page 22: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Exercise 3: find Y1•

Neural Networks Ch9. , ver. 5f2 22

l=1i=2

l=1i=3

l=1i=1

l=2i=1b=0.5

l=2i=2b=0.3

l=3i=1b=0.7

l=3i=2b=0.6

Wl=1,j=3,i=2

0.15

0.730.27

0.10.35

0.4

0.6

0.35

0.8

0.25

Input layer

Hidden layer ouput layer

Y1=?

y2

X=1

X=3.1

X=0.5

A1

A2

Ii

iixi

e

fy1

b)()(

1

1)u(

Page 23: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

• %demo_bpnn_note1 khw ver15• u1=1*0.1+3.1*0.35+0.5*0.4+0.5• A1=1/(1+exp(-1*u1))• • u2=1*0.27+3.1*0.73+0.5*0.15+0.3• A2=1/(1+exp(-1*u2))• • u_Y1=A1*0.6+A2*0.35+0.7• Y1=1/(1+exp(-1*u_Y1))

• %%%%%% result %%%%%%• %>>demo_bpnn_note1• u1 = 1.8850• A1 = 0.8682• U2 = 2.9080• A2 = 0.9482• Y1 = 0.8528• >> %>>

Neural Networks Ch9. , ver. 5f2 23

Answer 3

Page 24: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Part 2: Back propagation processing

(Training the network)

Back Propagation Neural Net (BPNN) (Training)

Ref:http://en.wikipedia.org/wiki/Backpropagation

Neural Networks Ch9. , ver. 5f2 24

Page 25: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Back propagation stage•

Neural Networks Ch9. , ver. 5f2 25

1ll

1lxlx

Part1:FeedForward (studied before)

Part2: Back propagation

llayer

)(1 bxfx l

We will explain why and prove the necessary equations in the following slides

For training we need to find , why?

E

Page 26: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

The criteria to train a network • Based on the overall error function, there are ‘N’ samples and

‘c’ classes to be learned (Assume N=60,000 in MNIST dataset)

Neural Networks Ch9. , ver. 5f2

26

network forward feed theofouput at the

sample training theof classoutput The

(teacher) sample training theof class truegiven The

;2

2

1

:)1( outputs allfor sample training theofError

utss_all_outpall_sampleerror_for_

2

1error Overall

2

2

1

2

1 1

th

thnk

thnk

c

k

nk

nk

n

th

N

N

n

c

k

nk

nk

N

k

ny

nt

norm

ytE

,..ckn

E

ytE

Example: The k-th class training sampleThe teacher says it is class tk

n=1

Page 27: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Before we back propagate data , we have to find the feed forward error signals e(n) first for training sample x(n). Recall: Feed forward processing, Input =P1,..P9, output =Y1,Y2,Y3, teacher =T1,T2,T3

• Input=

Neural Networks Ch9. , ver. 5f2 27

A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1

P(i=1)

P(i=2)

P(i=3)

::

P(i=9)

(i=1,j=1)

(i=2,i=1)

A1(j=5)

A1(j=1) (j=1,k=1)

(j=2,k=2)

(j=2,k=1)A1(j=2)

Layer l=1 Layer l=2

Y1=0.5101T1=1

Y2=0.4322T2=0

Y3=0.3241T3=0

Output layer

Input layer

I.e. e(n)=(1/2)|Y1-T1|2

=0.5*(0.5101-1)^2=0.12

e

Page 28: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Exercise 3 : The training idea• Assume it is for the nth training

sample, and belong to class C.• In the previous exercise we

calculated that in this network Y1=0.8059

• During training for this input the teacher says t=1

a) What is the error value e?b) How do we use this e?• Answer a: e=(1/2)|Y1-t|2=0.5*(1-0.8059)^2=0.0188• Answer b: We feed this e back to the network to find w to

minimize the overall E (E =sum_all_n [t-e]). It is because we know that w_new=w_old+ w will give a new w that decreases E. hence by applying this formula recursively, we can achieve a set of W to minimum E.

Neural Networks Ch9. , ver. 5f2 28

t=1

Assume it is for the nth training sample

Page 29: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

How to back propagate?

Neural Networks Ch9. , ver. 5f2 29

29

Neuron j

?E

find toneed wedoBut why

(1)--------- rule chainby , EE

,E

find want toWe

)(y

, definitionBy

isOutput j, neurona For

output actualy,or teachertarget

outputat error squared 2

1

1j

1

2

ij

ij

j

j

j

jij

ij

j

Ii

iijij

j

Ii

iijij

j

w

w

u

u

y

yw

sow

bwxfuf

bwxu

y

t

ytE

i=1,2,..,II inputs to neuron jOutput of neuron j is yj

jIiw ,

jiw ,1

Page 30: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Because: E/ wi,j tells you how to change w to minimize eE The method is called Learning by gradient decent

Neural Networks Ch9. , ver. 5f2 30

b

Ebb

w

Ew

Ewwww

Tw

Ew

eE

www

w

oldnew

argument, same For the

need why wesThat'

slide),next thein explained be ldecent wilgradient of theory (The

0.1)factor learning ( ve smalla useslowly it do o

decent)gradient by (learning make

cycle learningevery for E)ofelement an is ( decrease want to weIf

using

,calculated is new a (epoch), cycle learning each In

oldoldnew

oldnew

Page 31: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

We need to find , why?

• Ans:

Neural Networks Ch9. , ver. 5f231

EE

EEEE

EEwE

EEEE

()(

E

)(E

EE

EE

EEE

oldnew

oldoldnew

oldnew

oldnew

oldnewoldnew

decrease will- set :Conclusion

ve always is since ),()(

)(-)()(

becomes *) into **put

rate learning set the to termve smalla is is where

(**)- set we

*----- )()(

, Here

..)()(

definitionby seriesTaylor

Using Taylor series http://www.fepress.org/files/math_primer_fe_taylor.pdfhttp://en.wikipedia.org/wiki/Taylor's_theorem

E

Page 32: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Back propagation ideaInput =P1,..P9, output =Y(k=1),Y(k=2),Y3(k=3)teachers =T(k=1),T(k=3),T(k=3)

Neural Networks Ch9. , ver. 5f2 32

A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1

P(i=1)

P(i=2)

P(i=3)

::

P(i=9)

(i=1,j=1)

(i=2,j=1)

A1(j=5)

A1(i=1) (j=1,k=1)

l=2(j=2,k=2)

(j=2,k=1)A1(j=2)

32

Layer l=1 Layer l=2

Y(k=1)=0.5101T(k=1)=1

Y(k=2)=0.4322T(k=2)=0

Y(k=3)=0.3241T(k=3)=0

Output layer

Input layer

e=(1/2)|Y1-T1|2

=0.5*(0.5101-1)^2=0.12

Back propagate to find a better w to reduce E

Page 33: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

The training algorithm • Loop many epochs until E is very small or W is stable• { For n=1,N_all_training_samples• { feed forward x(n) to network to get y(n)• e(n)=0.5*[y(n)-t(n)]^2 //t(n)=teacher of sample x(n)• back propagate e(n) to the network, • //showed earlier if w=-*E/w , and wnew=wold+ w

• //output y(n) will be closer to t(n) hence e(n) will decrease• find w=-*E/w //E will decrease. 0.1=learning rate• update wnew=wold+ w =wold-*E/w //for weight

• Similarity update bnew=bold+ b =wold-*E/b //for bias• }• E=sum_all_n (e(n))• }

Neural Networks Ch9. , ver. 5f2 33

Page 34: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Theory of how to find E/w

Neural Networks Ch9. , ver. 5f2 34

term3 term2,, term1

rule) chain(by , EE

(1) from so E, affects how see want toWe

through neuronoutput toconnected is input An

,,

,

,

kj

j

j

j

jkj

kj

kj

w

u

u

y

yw

w

wkj

Xj=1yk

Wj,k

Output neuron k

uk

k

Jj

jkjjkk

k

Ij

jkjjk

bwxfufy

bwxu

1,

1,

)( Xj=J

Page 35: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Case 1: if neuronj is at the output layer. We want to see how E will change if we change the weight wj,k

Neural Networks Ch9. , ver. 5f2 35

ysensitivit

ufufty

u

uf

yu

y

y

xufuftyw

w

tyy

ty

y

bxw

bxw

w

u

ufufufu

uf

u

y

w

u

u

y

yw

k

kkkkk

k

k

kk

k

kk

jkkkkkj

kj

kkk

kk

k

jjkj

jjkj

kj

k

kkkk

k

k

k

kj

k

k

k

kkj

)2())(1)((

term2*term1)(EE

:note

)(1)(E

term3* term2* term1E

since

outputat measured,5.0E

:term1

constant since, :term3

appendix See,)(1)()(')(

:term2

term3 term2,, term1

, EE

,

,

2

,

,

,

,,

xjyk

Wj,k

uk

Outputyk

Teacher(Target )Class=tk

Neuron k as an output neuron

We want to see kjw ,

E

ek=0.5(tk-yk)2

, ekEk

Page 36: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Case2 : if neuron j is at the hidden layer. We want to see if how E will change if we change the weight wi,j. Note: Output yi affects all neurons connected to it in next layer

• Neural Networks Ch9. , ver. 5f2 36

layernext thein all affects

eachfor , because ,:part1b

slide)last of eq.(2) (see EE

:part1a

part1bpart1aEE

:term1

term3 term2term1

EE

,,

11

,,

kj

jkjkkjj

k

kk

k

kk

Kk

k

Kk

k j

k

kj

ji

j

j

j

jji

uy

kywuwy

u

u

y

yu

y

u

uy

w

u

u

y

yw

neuron j

1ku 1ky

program

in

W2

,

Kkjw

jy

kby indexed

neuronsOutput

1kix ju

program in

W1, jiw1, kjw

2ku 2ky

2k

Kku Kky

Kk

2, kjw

Kkjw ,

EChangeshere

Page 37: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Case2 : continue

• Neural Networks Ch9. , ver. 5f2 37

iii

Kk

kkjk

ji

Kk

kkjk

ji

Kk

kkjk

i

Kk

k

xufufww

ww

wy

)(1)(E

hence

slide previous thein that similar to are term3term2,

term3term2term3term2term1E

Epart1bpart1a term1So,

1,

,

1,

,

1,

1

For this hidden neuron j, this is df1 in the program

Input xi to the hidden neuron i, P(:,) in program

Page 38: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

After all (E/w) are found after you solved case1 and case2

Neural Networks Ch9. , ver. 5f2 38

w

Eww

w

Ew

www

E

w

oldnew

oldnew

0.1) rate learning (use method,decent graident

theusing minimized is so

all update tostep thisuse can We

Page 39: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Revisit the training algorithm • Iter=1: all_epochs (or break when E is very small)• { For n=1:N_all_training_samples• { feed forward x(n) to network to get y(n)• e(n)=0.5*[y(n)-t(n)]^2 ;//t(n)=teacher of sample x(n)• back propagate e(n) to the network, • //showed earlier if w=-*E/w , and wnew=wold+ w

• //output y(n) will be closer to t(n) hence e(n) will decrease• find w=-*E/w //E will decrease. 0.1=learning rate• update wnew=wold+ w =wold-*E/w ;//for weight

• Similarity update bnew=bold+ b =wold-*E/b ;//for bias• }• E=sum_all_n (e(n))• }

Neural Networks Ch9. , ver. 5f2 39

Page 40: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Summary

• Learn what is Back Propagation Neural Networks (BPNN)

• Learn the forward pass• Learn how to back propagate data during

training of the BPNN network

Neural Networks Ch9. , ver. 5f2 40

Page 41: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

References• Wiki

– http://en.wikipedia.org/wiki/Backpropagation– http://en.wikipedia.org/wiki/Convolutional_neural_network

• Matlab programs– Neural Network for pattern recognition- Tutorial

http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-pattern-recognition-tutorial

– CNN Matlab example http://www.mathworks.com/matlabcentral/fileexchange/38310-deep-learning-toolbox

• Open source library– Tensor flow: http://www.geekwire.com/2015/google-open-sources-

tensorflow-machine-learning-system-offering-its-neural-network-to-outside-developers/

Neural Networks Ch9. , ver. 5f2 41

Page 42: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Appendices

Neural Networks Ch9. , ver. 5f2 42

Page 43: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Appendix 1:Sigmod function f(u) and its derivative f’(u)

Neural Networks Ch9. , ver. 5f243

)(1)()()(

Thus,

)(1)()1(

11

)1(

1

)1(

)1()1(

)1(

1

)1()1(

1

)1(

1)(

)1(

1)(

rule) chain using(,)1(

)1(1

1)(

)(

)()(

1set simplicityfor ,1

1

1

1)(

'

22'

'

'

ufufufdu

udf

ufufee

e

eee

ee

e

e

ee

ee

uf

du

ed

ede

d

du

udfuf

Hence

ufdu

udfee

uf

uu

u

uuu

uu

u

u

uu

uu

u

u

u

uu

http://link.springer.com/chapter/10.1007%2F3-540-59497-3_175#page-1

http://mathworld.wolfram.com/SigmoidFunction.html

Page 44: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Neural Networks Ch9. , ver. 5f2 44

nnLlLl

nnn

l

nnn

nnn

l

nnnn

nl

n

n

nnnn

llll

tyf

L l

ivuftyb

E

b

ui

b

uufyt

b

uftyt

b

E

b

ytyt

b

Ei)(ii) & (ii

t

ufy

iiiuftytE

n-th

iiub

u

ub

ib

ubxu

u'

layeroutput at the

)(

,1),(in since

,)(

, From

(teacher)or target truththe

,outputcurrent theis )( Becuase

)()(2

1

2

1

sample theSince

)(ysensitivit theEEE

hence

),(1 so, since

'

'

22

1

Alternative

Derivation (for the output layer , in each neuron)

1ll

Output(last layer)t=target (teacher)y=output.Back propagate error to the previous layer

Page 45: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

derivation

Neural Networks Ch9. , ver. 5f245

eq(ii)δbb

Ebb

xE

viv(iv eq

T

viE

E

xivxE

ufty(iv)xuftyE

bxufty

ufyt

yyt

E

bwxuytE

lb

loldl

n

blold

lnew

lll

l

ll

l

ll

nnlnnl

lnn

lnn

l

nnn

l

nnn

see , argument, same For the

),,.use hence slide),next see method,decent gradient theis (This

factor) learning ( ve smalla useslowly it do o

)( make

cycle learningeverfy for decrease want to weIf

-(v)------- ,calculated is new a phase, learning eachFor

) weight and input each(for )(

)(' in since,)('

)(')(

and,2

1 , (iii) from Also

1oldoldoldnew

oldnew

2

Page 46: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

BNPP example in matlab

Based on Neural Network for pattern recognition- Tutorial

http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-

pattern-recognition-tutorial

Neural Networks Ch9. , ver. 5f2 46

Page 47: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Example: a simple BPNN

• Number of classes (no. of output neurons)=3• Input 9 pixels: each input is a 3x3 image• Training samples =3 for each class• Number of hidden layers =1• Number of neurons in the hidden layer =5

Neural Networks Ch9. , ver. 5f2 47

Page 48: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Display of testing patterns

Neural Networks Ch9. , ver. 5f2 48

Page 49: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Architecture

Neural Networks Ch9. , ver. 5f2 •

Input:P=9x1Indexed by j

A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1

l=1(i=1,j=1)

l=1(i=2,j=1)

l=1(i=9,j=1)

P(i=1)

P(i=2)

P(i=3)

::

P(i=9)

)1(jb...)1,2()1,1(112

11

1

1

1A

PjiPji ll

e

A1(j=1)P(i=1)

P(i=2)

P(i=9)

Neuron j=1Bias=b1(j=1)

2(j=1,k=1)

2(j=2,k=1)

2(j=5,k=1)

))1(b...)1()1,2()1()1,1((222

21

2

1

1A

kkAkjkAkj ll

e

A2(k=2)A1

A2

A5

Neuron k=1Bias=b2(k=1)

l=1(i=i,j=1)

l=1(i=2,j=1)

l=1(i=9,j=5)

l=1i(j=3,j=4)

A1(j=5)

A1(j=1)

A2:layer2, 3 Output neuronsindexed by kWl=2=5x3bl=2=3x1

l=2(j=5,k=3)

l=2(j=1,k=1)

l=2(j=2,k=2)

l=2(j=2,k=1)A1(j=2)

49

Layer l=1 Layer l=2S2 generated

S1 generated

Page 50: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

• %source : http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-pattern-recognition-tutorial• clear memory %comments added by kh wong• clear all• clc• nump=3; % number of classes• n=3; % number of images per class• % training images reshaped into columns in P • % image size (3x3) reshaped to (1x9)• • % training images • P=[196 35 234 232 59 244 243 57 226; ...• 188 15 236 244 44 228 251 48 230; ... % class 1• 246 48 222 225 40 226 208 35 234; ...• • 255 223 224 255 0 255 249 255 235; ...• 234 255 205 251 0 251 238 253 240; ... % class 2• 232 255 231 247 38 246 190 236 250; ...• • 25 53 224 255 15 25 249 55 235; ...• 24 25 205 251 10 25 238 53 240; ... % class 3• 22 35 231 247 38 24 190 36 250]';• • % testing images • N=[208 16 235 255 44 229 236 34 247; ...• 245 21 213 254 55 252 215 51 249; ... % class 1• 248 22 225 252 30 240 242 27 244; ...• • 255 241 208 255 28 255 194 234 188; ...• 237 243 237 237 19 251 227 225 237; ... % class 2• 224 251 215 245 31 222 233 255 254; ...• • 25 21 208 255 28 25 194 34 188; ...• 27 23 237 237 19 21 227 25 237; ... % class 3• 24 49 215 245 31 22 233 55 254]';• • % Normalization• P=P/256;• N=N/256;•

Neural Networks Ch9. , ver. 5f2 50

Page 51: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

• % display the training images • figure(1),• for i=1:n*nump• im=reshape(P(:,i), [3 3]);• %remove theline below to reflect the truth data input• % im=imresize(im,20); % resize the image to make it clear• subplot(nump,n,i),imshow(im);…• title(strcat('Train image/Class #', int2str(ceil(i/n))))• end• % display the testing images • figure,• for i=1:n*nump• im=reshape(N(:,i), [3 3]);• % remove theline below to reflect the truth data input• % im=imresize(im,20); % resize the image to make it clear • subplot(nump,n,i),imshow(im);title(strcat('test image #', int2str(i)))• end•

Neural Networks Ch9. , ver. 5f2 51

Page 52: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

• • • % targets• T=[ 1 1 1 0 0 0 0 0 0• 0 0 0 1 1 1 0 0 0• 0 0 0 0 0 0 1 1 1 ];• • S1=5; % numbe of hidden layers• S2=3; % number of output layers (= number of classes)• • [R,Q]=size(P); • epochs = 10000; % number of iterations• goal_err = 10e-5; % goal error• a=0.3; % define the range of random variables• b=-0.3;• W1=a + (b-a) *rand(S1,R); % Weights between Input and Hidden Neurons• W2=a + (b-a) *rand(S2,S1); % Weights between Hidden and Output Neurons• b1=a + (b-a) *rand(S1,1); % Weights between Input and Hidden Neurons• b2=a + (b-a) *rand(S2,1); % Weights between Hidden and Output Neurons• n1=W1*P;• A1=logsig(n1); %feedforward the first time• n2=W2*A1;• A2=logsig(n2);%feedforward the first time• e=A2-T; %actually e=T-A2 in main loop• error =0.5* mean(mean(e.*e)); % better say e=T-A2 , but no harm to error here• nntwarn off

Neural Networks Ch9. , ver. 5f2 52

Page 53: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

• for itr =1:epochs• if error <= goal_err • break• else• for i=1:Q %i is index to a column in P(9x9), for each column of P

( P:,i)• % is a training sample image, 9 training samples, 3 for each class• %A1=5x9, A1 =outputs of hidden layer and input to output layer• % A2=3x9, A2=Outputs of output layer• %T=true class, each column in T is for 1 training sample • % hidden_layer =1, output_layer =2, • df1=dlogsig(n1,A1(:,i)); %df1 is 5x1 for 5 neurons in hidden layer• df2=dlogsig(n2,A2(:,i)); %df2 is 3x1 for output neurons• % s2 is sigma2=sensitvity2 from the output layer , equation(2) • s2 = -1*diag(df2) * e(:,i); %e=T-A2; df2=f’=f(1-f) of layer2

Neural Networks Ch9. , ver. 5f2 53

Page 54: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

• %s1=5x1• s1 = diag(df1)* W2'* s2; % eq(3),feedback, from s2 to S1• %dW= -n*s2*df(u)*x in ppt, =0.1, S2 is found, x is A1• • %W2 is 3x5 , each output neuron receives, update W2• % 5 inputs from 5 hidden neurons in the hidden layer• %sigma2=s2 = -1*diag(df2) * e(:,i); %e=T-A2; df2=f’=f(1-f) of layer2• %delta_W2 = -learning_rate*sigma2*input_to_output_layer • %delta_W2 = -0.1*sigma2*A1• W2 = W2-0.1*s2*A1(:,i)'; %learning rate=0.1, equ(2) output case• %3x5 =3x5- (3x1*1x5), • %A1=5 hidden neuron outputs (5 hidden neurons)• %A1(:,i)’=1x5=outputs of hidden layer, • • b2 = b2-0.1*s2; %threshold • % 3x1=3x1- 3x1• %P1(:,i)=1x9 =input t o hidden,• % s1=5x1 because each hidden note has 1 sensitivity (sigma)• W1 = W1-0.1*s1*P(:,i)';% update W1 in layer 1, see equ(3) hidden case• %5x9 = 5x9-(5x1* 1x9), since P is 9x9 and for an i, P(:,i)' =1x9

Neural Networks Ch9. , ver. 5f2 54

Page 55: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

• b1 = b1-0.1*s1;%threshold • %5x1=5x1-5x1• • A1(:,i)=logsig(W1*P(:,i)+b1);%forward• %5x1 = 5x1• A2(:,i)=logsig(W2*A1(:,i)+b2);%forward• %3x1=3x1• end• e = T - A2; % for this e, put -ve sign for finding s2• error =0.5*mean(mean(e.*e));• disp(sprintf('Iteration :%5d mse :%12.6f

%',itr,error));• mse(itr)=error;• end• end• Neural Networks Ch9. , ver. 5f2 55

Page 56: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

• threshold=0.9; % threshold of the system (higher threshold = more accuracy)• • % training images result• • %TrnOutput=real(A2)• TrnOutput=real(A2>threshold) • • % applying test images to NN , TESTING BEGINS HERE• n1=W1*N;• A1=logsig(n1);• n2=W2*A1;• A2test=logsig(n2);• • % testing images result• • %TstOutput=real(A2test)• TstOutput=real(A2test>threshold)• • • % recognition rate• wrong=size(find(TstOutput-T),1);• recognition_rate=100*(size(N,2)-wrong)/size(N,2)• % end of code

Neural Networks Ch9. , ver. 5f2 56

Page 57: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Result of the programmse error vs. itr (epoch iteration)

Neural Networks Ch9. , ver. 5f2 57

Page 58: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Appendix: Architecture of our demo program: exercise3(write formulas for A1(i=4) , and A2(k=3)How many inputs, hidden neurons, outputs, weights in each layer?

Neural Networks Ch9. , ver. 5f2 •

Input:P=9x1Indexed by j

A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1

l=1(i=1,j=1)

l=1(i=2,j=1)

l=1(i=9,j=1)

P(i=1)

P(i=2)

P(i=3)

::

P(i=9)

)1(jb...)1,2()1,1(112

11

1

1

1)1(A

PjiPji ll

ej

A1(i=1)P(i=1)

P(i=2)

P(i=9)

Neuron i=1Bias=b1(i=1)

l=2(i=1,k=1)

l=2(i=2,k=1)

l=2(i=5,k=1)

)]1(b...)1()1,2()1()1,1([222

21

2

1

1)1(A

kkAkjkAkj ll

ek

A2(k=2)

A1

A2

A5

Neuron k=1Bias=b2(k=1)

l=1(i=1,j=1)

l=1(i=2,j=1)

l=1(i=9,j=5)

l=1(i=3,j=4)

A1(j=5)

A1(j=1)

A2:layer2, 3 Output neuronsindexed by kWl=2=5x3bl=2=3x1

l=2(j=5,k=3)

l=2(j=1,k=1)

l=2(i=2,k=2)

l=2(j=2,k=1)A1(j=2)

58

Layer l=1 Layer l=2S2 generated

S1 generated

Page 59: Chapter 9 Artificial Neural network Introduction to Back Propagation Neural Network BPNN By KH Wong Neural Networks Ch9., ver. 5f21

Answer (exercise3: write values for A1(i=4) and A2(k=3)

• P=[ 0.7656 0.7344 0.9609 0.9961 0.9141 0.9063 0.0977 0.0938 0.0859]%each is p(j=1,2,3..)

• Wl=1=[ 0.2112 0.1540 -0.0687 -0.0289 0.0720 -0.1666 0.2938 -0.0169 -0.1127]%each is w(l=1,j=1,2,3,..)

• bl=1= 0.1441 %for neuron i• %Find A1(i=4)• A1_i_is_4=1/(1+exp[-(l=1*P+bl=1))]• =0.49• How many inputs, hidden neurons, outputs, weights and biases in

each layer?• Answer: Inputs=9, hidden neurons=5, outputs=3, weights in hidden

layer (layer1) =9x5, neurons in output layer (layer2)= 5x3, 5 biases in hidden layer (layer1), 3 biases in output layer (layer2)

• The 4th hidden neuron is A1(i=4)

Neural Networks Ch9. , ver. 5f2 59

)4(jb...)4,2()4,1(112

11

1

1

1)4(A

PjjPjj ll

ej