convolution neural network cnn a tutorial kh wong convolution neural network cnn ver. 4.11a1

Convolution Neural NetworkCNN

A tutorialKH Wong

Convolution Neural Network CNN ver. 4.11a 1

Introduction

• Very Popular: – Toolboxes: cuda-convnet and caffe (user friendlier)

• A high performance Classifier (multi-class)• Successful in handwritten optical character OCR

recognition, speech recognition, image noise removal etc.

• Easy to implementation– Slow in learning– Fast in classification


Overview of this note

• Part 1: Fully connected Back Propagation Neural Networks (BPNN)– Part 1A: feed forward processing– Part 1A: feed backward processing

• Part 2: Convolution neural networks (CNN)– Part 2A: feed forward of CNN– Part 2B: feed backward of CNN


Part 1

Fully Connected Back Propagation (BP) neural net


TheoryFully connected Back Propagation Neural Net (BPNN)

• Use many samples to train the weights, so it can be used to classify an unknown input into different classes

• Will explain– How to use it after training: forward pass– How to train it: how to train the weights and

biases (using forward and backward passes)


Training• How to train it: how to train the weights (W) and

biases (b) (use forward, backward passes)• Initialize W and b randomly• Iter=1: all_epocks (each is called an epcok)– Forward pass for each output neuron:

• Use training samples: Xclass_t : feed forward to find y.

• Err=error_function(y-t)

– Backward pass:• Find W and b to reduce Err.• Wnew=Wold+W; bnew=bold+b

Convolution Neural Network CNN ver. 4.11a

6

Part 1A

Forward pass of Back Propagation Neural Net (BPNN)

Recall:Forward pass for each output neuron:

-Use training samples: Xclass_t : feed forward to find y.-Err=error_function(y-t)


Feed forward of Back Propagation Neural Net (BPNN)

• In side each neuron:


8

1x l 2x l 3x l

2lw 3lw Nlw

Nlx)b(1

1)u(x

,1

1)(

i.e. function, (sigmod) logistic a is Typically

b,W,u,xsuch that

,bwith )u(x

lll xW

ll

u

llllllll

llllll

ef

thereforee

uf

f

bWux

xWuf

Output neuronsInputs

Sigmod function f(u) and its derivative f’(u)

•


)(1)()(

1 slope for theparamter ,simplicityFor

)(1)()1()1(

1

)1()1()(

)(

)1(1

1)(

)(

slopefor paramter theis ,1

1)(

'

22'

'

ufufuf

ufufe

e

e

ee

ee

uf

du

edf

ede

d

du

udfuf

Hencee

uf

u

u

u

uu

uu

u

u

u

u

http://link.springer.com/chapter/10.1007%2F3-540-59497-3_175#page-1

http://mathworld.wolfram.com/SigmoidFunction.html

A single neuron• The neural net can have many layers• In between any neighboring 2 layers, a set of

neurons can be found


llllll bxWuufx 1 with)(

W,Wx,

1layer at inputs 1

wweightsx

lxl

l

lxl layer at input

)1(1lx

lx

)1(lW)2(lW)2(1lx

lu luf

Each Neuron

BPNN Forward pass• Forward pass is to find output when an input is given. For

example:• Assume we have used N=60,000 images to train a network to

recognize c=10 numerals.• When an unknown image is input, the output neuron

corresponds to the correct answer will give the highest output level.


10 output neurons for 0,1,2,..,9

Inputimage

The criteria to train a network • Is based on the overall error function•


network forward feed theofouput at the

sample training theof classoutput The

sample training theof class truegiven The

;2

yt2

1

2

1:neuron eachfor Error

2

1error Overall

2

2

2

1

2

1 1

thnk

thnk

nnc

k

nk

nk

n

N

n

c

k

nk

nk

N

ny

nt

norms

ytE

ytE

Structure of a BP neural network•


Input layer

output layer

llllllll bWux b,W,u,x that such

biases ofset b weights,ofset W inputs, ofset x

1layer

hidden

l llayer

hidden

1x l lx

()

b

W

f

biases

weightsl

l

Architecture (exercise: write formulas for A1(i=4) and A2(k=3)


•

Input:P=9x1Indexed by j

Hidden layer =5 neurons,indexed by iW1=9x5b1=5x1

W1(j=1,i=1)

W1(j=2,i=1)

W1(j=9,i=1)

P(j=1)

P(j=2)

P(j=3)

::

P(j=9)

)1(ib...)1,2()1,1(1 121111

1A

PijWPijWe

A1(i=1)

P(j=1)

P(j=2)

P(j=9)

Neuron i=1Bias=b1(i=1)

W2(i=1,k=1)

W2(i=2,k=1)

W2(i=5,k=1)

))1(b...)1()1,2()1()1,1((2 222121

1A

kkAkiWkAkiWe

A2(k=2)

A1

A2

A5

Neuron k=1Bias=b2(k=1)

W1(j=1,i=1)

W1(j=2,i=1)

W1(j=9,i=5)

W1(j=3,i=4)

A1(i=5)

A1(i=1)

Output neurons=3 neurons,indexed by kW2=5x3b2=3x1

W2(i=5,k=3)

W2(i=1,k=1)

W2(i=2,k=2)

W2(i=2,k=1)A1(i=2)

14

Answer (exercise: write values for A1(i=4) and A2(k=3)

• P=[ 0.7656 0.7344 0.9609 0.9961 0.9141 0.9063 0.0977 0.0938 0.0859]

• W1=[ 0.2112 0.1540 -0.0687 -0.0289 0.0720 -0.1666 0.2938 -0.0169 -0.1127]

• -b1= 0.1441• %Find A1(i=4)• A1_i_is_4=1/(1+exp[-(W1*P+b1))]• =0.49

Convolution Neural Network CNN ver. 4.11a 15 )1(ib...)4,2()4,1(1 121111

1)4(A PijWPijWe

i

Numerical example for the forward path

• Feed forward• Give numbers of x, w b etc


Example: a simple BPNN

• Number of classes (no. of output neurons)=3• Input 9 pixels: each input is a 3x3 image• Training samples =3 for each class• Number of hidden layers =1• Number of neurons in the hidden layer =5


Architecture of the example


Input Layer9x1 pixels

output Layer 3x1

5x1b

5x9W

layer

hidden

x

lx

()

b

W

f

biases

weightsl

l

•

Part 1B

Backward pass of Back Propagation Neural Net (BPNN)


feedback•


1ll

1lxlxFeedforward

Feedbackward

llayer

)(1 bwxfxl

lllTlllTll ffWfW u1uu 11'11

•


nnLL

nnn

l

nnn

nnn

l

nnnn

nl

nn

nnnn

llll

tyf

L

ivuftyb

E

b

ui

b

uufyt

b

uftyt

b

E

b

ytyt

b

E

tufy

iiiuftytE

iib

u

ubhence

ib

ubxWu

u'

layer output at the

)(

,1),(in since

,)(

(iii), & (ii) From

or target truth theis ,outputcurrent theis )( since,

)()(2

1

2

1 sampleth -n theSince

)(essensitivit theEE

),(1 so, since

'

'

22

1

derivation

derivation

•


ll

lll

l

Tll

nnlnn

lnn

lnn

l

nnn

l

nnn

WW

WWW

W

uftyxufty

W

bwxufty

W

ufyt

W

yyt

W

ytE

E

hence

factor learninga

useslowly it do to,E

make so negative make

clcycle learningeverfy for W decease want to weif ,W W

calculated isW new a phase, learning eachFor

x

)(' (iv) in since,)('

)(')(E

2

1 , (iii) from Also

oldnew

1

2

Numerical example for the feed back pass


Procedure

• From the last layer (output), find dt-y• Find d, then find w of the whole network• Find iterative (forward- back forward pass) to

generate a new set of W, until dW is small• Takes a long time


Part 2Convolution Neural Networks

Part 2AFeed forward part of

cnnff( )


Matlab examplehttp://www.mathworks.com/matlabcentral/fileexchange/38310-deep-learning-toolbox

An example optical chartered recognition OCR

• Example test_example_CNN.m in http://www.mathworks.com/matlabcentral/fileexchange/38310-deep-learning-toolbox

• Based on a data base (mnist_uint8, from http://yann.lecun.com/exdb/mnist/)

• 60,000 training examples (28x28 pixels each)

• 10,000 testing samples (a different dat.2set)– After training , given an unknown image, it

will tell whether it is 0, or 1 ,..,9 etc.– Recognition rate 11% use 1 epoch (training

200seconds)– Recognition rate 1.2% use 100 epochs

(hours of training) Convolution Neural Network CNN ver.

4.11a 26

http://andrew.gibiansky.com/blog/machine-learning/k-nearest-neighbors-simplest-machine-learning/

Overview ofTest_example_CNN.m

• Read data base• Part I: • cnnsetup.m

– Layer 1: input layer (do nothing)– Layer 2 convolution(conv.) Layer, output maps=6, kernel size=5x5– Layer 3 sub-sample (subs.) Layer, scale=2– Layer 4 conv. Layer, output maps =12, kernel size=5x5– Layer 5 subs. Layer (output layer), scale =2

• Part 2: • cnntrain.m % train wedihgts using 60,000 samples

– cnnff( ) % CNN feed forward– cnndb( ) % CNN feed back to train weighted in kernels– cnnapplygrads( ) % update weights

• cnntest.m % test the system using 10000 samples and show error rate


Architecture

•


Each output neuron corresponds to a character (0,1,2,..,9 etc.)

Layer 1:Image Input1x28x28

Layer 12:6 conv.Maps (C)InputMaps=6OutputMaps=6Fan_in=52=25 Fan_out=6x52=150

Layer 23:6 sub-sample Map (S)InputMaps=6OutputMaps=12

Layer 34:12 conv.Maps (C)InputMaps=6OutputMaps=12Fan_in=6x52=150Fan_out=12x52=300


Layer 1:One input (I)

Layer 5:12x4x4

10

Kernel=5x5

2x2

Kernel=5x5

Subs

Subs

Layer 4:12x8x8

Layer 3:6x12x12

Layer 2:6x24x24

Conv.

I=inputC=Conv.=convolutionS=Subs=sub sampling

Conv.

2x2

Cnnff.mconvolution neural networks feed forward• This is the feed forward part• Assume all the weights are initialized or

calculated, we show how to get the output from inputs.


Layer 12: • Convolute layer 1 with different kernels (map_index1=1,2,.,6) and produce 6 output maps

• Inputs : • input layer 1, a 28x28 image• 6 different kernels : k(1),.,,,k(6) , each k is

5x5, K are dendrites of neurons • Output : 6 output maps each 24x24

• Algorithm• For(map_index=1:6)• {• layer_2(map_index)=• I*k(map_index)valid

• }• Discussion• Valid means only consider overlapped

areas, so if layer 1 is 28x28, kernel is 5x5 each, each output map is 24x24

• In Matlab > use convn(I,k,’valid’)• Example:• I=rand(28,28)• k=rand(5,5)• size(convn(I,k,’valid’))• > ans • > 24 24Convolution Neural Network CNN ver.

4.11a 30

Layer 1:Image Input (i)1x28x28



Kernel=5x5

2x2

Layer 2(c):6x24x24

Conv.*K(6)


i

j

Map_index=1 2 : 6

Conv.*K(1)

Layer 23:

•



2x2

Subs

Layer 3 (s):6x12x12

Layer 2 (c):6x24x24

• Sub-sample layer 2 to layer 3• Inputs :

• 6 maps of layer 2, each is 24x24

• Output : 6 maps of layer 3, each is 12 x12

• Algorithm• For(map_index=1:6)• {• For each input map, calculate

the average of 2x2 pixels and the result is saved in output maps.

• Hence resolution is reduced from 24x24 to 12x12

• }• Discussion

Map_index=1 2 : 6

Layer 34:


32


Kernel=5x5

Layer 4(c):12x8x8

Layer3 L3(s):6x12x12

•

• Conv. layer 3 with kernels to produce layer 4

• Inputs : • 6 maps of layer3(L3{i=1:6}), each is

12x12• Kernel set: totally 6x12 kernels, each

is 5x5,i.e.• K{i=1:6}{j=1:12}, each K{i}{j} is 5x5• 12 bias{j=1:12} in this layer, each is a

scalar• Output : 12 maps of layer4(L4{j=1:12}),

each is 8x8

• Algorithm• for(j=1:12)• {for (i=1:6)• {clear z, i.e. z=0;• z=z+covn (L3{i}, k{i}{j},’valid’)] %z is 8x8• }• L4{j}=sigm(z+bais{j}) %L4{j} is 8x8• }• function X = sigm(P)• X = 1./(1+exp(-P));• End• Discussion

– Normalization?

Index=i=1:6 Index=j=1:12

:

net.layers{l}.a{j}

Layer 45

•



Layer 5:12x4x4

10

Subs

Layer 4:12x8x8

2x2

• Subsample layer 4 to layer 5

• Inputs : • 12 maps of

layer4(L4{i=1:12}), each is 12x8x8

• Output : 12 maps of layer5(L5{j=1:12}), each is 4x4

• Algorithm• Sub sample each 2x2 pixel

window in L4 to a pixel in L5

• Discussion– Normalization?

Layer 5output

•


Each output neuron corresponds to a character (0,1,2,..,9 etc.)net.o{m=1:10}


Layer 5 (L5{j=1:12}:12x4x4=192Totally 192 pixels

10

• Subsample layer 4 to layer 5• Inputs :

• 12 maps of layer5(L5{i=1:12}), each is 4x4, so L5 has 192 pixels in total

• Output layer weights: Net.ffW{m=1:10}{p=1:192}, total number of weights is 192

• Output : 10 output neurons (net.o{m=1:10})

• Algorithm• For m=1:10%each output neuron• {clear net.fv• net.fv=Net.ffW{m}{all 192

weight}.*L5(all corresponding 192 pixels)

• net.o{m}=sign(net.fv + bias)• }• Discussion

: :

Totally192 weights for each output neuron

Same for each output neuron

Part 2B

Back propagation partcnnbp( )

cnnapplyweight( )


cnnbp( )overview (output back to layer 5

• Convolution Neural Network CNN ver.

4.11a36

net.od) * (net.ffW' net.fvd

cnnbp.m codein so

*net.o)) - (1 *. (net.o *. net.e*.

net.o)) - (1 *. (net.o *. net.e.1

)1()(

)(.

.

._

)1()(

i

iii

ii

iii

ii

x

E

wwodnetx

E

odnetwx

E

wxyytyx

E

tyenet

yoout

mcnnbpin

xyytyw

E

Ref: See http://en.wikipedia.org/wiki/Backpropagation

Layer 5 to 4

• Expand 1x1 to 2x2


Layer 4 to 3

• Rotated convolution• Find dE/dx at layer 3


Layer 3 to 2

• Expand 1x1 to 2x2


Calculate gradient

• From later 2 to layer 3• From later 3 to layer 4• Net.ffW• Net.ffb found


Details of calc gradients• % part % reshape feature vector deltas into output map style• L4(c) run expand only• L3(s) run conv (rot180, fill), found d• L2(c) run expand only• %Part %% calc gradients• L2(c) run conv (valid), found dk and db• L3(s) not run here• L4(c) run conv(valid), found dk and db• Done , found these for the output layer L5:

– net.dffW = net.od * (net.fv)' / size(net.od, 2);– net.dffb = mean(net.od, 2);


cnnapplygrads(net, opts)

• For the convolution layers, L2, L4– From k and dk find new k (weights)– From b and db find new b (bias)

• For the output layer L5– net.ffW = net.ffW - opts.alpha * net.dffW;– net.ffb = net.ffb - opts.alpha * net.dffb;– opts.alpha is to adjust learning rate


appendix

•


Architecture

•


Each output neuron corresponds to a character (0,1,2,..,9 etc.)

Layer 1:Image Input1x28x28






Layer 5:12x4x4

10

Kernel=5x5

2x2

Kernel=5x5

Subs

Subs

Layer 4:12x8x8

Layer 3:6x12x12

Layer 2:6x24x24

Conv.


Conv.

2x2

j

i

u

v

A single neuron• The neural net has many layers• In between any neighboring 2 layers, a set of

neurons can be found


llllll xuf bW with )u(x 1

W,Wx,

1layer at inputs x 1

wweightsx

ll

l

lxl layer at input

)1(1lx

lx

)1(lW)2(lW)2(1lx

lu luf

Each Neuron

Derivation

• dE/dW=changes at layer l+1 by changes in layer l

• At output layer L• dE/db=d• E=f(wx+b)• dE/db=d


llTll fW u'11

nnLL tyf

L

u'

layer output at

References• Wiki– http://en.wikipedia.org/wiki/

Convolutional_neural_network– http://en.wikipedia.org/wiki/Backpropagation

• Matlab programs– Neural Network for pattern recognition- Tutorial

http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-pattern-recognition-tutorial

– CNN Matlab example http://www.mathworks.com/matlabcentral/fileexchange/38310-deep-learning-toolbox


convolution neural network cnn a tutorial kh wong convolution neural network cnn ver. 4.11a1

Documents