introduction to neural networks (under graduate course) lecture 4 of 9

22
Neural Networks Dr. Randa Elanwar Lecture 4

Upload: randa-elanwar

Post on 16-Jul-2015

229 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Neural Networks

Dr. Randa Elanwar

Lecture 4

Page 2: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Lecture Content

• Linearly separable functions: logical gate implementation

– Learning laws: Perceptron learning rule

– Pattern mode solution method

– Batch mode solution method

2Neural Networks Dr. Randa Elanwar

Page 3: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Learning Linearly Separable Functions

• Initial network has a randomly assigned weights.

• Learning is done by making small adjustments in the weights to reduce the difference between the observed and predicted values.

• Main difference from the logical algorithms is the need to repeat the update phase several times in order to achieve convergence.

• Updating process is divided into epochs.

• Each epoch updates all the weights of the process.

• Note that: the initial weights and the learning rate value determine the number of iterations needed for conversion.

3Neural Networks Dr. Randa Elanwar

Page 4: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Perceptron learning rule

• Desired the desired output for a given input

• Network calculates what it thinks the output should be

• Network changes its weights in proportion to the error between the desired & calculated results

• wi,j = * [Desiredi - outputi] * inputj– where: – is the learning rate (given constant);– Desiredi - outputi is the error term; – and inputj is the input activation

• wi,j = wi,j + wi,j (delta rule)

• Note: there are other learning rules/laws that will be discussed later

• Learning rate : (1)Used to control the amount of weight adjustment at each step of training, (2) ranges from 0 to 1, (3) determines the rate of learning in each time step

4Neural Networks Dr. Randa Elanwar

Page 5: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Adjusting perceptron weights

• wi,j = wi,j + wi,j

• wi,j = * [Desiredi - outputi] * inputj

• missi is (Desiredi - outputi)

• Adjust each wi,j based on inputj and missi

• If a set of <input, output> pairs are learnable (representable), the delta rule will find the necessary weights (when miss=0)

– in a finite number of steps

– independent of initial weights

Desired < 0, output > 0 w<0

Desired = 0, output = 0 w=0

Desired > 0, output < 0 w>0

5Neural Networks Dr. Randa Elanwar

Page 6: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Hypothetical example• Suppose we have 2 glasses: first is narrow and tall and has

water in it, second is wide and short with no water in it• Target is to make both glasses contain the same volume of

water• Initially, we add some water from the tall to the short then we

measure volumes• If the volume in the short is less than the tall we add more

water• If the volume in the short is more than the tall we return back

some water• And so on till: If both volumes are equal we are done

• The target = desired output, water = weights, difference measure = error

6Neural Networks Dr. Randa Elanwar

Page 7: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Node biases

• A node’s output is a weighted function of its inputs

• What is a bias?

• How can we learn the bias value?

• Answer: treat them like just another weight

7Neural Networks Dr. Randa Elanwar

Page 8: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Training biases ()

• A node’s output:– 1 if w1x1 + w2x2 + … + wnxn >=

– 0 otherwise

• Rewrite– w1x1 + w2x2 + … + wnxn - >= 0

– w1x1 + w2x2 + … + wnxn + (-1) >= 0

• Hence, the bias is just another weight whose activation is always -1

• Just add one more input unit to the network topology

bias

8Neural Networks Dr. Randa Elanwar

Page 9: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Linearly Separable Functions

• When solving the logical AND problem we are searching for the straight line equation separating +ve (1)and –ve (0) output regions on the graph

• Different values for w1, w2, θ lead to different line slope. We have more than 1 solution depending on: initial weights W, learning rate , activation function f and learning mode (Pattern vs. Batch)

9Neural Networks Dr. Randa Elanwar

IwIw 2211

+ve +ve +ve +ve

-ve -ve -ve -ve

Page 10: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Linearly Separable Functions

• Similarly for the logical OR problem

• Different values for w1, w2, θ lead to different line slope.

• We have more than 1 solution depending on: initial weights W, learning rate , activation function f and learning mode (Pattern vs. Batch)

10Neural Networks Dr. Randa Elanwar

IwIw 2211

-ve

-ve-ve

-ve

+ve +ve +ve +ve

Page 11: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Linearly Separable Functions

• Example: logical AND, with initial weights 0.5, 0.3 with bias = 0.5 and activation step function at t=0.5. The learning rate = 1

11Neural Networks Dr. Randa Elanwar

x2

w1= 0.5

w2 = 0.3

x1

yin = x1w1 + x2w2

y

Activation Function:Binary Step Functiont = 0.5,

(y-in) = 1 if y-in >= totherwise (y-in) = 0

Page 12: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Pattern mode)

• Given:

• Since we consider bias as additional weight thus the weight vector is 1x3 we have to fix the dimensionality of the input vector x1, x2, x3 and x4from 2x1 to be 3x1 to perform the multiplication.

12Neural Networks Dr. Randa Elanwar

5.03.05.0)0( W

11100100

X).( bXWfY

x1x2

x3x4

x1 x2 y0 0 00 1 01 0 01 1 1

111101011001

X

Page 13: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Pattern mode)

• Update weight vector for iteration 1

13Neural Networks Dr. Randa Elanwar

0,5.0100

.5.03.05.01.)0(

yXW OK

OK

OK

Wrong

5.03.15.1

)..( 4)0()1( XyWW ydis

TT

0,2.0110

.5.03.05.02.)0(

yXW

0,0101

.5.03.05.03.)0(

yXW

0,3.0111

.5.03.05.04.)0(

yXW

Page 14: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Pattern mode)

• Update weight vector for iteration 2

• Update weight vector for iteration 3

14Neural Networks Dr. Randa Elanwar

1,5.0100

.5.03.15.11.)1(

yXW Wrong

5.03.15.1

)..( 1)1()2( XyWW ydis

TT

1,8.0110

.5.03.15.12.)2(

yXW Wrong

5.13.05.1

)..( 2)2()3( XyWW ydis

TT

Page 15: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Pattern mode)

• Update weight vector for iteration 4

• Update weight vector for iteration 5

15Neural Networks Dr. Randa Elanwar

0,0101

.5.13.05.13.)3(

yXW

0,3.0111

.5.13.05.14.)3(

yXW

OK

Wrong

0,5.0100

5.03.15.21.)4(

yXW OK

Wrong 1,8.0110

5.03.15.22.)4(

yXW

5.03.15.2

)..( 4)3()4( XyWW ydis

TT

5.13.05.2

2)..()4()5( XyyWWdis

TT

Page 16: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Pattern mode)

• Update weight vector for iteration 6

• Update weight vector for iteration 7

16Neural Networks Dr. Randa Elanwar

1,1101

.5.13.05.23.)5(

yXW Wrong

5.23.05.1

3)..()5()6( XyyWWdis

TT

0,7.0111

.5.23.05.14.)6(

yXW

5.13.15.2

4)..()6()7( XyyWWdis

TT

Wrong

0,5.1100

.5.13.15.21.)7(

yXW

0,2.0110

.]5.13.15.2[2.)7(

yXW

1,1101

].5.13.15.2[3.)7(

yXW Wrong

OK

OK

Page 17: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Pattern mode)

• Update weight vector for iteration 8

• Update weight vector for iteration 9

• Update weight vector for iteration 10

17Neural Networks Dr. Randa Elanwar

5.23.15.1

3)..()7()8( XyyWWdis

TT

0,3.0111

].5.23.15.1[4.)8(

yXW

Wrong

5.13.25.2

4)..()8()9( XyyWWdis

TT

0,5.1100

].5.13.25.2[1.)9(

yXW

1,8.0110

].5.13.15.2[2.)9(

yXW Wrong

OK

5.23.15.2

3)..()9()10( XyyWWdis

TT

Page 18: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Pattern mode)

• The weights learning has converged at 10 iterations

18Neural Networks Dr. Randa Elanwar

0,0101

].5.13.15.2[3.)10(

yXW

1,3.1111

].5.23.15.2[4.)10(

yXW

0,5.2100

].5.13.15.2[1.)10(

yXW

0,7.0110

].5.13.15.2[2.)10(

yXW

OK

OK

OK

OK

Page 19: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Batch mode)

• Update weight vector for iteration 1

• Add w for all misclassified inputs together in 1 step

19Neural Networks Dr. Randa Elanwar

0,5.0100

].5.03.05.0[1.)0(

yXW

0,2.0110

].5.03.05.0[2.)0(

yXW

0,0101

].5.03.05.0[3.)0(

yXW

0,3.0111

].5.03.05.0[4.)0(

yXW

OK

OK

OK

Wrong

5.03.15.1

4)..()0()1( XyyWWdis

TT

Page 20: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Batch mode)

• Update weight vector for iteration 2

• Add w for all misclassified inputs together in 1 step

20Neural Networks Dr. Randa Elanwar

1,5.0100

].5.03.15.1[1.)1(

yXW

1,8.1110

].5.03.15.1[2.)1(

yXW

1,2101

].5.03.15.1[3.)1(

yXW

1,3.3111

].5.03.15.1[4.)1(

yXW

Wrong

Wrong

Wrong

OK

3)..(2)..(1)..()1()2( XyXyXy yyyWWdisdisdis

TT

5.23.05.0

101

110

100

5.03.15.1

)2(WT

Page 21: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Batch mode)

• Update weight vector for iteration 3

• Add w for all misclassified inputs together in 1 step

21Neural Networks Dr. Randa Elanwar

0,5.2100

].5.23.05.0[1.)2(

yXW

0,2.2110

].5.23.05.0[2.)2(

yXW

0,2101

].5.23.05.0[3.)2(

yXW

0,7.1111

].5.23.05.0[4.)2(

yXW

OK

OK

OK

Wrong

5.13.15.1

4)..()2()3( XyyWWdis

TT

Page 22: Introduction to Neural networks (under graduate course) Lecture 4 of 9

Solving Linearly Separable Functions (Batch mode)

• Note that

• The number of iterations in Batch mode solution is sometimes less than those of pattern mode

• The final weights obtained by Batch mode solution are different from those obtained by pattern mode solution.

22Neural Networks Dr. Randa Elanwar

0,5.1100

].5.13.15.1[1.)3(

yXW

0,2.0110

].5.13.15.1[2.)3(

yXW

0,0101

].5.13.15.1[3.)3(

yXW

1,3.1111

].5.13.15.1[4.)3(

yXW

OK

OK

OK

OK