ann combined
DESCRIPTION
ANN CombinedTRANSCRIPT
-
Artificial Neural Artificial Neural Networks Networks LecLec 1 & 21 & 2Networks Networks LecLec 1 & 21 & 2
Dr. Aditya Abhyankar
1ANN - Dr. Abhyankar - Lecture 1&2
-
Major Topics to be CoveredMajor Topics to be CoveredMajor Topics to be CoveredMajor Topics to be Coveredy ANN Basics, neurons, learning algorithmsy Perceptron learning, and pattern classificationPerceptron learning, and pattern classificationy Multi-Layer Perceptron (MLP), back-propagation learning,
and applicationsy Pattern classification, Support vector machine (SVM)y Clustering, Self-Organization Mapy Radial Basis Networky Time series prediction, system identification, expert system
F S t Th d F L i C t ly Fuzzy Set Theory and Fuzzy Logic Controly Genetic Algorithm and Evolution Computingy Learn vector quantizationy Mixture of Expert networky Mixture of Expert networky Recurrent network
2ANN - Dr. Abhyankar - Lecture 1&2
-
Class PhilosophyClass PhilosophyClass PhilosophyClass Philosophy
Questions starting with WHY !!y Questions starting with WHY !!y No formalitiesy No attendance gimmicksy No attendance gimmicksy More inquisitive
3ANN - Dr. Abhyankar - Lecture 1&2
-
ApplicationsApplications
y General models of ANN Applications Pattern Classifications Control, time series modeling, estimation Optimization
Real world application examples Real world application examples
4ANN - Dr. Abhyankar - Lecture 1&2
-
ApplicationsApplications
y Many memoryless ANN paradigms (MLP) modeled mathematically as a nonlinear
(f )mapping between the inputs (feature vectors) and outputs.
Discrete output values: classification problem Discrete output values: classification problem Continuous output values: approximation problem
ANN with feed-back can be used to model t eed bac ca be used to odedynamic systems
5ANN - Dr. Abhyankar - Lecture 1&2
-
Pattern Classification Pattern Classification Pattern Classification Pattern Classification ApplicationsApplications
y Speech Recognition and Speech Synthesisy Classification of radar/sonar signalsClassification of radar/sonar signalsy Remote Sensing and image classificationy Handwritten character/digits Recognitiong gy ECG/EEG/EMG Filtering/Classificationy Credit card application screeningy Data mining, Information retrieval
6ANN - Dr. Abhyankar - Lecture 1&2
-
Control, Time Series, EstimationControl, Time Series, Estimation
y Machine Control / Robot manipulationFi i l / S i ifi / E i i Ti iy Financial / Scientific / Engineering Time series forecastingInverse modeling of vocal tracty Inverse modeling of vocal tract
7ANN - Dr. Abhyankar - Lecture 1&2
-
OptimizationOptimization
y Traveling sales persony Multiprocessor scheduling and task assignmenty VLSI placement and routing
8ANN - Dr. Abhyankar - Lecture 1&2
-
Real World ApplicationsReal World Applications
y S&P 500 index predictiony S&P 500 index predictiony Real Estate appraisaly Credit scoringgy Geochemical modelingy Hospital patient stay length predictiony Breast cancer cell image classificationy Jury summoning predictiony Precision direct mailingy Precision direct mailingy Natural gas price prediction
9ANN - Dr. Abhyankar - Lecture 1&2
-
ANN !!!ANN !!!ANN !!!ANN !!!y An artificial neural network (ANN) is a massively
parallel distributed computing system (algorithmparallel distributed computing system (algorithm, device, or other) that has a natural propensity for storing experiential knowledge and making it available for useavailable for use.
y It resembles the brain in two aspects:1) Knowledge is acquired by the network through a learning1). Knowledge is acquired by the network through a learning process.2). Interneuron connection strengths known as synaptic weights are used to store the knowledge.
Al k d & M t (1990) H ki Aleksander & Morton (1990), Haykin(1994)
10ANN - Dr. Abhyankar - Lecture 1&2
-
Biological S stemBiological S stemBiological SystemBiological System
11ANN - Dr. Abhyankar - Lecture 1&2
-
Biological S stemBiological S stemBiological SystemBiological System
12ANN - Dr. Abhyankar - Lecture 1&2
-
ANN Ass mptionsANN Ass mptionsANN AssumptionsANN Assumptions
Info mation p ocessing happens at man y Information processing happens at many simple elements called neuronsy Signals are passed between neurons over Signals are passed between neurons over
the connection linksy Each connection link has associated
i htweighty Each neuron applies an activation function
13ANN - Dr. Abhyankar - Lecture 1&2
-
ANN cha acte isticsANN cha acte isticsANN characteristicsANN characteristics
P tt f ti b t y Pattern of connections between neurons called architecturey Method of determining the weights on the y Method of determining the weights on the
connections called training algorithmy Mathematical model for assigning the y Mathematical model for assigning the
output called activation function
14ANN - Dr. Abhyankar - Lecture 1&2
-
Ne on ModelNe on ModelNeuron ModelNeuron Modely McCulloch-Pitts (Simplistic) Neuron Model
The network function of a neuron is ay The network function of a neuron is aweighted sum of its input signals plus a biasterm.
15ANN - Dr. Abhyankar - Lecture 1&2
-
Neuron ModelNeuron ModelNeuron ModelNeuron Modely The net function is a linear or nonlinear
mapping from the input data space to an pp g p pintermediate feature spacey The most common form is a hyper-plane
16ANN - Dr. Abhyankar - Lecture 1&2
-
Othe Net Fo msOthe Net Fo msOther Net FormsOther Net Formsy Higher order net function: Net function is a linear
bi ti f hi h d l i l t Fcombination of higher order polynomial terms. For example, a 2nd order net function has the form:
ND lt (S P) t f ti i t d f ti th
1, 1i ijk j k i
j kw y y
= == +
y Delta (S-P) net function instead of summation, theproduct of all weighted synaptic inputs are computed:
N
0
N
i ij ij
w y=
=17ANN - Dr. Abhyankar - Lecture 1&2
-
Neuron Activation functionNeuron Activation functionNeuron Activation functionNeuron Activation function
18ANN - Dr. Abhyankar - Lecture 1&2
-
Neuron Activation functionNeuron Activation functionNeuron Activation functionNeuron Activation function
19ANN - Dr. Abhyankar - Lecture 1&2
-
Neuron Activation functionNeuron Activation function
20ANN - Dr. Abhyankar - Lecture 1&2
-
ANN configurationANN configurationgg
y Uni-directional communication links represented b di t d Th ANN t t th bby directed arcs. The ANN structure thus can be described by a directed graph.y Fully connected a cyclic graph with feedy Fully connected a cyclic graph with feed-
back.There are NxN connections for N neurons.
21ANN - Dr. Abhyankar - Lecture 1&2
-
ANN configurationANN configurationANN configurationANN configurationy Feed-forward, layered connection acyclic
directed graph no loop or cycledirected graph, no loop or cycle.
22ANN - Dr. Abhyankar - Lecture 1&2
-
ANN configurationANN configurationANN configurationANN configuration
23ANN - Dr. Abhyankar - Lecture 1&2
-
FeedFeed--back Dynamic Systemback Dynamic SystemFeedFeed back Dynamic Systemback Dynamic Systemy Without Delay, feedback
cause causality problem: ancause causality problem: anunknown variable dependson an unknown variable!2 ( 1) ( ( 2))a2 = g(a1) = g(g(a2)) =
To break the cycle, at leastone delay element must beinserted into the feedbackloop.
This effectively created a This effectively created anonlinear dynamic system(sequential machine).
24ANN - Dr. Abhyankar - Lecture 1&2
-
Models of NeuronModels of Neuron
y McCulloch-Pitts Model (MP)y Rosenblatts Perceptron Modely Adaline Model
25ANN - Dr. Abhyankar - Lecture 1&2
-
MP ModelMP ModelMP ModelMP Modely McCulloch-Pitts (Simplistic) Neuron Model
y The network function of a neuron is ai ht d f it i t i l l biweighted sum of its input signals plus a bias
term.
26ANN - Dr. Abhyankar - Lecture 1&2
-
MP Model LimitationsMP Model LimitationsMP Model LimitationsMP Model Limitations
y Weights fixedWeights fixedy Incapable of learningy Original model allows ONLY:g
Binary output steps Operations at discrete time steps
27ANN - Dr. Abhyankar - Lecture 1&2
-
PerceptronPerceptronPerceptronPerceptron28ANN - Dr. Abhyankar - Lecture 1&2
-
M1
M
i ii
x w a = Activation1i=( )s f x=Output
b s = Error
PerceptronPerceptron i iw a =Weight Change
PerceptronPerceptron29ANN - Dr. Abhyankar - Lecture 1&2
-
y Perceptron learning law gives step-y Perceptron learning law gives stepby-step process for adjusting weightsy Perceptron convergence theoremPerceptron convergence theorem
Perceptron Perceptron -- AdvantagesAdvantagesPerceptron Perceptron AdvantagesAdvantages30ANN - Dr. Abhyankar - Lecture 1&2
-
Widrows AdalineWidrows AdalineWidrow s AdalineWidrow s Adaline31ANN - Dr. Abhyankar - Lecture 1&2
-
M1
M
i ii
x w a = Activation1i=
( )s f x x= =Outputb s b x = = Error
Adaline (ADAptive Linear Adaline (ADAptive Linear Element)Element) i i
w a =Weight ChangeElement)Element)
32ANN - Dr. Abhyankar - Lecture 1&2
-
y Analog activation value x compared with a og a a o a u o pa dtarget output b
ORy Output is linear function of xy LMS learning lawy Gradient descent algorithm
Widrows AdalineWidrows AdalineWidrow s AdalineWidrow s Adaline33ANN - Dr. Abhyankar - Lecture 1&2
-
Heat and Cold ExHeat and Cold ExHeat and Cold Ex.Heat and Cold Ex.
34ANN - Dr. Abhyankar - Lecture 1&2
-
Hebb RuleHebb RuleHebb RuleHebb Rule35ANN - Dr. Abhyankar - Lecture 1&2
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
36ANN - Dr. Abhyankar - Lecture 1&2
-
Artificial Neural Artificial Neural Networks Networks LecLec 3 & 43 & 4
Dr. Aditya Abhyankar
-
ANN !!!ANN !!!ANN !!!ANN !!!
y An artificial neural network (ANN) is a massively paralleldistributed computing system (algorithm device or other) thatdistributed computing system (algorithm, device, or other) that has a natural propensity for storing experiential knowledge and making it available for use.
y It resembles the brain in two aspects:1). Knowledge is acquired by the network through a learning processprocess.2). Interneuron connection strengths known as synaptic weights are used to store the knowledge.
Aleksander & Morton (1990) Haykin Aleksander & Morton (1990), Haykin (1994)
2July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Biological S stemBiological S stemBiological SystemBiological System
3July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Biological S stemBiological S stemBiological SystemBiological System
4July 20, 2010ANN - Dr. Abhyankar - Lec 3 & 4
-
Features Features Biological NNBiological NNFeatures Features Biological NNBiological NN
y Robustness and fault toleranceRobustness and fault tolerancey Flexibility: on-the-fly-learning, adjustment
of weightsy Adaptability: ability to deal with variety of
data situations (fuzzy, probabilistic, noisy etc) etc) y Efficiency: Parallel and distributed
computing
5July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Ne on ModelNe on ModelNeuron ModelNeuron Modely McCulloch-Pitts (Simplistic) Neuron Model
The network function of a neuron is ay The network function of a neuron is aweighted sum of its input signals plus a biasterm.
6July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Performance ComparisonPerformance ComparisonParameter BNN ANNSpeed Slow (few ms per Slow (few ns perSpeed Slow (few ms per
execution)Slow (few ns per execution)
Processing Massively parallel Mostly sequentialProcessing Massively parallel Mostly sequential
Size & Neurons~1011, Difficult to perform Complexity interconnections~
1015, contribution
pcomplex pattern recognition tasks
from dendrites and synapses
7July 20, 2010ANN - Dr. Abhyankar -Lec 3 & 4
-
Performance ComparisonPerformance ComparisonPerformance ComparisonPerformance Comparison
Parameter BNN ANNParameter BNN ANN
Storage Adaptable Strictly replaceableStorage Adaptable (strengths of interconnections)
Strictly replaceable (memory mapping)
)Fault Tolerance
Good FT, distributed
Poor FT, corrupted memories un-
information retrieved data
8July 20, 2010ANN - Dr. Abhyankar -Lec 3 & 4
-
ANN TerminologyANN TerminologyANN TerminologyANN Terminology
ANN highly simplified model of BNNy ANN highly simplified model of BNNy ANN has interconnected processing unitsy Summing part receives N inputs weights y Summing part receives N inputs, weights
each value and computes weighted sumy Weighted sum activation valueWeighted sum activation valuey Positive weight excitatory inputy Negative weight inhibitory inputg g y p
9July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Neuron ModelNeuron ModelNeuron ModelNeuron Modely The net function is a linear or nonlinear
mapping from the input data space to an pp g p pintermediate feature spacey The most common form is a hyper-plane
10July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Other Things we sawOther Things we sawOther Things we sawOther Things we saw
y Other Net FormsOther Net Formsy Various Activation Functionsy ANN configurationsgy Dynamic Systemsy ANN Assumptionsy ANN Characteristics
11July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Models of Models of Neuron Neuron 1 layer1 layery McCulloch-Pitts Model (MP)y Rosenblatts Perceptron Modely Adaline Model
12July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
MP ModelMP ModelMP ModelMP Modely McCulloch-Pitts (Simplistic) Neuron Model
x1xxM
1iwijw
iny yjx
xMij
iNw y
Th t k f ti f i
Nxib
y The network function of a neuron is aweighted sum of its input signals plus a biasterm.term.
13July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
MP ModelMP Modely The net function is a linear or nonlinear
mapping from the input data space to an pp g p pintermediate feature spacey The most common form is a hyper-plane
NT
i ij j i i iy w x b w x b= + = +1
iin ij j i i ij
y w x b w x b=
+ +( )iny f y=
14July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
MP Model LimitationsMP Model LimitationsMP Model LimitationsMP Model Limitations
y Weights fixedWeights fixedy Incapable of learningy Original model allows ONLY:g
Binary output steps Operations at discrete time steps
15July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
PerceptronPerceptronpp
16July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
P tP tPerceptronPerceptron
M
1
M
i ii
x w a = Activation1i=( )s f x=Output
b s = Errori iw a =Weight Change
17July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
PerceptronPerceptron -- AdvantagesAdvantages
y Perceptron learning law gives step-y Perceptron learning law gives stepby-step process for adjusting weightsy Perceptron convergence theoremPerceptron convergence theorem
18July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Wid Wid Ad liAd liWidrowsWidrows AdalineAdaline
19July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
AdalineAdaline ((ADAptiveADAptive Linear Element)Linear Element)M
1
M
i ii
x w a = Activation1i=
( )s f x x= =Outputb s b x = = Error
i iw a =Weight Change20July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
WidrowsWidrows AdalineAdalineWidrowsWidrows AdalineAdaline
y Analog activation value x compared with y Analog activation value x compared with target output b
OROy Output is linear function of xy LMS learning lawgy Gradient descent algorithm
21July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
AND Example using MP modelAND Example using MP modelAND Example using MP modelAND Example using MP model
1 ?w = 1 20 0 0x x y
1x 1 ?w
0 1 01 0 0
2x y
1 0 01 1 1
2 ?w = ?TH =
22July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
AND Example using MP modelAND Example using MP modelAND Example using MP modelAND Example using MP model
1 1w = 1 20 0 0x x y
1x 1 1w
0 1 01 0 0
2x y
1 0 01 1 1
2 1w = 2TH =
23July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
AND Example using MP modelAND Example using MP modelAND Example using MP modelAND Example using MP model
1 1w = Gives equation f li 1 20 0 0x x y
1x 1 1w of a line
0 1 01 0 0
2x y
1 0 01 1 1
2 1w = 2TH =Why One Neuron is sufficient ???
24July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
AND Example using MP modelAND Example using MP modelAND Example using MP modelAND Example using MP model
2x1 2
0 0 0x x y2
0 1 01 0 0
1x1 0 01 1 1
25July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
OR Example using MP modelOR Example using MP modelOR Example using MP modelOR Example using MP model
2x1 2
0 0 0x x y2
0 1 11 0 1
1x1 0 11 1 1
26July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
OR Example using MP modelOR Example using MP modelOR Example using MP modelOR Example using MP model
1 2w = 1 20 0 0x x y
1x 1 2w
0 1 11 0 1
2x y
1 0 11 1 1
2 2w = 2TH =
27July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
ANDAND--NOT Example using MP NOT Example using MP p gp gmodelmodel
1 2w = 1 20 0 0x x y
1x 1 2w
0 1 01 0 1
2x y
1 0 11 1 0
2 1w = 2TH =
28July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model
1 ?w = 1 20 0 0x x y
1x 1 ?w
0 1 11 0 1
2x y
1 0 11 1 0
2 ?w = ?TH =
29July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model
2x1 2
0 0 0x x y2
0 1 11 0 1
1x1 0 11 1 1
NOT Linearly Separable !!!!!!
30July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model1 2
0 0 00 1 1
x x y
0 1 11 0 11 1 01x 11 2w = 31 2w =
y12 1w = 21 1w = 2TH =
2xy
22 2w =2TH =
32 2w =2TH =22 2TH =
1 2 1 2 2 1( ) { ( ) } { ( ) }x XOR x x ANDNOT x OR x ANDNOT x=31July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT
1x 1y
COLD COLD
2x 2y
Actual Input Perceived Output
32July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
( ) ( 1) ( 2)y t x t ANDx t=2 2 2( ) ( 1) ( 2)y t x t ANDx t= 1 1 2 2( ) { ( 1)} { ( 3) ( 2)}y t x t OR x t ANDNOTx t=
33July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
1x 1y2
-1 NN
COLD COLD22 1
2x 2y2
1
1N N
Actual Input Perceived Output
34July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
Case Study 1: Cold stimulus applied for small duration
01x 1yHOT HOT
2
1 1
COLD COLD2
-1
12x 2yCO CO
2
1
1
Actual Input Perceived Output
1
t=0
35July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
Case Study 1: Cold stimulus applied for small duration
01x 1yHOT HOT
0
2
1 1
COLD COLD
0
2
-1
02x 2yCO CO
12
1
1
Actual Input Perceived Output
1
t=1
36July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
Case Study 1: Cold stimulus applied for small duration
1x 0 1yHOT HOT
1
2
1 1
COLD COLD
1
2
-1
2x 0 2yCO CO
02
1
1
Actual Input Perceived Output
1
t=2
37July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
Case Study 1: Cold stimulus applied for small duration
1x 1 1yHOT HOT
2
1 1
COLD COLD2
-1
2x 0 2yCO CO
2
1
1
Actual Input Perceived Output
1
t=3
38July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
Case Study 2: Hot stimulus applied for one t step
11x 1yHOT HOT
2
1 1
COLD COLD2
-1
02x 2yCO CO
2
1
1
Actual Input Perceived Output
1
t=0
39July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
Case Study 2: Hot stimulus applied for one t step
1x 1 1yHOT HOT
2
1 1
COLD COLD2
-1
2x 0 2yCO CO
02
1
1
Actual Input Perceived Output
1
t=1
40July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
Case Study 3: Cold stimulus applied for longer duration
01x 1yHOT HOT
2
1 1
COLD COLD2
-1
12x 2yCO CO
2
1
1
Actual Input Perceived Output
1
t=0
41July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
Case Study 3: Cold stimulus applied for longer duration
01x 1yHOT HOT
0
2
1 1
COLD COLD
0
2
-1
12x 2yCO CO
12
1
1
Actual Input Perceived Output
1
t=1
42July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
Case Study 3: Cold stimulus applied for longer duration
1x 0 1yHOT HOT
0
2
1 1
COLD COLD
0
2
-1
2x 1 2yCO CO
12
1
1
Actual Input Perceived Output
1
t=2
43July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
MP ModelMP ModelMP ModelMP Modely McCulloch-Pitts (Simplistic) Neuron Model
x1xxM
1iwijw
iny yjx
xMij
iNw y
Th t k f ti f i
Nxib
y The network function of a neuron is aweighted sum of its input signals plus a biasterm.term.
44July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
MP ModelMP Modely The net function is a linear or nonlinear
mapping from the input data space to an pp g p pintermediate feature spacey The most common form is a hyper-plane
NT
i ij j i i iy w x b w x b= + = +1
iin ij j i i ij
y w x b w x b=
+ +( )iny f y=
45July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Significance of biasSignificance of biasgg
1x w 1x 1iw y1jx
M
M
1iwijw
iNw iny y
1
jx
x
M
M
1iwijw
iNw iny y
0b + +Nx
ibNx
i1 1 2 2 0b x w x w+ + =
w b1 1 2 2x w x w + =
1w 12 1
2 2
w bx xw w
= 12 12 2
x xw w
= +46July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
MP Model LimitationsMP Model LimitationsMP Model LimitationsMP Model Limitations
y Weights fixedWeights fixedy Incapable of learningy Original model allows ONLY:g
Binary output steps Operations at discrete time steps
47July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
HebbHebb Rule (Rule (HebbHebb Net)Net)HebbHebb Rule (Rule (HebbHebb Net)Net)
Initialize all weights to zero0 0, 1iw i n= = for {all input training vectors and target
t t i d t 2 4 }1 output pairs do steps 2 4 }1
S t A ti ti f I t2 i ix s= y t=Set Activation for Inputs2
Set Activation for Outputs3
i i y t
( ) ( )i i iw n w o x y= +Adjust weights & bias4
( ) ( )i i i y
( ) ( )b n b o y= +48July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
ifi i l lifi i l lArtificial Neural Artificial Neural Networks Networks LecLec 5&65&6Networks Networks LecLec 5&65&6
Dr. Aditya Abhyankar
-
RecapRecapRecapRecap
y ANN- definitiony Resemblance with BNNy BNN ANN comparisony ANN Terminologyy Neuron Modelsy Learning !!
-
ANN TerminologyANN TerminologyANN highly simplified model of BNNy ANN highly simplified model of BNNy ANN has interconnected processing unitsy Summing part receives N inputs weights y Summing part receives N inputs, weights
each value and computes weighted sumy Weighted sum activation valueWeighted sum activation valuey Positive weight excitatory inputy Negative weight inhibitory inputg g y p
-
Ne on ModelNe on ModelNeuron ModelNeuron Modely McCulloch-Pitts (Simplistic) Neuron Model
The network function of a neuron is ay The network function of a neuron is aweighted sum of its input signals plus a biasterm.
4July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Neuron ModelNeuron ModelNeuron ModelNeuron Modely The net function is a linear or nonlinear
mapping from the input data space to an pp g p pintermediate feature spacey The most common form is a hyper-plane
5July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Models of Models of Neuron Neuron 1 layer1 layery McCulloch-Pitts Model (MP)y Rosenblatts Perceptron Modely Adaline Model
6July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
MP ModelMP ModelMP ModelMP Modely McCulloch-Pitts (Simplistic) Neuron Model
x1xxM
1iwijw
iny yjx
xMij
iNw y
Th t k f ti f i
Nxib
y The network function of a neuron is aweighted sum of its input signals plus a biasterm.term.
7July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
MP Model LimitationsMP Model LimitationsMP Model LimitationsMP Model Limitations
y Weights fixedWeights fixedy Incapable of learningy Original model allows ONLY:g
Binary output steps Operations at discrete time steps
-
PerceptronPerceptron
1A1a 1w
2A 2a 2ww xMA
Ma
Mw( )s f x=
Sensory S i O t tMyUnitAssociation
Unit
SummingUnit
OutputUnit
Unit
-
PerceptronPerceptron
M
1
M
i ii
x w a = Activation1i=( )s f x=Output
b s = Errori iw a =Weight Change
-
PerceptronPerceptron -- AdvantagesAdvantages
y Perceptron learning law gives step-y Perceptron learning law gives stepby-step process for adjusting weightsy Perceptron convergence theoremPerceptron convergence theorem
-
WidrowsWidrows AdalineAdaline
1A1a 1w
2A 2a 2ww xMA
Ma
Mw( )s f x x= =
Sensory S i O t tMyUnitAssociation
Unit
SummingUnit
OutputUnit
Unit
-
AdalineAdaline ((ADAptiveADAptive Linear Element)Linear Element)M
1
M
i ii
x w a = Activation1i=
( )s f x x= =Outputb s b x = = Error
i iw a =Weight Change
-
Wid Ad liWid Ad liWidrows AdalineWidrows Adaline
y Analog activation value x compared with y Analog activation value x compared with target output b
OROy Output is linear function of xy LMS learning lawgy Gradient descent algorithm
-
Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample
HOT HOT2
1x 1y2
-1 NN
COLD COLD22 1
2x 2y2
1
1N N
Actual Input Perceived Output
15July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
HebbHebb Rule (Rule (HebbHebb Net)Net)HebbHebb Rule (Rule (HebbHebb Net)Net)
Initialize all weights to zero0 0, 1iw i n= = for {all input training vectors and target
t t i d t 2 4 }1 output pairs do steps 2 4 }1
S t A ti ti f I t2 i ix s= y t=Set Activation for Inputs2
Set Activation for Outputs3
i i y t
( ) ( )i i iw n w o x y= +Adjust weights & bias4
( ) ( )i i i y
( ) ( )b n b o y= +16July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
ExamplesExamples
AND logicy AND logicy OR logicy AND-NOT logicy AND-NOT logicy 3-D example where Hebb failsy X-OR !!y X OR !!
-
Concepts Concepts HebbianHebbian LearningLearningConcepts Concepts HebbianHebbian LearningLearning
y A discriminating hyper plane is constituted y A discriminating hyper plane is constituted by the combination of summing unit and output unit.y 0 targets are difficult to learn!y Bi-polar notion preferred
Hebb R le doesnt gi e di ection of y Hebb Rule doesnt give direction of learning!y Concept of a bias!!Concept of a bias!!
-
Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learning
y Step 0: Initialize weights and bias (for y Step 0: Initialize weights and bias (for simplicity set ws and bs to 0). Set learning rate 0 1< gy Step 1: While stopping condition is falsey Step 2: For each training pair :s ty Step 3: Set activation to Input
i ix s=
-
Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learningy Step 4: Compute response of o/p unitp p p p
in i iy b x w= +i
1 iny >0in
in
yy y
= 1 iny
-
Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learning
y Step 5: Update error and bias if error p poccurred for the given pattern, if y t
( ) ( )w n w o tx+( ) ( )i i iw n w o tx= +( ) ( )b n b o t= +
y Step 6: For no weight change (for the
( ) ( )b n b o t= +p g g (
entire epoch) stop!
-
Problem!Problem!
y AND function, bipolar i/ps, bipolar targets, 1 0b 1 21, 0b w w = = = = =
-
Problem!Problem!
y AND function, binary i/ps, bipolar targets, 1 0 0 2b 1, 0, 0.2b = = =
-
Problem!Problem!
y AND function, binary i/ps, bipolar targets, 1, 0, 0.2b = = =
Results for third epoch
-
Problem!Problem!
y AND function, binary i/ps, bipolar targets, 1, 0, 0.2b = = =
Results of tenth epoch
-
Artificial Neural Networks Artificial Neural Networks LL 7 & 87 & 8 LecLec 7 & 87 & 8Dr. Aditya AbhyankarDr. Aditya Abhyankar
1ANN - Dr. Abhyankar - Lec 7 & 8
-
TodayTodayTodayToday
C t l d ill f l t tiy Conceptual drill from last timey PLR-CT
Delta Ruley Delta Ruley Adaline learning
2ANN - Dr. Abhyankar - Lec 7 & 8
-
Concepts Concepts HebbianHebbian LearningLearningConcepts Concepts HebbianHebbian LearningLearning
y A discriminating hyper plane is constituted y A discriminating hyper plane is constituted by the combination of summing unit and output unit.y 0 targets are difficult to learn!y Bi-polar notion preferred
Hebb R le doesnt gi e di ection of y Hebb Rule doesnt give direction of learning!y Concept of a bias!!Concept of a bias!!
3ANN - Dr. Abhyankar - Lec 7 & 8
-
HebbHebb Rule (Rule (HebbHebb Net)Net)HebbHebb Rule (Rule (HebbHebb Net)Net)
Initialize all weights to zero0 0, 1iw i n= = for {all input training vectors and target
t t i d t 2 4 }1 output pairs do steps 2 4 }1
S t A ti ti f I t2 i ix s= y t=Set Activation for Inputs2
Set Activation for Outputs3
i i y t
( ) ( )i i iw n w o x y= +Adjust weights & bias4
( ) ( )i i i y
( ) ( )b n b o y= +4ANN - Dr. Abhyankar - Lec 7 & 8
-
ExamplesExamples
AND logicy AND logicy OR logicy AND-NOT logicy AND-NOT logicy 3-D example where Hebb failsy X-OR !!y X OR !!
5ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using OR Example using HebbianHebbianXX OR Example using OR Example using HebbianHebbian
1 ?w = 1 21 1 1x x y 1x
1 ?w
1 1 11 1 1
2x y
0TH 1 1 11 1 1
2
?w = 0TH =
6ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model
2x21 2
1 1 1x x y
1x1 1 1
1 1 11 1 11 1 1
NOT Linearly Separable !!!!!!
7ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model1 2
1 1 11 1 1
x x y 1 ?b =
1x 11 ?w = 31 ?w =1 1 1
1 1 11 1 1
y12 ?w = 21 ?w =
2xy
22 ?w =
32 ?w =3 ?b =22
1 2 1 2 2 1( ) { ( ) } { ( ) }x XOR x x ANDNOT x OR x ANDNOT x=2 ?b =
8ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model
0 0 01 2{ ( ) }x ANDNOT x
1 2 1 2 1 2
0 0 0x x t w w b w w b 1 1 1 1 1 1 1 1 11 1 1 1 1 1 2 0 2
1 1 1 1 1 1 2 0 2
1 1 1 1 1 1 3 1 1
1 1 1 1 1 1 2 2 2
9ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model
2x1 2{ ( ) }x ANDNOT x2
1x2 1 1x x=
10ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model
0 0 02 1{ ( ) }x ANDNOT x
1 2 1 2 1 2
0 0 0x x t w w b w w b 1 1 1 1 1 1 1 1 11 1 1 1 1 1 0 2 0
1 1 1 1 1 1 0 2 0
1 1 1 1 1 1 1 3 1
1 1 1 1 1 1 2 2 2
11ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model
2x1 2{ ( ) }x ANDNOT x2
1x2 1 1x x= +
12ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model
1 2{ ( ) }x ANDNOT x
11 2{ ( ) }x ANDNOT x
1+2 1 1x x= 2 1 1x x= +
t2x 2x
1t1
t
1x1x 1
1+
11
+
11
13ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model
2x2
1x1x
14ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model1 2
1 1 11 1 1
x x y 1 ?b =
1x 11 ?w = 31 ?w =1 1 1
1 1 11 1 1
y12 ?w = 21 ?w =
2xy
22 ?w =
32 ?w =3 ?b =22
1 2 1 2 2 1( ) { ( ) } { ( ) }x XOR x x ANDNOT x OR x ANDNOT x=2 ?b =
15ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model
0 0 01 2 1 2 2 1( ) { ( ) } { ( ) }x XOR x x ANDNOT x OR x ANDNOT x=
1 2 1 2 1 2
0 0 0x x t w w b w w b 1 1 1 1 1 1 1 1 11 1 1 1 1 1 0 2 0
1 1 1 1 1 1 0 2 0
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 2 2 2
16ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model1 2{ ( ) }x ANDNOT x
2x2x
2 1 1x x= x1x
17ANN - Dr. Abhyankar - Lec 7 & 8
-
XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model1 2{ ( ) }x ANDNOT x
2x2x
2 1 1x x= x1x
18ANN - Dr. Abhyankar - Lec 7 & 8
-
Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learning
y Step 0: Initialize weights and bias (for y Step 0: Initialize weights and bias (for simplicity set ws and bs to 0). Set learning rate 0 1< gy Step 1: While stopping condition is falsey Step 2: For each training pair :s ty Step 3: Set activation to Input
i ix s=
-
Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learningy Step 4: Compute response of o/p unitp p p p
in i iy b x w= +i
1 iny >0in
in
yy y
= 1 iny
-
Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learning
y Step 5: Update error and bias if error p poccurred for the given pattern, if y t
( ) ( )w n w o tx+( ) ( )i i iw n w o tx= +( ) ( )b n b o t= +
y Step 6: For no weight change (for the
( ) ( )b n b o t= +p g g (
entire epoch) stop!
-
Problem!Problem!
y AND function, bipolar i/ps, bipolar targets,1 0b
L t l thi i d d ti f
1 21, 0b w w = = = = =y Lets solve this using advanced notion of
bias!
-
Significance of biasSignificance of biasgg
1x w 1x 1iw y1jx
M
M
1iwijw
iNw iny y
1
jx
x
M
M
1iwijw
iNw iny y
0b + +Nx
ibNx
i1 1 2 2 0b x w x w+ + =
w b1 1 2 2x w x w + =
1w 12 1
2 2
w bx xw w
= 12 12 2
x xw w
= +23July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Significance of biasSignificance of biasgg
1xM 1i
winy
jxM
Mijw
iNw iny y
NxM iNw
ib
0x24July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
Problem!Problem!
y AND function, binary i/ps, bipolar targets, 1 0 0 2b 1, 0, 0.2b = = =
-
Problem!Problem!
y AND function, binary i/ps, bipolar targets, 1, 0, 0.2b = = =
Results for third epoch
-
Problem!Problem!
y AND function, binary i/ps, bipolar targets, 1, 0, 0.2b = = =
Results of tenth epoch
-
PerceptronPerceptron Convergence Convergence TheoremTheorem
If there is a weight vector w* such that gf(x(p).w*)=t(p) for all p, then for any starting vector w, PLR will converge to g , ga weight vector that gives correct response to all training patterns in p g pfinite number of loops.
28ANN - Dr. Abhyankar - Lec 7 & 8
-
PerceptronPerceptron Convergence Convergence TheoremTheorem
* * *( ). (0). , min{ . }w k w w w km m x w + =* 2
2* 2
( ( ). )|| ( ) |||| ||w k ww kw
|| ||w
* 22 ( (0). )|| ( ) || w w kmk +2 * 2( ( ) )|| ( ) || || ||w k w
29ANN - Dr. Abhyankar - Lec 7 & 8
-
PerceptronPerceptron Convergence Convergence TheoremTheorem
2 2 2|| ( ) || || (0) || , max{|| ||}w k w kM M x + =
* 22 2( (0). ) || ( ) || || (0) ||w w km k kM+ 2 2* 2( ( ) ) || ( ) || || (0) |||| || w k w kMw +
30ANN - Dr. Abhyankar - Lec 7 & 8
-
Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule
y Step 0: Initialize weights and bias (for y Step 0: Initialize weights and bias (for simplicity set ws and bs to 0). Set learning rate 0 1< gy Step 1: While stopping condition is falsey Step 2: For each training pair :s ty Step 3: Set activation to Input
i ix s=
-
Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule
y Step 4: Compute response of o/p unitp p p p
i i iy b x w= +in i ii
y b x w+iny y=
-
Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule
y Step 5: Update error and bias if error p poccurred for the given pattern, if y t
( ) ( ) ( )w n w o t y x+( ) ( ) ( )i i iw n w o t y x= + ( ) ( ) ( )b n b o t y= +
y Step 6: If largest weight change is smaller
( ) ( ) ( )b n b o t y= + y Step 6: If largest weight change is smaller
than specified tolerance then stop!
-
Problem!Problem!
y AND function, bipolar i/ps, bipolar targets, 1 0b 1 21, 0b w w = = = = =
-
Artificial Neural Networks Artificial Neural Networks LL 99 LecLec 99
Dr. Aditya AbhyankarDr. Aditya Abhyankar
1ANN - Dr. Abhyankar - Lec 7 & 8
-
Last TimeLast TimeLast TimeLast Time
C t l d ill!!y Conceptual drill!!y PLR-CT
Delta Ruley Delta Ruley Adaline learning
2ANN - Dr. Abhyankar - Lec 7 & 8
-
TodayTodayTodayToday
D lt R ly Delta Ruley Adaline learning
MATLAB Demoy MATLAB Demoy Madaline Philosophyy Extended Delta Rule BPy Extended Delta Rule BP
3ANN - Dr. Abhyankar - Lec 7 & 8
-
Delta RuleDelta RuleDelta RuleDelta Rule
D lt R l h i ht th y Delta Rule changes weights on the neuron connections to minimize difference between yin and tbetween yin and ty Aims at reducing error across all the
training patterns (exemplars)training patterns (exemplars)y Squared error for a particular training
pattern can be given asp g2 2( ) ( )inE t y t y= =
4ANN - Dr. Abhyankar - Lec 7 & 8
-
Delta RuleDelta RuleDelta RuleDelta Rule
E i f ti f ll th i hty E is function of all the weightsy Gradient of E will be a vector consisting of
partial derivatives of E with reference to partial derivatives of E with reference to each weighty Gradient gives direction of rapid increase y Gradient gives direction of rapid increase
in E, we wish to minimize E!y Hence calculate: E
Iw
5ANN - Dr. Abhyankar - Lec 7 & 8
-
Delta RuleDelta RuleDelta RuleDelta Rule
2( ) inyE t y = 2( )inI I
t yw w
= 2( )E ( )E
l ll b d d
2( )in II
E t y xw = ( )in II
E t y xw
= y Local error will be reduced most
rapidly by adjusting weights as per the delta rulethe delta rule
6ANN - Dr. Abhyankar - Lec 7 & 8
-
Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule
y Step 0: Initialize weights and bias (for y Step 0: Initialize weights and bias (for simplicity set ws and bs to small value). Set learning rate 0 1< gy Step 1: While stopping condition is falsey Step 2: For each training pair :s ty Step 3: Set activation to Input
i ix s=
-
Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule
y Step 4: Compute response of o/p unitp p p p
i i iy b x w= +in i ii
y b x w+iny y=
-
Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule
y Step 5: Update error and bias if error p poccurred for the given pattern, if y t
( ) ( ) ( )w n w o t y x+( ) ( ) ( )i i iw n w o t y x= + ( ) ( ) ( )b n b o t y= +
y Step 6: If largest weight change is smaller
( ) ( ) ( )b n b o t y= + y Step 6: If largest weight change is smaller
than specified tolerance then stop!
-
Problem!Problem!
y AND function, bipolar i/ps, bipolar targets,
1 20.1, 0.1b w w = = = =
-
MadalineMadaline ArchitectureArchitectureMadalineMadaline ArchitectureArchitecture
1b1Z
1x 11w 1v1
Z
y12w 21w
1
2v
1inZiny
2xy
22w 2
v
b2inZ22
2b3b
2Z
11July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4
-
M d liM d li L i (MRI)L i (MRI)MedalineMedaline Learning (MRI)Learning (MRI)
-
M d liM d li L i (MRI)L i (MRI)MedalineMedaline Learning (MRI)Learning (MRI)
-
MedalineMedaline Learning (MRI)Learning (MRI)MedalineMedaline Learning (MRI)Learning (MRI)
-
M d liM d li L i (MRI)L i (MRI)MedalineMedaline Learning (MRI)Learning (MRI)
-
Problem!Problem!
y X-OR function, bipolar i/ps, bipolar t t targets,
0 5 0 3 0 05 0 2b w w = = = =1 11 212 12 22
0.5, 0.3, 0.05, 0.20.5, 0.1, 0.2
b w wb w w = = = =
= = =2 1 2 0.5b v v= = =
-
Artificial Neural Networks - 10Artificial Neural Networks - 10
Dr. Aditya Abhyankar
-
Back-Propagation (BP)
Aims at balancing memorization and generalizationg
Stage 1: Feedforward I/p training pattern Stage 2: calculation and backpropagation of Stage 2: calculation and backpropagation of
associated errorS 3 Adj f h i h Stage 3: Adjustments of the weights
-
Architecture
HiddenLayer
-
Nomenclature
-
Nomenclature
-
Activation Function
Characteristics: ContinuousContinuous Differentiable Monotonically non-decreasingMonotonically non decreasing Easily differentiable
-
Binary Sigmoid FunctionBinary Sigmoid Functionrange (0,1)
-
Bipolar Sigmoid FunctionBipolar Sigmoid Functionrange (-1,1)
-
Algorithm: Training
-
Algorithm
-
Algorithm
-
Algorithm
-
Algorithm
-
Application
-
Example
X-or Problem (linearly not separable) using 2-4-1 backprop Netp p
Initial Weights to hidden layer
Initial weights to o/p layer
hidden layer
-
Solution
1y
1z 2z 3z 4z
1x 2x
-
Solution
y Output1y Layer
1z 2z 3z 4zHiddenLayery
1x 2x InputLayer
-
Solution
y Output1y Layer
1z 2z 3z 4zHiddenLayer
13v 14vy
11v12v 13
v
1x 2x InputLayer
-
Solution
y Output1y Layer
1z 2z 3z 4zHiddenLayer
13v 14v
21v 22vy11v
12v 13v21 22v
23v 24v
1x 2x InputLayer
-
Solution
y Output1y Layer11w 21w 31w 41w
1z 2z 3z 4zHiddenLayer
13v 14v
21v 22vy11v
12v 13v21 22v
23v 24v
1x 2x InputLayer
-
Solution
y Output01w 1y Layer11w 21w 31w 41w
101w
1z 2z 3z 4zHiddenLayer
13v 14v
21v 22vy11v
12v 13v21 22v
23v 24v01v
02v03v
1x 2x InputLayer1
03v
04v
-
Example
X-or Problem (linearly not separable) using 2-4-1 backprop Netp p
Initial Weights to hidden layer
Initial weights to o/p layer
hidden layer
-
Solution Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer11w 21w 31w 41w
1
010.39790.3581
1z 2z 3z 4z13v 14
v21v 22v
11v12v 13
v21 22v23v 24v
01v02v
03v
1x 2x103v
04v
-
Solution Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer11w 21w 31w 41w
1
010.39790.3581
1z 2z 3z 4z13v 14
v21v 22v
11v12v 13
v21 22v23v 24v
01v02v
03v
1x 2x103v
04v
[ ]0 0.3378 0.2771 0.2859 0.3329v =
-
Solution Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer11w 21w 31w 41w
1
010.39790.3581
1z 2z 3z 4z13v 14
v21v 22v
11v12v 13
v21 22v23v 24v
01v02v
03v
1x 2x103v
04v[ ]0 0 3378 0 2771 0 2859 0 3329v = [ ]0 0.3378 0.2771 0.2859 0.3329v 0.197 0.3191 0.1448 0.3394
0.3099 0.1904 0.0347 0.4861v
=
-
Solution Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer1
010.39790.3581
3zin = i
1z 2z 3z 4z1 2
1 1 0x x t
0 1691i
2
0.7866zin =
3
0.1064zin
4
0.4796zin =
1 0 10 1 1
1 0.1691zin =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer1
010.39790.3581
3zin = i
1z 2z 3z 4z1 2
1 1 0x x t
0 1691i
2
0.7866zin =
3
0.1064zin
4
0.4796zin =
1 0 10 1 1
1 0.1691zin =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer1
010.39790.3581
3z =
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =
1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Training: step1
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin =
1y Layer1
010.39790.3581
3z =
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =
1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Training: step1
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =
1y Layer1
010.39790.3581
3z =
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =
1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
-0.0567 1x 2x1
0 0 00.05670.03360.0459
in =
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-0.0413
-
Solution Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
-0.0567 -0.0141 1x 2x1
0 0 00.05670.03360.0459
in =
0.00720.01140 0097
j =
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-0.0413 -0.0097
-
Solution Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
1 21 1 0x x t
3z =0.1153k = -0.0012
-0.0016
1z 2z 3z 4z
1 1 01 0 10 1 10 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 0 0 0
1 0.5422z =-0.05670.0336
i
1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413
in =
[ ]0 0.3378 0.2771 0.2859 0.3329v = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950
v = * 1.0e-003
-
Solution Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
1 21 1 0x x t
3z =0.1153k = -0.0012
-0.0016
1z 2z 3z 4z
1 1 01 0 10 1 10 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 0 0 0
1 0.5422z =-0.05670.0336
i
1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413
in = [ ]0 3378 0 2771 0 2859 0 3329[ ]0 -0.2815 0.1444 0.2287 -0.1950v =
-0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950
v = [ ]0 0.3378 0.2771 0.2859 0.3329v =
3*10 3*10
-
Solution Training: step3
y Output01 0.1424w = 0.4907-0.2929
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
01-0.39910.3572 1 2x x t3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 01 0 1
x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 0 1 10 0 0
1 0.5422z =-0.05670.0336
i
1x 2x1[ ]-0 2815 0 1444 0 2287 -0 1950v =
0.0459-0.0413
in = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950
v = * 1.0e-003 [ ]0 0.2815 0.1444 0.2287 0.1950v =
0.1967 0.3192 -0.1446 0.33920.3096 0.1905 -0.0345 -0.4863
v = [ ]0 -0.3381 0.2772 0.2861 -0.3331v =
-
Solution Training: step3
y Output01 0.1393w = 0.4923-0.2908
w
= 1y Layer1
01-0.39750.3584 1 2x x t
1z 2z 3z 4z1 2
1 1 01 0 1
x x t
0 1 10 0 0
1x 2x10.1970 0.3191 -0.1448 0.33940.3100 0.1904 -0.0347 -0.4861
v = [ ]0 -0.3377 0.2770 0.2858 -0.3328v =
-
Application
-
Artificial Neural Networks Artificial Neural Networks 11 & 1211 & 12 11 & 1211 & 12
Dr. Aditya AbhyankarDr. Aditya Abhyankar
-
B kB k P i (BP)P i (BP)BackBack--Propagation (BP)Propagation (BP)
y Aims at balancing memorization and generalizationy Stage 1: Feedforward I/p training patterny Stage 2: calculation and backpropagation
of associated errorof associated errory Stage 3: Adjustments of the weights
-
ArchitectureArchitecture
HiddenLayer
-
NomenclatureNomenclatureNomenclatureNomenclatureInput Training Vector
( )x
x x x x= 1
1
( ,......, ,...... )Output Target Vector
( ,......, ,...... )
i n
k m
x x x x
tt t t t=
,Portion of error correction weight adjustment for ,
errror at to be propagated backj k
kk
wY
Portion of er
j ,ror correction weight adjustment for v ,
errror at Z to be propagated backi j
j
Learning RateInput Unit iX i
-
NomenclatureNomenclatureNomenclatureNomenclature
-
Activation Function Activation Function
y Characteristics: ContinuousDifferentiable Differentiable
Monotonically non-decreasing Easily differentiabley
-
Binary Sigmoid FunctionBinary Sigmoid Functionrange (0,1)range (0,1)g ( , )g ( , )
-
Bipolar Sigmoid FunctionBipolar Sigmoid FunctionBipolar Sigmoid FunctionBipolar Sigmoid Functionrange (range (--1,1)1,1)
-
Algorithm: TrainingAlgorithm: TrainingAlgorithm: TrainingAlgorithm: Training
-
AlgorithmAlgorithmAlgorithmAlgorithm
-
AlgorithmAlgorithmAlgorithmAlgorithm
-
AlgorithmAlgorithmAlgorithmAlgorithm
-
AlgorithmAlgorithmAlgorithmAlgorithm
-
ApplicationApplicationApplicationApplication
-
ApplicationApplicationApplicationApplication
-
y X-or Problem (linearly not separable) o ob ( a y o s pa ab )using 2-4-1 backprop Nety Initial Weights to
I iti l i ht thidden layer Initial weights to o/p layer
ExampleExample
-
SolutionSolutionSolutionSolution
1y
1z 2z 3z 4z
1x 2x
-
SolutionSolutionSolutionSolution
y Output1y Layer
1z 2z 3z 4zHiddenLayery
1x 2x InputLayer
-
SolutionSolutionSolutionSolution
y Output1y Layer
1z 2z 3z 4zHiddenLayer
13v 14vy
11v12v 13
v
1x 2x InputLayer
-
SolutionSolutionSolutionSolution
y Output1y Layer
1z 2z 3z 4zHiddenLayer
13v 14v
21v 22vy11v
12v 13v21 22v
23v 24v
1x 2x InputLayer
-
SolutionSolutionSolutionSolution
y Output1y Layer11w 21w 31w 41w
1z 2z 3z 4zHiddenLayer
13v 14v
21v 22vy11v
12v 13v21 22v
23v 24v
1x 2x InputLayer
-
SolutionSolutionSolutionSolution
y Output01w 1y Layer11w 21w 31w 41w
101w
1z 2z 3z 4zHiddenLayer
13v 14v
21v 22vy11v
12v 13v21 22v
23v 24v01v
02v03v
1x 2x InputLayer1
03v
04v
-
y X-or Problem (linearly not separable) o ob ( a y o s pa ab )using 2-4-1 backprop Nety Initial Weights to
hidden layer Initial weights to o/p layero/p layer
ExampleExample
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer11w 21w 31w 41w
1
010.39790.3581
1z 2z 3z 4z13v 14
v21v 22v
11v12v 13
v21 22v23v 24v
01v02v
03v
1x 2x103v
04v
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer11w 21w 31w 41w
1
010.39790.3581
1z 2z 3z 4z13v 14
v21v 22v
11v12v 13
v21 22v23v 24v
01v02v
03v
1x 2x103v
04v
[ ]0 0.3378 0.2771 0.2859 0.3329v =
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer11w 21w 31w 41w
1
010.39790.3581
1z 2z 3z 4z13v 14
v21v 22v
11v12v 13
v21 22v23v 24v
01v02v
03v
1x 2x103v
04v[ ]0 0 3378 0 2771 0 2859 0 3329v = [ ]0 0.3378 0.2771 0.2859 0.3329v 0.197 0.3191 0.1448 0.3394
0.3099 0.1904 0.0347 0.4861v
=
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer1
010.39790.3581
3zin = i
1z 2z 3z 4z1 2
1 1 0x x t
0 1691i
2
0.7866zin =
3
0.1064zin
4
0.4796zin =
1 0 10 1 1
1 0.1691zin =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer1
010.39790.3581
3zin = i
1z 2z 3z 4z1 2
1 1 0x x t
0 1691i
2
0.7866zin =
3
0.1064zin
4
0.4796zin =
1 0 10 1 1
1 0.1691zin =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer1
010.39790.3581
3z =
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =
1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin =
1y Layer1
010.39790.3581
3z =
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =
1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =
1y Layer1
010.39790.3581
3z =
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =
1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
-0.0567 1x 2x1
0 0 00.05670.03360.0459
in =
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-0.0413
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
-0.0567 -0.0141 1x 2x1
0 0 00.05670.03360.0459
in =
0.00720.01140 0097
j =
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-0.0413 -0.0097
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
1 21 1 0x x t
3z =0.1153k = -0.0012
-0.0016
1z 2z 3z 4z
1 1 01 0 10 1 10 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 0 0 0
1 0.5422z =-0.05670.0336
i
1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413
in =
[ ]0 0.3378 0.2771 0.2859 0.3329v = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950
v = * 1.0e-003
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
1 21 1 0x x t
3z =0.1153k = -0.0012
-0.0016
1z 2z 3z 4z
1 1 01 0 10 1 10 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 0 0 0
1 0.5422z =-0.05670.0336
i
1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413
in = [ ]0 3378 0 2771 0 2859 0 3329[ ]0 -0.2815 0.1444 0.2287 -0.1950v =
-0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950
v = [ ]0 0.3378 0.2771 0.2859 0.3329v =
3*10 3*10
-
Solution Solution Training: step3Training: step3Solution Solution Training: step3Training: step3
y Output01 0.1424w = 0.4907-0.2929
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
01-0.39910.3572 1 2x x t3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 01 0 1
x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 0 1 10 0 0
1 0.5422z =-0.05670.0336
i
1x 2x1[ ]-0 2815 0 1444 0 2287 -0 1950v =
0.0459-0.0413
in = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950
v = * 1.0e-003 [ ]0 0.2815 0.1444 0.2287 0.1950v =
0.1967 0.3192 -0.1446 0.33920.3096 0.1905 -0.0345 -0.4863
v = [ ]0 -0.3381 0.2772 0.2861 -0.3331v =
-
Solution Solution Training: step3Training: step3Solution Solution Training: step3Training: step3
y Output01 0.1393w = 0.4923-0.2908
w
= 1y Layer1
01-0.39750.3584 1 2x x t
1z 2z 3z 4z1 2
1 1 01 0 1
x x t
0 1 10 0 0
1x 2x10.1970 0.3191 -0.1448 0.33940.3100 0.1904 -0.0347 -0.4861
v = [ ]0 -0.3377 0.2770 0.2858 -0.3328v =
-
ApplicationApplicationApplicationApplication
-
ApplicationApplicationApplicationApplication
-
PerceptronPerceptron Convergence Convergence TheoremTheorem
If there is a weight vector w* such that gf(x(p).w*)=t(p) for all p, then for any starting vector w, PLR will converge to g , ga weight vector that gives correct response to all training patterns in p g pfinite number of loops.
44ANN - Dr. Abhyankar - Lec 7 & 8
-
PerceptronPerceptron Convergence Convergence TheoremTheorem
* * *( ). (0). , min{ . }w k w w w km m x w + =* 2
2* 2
( ( ). )|| ( ) |||| ||w k ww k
w
|| ||w
* 22 ( (0). )|| ( ) || w w kmk +2 * 2( ( ) )|| ( ) || || ||w k w
45ANN - Dr. Abhyankar - Lec 7 & 8
-
PerceptronPerceptron Convergence Convergence TheoremTheorem
2 2 2|| ( ) || || (0) || , max{|| ||}w k w kM M x + =
* 22 2( (0). ) || ( ) || || (0) ||w w km k kM+ 2 2* 2( ( ) ) || ( ) || || (0) |||| || w k w kMw +
46ANN - Dr. Abhyankar - Lec 7 & 8
-
Delta RuleDelta RuleDelta RuleDelta Rule
E i f ti f ll th i hty E is function of all the weightsy Gradient of E will be a vector consisting of
partial derivatives of E with reference to partial derivatives of E with reference to each weighty Gradient gives direction of rapid increase y Gradient gives direction of rapid increase
in E, we wish to minimize E!y Hence calculate: E
Iw
47ANN - Dr. Abhyankar - Lec 7 & 8
-
Delta RuleDelta RuleDelta RuleDelta Rule
2( ) inyE t y = 2( )inI I
t yw w
= 2( )E ( )E
l ll b d d
2( )in II
E t y xw = ( )in II
E t y xw
= y Local error will be reduced most
rapidly by adjusting weights as per the delta rulethe delta rule
48ANN - Dr. Abhyankar - Lec 7 & 8
-
Artificial Neural Networks Artificial Neural Networks 1313 1313
Dr. Aditya AbhyankarDr. Aditya Abhyankar
-
B kB k P i (BP)P i (BP)BackBack--Propagation (BP)Propagation (BP)
y Aims at balancing memorization and generalizationy Stage 1: Feedforward I/p training patterny Stage 2: calculation and backpropagation
of associated errorof associated errory Stage 3: Adjustments of the weights
-
ArchitectureArchitecture
HiddenLayer
-
NomenclatureNomenclatureNomenclatureNomenclatureInput Training Vector
( )x
x x x x= 1
1
( ,......, ,...... )Output Target Vector
( ,......, ,...... )
i n
k m
x x x x
tt t t t=
,Portion of error correction weight adjustment for ,
errror at to be propagated backj k
kk
wY
Portion of er
j ,ror correction weight adjustment for v ,
errror at Z to be propagated backi j
j
Learning RateInput Unit iX i
-
NomenclatureNomenclatureNomenclatureNomenclature
-
Activation Function Activation Function
y Characteristics: ContinuousDifferentiable Differentiable
Monotonically non-decreasing Easily differentiabley
-
Binary Sigmoid FunctionBinary Sigmoid Functionrange (0,1)range (0,1)g ( , )g ( , )
-
Bipolar Sigmoid FunctionBipolar Sigmoid FunctionBipolar Sigmoid FunctionBipolar Sigmoid Functionrange (range (--1,1)1,1)
-
Algorithm: TrainingAlgorithm: TrainingAlgorithm: TrainingAlgorithm: Training
-
AlgorithmAlgorithmAlgorithmAlgorithm
-
AlgorithmAlgorithmAlgorithmAlgorithm
-
AlgorithmAlgorithmAlgorithmAlgorithm
-
AlgorithmAlgorithmAlgorithmAlgorithm
-
ApplicationApplicationApplicationApplication
-
ApplicationApplicationApplicationApplication
-
y X-or Problem (linearly not separable) o ob ( a y o s pa ab )using 2-4-1 backprop Nety Initial Weights to
I iti l i ht thidden layer Initial weights to o/p layer
ExampleExample
-
SolutionSolutionSolutionSolution
1y
1z 2z 3z 4z
1x 2x
-
SolutionSolutionSolutionSolution
y Output1y Layer
1z 2z 3z 4zHiddenLayery
1x 2x InputLayer
-
SolutionSolutionSolutionSolution
y Output1y Layer
1z 2z 3z 4zHiddenLayer
13v 14vy
11v12v 13
v
1x 2x InputLayer
-
SolutionSolutionSolutionSolution
y Output1y Layer
1z 2z 3z 4zHiddenLayer
13v 14v
21v 22vy11v
12v 13v21 22v
23v 24v
1x 2x InputLayer
-
SolutionSolutionSolutionSolution
y Output1y Layer11w 21w 31w 41w
1z 2z 3z 4zHiddenLayer
13v 14v
21v 22vy11v
12v 13v21 22v
23v 24v
1x 2x InputLayer
-
SolutionSolutionSolutionSolution
y Output01w 1y Layer11w 21w 31w 41w
101w
1z 2z 3z 4zHiddenLayer
13v 14v
21v 22vy11v
12v 13v21 22v
23v 24v01v
02v03v
1x 2x InputLayer1
03v
04v
-
y X-or Problem (linearly not separable) o ob ( a y o s pa ab )using 2-4-1 backprop Nety Initial Weights to
hidden layer Initial weights to o/p layero/p layer
ExampleExample
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer11w 21w 31w 41w
1
010.39790.3581
1z 2z 3z 4z13v 14
v21v 22v
11v12v 13
v21 22v23v 24v
01v02v
03v
1x 2x103v
04v
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer11w 21w 31w 41w
1
010.39790.3581
1z 2z 3z 4z13v 14
v21v 22v
11v12v 13
v21 22v23v 24v
01v02v
03v
1x 2x103v
04v
[ ]0 0.3378 0.2771 0.2859 0.3329v =
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer11w 21w 31w 41w
1
010.39790.3581
1z 2z 3z 4z13v 14
v21v 22v
11v12v 13
v21 22v23v 24v
01v02v
03v
1x 2x103v
04v[ ]0 0 3378 0 2771 0 2859 0 3329v = [ ]0 0.3378 0.2771 0.2859 0.3329v 0.197 0.3191 0.1448 0.3394
0.3099 0.1904 0.0347 0.4861v
=
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer1
010.39790.3581
3zin = i
1z 2z 3z 4z1 2
1 1 0x x t
0 1691i
2
0.7866zin =
3
0.1064zin
4
0.4796zin =
1 0 10 1 1
1 0.1691zin =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer1
010.39790.3581
3zin = i
1z 2z 3z 4z1 2
1 1 0x x t
0 1691i
2
0.7866zin =
3
0.1064zin
4
0.4796zin =
1 0 10 1 1
1 0.1691zin =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
= 1y Layer1
010.39790.3581
3z =
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =
1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin =
1y Layer1
010.39790.3581
3z =
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =
1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =
1y Layer1
010.39790.3581
3z =
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =
1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
1x 2x10 0 0
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
-0.0567 1x 2x1
0 0 00.05670.03360.0459
in =
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-0.0413
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 0x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 1 0 10 1 1
1 0.5422z =
-0.0567 -0.0141 1x 2x1
0 0 00.05670.03360.0459
in =
0.00720.01140 0097
j =
[ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861
v =
-0.0413 -0.0097
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
1 21 1 0x x t
3z =0.1153k = -0.0012
-0.0016
1z 2z 3z 4z
1 1 01 0 10 1 10 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 0 0 0
1 0.5422z =-0.05670.0336
i
1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413
in =
[ ]0 0.3378 0.2771 0.2859 0.3329v = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950
v = * 1.0e-003
-
Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2
y Output01 0.1401w = 0.49190.2913
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
010.39790.3581
1 21 1 0x x t
3z =0.1153k = -0.0012
-0.0016
1z 2z 3z 4z
1 1 01 0 10 1 10 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 0 0 0
1 0.5422z =-0.05670.0336
i
1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413
in = [ ]0 3378 0 2771 0 2859 0 3329[ ]0 -0.2815 0.1444 0.2287 -0.1950v =
-0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950
v = [ ]0 0.3378 0.2771 0.2859 0.3329v =
3*10 3*10
-
Solution Solution Training: step3Training: step3Solution Solution Training: step3Training: step3
y Output01 0.1424w = 0.4907-0.2929
w
=
1 0.1462yin = 1 0.4635y =01 0.0023w =
1y Layer1
01-0.39910.3572 1 2x x t3z =
0.1153k = -0.0012-0.0016
1z 2z 3z 4z1 2
1 1 01 0 1
x x t
0 5422
2
0.6871z =
3
0.5266z
4
0.3823z =-0.0012
-0.0009
w = 0 1 10 0 0
1 0.5422z =-0.05670.0336
i
1x 2x1[ ]-0 2815 0 1444 0 2287 -0 1950v =
0.0459-0.0413
in = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950
v = * 1.0e-003 [ ]0 0.2815 0.1444 0.2287 0.1950v =
0.1967 0.3192 -0.1446 0.33920.3096 0.1905 -0.0345 -0.4863
v = [ ]0 -0.3381 0.2772 0.2861 -0.3331v =
-
Solution Solution Training: step3Training: step3Solution Solution Training: step3Training: step3
y Output01 0.1393w = 0.4923-0.2908
w
= 1y Layer1
01-0.39750.3584 1 2x x t
1z 2z 3z 4z1 2
1 1 01 0 1
x x t
0 1 10 0 0
1x 2x10.1970 0.3191 -0.1448 0.33940.3100 0.1904 -0.0347 -0.4861
v = [ ]0 -0.3377 0.2770 0.2858 -0.3328v =
-
ApplicationApplicationApplicationApplication
-
ApplicationApplicationApplicationApplication
-
PerceptronPerceptron Convergence Convergence TheoremTheorem
If there is a weight vector w* such that gf(x(p).w*)=t(p) for all p, then for any starting vector w, PLR will converge to g , ga weight vector that gives correct response to all training patterns in p g pfinite number of loops.
44ANN - Dr. Abhyankar - Lec 7 & 8
-
PerceptronPerceptron Convergence Convergence TheoremTheorem
* * *( ). (0). , min{ . }w k w w w km m x w + =* 2
2* 2
( ( ). )|| ( ) |||| ||w k ww k
w
|| ||w
* 22 ( (0). )|| ( ) || w w kmk +2 * 2( ( ) )|| ( ) || || ||w k w
45ANN - Dr. Abhyankar - Lec 7 & 8
-
PerceptronPerceptron Convergence Convergence TheoremTheorem
2 2 2|| ( ) || || (0) || , max{|| ||}w k w kM M x + =
* 22 2( (0). ) || ( ) || || (0) ||w w km k kM+ 2 2* 2( ( ) ) || ( ) || || (0) |||| || w k w kMw +
46ANN - Dr. Abhyankar - Lec 7 & 8
-
Delta RuleDelta RuleDelta RuleDelta Rule
E i f ti f ll th i hty E is function of all the weightsy Gradient of E will be a vector consisting of
partial derivatives of E with reference to partial derivatives of E with reference to each weighty Gradient gives direction of rapid increase y Gradient gives direction of rapid increase
in E, we wish to minimize E!y Hence calculate: E
Iw
47ANN - Dr. Abhyankar - Lec 7 & 8
-
Delta RuleDelta RuleDelta RuleDelta Rule
2( ) inyE t y = 2( )inI I
t yw w
= 2( )E ( )E
l ll b d d
2( )in II
E t y xw = ( )in II
E t y xw
= y Local error will be reduced most
rapidly by adjusting weights as per the delta rulethe delta rule
48ANN - Dr. Abhyankar - Lec 7 & 8
-
Dr. Aditya Abhyankar
-
Learning process of forming Learning process of forming associations between related patterns
Heteroassociative NNsHeteroassociative NNsAutoassociative NNsHopfield NetHopfield Net
-
A linear vector space X set