ann combined

869
Artificial Neural Artificial Neural Networks Networks Lec Lec 1 & 2 1 & 2 Networks Networks Lec Lec 1 & 2 1 & 2 Dr. Aditya Abhyankar 1 ANN - Dr. Abhyankar - Lecture 1&2

Upload: pravin

Post on 25-Dec-2015

221 views

Category:

Documents


2 download

DESCRIPTION

ANN Combined

TRANSCRIPT

  • Artificial Neural Artificial Neural Networks Networks LecLec 1 & 21 & 2Networks Networks LecLec 1 & 21 & 2

    Dr. Aditya Abhyankar

    1ANN - Dr. Abhyankar - Lecture 1&2

  • Major Topics to be CoveredMajor Topics to be CoveredMajor Topics to be CoveredMajor Topics to be Coveredy ANN Basics, neurons, learning algorithmsy Perceptron learning, and pattern classificationPerceptron learning, and pattern classificationy Multi-Layer Perceptron (MLP), back-propagation learning,

    and applicationsy Pattern classification, Support vector machine (SVM)y Clustering, Self-Organization Mapy Radial Basis Networky Time series prediction, system identification, expert system

    F S t Th d F L i C t ly Fuzzy Set Theory and Fuzzy Logic Controly Genetic Algorithm and Evolution Computingy Learn vector quantizationy Mixture of Expert networky Mixture of Expert networky Recurrent network

    2ANN - Dr. Abhyankar - Lecture 1&2

  • Class PhilosophyClass PhilosophyClass PhilosophyClass Philosophy

    Questions starting with WHY !!y Questions starting with WHY !!y No formalitiesy No attendance gimmicksy No attendance gimmicksy More inquisitive

    3ANN - Dr. Abhyankar - Lecture 1&2

  • ApplicationsApplications

    y General models of ANN Applications Pattern Classifications Control, time series modeling, estimation Optimization

    Real world application examples Real world application examples

    4ANN - Dr. Abhyankar - Lecture 1&2

  • ApplicationsApplications

    y Many memoryless ANN paradigms (MLP) modeled mathematically as a nonlinear

    (f )mapping between the inputs (feature vectors) and outputs.

    Discrete output values: classification problem Discrete output values: classification problem Continuous output values: approximation problem

    ANN with feed-back can be used to model t eed bac ca be used to odedynamic systems

    5ANN - Dr. Abhyankar - Lecture 1&2

  • Pattern Classification Pattern Classification Pattern Classification Pattern Classification ApplicationsApplications

    y Speech Recognition and Speech Synthesisy Classification of radar/sonar signalsClassification of radar/sonar signalsy Remote Sensing and image classificationy Handwritten character/digits Recognitiong gy ECG/EEG/EMG Filtering/Classificationy Credit card application screeningy Data mining, Information retrieval

    6ANN - Dr. Abhyankar - Lecture 1&2

  • Control, Time Series, EstimationControl, Time Series, Estimation

    y Machine Control / Robot manipulationFi i l / S i ifi / E i i Ti iy Financial / Scientific / Engineering Time series forecastingInverse modeling of vocal tracty Inverse modeling of vocal tract

    7ANN - Dr. Abhyankar - Lecture 1&2

  • OptimizationOptimization

    y Traveling sales persony Multiprocessor scheduling and task assignmenty VLSI placement and routing

    8ANN - Dr. Abhyankar - Lecture 1&2

  • Real World ApplicationsReal World Applications

    y S&P 500 index predictiony S&P 500 index predictiony Real Estate appraisaly Credit scoringgy Geochemical modelingy Hospital patient stay length predictiony Breast cancer cell image classificationy Jury summoning predictiony Precision direct mailingy Precision direct mailingy Natural gas price prediction

    9ANN - Dr. Abhyankar - Lecture 1&2

  • ANN !!!ANN !!!ANN !!!ANN !!!y An artificial neural network (ANN) is a massively

    parallel distributed computing system (algorithmparallel distributed computing system (algorithm, device, or other) that has a natural propensity for storing experiential knowledge and making it available for useavailable for use.

    y It resembles the brain in two aspects:1) Knowledge is acquired by the network through a learning1). Knowledge is acquired by the network through a learning process.2). Interneuron connection strengths known as synaptic weights are used to store the knowledge.

    Al k d & M t (1990) H ki Aleksander & Morton (1990), Haykin(1994)

    10ANN - Dr. Abhyankar - Lecture 1&2

  • Biological S stemBiological S stemBiological SystemBiological System

    11ANN - Dr. Abhyankar - Lecture 1&2

  • Biological S stemBiological S stemBiological SystemBiological System

    12ANN - Dr. Abhyankar - Lecture 1&2

  • ANN Ass mptionsANN Ass mptionsANN AssumptionsANN Assumptions

    Info mation p ocessing happens at man y Information processing happens at many simple elements called neuronsy Signals are passed between neurons over Signals are passed between neurons over

    the connection linksy Each connection link has associated

    i htweighty Each neuron applies an activation function

    13ANN - Dr. Abhyankar - Lecture 1&2

  • ANN cha acte isticsANN cha acte isticsANN characteristicsANN characteristics

    P tt f ti b t y Pattern of connections between neurons called architecturey Method of determining the weights on the y Method of determining the weights on the

    connections called training algorithmy Mathematical model for assigning the y Mathematical model for assigning the

    output called activation function

    14ANN - Dr. Abhyankar - Lecture 1&2

  • Ne on ModelNe on ModelNeuron ModelNeuron Modely McCulloch-Pitts (Simplistic) Neuron Model

    The network function of a neuron is ay The network function of a neuron is aweighted sum of its input signals plus a biasterm.

    15ANN - Dr. Abhyankar - Lecture 1&2

  • Neuron ModelNeuron ModelNeuron ModelNeuron Modely The net function is a linear or nonlinear

    mapping from the input data space to an pp g p pintermediate feature spacey The most common form is a hyper-plane

    16ANN - Dr. Abhyankar - Lecture 1&2

  • Othe Net Fo msOthe Net Fo msOther Net FormsOther Net Formsy Higher order net function: Net function is a linear

    bi ti f hi h d l i l t Fcombination of higher order polynomial terms. For example, a 2nd order net function has the form:

    ND lt (S P) t f ti i t d f ti th

    1, 1i ijk j k i

    j kw y y

    = == +

    y Delta (S-P) net function instead of summation, theproduct of all weighted synaptic inputs are computed:

    N

    0

    N

    i ij ij

    w y=

    =17ANN - Dr. Abhyankar - Lecture 1&2

  • Neuron Activation functionNeuron Activation functionNeuron Activation functionNeuron Activation function

    18ANN - Dr. Abhyankar - Lecture 1&2

  • Neuron Activation functionNeuron Activation functionNeuron Activation functionNeuron Activation function

    19ANN - Dr. Abhyankar - Lecture 1&2

  • Neuron Activation functionNeuron Activation function

    20ANN - Dr. Abhyankar - Lecture 1&2

  • ANN configurationANN configurationgg

    y Uni-directional communication links represented b di t d Th ANN t t th bby directed arcs. The ANN structure thus can be described by a directed graph.y Fully connected a cyclic graph with feedy Fully connected a cyclic graph with feed-

    back.There are NxN connections for N neurons.

    21ANN - Dr. Abhyankar - Lecture 1&2

  • ANN configurationANN configurationANN configurationANN configurationy Feed-forward, layered connection acyclic

    directed graph no loop or cycledirected graph, no loop or cycle.

    22ANN - Dr. Abhyankar - Lecture 1&2

  • ANN configurationANN configurationANN configurationANN configuration

    23ANN - Dr. Abhyankar - Lecture 1&2

  • FeedFeed--back Dynamic Systemback Dynamic SystemFeedFeed back Dynamic Systemback Dynamic Systemy Without Delay, feedback

    cause causality problem: ancause causality problem: anunknown variable dependson an unknown variable!2 ( 1) ( ( 2))a2 = g(a1) = g(g(a2)) =

    To break the cycle, at leastone delay element must beinserted into the feedbackloop.

    This effectively created a This effectively created anonlinear dynamic system(sequential machine).

    24ANN - Dr. Abhyankar - Lecture 1&2

  • Models of NeuronModels of Neuron

    y McCulloch-Pitts Model (MP)y Rosenblatts Perceptron Modely Adaline Model

    25ANN - Dr. Abhyankar - Lecture 1&2

  • MP ModelMP ModelMP ModelMP Modely McCulloch-Pitts (Simplistic) Neuron Model

    y The network function of a neuron is ai ht d f it i t i l l biweighted sum of its input signals plus a bias

    term.

    26ANN - Dr. Abhyankar - Lecture 1&2

  • MP Model LimitationsMP Model LimitationsMP Model LimitationsMP Model Limitations

    y Weights fixedWeights fixedy Incapable of learningy Original model allows ONLY:g

    Binary output steps Operations at discrete time steps

    27ANN - Dr. Abhyankar - Lecture 1&2

  • PerceptronPerceptronPerceptronPerceptron28ANN - Dr. Abhyankar - Lecture 1&2

  • M1

    M

    i ii

    x w a = Activation1i=( )s f x=Output

    b s = Error

    PerceptronPerceptron i iw a =Weight Change

    PerceptronPerceptron29ANN - Dr. Abhyankar - Lecture 1&2

  • y Perceptron learning law gives step-y Perceptron learning law gives stepby-step process for adjusting weightsy Perceptron convergence theoremPerceptron convergence theorem

    Perceptron Perceptron -- AdvantagesAdvantagesPerceptron Perceptron AdvantagesAdvantages30ANN - Dr. Abhyankar - Lecture 1&2

  • Widrows AdalineWidrows AdalineWidrow s AdalineWidrow s Adaline31ANN - Dr. Abhyankar - Lecture 1&2

  • M1

    M

    i ii

    x w a = Activation1i=

    ( )s f x x= =Outputb s b x = = Error

    Adaline (ADAptive Linear Adaline (ADAptive Linear Element)Element) i i

    w a =Weight ChangeElement)Element)

    32ANN - Dr. Abhyankar - Lecture 1&2

  • y Analog activation value x compared with a og a a o a u o pa dtarget output b

    ORy Output is linear function of xy LMS learning lawy Gradient descent algorithm

    Widrows AdalineWidrows AdalineWidrow s AdalineWidrow s Adaline33ANN - Dr. Abhyankar - Lecture 1&2

  • Heat and Cold ExHeat and Cold ExHeat and Cold Ex.Heat and Cold Ex.

    34ANN - Dr. Abhyankar - Lecture 1&2

  • Hebb RuleHebb RuleHebb RuleHebb Rule35ANN - Dr. Abhyankar - Lecture 1&2

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    36ANN - Dr. Abhyankar - Lecture 1&2

  • Artificial Neural Artificial Neural Networks Networks LecLec 3 & 43 & 4

    Dr. Aditya Abhyankar

  • ANN !!!ANN !!!ANN !!!ANN !!!

    y An artificial neural network (ANN) is a massively paralleldistributed computing system (algorithm device or other) thatdistributed computing system (algorithm, device, or other) that has a natural propensity for storing experiential knowledge and making it available for use.

    y It resembles the brain in two aspects:1). Knowledge is acquired by the network through a learning processprocess.2). Interneuron connection strengths known as synaptic weights are used to store the knowledge.

    Aleksander & Morton (1990) Haykin Aleksander & Morton (1990), Haykin (1994)

    2July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Biological S stemBiological S stemBiological SystemBiological System

    3July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Biological S stemBiological S stemBiological SystemBiological System

    4July 20, 2010ANN - Dr. Abhyankar - Lec 3 & 4

  • Features Features Biological NNBiological NNFeatures Features Biological NNBiological NN

    y Robustness and fault toleranceRobustness and fault tolerancey Flexibility: on-the-fly-learning, adjustment

    of weightsy Adaptability: ability to deal with variety of

    data situations (fuzzy, probabilistic, noisy etc) etc) y Efficiency: Parallel and distributed

    computing

    5July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Ne on ModelNe on ModelNeuron ModelNeuron Modely McCulloch-Pitts (Simplistic) Neuron Model

    The network function of a neuron is ay The network function of a neuron is aweighted sum of its input signals plus a biasterm.

    6July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Performance ComparisonPerformance ComparisonParameter BNN ANNSpeed Slow (few ms per Slow (few ns perSpeed Slow (few ms per

    execution)Slow (few ns per execution)

    Processing Massively parallel Mostly sequentialProcessing Massively parallel Mostly sequential

    Size & Neurons~1011, Difficult to perform Complexity interconnections~

    1015, contribution

    pcomplex pattern recognition tasks

    from dendrites and synapses

    7July 20, 2010ANN - Dr. Abhyankar -Lec 3 & 4

  • Performance ComparisonPerformance ComparisonPerformance ComparisonPerformance Comparison

    Parameter BNN ANNParameter BNN ANN

    Storage Adaptable Strictly replaceableStorage Adaptable (strengths of interconnections)

    Strictly replaceable (memory mapping)

    )Fault Tolerance

    Good FT, distributed

    Poor FT, corrupted memories un-

    information retrieved data

    8July 20, 2010ANN - Dr. Abhyankar -Lec 3 & 4

  • ANN TerminologyANN TerminologyANN TerminologyANN Terminology

    ANN highly simplified model of BNNy ANN highly simplified model of BNNy ANN has interconnected processing unitsy Summing part receives N inputs weights y Summing part receives N inputs, weights

    each value and computes weighted sumy Weighted sum activation valueWeighted sum activation valuey Positive weight excitatory inputy Negative weight inhibitory inputg g y p

    9July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Neuron ModelNeuron ModelNeuron ModelNeuron Modely The net function is a linear or nonlinear

    mapping from the input data space to an pp g p pintermediate feature spacey The most common form is a hyper-plane

    10July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Other Things we sawOther Things we sawOther Things we sawOther Things we saw

    y Other Net FormsOther Net Formsy Various Activation Functionsy ANN configurationsgy Dynamic Systemsy ANN Assumptionsy ANN Characteristics

    11July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Models of Models of Neuron Neuron 1 layer1 layery McCulloch-Pitts Model (MP)y Rosenblatts Perceptron Modely Adaline Model

    12July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • MP ModelMP ModelMP ModelMP Modely McCulloch-Pitts (Simplistic) Neuron Model

    x1xxM

    1iwijw

    iny yjx

    xMij

    iNw y

    Th t k f ti f i

    Nxib

    y The network function of a neuron is aweighted sum of its input signals plus a biasterm.term.

    13July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • MP ModelMP Modely The net function is a linear or nonlinear

    mapping from the input data space to an pp g p pintermediate feature spacey The most common form is a hyper-plane

    NT

    i ij j i i iy w x b w x b= + = +1

    iin ij j i i ij

    y w x b w x b=

    + +( )iny f y=

    14July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • MP Model LimitationsMP Model LimitationsMP Model LimitationsMP Model Limitations

    y Weights fixedWeights fixedy Incapable of learningy Original model allows ONLY:g

    Binary output steps Operations at discrete time steps

    15July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • PerceptronPerceptronpp

    16July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • P tP tPerceptronPerceptron

    M

    1

    M

    i ii

    x w a = Activation1i=( )s f x=Output

    b s = Errori iw a =Weight Change

    17July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • PerceptronPerceptron -- AdvantagesAdvantages

    y Perceptron learning law gives step-y Perceptron learning law gives stepby-step process for adjusting weightsy Perceptron convergence theoremPerceptron convergence theorem

    18July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Wid Wid Ad liAd liWidrowsWidrows AdalineAdaline

    19July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • AdalineAdaline ((ADAptiveADAptive Linear Element)Linear Element)M

    1

    M

    i ii

    x w a = Activation1i=

    ( )s f x x= =Outputb s b x = = Error

    i iw a =Weight Change20July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • WidrowsWidrows AdalineAdalineWidrowsWidrows AdalineAdaline

    y Analog activation value x compared with y Analog activation value x compared with target output b

    OROy Output is linear function of xy LMS learning lawgy Gradient descent algorithm

    21July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • AND Example using MP modelAND Example using MP modelAND Example using MP modelAND Example using MP model

    1 ?w = 1 20 0 0x x y

    1x 1 ?w

    0 1 01 0 0

    2x y

    1 0 01 1 1

    2 ?w = ?TH =

    22July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • AND Example using MP modelAND Example using MP modelAND Example using MP modelAND Example using MP model

    1 1w = 1 20 0 0x x y

    1x 1 1w

    0 1 01 0 0

    2x y

    1 0 01 1 1

    2 1w = 2TH =

    23July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • AND Example using MP modelAND Example using MP modelAND Example using MP modelAND Example using MP model

    1 1w = Gives equation f li 1 20 0 0x x y

    1x 1 1w of a line

    0 1 01 0 0

    2x y

    1 0 01 1 1

    2 1w = 2TH =Why One Neuron is sufficient ???

    24July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • AND Example using MP modelAND Example using MP modelAND Example using MP modelAND Example using MP model

    2x1 2

    0 0 0x x y2

    0 1 01 0 0

    1x1 0 01 1 1

    25July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • OR Example using MP modelOR Example using MP modelOR Example using MP modelOR Example using MP model

    2x1 2

    0 0 0x x y2

    0 1 11 0 1

    1x1 0 11 1 1

    26July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • OR Example using MP modelOR Example using MP modelOR Example using MP modelOR Example using MP model

    1 2w = 1 20 0 0x x y

    1x 1 2w

    0 1 11 0 1

    2x y

    1 0 11 1 1

    2 2w = 2TH =

    27July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • ANDAND--NOT Example using MP NOT Example using MP p gp gmodelmodel

    1 2w = 1 20 0 0x x y

    1x 1 2w

    0 1 01 0 1

    2x y

    1 0 11 1 0

    2 1w = 2TH =

    28July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model

    1 ?w = 1 20 0 0x x y

    1x 1 ?w

    0 1 11 0 1

    2x y

    1 0 11 1 0

    2 ?w = ?TH =

    29July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model

    2x1 2

    0 0 0x x y2

    0 1 11 0 1

    1x1 0 11 1 1

    NOT Linearly Separable !!!!!!

    30July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model1 2

    0 0 00 1 1

    x x y

    0 1 11 0 11 1 01x 11 2w = 31 2w =

    y12 1w = 21 1w = 2TH =

    2xy

    22 2w =2TH =

    32 2w =2TH =22 2TH =

    1 2 1 2 2 1( ) { ( ) } { ( ) }x XOR x x ANDNOT x OR x ANDNOT x=31July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT

    1x 1y

    COLD COLD

    2x 2y

    Actual Input Perceived Output

    32July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    ( ) ( 1) ( 2)y t x t ANDx t=2 2 2( ) ( 1) ( 2)y t x t ANDx t= 1 1 2 2( ) { ( 1)} { ( 3) ( 2)}y t x t OR x t ANDNOTx t=

    33July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    1x 1y2

    -1 NN

    COLD COLD22 1

    2x 2y2

    1

    1N N

    Actual Input Perceived Output

    34July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    Case Study 1: Cold stimulus applied for small duration

    01x 1yHOT HOT

    2

    1 1

    COLD COLD2

    -1

    12x 2yCO CO

    2

    1

    1

    Actual Input Perceived Output

    1

    t=0

    35July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    Case Study 1: Cold stimulus applied for small duration

    01x 1yHOT HOT

    0

    2

    1 1

    COLD COLD

    0

    2

    -1

    02x 2yCO CO

    12

    1

    1

    Actual Input Perceived Output

    1

    t=1

    36July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    Case Study 1: Cold stimulus applied for small duration

    1x 0 1yHOT HOT

    1

    2

    1 1

    COLD COLD

    1

    2

    -1

    2x 0 2yCO CO

    02

    1

    1

    Actual Input Perceived Output

    1

    t=2

    37July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    Case Study 1: Cold stimulus applied for small duration

    1x 1 1yHOT HOT

    2

    1 1

    COLD COLD2

    -1

    2x 0 2yCO CO

    2

    1

    1

    Actual Input Perceived Output

    1

    t=3

    38July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    Case Study 2: Hot stimulus applied for one t step

    11x 1yHOT HOT

    2

    1 1

    COLD COLD2

    -1

    02x 2yCO CO

    2

    1

    1

    Actual Input Perceived Output

    1

    t=0

    39July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    Case Study 2: Hot stimulus applied for one t step

    1x 1 1yHOT HOT

    2

    1 1

    COLD COLD2

    -1

    2x 0 2yCO CO

    02

    1

    1

    Actual Input Perceived Output

    1

    t=1

    40July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    Case Study 3: Cold stimulus applied for longer duration

    01x 1yHOT HOT

    2

    1 1

    COLD COLD2

    -1

    12x 2yCO CO

    2

    1

    1

    Actual Input Perceived Output

    1

    t=0

    41July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    Case Study 3: Cold stimulus applied for longer duration

    01x 1yHOT HOT

    0

    2

    1 1

    COLD COLD

    0

    2

    -1

    12x 2yCO CO

    12

    1

    1

    Actual Input Perceived Output

    1

    t=1

    42July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    Case Study 3: Cold stimulus applied for longer duration

    1x 0 1yHOT HOT

    0

    2

    1 1

    COLD COLD

    0

    2

    -1

    2x 1 2yCO CO

    12

    1

    1

    Actual Input Perceived Output

    1

    t=2

    43July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • MP ModelMP ModelMP ModelMP Modely McCulloch-Pitts (Simplistic) Neuron Model

    x1xxM

    1iwijw

    iny yjx

    xMij

    iNw y

    Th t k f ti f i

    Nxib

    y The network function of a neuron is aweighted sum of its input signals plus a biasterm.term.

    44July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • MP ModelMP Modely The net function is a linear or nonlinear

    mapping from the input data space to an pp g p pintermediate feature spacey The most common form is a hyper-plane

    NT

    i ij j i i iy w x b w x b= + = +1

    iin ij j i i ij

    y w x b w x b=

    + +( )iny f y=

    45July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Significance of biasSignificance of biasgg

    1x w 1x 1iw y1jx

    M

    M

    1iwijw

    iNw iny y

    1

    jx

    x

    M

    M

    1iwijw

    iNw iny y

    0b + +Nx

    ibNx

    i1 1 2 2 0b x w x w+ + =

    w b1 1 2 2x w x w + =

    1w 12 1

    2 2

    w bx xw w

    = 12 12 2

    x xw w

    = +46July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • MP Model LimitationsMP Model LimitationsMP Model LimitationsMP Model Limitations

    y Weights fixedWeights fixedy Incapable of learningy Original model allows ONLY:g

    Binary output steps Operations at discrete time steps

    47July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • HebbHebb Rule (Rule (HebbHebb Net)Net)HebbHebb Rule (Rule (HebbHebb Net)Net)

    Initialize all weights to zero0 0, 1iw i n= = for {all input training vectors and target

    t t i d t 2 4 }1 output pairs do steps 2 4 }1

    S t A ti ti f I t2 i ix s= y t=Set Activation for Inputs2

    Set Activation for Outputs3

    i i y t

    ( ) ( )i i iw n w o x y= +Adjust weights & bias4

    ( ) ( )i i i y

    ( ) ( )b n b o y= +48July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • ifi i l lifi i l lArtificial Neural Artificial Neural Networks Networks LecLec 5&65&6Networks Networks LecLec 5&65&6

    Dr. Aditya Abhyankar

  • RecapRecapRecapRecap

    y ANN- definitiony Resemblance with BNNy BNN ANN comparisony ANN Terminologyy Neuron Modelsy Learning !!

  • ANN TerminologyANN TerminologyANN highly simplified model of BNNy ANN highly simplified model of BNNy ANN has interconnected processing unitsy Summing part receives N inputs weights y Summing part receives N inputs, weights

    each value and computes weighted sumy Weighted sum activation valueWeighted sum activation valuey Positive weight excitatory inputy Negative weight inhibitory inputg g y p

  • Ne on ModelNe on ModelNeuron ModelNeuron Modely McCulloch-Pitts (Simplistic) Neuron Model

    The network function of a neuron is ay The network function of a neuron is aweighted sum of its input signals plus a biasterm.

    4July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Neuron ModelNeuron ModelNeuron ModelNeuron Modely The net function is a linear or nonlinear

    mapping from the input data space to an pp g p pintermediate feature spacey The most common form is a hyper-plane

    5July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Models of Models of Neuron Neuron 1 layer1 layery McCulloch-Pitts Model (MP)y Rosenblatts Perceptron Modely Adaline Model

    6July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • MP ModelMP ModelMP ModelMP Modely McCulloch-Pitts (Simplistic) Neuron Model

    x1xxM

    1iwijw

    iny yjx

    xMij

    iNw y

    Th t k f ti f i

    Nxib

    y The network function of a neuron is aweighted sum of its input signals plus a biasterm.term.

    7July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • MP Model LimitationsMP Model LimitationsMP Model LimitationsMP Model Limitations

    y Weights fixedWeights fixedy Incapable of learningy Original model allows ONLY:g

    Binary output steps Operations at discrete time steps

  • PerceptronPerceptron

    1A1a 1w

    2A 2a 2ww xMA

    Ma

    Mw( )s f x=

    Sensory S i O t tMyUnitAssociation

    Unit

    SummingUnit

    OutputUnit

    Unit

  • PerceptronPerceptron

    M

    1

    M

    i ii

    x w a = Activation1i=( )s f x=Output

    b s = Errori iw a =Weight Change

  • PerceptronPerceptron -- AdvantagesAdvantages

    y Perceptron learning law gives step-y Perceptron learning law gives stepby-step process for adjusting weightsy Perceptron convergence theoremPerceptron convergence theorem

  • WidrowsWidrows AdalineAdaline

    1A1a 1w

    2A 2a 2ww xMA

    Ma

    Mw( )s f x x= =

    Sensory S i O t tMyUnitAssociation

    Unit

    SummingUnit

    OutputUnit

    Unit

  • AdalineAdaline ((ADAptiveADAptive Linear Element)Linear Element)M

    1

    M

    i ii

    x w a = Activation1i=

    ( )s f x x= =Outputb s b x = = Error

    i iw a =Weight Change

  • Wid Ad liWid Ad liWidrows AdalineWidrows Adaline

    y Analog activation value x compared with y Analog activation value x compared with target output b

    OROy Output is linear function of xy LMS learning lawgy Gradient descent algorithm

  • Heat and Cold Heat and Cold ExampleExampleHeat and Cold Heat and Cold ExampleExample

    HOT HOT2

    1x 1y2

    -1 NN

    COLD COLD22 1

    2x 2y2

    1

    1N N

    Actual Input Perceived Output

    15July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • HebbHebb Rule (Rule (HebbHebb Net)Net)HebbHebb Rule (Rule (HebbHebb Net)Net)

    Initialize all weights to zero0 0, 1iw i n= = for {all input training vectors and target

    t t i d t 2 4 }1 output pairs do steps 2 4 }1

    S t A ti ti f I t2 i ix s= y t=Set Activation for Inputs2

    Set Activation for Outputs3

    i i y t

    ( ) ( )i i iw n w o x y= +Adjust weights & bias4

    ( ) ( )i i i y

    ( ) ( )b n b o y= +16July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • ExamplesExamples

    AND logicy AND logicy OR logicy AND-NOT logicy AND-NOT logicy 3-D example where Hebb failsy X-OR !!y X OR !!

  • Concepts Concepts HebbianHebbian LearningLearningConcepts Concepts HebbianHebbian LearningLearning

    y A discriminating hyper plane is constituted y A discriminating hyper plane is constituted by the combination of summing unit and output unit.y 0 targets are difficult to learn!y Bi-polar notion preferred

    Hebb R le doesnt gi e di ection of y Hebb Rule doesnt give direction of learning!y Concept of a bias!!Concept of a bias!!

  • Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learning

    y Step 0: Initialize weights and bias (for y Step 0: Initialize weights and bias (for simplicity set ws and bs to 0). Set learning rate 0 1< gy Step 1: While stopping condition is falsey Step 2: For each training pair :s ty Step 3: Set activation to Input

    i ix s=

  • Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learningy Step 4: Compute response of o/p unitp p p p

    in i iy b x w= +i

    1 iny >0in

    in

    yy y

    = 1 iny

  • Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learning

    y Step 5: Update error and bias if error p poccurred for the given pattern, if y t

    ( ) ( )w n w o tx+( ) ( )i i iw n w o tx= +( ) ( )b n b o t= +

    y Step 6: For no weight change (for the

    ( ) ( )b n b o t= +p g g (

    entire epoch) stop!

  • Problem!Problem!

    y AND function, bipolar i/ps, bipolar targets, 1 0b 1 21, 0b w w = = = = =

  • Problem!Problem!

    y AND function, binary i/ps, bipolar targets, 1 0 0 2b 1, 0, 0.2b = = =

  • Problem!Problem!

    y AND function, binary i/ps, bipolar targets, 1, 0, 0.2b = = =

    Results for third epoch

  • Problem!Problem!

    y AND function, binary i/ps, bipolar targets, 1, 0, 0.2b = = =

    Results of tenth epoch

  • Artificial Neural Networks Artificial Neural Networks LL 7 & 87 & 8 LecLec 7 & 87 & 8Dr. Aditya AbhyankarDr. Aditya Abhyankar

    1ANN - Dr. Abhyankar - Lec 7 & 8

  • TodayTodayTodayToday

    C t l d ill f l t tiy Conceptual drill from last timey PLR-CT

    Delta Ruley Delta Ruley Adaline learning

    2ANN - Dr. Abhyankar - Lec 7 & 8

  • Concepts Concepts HebbianHebbian LearningLearningConcepts Concepts HebbianHebbian LearningLearning

    y A discriminating hyper plane is constituted y A discriminating hyper plane is constituted by the combination of summing unit and output unit.y 0 targets are difficult to learn!y Bi-polar notion preferred

    Hebb R le doesnt gi e di ection of y Hebb Rule doesnt give direction of learning!y Concept of a bias!!Concept of a bias!!

    3ANN - Dr. Abhyankar - Lec 7 & 8

  • HebbHebb Rule (Rule (HebbHebb Net)Net)HebbHebb Rule (Rule (HebbHebb Net)Net)

    Initialize all weights to zero0 0, 1iw i n= = for {all input training vectors and target

    t t i d t 2 4 }1 output pairs do steps 2 4 }1

    S t A ti ti f I t2 i ix s= y t=Set Activation for Inputs2

    Set Activation for Outputs3

    i i y t

    ( ) ( )i i iw n w o x y= +Adjust weights & bias4

    ( ) ( )i i i y

    ( ) ( )b n b o y= +4ANN - Dr. Abhyankar - Lec 7 & 8

  • ExamplesExamples

    AND logicy AND logicy OR logicy AND-NOT logicy AND-NOT logicy 3-D example where Hebb failsy X-OR !!y X OR !!

    5ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using OR Example using HebbianHebbianXX OR Example using OR Example using HebbianHebbian

    1 ?w = 1 21 1 1x x y 1x

    1 ?w

    1 1 11 1 1

    2x y

    0TH 1 1 11 1 1

    2

    ?w = 0TH =

    6ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model

    2x21 2

    1 1 1x x y

    1x1 1 1

    1 1 11 1 11 1 1

    NOT Linearly Separable !!!!!!

    7ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model1 2

    1 1 11 1 1

    x x y 1 ?b =

    1x 11 ?w = 31 ?w =1 1 1

    1 1 11 1 1

    y12 ?w = 21 ?w =

    2xy

    22 ?w =

    32 ?w =3 ?b =22

    1 2 1 2 2 1( ) { ( ) } { ( ) }x XOR x x ANDNOT x OR x ANDNOT x=2 ?b =

    8ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model

    0 0 01 2{ ( ) }x ANDNOT x

    1 2 1 2 1 2

    0 0 0x x t w w b w w b 1 1 1 1 1 1 1 1 11 1 1 1 1 1 2 0 2

    1 1 1 1 1 1 2 0 2

    1 1 1 1 1 1 3 1 1

    1 1 1 1 1 1 2 2 2

    9ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model

    2x1 2{ ( ) }x ANDNOT x2

    1x2 1 1x x=

    10ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model

    0 0 02 1{ ( ) }x ANDNOT x

    1 2 1 2 1 2

    0 0 0x x t w w b w w b 1 1 1 1 1 1 1 1 11 1 1 1 1 1 0 2 0

    1 1 1 1 1 1 0 2 0

    1 1 1 1 1 1 1 3 1

    1 1 1 1 1 1 2 2 2

    11ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model

    2x1 2{ ( ) }x ANDNOT x2

    1x2 1 1x x= +

    12ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model

    1 2{ ( ) }x ANDNOT x

    11 2{ ( ) }x ANDNOT x

    1+2 1 1x x= 2 1 1x x= +

    t2x 2x

    1t1

    t

    1x1x 1

    1+

    11

    +

    11

    13ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model

    2x2

    1x1x

    14ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model1 2

    1 1 11 1 1

    x x y 1 ?b =

    1x 11 ?w = 31 ?w =1 1 1

    1 1 11 1 1

    y12 ?w = 21 ?w =

    2xy

    22 ?w =

    32 ?w =3 ?b =22

    1 2 1 2 2 1( ) { ( ) } { ( ) }x XOR x x ANDNOT x OR x ANDNOT x=2 ?b =

    15ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model

    0 0 01 2 1 2 2 1( ) { ( ) } { ( ) }x XOR x x ANDNOT x OR x ANDNOT x=

    1 2 1 2 1 2

    0 0 0x x t w w b w w b 1 1 1 1 1 1 1 1 11 1 1 1 1 1 0 2 0

    1 1 1 1 1 1 0 2 0

    1 1 1 1 1 1 1 1 1

    1 1 1 1 1 1 2 2 2

    16ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model1 2{ ( ) }x ANDNOT x

    2x2x

    2 1 1x x= x1x

    17ANN - Dr. Abhyankar - Lec 7 & 8

  • XX--OR Example using MP modelOR Example using MP modelXX OR Example using MP modelOR Example using MP model1 2{ ( ) }x ANDNOT x

    2x2x

    2 1 1x x= x1x

    18ANN - Dr. Abhyankar - Lec 7 & 8

  • Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learning

    y Step 0: Initialize weights and bias (for y Step 0: Initialize weights and bias (for simplicity set ws and bs to 0). Set learning rate 0 1< gy Step 1: While stopping condition is falsey Step 2: For each training pair :s ty Step 3: Set activation to Input

    i ix s=

  • Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learningy Step 4: Compute response of o/p unitp p p p

    in i iy b x w= +i

    1 iny >0in

    in

    yy y

    = 1 iny

  • Pe cept on Lea ningPe cept on Lea ningPerceptron LearningPerceptron Learning

    y Step 5: Update error and bias if error p poccurred for the given pattern, if y t

    ( ) ( )w n w o tx+( ) ( )i i iw n w o tx= +( ) ( )b n b o t= +

    y Step 6: For no weight change (for the

    ( ) ( )b n b o t= +p g g (

    entire epoch) stop!

  • Problem!Problem!

    y AND function, bipolar i/ps, bipolar targets,1 0b

    L t l thi i d d ti f

    1 21, 0b w w = = = = =y Lets solve this using advanced notion of

    bias!

  • Significance of biasSignificance of biasgg

    1x w 1x 1iw y1jx

    M

    M

    1iwijw

    iNw iny y

    1

    jx

    x

    M

    M

    1iwijw

    iNw iny y

    0b + +Nx

    ibNx

    i1 1 2 2 0b x w x w+ + =

    w b1 1 2 2x w x w + =

    1w 12 1

    2 2

    w bx xw w

    = 12 12 2

    x xw w

    = +23July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Significance of biasSignificance of biasgg

    1xM 1i

    winy

    jxM

    Mijw

    iNw iny y

    NxM iNw

    ib

    0x24July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • Problem!Problem!

    y AND function, binary i/ps, bipolar targets, 1 0 0 2b 1, 0, 0.2b = = =

  • Problem!Problem!

    y AND function, binary i/ps, bipolar targets, 1, 0, 0.2b = = =

    Results for third epoch

  • Problem!Problem!

    y AND function, binary i/ps, bipolar targets, 1, 0, 0.2b = = =

    Results of tenth epoch

  • PerceptronPerceptron Convergence Convergence TheoremTheorem

    If there is a weight vector w* such that gf(x(p).w*)=t(p) for all p, then for any starting vector w, PLR will converge to g , ga weight vector that gives correct response to all training patterns in p g pfinite number of loops.

    28ANN - Dr. Abhyankar - Lec 7 & 8

  • PerceptronPerceptron Convergence Convergence TheoremTheorem

    * * *( ). (0). , min{ . }w k w w w km m x w + =* 2

    2* 2

    ( ( ). )|| ( ) |||| ||w k ww kw

    || ||w

    * 22 ( (0). )|| ( ) || w w kmk +2 * 2( ( ) )|| ( ) || || ||w k w

    29ANN - Dr. Abhyankar - Lec 7 & 8

  • PerceptronPerceptron Convergence Convergence TheoremTheorem

    2 2 2|| ( ) || || (0) || , max{|| ||}w k w kM M x + =

    * 22 2( (0). ) || ( ) || || (0) ||w w km k kM+ 2 2* 2( ( ) ) || ( ) || || (0) |||| || w k w kMw +

    30ANN - Dr. Abhyankar - Lec 7 & 8

  • Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule

    y Step 0: Initialize weights and bias (for y Step 0: Initialize weights and bias (for simplicity set ws and bs to 0). Set learning rate 0 1< gy Step 1: While stopping condition is falsey Step 2: For each training pair :s ty Step 3: Set activation to Input

    i ix s=

  • Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule

    y Step 4: Compute response of o/p unitp p p p

    i i iy b x w= +in i ii

    y b x w+iny y=

  • Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule

    y Step 5: Update error and bias if error p poccurred for the given pattern, if y t

    ( ) ( ) ( )w n w o t y x+( ) ( ) ( )i i iw n w o t y x= + ( ) ( ) ( )b n b o t y= +

    y Step 6: If largest weight change is smaller

    ( ) ( ) ( )b n b o t y= + y Step 6: If largest weight change is smaller

    than specified tolerance then stop!

  • Problem!Problem!

    y AND function, bipolar i/ps, bipolar targets, 1 0b 1 21, 0b w w = = = = =

  • Artificial Neural Networks Artificial Neural Networks LL 99 LecLec 99

    Dr. Aditya AbhyankarDr. Aditya Abhyankar

    1ANN - Dr. Abhyankar - Lec 7 & 8

  • Last TimeLast TimeLast TimeLast Time

    C t l d ill!!y Conceptual drill!!y PLR-CT

    Delta Ruley Delta Ruley Adaline learning

    2ANN - Dr. Abhyankar - Lec 7 & 8

  • TodayTodayTodayToday

    D lt R ly Delta Ruley Adaline learning

    MATLAB Demoy MATLAB Demoy Madaline Philosophyy Extended Delta Rule BPy Extended Delta Rule BP

    3ANN - Dr. Abhyankar - Lec 7 & 8

  • Delta RuleDelta RuleDelta RuleDelta Rule

    D lt R l h i ht th y Delta Rule changes weights on the neuron connections to minimize difference between yin and tbetween yin and ty Aims at reducing error across all the

    training patterns (exemplars)training patterns (exemplars)y Squared error for a particular training

    pattern can be given asp g2 2( ) ( )inE t y t y= =

    4ANN - Dr. Abhyankar - Lec 7 & 8

  • Delta RuleDelta RuleDelta RuleDelta Rule

    E i f ti f ll th i hty E is function of all the weightsy Gradient of E will be a vector consisting of

    partial derivatives of E with reference to partial derivatives of E with reference to each weighty Gradient gives direction of rapid increase y Gradient gives direction of rapid increase

    in E, we wish to minimize E!y Hence calculate: E

    Iw

    5ANN - Dr. Abhyankar - Lec 7 & 8

  • Delta RuleDelta RuleDelta RuleDelta Rule

    2( ) inyE t y = 2( )inI I

    t yw w

    = 2( )E ( )E

    l ll b d d

    2( )in II

    E t y xw = ( )in II

    E t y xw

    = y Local error will be reduced most

    rapidly by adjusting weights as per the delta rulethe delta rule

    6ANN - Dr. Abhyankar - Lec 7 & 8

  • Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule

    y Step 0: Initialize weights and bias (for y Step 0: Initialize weights and bias (for simplicity set ws and bs to small value). Set learning rate 0 1< gy Step 1: While stopping condition is falsey Step 2: For each training pair :s ty Step 3: Set activation to Input

    i ix s=

  • Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule

    y Step 4: Compute response of o/p unitp p p p

    i i iy b x w= +in i ii

    y b x w+iny y=

  • Ad liAd li D lt L i R lD lt L i R lAdalineAdaline Delta Learning RuleDelta Learning Rule

    y Step 5: Update error and bias if error p poccurred for the given pattern, if y t

    ( ) ( ) ( )w n w o t y x+( ) ( ) ( )i i iw n w o t y x= + ( ) ( ) ( )b n b o t y= +

    y Step 6: If largest weight change is smaller

    ( ) ( ) ( )b n b o t y= + y Step 6: If largest weight change is smaller

    than specified tolerance then stop!

  • Problem!Problem!

    y AND function, bipolar i/ps, bipolar targets,

    1 20.1, 0.1b w w = = = =

  • MadalineMadaline ArchitectureArchitectureMadalineMadaline ArchitectureArchitecture

    1b1Z

    1x 11w 1v1

    Z

    y12w 21w

    1

    2v

    1inZiny

    2xy

    22w 2

    v

    b2inZ22

    2b3b

    2Z

    11July 20, 2010 ANN - Dr. Abhyankar - Lec 3 & 4

  • M d liM d li L i (MRI)L i (MRI)MedalineMedaline Learning (MRI)Learning (MRI)

  • M d liM d li L i (MRI)L i (MRI)MedalineMedaline Learning (MRI)Learning (MRI)

  • MedalineMedaline Learning (MRI)Learning (MRI)MedalineMedaline Learning (MRI)Learning (MRI)

  • M d liM d li L i (MRI)L i (MRI)MedalineMedaline Learning (MRI)Learning (MRI)

  • Problem!Problem!

    y X-OR function, bipolar i/ps, bipolar t t targets,

    0 5 0 3 0 05 0 2b w w = = = =1 11 212 12 22

    0.5, 0.3, 0.05, 0.20.5, 0.1, 0.2

    b w wb w w = = = =

    = = =2 1 2 0.5b v v= = =

  • Artificial Neural Networks - 10Artificial Neural Networks - 10

    Dr. Aditya Abhyankar

  • Back-Propagation (BP)

    Aims at balancing memorization and generalizationg

    Stage 1: Feedforward I/p training pattern Stage 2: calculation and backpropagation of Stage 2: calculation and backpropagation of

    associated errorS 3 Adj f h i h Stage 3: Adjustments of the weights

  • Architecture

    HiddenLayer

  • Nomenclature

  • Nomenclature

  • Activation Function

    Characteristics: ContinuousContinuous Differentiable Monotonically non-decreasingMonotonically non decreasing Easily differentiable

  • Binary Sigmoid FunctionBinary Sigmoid Functionrange (0,1)

  • Bipolar Sigmoid FunctionBipolar Sigmoid Functionrange (-1,1)

  • Algorithm: Training

  • Algorithm

  • Algorithm

  • Algorithm

  • Algorithm

  • Application

  • Example

    X-or Problem (linearly not separable) using 2-4-1 backprop Netp p

    Initial Weights to hidden layer

    Initial weights to o/p layer

    hidden layer

  • Solution

    1y

    1z 2z 3z 4z

    1x 2x

  • Solution

    y Output1y Layer

    1z 2z 3z 4zHiddenLayery

    1x 2x InputLayer

  • Solution

    y Output1y Layer

    1z 2z 3z 4zHiddenLayer

    13v 14vy

    11v12v 13

    v

    1x 2x InputLayer

  • Solution

    y Output1y Layer

    1z 2z 3z 4zHiddenLayer

    13v 14v

    21v 22vy11v

    12v 13v21 22v

    23v 24v

    1x 2x InputLayer

  • Solution

    y Output1y Layer11w 21w 31w 41w

    1z 2z 3z 4zHiddenLayer

    13v 14v

    21v 22vy11v

    12v 13v21 22v

    23v 24v

    1x 2x InputLayer

  • Solution

    y Output01w 1y Layer11w 21w 31w 41w

    101w

    1z 2z 3z 4zHiddenLayer

    13v 14v

    21v 22vy11v

    12v 13v21 22v

    23v 24v01v

    02v03v

    1x 2x InputLayer1

    03v

    04v

  • Example

    X-or Problem (linearly not separable) using 2-4-1 backprop Netp p

    Initial Weights to hidden layer

    Initial weights to o/p layer

    hidden layer

  • Solution Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer11w 21w 31w 41w

    1

    010.39790.3581

    1z 2z 3z 4z13v 14

    v21v 22v

    11v12v 13

    v21 22v23v 24v

    01v02v

    03v

    1x 2x103v

    04v

  • Solution Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer11w 21w 31w 41w

    1

    010.39790.3581

    1z 2z 3z 4z13v 14

    v21v 22v

    11v12v 13

    v21 22v23v 24v

    01v02v

    03v

    1x 2x103v

    04v

    [ ]0 0.3378 0.2771 0.2859 0.3329v =

  • Solution Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer11w 21w 31w 41w

    1

    010.39790.3581

    1z 2z 3z 4z13v 14

    v21v 22v

    11v12v 13

    v21 22v23v 24v

    01v02v

    03v

    1x 2x103v

    04v[ ]0 0 3378 0 2771 0 2859 0 3329v = [ ]0 0.3378 0.2771 0.2859 0.3329v 0.197 0.3191 0.1448 0.3394

    0.3099 0.1904 0.0347 0.4861v

    =

  • Solution Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer1

    010.39790.3581

    3zin = i

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 1691i

    2

    0.7866zin =

    3

    0.1064zin

    4

    0.4796zin =

    1 0 10 1 1

    1 0.1691zin =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer1

    010.39790.3581

    3zin = i

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 1691i

    2

    0.7866zin =

    3

    0.1064zin

    4

    0.4796zin =

    1 0 10 1 1

    1 0.1691zin =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer1

    010.39790.3581

    3z =

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =

    1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin =

    1y Layer1

    010.39790.3581

    3z =

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =

    1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =

    1y Layer1

    010.39790.3581

    3z =

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =

    1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    -0.0567 1x 2x1

    0 0 00.05670.03360.0459

    in =

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

    -0.0413

  • Solution Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    -0.0567 -0.0141 1x 2x1

    0 0 00.05670.03360.0459

    in =

    0.00720.01140 0097

    j =

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

    -0.0413 -0.0097

  • Solution Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    1 21 1 0x x t

    3z =0.1153k = -0.0012

    -0.0016

    1z 2z 3z 4z

    1 1 01 0 10 1 10 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 0 0 0

    1 0.5422z =-0.05670.0336

    i

    1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413

    in =

    [ ]0 0.3378 0.2771 0.2859 0.3329v = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950

    v = * 1.0e-003

  • Solution Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    1 21 1 0x x t

    3z =0.1153k = -0.0012

    -0.0016

    1z 2z 3z 4z

    1 1 01 0 10 1 10 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 0 0 0

    1 0.5422z =-0.05670.0336

    i

    1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413

    in = [ ]0 3378 0 2771 0 2859 0 3329[ ]0 -0.2815 0.1444 0.2287 -0.1950v =

    -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950

    v = [ ]0 0.3378 0.2771 0.2859 0.3329v =

    3*10 3*10

  • Solution Training: step3

    y Output01 0.1424w = 0.4907-0.2929

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    01-0.39910.3572 1 2x x t3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 01 0 1

    x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 0 1 10 0 0

    1 0.5422z =-0.05670.0336

    i

    1x 2x1[ ]-0 2815 0 1444 0 2287 -0 1950v =

    0.0459-0.0413

    in = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950

    v = * 1.0e-003 [ ]0 0.2815 0.1444 0.2287 0.1950v =

    0.1967 0.3192 -0.1446 0.33920.3096 0.1905 -0.0345 -0.4863

    v = [ ]0 -0.3381 0.2772 0.2861 -0.3331v =

  • Solution Training: step3

    y Output01 0.1393w = 0.4923-0.2908

    w

    = 1y Layer1

    01-0.39750.3584 1 2x x t

    1z 2z 3z 4z1 2

    1 1 01 0 1

    x x t

    0 1 10 0 0

    1x 2x10.1970 0.3191 -0.1448 0.33940.3100 0.1904 -0.0347 -0.4861

    v = [ ]0 -0.3377 0.2770 0.2858 -0.3328v =

  • Application

  • Artificial Neural Networks Artificial Neural Networks 11 & 1211 & 12 11 & 1211 & 12

    Dr. Aditya AbhyankarDr. Aditya Abhyankar

  • B kB k P i (BP)P i (BP)BackBack--Propagation (BP)Propagation (BP)

    y Aims at balancing memorization and generalizationy Stage 1: Feedforward I/p training patterny Stage 2: calculation and backpropagation

    of associated errorof associated errory Stage 3: Adjustments of the weights

  • ArchitectureArchitecture

    HiddenLayer

  • NomenclatureNomenclatureNomenclatureNomenclatureInput Training Vector

    ( )x

    x x x x= 1

    1

    ( ,......, ,...... )Output Target Vector

    ( ,......, ,...... )

    i n

    k m

    x x x x

    tt t t t=

    ,Portion of error correction weight adjustment for ,

    errror at to be propagated backj k

    kk

    wY

    Portion of er

    j ,ror correction weight adjustment for v ,

    errror at Z to be propagated backi j

    j

    Learning RateInput Unit iX i

  • NomenclatureNomenclatureNomenclatureNomenclature

  • Activation Function Activation Function

    y Characteristics: ContinuousDifferentiable Differentiable

    Monotonically non-decreasing Easily differentiabley

  • Binary Sigmoid FunctionBinary Sigmoid Functionrange (0,1)range (0,1)g ( , )g ( , )

  • Bipolar Sigmoid FunctionBipolar Sigmoid FunctionBipolar Sigmoid FunctionBipolar Sigmoid Functionrange (range (--1,1)1,1)

  • Algorithm: TrainingAlgorithm: TrainingAlgorithm: TrainingAlgorithm: Training

  • AlgorithmAlgorithmAlgorithmAlgorithm

  • AlgorithmAlgorithmAlgorithmAlgorithm

  • AlgorithmAlgorithmAlgorithmAlgorithm

  • AlgorithmAlgorithmAlgorithmAlgorithm

  • ApplicationApplicationApplicationApplication

  • ApplicationApplicationApplicationApplication

  • y X-or Problem (linearly not separable) o ob ( a y o s pa ab )using 2-4-1 backprop Nety Initial Weights to

    I iti l i ht thidden layer Initial weights to o/p layer

    ExampleExample

  • SolutionSolutionSolutionSolution

    1y

    1z 2z 3z 4z

    1x 2x

  • SolutionSolutionSolutionSolution

    y Output1y Layer

    1z 2z 3z 4zHiddenLayery

    1x 2x InputLayer

  • SolutionSolutionSolutionSolution

    y Output1y Layer

    1z 2z 3z 4zHiddenLayer

    13v 14vy

    11v12v 13

    v

    1x 2x InputLayer

  • SolutionSolutionSolutionSolution

    y Output1y Layer

    1z 2z 3z 4zHiddenLayer

    13v 14v

    21v 22vy11v

    12v 13v21 22v

    23v 24v

    1x 2x InputLayer

  • SolutionSolutionSolutionSolution

    y Output1y Layer11w 21w 31w 41w

    1z 2z 3z 4zHiddenLayer

    13v 14v

    21v 22vy11v

    12v 13v21 22v

    23v 24v

    1x 2x InputLayer

  • SolutionSolutionSolutionSolution

    y Output01w 1y Layer11w 21w 31w 41w

    101w

    1z 2z 3z 4zHiddenLayer

    13v 14v

    21v 22vy11v

    12v 13v21 22v

    23v 24v01v

    02v03v

    1x 2x InputLayer1

    03v

    04v

  • y X-or Problem (linearly not separable) o ob ( a y o s pa ab )using 2-4-1 backprop Nety Initial Weights to

    hidden layer Initial weights to o/p layero/p layer

    ExampleExample

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer11w 21w 31w 41w

    1

    010.39790.3581

    1z 2z 3z 4z13v 14

    v21v 22v

    11v12v 13

    v21 22v23v 24v

    01v02v

    03v

    1x 2x103v

    04v

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer11w 21w 31w 41w

    1

    010.39790.3581

    1z 2z 3z 4z13v 14

    v21v 22v

    11v12v 13

    v21 22v23v 24v

    01v02v

    03v

    1x 2x103v

    04v

    [ ]0 0.3378 0.2771 0.2859 0.3329v =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer11w 21w 31w 41w

    1

    010.39790.3581

    1z 2z 3z 4z13v 14

    v21v 22v

    11v12v 13

    v21 22v23v 24v

    01v02v

    03v

    1x 2x103v

    04v[ ]0 0 3378 0 2771 0 2859 0 3329v = [ ]0 0.3378 0.2771 0.2859 0.3329v 0.197 0.3191 0.1448 0.3394

    0.3099 0.1904 0.0347 0.4861v

    =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer1

    010.39790.3581

    3zin = i

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 1691i

    2

    0.7866zin =

    3

    0.1064zin

    4

    0.4796zin =

    1 0 10 1 1

    1 0.1691zin =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer1

    010.39790.3581

    3zin = i

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 1691i

    2

    0.7866zin =

    3

    0.1064zin

    4

    0.4796zin =

    1 0 10 1 1

    1 0.1691zin =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer1

    010.39790.3581

    3z =

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =

    1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin =

    1y Layer1

    010.39790.3581

    3z =

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =

    1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =

    1y Layer1

    010.39790.3581

    3z =

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =

    1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    -0.0567 1x 2x1

    0 0 00.05670.03360.0459

    in =

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

    -0.0413

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    -0.0567 -0.0141 1x 2x1

    0 0 00.05670.03360.0459

    in =

    0.00720.01140 0097

    j =

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

    -0.0413 -0.0097

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    1 21 1 0x x t

    3z =0.1153k = -0.0012

    -0.0016

    1z 2z 3z 4z

    1 1 01 0 10 1 10 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 0 0 0

    1 0.5422z =-0.05670.0336

    i

    1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413

    in =

    [ ]0 0.3378 0.2771 0.2859 0.3329v = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950

    v = * 1.0e-003

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    1 21 1 0x x t

    3z =0.1153k = -0.0012

    -0.0016

    1z 2z 3z 4z

    1 1 01 0 10 1 10 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 0 0 0

    1 0.5422z =-0.05670.0336

    i

    1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413

    in = [ ]0 3378 0 2771 0 2859 0 3329[ ]0 -0.2815 0.1444 0.2287 -0.1950v =

    -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950

    v = [ ]0 0.3378 0.2771 0.2859 0.3329v =

    3*10 3*10

  • Solution Solution Training: step3Training: step3Solution Solution Training: step3Training: step3

    y Output01 0.1424w = 0.4907-0.2929

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    01-0.39910.3572 1 2x x t3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 01 0 1

    x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 0 1 10 0 0

    1 0.5422z =-0.05670.0336

    i

    1x 2x1[ ]-0 2815 0 1444 0 2287 -0 1950v =

    0.0459-0.0413

    in = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950

    v = * 1.0e-003 [ ]0 0.2815 0.1444 0.2287 0.1950v =

    0.1967 0.3192 -0.1446 0.33920.3096 0.1905 -0.0345 -0.4863

    v = [ ]0 -0.3381 0.2772 0.2861 -0.3331v =

  • Solution Solution Training: step3Training: step3Solution Solution Training: step3Training: step3

    y Output01 0.1393w = 0.4923-0.2908

    w

    = 1y Layer1

    01-0.39750.3584 1 2x x t

    1z 2z 3z 4z1 2

    1 1 01 0 1

    x x t

    0 1 10 0 0

    1x 2x10.1970 0.3191 -0.1448 0.33940.3100 0.1904 -0.0347 -0.4861

    v = [ ]0 -0.3377 0.2770 0.2858 -0.3328v =

  • ApplicationApplicationApplicationApplication

  • ApplicationApplicationApplicationApplication

  • PerceptronPerceptron Convergence Convergence TheoremTheorem

    If there is a weight vector w* such that gf(x(p).w*)=t(p) for all p, then for any starting vector w, PLR will converge to g , ga weight vector that gives correct response to all training patterns in p g pfinite number of loops.

    44ANN - Dr. Abhyankar - Lec 7 & 8

  • PerceptronPerceptron Convergence Convergence TheoremTheorem

    * * *( ). (0). , min{ . }w k w w w km m x w + =* 2

    2* 2

    ( ( ). )|| ( ) |||| ||w k ww k

    w

    || ||w

    * 22 ( (0). )|| ( ) || w w kmk +2 * 2( ( ) )|| ( ) || || ||w k w

    45ANN - Dr. Abhyankar - Lec 7 & 8

  • PerceptronPerceptron Convergence Convergence TheoremTheorem

    2 2 2|| ( ) || || (0) || , max{|| ||}w k w kM M x + =

    * 22 2( (0). ) || ( ) || || (0) ||w w km k kM+ 2 2* 2( ( ) ) || ( ) || || (0) |||| || w k w kMw +

    46ANN - Dr. Abhyankar - Lec 7 & 8

  • Delta RuleDelta RuleDelta RuleDelta Rule

    E i f ti f ll th i hty E is function of all the weightsy Gradient of E will be a vector consisting of

    partial derivatives of E with reference to partial derivatives of E with reference to each weighty Gradient gives direction of rapid increase y Gradient gives direction of rapid increase

    in E, we wish to minimize E!y Hence calculate: E

    Iw

    47ANN - Dr. Abhyankar - Lec 7 & 8

  • Delta RuleDelta RuleDelta RuleDelta Rule

    2( ) inyE t y = 2( )inI I

    t yw w

    = 2( )E ( )E

    l ll b d d

    2( )in II

    E t y xw = ( )in II

    E t y xw

    = y Local error will be reduced most

    rapidly by adjusting weights as per the delta rulethe delta rule

    48ANN - Dr. Abhyankar - Lec 7 & 8

  • Artificial Neural Networks Artificial Neural Networks 1313 1313

    Dr. Aditya AbhyankarDr. Aditya Abhyankar

  • B kB k P i (BP)P i (BP)BackBack--Propagation (BP)Propagation (BP)

    y Aims at balancing memorization and generalizationy Stage 1: Feedforward I/p training patterny Stage 2: calculation and backpropagation

    of associated errorof associated errory Stage 3: Adjustments of the weights

  • ArchitectureArchitecture

    HiddenLayer

  • NomenclatureNomenclatureNomenclatureNomenclatureInput Training Vector

    ( )x

    x x x x= 1

    1

    ( ,......, ,...... )Output Target Vector

    ( ,......, ,...... )

    i n

    k m

    x x x x

    tt t t t=

    ,Portion of error correction weight adjustment for ,

    errror at to be propagated backj k

    kk

    wY

    Portion of er

    j ,ror correction weight adjustment for v ,

    errror at Z to be propagated backi j

    j

    Learning RateInput Unit iX i

  • NomenclatureNomenclatureNomenclatureNomenclature

  • Activation Function Activation Function

    y Characteristics: ContinuousDifferentiable Differentiable

    Monotonically non-decreasing Easily differentiabley

  • Binary Sigmoid FunctionBinary Sigmoid Functionrange (0,1)range (0,1)g ( , )g ( , )

  • Bipolar Sigmoid FunctionBipolar Sigmoid FunctionBipolar Sigmoid FunctionBipolar Sigmoid Functionrange (range (--1,1)1,1)

  • Algorithm: TrainingAlgorithm: TrainingAlgorithm: TrainingAlgorithm: Training

  • AlgorithmAlgorithmAlgorithmAlgorithm

  • AlgorithmAlgorithmAlgorithmAlgorithm

  • AlgorithmAlgorithmAlgorithmAlgorithm

  • AlgorithmAlgorithmAlgorithmAlgorithm

  • ApplicationApplicationApplicationApplication

  • ApplicationApplicationApplicationApplication

  • y X-or Problem (linearly not separable) o ob ( a y o s pa ab )using 2-4-1 backprop Nety Initial Weights to

    I iti l i ht thidden layer Initial weights to o/p layer

    ExampleExample

  • SolutionSolutionSolutionSolution

    1y

    1z 2z 3z 4z

    1x 2x

  • SolutionSolutionSolutionSolution

    y Output1y Layer

    1z 2z 3z 4zHiddenLayery

    1x 2x InputLayer

  • SolutionSolutionSolutionSolution

    y Output1y Layer

    1z 2z 3z 4zHiddenLayer

    13v 14vy

    11v12v 13

    v

    1x 2x InputLayer

  • SolutionSolutionSolutionSolution

    y Output1y Layer

    1z 2z 3z 4zHiddenLayer

    13v 14v

    21v 22vy11v

    12v 13v21 22v

    23v 24v

    1x 2x InputLayer

  • SolutionSolutionSolutionSolution

    y Output1y Layer11w 21w 31w 41w

    1z 2z 3z 4zHiddenLayer

    13v 14v

    21v 22vy11v

    12v 13v21 22v

    23v 24v

    1x 2x InputLayer

  • SolutionSolutionSolutionSolution

    y Output01w 1y Layer11w 21w 31w 41w

    101w

    1z 2z 3z 4zHiddenLayer

    13v 14v

    21v 22vy11v

    12v 13v21 22v

    23v 24v01v

    02v03v

    1x 2x InputLayer1

    03v

    04v

  • y X-or Problem (linearly not separable) o ob ( a y o s pa ab )using 2-4-1 backprop Nety Initial Weights to

    hidden layer Initial weights to o/p layero/p layer

    ExampleExample

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer11w 21w 31w 41w

    1

    010.39790.3581

    1z 2z 3z 4z13v 14

    v21v 22v

    11v12v 13

    v21 22v23v 24v

    01v02v

    03v

    1x 2x103v

    04v

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer11w 21w 31w 41w

    1

    010.39790.3581

    1z 2z 3z 4z13v 14

    v21v 22v

    11v12v 13

    v21 22v23v 24v

    01v02v

    03v

    1x 2x103v

    04v

    [ ]0 0.3378 0.2771 0.2859 0.3329v =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer11w 21w 31w 41w

    1

    010.39790.3581

    1z 2z 3z 4z13v 14

    v21v 22v

    11v12v 13

    v21 22v23v 24v

    01v02v

    03v

    1x 2x103v

    04v[ ]0 0 3378 0 2771 0 2859 0 3329v = [ ]0 0.3378 0.2771 0.2859 0.3329v 0.197 0.3191 0.1448 0.3394

    0.3099 0.1904 0.0347 0.4861v

    =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer1

    010.39790.3581

    3zin = i

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 1691i

    2

    0.7866zin =

    3

    0.1064zin

    4

    0.4796zin =

    1 0 10 1 1

    1 0.1691zin =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer1

    010.39790.3581

    3zin = i

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 1691i

    2

    0.7866zin =

    3

    0.1064zin

    4

    0.4796zin =

    1 0 10 1 1

    1 0.1691zin =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    = 1y Layer1

    010.39790.3581

    3z =

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =

    1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin =

    1y Layer1

    010.39790.3581

    3z =

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =

    1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step1Training: step1Solution Solution Training: step1Training: step1

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =

    1y Layer1

    010.39790.3581

    3z =

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =

    1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    1x 2x10 0 0

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    -0.0567 1x 2x1

    0 0 00.05670.03360.0459

    in =

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

    -0.0413

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 0x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 1 0 10 1 1

    1 0.5422z =

    -0.0567 -0.0141 1x 2x1

    0 0 00.05670.03360.0459

    in =

    0.00720.01140 0097

    j =

    [ ]0 0.3378 0.2771 0.2859 0.3329v = 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861

    v =

    -0.0413 -0.0097

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    1 21 1 0x x t

    3z =0.1153k = -0.0012

    -0.0016

    1z 2z 3z 4z

    1 1 01 0 10 1 10 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 0 0 0

    1 0.5422z =-0.05670.0336

    i

    1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413

    in =

    [ ]0 0.3378 0.2771 0.2859 0.3329v = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950

    v = * 1.0e-003

  • Solution Solution Training: step2Training: step2Solution Solution Training: step2Training: step2

    y Output01 0.1401w = 0.49190.2913

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    010.39790.3581

    1 21 1 0x x t

    3z =0.1153k = -0.0012

    -0.0016

    1z 2z 3z 4z

    1 1 01 0 10 1 10 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 0 0 0

    1 0.5422z =-0.05670.0336

    i

    1x 2x1 0.197 0.3191 0.1448 0.33940.3099 0.1904 0.0347 0.4861v = 0.0459-0.0413

    in = [ ]0 3378 0 2771 0 2859 0 3329[ ]0 -0.2815 0.1444 0.2287 -0.1950v =

    -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950

    v = [ ]0 0.3378 0.2771 0.2859 0.3329v =

    3*10 3*10

  • Solution Solution Training: step3Training: step3Solution Solution Training: step3Training: step3

    y Output01 0.1424w = 0.4907-0.2929

    w

    =

    1 0.1462yin = 1 0.4635y =01 0.0023w =

    1y Layer1

    01-0.39910.3572 1 2x x t3z =

    0.1153k = -0.0012-0.0016

    1z 2z 3z 4z1 2

    1 1 01 0 1

    x x t

    0 5422

    2

    0.6871z =

    3

    0.5266z

    4

    0.3823z =-0.0012

    -0.0009

    w = 0 1 10 0 0

    1 0.5422z =-0.05670.0336

    i

    1x 2x1[ ]-0 2815 0 1444 0 2287 -0 1950v =

    0.0459-0.0413

    in = -0.2815 0.1444 0.2287 -0.1950-0.2815 0.1444 0.2287 -0.1950

    v = * 1.0e-003 [ ]0 0.2815 0.1444 0.2287 0.1950v =

    0.1967 0.3192 -0.1446 0.33920.3096 0.1905 -0.0345 -0.4863

    v = [ ]0 -0.3381 0.2772 0.2861 -0.3331v =

  • Solution Solution Training: step3Training: step3Solution Solution Training: step3Training: step3

    y Output01 0.1393w = 0.4923-0.2908

    w

    = 1y Layer1

    01-0.39750.3584 1 2x x t

    1z 2z 3z 4z1 2

    1 1 01 0 1

    x x t

    0 1 10 0 0

    1x 2x10.1970 0.3191 -0.1448 0.33940.3100 0.1904 -0.0347 -0.4861

    v = [ ]0 -0.3377 0.2770 0.2858 -0.3328v =

  • ApplicationApplicationApplicationApplication

  • ApplicationApplicationApplicationApplication

  • PerceptronPerceptron Convergence Convergence TheoremTheorem

    If there is a weight vector w* such that gf(x(p).w*)=t(p) for all p, then for any starting vector w, PLR will converge to g , ga weight vector that gives correct response to all training patterns in p g pfinite number of loops.

    44ANN - Dr. Abhyankar - Lec 7 & 8

  • PerceptronPerceptron Convergence Convergence TheoremTheorem

    * * *( ). (0). , min{ . }w k w w w km m x w + =* 2

    2* 2

    ( ( ). )|| ( ) |||| ||w k ww k

    w

    || ||w

    * 22 ( (0). )|| ( ) || w w kmk +2 * 2( ( ) )|| ( ) || || ||w k w

    45ANN - Dr. Abhyankar - Lec 7 & 8

  • PerceptronPerceptron Convergence Convergence TheoremTheorem

    2 2 2|| ( ) || || (0) || , max{|| ||}w k w kM M x + =

    * 22 2( (0). ) || ( ) || || (0) ||w w km k kM+ 2 2* 2( ( ) ) || ( ) || || (0) |||| || w k w kMw +

    46ANN - Dr. Abhyankar - Lec 7 & 8

  • Delta RuleDelta RuleDelta RuleDelta Rule

    E i f ti f ll th i hty E is function of all the weightsy Gradient of E will be a vector consisting of

    partial derivatives of E with reference to partial derivatives of E with reference to each weighty Gradient gives direction of rapid increase y Gradient gives direction of rapid increase

    in E, we wish to minimize E!y Hence calculate: E

    Iw

    47ANN - Dr. Abhyankar - Lec 7 & 8

  • Delta RuleDelta RuleDelta RuleDelta Rule

    2( ) inyE t y = 2( )inI I

    t yw w

    = 2( )E ( )E

    l ll b d d

    2( )in II

    E t y xw = ( )in II

    E t y xw

    = y Local error will be reduced most

    rapidly by adjusting weights as per the delta rulethe delta rule

    48ANN - Dr. Abhyankar - Lec 7 & 8

  • Dr. Aditya Abhyankar

  • Learning process of forming Learning process of forming associations between related patterns

    Heteroassociative NNsHeteroassociative NNsAutoassociative NNsHopfield NetHopfield Net

  • A linear vector space X set