lecture 8 artificial neural networksrahimi/cs437/slides/lec08.pdf · 2010. 11. 14. · lecture 8...
TRANSCRIPT
Lecture 8Lecture 8Artificial neural networks:Artificial neural networks:UnsupervisedUnsupervised learninglearning IntroductionIntroduction H bbi l iH bbi l i
Unsupervised Unsupervised learninglearning
Hebbian learningHebbian learning Generalised Hebbian learning algorithmGeneralised Hebbian learning algorithm Competitive learningCompetitive learning SelfSelf--organising computational map:organising computational map:
Kohonen networkKohonen network SummarySummary
11/14/2010 Intelligent Systems and Soft Computing 1
SummarySummary
IntroductionIntroduction
The main property of a neural network is an The main property of a neural network is an ability to learn from its environment and toability to learn from its environment and toability to learn from its environment, and to ability to learn from its environment, and to improve its performance through learning. So improve its performance through learning. So far we have consideredfar we have considered supervisedsupervised oror activeactivefar we have considered far we have considered supervisedsupervised or or active active learninglearning learning with an external “teacher” learning with an external “teacher” or a supervisor who presents a training set to theor a supervisor who presents a training set to theor a supervisor who presents a training set to the or a supervisor who presents a training set to the network. But another type of learning also network. But another type of learning also exists:exists: unsupervised learningunsupervised learningexists: exists: unsupervised learningunsupervised learning..
11/14/2010 Intelligent Systems and Soft Computing 2
In contrast to supervised learning, unsupervised or In contrast to supervised learning, unsupervised or selfself--organised learning organised learning does not require an does not require an external teacher. During the training session, the external teacher. During the training session, the
l k i b f diffl k i b f diffneural network receives a number of different neural network receives a number of different input patterns, discovers significant features in input patterns, discovers significant features in th tt d l h t l if i t d tth tt d l h t l if i t d tthese patterns and learns how to classify input data these patterns and learns how to classify input data into appropriate categories. Unsupervised into appropriate categories. Unsupervised learning tends to follow the neurolearning tends to follow the neuro biologicalbiologicallearning tends to follow the neurolearning tends to follow the neuro--biological biological organisation of the brain.organisation of the brain.
Unsupervised learning algorithms aim to learnUnsupervised learning algorithms aim to learnrapidly and can be used in realrapidly and can be used in real--time.time.
11/14/2010 Intelligent Systems and Soft Computing 3
Hebbian learningHebbian learning
In 1949, Donald Hebb proposed one of the key In 1949, Donald Hebb proposed one of the key ideas in biological learning commonly known asideas in biological learning commonly known asideas in biological learning, commonly known as ideas in biological learning, commonly known as Hebb’s LawHebb’s Law. Hebb’s Law states that if neuron . Hebb’s Law states that if neuron ii is is near enough to excite neuronnear enough to excite neuron jj and repeatedlyand repeatedlynear enough to excite neuron near enough to excite neuron j j and repeatedly and repeatedly participates in its activation, the synaptic connection participates in its activation, the synaptic connection between these two neurons is strengthened andbetween these two neurons is strengthened andbetween these two neurons is strengthened and between these two neurons is strengthened and neuron neuron j j becomes more sensitive to stimuli from becomes more sensitive to stimuli from neuron neuron ii..
11/14/2010 Intelligent Systems and Soft Computing 4
Hebb’sHebb’s Law can be represented in the form of two Law can be represented in the form of two rules:rules:rules:rules:
1.1. If two neurons on either side of a connection If two neurons on either side of a connection are activated synchronously, then the weight of are activated synchronously, then the weight of that connection is increased.that connection is increased.
2.2. If two neurons on either side of a connection If two neurons on either side of a connection are activated asynchronously, then the weight are activated asynchronously, then the weight y y gy y gof that connection is decreased. of that connection is decreased.
Hebb’sHebb’s Law provides the basis for learningLaw provides the basis for learningHebb sHebb s Law provides the basis for learning Law provides the basis for learning without a teacher. Learning here is a without a teacher. Learning here is a local local phenomenon phenomenon occurring without feedback from occurring without feedback from
11/14/2010 Intelligent Systems and Soft Computing 5
pp ggthe environment.the environment.
Hebbian learning in a neural networkHebbian learning in a neural networkn
a l s
g n
a l s
i ju t
S i g
n
u t
S i
g
i j
I n p
u
O u
t p
11/14/2010 Intelligent Systems and Soft Computing 6
Using Using Hebb’sHebb’s Law we can express the adjustment Law we can express the adjustment applied to the weightapplied to the weight ww at iterationat iteration pp in thein theapplied to the weight applied to the weight wwijij at iteration at iteration p p in the in the following form:following form:
][ )()()( F
As a special case, we can representAs a special case, we can represent Hebb’sHebb’s Law asLaw as
][ )(),()( pxpyFpw ijij
As a special case, we can represent As a special case, we can represent Hebb sHebb s Law as Law as follows:follows:
)()()( pxpypw ijij a
where where is the is the learning rate learning rate parameter. parameter.
)()()( pxpypw ijij a
This equation is referred to as the This equation is referred to as the activity product activity product rulerule..
11/14/2010 Intelligent Systems and Soft Computing 7
Hebbian learning implies that weights can only Hebbian learning implies that weights can only increase To resolve this problem we mightincrease To resolve this problem we mightincrease. To resolve this problem, we might increase. To resolve this problem, we might impose a limit on the growth of synaptic weights. impose a limit on the growth of synaptic weights. It can be done by introducing a nonIt can be done by introducing a non--linearlinearIt can be done by introducing a nonIt can be done by introducing a non linearlinearforgetting factor forgetting factor intointo Hebb’s Law:Hebb’s Law:
)()()()()( pwpypxpypw
where where is the forgetting factor.is the forgetting factor.
)()()()()( pwpypxpypw ijjijij
g gg g
Forgetting factor usually falls in the interval Forgetting factor usually falls in the interval between 0 and 1 typically between 0 01 and 0 1between 0 and 1 typically between 0 01 and 0 1between 0 and 1, typically between 0.01 and 0.1, between 0 and 1, typically between 0.01 and 0.1, to allow only a little “forgetting” while limiting to allow only a little “forgetting” while limiting the weight growth.the weight growth.
11/14/2010 Intelligent Systems and Soft Computing 8
the weight growth.the weight growth.
Hebbian learning algorithmHebbian learning algorithm
Step 1Step 1: Initialisation.: Initialisation.Set initial synaptic weights and thresholds to smallSet initial synaptic weights and thresholds to smallSet initial synaptic weights and thresholds to small Set initial synaptic weights and thresholds to small random values, say in an interval [0, 1random values, say in an interval [0, 1 ].].
Step 2Step 2: Activation.: Activation.Compute the neuron output at iteration Compute the neuron output at iteration p p
jn
ijij pwpxpy )()()(
where where n n is the number of neuron inputs, and is the number of neuron inputs, and jj is the is the
ji
jj 1
11/14/2010 Intelligent Systems and Soft Computing 9
jjthreshold value of neuron threshold value of neuron jj..
Step 3Step 3: Learning. : Learning. Update the weights in the network:Update the weights in the network:Update the weights in the network: Update the weights in the network:
)()()1( pwpwpw ijijij
where where wwijij((pp) is the weight correction at iteration ) is the weight correction at iteration pp. .
Th i h i i d i d b hTh i h i i d i d b hThe weight correction is determined by the The weight correction is determined by the generalised activity product rule:generalised activity product rule:
Step 4Step 4: Iteration: Iteration
][ )()()()( pwpxpypw ijijij
Step 4Step 4: Iteration. : Iteration. Increase iteration Increase iteration p p by one, go back to Step 2.by one, go back to Step 2.
11/14/2010 Intelligent Systems and Soft Computing 10
Hebbian learning exampleHebbian learning exampleTo illustrate Hebbian learning, consider a fully To illustrate Hebbian learning, consider a fully connected feedforward network with a single layer connected feedforward network with a single layer of five computation neurons. Each neuron is of five computation neurons. Each neuron is represented by arepresented by a McCulloch and Pitts model with McCulloch and Pitts model with th i ti ti f ti Th t k i t i dth i ti ti f ti Th t k i t i dthe sign activation function. The network is trained the sign activation function. The network is trained on the following set of input vectors:on the following set of input vectors:
00
10
00
00
10
0001X
1002X
0103X
0014X
1005X
11/14/2010 Intelligent Systems and Soft Computing 11
0 1 0 0 1
Initial and final states of the networkInitial and final states of the network
x1 1 1 y11 1 x 1 1 y1
1 0
2 y2x2 20 0 2 y2x 20 12
x3 3 3 y30
0
0
0
x 3 3 y30
0
0
0x4 4
5
4 y4
5
0
1
0
1
x 4 4 y4
5
0
1
0
15
Input layer Output layerx5 5 5 y5
1 1
Input layer Output layerx 5 y5
1 15
11/14/2010 Intelligent Systems and Soft Computing 12
Initial and final weight matricesInitial and final weight matricesgg
O u t p u t l a yer O u t p u t l a yer
0001000001
21 43 5
1
2
001 2 3 4 5
1
2 2 02040
00
00
2 02040
0100000100000102
3
4
0002
3
4 00
2.02041.0200
0
00 .9996
00
00
2.0204
100005
05 2.0204 0 0 2.0204
11/14/2010 Intelligent Systems and Soft Computing 13
A test input vector, or probe, is defined asA test input vector, or probe, is defined as1
001
X
100X
When this probe is presented to the network, we When this probe is presented to the network, we obtain:obtain:
10
2661.04940.0
01
2.0204002.0204000000
100
073709478.00907.0
100
2 0204002 0204000.9996000001.020000signY
11/14/2010 Intelligent Systems and Soft Computing 14
10737.012.0204002.02040
Competitive learningCompetitive learning In competitive learning, neurons compete among In competitive learning, neurons compete among
themselves to be activatedthemselves to be activatedthemselves to be activated.themselves to be activated.
While in While in HebbianHebbian learning, several output neurons learning, several output neurons b ti t d i lt l i titib ti t d i lt l i titican be activated simultaneously, in competitive can be activated simultaneously, in competitive
learning, only a single output neuron is active at learning, only a single output neuron is active at any timeany timeany time.any time.
The output neuron that wins the “competition” is The output neuron that wins the “competition” is called the called the winnerwinner--takestakes--all all neuron.neuron.
11/14/2010 Intelligent Systems and Soft Computing 15
Th b i id f titi l iTh b i id f titi l i The basic idea of competitive learning was The basic idea of competitive learning was introduced in the early 1970s.introduced in the early 1970s.
In the late 1980s, Teuvo Kohonen introduced a In the late 1980s, Teuvo Kohonen introduced a special class of artificial neural networks called special class of artificial neural networks called selfself--organizing feature mapsorganizing feature maps. These maps are . These maps are based on competitive learning.based on competitive learning.
11/14/2010 Intelligent Systems and Soft Computing 16
What is a selfWhat is a self--organizing feature map?organizing feature map?Our brain is dominated by the cerebral cortex, a Our brain is dominated by the cerebral cortex, a very complex structure of billions of neurons and very complex structure of billions of neurons and y py phundreds of billions of synapses. The cortex hundreds of billions of synapses. The cortex includes areas that are responsible for different includes areas that are responsible for different human activities (motor, visual, auditory, human activities (motor, visual, auditory, somatosensory, etc.), andsomatosensory, etc.), and associated with different associated with different sensory inputs. We can say that each sensory sensory inputs. We can say that each sensory input is mapped into a corresponding area of the input is mapped into a corresponding area of the
b l tb l t Th t i lfTh t i lf i ii icerebral cortex. cerebral cortex. The cortex is a selfThe cortex is a self--organizing organizing computational map in the human brain.computational map in the human brain.
11/14/2010 Intelligent Systems and Soft Computing 17
FeatureFeature--mapping Kohonen modelmapping Kohonen modelKohonen layer Kohonen layer
Input layer Input layerInput layer Input layer
1 100
11/14/2010 Intelligent Systems and Soft Computing 18
(a) (b)0
The Kohonen networkThe Kohonen network
The The KohonenKohonen model provides a topological model provides a topological mapping It places a fixed number of inputmapping It places a fixed number of inputmapping. It places a fixed number of input mapping. It places a fixed number of input patterns from the input layer into a higherpatterns from the input layer into a higher--dimensional output ordimensional output or KohonenKohonen layerlayerdimensional output or dimensional output or KohonenKohonen layer. layer.
Training in the Training in the KohonenKohonen network begins with the network begins with the i ’ i hb h d f f i l l i Thi ’ i hb h d f f i l l i Thwinner’s neighborhood of a fairly large size. Then, winner’s neighborhood of a fairly large size. Then,
as training proceeds, the neighborhood size as training proceeds, the neighborhood size gradually decreasesgradually decreasesgradually decreases.gradually decreases.
11/14/2010 Intelligent Systems and Soft Computing 19
Architecture of the Kohonen NetworkArchitecture of the Kohonen Network
y1
x1
1
x2
y2
y3
Inputlayer
Outputlayer
11/14/2010 Intelligent Systems and Soft Computing 20
y y
The lateral connections are used to create a The lateral connections are used to create a competition between neurons The neuron withcompetition between neurons The neuron withcompetition between neurons. The neuron with competition between neurons. The neuron with the largest activation level among all neurons in the largest activation level among all neurons in the output layer becomes the winner This neuronthe output layer becomes the winner This neuronthe output layer becomes the winner. This neuron the output layer becomes the winner. This neuron is the only neuron that produces an output signal. is the only neuron that produces an output signal. The activity of all other neurons is suppressed in The activity of all other neurons is suppressed in e c v y o o e eu o s s supp essede c v y o o e eu o s s supp essedthe competition.the competition.
The lateral feedback connections produceThe lateral feedback connections produce The lateral feedback connections produce The lateral feedback connections produce excitatory or inhibitory effects, depending on the excitatory or inhibitory effects, depending on the distance from the winning neuron This isdistance from the winning neuron This isdistance from the winning neuron. This is distance from the winning neuron. This is achieved by the use of a achieved by the use of a Mexican hat function Mexican hat function which describes synaptic weights between neuronswhich describes synaptic weights between neurons
11/14/2010 Intelligent Systems and Soft Computing 21
which describes synaptic weights between neurons which describes synaptic weights between neurons in the Kohonen layer.in the Kohonen layer.
The Mexican hat function of lateral connectionThe Mexican hat function of lateral connection
Connectionstrength1 strength
Excitatoryeffect
1
effect
Distance
InhibitoryInhibitory
0
Inhibitoryeffect
Inhibitoryeffect
11/14/2010 Intelligent Systems and Soft Computing 22
In theIn the Kohonen network, a neuron learns by Kohonen network, a neuron learns by shifting its weights from inactive connections toshifting its weights from inactive connections toshifting its weights from inactive connections to shifting its weights from inactive connections to active ones. Only the winning neuron and its active ones. Only the winning neuron and its neighbourhood are allowed to learn. If a neuron neighbourhood are allowed to learn. If a neuron ggdoes not respond to a given input pattern, then does not respond to a given input pattern, then learning cannot occur in that particular neuron.learning cannot occur in that particular neuron.
The The competitive learning rule competitive learning rule defines the change defines the change wwijij applied to synaptic weight applied to synaptic weight wwijij as as ijij pp y p gpp y p g ijij
ncompetitiothelosesneuronif0
ncompetitiothewinsneuronif),(
j
jwxw iji
ij
where where xxii is the input signal and is the input signal and is the is the learning learning
ncompetitiothelosesneuronif,0 j
11/14/2010 Intelligent Systems and Soft Computing 23
rate rate parameter.parameter.
The overall effect of the competitive learning rule The overall effect of the competitive learning rule resides in moving the synaptic weight vectorresides in moving the synaptic weight vector WW ofofresides in moving the synaptic weight vector resides in moving the synaptic weight vector WWjj of of the winning neuron the winning neuron j j towards the input pattern towards the input pattern XX. . The matching criterion is equivalent to theThe matching criterion is equivalent to theThe matching criterion is equivalent to the The matching criterion is equivalent to the minimum minimum Euclidean distance Euclidean distance between vectors.between vectors.
Th E lid di b i fTh E lid di b i f bb 11 The Euclidean distance between a pair of The Euclidean distance between a pair of nn--byby--1 1 vectors vectors X X and and WWjj is defined byis defined by
2/12)(
n
ijij wxd WX
where where xxii and and wwijij are the are the iith elements of the vectors th elements of the vectors 1 i
11/14/2010 Intelligent Systems and Soft Computing 24
X X and and WWjj, respectively., respectively.
T id tif th i iT id tif th i i jj th t b tth t b t To identify the winningTo identify the winning neuron, neuron, jjXX, that best , that best matches the input vector matches the input vector XX, we may apply the , we may apply the following condition:following condition:following condition:following condition:
jminj WXX j = 1 2 m
hh i h b f i h K hi h b f i h K h
,jj
minj WXX j = 1, 2, . . .,m
where where m m is the number of neurons in the Kohonen is the number of neurons in the Kohonen layer.layer.
11/14/2010 Intelligent Systems and Soft Computing 25
Suppose, for instance, that the 2Suppose, for instance, that the 2--dimensional input dimensional input vectorvector XX is presented to the threeis presented to the three neuron Kohonenneuron Kohonenvector vector X X is presented to the threeis presented to the three--neuron Kohonen neuron Kohonen network,network,
12.052.0
X
The initial weight vectors, The initial weight vectors, WWjj, are given by, are given by
81027.0
1W
70042.0
2W
21043.0
3W
81.01
70.02
21.03
11/14/2010 Intelligent Systems and Soft Computing 26
We find the winning (bestWe find the winning (best--matching) neuron matching) neuron jjXX
using the minimumusing the minimum distance Euclidean criterion:distance Euclidean criterion:using the minimumusing the minimum--distance Euclidean criterion:distance Euclidean criterion:2
2122
1111 )()( wxwxd 73.0)81.012.0()27.052.0( 22
2222
21212 )()( wxwxd 59.0)70.012.0()42.052.0( 22
Neuron 3 is the winner and its weight vectorNeuron 3 is the winner and its weight vector WW33 isis
2232
21313 )()( wxwxd 13.0)21.012.0()43.052.0( 22
Neuron 3 is the winner and its weight vector Neuron 3 is the winner and its weight vector WW33 is is updated according to the competitive learning rule.updated according to the competitive learning rule.
)()( 0.01)43.052.0(1.0)( 13113 wxw
0.01)21.012.0(1.0)( 23223 wxw
11/14/2010 Intelligent Systems and Soft Computing 27
)()( 23223
The updated weight vector The updated weight vector WW33 at iteration (at iteration (p p + 1) + 1) p gp g 33 ((pp ))is determined as:is determined as:
20.044.0
01.00.01
21.043.0
)()()1( 333 ppp WWW
The weight vectorThe weight vector WW33 of the wining neuron 3of the wining neuron 3 The weight vector The weight vector WW33 of the wining neuron 3 of the wining neuron 3 becomes closer to the input vector becomes closer to the input vector X X with each with each iteration.iteration.
11/14/2010 Intelligent Systems and Soft Computing 28
Competitive Learning AlgorithmCompetitive Learning AlgorithmStep 1Step 1: : InitializationInitialization..
Set initial synaptic weights to small randomSet initial synaptic weights to small randomvalues, say in an interval [0, 1], and assign a smallvalues, say in an interval [0, 1], and assign a small
iti l t th l i t titi l t th l i t tpositive value to the learning rate parameter positive value to the learning rate parameter ..
11/14/2010 Intelligent Systems and Soft Computing 29
A ti t th K h t k b l i thA ti t th K h t k b l i thStep 2Step 2: : Activation and Similarity MatchingActivation and Similarity Matching..
Activate the Kohonen network by applying theActivate the Kohonen network by applying theinput vector input vector XX, and find the winner, and find the winner--takestakes--all (bestall (bestmatching) neuronmatching) neuron jj at iterationat iteration pp using theusing thematching) neuron matching) neuron jjXX at iteration at iteration pp, using the, using theminimumminimum--distance Euclidean criteriondistance Euclidean criterion
2/1
,)()()(2/1
1
2][
n
iijij
jpwxpminpj WXX
1 ij
j = 1, 2, . . .,m
where where n n is the number of neurons in the inputis the number of neurons in the inputlayer, and layer, and m m is the number of is the number of neurons in theneurons in the
11/14/2010 Intelligent Systems and Soft Computing 30
Kohonen layer.Kohonen layer.
U d h i i hU d h i i hStep 3Step 3: : LearningLearning..
Update the synaptic weightsUpdate the synaptic weights
)()()1( pwpwpw ijijij
where where wwijij((pp) is the weight correction at iteration ) is the weight correction at iteration pp..The weight correction is determined by theThe weight correction is determined by the
jjj
The weight correction is determined by theThe weight correction is determined by thecompetitive learning rule:competitive learning rule:
)()( ][ j
)(,0
)(,)()(
][pj
pjpwxpw
j
jijiij
where where is the is the learning rate learning rate parameter, and parameter, and jj((pp) is) isthe neighborhood function centered around thethe neighborhood function centered around the
11/14/2010 Intelligent Systems and Soft Computing 31
the neighborhood function centered around thethe neighborhood function centered around thewinnerwinner--takestakes--all neuron all neuron jjXX at iteration at iteration pp..
Step 4Step 4: : IterationIteration..Increase iteration Increase iteration p p by one, go back to Step 2 andby one, go back to Step 2 andcontinue until the minimumcontinue until the minimum--distance Euclideandistance Euclideancriterion is satisfied, or no noticeable changescriterion is satisfied, or no noticeable changesoccur in the feature map.occur in the feature map.
11/14/2010 Intelligent Systems and Soft Computing 32
Competitive learning in the Kohonen networkCompetitive learning in the Kohonen network
To illustrate competitive learning, consider the To illustrate competitive learning, consider the Kohonen network with 100 neurons arranged in the Kohonen network with 100 neurons arranged in the form of a twoform of a two--dimensional lattice with 10 rows and dimensional lattice with 10 rows and 10 columns. The network is required to classify 10 columns. The network is required to classify twotwo--dimensional input vectors dimensional input vectors -- each neuron in the each neuron in the network should respond only to the input vectors network should respond only to the input vectors
i i it ii i it ioccurring in its region.occurring in its region. The network is trained with 1000 twoThe network is trained with 1000 two--dimensional dimensional
input vectors generated randomly in a square input vectors generated randomly in a square region in the interval between region in the interval between ––1 and +1. The 1 and +1. The l i i l 0 1l i i l 0 1
11/14/2010 Intelligent Systems and Soft Computing 33
learning rate parameter a is equal to 0.1.learning rate parameter a is equal to 0.1.
Initial random weightsInitial random weights1
0 6
0.8
1
0.2
0.4
0.6
-0.2
0
0.2
W(2
,j)
-0.6
-0.4
0 8 0 6 0 4 0 2 0 0 2 0 4 0 6 0 8 11-1
-0.8
11/14/2010 Intelligent Systems and Soft Computing 34
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1W(1,j)
Network after 100 iterationsNetwork after 100 iterations1
0.6
0.8
1
0.2
0.4
0.6
-0.2
0
W(2
,j)
-0.6
-0.4
0 8 0 6 0 4 0 2 0 0 2 0 4 0 6 0 81-1
-0.8
1
11/14/2010 Intelligent Systems and Soft Computing 35
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8-1 1W(1,j)
Network after 1000 iterationsNetwork after 1000 iterations1
0 6
0.8
1
0 2
0.4
0.6
-0.2
0
0.2
W(2
,j)
-0.6
-0.4
0.2
-1
-0.8
11/14/2010 Intelligent Systems and Soft Computing 36
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8-1 1W(1,j)
Network after 10,000 iterationsNetwork after 10,000 iterations1
0 6
0.8
1
0 2
0.4
0.6
0 2
0
0.2
W(2
,j)
-0 6
-0.4
-0.2
-1
-0.8
-0.6
11/14/2010 Intelligent Systems and Soft Computing 37
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8-1-1
1W(1,j)