# Recurrent Networks

Post on 31-Dec-2015

19 views

Embed Size (px)

DESCRIPTION

Recurrent Networks. A recurrent network is characterized by The connection graph of the network has cycles, i.e. the output of a neuron can influence its input There are no natural input and output nodes Initially each neuron has a given input state - PowerPoint PPT PresentationTRANSCRIPT

<ul><li><p>Recurrent NetworksA recurrent network is characterized by The connection graph of the network has cycles, i.e. the output of a neuron can influence its inputThere are no natural input and output nodesInitially each neuron has a given input stateNeurons change state using some update ruleThe network evolves until some stable situation is reachedThe resulting state is the output of the network</p></li><li><p>Pattern RecognitionRecurrent networks can be used for patternrecognition in the following way:</p><p> The stable states represent the patterns to be recognized The initial state is a noisy or otherwise mutilated version of one of the patterns The recognition process consists of the network evolving from its initial state to a stable state</p></li><li><p>Pattern Recognition Example</p></li><li><p>Pattern Recognition Example (cntd)Noisy imageRecognized pattern</p></li><li><p>Bipolar Data EncodingIn bipolar encoding firing of a neuron is repre-sented by the value 1, and non-firing by the value 1In bipolar encoding the transfer function of the neurons is the sign function sgnA bipolar vector x of dimension n satisfies the equations sgn(x ) = x xTx = n</p></li><li><p>Binary versus Bipolar EncodingThe number of orthogonal vector pairs is muchlarger in case of bipolar encoding. In an n-dimensional vector space:For binary encoding </p><p>For bipolar encoding</p></li><li><p>Hopfield NetworksA recurrent network is a Hopfield network when</p><p>The neurons have discrete output (for convenience we use bipolar encoding)Each neuron has a threshold Each pair of neurons is connected by a weighted connection. The weight matrix is symmetric and has a zero diagonal (no connection from a neuron to itself)</p></li><li><p>Network statesIf a Hopfield network has n neurons, then the state of the network at time t is the vector x(t) 2 {-1, 1}n with components x i (t) that describe the states of the individual neurons. </p><p>Time is discrete, so t 2 NThe state of the network is updated using a so-called update rule. (Not) firing of a neuron at time t+1 will depend on the sign of the total input at time t</p></li><li><p>Update StrategiesIn a sequential network only one neuron at a time is allowed to change its state. In the asyn-chronous update rule this neuron is randomly selected.In a parallel network several neurons are allowed to change their state simultaneously. Limited parallelism: only neurons that are not connected can change their state simultaneouslyUnlimited parallelism: also connected neurons may change their state simultaneouslyFull parallelism: all neurons change their state simul-taneously</p></li><li><p>Asynchronous Update</p></li><li><p>Asynchronous NeighborhoodBecause wkk = 0 , it follows that for every pair of neighboring states x* 2 Na(x) The asynchronous neighborhood of a state x is defined as the set of states</p></li><li><p>Synchronous UpdateThis update rule corresponds to full parallelism</p></li><li><p>Sign-assumptionIn order for both update rules to be applica-ble, we assume that for all neurons i</p><p>Because the number of states is finite, it isalways possible to adjust the thresholds such that the above assumption holds.</p></li><li><p>Stable StatesA state x is called a stable state, whenFor both the synchronous and the asyn-chronous update rule we have:a state is a stable state if and only if the update rule does not lead to a different state.</p></li><li><p>Cyclic behavior in asymmetric RNN-1-1-1111</p></li><li><p>Basins of Attractionstate spaceinitial statestable state</p></li><li><p>Consensus and EnergyThe consensus C(x) of a state x of a Hopfield network with weight matrix W and bias vector b is defined asThe energy E(x) of a Hopfield network in state x is defined as</p></li><li><p>Consensus difference For any pair of vectors x and x* we have</p></li><li><p>Asynchronous ConvergenceIf in an asynchronous step the state of the network changes from x to x-2xkek, then the consensus increases.</p><p>Since there are only a finite number of states, the consensus serves as a variant function that shows that a Hopfield network evolves to a stable state, when the asynchronous update rule is used.</p></li><li><p>Stable States and Local maximaA state x is a local maximum of the consensusfunction whenTheorem:A state x is a local maximum of the consensusfunction if and only if it is a stable state.</p></li><li><p>Stable equals local maximum</p></li><li><p>Modified ConsensusThe modified consensus of a state x of a Hopfield network with weight matrix W and bias vector b is defined asLet x , x*, and x** be successive states obtained with the synchronous update rule. Then</p></li><li><p>Synchronous Convergence Suppose that x, x*, and x** are successive states obtained with the synchronous update rule. ThenHence a Hopfield network that evolves using the synchronous update rule will arrive either in a stable state or in a cycle of length 2.</p></li><li><p>Storage of a Single PatternHow does one determine the weights of aHopfield network given a set of desired sta-ble states? First we consider the case of a single stable state. Let x be an arbitrary vector. Choos-ing weight matrix W and bias vector b as</p><p>makes x a stable state.</p></li><li><p>Proof of Stability</p></li><li><p>Example</p></li><li><p>State encoding</p></li><li><p>Finite state machine for async update</p></li><li><p>Weights for Multiple PatternsLet {x(p) j 1 p P } be a set of patterns, and let W(p) be the weight matrix corresponding to pattern number p. Choose the weight matrix W and the bias vector b for a Hopfield network that must recognize all P patterns asQuestion: Is x(p) indeed a stable state?</p></li><li><p>RemarksIt is not guaranteed that a Hopfield network with weight matrix as defined on the previous slide indeed has the patterns as it stable statesThe disturbance caused by other patterns is called crosstalk. The closer the patterns are, the larger the crosstalk isThis raises the question how many patterns there can be stored in a network before crosstalk gets the overhand </p></li><li><p>Input of neuron i in state x(p)</p></li><li><p>CrosstalkThe crosstalk term is defined by</p></li><li><p>Spurious StatesBesides the desired stable states the network can have additional undesired (spurious) stable states</p><p>If x is stable and b = 0, then x is also stable.Some combinations of an odd number of stable states can be stable.Moreover there can be more complicated additional stable states (spin glass states) that bare no relation to the desired states.</p></li><li><p>Storage CapacityQuestion: How many stable states P can be stored in a network of size n ?</p><p>Answer: That depends on the probability ofinstability one is willing to accept. Experi-mentally P 0.15n has been found (byHopfield) to be a reasonable value.</p></li><li><p>Probabilistic analysis 1Assume that all components of the patterns arerandom variables with equal probability of being1 and -1</p></li><li><p>Probabilistic Analysis 2From these assumptions it follows thatApplication of the central limit theorem yields</p></li><li><p>Standard Normal DistributionThe shaded area under the bell-shaped curve gives the probability</p><p>Pr[y 1.5]</p></li><li><p>Probability of Instability</p><p>0.051.6450.3700.012.3260.1850.0052.5760.1510.0013.0900.105</p></li><li><p>Topics Not TreatedReduction of crosstalk for correlated patterns Stability analysis for correlated patternsMethods to eliminate spurious statesContinuous Hopfield modelsDifferent associative memoriesBinary Associative Memory (Kosko)Brain State in a Box (Kawamoto, Anderson)</p></li></ul>