bioinspired computing lecture 12 artificial neural networks: from multilayer to recurrent neural...

Bioinspired ComputingLecture 12

Artificial Neural Networks:

From Multilayer to

Recurrent Neural Nets

Netta Cohen

2

Distributed RepresentationsIn the examples today, nnets learn to represent the information in a training set by distributing it across a set of simple connected neuron-like units.

Some useful properties of distributed representations:• they are robust to noise• they degrade gracefully• they are content addressable• they allow automatic completion or repair of input

patterns• they allow automatic generalisation from input patterns• they allow automatic clustering of input patternsIn many ways, this form of information processing resembles that carried out by real nervous systems.

3

Distributed Representations

However, distributed representations are quite hard for us to understand, visualise or build by hand.

To aid our understanding we have developed ideas such as:

• the partitioning of the input space• the clustering of the input data• the formation of feature detectors• the characterisation of hidden unit subspaces• etc.

To build distributed representations automatically, we resorted to learning algorithms such as backprop.

4

ProblemsANNs often depart from biological reality:

Supervision: Real brains cannot rely on a supervisor to teach them, nor are they free to self-organise.

Training vs. Testing: This distinction is an artificial one.

Temporality: Real brains are continuously engaged with their environment, not exposed to a series of disconnected “trials”.

Architecture: Real neurons and the networks that they form are far more complicated than the artificial neurons and simple connectivity that we have discussed so far.

Does this matter? If ANNs are just biologically inspired tools, no, but if they are to model mind or life-like systems, the answer is maybe.

5

Some ways forward…

Learning: eliminate supervision

Temporality: Introduce dynamics into neural networks.

Architecture: eliminate layers & feed-forward directionality

6

Auto-associative MemoryAuto-associative nets are trained to reproduce their input activation across their output nodes…

Once trained, the net can automatically repair noisy or damaged images that are presented to it…

input

output

hidden

A net trained on bedrooms and bathrooms, presented with an input including a sink and a bed might infer the presence of a mirror and a wardrobe – a bedsit.

bed+bath

…+mirror+wardrobe

m w bs

7

The networks still rely on a feed-forward architecture

Auto-association

Training still (typically) relies on back-propagation

But… this form of learning presents a move away from conventional supervision since the “desired” output is none other than the input which can be stored internally.

8

A Vision Application

Hubert & Wiesel’s work on cat retinas has inspired a class of ANNs that are used for sophisticated image analysis.

Neurons in the retina are arranged in large arrays, and each has its own associated receptive field. This arrangement together with “lateral inhibition” enable the eye to efficiently perform edge detection of our visual input streams.

An ANN can do the same thing

http://serendip.brynmawr.edu/bb/latinhib_app.html

http://www.serendip.brynmawr.edu/bb/latinhib_app.html

9

Lateral inhibition• A pattern of light falls across an array of neurons that each inhibit their right-hand neighbour.

• Lateral inhibition such as this is characteristic of natural retinal networks.

• Only neurons along the left-hand dark-light boundary escape inhibition.

• Now let the receptive fields of different neurons be coarse grained, with large overlaps between adjacent neurons. What advantage is gained?

http://serendip.brynmawr.edu/bb/latinhib_app.html

http://www.serendip.brynmawr.edu/bb/latinhib_app.html

10

The networks still rely on one-way connectivity

Lateral inhibition

But there are no input, hidden and output layers. Every neuron serves both for input and for output.

Still, the learning is not very interesting: lateral inhibition is hard-wired up to fine tuning. While some applications support hard-wired circuitry, others require more flexible functionality.

11

Unsupervised LearningAutoassociative learning: A first but modest move away from supervision.

Reinforcement learning: In many real-life situations, we have no idea what the “desired” output of the neurons should be, but can recognise desired behaviour (e.g. riding a bike). By conditioning behaviour with rewards and penalties, desirable neural net activity is reinforced.

Hebbian learning and self-organisation: In the complete absence of supervision or conditioning, the network can still self-organise and reach an appropriate and stable solution. In Hebbian learning, only effective connections between neurons are enhanced.

12

Supervised training

• External learning rules• External supervisor• Off-line training

• a general-purpose net (that has been trained to do almost any sort of classification, recognition, fitting, association, and much more)

Artificial nets can be configured to perform set tasks. The training usually involves:

The result is typically:

Can we design and train artificial neural nets to exhibit a more natural learning capacity?

13

• Internalised learning rules • An ability to learn by one’s self• … in the real world…• without conscious control over neuronal plasticity

• a restricted learning capacity: We are primed to study a mother tongue, but less so

to study math.

Natural learning

A natural learning experience typically implies:

The result is typically:

14

The life of a squirrelAnimals tasks routinely require them to combine reactive behaviour (such as reflexes, pattern recognition, etc.), sequential behaviour (such as set routines) and learning.

For example, a squirrel foraging for food must• move around her territory without injuring herself• identify dangerous or dull areas or those rich in food• learn to travel to and from particular areas

All of these behaviours are carried out by one nervous system – the squirrel’s brain. To some degree, different parts of the brain may specialise in different kinds of task. However, there is no sharp boundary between learned behaviours, instinctual behaviours and reflex behaviours.

15

Recurrent Neural Networks

Adapted from Randy Beer (1995) “A dynamical systems perspective on agent-environment interaction”, Artificial Intelligence 72: 173-215.

Each neuron is connected to each other neuron and to itself.

Connections can be asymmetric.input

Some neurons receive external input.

out p

ut

Some neurons produce outputs.

Activation flows around the network, rather than feeding forward.

Randy Beer describes one scheme for recurrent nets:

RNN neurons are virtually the same as in feed-forward nets, but the activity of the network is updated at each time step. Note: inputs from other nodes as well as one’s self must be counted .

16

The Net’s Dynamic CharacterConsider the servile life of a feed-forward net:

Contrast the active life of a recurrent net:• Even without input, spontaneous activity may

reverberate around the net in any manner.• this spontaneous activity is free to flow around the net• external input may modify these intrinsic dynamics• if embedded, the net’s activity may affect its

environment, which may alter its ‘sensory’ input, which may perturb its dynamics, and so on…

• it is dormant until an input is provided• this is mapped onto the output nodes via a hidden layer• weights are changed by an auxiliary learning algorithm• once again the net is dormant, awaiting input

17

What does it do?

Initially, in the absence of input, or in the presence of a steady-state input, a recurrent network will usually approach a stable equilibrium.

Other behaviours can be obtained with dynamic inputs and induced by training. For instance, a recurrent net can be trained to oscillate spontaneously (without any input), and in some cases even to generate chaotic behaviour.

One of the big challenges in this area is finding the best algorithms and network architectures to induce such diverse forms of dynamics.

18

Tuning the dynamics

Once input is included, there is a fear that the abundance of internal stimulations and excitations will result in an explosion or saturation of activity.

In fact by including a balance of excitations and inhibitions, and by correctly choosing the activation functions, the activity is usually self-contained within reasonable bounds.

19

Dynamical Neural NetsIntroducing time

In all neural net discussions so far, we have assumed all inputs to be presented simultaneously, and each trial to be separate. Time was somehow deemed irrelevant.

Recurrent nets can deal with inputs that are presented sequentially, as they would almost always be in real problems. The ability of the net to reverberate and sustain activity can serve as a working memory. Such nets are called Dynamical Neural Nets (DNN or DRNN).

20

Dynamical Neural Nets

Consider an XOR with only one input nodeInput:

We provide the network with a time series consisting of a pair of high and low values.

Output:

The output neuron is to become active when the input sequence is 01 or 10, but remain inactive when the input sequence is 00 or 11.

inputtarget output

time

1

0

21

Supervised learning for RNNsBackprop through Time:

– Calculate errors at output nodes at time t– Backpropagate the error to all nodes at time t-1– repeat for some fixed number of time steps (usually<10)– Apply usual weight fixing formula

Real Time Recurrent Learning:– Calculate errors at output nodes– Numerically seek steepest descent solution to minimise the

error at each time step.

Both methods deteriorate with history: The longer the history, the harder the training. However, with slight variations, these learning algorithms have successfully been used for a variety of applications. Examples: grammar learning & distinguishing between spoken languages.

22

RNNs for Time Series PredictionPredicting the future is one of the biggest quests of human kind. What will the weather bring tomorrow? Is there global warming? When will Wall Street crash and how will oil prices fluctuate in the coming months? Can EEG recordings be used to predict the next epilepsy attack or ECG, the next heart attack?

Such daunting questions have occupied scientists for centuries and computer scientists since time immemorial.

Another example dates back to Edmund Halley’s observation in 1676 that Jupiter’s orbit was directed slowly towards the sun. If true, Jupiter would sweep the inner planets with it into the sun (that’s us). The hypothesis threw the entire mathematical elite into a frenzy. Euler, Lagrange and Lambert made heroic attacks on the problem without solving it. No wonder. The problem involved 75 simultaneous equations resulting in some 20 billion possible choices of parameters. It was finally solved (probabilistically) by Laplace in 1787. Jupiter, it turns out, was only oscillating. The first half cycle of oscillations will be completed at about 2012.

23

How does it work?

A time series represents a process that is discrete or has been discretised in time.

input

output

The output represents a prediction based on information about the past.

If the system can be described by a dynamical process, then a recurrent neural net should be able to model it. The question is how many data points are needed (i.e. how far into the past must we go) to predict the future.

x

t0 1 2 43 5 7...6

24

Next time…

Reading• Randy Beer (1995) “A dynamical systems perspective on agent-environment interaction”, Artificial Intelligence 72: 173-215.

• In particular, much of today was based on http://www.idsia.ch/NNcourse/intro.html

• More neural networks (self-organising nets, competitive nets, & more)

• Unsupervised learning algorithms

• Bio-inspired applications

bioinspired computing lecture 12 artificial neural networks: from multilayer to recurrent neural...

Documents

artificial neurons

form of learning

input data

input space

input activation

real neurons

hebbian learning

reinforcement learning