a proposed model of repetition blindness

10
14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9 A PROPOSED MODEL OF REPETITION BLINDNESS COLM G. CONNOLLY Dept. of Computer Science, University College Dublin, Belfield, Dublin 4, Ireland. E-mail: [email protected] RONAN G. REILLY Dept. of Computer Science, National University of Ireland, Maynooth Co. Kildare, Ireland. E-mail: [email protected] We describe a model of repetition blindness which draws on the dichotomous divi- sion of the visual system into two subsystems which process identity and location information. The model is constructed from self-organising networks of spiking neurons which are connected by plastic inhibitory and excitatory synapses. In particular, we describe how these networks are capable of learning translation in- variant letter representations and learning to locate a stimulus in the input array using the SOM learning algorithm. Various psychological phenomena require the establishment of separate episodic representations for each of the objects perceived in the environ- ment. From the point of view of the brain these representations need not be stored in the same place, but may be distributed over many regions. Indeed, there is neurobiological evidence supporting the theory that the episodic context of an object is stored separately from its identity. We de- scribe a model of one such phenomenon which uses separate representations for the identity and location of word information. 1. Repetition Blindness Repetition blindness (RB) is the failure to detect the repeated occurrence of a stimulus which is presented in a rapid serial visual presentation (RSVP) experimental paradigm. If the exposure of the first occurrence of the stim- ulus, or critical word (C1), is set so that it is just identifiable—about

Upload: nuim

Post on 03-Dec-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9

A PROPOSED MODEL OF REPETITION BLINDNESS

COLM G. CONNOLLY

Dept. of Computer Science,

University College Dublin,

Belfield, Dublin 4, Ireland.

E-mail: [email protected]

RONAN G. REILLY

Dept. of Computer Science,

National University of Ireland, Maynooth

Co. Kildare, Ireland.

E-mail: [email protected]

We describe a model of repetition blindness which draws on the dichotomous divi-sion of the visual system into two subsystems which process identity and locationinformation. The model is constructed from self-organising networks of spikingneurons which are connected by plastic inhibitory and excitatory synapses. Inparticular, we describe how these networks are capable of learning translation in-variant letter representations and learning to locate a stimulus in the input arrayusing the SOM learning algorithm.

Various psychological phenomena require the establishment of separate

episodic representations for each of the objects perceived in the environ-

ment. From the point of view of the brain these representations need not

be stored in the same place, but may be distributed over many regions.

Indeed, there is neurobiological evidence supporting the theory that the

episodic context of an object is stored separately from its identity. We de-

scribe a model of one such phenomenon which uses separate representations

for the identity and location of word information.

1. Repetition Blindness

Repetition blindness (RB) is the failure to detect the repeated occurrence of

a stimulus which is presented in a rapid serial visual presentation (RSVP)

experimental paradigm. If the exposure of the first occurrence of the stim-

ulus, or critical word (C1), is set so that it is just identifiable—about

14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9

100 ms—then identification of the second occurrence (C2) is impeded (Kan-

wisher, 1987).

RB can be observed under a variety of stimulus conditions, including

compounds, homographs, homophones and single letters (Kanwisher & Pot-

ter, 1990). RB is not confined to the level of whole words, as overlapping

fragments of words can be subject to blindness (Harris & Morris, 2000).

Furthermore, it appears not be restricted to letter clusters. It has been

found between numbers in their verbal (nine) and Arabic (9) forms and

between homophones, such as I / eye (Bavelier & Potter, 1992).

1.1. Types, Tokens and Cortical Streams

The distinction between types and tokens is well known in language and is

critical to the understanding of language and perception in general. Con-

sider the following example: When extracting the different meanings from

the following sentences, “Big fish eat little fish” and “Big fish eat little”,

it is necessary to clearly establish that there are two tokens of the type

“fish” (Kanwisher, 1987). Normally, we are proficient at this task. How-

ever, in RSVP experiments containing repeated stimuli this process can

break down.

Kanwisher (1987) proposed that RB occurs when words are recognised

as separate types but not tokenised, hence the description type activation

without token individuation. Kanwisher also suggested that as each word

is presented separate type and token nodes are setup with links established

between the two to tokenise the type.

One possible source of the type-token division is the what-where divi-

sion of the visual system (Ungerleider & Mishkin, 1982; Baylis, Driver, &

Rafal, 1993). In this view, the types are processed in the temporal “what”

pathway whereas the tokens, or episodic representations, are the domain of

the parietal “where” stream. It is assumed that the dorsal pathway is also

responsible for the tokens’ temporal order. The hypothesis is that the sep-

arate spatiotemporal and identity representations are then bound together,

hence tokenising the type. Activation of the type representation and cre-

ation of the episodic token can happen in parallel, the problem arises when

the two are bound together to form a stable percept. In RB, failure to

bind the type-token representations causes the disappearance of the second

critical stimulus.

One popular proposal is that the brain uses the synchronisation of neu-

ral spikes to solve this binding problem (Singer, 1994). In the context of

14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9

accounting for RB within this framework, one could hypothesise that under

normal circumstances there are two ensembles of neurons, one in each of

the what and where streams. Over a short period of time these entrain

producing the neuronal correlate of a stable percept. Disruption of this

entrainment, however, gives rise to RB. In RB experiments, the time scale

of stimulus presentation is so fast that sometimes entrainment fails to occur

when the type is repeated.

2. A Model of RB

Drawing on the presence of the functional streams in the visual cortex, the

model we propose for repetition blindness, shown in figure 1, follows this

roughly Y-shaped division. It consists of two networks that share a com-

mon input source. One of the networks is specialised for localising where

a letter is presented in the input and the other is specialised for identi-

fying the letter regardless of location in the input. The final network in

the model plays the role of what is commonly called association cortex. It

is bidirectionally connected to both the “what” and the “where” networks

and allows for identity and location information to be bound together even

though two different networks sub-serve these functions. These three net-

works are laterally connected self-organising maps (SOM) (Kohonen, 1995)

of excitatory and inhibitory spiking neurons.

The input layer roughly models the pattern of orientation sensitive line

detectors found in V1 (Hubel & Wiesel, 1965). The layer is fully connected

to the excitatory neurons of the what layer while the connections between

it and the where layer are arranged in overlapping topographic receptive

fields. This allows only one side of the where map to activate in response

to a letter presentation on one side of the input. However, some of this

activity spreads, via the lateral connections, to the other side of the map.

The final part of the model consists of an “association” cortex layer.

Its function is to mediate the binding of letter identity and letter location

information into a stable percept. This is accomplished by using a third

self-organising layer of laterally connected excitatory and inhibitory neu-

rons. This layer receives input from both the “where” and “what” layers.

In addition to this, it also sends an extensive network of reciprocal connec-

tions to both the “what” and “where” layers. These connections promote

inter-areal synchronisation thus facilitating the binding of the “what” and

“where” information (Singer, 1994).

14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9

AssociationArea

VisualFeature

Input

WhereSystemSystem

What

Figure 1. Gross outline of the model of RB.

2.1. Visual Input

Input to the model consists of a simplified version of the simple cells de-

scribed by Hubel and Wiesel (1965). Four different orientations, namely 0,

45, 90 and 135, and 16 possible spatial locations are used. This represen-

tation was chosen as it allowed the use of the font described by Rumelhart

and Siple (1974). The font together with the oriented line detectors and

their locations can be seen in figure 2. Each one of these features corre-

sponds to a neuron capable of generating a single spike and each letter is

represented by a spike on a subset of these oriented line detectors.

(a)

9 11

12

13

8

15

6 5

21

14

10

3

47

0

(b)

Neu

ron

Num

ber

0123456789101112131415

0 50 100

Time (ms)

(c)

Figure 2. (a): The font used for letter presentation to the model is similar to thatused by Rumelhart and Siple (1974). (b): The template showing the location andarrangement of all 16 features. After figure 2, Rumelhart and Siple (1974). (c): Theencoding of the letter “A” by spike times. In this case, neurons 0, 1, 2, 3, 4, 7, 8 and12 signal its presence. These spikes are solid lines. The other neurons, which do nottake an active part in encoding the feature have their spikes shown as dashed lines. Theneuron numbers correspond to the features numbered in (b).

14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9

The visual input to the model consists of two groups of 16 neurons.

Each of these groups can represent one letter, consequently the two groups

are capable of registering two simultaneously presented letters.

The presence of a particular feature in the input is encoded in the firing

time of the appropriate neurons. In this case, the encoding is a delay or

latency code (Gerstner & Kistler, 2002). In the training pattern set, those

neurons signalling the presence of a particular feature fire early and all other

neurons fire much later. This arrangement can be seen in figure 2(c). The

inclusion of this positive and negative information is necessitated by the

design of the synapses used in the network. Since they only change efficacy

in response spike events, some means must be included to indicate that

the efficacy must either be increased or decreased. This is accomplished by

having spikes encoding the presence of a feature fire early and roughly in

synchrony. All the others fire randomly much later than those signalling the

presence of a feature. Given an appropriate membrane time constant, τm,

these early spikes summate and cause the SOM neurons to fire. When the

efficacy of the synapses is updated the early spikes will cause an increase

in the synaptic efficacy since they, for the most part, lie in the time period

before a SOM neuron spike when afferent spikes can potentiate a synapse.

The later firing neurons, on the other hand, lie in the time window after a

SOM neuron spike which causes a decrease in synaptic efficacy.

This pattern encoding scheme also has the advantage that a large num-

ber of patterns can easily be generated for a single letter by addition or

deletion of spikes from a canonical letter representation or by shifting one

or more spike’s time.

The testing pattern set is similar to the training pattern set save for the

absence of the late, dashed spikes in figure 2.

Having described how patterns are created for the letters, we now pro-

ceed to outline the neural model used in the SOM layers which learn these

input patterns.

3. The Neural Model

The spiking model used in the network is the Spike Response Model (SRM)

(Gerstner & Kistler, 2002). In order to obtain a computationally efficient

function which computes the response of a neuron to afferent spikes, several

simplifications described by Connolly, Marian, and Reilly (2004), have to

be made to the SRM equations. Firstly, between the arrival of any two

spikes at a neuron its membrane potential evolves in a deterministic manner.

14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9

Therefore if we know what the membrane potential is at time tint we can

compute what it will be at some later time tint + ∆t. We then need only

compute the additional change in the membrane voltage caused by spikes

arriving since tint. Secondly, if we assume that the post-synaptic potentials

due to the arrival of a pre-synaptic spike contribute instantaneously to the

membrane potential of the model neuron then we can eliminate the need to

use a biophysically more realistic function to model the change of the post-

synaptic potential. In its place we can simply use a Dirac delta function.

The modified equation is shown in (1).

Vi(t) =∑

t(f)i

∈Fi

η(t − t(f)i ) + ε(tint) · exp(

−(t − tint)

τm

)+

j∈Γi

t(f)j

>tint∈Fj

wijδ(t − t(f)j − d̂ij)

(1)

where η describes the response of a neuron to its own spikes, ε(tint) is the

post-synaptic potential at time tint, wij is the efficacy of the synapse be-

tween neurons i and j, t(f)j is the most recent spike time of neuron i, Γi is

the set of neurons pre-synaptic to neuron i, Fj is the set of spike times of

neuron j, d̂ij is a noisy spike transmission delay between neurons i and j,

and δ is the Dirac delta function (δ(x) = 1 iff x = 0, 0 otherwise). This

simplification of the neural model, though physiologically unrealistic (Ger-

stner, 2001), has considerable computational benefits since we no longer

consider the time course of synaptic input to the neuron (Marian, 2002).

Learning in the model is accomplished by using plastic excitatory and

inhibitory spike time-dependent synapses (Abbott & Nelson, 2000). It is

thought that in the brain, plasticity processes at synapses mediate long-

term learning. At the excitatory synapses a type of Hebbian learning is

modelled. If a pre-synaptic spike precedes that of a post-synaptic neuron,

potentiate of the synapse can occur. If the reverse occurs then depression

of the synapse takes place. Equations (2) and (3), below, govern these two

processes.

wij(t + ∆t) = wij(t) + κE(t̂j + d̂ij − t̂i) (2)

κE(∆) =

A+ · exp(∆/τP ) if ∆ < 0

B− · exp(−∆/τD) if ∆ > 0

0 otherwise

(3)

14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9

where t̂j , t̂i are the most recent spike times of the pre- and post-synaptic

neuron, respectively. The amplitude and polarity of the change are deter-

mined by the function κ, which implements what is known as the learning

window (Gerstner & Kistler, 2002). τP and τD control the time course over

which potentiation and depression occur respectively, and the constants

A+ > 0 and B− < 0 control the amplitude of potentiation and depression,

respectively. This function differentiates between the case where the pre-

synaptic spike, tpre, occurs before the post-synaptic spike, tpost, and vice

versa, with potentiation occurring in the former instance and depression in

the latter.

The similar process takes place at inhibitory synapses (Roberts, 2000).

When spiking of the pre-synaptic inhibitory neuron follows that of the

post-synaptic excitatory neuron, potentiation can result. Conversely, when

pre-synaptic activity precedes post-synaptic activation, depression can be

observed. Two similar equations, (4) and (5), can also be defined, with all

parameters as previously mentioned.

wij(t + ∆t) = wij(t) + κI(t̂j − t̂i) (4)

κI(∆) =

A+ · exp(−∆/τP ) if ∆ > 0

B− · exp(∆/τD) if ∆ < 0

0 otherwise

(5)

In both cases, a realistic description of synaptic plasticity requires that

the efficacy, wij , be constrained to vary within a certain range of values,

namely [wEImin, wEI

max]. The excitatory (inhibitory) synapse described here

must maintain a positive (negative) weight which must not be greater (less)

than wEImax. These constraints can be easily realised by setting B− = wijb−,

b− < 0 and A+ = (wEImax − wij)a+, a+ > 0 in equations (3) and (5). Thus

each positive (negative) term which increments (decrements) the synaptic

efficacy is proportional to wEImax−wij whereas each negative (positive) term

which leads to a decrement (increment) of the efficacy is proportional to

wij . When the efficacy reaches its maximum (minimum) value it is no

longer incremented (decremented) (Gerstner & Kistler, 2002).

4. Results

Thus far eliciting synchronous oscillation from the network has proved very

difficult as the network is very sensitive to small changes in parameter sets.

14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9

Seemingly innocuous modifications can change the regular output shown in

figure 3 into self-propagating epileptiform spiking patterns where all infor-

mation about letter identity and location is lost. Balancing excitation and

inhibition in a way which maintains the topographic regularity extracted

by the SOM algorithm with the need to have regions of the three maps

spiking together, and binding what and where information is so doing, is a

demanding task. Solving this problem is our current goal.

Left [6] Left [7] Left [8] Left [9]

Right [0]

5 10 15

15

10

5

Right [1]

5 10 15

Right [3]

5 10 15

Right [4]

5 10 15

Right [5]

5 10 150

0.2

0.4

0.6

0.8

1

T

Left [2]

15

10

5

(a) Letter identity.

Left [6] Left [7] Left [8] Left [9]

Right [0]

2 4 6 8

8

6

4

2

Right [1]

2 4 6 8

Right [3]

2 4 6 8

Right [4]

2 4 6 8

Right [5]

2 4 6 80

0.2

0.4

0.6

0.8

1

T

Left [2]

8

6

4

2

(b) Letter location.

Figure 3. The figures above show the number of spikes generated by 10 presentationsof the letter “T” to the model.

Currently the model is capable of correctly learning both letter location

and a translation invariant representation of letter identity. This can be

seen in figure 3. The letter “T” was presented to a network 10 times. Half

of these presentations were to the left side of the input array and the other

half to the right. The model learned letter identity, show in figure 3(a),

regardless of presentation location. Though the individual spike patterns

are not identical, due to noise in spike transmission, they are substantially

similar. Likewise in figure 3(b), the model easily located the letter in the

input array. This is a consequence of the receptive field connection between

the input and where layers.

5. Summary and Conclusions

In this paper we have described a model of orthographic repetition blind-

ness, a disruption in the ability to recognise repeated words, or word frag-

14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9

ments, which are rapidly visually presented.

Drawing on the division of the visual system into streams which pro-

cess object identity and location separately, the model consists of two self-

organising maps performing these functions. These two maps are recipro-

cally connected, via a third SOM, which promotes synchronous firing of the

neurons in each map.

We have described a model spiking neuron based on Gerstner’s SRM

with networks of these neurons connected with plastic excitatory and in-

hibitory synapses exhibiting spike time-dependant learning.

So far the network has successfully learned both letter and location in-

formation. In attempting to get the model to demonstrate synchronous

oscillations in different parts of the three maps the model has almost con-

sistently become epileptic, a known problem with these types of networks

(Huyck, n.d.).

This may be due to a number of factors. Firstly, the balance between

slow and fast excitatory and inhibitory synaptic transmission which can

stabilise oscillatory persistent activity (Brunel, 2003) may not be present in

the network. Secondly, the simple fire and quickly forget refractory function

may decay to swiftly thus depriving the network of its activity history. We

are currently investigating whether addressing these two problems will allow

the network to enter stable oscillatory persistently active states.

Acknowledgements

We wish to thank Harry Erwin for his useful comments and suggestions for

solving the simulated epilepsy demonstrated by the model.

References

Abbott, L. F., & Nelson, S. B. (2000). Synaptic plasticity: taming the

beast. Nature Neuroscience, 3, 1178–1183.

Bavelier, D., & Potter, M. C. (1992). Visual and phonological codes in

repetition blindness. Journal of Experimental Psychology: Human

Perception and Performance, 18 (1), 134–147.

Baylis, G., Driver, J., & Rafal, R. D. (1993). Visual extinction and stimulus

repetition. Journal of Cognitive Neuroscience, 5, 453–466.

Brunel, N. (2003). Dynamics and plasticity of stimulus-persistent activity

in cortical network models. Cerebral Cortex, 3 (11), 1151–1161.

Connolly, C. G., Marian, I., & Reilly, R. G. (2004). Approaches to ef-

ficient simulation with spiking neural networks. In H. Bowman &

14th February 2005 17:37 WSPC/Trim Size: 9in x 6in for Proceedings ncpw9

C. Labiouse (Eds.), Connectionist models of cognition and perception

II (Vol. 15, pp. 231–240). London, UK: World Scientific.

Gerstner, W. (2001). What’s different with spiking neurons. In H. A. K.

Mastebroek & J. E. Vos (Eds.), Plausible neural networks for bio-

logical modelling (Vol. 13, pp. 23–48). Dordrecht, The Netherlands:

Kluwer Academic Publishers.

Gerstner, W., & Kistler, W. M. (2002). Spiking neuron models: Single neu-

rons, populations, plasticity. Cambridge, UK: Cambridge University

Press.

Harris, C. L., & Morris, A. L. (2000). Orthographic repetition blindness.

The Quarterly Journal of Experimental Psychology, 53A, 1039–1060.

Hubel, D. H., & Wiesel, T. N. (1965). Receptive fields and functional

architecture of two non-striate visual areas (18 and 19) of the cat.

Journal of Neurophysiology, 28, 229–289.

Huyck, C. (n.d.). Creating hierarchical categories using cell assemblies.

(Manuscript under review. Retrieved 14th October 2004 from http:

//www.cwa.mdx.ac.uk/chris/hebb/hier/hier.ps)

Kanwisher, N. G. (1987). Repetition blindness: Type recognition without

token individuation. Cognition, 27, 117–143.

Kanwisher, N. G., & Potter, M. C. (1990). Repetition blindness: Levels of

processing. Journal of Experimental Psychology: Human Perception

and Performance, 16 (1), 30–47.

Kohonen, T. (1995). Self-organizing maps. Berlin: Springer.

Marian, I. D. (2002). A biologically inspired model of motor control of

direction. Unpublished master’s thesis, National University of Ireland,

Dublin, Dublin, Ireland.

Roberts, P. D. (2000). Modeling inhibitory plasticity in the electrosensory

system of mormyrid electric fish. Journal of Neurophysiology, 84 (1),

2035–2047.

Rumelhart, D. E., & Siple, P. (1974). Process of recognizing tachistoscop-

ically presented words. Psychological Review, 81 (2), 99–118.

Singer, W. (1994). Putative functions of temporal correlations in neocorti-

cal processing. In C. Koch & J. L. Davis (Eds.), Large-scale neuronal

theories of the brain (pp. 210–237). Cambridge, MA: The MIT Press.

Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In

D. Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.), Analysis of

visual behavior (pp. 549–586). Cambridge, MA: The MIT Press.