learning sensorimotor contingencies james j. clark centre for intelligent machines mcgill university

32
Learning Sensorimotor Contingencies James J. Clark Centre for Intelligent Machines McGill University

Post on 21-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Learning Sensorimotor Contingencies

James J. Clark

Centre for Intelligent MachinesMcGill University

This work is being done in collaboration with:

J. Kevin O’Regan (CNRS, Univ. Rene Descartes)

and with doctoral students at McGill University:

Fatima Drissi-SmailiZiad HafedMuhua Li

A mystery

: Why do we perceive the same feature value (e.g. Color) when viewing the feature foveally or peripherally?

Why is this a mystery?

The signal provided by retinal photoreceptors can be quite different when the image of the feature falls on different places on the retina.

For example: the spectral sensitivity curves of retinal photoreceptors are shifted towards the blue in peripheral cells as compared with the foveal cells.

.

A related mystery (perhaps…)

: Why do neurons in areas such as V4 and IT, which have large receptive fields, respond to the same feature value (e.g. color, orientation, complex shape) no matter where the feature lies in the receptive field?

The activity of these neurons is usually reduced when the feature falls in the periphery of the receptive field as compared with the center, but the neuron’s selectivity, or tuning, is the same everywhere.

Perceptual Stability

These mysteries can be more generally considered as related to the mystery of perceptual stability.

Perceptual stability is the constancy of subjective experienceacross self-actions, even though these self-actions can causelarge changes in sensory inputs.

Sensorimotor Contingencies

One theory of perceptual stability, due to O’Regan and Noe,holds that what is perceived is the sensorimotor contingencyassociated with a given physical stimulus.

A sensorimotor contingency is a law or set of laws thatdescribes the relation between self-actions and resultingchanges in sensory input.

Since it is the presence of a lawful relationship betweensensory input and motor activity that determines the perceptionof a physical stimulus, an appropriate change in sensory inputis necessary for a perception to be stable!

Conditioning using Temporal-Difference Learning

We propose that Sensorimotor Contingencies associatedwith sensory changes due to eye movements can be learnedusing a variety of learning techniques.We propose the use of the Temporal-Difference Learning scheme of Sutton and Barto.

This reinforcement learning technique can be thought of as a form of Conditioning where the Conditioned Stimulus is the sensory activity before the eye movement and the Unconditioned Stimulus is the sensory activity after theeye movement.

After conditioning, presentation of the conditioned stimuluswill produce the same behaviour as that produced by theunconditioned stimulus.

)( ofmemory -term-shortor y trace,eligibilit theis )(

j feature of valueperipheral theis )(

stimulus" nedunconditio" theis and i, feature of valuefoveal the

1 and 0between lie tosaturation through dconstraine is )(

periphery tofovea fromshift attention theduring 0)(

)1()1()(

)()1()( )1()()(

tXtX

tX

t

tV

tX

tXtXtX

tXtXtXtVttV

jj

j

ij

jjj

jjjijij

The Sutton-Barto Temporal-Difference Learning Rule

V is a matrix of association strengths between pre- and post- saccadic stimuli.

The pre-motor stimulus X is held in a short-term memorygenerating an eligibilty trace, which will be used to enhance,in a Hebbian fashion, the association to the post-motor stimulus.

The reinforcement signal, which is multiplied by the eligibiltytrace to yield the change in the association matrix,is the difference between 2 different predictions of thefoveal response - a weighted sum of the current and previous foveal responses, and the action of the current association matrixon the previous peripheral stimulus.

Attention selects a peripheral target and enhances featuredetector activity at that location.

TRAINING PHASE

A short-term memory (eligibility trace) of this featureactivity is generated.

TRAINING PHASE

An eye movement is made, foveating the target.

TRAINING PHASE

Attention shifts to the fovea, enhancing the feature detectoractivity there.

TRAINING PHASE

The feature detector activity at the fovea is associated withthe feature detector activity represented in the short-termmemory, using an appropriate learning rule, e.g. theSutton-Barto Temporal Difference Rule.

TRAINING PHASE

Once associations have been built up, the appearance ofan attended-to target in periphery can produce a responseas though the target is actually foveated.

This response can be thought of as a mental image.

This mental image might be represented by activity in neurons in areas with large receptive fields (V4, IT) andhence would be concerned only with feature type, rather than feature location.

This provides an explanation for the continuity in the quality of the subjective experience of a stimulus across the visual field.

RECOGNITION PHASE

We have divided the processing into two separate phases, Training and Recognition.

In practice, however, these can co-occur.

The learning mechanism can be continuous, allowingadaptation to changes in the sensory and motor systems

(e.g. aging of the photoreceptors, changes in the projective optics of the eye, …)

STEADY-STATE OPERATION

Creation of “Mental Images”

Once the association weights matrix, V, has beenlearned, it can be used to generate predictions, M, ofwhat the foveal image or feature detector responsewill look like, based on the peripheral, responses, P.

M = V*P

It is expected that the association matrix should mapfoveal images into themselves, therefore theeigenvectors of this matrix should be (linear combinations of)the foveal images.

F = kV*F

AN EXAMPLE: STABILITY OF COLOR PERCEPTION

Many factors, including absorption of light by the lensof the eye, cause a yellowing of the light falling on thefovea as compared with that falling on the periphery.

After training, a presentation of a given color feature in thefovea is associated with the color feature that would beobserved after the feature is foveated with an eye movement.

This can be seen in the structure of the association weightsmatrix, where peripheral and foveal color features map tothe same color class.

ANOTHER EXAMPLE: STABILITY OF STRAIGHT LINE PERCEPTION

The retina is hemispherical, and this causes straight lines in space to be projected as 2-D arcs on the retinal surface, with radii of curvature that vary with eccentricity

Images of Lines Projected onto ReceptorsImages of Straight Lines At Various Eccentricities

It can be seen that the “mental images” are all very closeto the foveal images, no matter where on the peripherythe projection of the physical line falls.

The eigenvalues are not equal to the foveal images, but thefoveal images can be obtained from them through alinear sum.

Development of Position Invariance in Neural Responses

Feature detectors with differing preferred stimuli(corresponding to the photoreceptor responses ofa stable physical stimulus as the eye moves)

Which feature detectorsare connected to the cellmust be learned (and continually adapted)

Standard View

it is unclear howthe development wouldproceed without somesort of adaptation signalcoming from the needfor constancy of responseacross self-actions(e.g. eye movements)

Development of Position Invariance in Neural Responses

Feature detectors with differing preferred stimuli(corresponding to the photoreceptor responses ofa stable physical stimulus as the eye moves)

Alternate View

The weightings of thelower level units are continually updated through the associativelearning mechanism. Thismechanism requires inputfrom the oculomotorsystem to know when an eye movement has taken place.

Association Layer

“mental image”(prediction of foveal response)

Eye movementsignal

Conclusions

Perceptual stability and the position invariance of higher-levelcortical neurons may arise from a learning of sensorimotorcontingencies.

Such learning can be accomplished with a reinforcementlearning network, which learns to generate predictions oflower level visual feature detector activity which wouldoccur after foveation of a physical stimulus.

In our view, a projection of a physical stimulus onto anyperipheral retinal location will result in the same“mental image” of the feature as projection onto the fovea.

On-going and Future Research

* Recurrent Feedback of predictions back down to low-level feature detectors - will allow small displacements of foveal image

* Interpretation of the Reinforcement Signal - small signal can be used to drive adaptation - large signal can be used to indicate instability of the world or to indicate that a new class should be created

* Psychophysical studies of Pre- and Post-motor attention shifts

* Sensorimotor Basis function representations of the Association weights matrix.