2/11/20071 acq and the basal ganglia jimmy bonaiuto usc brain project 2/12/2007

37
2/11/2007 1 ACQ and the Basal Ganglia Jimmy Bonaiuto USC Brain Project 2/12/2007

Upload: merilyn-bailey

Post on 19-Jan-2018

214 views

Category:

Documents


0 download

DESCRIPTION

2/11/20073 Alstermark’s Cat

TRANSCRIPT

2/11/2007 1

ACQ and the Basal Ganglia

Jimmy BonaiutoUSC Brain Project

2/12/2007

2/11/2007 2

Outline

• Alstermark’s Cat• ACQ• ACQ → Basal Ganglia• Basal Ganglia Model Implementations

(NSL)• The Search for Executability

2/11/2007 3

Alstermark’s Cat

2/11/2007 4

ACQ

2/11/2007 5

ACQ

2/11/2007 6

ACQ - Executability

-2D Gaussian kernel populations-Food location relative to mouth-Food location relative to paw-Food location relative to tube opening

2, , , , ,

2 2y y maxx x maxmax max max max

x y Omax max max max

f t p t Vf t p t VV V P PPF t G x y

P P V V

2, , , , ,

2 2x y maxx x maxmax max max max

x y Omax max max max

f t m t Vf t m t VV V P PMF t G x y

P P V V

2, , , , ,

2 2y y maxx x maxmax max max max

x y Omax max max max

f t b t Vf t b t VV V P PBF t G x yP P V V

22

222

, , , , exp2

x yx y

x yG x y

2/11/2007 7

Learning Executability

- Success or failure is signaledby the match or mismatchbetween efferent signals andmirror system output

, , ,1 1Ox y a exec a x yW s T O T

2/11/2007 8

Learning Desirability

, 2 ˆ1ISi ext i ext extW IS D T r T

- Eligibility signal computed from - Internal state - Mirror system output - Efferent signal

, ,ˆ IS IS

ext i ext1 i ext2r T r T W W

, ,ˆ IS IS

int i int1 i int2r T r T W W , ˆ1ISi int2 i int intW IS D T r T

2/11/2007 9

Priority

Simplified form: priority = executability × desirability

Leaky integrator form:

21, 1

d,0.0

da

a

a

PP ISPP PP a a PP

a PP

uu t E t W IS t rand

tPP t u t

2/11/2007 10

Action Selection

- Winner declared when max CC layer element firing rate is greater or equal to ε1 (0.9) and all other element firing rates are less than or equal to ε2 (0.1).

2/11/2007 11

ACQ

2/11/2007 12

ACQ Selection Properties

Contrast-Dependent Latency

2/11/2007 13

ACQ Selection Properties

-Approximation toBoltzmann equation:

1

1xT

p xe

T=temperature

2/11/2007 14

ACQ – TD Learning

No initialized weights

Eat initialized

Reach-graspinitialized

Effects of Desirability WeightInitialization on Mean Trial LengthDuring TD Learning

2/11/2007 15

ACQ – Simulation Results

Final Desirability Weights

Mean Trial Length

Mean Unsuccessful Action Attempts

2/11/2007 16

ACQ – Simulation Results

MF - Eat MF – Grasp Jaw

PF – Reach Food PF – Reach Tube

2/11/2007 17

Where in the Brain is ACQ?• Affordances

– Posterior parietal cortex• Object-directed motor schemas

– Premotor cortex• Winner-Take-All

– Basal ganglia (Winner-Lose-All)• Desirability Learning

– Striatum with TD error signal from midbrain dopaminergic system (SNc, VTA)

• What about Executability?

2/11/2007 18

Basal Ganglia Model Implementations (NSL)

• The following models are implemented in NSL and available for extension or experimentation:– GPR– Brown, Bullock, & Grossberg– RDDR

2/11/2007 19

Gurney, Prescott, Redgrave (GPR)

-Interlayer winner-lose-all-Control signal calculated from the sum of the cortical signal provides a gain signal to the competition

2/11/2007 20

GPR

GPi/SNr

Str-D1Cortex Str-D2 STN

GPe

2/11/2007 21

GPR

• What does a consideration of the GPR model bring to ACQ?– Intralayer WTA → Interlayer WTA– WTA → WLA

• Do we need a control (gain) signal?– We may want to explore the possibility of

chunking when two actions are activated to similar levels

2/11/2007 22

Brown, Bullock, & Grossberg

• Ventral striatum → ventral pallidum → PPTN

• Learns to activate SNc given secondary reinforcer

• Cortex → Striosomes• Learns to inhibit SNc response to primary reinforcer• Learns timing between primary and secondary reinforcers

2/11/2007 23

Brown, Bullock, & Grossberg

2/11/2007 24

Brown, Bullock, & Grossberg

2/11/2007 25

Brown, Bullock, & Grossberg

2/11/2007 26

Brown, Bullock, & Grossberg

• What does a consideration of the Brown, Bullock, & Grossberg model bring to ACQ?– A neural method of computing the TD error

signal– Can we extend it to have multiple primary

reinforcers (dimensions of reinforcement)?

2/11/2007 27

Reinforcement Driven Dimensionality Reduction (RDDR)

• Extension of PCA neural network methods to include reinforcement

• Feedforward connections: normalized multi-Hebbian with reinforcement

• Lateral connections: normalized anti-Hebbian

2/11/2007 28

RDDR - Pretraining

2/11/2007 29

RDDR – Mid-training

2/11/2007 30

RDDR - Trained

2/11/2007 31

RDDR - Retraining

2/11/2007 32

RDDR - Retrained

2/11/2007 33

RDDR

• What does a consideration of the Brown, Bullock, & Grossberg model bring to ACQ?– Maybe nothing, but it may be useful in

chunking actions in hACQ

2/11/2007 34

Where is Executability?

• We can map ACQ onto the basic BG architecture by modeling an interlayer WLA network with cortico-striatal connection weights encoding desirability and modified via TD learning

• How does executability fit in?

2/11/2007 35

Parietal / Basal Ganglia Projections

• Petras (1971) – Projections from the inferior and superior parietal lobules to the striatum and thalamus

• Cavada & Goldman (1991) – Subregions of parietal area 7 project to portions of the striatum bilaterally

• Flaherty & Graybiel (1991) – Somatotopic projections from S1 to the striatum– Only innervates matrix – not striosomes

• Graziano & Gross (1993) – Bimodal somatotopic map in putamen

• Lawrence et al. (2000) –Dorsal stream projects to the anterodorsal striatum

2/11/2007 36

ACQ Basal Ganglia

• Could executability and desirability be represented in segregated regions of the striatum and be combined in the globus pallidus?

• Or perhaps they are combined in the striatum?

2/11/2007 37

References• Bar-Gad, I., Morris, G., Bergman, H. (2003) Information processing, dimensionality

reduction and reinforcement learning in the basal ganglia. Progress in Neurobiology, 71: 439–473.

• Brown, J., Bullock, D., Grossberg, S. (1999) How the Basal Ganglia Use Parallel Excitatory and Inhibitory Learning Pathways to Selectively Respond to Unexpected Rewarding Cues. J. Neurosci., 19(23): 10502-10511.

• Cavada, C., Goldman-Rakic, P.S. (1991) Topographic Segregation of Corticostriatal Projections from Posterior Parietal Subdivisions in the Macaque Monkey. Neuroscience, 42(3): 683-696.

• Flaherty, A.W., Graybiel, A.M. (1991) Corticostriatal Transformations in the Primate Somatosensory System. Projections from Physiologically Mapped Body-Part Representations. J. Neurophys. 66(4): 1249-1263.

• Graziano, M.S.A., Gross, C.G. (1993) A bimodal map of space: Somatosensory receptive fields in the macaque putamen with corresponding visual receptive fields. Exp Brain Res, 97: 96-109.

• Gurney, K., Prescott, T.J., Redgrave, P. (2001) A computational model of action selection in the basal ganglia. I. A new functional anatomy. Biol. Cybern. 84: 401-410.

• Lawrence, A.D., Watkins, L.H.A., Sahakian, B.J., Hodges, J.R., Robbins, T.W. (2000) Visual object and visuospatial cognition in Huntington’s disease: implications for information processing in corticostriatal circuits. Brain, 123: 1349-1364.

• Petras, J.M. (1971) Connections of the Parietal Lobe. J. Psychiat. Res., 8: 189-201.