neurophysiological correlates of spatial navigation optimization in...

Contact Information: [email protected] by NSF Grants 1117303 and 1117302 to JMF, PD and AW

T1: T2:

T3: T4: T5:

T6: T7: T8:

T1 T2 T3 T4 T5 T6 T7 T80

1

2

Optim

alit

y r

atio

Methods- Rats were exposed to both, random configurations (different cup locations every trial, 10 trials) and standard configurations (same cup locations, until convergence or up to 20 trials) of the TSP task with 5 cups per configuration.

- Bilateral inactivation of the DH (bupivacaine (2.5%) or ropivacaine (10mg/ml) ) before TSP task (standard configurations only).

- Multi-tetrode recordings from the DH CA1. SPWs extracted by filtering (100-300 Hz). Spike cutting using Spike2.

- Computational models of the hippocampus-BG system (decision making) and PFC (efficient path formation) to model the experimental data.

Summary

1. The dorsal hippocampus is important for spatial optimization.

2. Optimization in the TSP task may be the result of a transition from local to global strategies, as a subject becomes more familiar with the reward locations.

3. SPWs density increased from trial to trial and stays high several minutes after the TSP task, whether the task involves optimization or not.

4. SPWs density between trials was positively correlated with performance irrespective of convergence. However, the number of SPWs at target locations increased from cup to cup if the task involved optimization, but decreased if optimization was not possible.

5. After convergence, while there was a general decrease in place cells firing within SPWs, there was a high correlation between their firing and the extent to which the place fields intercepted the trajectory. The correlation remained high in the last trials and a few minutes after the task only in converged conditions. This suggests that post-task replay preferentially involves cells whose fields are on the converged path.

6. Modeling studies suggest that the hippocampus-BG system could implement path optimization during active spatial navigation using reinforcement learning, and that the PFC could further optimize the trajectories during post-task rest using reservoir-like computation aimed at producing short elementary paths (snippets).

1 2 3 2 3 2 1T. Pelc , M. Llofriu , N. Cazin , P. Scleidorovich Chiodi , P. Dominey , A. Weitzenfeld , J.-M. Fellous

1Dept. of Psychology, Arizona Univ., Tucson, AZ; 2Dept. of Computer Sci. and Engin., Univ. of South Florida, Tampa, FL; 3Human and Robot Cognitive Systems, INSERM, Bron, France1Dept. of Psychology, Arizona Univ., Tucson, AZ; 2Dept. of Computer Sci. and Engin., Univ. of South Florida, Tampa, FL; 3Human and Robot Cognitive Systems, INSERM, Bron, France1Dept. of Psychology, Arizona Univ., Tucson, AZ; 2Dept. of Computer Sci. and Engin., Univ. of South Florida, Tampa, FL; 3Human and Robot Cognitive Systems, INSERM, Bron, France

Neurophysiological correlates of spatial navigationNeurophysiological correlates of spatial navigationoptimization in the rodent optimization in the rodent

Neurophysiological correlates of spatial navigationoptimization in the rodent

Neurophysiological correlates of spatial navigation Neurophysiological correlates of spatial navigation optimization in the rodent optimization in the rodent

Neurophysiological correlates of spatial navigation optimization in the rodent

Experimental Protocol

Random Configuration10 trials =

20 minPre Rest

5 min Post Config Rest

Configuration 14 to 20 trials =

Trial

Configuration 24 to 20 trials =



Results: Inactivation

Inactivation (bupivacaine/ropivacaine) of the dorsal hippocampus significantly increases: - distance traveled - movement time - optimality ratio

20

30

40

50

0.3 0.5 0.7 0.9Trial c

om

ple

tion tim

e (

s)

SPW Density (/s)

Rand r*Conv r*NotConv

3. SPWs Between Trials vs Performance 4. SPWs at Reward Locations

* **

0.0

0.2

0.4

0.6

0.8

1.0

Pre-Rest Rand Conv NotConv

SP

Ws

De

nsi

ty

(/se

c)

SP

Ws

De

nsi

ty

(/se

c)

0.3

0.4

0.5

0.6

0.7

0.8

First2 trials Last3 trials

Rand ***Conv **NotConv *

0

5

10

15

Rand n=47

Convn=106

NotConv n=20

% C

ha

ng

e F

Rfr

om

Pre

-Re

st

% C

ha

ng

e F

Rfr

om

Pre

-Re

st

0

10

20

30

40

First2 trials Last3 trials

Rand n=47Conv n=106NotConv n=20

* *

** *

5. Place Field Intercept Correlations with FR within SPWs

15% 0%

Place Field Intercept

Behavior

SPWs and Firing Rate within SPWsSharp wave–ripple complexes occur: 1. In HB, between configurations (5 min rest) 2. In HB, between trials (30 sec rest) 3. At the reward locations

Computational modeling results

1.0

1.1

1.2

1.3

DistanceTraveled

MovementTime

Optimality Ratio

*

*

*

Behavioral Performance

Config

ura

tions

Featu

res

Configuration

Post-Trial SPWs density was correlated with trial completion time and optimality ratio in the random and converged conditions.

The number of SPWs at reward locations during the last 3 trials decreased from first cups to last cups in random configurations but increased in converged condition.

SPWs density was significantly higher after the task in all conditions.

However, place cells fired more in SPWs after the Random and Not Converged sessions than after the Converged condition.

SPWs density increased significantly across trials in all conditions.

Place cells did not significantly fire more in SPWs across trials in any condition. Their contributions to SPWs were however greater if the rat did not converge.

The firing of place cells during SPWs was correlated with the intersection of the path with their place field after the first 2 trials in all conditions, but remained significant only if the rat converged.

The correlation after convergence remained significant in the 5 min post configuration periods.

Rats mostly converged to a near-optimal path in less then 10 trials. When rats converged, optimization was reflected in the decrease in distance traveled, trial completion time, movement time, number of visited reward locations and optimality ratio.

First trials of the converged configurations were mostly associated with local strategies, last trials were associated with local and global strategies.

1.2

1.7

2.2

0.3 0.5 0.7 0.9

Rand r*Conv r*NotConv

Random configurations

First2 trialsLast3 trials

Converged configurations

*

+30sec ITl Trial

+30sec ITl Trial

+30sec ITl

Convergencesame path for 3 consecutive trials

1

1. SPWs Post Configurations 2. SPWs Between Trials

0.2

0.4

0.6

0.8

Cup1 Cup2 Cup3 Cup4 Cup5

SP

Ws

# p

er

tria

l

0.2

0.4

0.6

0.8

Cup1 Cup2 Cup3 Cup4 Cup5

-0.2

0.0

0.2

0.4

0.6

0.8

Random n=41

Convn=73

NotConvn=17

Corr

ela

tion

Coeffic

ient

Between TrialsFirst2 trials

Last3 trials

-0.2

0.0

0.2

0.4

0.6

0.8

Post Configuration

SP

Ws

# p

er

tria

l

Corr

ela

tion C

oeffic

ient

****

**

****

Random n=41

Convn=73

NotConvn=17

Optim

alit

y R

atio

The Brain Mechanisms of Spatial Navigation Optimization

Humans and animals are able to find good solutions (Macgregor & Ormerod, 1996; Menzel 1973,

Watkins et al. 2011). The neural substrate of spatial navigation optimization is largely unknown.

Introduction

The Traveling Salesperson Problem (TSP) involves the search for an optimal solution among a discrete set of possible solutions. TSP extends beyond physical distances: imaging celestial objects, airport traffic control, manufacture of microchips, DNA sequencing...

Spatial learning(Packard & McGaugh, 1996)

Place cells(O'Keefe & Nadel, 1978)

Replay of spatial experience during rest

(Skaggs & McNaughton, 1996)

Main General QuestionsŸ Is the dorsal hippocampus (DH) necessary for the TSP?

Ÿ What are the behavioral strategies used in the TSP?

Ÿ Are the rate and spatial distributions of Sharp Wave Ripple complexes (SPWs) and replay events within and between trials correlated with the convergence onto a path?

This multi-goal task makes it possible to relate the activity of place cells from trial to trial as to the rat’s convergence on a specific trajectory.

PrefrontalCotex

Ventral TegmentalArea

100 ms

30 cm

Dru

g/S

alin

e R

atio

Home Base(HB)

Order visited in each trial Order visited in each trial

4. Cvxhull Area

5. Optimal Path Length

1. Optimal Path Turning Angle

3. Greedy Path Length

Distance Traveled

Visits N of Opt Segments

on the Path

Trial Time

2. Greedy Path Turning Angle

2

PFC is modeled as an instantiation of the reservoir computing paradigm called the Temporal Recurrent Network. Reservoir input is the replay of place cell subsequences (snippets) during SPW events, as the animal generates increasingly efficient trajectories in the TSP task across trials. Recurrent dynamics consolidate snippet representations and create efficient sequences by rejecting sequences that are less robustly coded.

The model results agree with experimental results showing convergence to a near-optimal path in about 10 trials.

***

OptimalityRatio

Distance Traveled

Visits N of Opt Segments

on the Path

Trial Time

OptimalityRatio

First 2 trials Last 3 trials

- - - -- -

-

++

+--- + +

-+

Computational model of the Hippocampus

Role of the prefrontal cortex and ‘Reservoir Computing’

References:De Jong, L. W. Gereke, B. Martin, G. M. Fellous, J-M (2011) The traveling salesrat: insights into the dynamics of efficient spatial navigation in the rodent J. Neural Eng. 8 065010MacGregor, J. N., & Ormerod, T. (1996). Human performance on the travelling salesman problem. Perception & Psychophysics, 58, 527–539.Menzel, E. W. (1973). Chimpanzee spatial memory organization. Science, 182, 943–945.Packard, M.G. McGaugh, J.L. (1996).Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning. Neurobiology of Learning and Memory, 65 , pp. 65–72O'Keefe, J. Nadel, L. (1978) The Hippocampus as a Cognitive Map (Clarendon, Oxford).Skaggs, W. E. & McNaughton, B. L. (1996). Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience. Science 271, 1870–1873

0.8

1.0

1.2

Distance TrialTime

MovtTime

Visits OptimalityRatio

La

st 3

/ F

irst

2 tria

ls

Converged (n=55)

Not Converged (n=7)

* ** *

*

4

53

Placecells

Hippocampus PFCStriatum

0.6

1.0

1.4

1.8

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10

Fre

chet D

ista

nce

0

10

20

30

40

50

60

T1 T2 T3 T4 T5 T6 T7 T8 T9 T10

Dis

tance

Tra

vele

d

T1 T2

T6 T10

Readout

The model learns which feeder to go to next. The decision is solely based on the current location, but the model can perform global optimization using reinforcement learning strategies, implemented through the HPC-BG interactions.

0.0

0.5

1.0

1.5

Visits DistanceTraveled

Last

3 /

First

2 t

rials

Converged (n= 59)

Not Converged (n= 5)

* *

The model results agree with experimental results showing a decrease in distance traveled and in the number of visited reward locations across trials only in converged condition.

Start

Reward location

SPW Density (/s)

Hippocampus

Optimal Path

Actual Path

HBcup2

cup1

cup4

cup3

cup5SPW events Rat’s trajectory

Actor Critic

NextFeeder

Dorso-medial StraiatumActor Units

VTA - Nuc. Acc.Critic Units

D D D D

D D D D

HippocampusPlace Fields

F1 F2 F3 F4 F5

F2F4

F1 F5

F3

neurophysiological correlates of spatial navigation optimization in...

Documents