neurophysiological correlates of spatial navigation optimization in...
TRANSCRIPT
Contact Information: [email protected] by NSF Grants 1117303 and 1117302 to JMF, PD and AW
T1: T2:
T3: T4: T5:
T6: T7: T8:
T1 T2 T3 T4 T5 T6 T7 T80
1
2
Optim
alit
y r
atio
Methods- Rats were exposed to both, random configurations (different cup locations every trial, 10 trials) and standard configurations (same cup locations, until convergence or up to 20 trials) of the TSP task with 5 cups per configuration.
- Bilateral inactivation of the DH (bupivacaine (2.5%) or ropivacaine (10mg/ml) ) before TSP task (standard configurations only).
- Multi-tetrode recordings from the DH CA1. SPWs extracted by filtering (100-300 Hz). Spike cutting using Spike2.
- Computational models of the hippocampus-BG system (decision making) and PFC (efficient path formation) to model the experimental data.
Summary
1. The dorsal hippocampus is important for spatial optimization.
2. Optimization in the TSP task may be the result of a transition from local to global strategies, as a subject becomes more familiar with the reward locations.
3. SPWs density increased from trial to trial and stays high several minutes after the TSP task, whether the task involves optimization or not.
4. SPWs density between trials was positively correlated with performance irrespective of convergence. However, the number of SPWs at target locations increased from cup to cup if the task involved optimization, but decreased if optimization was not possible.
5. After convergence, while there was a general decrease in place cells firing within SPWs, there was a high correlation between their firing and the extent to which the place fields intercepted the trajectory. The correlation remained high in the last trials and a few minutes after the task only in converged conditions. This suggests that post-task replay preferentially involves cells whose fields are on the converged path.
6. Modeling studies suggest that the hippocampus-BG system could implement path optimization during active spatial navigation using reinforcement learning, and that the PFC could further optimize the trajectories during post-task rest using reservoir-like computation aimed at producing short elementary paths (snippets).
1 2 3 2 3 2 1T. Pelc , M. Llofriu , N. Cazin , P. Scleidorovich Chiodi , P. Dominey , A. Weitzenfeld , J.-M. Fellous
1Dept. of Psychology, Arizona Univ., Tucson, AZ; 2Dept. of Computer Sci. and Engin., Univ. of South Florida, Tampa, FL; 3Human and Robot Cognitive Systems, INSERM, Bron, France1Dept. of Psychology, Arizona Univ., Tucson, AZ; 2Dept. of Computer Sci. and Engin., Univ. of South Florida, Tampa, FL; 3Human and Robot Cognitive Systems, INSERM, Bron, France1Dept. of Psychology, Arizona Univ., Tucson, AZ; 2Dept. of Computer Sci. and Engin., Univ. of South Florida, Tampa, FL; 3Human and Robot Cognitive Systems, INSERM, Bron, France
Neurophysiological correlates of spatial navigationNeurophysiological correlates of spatial navigationoptimization in the rodent optimization in the rodent
Neurophysiological correlates of spatial navigationoptimization in the rodent
Neurophysiological correlates of spatial navigation Neurophysiological correlates of spatial navigation optimization in the rodent optimization in the rodent
Neurophysiological correlates of spatial navigation optimization in the rodent
Experimental Protocol
Random Configuration10 trials =
20 minPre Rest
5 min Post Config Rest
Configuration 14 to 20 trials =
Trial
Configuration 24 to 20 trials =
5 min Post Config Rest
5 min Post Config Rest
Results: Inactivation
Inactivation (bupivacaine/ropivacaine) of the dorsal hippocampus significantly increases: - distance traveled - movement time - optimality ratio
20
30
40
50
0.3 0.5 0.7 0.9Trial c
om
ple
tion tim
e (
s)
SPW Density (/s)
Rand r*Conv r*NotConv
3. SPWs Between Trials vs Performance 4. SPWs at Reward Locations
* **
0.0
0.2
0.4
0.6
0.8
1.0
Pre-Rest Rand Conv NotConv
SP
Ws
De
nsi
ty
(/se
c)
SP
Ws
De
nsi
ty
(/se
c)
0.3
0.4
0.5
0.6
0.7
0.8
First2 trials Last3 trials
Rand ***Conv **NotConv *
0
5
10
15
Rand n=47
Convn=106
NotConv n=20
% C
ha
ng
e F
Rfr
om
Pre
-Re
st
% C
ha
ng
e F
Rfr
om
Pre
-Re
st
0
10
20
30
40
First2 trials Last3 trials
Rand n=47Conv n=106NotConv n=20
* *
** *
5. Place Field Intercept Correlations with FR within SPWs
15% 0%
Place Field Intercept
Behavior
SPWs and Firing Rate within SPWsSharp wave–ripple complexes occur: 1. In HB, between configurations (5 min rest) 2. In HB, between trials (30 sec rest) 3. At the reward locations
Computational modeling results
1.0
1.1
1.2
1.3
DistanceTraveled
MovementTime
Optimality Ratio
*
*
*
Behavioral Performance
Config
ura
tions
Featu
res
Configuration
Post-Trial SPWs density was correlated with trial completion time and optimality ratio in the random and converged conditions.
The number of SPWs at reward locations during the last 3 trials decreased from first cups to last cups in random configurations but increased in converged condition.
SPWs density was significantly higher after the task in all conditions.
However, place cells fired more in SPWs after the Random and Not Converged sessions than after the Converged condition.
SPWs density increased significantly across trials in all conditions.
Place cells did not significantly fire more in SPWs across trials in any condition. Their contributions to SPWs were however greater if the rat did not converge.
The firing of place cells during SPWs was correlated with the intersection of the path with their place field after the first 2 trials in all conditions, but remained significant only if the rat converged.
The correlation after convergence remained significant in the 5 min post configuration periods.
Rats mostly converged to a near-optimal path in less then 10 trials. When rats converged, optimization was reflected in the decrease in distance traveled, trial completion time, movement time, number of visited reward locations and optimality ratio.
First trials of the converged configurations were mostly associated with local strategies, last trials were associated with local and global strategies.
1.2
1.7
2.2
0.3 0.5 0.7 0.9
Rand r*Conv r*NotConv
Random configurations
First2 trialsLast3 trials
Converged configurations
*
+30sec ITl Trial
+30sec ITl Trial
+30sec ITl
Convergencesame path for 3 consecutive trials
1
1. SPWs Post Configurations 2. SPWs Between Trials
0.2
0.4
0.6
0.8
Cup1 Cup2 Cup3 Cup4 Cup5
SP
Ws
# p
er
tria
l
0.2
0.4
0.6
0.8
Cup1 Cup2 Cup3 Cup4 Cup5
-0.2
0.0
0.2
0.4
0.6
0.8
Random n=41
Convn=73
NotConvn=17
Corr
ela
tion
Coeffic
ient
Between TrialsFirst2 trials
Last3 trials
-0.2
0.0
0.2
0.4
0.6
0.8
Post Configuration
SP
Ws
# p
er
tria
l
Corr
ela
tion C
oeffic
ient
****
**
****
Random n=41
Convn=73
NotConvn=17
Optim
alit
y R
atio
The Brain Mechanisms of Spatial Navigation Optimization
Humans and animals are able to find good solutions (Macgregor & Ormerod, 1996; Menzel 1973,
Watkins et al. 2011). The neural substrate of spatial navigation optimization is largely unknown.
Introduction
The Traveling Salesperson Problem (TSP) involves the search for an optimal solution among a discrete set of possible solutions. TSP extends beyond physical distances: imaging celestial objects, airport traffic control, manufacture of microchips, DNA sequencing...
Spatial learning(Packard & McGaugh, 1996)
Place cells(O'Keefe & Nadel, 1978)
Replay of spatial experience during rest
(Skaggs & McNaughton, 1996)
Main General QuestionsŸ Is the dorsal hippocampus (DH) necessary for the TSP?
Ÿ What are the behavioral strategies used in the TSP?
Ÿ Are the rate and spatial distributions of Sharp Wave Ripple complexes (SPWs) and replay events within and between trials correlated with the convergence onto a path?
This multi-goal task makes it possible to relate the activity of place cells from trial to trial as to the rat’s convergence on a specific trajectory.
PrefrontalCotex
Ventral TegmentalArea
100 ms
30 cm
Dru
g/S
alin
e R
atio
Home Base(HB)
Order visited in each trial Order visited in each trial
4. Cvxhull Area
5. Optimal Path Length
1. Optimal Path Turning Angle
3. Greedy Path Length
Distance Traveled
Visits N of Opt Segments
on the Path
Trial Time
2. Greedy Path Turning Angle
2
PFC is modeled as an instantiation of the reservoir computing paradigm called the Temporal Recurrent Network. Reservoir input is the replay of place cell subsequences (snippets) during SPW events, as the animal generates increasingly efficient trajectories in the TSP task across trials. Recurrent dynamics consolidate snippet representations and create efficient sequences by rejecting sequences that are less robustly coded.
The model results agree with experimental results showing convergence to a near-optimal path in about 10 trials.
***
OptimalityRatio
Distance Traveled
Visits N of Opt Segments
on the Path
Trial Time
OptimalityRatio
First 2 trials Last 3 trials
- - - -- -
-
++
+--- + +
-+
Computational model of the Hippocampus
Role of the prefrontal cortex and ‘Reservoir Computing’
References:De Jong, L. W. Gereke, B. Martin, G. M. Fellous, J-M (2011) The traveling salesrat: insights into the dynamics of efficient spatial navigation in the rodent J. Neural Eng. 8 065010MacGregor, J. N., & Ormerod, T. (1996). Human performance on the travelling salesman problem. Perception & Psychophysics, 58, 527–539.Menzel, E. W. (1973). Chimpanzee spatial memory organization. Science, 182, 943–945.Packard, M.G. McGaugh, J.L. (1996).Inactivation of hippocampus or caudate nucleus with lidocaine differentially affects expression of place and response learning. Neurobiology of Learning and Memory, 65 , pp. 65–72O'Keefe, J. Nadel, L. (1978) The Hippocampus as a Cognitive Map (Clarendon, Oxford).Skaggs, W. E. & McNaughton, B. L. (1996). Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience. Science 271, 1870–1873
0.8
1.0
1.2
Distance TrialTime
MovtTime
Visits OptimalityRatio
La
st 3
/ F
irst
2 tria
ls
Converged (n=55)
Not Converged (n=7)
* ** *
*
4
53
Placecells
Hippocampus PFCStriatum
0.6
1.0
1.4
1.8
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Fre
chet D
ista
nce
0
10
20
30
40
50
60
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
Dis
tance
Tra
vele
d
T1 T2
T6 T10
Readout
The model learns which feeder to go to next. The decision is solely based on the current location, but the model can perform global optimization using reinforcement learning strategies, implemented through the HPC-BG interactions.
0.0
0.5
1.0
1.5
Visits DistanceTraveled
Last
3 /
First
2 t
rials
Converged (n= 59)
Not Converged (n= 5)
* *
The model results agree with experimental results showing a decrease in distance traveled and in the number of visited reward locations across trials only in converged condition.
Start
Reward location
SPW Density (/s)
Hippocampus
Optimal Path
Actual Path
HBcup2
cup1
cup4
cup3
cup5SPW events Rat’s trajectory
Actor Critic
NextFeeder
Dorso-medial StraiatumActor Units
VTA - Nuc. Acc.Critic Units
D D D D
D D D D
HippocampusPlace Fields
F1 F2 F3 F4 F5
F2F4
F1 F5
F3