procedural skill unlearning
TRANSCRIPT
Procedural Skill Unlearning
Matthew J. Crossley
Department of Psychological and Brain Sciences University of California, Santa Barbara, 93106
• Build and test a computational cognitive neuroscience (CCN) model of procedural skill learning and unlearning
• CCN models predict neurobiology and behavior
Talk Goals
• Procedural Skills
• Model Architecture
• Instrumental Conditioning Applications
• Category Learning Applications
• Closing Remarks
Outline
• Procedural Skills
• Model Architecture
• Instrumental Conditioning Applications
• Category Learning Applications
• Closing Remarks
Outline
• Learned incrementally from feedback
• E.g., riding a bike or playing an instrument
• E.g., radiology
Procedural Skills
Appealing Choices
• Much is known about the relevant neurobiology for each
• Each has investigated unlearning
Procedural Skills Depend on the Basal Ganglia
• Basal ganglia are a collection of subcortical nuclei
• Interconnects with cortex in well defined circuits
• Striatum is a major input structure
Procedural Learning Depends on the Striatum
• Single-cell recordings Carelli, Wolske, & West, 1997; Merchant, Zainos, Hernadez, Salinas, & Romo, 1997; Romo, Merchant, Ruiz, Crespo, & Zainos, 1995
• Lesion studies Eacott & Gaffan, 1991; Gaffan & Eacott, 1995; Gaffan & Harrison, 1987; McDonald & White, 1993, 1994; Packard, Hirsch, & White, 1989; Packard & McGaugh, 1992
• Neuropsychological patient studies Filoteo, Maddox, & Davis, 2001; Filoteo, Maddox, Salmon, & Song, 2005; Knowlton, Mangels, & Squire, 1996
• Neuroimaging Nomura et al., 2007; Seger & Cincotta, 2002; Waldschmidt & Ashby, 2011
Striatal Neurons
Medium Spiny Projection Neurons (MSNs)
96%
GABA Interneurons 2%
TANs - Cholinergic Interneurons 2%
The TANs are of Particular Interest
• Tonically active and pause to excitatory input
• Presynaptically inhibit cortical input to MSNs
• Get major input from CM-Pf
• Learn to pause to stimuli that predict reward (requires dopamine)
• Procedural Skills
• Model Architecture
• Instrumental Conditioning Applications
• Category Learning Applications
• Closing Remarks
Outline
Learning Occurs at the CTX-MSN Synapse and at Pf-TAN Synapses
Pf-TAN Synapse
CTX-MSN Synapse
Ashby and Crossley (2011)
Response and Feedback
• Model responds if SMA crosses threshold
• Model is given feedback after every trial
Learning Occurs at the CTX-MSN Synapse and at Pf-TAN Synapses
Pf-TAN Synapse
CTX-MSN Synapse
Ashby and Crossley (2011)
CTX-MSN Synaptic Modification Requires a TANs Pause
• Synaptic Strengthening:
- Strong presynaptic activation
- Strong postsynaptic activation
- Elevated DA levels
• Synaptic Weakening:
- Strong presynaptic activation
- Strong postsynaptic activation
- Depressed DA levels
Arbuthnott, Ingham, & Wickens (2000) Calabresi, Pisani, Mercuri, & Bernardi (1996) Reynolds & Wickens (2002)
Synaptic Plasticity in the Striatum Depends on Dopamine (DA)
• Synaptic Strengthening:
- Strong presynaptic activation
- Strong postsynaptic activation
- Elevated DA levels
• Synaptic Weakening:
- Strong presynaptic activation
- Strong postsynaptic activation
- Depressed DA levels
Arbuthnott, Ingham, & Wickens (2000) Calabresi, Pisani, Mercuri, & Bernardi (1996) Reynolds & Wickens (2002)
DA Encodes Reward Prediciton Error (RPE)
• Elevated after unexpected reward
• Depressed after unexpected no-reward
• Does nothing if anything expected happens
Bayer & Glimcher (2005)
Computing RPE
Obtained feedback on trial n:
Predicted feedback on trial n:
Rn =
�1 if positive feedback0 otherwise
Pn = Pn�1 + �(Rn�1 � Pn�1)
RPE on trial n:
RPE(n) = Rn � Pn
Updating Synapses in the Model
!
wK ,J
(n +1) = wK ,J
(n)
+"wIK
(n) SJ(n) #$
NMDA[ ]+D(n) #D
base[ ]+
1# wK ,J
(n)[ ]
#%wIK
(n) SJ(n) #$
NMDA[ ]+Dbase
#D(n)[ ]+wK ,J
(n)
# &wIK
(n) $NMDA
# SJ(n)[ ]
+' S
J(n) #$
AMPA[ ]+wK ,J
(n).
Presynaptic Activity
Presynaptic Activity
Synaptic Strengthening
Synaptic Weakening
Updating Synapses in the Model
!
wK ,J
(n +1) = wK ,J
(n)
+"wIK
(n) SJ(n) #$
NMDA[ ]+D(n) #D
base[ ]+
1# wK ,J
(n)[ ]
#%wIK
(n) SJ(n) #$
NMDA[ ]+Dbase
#D(n)[ ]+wK ,J
(n)
# &wIK
(n) $NMDA
# SJ(n)[ ]
+' S
J(n) #$
AMPA[ ]+wK ,J
(n).
Postsynaptic Activation
Postsynaptic Activation
Synaptic Strengthening
Synaptic Weakening
Updating Synapses in the Model
!
wK ,J
(n +1) = wK ,J
(n)
+"wIK
(n) SJ(n) #$
NMDA[ ]+D(n) #D
base[ ]+
1# wK ,J
(n)[ ]
#%wIK
(n) SJ(n) #$
NMDA[ ]+Dbase
#D(n)[ ]+wK ,J
(n)
# &wIK
(n) $NMDA
# SJ(n)[ ]
+' S
J(n) #$
AMPA[ ]+wK ,J
(n).
Elevated DA
Depressed DA
Synaptic Strengthening
Synaptic Weakening
Model Accounts for Basic Instrumental Conditioning Behavior
Ashby and Crossley (2011)
Fast reacquisition is evidence that extinction did not erase initial learning
• Procedural Skills
• Model Architecture
• Instrumental Conditioning Applications
• Category Learning Applications
• Closing Remarks
Outline
Slowed Reacquisition
Condition
Phase
Ext2 Ext8 Prf2 Prf8
Acquisition VI-30 sec VI-30 sec VI-30 sec VI-30 sec
ExtinctionNo
ReinforcementNo
ReinforcementLean Schedule Lean Schedule
Reacquisition VI-2 min VI-8 min VI-2 min VI-8 min
Woods and Bouton (2007)
Renewal - Basic Design
Condition
Phase
ABA AAB ABC
Acquisition Environment A Environment A Environment A
Extinction Environment B Environment A Environment B
Renewal (Extinction)
Environment A Environment B Environment C
Bouton et al. (2011)
Instrumental Conditioning Summary
• The TANs protect learning at CTX-MSN synapses.
• Manipulations that keep the TANs paused during extinction leave learning at the CTX-MSN synapse subject to change.
• Procedural Skills
• Model Architecture
• Instrumental Conditioning Applications
• Category Learning Applications
• Closing Remarks
Outline
Many Qualitative Differences Between RB and II
RB II
Unsupervised learning Yes No
Observational learning Yes No
Dual-task interference Yes No
Time needed to process feedback
Yes No
Interference from button switch
No Yes
Interference from Feedback Delay
No Yes
II Category Learning is a Procedural Skill
General Experiment Design
Crossley, Maddox & Ashby (in prep)
Condition
Phase
Active ConditionMeta-Learning
Condition
Acquisition True Feedback
ExtinctionActive Feedback
Manipulation
Reacquisition True Feedback
General Experiment Design
Crossley, Maddox & Ashby (in prep)
Condition
Phase
Active ConditionMeta-Learning
Condition
Acquisition True Feedback True Feedback
ExtinctionActive Feedback
ManipulationActive Feedback
Manipulation
Reacquisition True FeedbackTrue Feedback
Rotated Categories
II Category-Unlearning
Rotation of this kind massively interferes with category learning performance (Maddox, Glass, O’Brien, Filoteo & Ashby, 2010)
Experiment 1
Crossley, Maddox & Ashby (in prep)
Condition
Phase
Random-Feedback Extinction
Random-Feedback Meta-Learning
Acquisition True Feedback True Feedback
Extinction Random Feedback Random Feedback
Reacquisition True FeedbackTrue Feedback
Rotated Categories
Experiment 1- Reacquisition
Fast Reacquisition: Random feedback does not interfere with initial learning
The results of experiment 1 are inconsistent with all existing theories of category learning
Importance
Problem with RPE DA Model
• Random feedback prevents reward from becoming predicted
• TANs don’t stop pausing during extinction
• CTX-MSN synapses remain vulnerable to unlearning
• DA scaled by response-feedback contingency
• Correlation between response confidence and feedback history
A New Dopamine Model
DA model suggests the important factor to keep the TANs paused is
response-feedback contingency
Using the Model to Develop an Effective Unlearning Protocol
Experiment 2
Crossley, Maddox & Ashby (in prep)
Condition
Phase
Partially-Contingent Extinction
Partially-Contingent Meta-Learning
Acquisition True Feedback True Feedback
Extinction Partially-Contingent Partially-Contingent
Reacquisition True FeedbackTrue Feedback
Rotated Categories
Experiment 2 - Reacquisition
Slow Reacquisition: Partially-Contingent Random feedback interferes with initial learning
Experiment 3
Crossley, Maddox & Ashby (in prep)
Condition
Phase
Non-Contingent-40 Extinction
Non-Contingent-40 Meta-Learning
Acquisition True Feedback True Feedback
ExtinctionNon-Contingent-40
FeedbackNon-Contingent-40
Feedback
Renewal (Extinction)
True FeedbackTrue Feedback
Rotated Categories
Experiment 3 - Acquisition
Fast Reacquisition: Non-Contingent-40 feedback does not erase initial learning
Summary
• TANs protect cortical-striatal synapses during periods when reward delivery is not contingent on behavior
• RPE may not be sufficient to capture DA behavior under noisy feedback conditions
• Key to unlearning may be to simulate a TAN pause (e.g., with drugs) or trick the TANs into pausing (e.g., with partial reliable feedback) during the unlearning process