computational models of cognitive control (ii)
DESCRIPTION
Computational models of cognitive control (II). Matthew Botvinick Princeton Neuroscience Institute and Department of Psychology, Princeton University. Banishing the homunculus. Banishing the homunculus Decision-making in control:. Banishing the homunculus Decision-making in control: - PowerPoint PPT PresentationTRANSCRIPT
Computational models of cognitive control (II)
Matthew BotvinickPrinceton Neuroscience Institute andDepartment of Psychology, Princeton University
Banishing the homunculus
Banishing the homunculus
Decision-making in control:
Banishing the homunculus
Decision-making in control:
Not only, “How does control shape decision-making?”
Banishing the homunculus
Decision-making in control:
Not only, “How does control shape decision-making?”
But also, “How are ‘control states’ selected?”
Banishing the homunculus
Decision-making in control:
Not only, “How does control shape decision-making?”
But also, “How are ‘control states’ selected?”
And, “How are they updated over time?”
environment
action
perceptual input
viewed object held object
manipulative perceptual
1. Routine sequential action
Botvinick & Plaut, Psychological Review, 2004Botvinick, Proceedings of the Royal Society, B, 2007.
Botvinick, TICS, 2008
‘Routine sequential action’
• Action on familiar objects
• Well-defined sequential structure
• Concrete goals
• Highly routine
• Everyday tasks
Computational models of cognitive control (II)
Matthew BotvinickPrinceton Neuroscience Institute andDepartment of Psychology, Princeton University
?!
Hierarchical structure
MAKE INSTANT COFFEE
ADD GROUNDS ADD CREAM ADD SUGAR
SCOOP
ADD SUGAR FROM
SUGARPACK
ADD SUGAR FROM
SUGARBOWL
PICK-UP PUT-DOWN POUR STIR TEAR
Hierarchical models of action
ADD SUGAR FROM SUGARBOWL / PACKET
MAKE INSTANT COFFEE
ADD GROUNDS
ADD CREAM ADD SUGAR
PICK-UP PUT-DOWN POUR STIR TEAR SCOOP
• Hierarchical structure of task built directly into architecture
(e.g.,Cooper & Shallice, 2000; Estes, 1972; Houghton, 1990; MacKay, 1987, Rumelhart & Norman, 1982)
• Schemas as primitive elements
pt+2
at+2
st+2
An alternative approach
pt
at
st
pt+1
at+1
st+1
pt
at
st
pt+1
at+1
st+1
pt+2
at+2
st+2
• p, s, a = patterns of activation over simple processing units
• Weighted, excitatory/inhibitory connections
• Weights adjusted through gradient-descent learning in target task domains
Recurrent neural networks
• Feedback as well as feedforward connections
• Allow preservation of information over time
• Demonstrated capacity to learn sequential
behaviors (e.g., Cleermans, 1993; Elman, 1990)
environment
action
internalrepresentation
perceptual input
The model
Fixate(Blue) Fixate(Green) Fixate(Top)
PickUp Fixate(Table) PutDown
Fixate(Green) PickUp
Ballard, Hayhoe, Pook & Rao, (1996). BBS.
environment
action
perceptual input
viewed objectheld object
Model architecture
manipulative perceptual
Routine sequential action: Task domain
• Hierarchically structured
• Actions/subtasks may appear in multiple contexts
• Environmental cues alone sometimes insufficient to guide action selection
• Subtasks that may be executed in variable order
• Subtask disjunctions
ADD SUGAR FROMSUGARBOWL / PACKET
MAKE INSTANT COFFEE
ADD GROUNDS
ADD CREAM ADD SUGAR
PICK-UP PUT-DOWN POUR STIR TEAR SCOOP
drinksteep tea
`
drink
grounds
Start
End
End
drinksteep tea
cre
am
cre
am
`
drink
grounds
Start
End
End
Representations
VIEWED INPUT HELD INPUT ACTION cup cup pickup 1handle 1handle putdown 2handles 2handles pour lid lid peelopen water water tearopen brownliquid brownliquid pullopen milk milk pinchlift carton carton scoop open open sip closed closed stir packet packet locate-cup foil foil locate-sugar paper paper locate-sugarbowl torn torn locate-teabag untorn untorn locate-coffeepack spoon spoon locate-spoon teabag teabag locate-carton sugar sugar saydone coffee-instruction nothing tea-instruction
sugar-packet
Man
ipu
lative actio
ns
Percep
tual
action
s
STEP VIEWED HELD ACTION
1 cup-1handle-clearliquid nothing locate-coffeepack
2 packet-brownfoil-untorn nothing pickup
3 packet-brownfoil-untorn packet-brownfoil-untorn pullopen
4 packet-brownfoil-torn packet-brownfoil-torn locate-cup
5 cup-1handle-clearliquid packet-brownfoil-torn pour
6 cup-1handle-brownliquid packet-brownfoil-torn locate-spoon
7
Input Target/output
STEP VIEWED HELD ACTION
1 cup-1handle-clearliquid nothing locate-coffeepack
2 packet-brownfoil-untorn nothing pickup
3 packet-brownfoil-untorn packet-brownfoil-untorn pullopen
4 packet-brownfoil-torn packet-brownfoil-torn locate-cup
5 cup-1handle-clearliquid packet-brownfoil-torn pour
6 cup-1handle-brownliquid packet-brownfoil-torn locate-spoon
7
Input Target/output
STEP VIEWED HELD ACTION
1 cup-1handle-clearliquid nothing locate-coffeepack
2 packet-brownfoil-untorn nothing pickup
3 packet-brownfoil-untorn packet-brownfoil-untorn pullopen
4 packet-brownfoil-torn packet-brownfoil-torn locate-cup
5 cup-1handle-clearliquid packet-brownfoil-torn pour
6 cup-1handle-brownliquid packet-brownfoil-torn locate-spoon
7
Input Target/output
STEP VIEWED HELD ACTION
1 cup-1handle-clearliquid nothing locate-coffeepack
2 packet-brownfoil-untorn nothing pickup
3 packet-brownfoil-untorn packet-brownfoil-untorn pullopen
4 packet-brownfoil-torn packet-brownfoil-torn locate-cup
5 cup-1handle-clearliquid packet-brownfoil-torn pour
6 cup-1handle-brownliquid packet-brownfoil-torn locate-spoon
7
Input Target/output
STEP VIEWED HELD ACTION
1 cup-1handle-clearliquid nothing locate-coffeepack
2 packet-brownfoil-untorn nothing pickup
3 packet-brownfoil-untorn packet-brownfoil-untorn pullopen
4 packet-brownfoil-torn packet-brownfoil-torn locate-cup
5 cup-1handle-clearliquid packet-brownfoil-torn pour
6 cup-1handle-brownliquid packet-brownfoil-torn locate-spoon
7
Input Target/output
STEP VIEWED HELD ACTION
1 cup-1handle-clearliquid nothing locate-coffeepack
2 packet-brownfoil-untorn nothing pickup
3 packet-brownfoil-untorn packet-brownfoil-untorn pullopen
4 packet-brownfoil-torn packet-brownfoil-torn locate-cup
5 cup-1handle-clearliquid packet-brownfoil-torn pour
6 cup-1handle-brownliquid packet-brownfoil-torn locate-spoon
7
Input Target/output
STEP VIEWED HELD ACTION
1 cup-1handle-clearliquid nothing locate-coffeepack
2 packet-brownfoil-untorn nothing pickup
3 packet-brownfoil-untorn packet-brownfoil-untorn pullopen
4 packet-brownfoil-torn packet-brownfoil-torn locate-cup
5 cup-1handle-clearliquid packet-brownfoil-torn pour
6 cup-1handle-brownliquid packet-brownfoil-torn locate-spoon
7
Input Target/output
Model behavior
15% 18%
12% 10%
20% 25%
crea
m
crea
m
drink
grounds
StartEnd
crea
m
crea
m
drink
grounds
StartEnd
drinksteep tea
Start
End
crea
m
crea
m
drink
grounds
StartEnd
crea
m
crea
m
drink
grounds
StartEnd
drinksteep tea
Start
End
Slips of action(after Reason)
• Occur at decision (or fork) points
• Sequence errors involve subtask omissions, repetitions, and lapses
• Lapses show effect of relative task frequency
environment
action
perceptual input
viewed object held object
manipulative perceptual
Sample of behavior:
pick-up coffee-packpull-open coffee-packpour coffee-pack into cupput-down coffee-packpick-up spoonstir cupput-down spoonpick-up sugar-packtear-open sugar-packpour sugar-pack into cupput-down sugar-packpick-up spoonstir cupput-down spoonpick-up cup*sip cupsip cupsay-done
grounds
sugar (pack)
drink
cream omitted
subtask 1 subtask 2 subtask 3 subtask 4
Step in coffee sequence
P
erce
nta
ge
of
tria
ls e
rro
r-fr
ee100
0
0
20
40
60
80
0.02 0.1 0.2 0.3
Noise level (variance)
Per
cen
tag
e o
f tr
ials Omissions / anticipations
Repetitions / perseverationsIntrusions / lapses
steep tea sugar cream *
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
5:1 1:1 1:5
Tea : coffeeO
dd
s o
f la
pse
into
co
ffee
-mak
ing
drinksteep tea
crea
m
crea
mdrink
grounds
Start
End
End
Action disorganization syndrome(after Schwartz and colleagues)
• Fragmentation of sequential structure (independent actions)
• Specific error types
• Omission effect
environment
action
perceptual input
viewed object held object
manipulative perceptual
Sample of behavior:
pick-up coffee-packpull-open coffee-packput-down coffee-pack*pick-up coffee-packpour coffee-pack into cupput-down coffee-packpick-up spoonstir cupput-down spoonpick-up sugar-packtear-open sugar-packpour sugar-pack into cupput-down sugar-packpick-up cup*put-down cuppull-off sugarbowl lid*put-down lidpick-up spoonscoop sugarbowl with spoonput-down spoon*pick-up cup*sip cupsip cupsay-done
sugar repeated
cream omitted
disrupted subtask
subtask fragment
subtask fragment
Omission Sugar not added 77 (30 -40)
Sequence: 15 (20)
Anticipation Pour cream without openingPerseveration Add cream, add sugar, add cream againReversal Stir water then add grounds
Other: 8 (30)
Object substitution Stir with coffee -pack Gesture substitution Pour gesture substituted for stirTool omission Pour sugarbowl into cupAction addition Scoop sugar with, then put down, lidQuality Pour cream four times in a row
Error type Example Percentage
Omission Sugar not added 77 (30 -40)
Sequence: 15 (20)
Anticipation Pour cream without openingPerseveration Add cream, add sugar, add cream againReversal Stir water then add grounds
Other: 8 (30)
Object substitution Stir with coffee -pack Gesture substitution Pour gesture substituted for stirTool omission Pour sugarbowl into cupAction addition Scoop sugar with, then put down, lidQuality Pour cream four times in a row
Error type Example Percentage
Empirical data: Schwartz, et al. Neuropsychology, 1991
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.5 0.4 0.3 0.2 0.1 0
Noise (variance)
Pro
po
rtio
n In
dep
end
ents
From: Schwartz, et al. Neuropsychology, 1998.
0
10
20
30
40
50
60
70
0.3 0.2 0.1 0.04
Noise (variance)
Err
ors
(p
er
op
po
rtu
nit
y)
Sequence errors
Omission errors
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
CHI Subject
Standardized error rate
Sequence
Omission
Substitution
Internal representations
-1.6
-1.1
-0.6
-0.1
0.4
0.9
1.4
1.9
-1.2 -0.2 0.8
-1.6
-1.1
-0.6
-0.1
0.4
0.9
1.4
1.9
-1.2 -0.2 0.8
-1.6
-1.1
-0.6
-0.1
0.4
0.9
1.4
1.9
-1.2 -0.2 0.8
-1.6
-1.1
-0.6
-0.1
0.4
0.9
1.4
1.9
-1.2 -0.2 0.8
-1.6
-1.1
-0.6
-0.1
0.4
0.9
1.4
1.9
-1.2 -0.2 0.8
cre
am
cre
am
drink
grounds
drinksteep tea
-1.6
-1.1
-0.6
-0.1
0.4
0.9
1.4
1.9
-1.2 -0.2 0.8
cre
am
cre
am
drink
grounds
drinksteep tea
Etiology of a slip
cre
am
cre
am
drink
grounds
drinksteep tea
-1.6
-1.1
-0.6
-0.1
0.4
0.9
1.4
1.9
-1.2 -0.2 0.8
Tea representation
Coffee representation
tea rep’n
coffee rep’n
Coffee more frequent
coffee
tea
Tea more frequent
tea
coffee
environment
action
perceptual input
viewed object held object
manipulative perceptual
primary sensory primary motor
unimodal assn. premotor
prefrontalmultimodal assn.
Input
Peripheral(input)
Output
Peripheral(Output)
Intermediate(input)
Intermediate(Output)
Apex
Store-Ignore-Recall (SIR) task
9
8
4
7
R
“nine”
“eight”
“four”
“seven”
“eight”
Input
Peripheral(input)
Output
Peripheral(Output)
Intermediate(input)
Intermediate(Output)
Apex
0
1
2
3
4
5
6
7
Peripheral (input) Intermediate (input) Apex Intermediate (output) Peripheral (output)
Coding ratio
Input
Peripheral(input)
Output
Peripheral(Output)
Intermediate(input)
Intermediate(Output)
Apex
Conclusions
• Architectural hierarchy is not necessary for hierarchically structured behavior (or to understand action errors). Recurrent connectivity combined with graded, distributed representation is sufficient.
• Nonetheless, if architectural hierarchy is present, it can lead to a graded division of labor, according to which units furthest from sensory and motor peripheries specialize in coding information pertaining to temporal context.
• This may give us a way of explaining why the prefrontal cortex seems to be involved in routine sequential behavior.
2. Hierarchical reinforcement learning
Botvinick, Niv & Barto, Cognition, in press.Botvinick, TICS, 2008
Reinforcement Learning
1. States2. Actions3. Transition function4. Reward function
Policy?
Action strengths
State values
Prediction error
δ =rt +1 + γ V (st +1) − V (st )
V (st ) ← V(st−1) +αCδ
W (st ,a) ← W(st−1,a) + αAδ
W W
S
W W
P
G
W W
W W
Adapted from Sutton et al., AI, 1999
O
Hierarchical Reinforcement Learning
O: I, ,
(After Sutton, Precup & Singh, 1999)
GREEN RED
“green” “red”
Color-namingWord-reading
Adapted from Cohen et al., Psych. Rev., 1990
“Policy abstraction”
O O O
O O O
O O O
From Humpheys & Forde, Cog. Neuropsych., 2001
W W
S
W W
P
G
W W
W W
W W
S
W W
P
G
W W
W W
1
2
W W
S
W W
P
G
W W
W W
cf. Luchins, Psychol. Monol., 1942
W W
S
W W
P
G
W W
W W
W W
S
W W
P
G
W W
W W
Genetic algorithms (Elfwing, 2003)
Frequently visited states (Picket & Barto, 2002; Thrun & Schwartz, 1996)
Graph partitioning (Menache et al., 2002; Mannor et al., 2004; Simsek et al., 2005)
Intrinsic motivation (Simsek & Barto, 2005)
Other possibilities: Impasses (Soar); Social transmission
The Option Discovery Problem
1
2
3
4
Extension 1: Support for representing option identifiers
1
White & Wise, Exp Br Res, 1999
(See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999…)
Miller & Cohen, Ann. Rev. Neurosci, 2001
From Curtis & D’Esposito, TICS, 2003, after Funahashi et al., J. Neurophysiol,1989.
Koechlin, Attn & Perf., 2008
2
Extension 2: Option-specific policies
O’Reilly & Frank, Neural Computation, 2006
Aldridge & Berridge, J Neurosci, 1998
3
Extension 3: Option-specific state values
W W
S
W W
P
G
W W
W W
Schoenbaum, et al. J Neurosci. 1999
See also: O’Doherty, Critchley, Deichmann, Dolan, 2003
4
Extension 4: Temporal scope of the prediction error
Schoenbaum, Roesch & Stalnaker, TICS, 2006
Roesch, Taylor & Schoenbaum, Neuron, 2006
Daw, NIPS, 2003
3. Goal-directed behavior
Botvinick & An, submitted.
Niv, Joel & Dayan, TICS (2006)
T
R
Niv, Joel & Dayan, TICS (2006)
T
R
4 0 2 3
Niv, Joel & Dayan, TICS (2006)
T
R
4 0 2 3
Niv, Joel & Dayan, TICS (2006)
T
R
4 0 2 3
4 3
Niv, Joel & Dayan, TICS (2006)
T
R
4 0 2 3
Niv, Joel & Dayan, TICS (2006)
T
R
4 0 2 3
Blodgett, 1929
Latent learning
Blodgett, 1929
Latent learning
Tolman & Honzik, 1930
Detour behavior
Tolman & Honzik, 1930
Detour behavior
Tolman & Honzik, 1930
Detour behavior
Niv, Joel & Dayan, TICS (2006)
Devaluation
White & Wise, Exp Br Res, 1999
(See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999; Miller & Cohen, 2001…)
Miller & Cohen, Ann. Rev. Neurosci, 2001
Padoa-Schioppa & Assad, Nature, 2006
Niv, Joel & Dayan, TICS (2006)
T
R
4 0 2 3
Gopnik, et al., Psych Rev, 2004
R
T
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
?
Redish data…
Johnson & Redish, J. Neurosci., 2007
,
,
Botvinick & An, submitted
Cf. Tatman & Shachter, 1990
Cf. Verma & Rao, 2006
Policy query
Policy query
Policy query Reward query
Policy query Reward query
Policy query Reward query
4 0 2 3
4 0 2 3
2 0 4 1
2 0 4 1
4 0 2 3
-2
4 0 2 3
-2
+1 / 0 +2 / -3
+10
+2-3
+10
+2-3
environment
action
perceptual input
viewed object held object
manipulative perceptual
Collaborators
James AnAndy BartoTodd BraverDeanna BarchJonathan CohenAndrew LedvinaJoseph McGuireDavid PlautYael Niv