computational models of cognitive control (ii)

Computational models of cognitive control (II)

Matthew BotvinickPrinceton Neuroscience Institute andDepartment of Psychology, Princeton University

Banishing the homunculus

Decision-making in control:

Not only, “How does control shape decision-making?”

But also, “How are ‘control states’ selected?”

And, “How are they updated over time?”

environment

action

perceptual input

viewed object held object

manipulative perceptual

1. Routine sequential action

Botvinick & Plaut, Psychological Review, 2004Botvinick, Proceedings of the Royal Society, B, 2007.

Botvinick, TICS, 2008

‘Routine sequential action’

• Action on familiar objects

• Well-defined sequential structure

• Concrete goals

• Highly routine

• Everyday tasks

Computational models of cognitive control (II)

Matthew BotvinickPrinceton Neuroscience Institute andDepartment of Psychology, Princeton University

Hierarchical structure

MAKE INSTANT COFFEE

ADD GROUNDS ADD CREAM ADD SUGAR

ADD SUGAR FROM

SUGARPACK

ADD SUGAR FROM

SUGARBOWL

PICK-UP PUT-DOWN POUR STIR TEAR

Hierarchical models of action

ADD SUGAR FROM SUGARBOWL / PACKET

MAKE INSTANT COFFEE

ADD GROUNDS

ADD CREAM ADD SUGAR

PICK-UP PUT-DOWN POUR STIR TEAR SCOOP

• Hierarchical structure of task built directly into architecture

(e.g.,Cooper & Shallice, 2000; Estes, 1972; Houghton, 1990; MacKay, 1987, Rumelhart & Norman, 1982)

• Schemas as primitive elements

An alternative approach

• p, s, a = patterns of activation over simple processing units

• Weighted, excitatory/inhibitory connections

• Weights adjusted through gradient-descent learning in target task domains

Recurrent neural networks

• Feedback as well as feedforward connections

• Allow preservation of information over time

• Demonstrated capacity to learn sequential

behaviors (e.g., Cleermans, 1993; Elman, 1990)

environment

action

internalrepresentation

perceptual input

The model

Fixate(Blue) Fixate(Green) Fixate(Top)

PickUp Fixate(Table) PutDown

Fixate(Green) PickUp

Ballard, Hayhoe, Pook & Rao, (1996). BBS.

environment

action

perceptual input

viewed objectheld object

Model architecture

Routine sequential action: Task domain

• Hierarchically structured

• Actions/subtasks may appear in multiple contexts

• Environmental cues alone sometimes insufficient to guide action selection

• Subtasks that may be executed in variable order

• Subtask disjunctions

ADD SUGAR FROMSUGARBOWL / PACKET

MAKE INSTANT COFFEE

ADD GROUNDS

ADD CREAM ADD SUGAR

PICK-UP PUT-DOWN POUR STIR TEAR SCOOP

drinksteep tea

grounds

drinksteep tea

grounds

Representations

VIEWED INPUT HELD INPUT ACTION cup cup pickup 1handle 1handle putdown 2handles 2handles pour lid lid peelopen water water tearopen brownliquid brownliquid pullopen milk milk pinchlift carton carton scoop open open sip closed closed stir packet packet locate-cup foil foil locate-sugar paper paper locate-sugarbowl torn torn locate-teabag untorn untorn locate-coffeepack spoon spoon locate-spoon teabag teabag locate-carton sugar sugar saydone coffee-instruction nothing tea-instruction

sugar-packet

lative actio

Percep

action

STEP VIEWED HELD ACTION

1 cup-1handle-clearliquid nothing locate-coffeepack

2 packet-brownfoil-untorn nothing pickup

3 packet-brownfoil-untorn packet-brownfoil-untorn pullopen

4 packet-brownfoil-torn packet-brownfoil-torn locate-cup

5 cup-1handle-clearliquid packet-brownfoil-torn pour

6 cup-1handle-brownliquid packet-brownfoil-torn locate-spoon

Input Target/output

Model behavior

15% 18%

12% 10%

20% 25%

grounds

StartEnd

grounds

StartEnd

drinksteep tea

grounds

StartEnd

grounds

StartEnd

drinksteep tea

Slips of action(after Reason)

• Occur at decision (or fork) points

• Sequence errors involve subtask omissions, repetitions, and lapses

• Lapses show effect of relative task frequency

environment

action

perceptual input

Sample of behavior:

pick-up coffee-packpull-open coffee-packpour coffee-pack into cupput-down coffee-packpick-up spoonstir cupput-down spoonpick-up sugar-packtear-open sugar-packpour sugar-pack into cupput-down sugar-packpick-up spoonstir cupput-down spoonpick-up cup*sip cupsip cupsay-done

grounds

sugar (pack)

cream omitted

subtask 1 subtask 2 subtask 3 subtask 4

Step in coffee sequence

0.02 0.1 0.2 0.3

Noise level (variance)

ials Omissions / anticipations

Repetitions / perseverationsIntrusions / lapses

steep tea sugar cream *

5:1 1:1 1:5

Tea : coffeeO

drinksteep tea

mdrink

grounds

Action disorganization syndrome(after Schwartz and colleagues)

• Fragmentation of sequential structure (independent actions)

• Specific error types

• Omission effect

environment

action

perceptual input

Sample of behavior:

pick-up coffee-packpull-open coffee-packput-down coffee-pack*pick-up coffee-packpour coffee-pack into cupput-down coffee-packpick-up spoonstir cupput-down spoonpick-up sugar-packtear-open sugar-packpour sugar-pack into cupput-down sugar-packpick-up cup*put-down cuppull-off sugarbowl lid*put-down lidpick-up spoonscoop sugarbowl with spoonput-down spoon*pick-up cup*sip cupsip cupsay-done

sugar repeated

cream omitted

disrupted subtask

subtask fragment

Omission Sugar not added 77 (30 -40)

Sequence: 15 (20)

Anticipation Pour cream without openingPerseveration Add cream, add sugar, add cream againReversal Stir water then add grounds

Other: 8 (30)

Object substitution Stir with coffee -pack Gesture substitution Pour gesture substituted for stirTool omission Pour sugarbowl into cupAction addition Scoop sugar with, then put down, lidQuality Pour cream four times in a row

Error type Example Percentage

Omission Sugar not added 77 (30 -40)

Sequence: 15 (20)

Anticipation Pour cream without openingPerseveration Add cream, add sugar, add cream againReversal Stir water then add grounds

Other: 8 (30)

Object substitution Stir with coffee -pack Gesture substitution Pour gesture substituted for stirTool omission Pour sugarbowl into cupAction addition Scoop sugar with, then put down, lidQuality Pour cream four times in a row

Error type Example Percentage

Empirical data: Schwartz, et al. Neuropsychology, 1991

0.5 0.4 0.3 0.2 0.1 0

Noise (variance)

From: Schwartz, et al. Neuropsychology, 1998.

0.3 0.2 0.1 0.04

Noise (variance)

Sequence errors

Omission errors

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

CHI Subject

Standardized error rate

Sequence

Omission

Substitution

Internal representations

-1.2 -0.2 0.8

grounds

drinksteep tea

-1.2 -0.2 0.8

grounds

drinksteep tea

Etiology of a slip

grounds

drinksteep tea

-1.2 -0.2 0.8

Tea representation

Coffee representation

tea rep’n

coffee rep’n

Coffee more frequent

coffee

Tea more frequent

coffee

environment

action

perceptual input

primary sensory primary motor

unimodal assn. premotor

prefrontalmultimodal assn.

Peripheral(input)

Output

Peripheral(Output)

Intermediate(input)

Intermediate(Output)

Store-Ignore-Recall (SIR) task

“nine”

“eight”

“four”

“seven”

“eight”

Peripheral(input)

Output

Peripheral(Output)

Intermediate(input)

Peripheral (input) Intermediate (input) Apex Intermediate (output) Peripheral (output)

Coding ratio

Peripheral(input)

Output

Peripheral(Output)

Intermediate(input)

Conclusions

• Architectural hierarchy is not necessary for hierarchically structured behavior (or to understand action errors). Recurrent connectivity combined with graded, distributed representation is sufficient.

• Nonetheless, if architectural hierarchy is present, it can lead to a graded division of labor, according to which units furthest from sensory and motor peripheries specialize in coding information pertaining to temporal context.

• This may give us a way of explaining why the prefrontal cortex seems to be involved in routine sequential behavior.

2. Hierarchical reinforcement learning

Botvinick, Niv & Barto, Cognition, in press.Botvinick, TICS, 2008

Reinforcement Learning

1. States2. Actions3. Transition function4. Reward function

Policy?

Action strengths

State values

Prediction error

δ =rt +1 + γ V (st +1) − V (st )

V (st ) ← V(st−1) +αCδ

W (st ,a) ← W(st−1,a) + αAδ

Adapted from Sutton et al., AI, 1999

Hierarchical Reinforcement Learning

O: I, ,

(After Sutton, Precup & Singh, 1999)

GREEN RED

“green” “red”

Color-namingWord-reading

Adapted from Cohen et al., Psych. Rev., 1990

“Policy abstraction”

From Humpheys & Forde, Cog. Neuropsych., 2001

cf. Luchins, Psychol. Monol., 1942

Genetic algorithms (Elfwing, 2003)

Frequently visited states (Picket & Barto, 2002; Thrun & Schwartz, 1996)

Graph partitioning (Menache et al., 2002; Mannor et al., 2004; Simsek et al., 2005)

Intrinsic motivation (Simsek & Barto, 2005)

Other possibilities: Impasses (Soar); Social transmission

The Option Discovery Problem

Extension 1: Support for representing option identifiers

White & Wise, Exp Br Res, 1999

(See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999…)

Miller & Cohen, Ann. Rev. Neurosci, 2001

From Curtis & D’Esposito, TICS, 2003, after Funahashi et al., J. Neurophysiol,1989.

Koechlin, Attn & Perf., 2008

Extension 2: Option-specific policies

O’Reilly & Frank, Neural Computation, 2006

Aldridge & Berridge, J Neurosci, 1998

Extension 3: Option-specific state values

Schoenbaum, et al. J Neurosci. 1999

See also: O’Doherty, Critchley, Deichmann, Dolan, 2003

Extension 4: Temporal scope of the prediction error

Schoenbaum, Roesch & Stalnaker, TICS, 2006

Roesch, Taylor & Schoenbaum, Neuron, 2006

Daw, NIPS, 2003

3. Goal-directed behavior

Botvinick & An, submitted.

Niv, Joel & Dayan, TICS (2006)

4 0 2 3

Blodgett, 1929

Latent learning

Blodgett, 1929

Latent learning

Tolman & Honzik, 1930

Detour behavior

Devaluation

White & Wise, Exp Br Res, 1999

(See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999; Miller & Cohen, 2001…)

Miller & Cohen, Ann. Rev. Neurosci, 2001

Padoa-Schioppa & Assad, Nature, 2006

4 0 2 3

Gopnik, et al., Psych Rev, 2004

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Redish data…

Johnson & Redish, J. Neurosci., 2007

Botvinick & An, submitted

Cf. Tatman & Shachter, 1990

Cf. Verma & Rao, 2006

Policy query

Policy query Reward query

4 0 2 3

2 0 4 1

4 0 2 3

+1 / 0 +2 / -3

environment

action

perceptual input

Collaborators

James AnAndy BartoTodd BraverDeanna BarchJonathan CohenAndrew LedvinaJoseph McGuireDavid PlautYael Niv

computational models of cognitive control (ii)

homunculus decisionmaking

shape decisionmaking

sequential behaviorse

princeton universitybanishing

action selectionsubtasks

gradientdescent learning

familiar objectswell

psychological review

Documents

cognitive computational neuroscience of...

cognitive modelling, users models and mental models what’s...

accounting for cognitive and perceptual biases in...

handbook on computational cognitive modeling

computational cognitive modeling

computational models and hard optimization problems ·...

hide and seek: using computational cognitive models to...

computational modeling for addiction medicine: from ... ·...

1 first-order probabilistic models brian milch 9.66:...

cognitive models. 2 contents cognitive models device models...

computational exploration in cognitive neuroscience

strategic cognitive sequencing: a computational cognitive

computational models of cognitive control (ii) matthew...

visual search - computational cognitive neuroscience wiki

using computational cognitive 1 running head: cognitive...

cognitive computational vision-based (cvm)...

cognitive and computational models in interactive narrative

computational cognitive neuroscience (2012)

6 computational cognitive neuroscience approaches to...

computational foundations of cognitive science