computational models of cognitive control (ii)

Computational models of cognitive control (II)

Matthew BotvinickPrinceton Neuroscience Institute andDepartment of Psychology, Princeton University

Banishing the homunculus


Decision-making in control:



Not only, “How does control shape decision-making?”




But also, “How are ‘control states’ selected?”




But also, “How are ‘control states’ selected?”

And, “How are they updated over time?”

environment

action

perceptual input

viewed object held object

manipulative perceptual

1. Routine sequential action

Botvinick & Plaut, Psychological Review, 2004Botvinick, Proceedings of the Royal Society, B, 2007.

Botvinick, TICS, 2008

‘Routine sequential action’

• Action on familiar objects

• Well-defined sequential structure

• Concrete goals

• Highly routine

• Everyday tasks

Computational models of cognitive control (II)

Matthew BotvinickPrinceton Neuroscience Institute andDepartment of Psychology, Princeton University

?!

Hierarchical structure

MAKE INSTANT COFFEE

ADD GROUNDS ADD CREAM ADD SUGAR

SCOOP

ADD SUGAR FROM

SUGARPACK

ADD SUGAR FROM

SUGARBOWL

PICK-UP PUT-DOWN POUR STIR TEAR

Hierarchical models of action

ADD SUGAR FROM SUGARBOWL / PACKET

MAKE INSTANT COFFEE

ADD GROUNDS

ADD CREAM ADD SUGAR

PICK-UP PUT-DOWN POUR STIR TEAR SCOOP

• Hierarchical structure of task built directly into architecture

(e.g.,Cooper & Shallice, 2000; Estes, 1972; Houghton, 1990; MacKay, 1987, Rumelhart & Norman, 1982)

• Schemas as primitive elements

pt+2

at+2

st+2

An alternative approach

pt

at

st

pt+1

at+1

st+1

pt

at

st

pt+1

at+1

st+1

pt+2

at+2

st+2

• p, s, a = patterns of activation over simple processing units

• Weighted, excitatory/inhibitory connections

• Weights adjusted through gradient-descent learning in target task domains

Recurrent neural networks

• Feedback as well as feedforward connections

• Allow preservation of information over time

• Demonstrated capacity to learn sequential

behaviors (e.g., Cleermans, 1993; Elman, 1990)

environment

action

internalrepresentation

perceptual input

The model

Fixate(Blue) Fixate(Green) Fixate(Top)

PickUp Fixate(Table) PutDown

Fixate(Green) PickUp

Ballard, Hayhoe, Pook & Rao, (1996). BBS.

environment

action

perceptual input

viewed objectheld object

Model architecture


Routine sequential action: Task domain

• Hierarchically structured

• Actions/subtasks may appear in multiple contexts

• Environmental cues alone sometimes insufficient to guide action selection

• Subtasks that may be executed in variable order

• Subtask disjunctions

ADD SUGAR FROMSUGARBOWL / PACKET

MAKE INSTANT COFFEE

ADD GROUNDS

ADD CREAM ADD SUGAR

PICK-UP PUT-DOWN POUR STIR TEAR SCOOP

drinksteep tea

`

drink

grounds

Start

End

End

drinksteep tea

cre

am

cre

am

`

drink

grounds

Start

End

End

Representations

VIEWED INPUT HELD INPUT ACTION cup cup pickup 1handle 1handle putdown 2handles 2handles pour lid lid peelopen water water tearopen brownliquid brownliquid pullopen milk milk pinchlift carton carton scoop open open sip closed closed stir packet packet locate-cup foil foil locate-sugar paper paper locate-sugarbowl torn torn locate-teabag untorn untorn locate-coffeepack spoon spoon locate-spoon teabag teabag locate-carton sugar sugar saydone coffee-instruction nothing tea-instruction

sugar-packet

Man

ipu

lative actio

ns

Percep

tual

action

s

STEP VIEWED HELD ACTION

1 cup-1handle-clearliquid nothing locate-coffeepack

2 packet-brownfoil-untorn nothing pickup

3 packet-brownfoil-untorn packet-brownfoil-untorn pullopen

4 packet-brownfoil-torn packet-brownfoil-torn locate-cup

5 cup-1handle-clearliquid packet-brownfoil-torn pour

6 cup-1handle-brownliquid packet-brownfoil-torn locate-spoon

7

Input Target/output

Model behavior

15% 18%

12% 10%

20% 25%

crea

m

crea

m

drink

grounds

StartEnd

crea

m

crea

m

drink

grounds

StartEnd

drinksteep tea

Start

End

crea

m

crea

m

drink

grounds

StartEnd

crea

m

crea

m

drink

grounds

StartEnd

drinksteep tea

Start

End

Slips of action(after Reason)

• Occur at decision (or fork) points

• Sequence errors involve subtask omissions, repetitions, and lapses

• Lapses show effect of relative task frequency

environment

action

perceptual input



Sample of behavior:

pick-up coffee-packpull-open coffee-packpour coffee-pack into cupput-down coffee-packpick-up spoonstir cupput-down spoonpick-up sugar-packtear-open sugar-packpour sugar-pack into cupput-down sugar-packpick-up spoonstir cupput-down spoonpick-up cup*sip cupsip cupsay-done

grounds

sugar (pack)

drink

cream omitted

subtask 1 subtask 2 subtask 3 subtask 4

Step in coffee sequence

P

erce

nta

ge

of

tria

ls e

rro

r-fr

ee100

0

0

20

40

60

80

0.02 0.1 0.2 0.3

Noise level (variance)

Per

cen

tag

e o

f tr

ials Omissions / anticipations

Repetitions / perseverationsIntrusions / lapses

steep tea sugar cream *

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

5:1 1:1 1:5

Tea : coffeeO

dd

s o

f la

pse

into

co

ffee

-mak

ing

drinksteep tea

crea

m

crea

mdrink

grounds

Start

End

End

Action disorganization syndrome(after Schwartz and colleagues)

• Fragmentation of sequential structure (independent actions)

• Specific error types

• Omission effect

environment

action

perceptual input



Sample of behavior:

pick-up coffee-packpull-open coffee-packput-down coffee-pack*pick-up coffee-packpour coffee-pack into cupput-down coffee-packpick-up spoonstir cupput-down spoonpick-up sugar-packtear-open sugar-packpour sugar-pack into cupput-down sugar-packpick-up cup*put-down cuppull-off sugarbowl lid*put-down lidpick-up spoonscoop sugarbowl with spoonput-down spoon*pick-up cup*sip cupsip cupsay-done

sugar repeated

cream omitted

disrupted subtask

subtask fragment

subtask fragment

Omission Sugar not added 77 (30 -40)

Sequence: 15 (20)

Anticipation Pour cream without openingPerseveration Add cream, add sugar, add cream againReversal Stir water then add grounds

Other: 8 (30)

Object substitution Stir with coffee -pack Gesture substitution Pour gesture substituted for stirTool omission Pour sugarbowl into cupAction addition Scoop sugar with, then put down, lidQuality Pour cream four times in a row

Error type Example Percentage

Omission Sugar not added 77 (30 -40)

Sequence: 15 (20)

Anticipation Pour cream without openingPerseveration Add cream, add sugar, add cream againReversal Stir water then add grounds

Other: 8 (30)

Object substitution Stir with coffee -pack Gesture substitution Pour gesture substituted for stirTool omission Pour sugarbowl into cupAction addition Scoop sugar with, then put down, lidQuality Pour cream four times in a row

Error type Example Percentage

Empirical data: Schwartz, et al. Neuropsychology, 1991

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.5 0.4 0.3 0.2 0.1 0

Noise (variance)

Pro

po

rtio

n In

dep

end

ents

From: Schwartz, et al. Neuropsychology, 1998.

0

10

20

30

40

50

60

70

0.3 0.2 0.1 0.04

Noise (variance)

Err

ors

(p

er

op

po

rtu

nit

y)

Sequence errors

Omission errors

0

10

20

30

40

50

60

70

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

CHI Subject

Standardized error rate

Sequence

Omission

Substitution

Internal representations

-1.6

-1.1

-0.6

-0.1

0.4

0.9

1.4

1.9

-1.2 -0.2 0.8

cre

am

cre

am

drink

grounds

drinksteep tea

-1.6

-1.1

-0.6

-0.1

0.4

0.9

1.4

1.9

-1.2 -0.2 0.8

cre

am

cre

am

drink

grounds

drinksteep tea

Etiology of a slip

cre

am

cre

am

drink

grounds

drinksteep tea

-1.6

-1.1

-0.6

-0.1

0.4

0.9

1.4

1.9

-1.2 -0.2 0.8

Tea representation

Coffee representation

tea rep’n

coffee rep’n

Coffee more frequent

coffee

tea

Tea more frequent

tea

coffee

environment

action

perceptual input



primary sensory primary motor

unimodal assn. premotor

prefrontalmultimodal assn.

Input

Peripheral(input)

Output

Peripheral(Output)

Intermediate(input)

Intermediate(Output)

Apex

Store-Ignore-Recall (SIR) task

9

8

4

7

R

“nine”

“eight”

“four”

“seven”

“eight”

Input

Peripheral(input)

Output

Peripheral(Output)

Intermediate(input)


Apex

0

1

2

3

4

5

6

7

Peripheral (input) Intermediate (input) Apex Intermediate (output) Peripheral (output)

Coding ratio

Input

Peripheral(input)

Output

Peripheral(Output)

Intermediate(input)


Apex

Conclusions

• Architectural hierarchy is not necessary for hierarchically structured behavior (or to understand action errors). Recurrent connectivity combined with graded, distributed representation is sufficient.

• Nonetheless, if architectural hierarchy is present, it can lead to a graded division of labor, according to which units furthest from sensory and motor peripheries specialize in coding information pertaining to temporal context.

• This may give us a way of explaining why the prefrontal cortex seems to be involved in routine sequential behavior.

2. Hierarchical reinforcement learning

Botvinick, Niv & Barto, Cognition, in press.Botvinick, TICS, 2008

Reinforcement Learning

1. States2. Actions3. Transition function4. Reward function

Policy?

Action strengths

State values

Prediction error

δ =rt +1 + γ V (st +1) − V (st )

V (st ) ← V(st−1) +αCδ

W (st ,a) ← W(st−1,a) + αAδ

W W

S

W W

P

G

W W

W W

Adapted from Sutton et al., AI, 1999

O

Hierarchical Reinforcement Learning

O: I, ,

(After Sutton, Precup & Singh, 1999)

GREEN RED

“green” “red”

Color-namingWord-reading

Adapted from Cohen et al., Psych. Rev., 1990

“Policy abstraction”

O O O

O O O

O O O

From Humpheys & Forde, Cog. Neuropsych., 2001

W W

S

W W

P

G

W W

W W

W W

S

W W

P

G

W W

W W

cf. Luchins, Psychol. Monol., 1942

W W

S

W W

P

G

W W

W W

Genetic algorithms (Elfwing, 2003)

Frequently visited states (Picket & Barto, 2002; Thrun & Schwartz, 1996)

Graph partitioning (Menache et al., 2002; Mannor et al., 2004; Simsek et al., 2005)

Intrinsic motivation (Simsek & Barto, 2005)

Other possibilities: Impasses (Soar); Social transmission

The Option Discovery Problem

1

2

3

4

Extension 1: Support for representing option identifiers

1

White & Wise, Exp Br Res, 1999

(See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999…)

Miller & Cohen, Ann. Rev. Neurosci, 2001

From Curtis & D’Esposito, TICS, 2003, after Funahashi et al., J. Neurophysiol,1989.

Koechlin, Attn & Perf., 2008

2

Extension 2: Option-specific policies

O’Reilly & Frank, Neural Computation, 2006

Aldridge & Berridge, J Neurosci, 1998

3

Extension 3: Option-specific state values

W W

S

W W

P

G

W W

W W

Schoenbaum, et al. J Neurosci. 1999

See also: O’Doherty, Critchley, Deichmann, Dolan, 2003

4

Extension 4: Temporal scope of the prediction error

Schoenbaum, Roesch & Stalnaker, TICS, 2006

Roesch, Taylor & Schoenbaum, Neuron, 2006

Daw, NIPS, 2003

3. Goal-directed behavior

Botvinick & An, submitted.

Niv, Joel & Dayan, TICS (2006)

T

R


T

R

4 0 2 3


T

R

4 0 2 3

4 3


T

R

4 0 2 3

Blodgett, 1929

Latent learning

Tolman & Honzik, 1930

Detour behavior


Devaluation

White & Wise, Exp Br Res, 1999

(See also: Assad, Rainer & Miller, 2000; Bunge, 2004; Hoshi, Shima & Tanji, 1998; Johnston & Everling, 2006; Wallis, Anderson & Miller, 2001; White, 1999; Miller & Cohen, 2001…)

Miller & Cohen, Ann. Rev. Neurosci, 2001

Padoa-Schioppa & Assad, Nature, 2006


T

R

4 0 2 3

Gopnik, et al., Psych Rev, 2004

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Redish data…

Johnson & Redish, J. Neurosci., 2007

Botvinick & An, submitted

Cf. Tatman & Shachter, 1990

Cf. Verma & Rao, 2006

Policy query

Policy query Reward query

4 0 2 3

2 0 4 1

4 0 2 3

-2

+1 / 0 +2 / -3

+10

+2-3

environment

action

perceptual input



Collaborators

James AnAndy BartoTodd BraverDeanna BarchJonathan CohenAndrew LedvinaJoseph McGuireDavid PlautYael Niv

computational models of cognitive control (ii)

Documents

homunculus decisionmaking

shape decisionmaking

sequential behaviorse

princeton universitybanishing

action selectionsubtasks

gradientdescent learning

familiar objectswell

psychological review