human action understanding as bayesian inverse planning

Human action understanding as Bayesian inverse planning

Josh TenenbaumMIT BCS, CSAIL

Collaborators: Chris Baker, Noah Goodman, Rebecca Saxe,

Everyday inductive leaps

How can human minds infer so much about the world from such limited evidence?– Visual scene perception– Object categorization– Language understanding and acquisition– Causal reasoning– Social cognition

• How do we infer the hidden mental states of other agents that give rise to their observed behavior – beliefs, desires, intentions, plans, emotions, values?

Why did the man cross the street?

Far more complex than conventional “activity recognition” systems…

The solution

Background knowledge (“intuitive theories”, “inductive bias”).– How does background knowledge guide inferences

from sparsely observed data? – What form does background knowledge take, across

different domains and tasks?– How is background knowledge itself acquired?

The challenge: Can we answer these questions in precise computational terms?

1. How does background knowledge guide inferences from sparsely observed data?

Bayesian inference:

2. What form does background knowledge take, across different domains and tasks?

Probabilities defined over structured representations: graphs, grammars, predicate logic, schemas, functional processes, programs.

3. How is background knowledge itself acquired? Hierarchical probabilistic models, with inference at multiple levels of

abstraction.

The approach

Hhii

i

hPhdP

hPhdPdhP

)()|(

)()|()|(

1. How does background knowledge guide inferences from sparsely observed data?

Bayesian inference:

2. What form does background knowledge take, across different domains and tasks?

Probabilities defined over structured representations: graphs, grammars, predicate logic, schemas, functional processes, programs.

3. How is background knowledge itself acquired? Hierarchical probabilistic models, with inference at multiple levels of

abstraction.

The approach

Hhii

i

hPhdP

hPhdPdhP

)()|(

)()|()|(

Which does “more work” – Bayesian inference or structured knowledge? Not the right question! Each component needs the other to get the hard work done.

Principle of rationality

• People assume that other agents will tend to take sequences of actions that most effectively achieve their goals given their beliefs.

• Model this more formally as inverse planning in a goal-based Markov Decision Process (MDP).

Beliefs (B) Goals (G)

Actions (A)

RationalPlanning(MDP)

),(),|()|,( GBpGBApAGBp c.f., inverse RL (Ng, Russell); goal-based imitation (Rao et al.)

Principle of rationality

• People assume that other agents will tend to take sequences of actions that most effectively achieve their goals given their beliefs.

• Model this more formally as inverse planning in a goal-based Markov Decision Process (MDP).


Actions (A)


),(),|()|,( GBpGBApAGBp c.f., inverse RL (Ng, Russell); goal-based imitation (Rao et al.)

Caution: not necessarily a computational model of how people plan! It is a computational model of people’s mental models of how people plan. C.f. simulation/theory debate.

Rational action understanding in infants (Gergely & Csibra)

Rational action schema

Inferring relational goals

• Hamlin, Kuhlmeier, Wynn, Bloom: helping, hindering

• Southgate, Csibra: chasing, fleeing

Immediate research aims

• Test how accurately and precisely the Bayesian inverse planning framework can explain people’s intuitive psychological judgments.

• Use the inverse planning framework as a tool to test alternative accounts of people’s intuitive theories of psychology.

• Focus on perceptual inferences about goals (c.f., Brian Scholl’s talk).

Alternative computational models

• Theory-based models– Objects as goals

• fixed or changing goals?

• simple or complex intentions?

– Social goals towards other agents• try to achieve some relation to another agent, e.g.,

– chasing: try to minimize distance to another agent

– fleeing: try to maximize distance from another agent

• goals have different content depending on how agents model each other, e.g.

– first-order (“simple-minded”) or second-order (“smarter”): does an agent ignore or try to account for other agents’ goals and planning processes?


• Theory-based models• Bottom-up heuristics based on motion cues

(Todd & Miller, Zacks, Tremoulet & Feldman)– Object O is the goal of X if O is a salient object

and X is moving towards O.– X is chasing/fleeing Y if X tends to move

towards/away from Y as Y tends to move away from/towards X.

Experiment 1: objects as goals• Method

– Subjects (N=16) view animated trajectories in simple maze-like environments.

– Subjects observe partial action sequences with several candidate goals and are asked to rate relative probabilities of goals at different points along each trajectory.

Experiment 1: objects as goals• Method

– Subjects (N=16) view animated trajectories in simple maze-like environments.

– Subjects observe partial action sequences with several candidate goals and are asked to rate relative probabilities of goals at different points along each trajectory.

• Set up– Cover story: intelligent aliens moving in their natural

environment.– Assume fully observable world: agent’s beliefs = true states and

transition functions of the environment. – 100 total judgments, with 3-6 judgment points along each of 36

different trajectories (= 4 goal positions x 3 kinds of obstacles x 3 goals).

Specific inverse planning models• Model M1(β): fixed

goal– The agent acts to

achieve a particular state of the environment.

– This goal state is fixed for a given action sequence.

– Small negative cost for each step that does not reach the goal.

G

St St+1

At At+1

b

Like “Chasing subtlety”, but more abstract

• Model M2(β,γ): switching goals– Just like M1, but the agent’s

goal can change at any time step with probability γ.

– Agent plans greedily, not anticipating its own potential goal changes.

– Special case where γ = 1/# goals is a “one-step rule”, equivalent to a simple motion cue:

Gt Gt+1

St St+1

At At+1

b

g

Specific inverse planning models

“Object O is the goal of X if O is a salient object and X is moving towards O.”

Results

People

M2(2,0.33)Goal switch

Model fits

0 0.5 10

0.5

1r = 0.83

M1(.5)

0 0.5 10

0.5

1r = 0.96

M3(2,.33)0 0.5 1

0

0.5

1r = 0.97

M3(2,.5)0 0.5 1

0

0.5

1r = 0.97

M3(2,.77)0 0.5 1

0

0.5

1r = 0.97

Peo

ple

Peo

ple

M1(β)Fixed goal

M2(β,γ)Goal switching

Criticalstimuli

Analysis of critical trials

M2(2,0.67)“one-step”rule

M1(2)Fixed goal


Humangoal inferences

C A

B

A

B

C

Motion-cue heuristic: “Object O is the goal of X if O is a salient object and X is moving towards O.”

10

11

1011

Experiment 2: social goals toward other agents

• Method– Subjects view short animated trajectories (2 or 3 steps) of

two agents moving in simple maze-like environments.

– One agent is chasing the other, who is fleeing. Subjects are asked to rate the relative probability that each agent is the chasing agent.

Which is more likely? Green is chasing & Red is fleeing.

Or

Red is chasing & Green is fleeing.

• Design aim: distinguish theory-based accounts from simpler models based on motion cues. – Previous evidence: Southgate & Csibra Demo

– Critical stimuli in our study: e.g.,


Or


Experiment 2: social goals toward other agents

1 2 3

4 5 6

7 8 9


Or


Green is chasing & Red is fleeing.


PeopleFirst-order model 1First-order model 2Second-order model 1Second-order model 2

Summary• Bayesian inverse planning gives a framework for making

rich mental-state inferences from sparsely observed behavior, assuming the principle of rationality.

• At least in simple environments, inverse planning models provide a strong fit to people’s goal attribution judgments – and a better fit than bottom-up accounts based purely on motion cues.

• Comparing inverse planning models with different hypothesis spaces lets us probe the nature and complexity of people’s representations of agents and their goals.

Ongoing and future work• In perceiving social-relational goals, can we distinguish

first-order (“simple-minded”) agents from second-order (“smart”) agents? – A demo:

• Agent profile: 1 Env: 1




Ongoing and future work• In perceiving social-relational goals, can we distinguish first-order

(“simple-minded”) agents from second-order (“smart”) agents?

• Scaling up to more complex environments and richer mental-state representations: e.g., hierarchical goal structures, plans, recursive beliefs.

• What is the relation between actual psychology (how people actually think and plan) and intuitive psychology? Intuitive psychology may be more rational… relevance for the “theory vs. simulation” debate?

• How could a theory of mind be learned or modified with experience? Aspects to be learned might include: how complex are agents’ goals, how complex are agents’ representations of other agents, different types of agents, how do agents’ beliefs depend on the state of the world.


• Theory-based models– Objects as goals

• fixed or changing goals? • simple or complex intentions?

– Social goals towards other agents• first-order (“simple-minded”): try to achieve some relation to

another agent without modeling that agent’s own action-planning process.

– E.g., chasing: try to minimize distance to another agent; fleeing: try to maximize distance from another agent.

• second-order (“smart”): try to achieve some relation to another agent while modeling that agent’s action planning process.

M2(2,0.67)“one-step”rule

M1(2)Fixed goal



C A

B

A

B

C

Motion-cue heuristic: “Object O is the goal of X if O is a salient object and X is moving towards O.”

Rational action understanding in infants (Gergely & Csibra)

Rationalaction schema:

The present research

• Aims– To test how accurately and precisely the inverse

planning framework can explain people’s intuitive psychological judgments.

– To use the inverse planning framework as a tool to test alternative accounts of people’s intuitive theories of psychology.

• Experiments 1 & 2: goal inference

• Experiments 3 & 4: action prediction


M2(2,0.67)One-stepheuristic

An alternative heuristic account?

A thought experiment…

• Last-step heuristic: infer goal based on only the last movement (instead of the entire path)– a special case of M2, equivalent to M2(β,.67).

• This model correlates highly with people’s judgments in Experiment 1.

• However, there are qualitative differences between this model’s predictions and people’s judgments that suggest that people are using a more sophisticated form of temporal integration.

An alternative heuristic account?

Sample behavioral data

Modeling results

People


1 2 3

4 5 6

7 8 9


Or


Green is chasing & Red is fleeing.


PeopleFirst-order model 1First-order model 2Second-order model 1Second-order model 2

M2(2,0.67)“Last-step”heuristic

M1(2)Fixed goal



C A

B

A A A

BB B

C C C

The solution

Prior knowledge (inductive bias).

Caution!

• We are not proposing this as a computational model of how people plan! It is a computational model of people’s mental model of how people plan.

• Whether planning is “rational” or well-described as solving an MDP is an interesting but distinct question.


Actions (A)


human action understanding as bayesian inverse planning

Documents

structured knowledge

knowledge guide inferences

work bayesian inference

inverse rl ng

observed behavior beliefs

russell goal

different domains

human action