finding approximate pomdp solutions through belief compression

32
Based on slides by Nicholas Roy, MIT Finding Approximate POMDP Solutions through Belief Compression

Upload: odin

Post on 10-Feb-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Finding Approximate POMDP Solutions through Belief Compression. Based on slides by Nicholas Roy, MIT. Estimated robot position Robot position distribution True robot position Goal position. Reliable Navigation. Conventional trajectories may not be robust to localisation error. Control. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Finding Approximate POMDP Solutions through Belief Compression

Based on slides byNicholas Roy, MIT

Finding Approximate POMDP Solutions through Belief Compression

Page 2: Finding Approximate POMDP Solutions through Belief Compression

Reliable Navigation

Conventional trajectories may not be robust to localisation error

Estimated robot positionRobot position distribution

True robot positionGoal position

Page 3: Finding Approximate POMDP Solutions through Belief Compression

Perception and Control

Perception Control

World state

Control algorithms

Page 4: Finding Approximate POMDP Solutions through Belief Compression

Perception and Control

Assumed full observability

Exact POMDP planning

Probabilistic Perception

ModelP(x) argmax P(x) Control

World state World state

Probabilistic Perception

ModelP(x) Control

Brittle

Intractable

Page 5: Finding Approximate POMDP Solutions through Belief Compression

Perception and Control

Assume full observability

Exact POMDP planning

Brittle

World state

Probabilistic Perception

ModelP(x) Compressed P(x) Control

Intractable

Page 6: Finding Approximate POMDP Solutions through Belief Compression

Main Insight

World state

Probabilistic Perception

ModelP(x) Low-dimensional P(x) Control

Good policies for real world POMDPs can be found by planning over low-dimensional representations

of the belief space.

Page 7: Finding Approximate POMDP Solutions through Belief Compression

but not usually.

The controller may be globally uncertain...

Belief Space Structure

Page 8: Finding Approximate POMDP Solutions through Belief Compression

Coastal Navigation

Represent beliefs using

Discretise into low-dimensional belief space MDP

)();(maxarg~ bHsbbs

Page 9: Finding Approximate POMDP Solutions through Belief Compression

Coastal Navigation

Page 10: Finding Approximate POMDP Solutions through Belief Compression

A Hard Navigation Problem

0

1

2

3

4

5

6

7

8

9

Maximum Likelihood AMDP

Dis

tanc

e in

M

Average Distance to Goal

Page 11: Finding Approximate POMDP Solutions through Belief Compression

Dimensionality Reduction

Principal Components Analysis

Original Beliefs

WeightsCharacteristicBeliefs

Page 12: Finding Approximate POMDP Solutions through Belief Compression

Principal Components Analysis

Given belief bn, we want bm, m«n.

Collection of beliefs drawn from 200 state problem

Prob

abili

ty o

f bei

ng in

stat

e

State

~

Page 13: Finding Approximate POMDP Solutions through Belief Compression

One sample distribution

m=9 gives this representation for one sample distribution

Principal Components Analysis

Given belief bn, we want bm, m«n.

Prob

abili

ty o

f bei

ng in

stat

e

State

~

Page 14: Finding Approximate POMDP Solutions through Belief Compression

Principal Components Analysis

Many real world POMDP distributions are characterised by large regions of low probability.

Idea: Create fitting criterion that is (exponentially) stronger in low-probability regions (E-PCA)

Page 15: Finding Approximate POMDP Solutions through Belief Compression

1 basis2 bases3 bases4 bases

Example EPCA

State

Prob

abili

ty o

f bei

ng in

stat

e

Page 16: Finding Approximate POMDP Solutions through Belief Compression

Example Reduction

Page 17: Finding Approximate POMDP Solutions through Belief Compression

E-PCA will indicate appropriate number of bases, depending on beliefs encountered

Finding Dimensionality

Page 18: Finding Approximate POMDP Solutions through Belief Compression

Planning

S1

S2

S3Original POMDP Low-dimensional

belief space B

E-PCA

Discrete beliefspace MDP

Discretise

~

Page 19: Finding Approximate POMDP Solutions through Belief Compression

Model Parameters

Reward function

R(b)

s1 s2 s3

p(s)

Back-project to high dimensional belief

S

b sRspsREbR )()())(()(

Compute expected reward from belief:~~

Page 20: Finding Approximate POMDP Solutions through Belief Compression

Model Parameters

Low dimensionFull dimension

~1. For each belief bi and action a

bi

~3. Propagate according to

action

bj

4. Propagate according toobservation

bj

~

~5. Recover bj

||

1

||

1

||

1

)(),|()|()~

,,~

(bZ

k

S

l

S

mmjmllkji sbasspszpbabT

6. Set T(bi, a, bj) to probabilityof observation

~~ bi

~2. Recover full belief bi

Page 21: Finding Approximate POMDP Solutions through Belief Compression

Robot Navigation Example

True (hidden) robot positionGoal position

Goal state

Initial Distribution

Page 22: Finding Approximate POMDP Solutions through Belief Compression

Robot Navigation Example

True robot positionGoal position

Page 23: Finding Approximate POMDP Solutions through Belief Compression

Policy Comparison

0

1

2

3

4

5

6

7

8

9

Maximum Likelihood AMDP E-PCA

Average Distance to GoalD

ista

nce

in M

6 bases

Page 24: Finding Approximate POMDP Solutions through Belief Compression

People Finding

Page 25: Finding Approximate POMDP Solutions through Belief Compression

People Finding as a POMDP

Fully Observable Robot

Position of person unknownRobot position

True person position

Page 26: Finding Approximate POMDP Solutions through Belief Compression

Finding and Tracking People

Robot positionTrue person position

Page 27: Finding Approximate POMDP Solutions through Belief Compression

People Finding as a POMDP

Factored belief space2 dimensions: fully-observable robot position6 dimensions: distribution over person positions

Regular grid gives ≈ 1016 states

Page 28: Finding Approximate POMDP Solutions through Belief Compression

Variable Resolution

Non-regular grid using samples

b1b2 b3 b4

b5

T(b1, a1, b2)

T(b1, a2, b5)

Compute model parameters using nearest-neighbour

~ ~

~ ~

~

~~

~ ~

Page 29: Finding Approximate POMDP Solutions through Belief Compression

Refining the Grid

V(b1)~

V(b'1)~

Sample beliefs according to policy

b1

~

b'~

Construct new model~ ~Keep new belief if V(b'1) > V(b1)

Page 30: Finding Approximate POMDP Solutions through Belief Compression

The Optimal Policy

Original distribution

Reconstruction using EPCA and 6 bases

Robot positionTrue person position

Page 31: Finding Approximate POMDP Solutions through Belief Compression

0

50

100

150

200

250

Closest Densest MaximumLikelihood

E-PCA RefinedE-PCA

Policy Comparison

Average time to find person

Ave

rage

# o

f Act

ions

to fi

nd P

erso

n

E-PCA: 72 statesRefined E-PCA: 260 states

Fully observable MDP

Page 32: Finding Approximate POMDP Solutions through Belief Compression

Nick’s Thesis Contributions

Good policies for real world POMDPs can be found by planning over a low-dimensional representation of the belief space, using E-PCA.

POMDPs can scale to bigger, more complicated real-world problems.POMDPs can be used for real deployed robots.