![Page 1: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/1.jpg)
Representing hierarchical POMDPs as DBNs for multi-scale
robot localization
G. Thocharous, K. Murphy, L. Kaelbling
Presented by: Hannaneh Hajishirzi
![Page 2: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/2.jpg)
Outline
• Define H-HMM– Flattening H-HMM
• Define H-POMDP– Flattening H-POMDP
• Approximate H-POMDP with DBN
• Inference and Learning in H-POMDP
![Page 3: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/3.jpg)
Introduction
• H-POMDPs represent state-space at multiple levels of abstraction– Scale much better to larger environments– Simplify planning
• Abstract states are more deterministic
– Simplify learning• Number of free parameters is reduced
![Page 4: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/4.jpg)
Hierarchical HMMs
• A generalization of HMM to model hierarchical structure domains– Application: NLP
• Concrete states: emit single observation
• Abstract states: emit strings of observations
• Emitted strings by abstract states are governed by sub-HMMs
![Page 5: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/5.jpg)
Example
• HHMM representing a(xy)+b | c(xy)+d
When the sub-HHMM is finished, control is returned to wherever it was called from
![Page 6: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/6.jpg)
HHMM to HMM• Create a state for every leaf in HHMM
![Page 7: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/7.jpg)
HHMM to HMM• Create a state for every leaf in HHMM• Flat transition probability =
Sum( P( all paths in HHMM))
• Disadvantages: Flattening loses modularity Learning requires more samples
![Page 8: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/8.jpg)
Representing HHMMs as DBNs
dtQ : state at level d
if HMM at level d finished
:1dtF
![Page 9: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/9.jpg)
H-POMDPs
• HHMMs with inputs and reward function• Problems:
– Planning: Find mapping from belief states to actions
– Filtering: Compute the belief state online– Smoothing: Compute offline– Learning: Find MLE of model parameters
),|( :1:1 TTt uyxP
![Page 10: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/10.jpg)
H-POMDP for Robot Navigation
Flat model
Hierarchical model4
* Abstract state: Xt1 (1..4)
* Concrete state: Xt2 (1..3)
* Observation: Yt (4 bits)
* Robot position: Xt
(1..10)
In this paper, Ignore the problem of how to choose the actions
![Page 11: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/11.jpg)
State Transition Diagram for 2-H-POMDP
Sample path:
![Page 12: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/12.jpg)
State Transition Diagram for Corridor Environment
Abstract States
Entry States
Exit States
Concrete States
![Page 13: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/13.jpg)
Flattening H-POMDPs
• Advantages of H-POMDP over corresponding POMDP:– Learning is easier: Learn sub-models– Planning is easier: Reason in terms of “macro” actions
![Page 14: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/14.jpg)
tU
tL
t
1tL
1t
STATE POMDP FACTORED DBN POMDP
0.080.01
0.7
0.05
0.08
0.01
Dynamic Bayesian Networks
# of parameters
34 3 4 40
# of parameters
12 * 9 108
![Page 15: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/15.jpg)
STATE H-POMDP FACTORED DBN H-POMDP
1tU
1tE
tU
1tL
t
11tL
1t
tY1tY
2tL 2
1tL
tE
EAST WESTEAST WESTWEST EASTWEST EAST
Representing H-POMDPs as DBNs
![Page 16: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/16.jpg)
STATE H-POMDP FACTORED DBN H-POMDP
1tU
1tE
tU
1tL
t
11tL
1t
tY1tY
2tL 2
1tL
tE
EAST WESTEAST WESTWEST EASTWEST EAST
Representing H-POMDPs as DBNs
![Page 17: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/17.jpg)
STATE H-POMDP FACTORED DBN H-POMDP
1tU
1tE
tU
1tL
t
11tL
1t
tY1tY
2tL 2
1tL
tE
EAST WESTEAST WESTWEST EASTWEST EAST
Representing H-POMDPs as DBNs
![Page 18: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/18.jpg)
STATE H-POMDP FACTORED DBN H-POMDP
1tU
1tE
tU
1tL
t
11tL
1t
tY1tY
2tL 2
1tL
tE
EAST WESTEAST WESTWEST EASTWEST EAST
Representing H-POMDPs as DBNs
![Page 19: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/19.jpg)
STATE H-POMDP FACTORED DBN H-POMDP
1tU
1tE
tU
1tL
t
11tL
1t
tY1tY
2tL 2
1tL
tE
EAST WESTEAST WESTWEST EASTWEST EAST
Representing H-POMDPs as DBNs
![Page 20: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/20.jpg)
H-POMDPs as DBNs
1tL : Abstract location
: Orientationt2tL : Concrete location
tE : Exit node (5 values)
tY : Observation
tU : Action node
Representing no-exit, s, n, l, r -exit
![Page 21: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/21.jpg)
Transition Model
),,(
),(),|(
212
12
jeiH
jieEiLjLP ttt
If e = no-exit
otherwise
Abstract horizontal transition matrix
![Page 22: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/22.jpg)
Transition Model
),,(
),(),|(
2122
jeiH
jieEiLjLP ttt
If e = no-exit
otherwise
),,,|( 211
11
1 aLeEiLjLP ttttt
),,(
),,,( 11
jaeV
jaiH t If e = no-exit
otherwise
Concrete vertical entry vector
Concrete horizontal transition matrix
),,,(),,|( 21 eajXaLjLeEP ttttt Probability of entering exit state e
![Page 23: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/23.jpg)
Observation Model
• Probability of seeing a wall or opening on each of 4 sides of the robot
• Naïve Bayes assumption: where
• Map global coordinate frame to robot’s local coordinate frame
Then,
Learn the appearance of the cell in all directions
4
1
)|()|(i
tittt XYPXYP ),,( 21
tttt LLX
),|(),,,( 21 aLjLyYPyjaB ttt
),,,(),,|( 18021 yRjaBaLjLyYP ttt
Bt
![Page 24: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/24.jpg)
Example
1.9.
9.1.2H
100
9.1.0
09.1.1
1,H
![Page 25: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/25.jpg)
Inference
• Online filtering: – Input of controller: MLE of the abstract and concrete
states
• Offline smoothing:– O(DK1.5D T) D: # of dimensions
K: # of states in each level– 1.5D: size of largest clique in DBN =
The state nodes at t-1 + half of the state nodes at t– Approximation (belief propagation): O(DKT)
),|( :1:1 TTt uyXP
),|( :1:1 ttt uyXP
![Page 26: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/26.jpg)
Learning
• Maximum likelihood parameter estimate using EM• In E step, compute:
• In M step, compute normalizing matrix of expected counts:
),|)(,( :1:1 TTtt uyVPaVP
' 2
2
2
2
2
12
122
),,,(
),,,(),,,(
)|,,(),,,(
j
T
t
T
t
ttt
jeitH
jeitHjeitH
OeEiLjLPjeitH
![Page 27: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/27.jpg)
Learning (Cont.)
)|,,,(),,,,( 21 OeEaLjLPeajtX ttttt
)|,,,,(),,,,( 1211
111
1 OjLaLnEiLPjaitH tttttt
neOjLaLeEPjaetV tttt ),|,,,(),,,( 1211
Concrete horizontal transition matrix:
Exit probabilities:
Vertical transition vector:
![Page 28: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/28.jpg)
Estimating Observation Model
• Map local observations into world-centered
Probability of observing y, facing North
![Page 29: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/29.jpg)
Hierarchical Localizes better
)|()( :1,:1 tttt uysXPsb
T
t t sb1 )(accuracyon localizati
Factored DBN H-POMDP
H-POMDP
STATE POMDP
Before training
![Page 30: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/30.jpg)
Conclusions
• Represent H-POMDPs with DBNs– Learn large models with less data
• Difference with SLAM: – SLAM is harder to generalize
![Page 31: Representing hierarchical POMDPs as DBNs for multi-scale robot localization G. Thocharous, K. Murphy, L. Kaelbling Presented by: Hannaneh Hajishirzi](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf8e1a28abf838c8cb67/html5/thumbnails/31.jpg)
Complexity of Inference
STATE H-POMDP FACTORED
DBN H-POMDP
O(ST 3)
O(S1.5T)
EAST WESTEAST WESTWEST EASTWEST EAST
1tA
1tE
tA
1tL
t
11tL
1t
tO1tO
2tL 2
1tL
tE
DKS Number of states: