t emporal p robabilistic m odels p t 2. a genda kalman filtering dynamic bayesian networks particle...
TRANSCRIPT
TEMPORAL PROBABILISTIC MODELS PT 2
AGENDA
Kalman filtering Dynamic Bayesian Networks Particle filtering
KALMAN FILTERING
In a nutshell Efficient filtering in continuous
state spaces Gaussian transition and
observation models Ubiquitous for tracking with
noisy sensors, e.g. radar, GPS, cameras
HIDDEN MARKOV MODEL FOR ROBOT LOCALIZATION
Use observations + transition dynamics to get a better idea of where the robot is at time t
X0 X1 X2 X3
z1 z2 z3
Hidden state variables
Observed variables
Predict – observe – predict – observe…
HIDDEN MARKOV MODEL FOR ROBOT LOCALIZATION
Use observations + transition dynamics to get a better idea of where the robot is at time t
Maintain a belief state bt over time bt(x) = P(Xt=x|z1:t)
X0 X1 X2 X3
z1 z2 z3
Hidden state variables
Observed variables
Predict – observe – predict – observe…
BAYESIAN FILTERING WITH BELIEF STATES
Compute bt, given zt and prior belief bt
Recursive filtering equation
Update via the observation ztPredict P(Xt|z1:t-1) using dynamics alone
BAYESIAN FILTERING WITH BELIEF STATES
Compute bt, given zt and prior belief bt
Recursive filtering equation
IN CONTINUOUS STATE SPACES…
Compute bt, given zt and prior belief bt
Continuous filtering equation
GENERAL BAYESIAN FILTERING IN CONTINUOUS STATE SPACES
Compute bt, given zt and prior belief bt
Continuous filtering equation
How to evaluate this integral? How to calculate Z? How to even represent a belief state?
KEY REPRESENTATIONAL DECISIONS
Pick a method for representing distributions Discrete: tables Continuous: fixed parameterized classes vs.
particle-based techniques Devise methods to perform key calculations
(marginalization, conditioning) on the representation Exact or approximate?
GAUSSIAN DISTRIBUTION
Mean m, standard deviation s Distribution is denoted N(m,s) If X ~ N(m,s), then
With a normalization factor
LINEAR GAUSSIAN TRANSITION MODEL FOR MOVING 1D POINT
Consider position and velocity xt, vt
Time step h Without noise
xt+1 = xt + h vt
vt+1 = vt
With Gaussian noise of std s1
P(xt+1|xt) exp(-(xt+1 – (xt + h vt))2/(2s12)
i.e. Xt+1 ~ N(xt + h vt, s1)
LINEAR GAUSSIAN TRANSITION MODEL If prior on position is Gaussian, then the
posterior is also Gaussian
vh s1
N(m,s) N(m+vh,s+s1)
LINEAR GAUSSIAN OBSERVATION MODEL Position observation zt
Gaussian noise of std s2
zt ~ N(xt,s2)
LINEAR GAUSSIAN OBSERVATION MODEL
If prior on position is Gaussian, then the posterior is also Gaussian
m (s2z+s22m)/(s2+s2
2)
s2 s2s22/(s2+s2
2)
Position prior
Posterior probability
Observation probability
MULTIVARIATE GAUSSIANS
Multivariate analog in N-D space Mean (vector) m, covariance (matrix) S
With a normalization factor
X ~ N(m,S)
MULTIVARIATE LINEAR GAUSSIAN PROCESS
A linear transformation + multivariate Gaussian noise
If prior state distribution is Gaussian, then posterior state distribution is Gaussian
If we observe one component of a Gaussian, then its posterior is also Gaussian
y = A x + e e ~ N(m,S)
MULTIVARIATE COMPUTATIONS Linear transformations of gaussians
If x ~ N(m,S), y = A x + bThen y ~ N(Am+b, ASAT)
Consequence If x ~ N(mx,Sx), y ~ N(my,Sy), z=x+yThen z ~ N(mx+my,Sx+Sy)
Conditional of gaussian If [x1,x2] ~ N([m1 m2],[S11,S12;S21,S22])Then on observing x2=z, we have
x1 ~ N(m1-S12S22-1(z-m2), S11-S12S22
-1S21)
KALMAN FILTER ASSUMPTIONS
xt ~ N(mx,Sx) xt+1 = F xt + g + v zt+1 = H xt+1 + w v ~ N(0,Sv), w ~ N(0,Sw)
Dynamics noise
Observation noise
TWO STEPS
Maintain mt, St the parameters of the gaussian distribution over state xt
Predict Compute distribution of xt+1 using dynamics
model alone
Update (observe zt+1) Compute P(xt+1|zt+1) with Bayes rule
TWO STEPS
Maintain mt, St the parameters of the gaussian distribution over state xt
Predict Compute distribution of xt+1 using dynamics
model alone xt+1 ~ N(Fmt + g, F St FT
+ Sv) Let these be N(m’,S’)
Update Compute P(xt+1|zt+1) with Bayes rule
TWO STEPS
Maintain mt, St the parameters of the gaussian distribution over state xt
Predict Compute distribution of xt+1 using dynamics
model alone xt+1 ~ N(Fmt + g, F St FT
+ Sv) Let these be N(m’,S’)
Update Compute P(xt+1|zt+1) with Bayes rule Parameters of final distribution mt+1 and St+1
derived using the conditional distribution formulas
DERIVING THE UPDATE RULE
xt
zt
m’a
= N ( , )S’ BBT C
xt ~ N(m’ , S’)
(1) Unknowns a,B,C
(3) Assumption
(7) Conditioning (1)xt | zt ~ N(m’-BC-1(zt-a), S’-BC-1BT)
(2) Assumption
zt | xt ~ N(H xt, SW)
C-BTS’-1B = SW => C = H S’ HT + SW
H xt = a-BTS’-1(xt-m’) => a=Hm’, BT=HS’ (5) Set mean (4)=(3)
(6) Set cov. (4)=(3)
(8,9) Kalman filtermt = m’ - S’HTC-1(zt-Hm’)
(4) Conditioning (1)zt | xt ~ N(a-BTS’-1xt, C-BTS’-1B)
St = S’ - S’HTC-1HS’
PUTTING IT TOGETHER
Transition matrix F, covariance Sx
Observation matrix H, covariance Sz
mt+1 = F mt + Kt+1(zt+1 – HFmt)St+1 = (I - Kt+1)(FStFT + Sx)
WhereKt+1= (FStFT + Sx)HT(H(FStFT + Sx)HT +Sz)-1
Got that memorized?
PROPERTIES OF KALMAN FILTER Optimal Bayesian estimate for linear
Gaussian transition/observation models Need estimates of covariance… model
identification necessary Extensions to nonlinear
transition/observation models work as long as they aren’t too nonlinear Extended Kalman Filter Unscented Kalman Filter
Tracking the velocity of a braking obstacle
Learning that the road is slick
Actual max deceleration
Braking begins
Estimated max deceleration
Velocity initially uninformed
More distance measurements arrive
Obstacle slows
Stopping distance (95% confidence interval)
Braking initiated Gradual stop
NON-GAUSSIAN DISTRIBUTIONS
Gaussian distributions are a “lump”
Kalman filter estimate
NON-GAUSSIAN DISTRIBUTIONS
Integrating continuous and discrete states
Splitting with a binary choice
“up”
“down”
EXAMPLE: FAILURE DETECTION
Consider a battery meter sensor Battery = true level of battery BMeter = sensor reading
Transient failures: send garbage at time t Persistent failures: send garbage forever
EXAMPLE: FAILURE DETECTION
Consider a battery meter sensor Battery = true level of battery BMeter = sensor reading
Transient failures: send garbage at time t 5555500555…
Persistent failures: sensor is broken 5555500000…
DYNAMIC BAYESIAN NETWORK
BMetert
BatterytBatteryt-1
BMetert ~ N(Batteryt,s)
(Think of this structure “unrolled” forever…)
DYNAMIC BAYESIAN NETWORK
BMetert
BatterytBatteryt-1
BMetert ~ N(Batteryt,s)
P(BMetert=0 | Batteryt=5) = 0.03Transient failure model
RESULTS ON TRANSIENT FAILUREE
(Bat
tery
t)
Transient failure occurs
Without model
With model
Meter reads 55555005555…
RESULTS ON PERSISTENT FAILUREE
(Bat
tery
t)
Persistent failure occurs
With transient model
Meter reads 5555500000…
PERSISTENT FAILURE MODEL
BMetert
BatterytBatteryt-1
BMetert ~ N(Batteryt,s)
P(BMetert=0 | Batteryt=5) = 0.03
Brokent-1 Brokent
P(BMetert=0 | Brokent) = 1
Example of a Dynamic Bayesian Network (DBN)
RESULTS ON PERSISTENT FAILUREE
(Bat
tery
t)
Persistent failure occurs
With transient model
Meter reads 5555500000…
With persistent failure model
HOW TO PERFORM INFERENCE ON DBN? Exact inference on “unrolled” BN
Variable Elimination – eliminate old time steps After a few time steps, all variables in the state
space become dependent! Lost sparsity structure
Approximate inference Particle Filtering
PARTICLE FILTERING (AKA SEQUENTIAL MONTE CARLO)
Represent distributions as a set of particles
Applicable to non-gaussian high-D distributions
Convenient implementations
Widely used in vision, robotics
PARTICLE REPRESENTATION
Bel(xt) = {(wk,xk)} wk are weights, xk are state
hypotheses Weights sum to 1 Approximates the underlying
distribution
Weighted resampling step
PARTICLE FILTERING
Represent a distribution at time t as a set of N “particles” St
1,…,StN
Repeat for t=0,1,2,… Sample S[i] from P(Xt+1|Xt=St
i) for all i Compute weight w[i] = P(e|Xt+1=S[i]) for all i Sample St+1
i from S[.] according to weights w[.]
BATTERY EXAMPLE
BMetert
BatterytBatteryt-1
Brokent-1 Brokent
Sampling step
BATTERY EXAMPLE
BMetert
BatterytBatteryt-1
Brokent-1 Brokent
Suppose we now observe BMeter=0
P(BMeter=0|sample) = ?
0.03
1
BATTERY EXAMPLE
BMetert
BatterytBatteryt-1
Brokent-1 Brokent
Compute weights (drawn as particle size)
P(BMeter=0|sample) = ?
0.03
1
BATTERY EXAMPLE
BMetert
BatterytBatteryt-1
Brokent-1 Brokent
Weighted resampling
P(BMeter=0|sample) = ?
BATTERY EXAMPLE
BMetert
BatterytBatteryt-1
Brokent-1 Brokent
Sampling Step
BATTERY EXAMPLE
BMetert
BatterytBatteryt-1
Brokent-1 Brokent
Now observe BMetert = 5
BATTERY EXAMPLE
BMetert
BatterytBatteryt-1
Brokent-1 Brokent
Compute weights
10
BATTERY EXAMPLE
BMetert
BatterytBatteryt-1
Brokent-1 Brokent
Weighted resample
APPLICATIONS OF PARTICLE FILTERING IN ROBOTICS Simultaneous Localization and
Mapping (SLAM) Observations: laser rangefinder State variables: position, walls
SIMULTANEOUS LOCALIZATION AND MAPPING (SLAM)
Mobile robots Odometry
Locally accurateDrifts significantly over
time Vision/ladar/sonar
Inaccurate locallyGlobal reference frame
Combine the twoState: (robot pose, map)Observations: (sensor
input)
GENERAL PROBLEM
xt ~ Bel(xt) (arbitrary p.d.f.)xt+1 = f(xt,u,ep)zt+1 = g(xt+1,eo)ep ~ arbitrary p.d.f., eo ~ arbitrary
p.d.f.
Process noise
Observation noise
SAMPLING IMPORTANCE RESAMPLING (SIR) VARIANT
Predict
Update
Resample
ADVANCED FILTERING TOPICS
Mixing exact and approximate representations (e.g., mixture models)
Multiple hypothesis tracking (assignment problem)
Model calibration Scaling up (e.g., 3D SLAM, huge maps)
NEXT TIME
Putting it together: intelligent agents Read R&N 2