the chopthin algorithm for resamplingfdl06/chopthin_presentation_cox_event.pdf · the chopthin...

The Chopthin Algorithm for Resampling

Din-Houn Lau

Joint work with Axel GandyImperial College London

Sir David Cox Celebration Event7th December 2016

Imperial College London Din-Houn Lau The Chopthin Algorithm for Resampling 1

Particle Filter: Example: Hidden Markov Model

Observations

Hidden

Y1

X1

Y2

X2

Y3

X3

●●●

●●

●● ●●

●

●

●●

●●

●●●

●● ●

●

●

●

●

●●

●●●

●●● ●

●

●

●●●

●●

xx xx

x

11

121

1

21

23● ●●

●●

● ●●

●

●

●

●

●●●1

1121

1

21

23π(x1|y1) ≈

likelihoodg(y1|x

(i)1 )

f (x2|x1)

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●●●

x

x

x

x

x x2

1

2

22

2

2

1

1

π(x2|y1, y2) ≈

particle system{x (i)

t ,w (i)t

}N

i=1

resample step : necessary→ avoids weight degeneracy.


Particle Filter: Initial Guesses

Observations

Hidden

Y1

X1

Y2

X2

Y3

X3

●●●

●●

●● ●●

●

●

●●

●●

●●●

●● ●

●

●

●

●

●●

●●●

●●● ●

●

●

●●●

●●

xx xx

x

11

121

1

21

23● ●●

●●

● ●●

●

●

●

●

●●●1

1121

1

21

23π(x1|y1) ≈

likelihoodg(y1|x

(i)1 )

f (x2|x1)

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●●●

x

x

x

x

x x2

1

2

22

2

2

1

1

π(x2|y1, y2) ≈


t ,w (i)t

}N

i=1



Particle Filter: Update Weights using Likelihood

Observations

Hidden

Y1

X1

Y2

X2

Y3

X3

●●●

●●

●● ●●

●

●

●●

●●

●●●

●● ●

●

●

●

●

●

●●●

●●●● ●

●

●

●●●

●●

xx xx

x

11

121

1

21

23● ●●

●●

● ●●

●

●

●

●

●●●1

1121

1

21

23π(x1|y1) ≈

likelihoodg(y1|x

(i)1 )

f (x2|x1)

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●●●

x

x

x

x

x x2

1

2

22

2

2

1

1

π(x2|y1, y2) ≈


t ,w (i)t

}N

i=1



Particle Filter: Resample

Observations

Hidden

Y1

X1

Y2

X2

Y3

X3

●●●

●●

●● ●●

●

●

●●

●●

●●●

●● ●

●

●

●

●

●

●●●

●●●● ●

●

●

●●●

●●

xx xx

x

11

121

1

21

23

● ●●●

●● ●●

●

●

●

●

●●●1

1121

1

21

23π(x1|y1) ≈

likelihoodg(y1|x

(i)1 )

f (x2|x1)

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●●●

x

x

x

x

x x2

1

2

22

2

2

1

1

π(x2|y1, y2) ≈


t ,w (i)t

}N

i=1




Observations

Hidden

Y1

X1

Y2

X2

Y3

X3

●●●

●●

●● ●●

●

●

●●

●●

●●●

●● ●

●

●

●

●

●●

●●●

●●● ●

●

●

●●●

●●

xx xx

x

11

121

1

21

23

● ●●●

●● ●●

●

●

●

●

●●●1

1121

1

21

23π(x1|y1) ≈

likelihoodg(y1|x

(i)1 )

f (x2|x1)

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●●●

x

x

x

x

x x2

1

2

22

2

2

1

1

π(x2|y1, y2) ≈


t ,w (i)t

}N

i=1



Particle Filter: Transition

Observations

Hidden

Y1

X1

Y2

X2

Y3

X3

●●●

●●

●● ●●

●

●

●●

●●

●●●

●● ●

●

●

●

●

●●

●●●

●●● ●

●

●

●●●

●●

xx xx

x

11

121

1

21

23● ●●

●●

● ●●

●

●

●

●

●●●1

1121

1

21

23π(x1|y1) ≈

likelihoodg(y1|x

(i)1 )

f (x2|x1)

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●●●

x

x

x

x

x x2

1

2

22

2

2

1

1

π(x2|y1, y2) ≈


t ,w (i)t

}N

i=1



Particle Filter: Update Weight

Observations

Hidden

Y1

X1

Y2

X2

Y3

X3

●●●

●●

●● ●●

●

●

●●

●●

●●●

●● ●

●

●

●

●

●●

●●●

●●● ●

●

●

●●●

●●

xx xx

x

11

121

1

21

23● ●●

●●

● ●●

●

●

●

●

●●●1

1121

1

21

23π(x1|y1) ≈

likelihoodg(y1|x

(i)1 )

f (x2|x1)

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●●●

x

x

x

x

x x2

1

2

22

2

2

1

1

π(x2|y1, y2) ≈


t ,w (i)t

}N

i=1




Observations

Hidden

Y1

X1

Y2

X2

Y3

X3

●●●

●●

●● ●●

●

●

●●

●●

●●●

●● ●

●

●

●

●

●●

●●●

●●● ●

●

●

●●●

●●

xx xx

x

11

121

1

21

23● ●●

●●

● ●●

●

●

●

●

●●●1

1121

1

21

23π(x1|y1) ≈

likelihoodg(y1|x

(i)1 )

f (x2|x1)

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●●●

x

x

x

x

x x2

1

2

22

2

2

1

1

π(x2|y1, y2) ≈


t ,w (i)t

}N

i=1




Observations

Hidden

Y1

X1

Y2

X2

Y3

X3

●●●

●●

●● ●●

●

●

●●

●●

●●●

●● ●

●

●

●

●

●●

●●●

●●● ●

●

●

●●●

●●

xx xx

x

11

121

1

21

23● ●●

●●

● ●●

●

●

●

●

●●●1

1121

1

21

23π(x1|y1) ≈

likelihoodg(y1|x

(i)1 )

f (x2|x1)

●

●●

●

●●

●

●

●

●

●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●

●

●

●●●

x

x

x

x

x x2

1

2

22

2

2

1

1

π(x2|y1, y2) ≈


t ,w (i)t

}N

i=1

resample step : necessary→ avoids weight degeneracy.Imperial College London Din-Houn Lau The Chopthin Algorithm for Resampling 2

Available Resampling Methods

I Multinomial: sample with replacement using weights.

I Residual (Liu and Chen, 1998)

I Branching (Bain and Crisan, 2009, p. 278)

I Systematic (Carpenter et al., 1999)

I Stratified (Kitagawa, 1996)

General consensus — possible to outperform multinomialresampling; the other resamplers are comparable in terms oftheir performance in particle filters. (Douc and Cappe, 2005;Hol et al., 2006)


Resampling: Common ConstraintsLet N i = number of offspring from particle i and G a σ-field generatedby the particle system {x (i),w (i)}N

i=1 with∑N

i=1 w (i) = 1.

1.∑N

i=1 N i = N ←− returns N particles

2. E(N i |G) = Nw (i) ←− unbiasedness

3. Equal weights after resampling = 1/N.

Objective: Construct resampler that makes

N∑i=1

N i

Nf (x (i))

small for any function of interest f .

Chopthin resampling: returns unequal weights.

Estimator of interest afterresamplingE

(N∑

i=1

N i

Nf (x (i))

∣∣∣∣∣G)

=N∑

i=1

w (i)f (x (i))Var

(N∑

i=1

N i

Nf (x (i))

∣∣∣∣∣G)



i=1 with∑N

i=1 w (i) = 1.

1.∑N





N∑i=1

N i

Nf (x (i))



Estimator of interest afterresampling

E

(N∑

i=1

N i

Nf (x (i))

∣∣∣∣∣G)

=N∑

i=1

w (i)f (x (i))Var

(N∑

i=1

N i

Nf (x (i))

∣∣∣∣∣G)



i=1 with∑N

i=1 w (i) = 1.

1.∑N





N∑i=1

N i

Nf (x (i))



Estimator of interest afterresampling

E

(N∑

i=1

N i

Nf (x (i))

∣∣∣∣∣G)

=N∑

i=1

w (i)f (x (i))

Var

(N∑

i=1

N i

Nf (x (i))

∣∣∣∣∣G)



i=1 with∑N

i=1 w (i) = 1.

1.∑N





N∑i=1

N i

Nf (x (i))



Estimator of interest afterresamplingE

(N∑

i=1

N i

Nf (x (i))

∣∣∣∣∣G)

=N∑

i=1

w (i)f (x (i))

Var

(N∑

i=1

N i

Nf (x (i))

∣∣∣∣∣G)


Motivation for Chopthin

I Resampling introduces variance. But is necessary. . .

I Particle filter: resampling not performed at every iteration e.g.if ESS ≤ β: Resample

if ESS > β: No Resampling

where ESS =(∑n

i=1 w (i))2∑ni=1(w (i))2

, typically β = 0.5N

I Can some “in-between” resampling be constructed?

I For particle filters to work one essentially needs to ensure thatweights

• do not get too large and • do not get too small.


Aims for the Chopthin Algorithm

1.∑N

i=1 N i = N.

2. Unbiasedness: E[N iw (i)

new

∣∣∣G] = w (i)old ∀i


? Bound ratio of weights:w (i)

new

w (j)new

≤ η ∀i , j

? Conserve total weight:N∑

i=1

w (i)old =

N∑i=1

w (i)new

Also: Efficient Implementation


Chopthin

I New weights: w (i)new ∈ [a, ηa] =⇒ ratio bounded X.

(a > 0 to be determined)I Weights below a get thinned:

0 or 1 offspring with weight a.I Weights above a get chopped:

replicated, weighted divided among offspring.

I To ensure that N particles are returned (in expectation), needto find a such that

N∑i=1

E(N i |G) :=N∑

i=1

hηa(w

(i)) = N

I Need to choose hηa such that all the aims are satisfied.


Chopthin: Key Steps

1. Find a: that solves∑N

i=1 hηa(w (i)) = N.

2. Thinning step: Offspring→ Systematic resampling on smallweights (on hη

a(wi) s.t. wi < a).

3. Chopping step: Offspring→ Systematic resampling on largeweights (on fractional parts of hη

a(wi) s.t. wi ≥ a).

Systematic resampling – ensures exactly N particles arereturned =⇒

∑Ni=1 N i = N X


Our Choice of hηa

hηa(w) =

w/a if w < a1 if a ≤ w < ηa/22w/ηa if w ≥ ηa/2

.0.

01.

02.

0

weight, w

h aη (w)

0 a ηa 2 ηa

THIN CHOPNothing

Choice:∑N

i=1 hηa(w (i)) = N can be solved for a efficiently (expected

linear effort).


Our Choice of hηa

hηa(w) =

w/a if w < a1 if a ≤ w < ηa/22w/ηa if w ≥ ηa/2

.0.

01.

02.

0

weight, w

h aη (w)

0 a ηa 2 ηa

THIN CHOPNothing

Choice:∑N

i=1 hηa(w (i)) = N can be solved for a efficiently (expected

linear effort).Imperial College London Din-Houn Lau The Chopthin Algorithm for Resampling 9

ExampleI Model: For t = 1, . . . ,T = 1000

Xt ∼ N(Xt−1,1) , with X0 ∼ N(0,1), (hidden)

Yt ∼ N(Xt , σ2Y ) (observed)

I Targets: Xt |Y1, . . .Yt .

Algorithm: Bootstrap Particle Filter

Sample x (1), . . . , x (N) ∼ N(0, 1); Let w (1) = · · · = w (N) = 1/Nfor t = 1, . . . ,T do

Sample x (i) ∼ N(x (i), 1), i = 1, . . . ,N

w (i) ← w (i)φ(yt − x (i)/σY ), i = 1, . . . ,N

(potentially) resample (x (i),w (i))

I Kalman filter gives exact posterior distribution in this case —but not in more general filtering problems.


Simulation ResultsI 1000 repetitions – MSE of posterior mean divided by MSE

using systematic resampling.N 103 103 103 103 104 104 104 104

β η σY 1/3 1 3 9 1/3 1 3 9chopthin N 3+

√8 1.00 0.89 0.86 0.87 0.95 0.90 0.89 0.90

chopthin 0.5N 3+√

8 0.97 0.98 0.96 0.94 0.96 0.98 0.96 0.94chopthin N 4 0.98 0.90 0.89 0.92 0.97 0.91 0.90 0.94chopthin N 10 0.98 0.91 0.87 0.86 0.94 0.93 0.86 0.85multinomial 0.5N - 0.99 1.04 1.15 1.24 0.98 1.04 1.15 1.22branching 0.5N - 1.00 0.99 1.00 1.01 0.95 1.01 1.00 1.00residual 0.5N - 0.99 1.00 1.00 1.01 1.00 1.00 1.00 1.01stratified 0.5N - 0.99 1.01 1.02 1.04 0.96 1.02 1.01 1.01residual-stratified 0.5N - 0.97 1.00 1.01 1.01 0.96 1.03 1.00 1.05systematic 0.5N - 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

I consistently lower MSE for the posterior mean.I For η = 3 +

√8 and η = 10, ESS is above theoretical lower bound of

0.5N and 0.33N, respectively.


What about the Variance

Var

(N iw (i)

new

∣∣∣∣∣G)

var. of total weight of offspring from particle i

0.000 0.005 0.010 0.015 0.0200.00

000

0.00

010

0.00

020

weight

Var

ianc

e of

tota

l wei

ght o

f offs

prin

g

MultinomialResidualSystematic


What about the Variance

Var

(N iw (i)

new

∣∣∣∣∣G)

var. of total weight of offspring from particle i

0.000 0.005 0.010 0.015 0.0200.00

000

0.00

010

0.00

020

weight

Var

ianc

e of

tota

l wei

ght o

f offs

prin

g

MultinomialResidualSystematicChopthin


Summary

I Chopthin is a new resampler that returns unequal weightswhose ratio is bounded.

I Chopthin can be performed at every resampling step in aparticle filter.

I Simulations: chopthin consistently outperforms otherresamplers.

I Chopthin can be implemented efficiently. Implementations forC++, R (on CRAN), Python and for Matlab are available.

I Next: Central Limit type theorem (as N →∞).


ReferencesBain, A. and D. Crisan (2009). Fundamentals of Stochastic Filtering.

Springer.

Carpenter, J., P. Clifford, and P. Fearnhead (1999). Improved particle filter fornonlinear problems. IEE Proceedings - Radar, Sonar and Navigation 146,2–7.

Douc, R. and O. Cappe (2005). Comparison of resampling schemes forparticle filtering. In Proceedings of the 4th International Symposium onImage and Signal Processing and Analysis, pp. 64–69. IEEE.

Gandy, A. and F. D.-H. Lau (2016). The chopthin algorithm for resampling.IEEE Transactions on Signal Processing 64(16), 4273–4281.

Hol, J. D., T. B. Schon, and F. Gustafsson (2006). On resampling algorithmsfor particle filters. In Nonlinear Statistical Signal Processing Workshop,2006 IEEE, pp. 79–82.

Kitagawa, G. (1996). Monte Carlo filter and smoother for non-Gaussiannonlinear state space models. Journal of Computational and GraphicalStatistics 5(1), 1–25.

Liu, J. S. and R. Chen (1998). Sequential Monte Carlo methods for dynamicsystems. Journal of the American statistical association 93(443),1032–1044.


Chopthin Variance Explained: IllustrationBEFORE CHOPTHIN

AFTER CHOPTHIN

0

0

a ηa/2 ηa

weight


the chopthin algorithm for resamplingfdl06/chopthin_presentation_cox_event.pdf · the chopthin...

Documents