mcmc diagnostics - simon fraser university

MCMC DiagnosticsMultiple chain

© Dave Campbell 2009Friday, June 12, 2009

hist(beta,100)

The distribution of ß, the probability of getting cancer without getting vaccinated.

Friday, June 12, 2009

Correction:

In the raftery Matlab function r is the interval width (precision).

But r is relative, not an absolute width in this Matlab software

r= .005 is a ½% precision, not absolute interval width as I suggested last day in class.


Gelman-Rubin

Multiple chain method


0 5000 10000 15000 20000

-50

050

100

150

200

250

Index

x[k

, ]

0 5000 10000 15000 20000

-300

-200

-100

0

Index

x[k

, ]

0 5000 10000 15000 20000

-50

050

100

200

300

Index

x[k

, ]

0 5000 10000 15000 20000

-300

-200

-100

050

Index

x[k

, ]

0 5000 10000 15000 20000

-300

-200

-100

0

Index

x[k

, ]

0 5000 10000 15000 20000

050

100

150

200

250

Index

x[k

, ]

0 5000 10000 15000 20000

-150

-100

-50

050

Index

x[k

, ]

0 5000 10000 15000 20000

-40

-20

020

40Index

x[k

, ]

0 5000 10000 15000 20000

-150

-100

-50

050

Index

x[k

, ]

0 5000 10000 15000 20000

-100

0100

200

Index

x[k

, ]

Multiple Chain Convergence Diagnostics: Gelman-Rubin method

If we start our Markov chain from m different places, do they all converge to the same place?


Today we have a series of independent (and separate) MCMC runs j=1,2,…,m, where for us m=10. This is the usual value when using the Gelman Rubin diagnostic.

The sampled values are:

β j(i), i = 1,2,...,n


RandStream.setDefaultStream(RandStream('mt19937ar','seed',sum(clock)))niter=10000;y=36; N=5766;stepvar=.004;m=10;betas=zeros(niter,m);betas(1,:)=(1:10)/11;for j=1:m iter=1; log_alpha_bot=(y*log(betas(iter,j))+(N-y)*log(1-betas(iter,j))+... log(2-2*betas(iter,j))); for iter=2:niter X=unifrnd(betas(iter-1,j)-stepvar,betas(iter-1,j)+stepvar); log_alpha_top= y*log(X)+(N-y)*log(1-X)+log(2-2*X); if(rand<exp(log_alpha_top-log_alpha_bot)) betas(iter,j)=X; log_alpha_bot=log_alpha_top; else betas(iter,j)=betas(iter-1,j); end endendplot(betas)

Note:I’m using the log of the acceptance ratio. It’s numerically more stable. You should always do this too.


RandStream.setDefaultStream(RandStream('mt19937ar','seed',sum(clock)))niter=10000;y=36; N=5766;stepvar=.004;m=10;betas=zeros(niter,m);betas(1,:)=(1:10)/11;for j=1:m iter=1; log_alpha_bot=(y*log(betas(iter,j))+(N-y)*log(1-betas(iter,j))+... log(2-2*betas(iter,j))); for iter=2:niter X=unifrnd(betas(iter-1,j)-stepvar,betas(iter-1,j)+stepvar); log_alpha_top= y*log(X)+(N-y)*log(1-X)+log(2-2*X); if(rand<exp(log_alpha_top-log_alpha_bot)) betas(iter,j)=X; log_alpha_bot=log_alpha_top; else betas(iter,j)=betas(iter-1,j); end endendplot(betas)

NOTE: that I’m filling down columns since that is the way Matlab indexes a Matrix it’s faster to do it this way.


plot(ßj)Friday, June 12, 2009

Multiple Chain Convergence Diagnostics Gelman-Rubin method:

Run MCMC m times Discard a bunch for Burn-in With what is left compute:

Average within chain var:

Between chain variance:

W =1m

1n −1

β j(i) − β j( )2

i=1

n

∑⎡⎣⎢

⎤⎦⎥j=1

m

∑

B =n

m −1β j − β( )2

j=1

m

∑

β j =1n

β j(i)

i=1

n

∑


The total estimated variance:

And the Gelman-Rubin statistic:

R Should be close to 1 when all is working well

R>1.05 suggests possible problems

V̂ar(β) = 1− 1n

⎛⎝⎜

⎞⎠⎟W +

1nB

R =V̂ar(β)W

W =1m

1n −1

β j(i) − β j( )2

i=1

n

∑⎡⎣⎢

⎤⎦⎥j=1

m

∑ B =n

m −1β j − β( )2

j=1

m

∑


Gelman-Rubin is a univariate diagnostic

Multivariate version exists (Brooks, S. and A. Gelman. 1998. General methods for monitoring convergence of iterative simulations. Journal of Computational

and Graphical Statistics 7: 434-55. )

We could also run one very long chain and divide it into 50 segments to perform Gelman-Rubin

Large Gelman and Rubin might arise from slow mixing or multi-modality


Download the Computational Statistics Toolbox from the authors of the Martinez and Martinez (2008), ‘Computational Statistics Handbook with Matlab’ 2nd ed.

Their book is on reserves

From a direct link to software: http://www.pi-sigma.info/

CompStatsToolboxV2.zip or visit their webpage: http://www.pi-sigma.info/CS2E.htm

Gelman-Rubin in Matlab


http://www.pi-sigma.info/CompStatsToolboxV2.zip




http://www.pi-sigma.info/CS2E.htm




Using the computational statistics toolbox is as simple as making it visible to Matlab:

The if you instead use you’ll open the contents file for Matlab.

Check which file you’re opening by using

addpath('/Volumes/iamdavecampbell/CompStatsToolboxV2')open '/Volumes/iamdavecampbell/CompStatsToolboxV2/Contents'

open 'Contents'

which 'Contents'


The Gelman-Rubin diagnostic function takes a matrix of m rows and n columns

>> size(betas) >> R=csgelrub(betas')R =

1.0056


To show it not working let’s contrive multi-modality

>> betas(:,1:5)=-betas(:,1:5); >> R=csgelrub(betas')R =

1.0711

While our result suggests multimodality, the modes are close together so the value of R is not all that large but above the 1.05 cutoff


mcmc diagnostics - simon fraser university

Documents