hastings paper discussion
DESCRIPTION
Talk given by Donia Skanji at the "reading classics seminar" in Paris-DauphineTRANSCRIPT
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Monte Carlo Sampling methods using MarkovChains and their Applications
Hastings-University of Toronto
Reading seminar on classics: C.P.Robertpresented by:Donia Skanji
December 3, 2012
1/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Outline
1 Introduction
2 Monte Carlo Principle
3 Markov Chain Theory
4 MCMC
5 Conclusion
2/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Introduction to MCMC Methods
3/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Introduction:
There are several numerical problems such as Integralcomputing and Maximum evaluation in large dimensionalspaces
Monte Carlo Methods are often applied to solve integrationand optimisation problems.
Monte Carlo Markov chain (MCMC) is one of the most knownMonte Carlo methods.
MCMC methods involve a large class of sampling algorithmsthat have had a greatest influence on science development.
4/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
To expose some relevant theory and techniques ofapplication related to MCMC methods
To present a generalization of Metropolis sampling method.
Study objectif
♣
5/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Next Steps
Monte Carlo Principle
Markov Chain
6/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Next Steps
Monte Carlo Principle
Markov Chain
6/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Next Steps
Monte Carlo Principle
Markov Chain
To introduce:
6/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Next Steps
Monte Carlo Principle
Markov Chain
To introduce:
-MCMC Methods
6/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Next Steps
Monte Carlo Principle
Markov Chain
To introduce:
-MCMC Methods
-MCMC Algorithms
6/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Monte Carlo Methods
7/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Overview
The idea of Monte Carlo simulation is to draw an i.i.d. set ofsamples{x i}Ni=1 from a target density π.
These N samples can be used to approximate the targetdensity with the following empirical point-mass function:
πN(x) = 1N
∑Ni=1 δx(i)(x)
For independent samples, by Law of Large numbers, one canapproximate the integrals I (f ) with tractable sums IN(f ) thatconverge as follows:
IN(f ) = 1N
∑Ni=1 f (x i )→ I (f ) =
∫f (x)π(x)dx a.s
see example
8/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
N sample from π
x1
x2
x3
x4x5
xN
x9x6
x7
x8
But independent sampling from π may be difficult especially in ahigh dimensional space.
9/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
It turns out that 1N
∑Ni=1 f (x i )→
∫f (x)π(x)dx (N →∞)
still applies if we generate samples using a Markovchain(dependent samples).
The idea of MCMC is to use Markov chain convergenceproperties to overcome the dimensionality problems met byregular Monte carlo methods.
But first, some revision of Markov chains in a discrete set χ.
10/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Markov Chain Theory
11/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Definition
Finite Markov Chain
A Markov chain is a mathematical system that undergoestransitions from one state to another, between a finite or countablenumber of possible states. It is a random process usuallycharacterized as memoryless:
P(X (t+1)/X (0),X (1), . . . ,X (t)) = P(X (t+1)/X (t))
12/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Transition Matrix
Let P = {Pij} the transition Matrix of a markov chain with states0, 1, 2 . . . ,S then, if X (t) denotes the state occupied by theprocess at time t, we have:
Pr(X (t+1) = j/X (t) = i) = Pij
X (t+1) = X (t).P13/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Properties
Stationarity
As t →∞,the Markov chain converges to itsstationary(invariant) distribution:π = π.P
Irreducibility
Irreducible means any set of states can bereached from any other state in a finite numberof moves (p(i , j) > 0 for every i and j).
Stationarity/Irreducibility
♣
14/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Properties
Stationarity
As t →∞,the Markov chain converges to itsstationary(invariant) distribution:π = π.P
Irreducibility
Irreducible means any set of states can bereached from any other state in a finite numberof moves (p(i , j) > 0 for every i and j).
Stationarity/Irreducibility
♣
14/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Properties
Stationarity
As t →∞,the Markov chain converges to itsstationary(invariant) distribution:π = π.P
Irreducibility
Irreducible means any set of states can bereached from any other state in a finite numberof moves (p(i , j) > 0 for every i and j).
Stationarity/Irreducibility
♣
14/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Properties
Stationarity
As t →∞,the Markov chain converges to itsstationary(invariant) distribution:π = π.P
Irreducibility
Irreducible means any set of states can bereached from any other state in a finite numberof moves (p(i , j) > 0 for every i and j).
Stationarity/Irreducibility
♣
14/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
MCMC
The idea of Markov Monte Carlo Method is to choose P thetransition Matrix so that π(the target density which is verydifficult to sample from) is its unique stationary distribution.
Assume the Markov Chain:has a stationary distribution π(X )is irreducible and aperiodic
Then we have an Ergodic Theorem:
Theorem(Ergodic Theorem)
if the Markov chain xt is irriducible, aperiodic and stationary thenfor any function h with E |h| ≺ ∞
1N
∑i h(xi )→
∫h(x)dπ(x) when N →∞
15/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Recall that our goal is to build a markov chain (X t)using a transition matrix P so that the limiting distri-bution of (X t) is the target density π and integrals canbe approximated using the ergodic theorem.
Summary
16/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Question
How do we construct a Markov chain whose stationarydistribution is the target distribution,π
Metropolis et al (1953) showed how.
The method was generalized by Hastings (1970).
17/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Construction of the transition matrix
in order to construct a markov chain with π as its stationarydistribution, we have to consider a transition matrix P thatsatisfy the reversibility condition that for all i and j
πip(i → j) = πjp(j → i)
πipij = πjpji
This property ensures that∑πipij = πj(definition of a
stationary distribution) and hence that π is a stationarydistribution of P
18/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Construction of the transition matrix
How to choose thetransition MatrixP so that the
reversibility con-dition is verified?
πiPij = πjPji
19/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Overview
Suppose that we have a proposal matrix denoted Q where∑j qij = 1 .
If it happens that Q itself satisfies the reversibilitycondition:πiqij = πjqji for all i and j then our research isover,but most likely it will not.
We might find for example that for some i and j:πiqij > πjqji
A convenient way to correct this condition is to reduce thenumber of moves from i to j by introducing a probability αij
that the move is made.
20/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
The choice of the transition matrix
we assume that the transition matrix P has this form:
Pij = qijαij if i 6= jPii = 1−
∑j 6=i Pij if i = j
where:X Q = qij is the proposal matrix or jumping matrix of anarbitrary Markov chain on the states 0, 1..S , which suggests anew sample value j given a sample value i .X αij is the acceptance probability to move from state i tostate j.
21/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion⊙
In order to obtain the reversibility condition, we have to verify :
πipij = πjpjiπiαijqij = πjαjiqji (∗)⊙
The probabilities αij and αji are introduced to ensure that thetwo sides of (∗) are in balance.⊙
In his paper, Hastings defined a generic form of the acceptanceprobability:
αij =sij
1+πi qijπj qji⊙
Where:sij is a symetric function of i and j(sij = sji ) chosen sothat 0 6 αij 6 1 for all i and j⊙
With this form of Pij and αij suggested by Hastings, it’s readilyverified the reversibility condition.
22/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
2-The acceptance probability α
Recall that in this paper, Hastings defined the acceptanceprobability αij as follows:
αij =sij
1+πi qijπj qji
For a specific choice of sij , we recognize the acceptanceprobabilities suggested by both:⊕Metropolis et al(1953)⊕Barker(1965)
The choice of α
23/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
The acceptance probability α
Two choices for Sij are given for all i and j by
s(M)ij =
{1 +
πiqijπjqji
ifπjqjiπiqij> 1
1 +πjqjiπiqij
ifπjqjiπiqij6 1
when qij = qji and Sij = S(M)ij we have the method devised
by Metropolis et al with α(M)ij = min(1,
πjπi
)
whenqij = qji and Sij = S(B)ij = 1 we have the method
devised by Barker with α(B)ij = (
πjπi+πj
)
The choice of Sij
24/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
In this paper, Hastings mentionned that little is known about
the merits of these two choices of S(M)ij and S
(B)ij
Remark
r
25/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
The Proposal Matrix Q
It has been recognised that the choice of the proposalmatrix/density is crucial to the success(rapid convergence)of MCMC algorithm.
The proposal matrix can be almost arbitrary which allows toreach all states frequently and assure a high acceptance rate
The choice of Q
26/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
1 First, pick a proposal matrix Q(i , j) of an arbitrary Markovchain on the states 0, 1..S , which suggests a new samplevalue j given a sample value i .
2 Also, start with some arbitrary point i0 as the first sample.
3 Then, to return a new sample j given the most recentsample i , we proceed as follows:
4 Generate a proposed new sample value j from the jumpingdistribution Q(i → j).
5 Accept proposal with probability α(i → j)
-if proposal accepted then move to j/step4-repeat until a sample from the desired size isobtained
Algorithm
X
27/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
1 First, pick a proposal matrix Q(i , j) of an arbitrary Markovchain on the states 0, 1..S , which suggests a new samplevalue j given a sample value i .
2 Also, start with some arbitrary point i0 as the first sample.
3 Then, to return a new sample j given the most recentsample i , we proceed as follows:
4 Generate a proposed new sample value j from the jumpingdistribution Q(i → j).
5 Accept proposal with probability α(i → j)
-if proposal accepted then move to j/step4-repeat until a sample from the desired size isobtained
Algorithm
X
27/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
1 First, pick a proposal matrix Q(i , j) of an arbitrary Markovchain on the states 0, 1..S , which suggests a new samplevalue j given a sample value i .
2 Also, start with some arbitrary point i0 as the first sample.
3 Then, to return a new sample j given the most recentsample i , we proceed as follows:
4 Generate a proposed new sample value j from the jumpingdistribution Q(i → j).
5 Accept proposal with probability α(i → j)
-if proposal accepted then move to j/step4-repeat until a sample from the desired size isobtained
Algorithm
X
27/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
1 First, pick a proposal matrix Q(i , j) of an arbitrary Markovchain on the states 0, 1..S , which suggests a new samplevalue j given a sample value i .
2 Also, start with some arbitrary point i0 as the first sample.
3 Then, to return a new sample j given the most recentsample i , we proceed as follows:
4 Generate a proposed new sample value j from the jumpingdistribution Q(i → j).
5 Accept proposal with probability α(i → j)
-if proposal accepted then move to j/step4-repeat until a sample from the desired size isobtained
Algorithm
X
27/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
1 First, pick a proposal matrix Q(i , j) of an arbitrary Markovchain on the states 0, 1..S , which suggests a new samplevalue j given a sample value i .
2 Also, start with some arbitrary point i0 as the first sample.
3 Then, to return a new sample j given the most recentsample i , we proceed as follows:
4 Generate a proposed new sample value j from the jumpingdistribution Q(i → j).
5 Accept proposal with probability α(i → j)
-if proposal accepted then move to j/step4-repeat until a sample from the desired size isobtained
Algorithm
X
27/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
1 First, pick a proposal matrix Q(i , j) of an arbitrary Markovchain on the states 0, 1..S , which suggests a new samplevalue j given a sample value i .
2 Also, start with some arbitrary point i0 as the first sample.
3 Then, to return a new sample j given the most recentsample i , we proceed as follows:
4 Generate a proposed new sample value j from the jumpingdistribution Q(i → j).
5 Accept proposal with probability α(i → j)
-if proposal accepted then move to j/step4
-repeat until a sample from the desired size isobtained
Algorithm
X
27/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
1 First, pick a proposal matrix Q(i , j) of an arbitrary Markovchain on the states 0, 1..S , which suggests a new samplevalue j given a sample value i .
2 Also, start with some arbitrary point i0 as the first sample.
3 Then, to return a new sample j given the most recentsample i , we proceed as follows:
4 Generate a proposed new sample value j from the jumpingdistribution Q(i → j).
5 Accept proposal with probability α(i → j)
-if proposal accepted then move to j/step4-repeat until a sample from the desired size isobtained
Algorithm
X
27/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
An empirical way for checking convergence is to let two ormore different chains run in parallel and see if they areconcentrating on the some place.
The calculation of α does not require knowledge of thenormalizing constant of π because it appears both in thenumerator and denominator.
Although the Markov chain eventually converges to thedesired distribution, the initial samples may follow a verydifferent distribution, especially if the starting point is in aregion of low density.As a result a burn in period is typically necessary.
Remarks
r
28/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Example:Poisson Distribution as the Target Distribution
� Consider π as the Poisson distribution with intensity λ > 0
πi = e−λ λi
i! where i = 0, 1, 2, · · ·
� Hastings(1970)suggests the following proposal transition matrix
qij =
q00 = q01 = 1
2 if i = 012 if j = i − 112 if j = i + 10 otherwise
Q =
12
12 0 0 · · ·
12 0 1
2 0 · · ·0 1
2 0 12 · · ·
0 0 12 0 · · ·
......
...... · · ·
� Q is in fact symmetric, and the algorithm reduces to that ofMetropolis
skip
29/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
pij = qijα(M)ij =
12min(1, i
λ) if j = i − 112min(1, λ
i+1 ) if j = i + 1
1− pi ,i−1 − pi ,i+1 j = i0 otherwise
For i = 0
p0j =
12min(1, λ) if j = 11− 1
2min(1, λ) if j = 00 otherwise
� this transition probability is aperiodic and irreducible� In practice, if λ is small, this choice of Q seems to work fairlywell and fast to approximate π
30/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Given a starting point i we take:
j=i+1 with probability 12
or j=i-1 with probability 12
qij = 12δi−1(j) + 1
2δi+1(j)
We calculate Metropolis and Hastings ratio:
αij = min{1, π(j)π(i)} = min{1, λ(j−i) × i!
j!}let u ∼ U[0, 1]
if u ≤ αij then Xk+1 = j
else Xk+1 = Xk = i
Algorithm
♣
31/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Given a starting point i we take:
j=i+1 with probability 12
or j=i-1 with probability 12
qij = 12δi−1(j) + 1
2δi+1(j)
We calculate Metropolis and Hastings ratio:
αij = min{1, π(j)π(i)} = min{1, λ(j−i) × i!
j!}let u ∼ U[0, 1]
if u ≤ αij then Xk+1 = j
else Xk+1 = Xk = i
Algorithm
♣
31/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Given a starting point i we take:
j=i+1 with probability 12
or j=i-1 with probability 12
qij = 12δi−1(j) + 1
2δi+1(j)
We calculate Metropolis and Hastings ratio:
αij = min{1, π(j)π(i)} = min{1, λ(j−i) × i!
j!}let u ∼ U[0, 1]
if u ≤ αij then Xk+1 = j
else Xk+1 = Xk = i
Algorithm
♣
31/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Given a starting point i we take:
j=i+1 with probability 12
or j=i-1 with probability 12
qij = 12δi−1(j) + 1
2δi+1(j)
We calculate Metropolis and Hastings ratio:
αij = min{1, π(j)π(i)} = min{1, λ(j−i) × i!
j!}let u ∼ U[0, 1]
if u ≤ αij then Xk+1 = j
else Xk+1 = Xk = i
Algorithm
♣
31/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Given a starting point i we take:
j=i+1 with probability 12
or j=i-1 with probability 12
qij = 12δi−1(j) + 1
2δi+1(j)
We calculate Metropolis and Hastings ratio:
αij = min{1, π(j)π(i)} = min{1, λ(j−i) × i!
j!}let u ∼ U[0, 1]
if u ≤ αij then Xk+1 = j
else Xk+1 = Xk = i
Algorithm
♣
31/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Given a starting point i we take:
j=i+1 with probability 12
or j=i-1 with probability 12
qij = 12δi−1(j) + 1
2δi+1(j)
We calculate Metropolis and Hastings ratio:
αij = min{1, π(j)π(i)} = min{1, λ(j−i) × i!
j!}
let u ∼ U[0, 1]
if u ≤ αij then Xk+1 = j
else Xk+1 = Xk = i
Algorithm
♣
31/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Given a starting point i we take:
j=i+1 with probability 12
or j=i-1 with probability 12
qij = 12δi−1(j) + 1
2δi+1(j)
We calculate Metropolis and Hastings ratio:
αij = min{1, π(j)π(i)} = min{1, λ(j−i) × i!
j!}let u ∼ U[0, 1]
if u ≤ αij then Xk+1 = j
else Xk+1 = Xk = i
Algorithm
♣
31/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Given a starting point i we take:
j=i+1 with probability 12
or j=i-1 with probability 12
qij = 12δi−1(j) + 1
2δi+1(j)
We calculate Metropolis and Hastings ratio:
αij = min{1, π(j)π(i)} = min{1, λ(j−i) × i!
j!}let u ∼ U[0, 1]
if u ≤ αij then Xk+1 = j
else Xk+1 = Xk = i
Algorithm
♣
31/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Given a starting point i we take:
j=i+1 with probability 12
or j=i-1 with probability 12
qij = 12δi−1(j) + 1
2δi+1(j)
We calculate Metropolis and Hastings ratio:
αij = min{1, π(j)π(i)} = min{1, λ(j−i) × i!
j!}let u ∼ U[0, 1]
if u ≤ αij then Xk+1 = j
else Xk+1 = Xk = i
Algorithm
♣
31/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
R implementation
> l i b r a r y (mcsm)> f a c t=f u n c t i o n ( n ){gamma( n+1)}> p o i s s o n f=f u n c t i o n ( n , lambda , x0 ){
x=x0xn=x0f o r ( i i n 1 : n ){i f ( xn!= 0)y=xn +(2∗ rbinom ( 1 , 1 , 0 . 5 ) −1 )e l s e {y=rbinom ( 1 , 1 , 0 . 5 )}a l p h a=min ( 1 , lambda ˆ( y−xn )∗ f a c t ( xn )/ f a c t ( y ) )i f ( r u n i f ( 1 ) < a l p h a ){ xn=y}x=c ( x , xn )}x}
32/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
33/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Multivariate Target
if the distribution π is d-dimensional and the simulatedprocess X (t) = {X1(t), · · ·Xd(t)}, we may use the followingtechniques to construct the transition matrix P
1 In the transition from t to t + 1 all co-ordinates of X (t) maybe changed
2 In the transition from t to t + 1 only one co-ordinates of X (t)may be changed, that selection may be made at randomamong the d co-ordinates
3 In the transition from time t to t + 1 only one co-ordinate maychange in each transition, and the co-ordinate being selectedin a fixed rather than a random sequence.
34/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Hastings transformed the d dimensional problem to onedimensional problem
The approach is based on updating one component at eachtime
The transition matrix is defined as follow:P = P1.P2 · · ·Pd
For each (k = 1 · · · d),Pk is constructed so that πPk = π
π will be a stationary distribution of P sinceπP = πP1 · · ·Pd = πP2 · · ·Pd
Hastings’justification
♣
Orthogonal Matrices
35/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Conclusion
+In this paper, Hastings gives a generalization of Metropoliset al (1953) approach.
+He also introduiced gibbs sampling strategy when hepresented the multivariate target.
+Hastings treated the continuous case using a discretizationanalogy.
-little information about the merits of Metropolis and Barkeracceptance forms.
36/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Thank You For Your Attention
37/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Bibliography
[1]:W.K.Hastings(1970).Monte Carlo Sampling Methods UsingMarkov chain and their Applications[2]:Christian P Roberts (2010).Introduicing Monte Carlo Methodswith R[3]:Kenneth Lange(2010).Numerical Analysis for statisticians[4]:Siddhartha Chib(1995).Understanding the metropolis Hastingsalgorithm[5]:Robert Gray(2001).Advanced statistical computing
38/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Random orthogonal Matrices
Hastings suggests an interesting chain on the space n × northogonal matrices(H ′H = I , det(H) = 1)
The proposal stage of Hasting’s algorithm consists of choosingat random 2 indices i and j and an angle θ ∈ [0, 2π]
The proposed replacement for the current rotation matrix H isthen H ′ = Eij(θ).H
Eij(θ) coincides with the identity matrix expect for someentries
since Eij(θ)−1 = Eij(−θ) the transition density is symmetricand the markov chain induced is reversible
back
39/40
Hastings-University of Toronto Reading Seminar:MCMC
OutlineIntroduction
Monte Carlo PrincipleMarkov Chain Theory
MCMCConclusion
Estimating Pi using Monte Carlo methods (SAS output)
Problem :Estimate PI using Monte CarloIntegration
Strategy:Equation of a circle with radius= 1 :
x2 + y2 = 1 which can be written y =√
1 − x2
Area of this circle =pi
Area of this circle in the first quadrant = pi�4
Generate Ux Uniform(0, 1) and Uy Uniform(0, 1)
Check to see if Uy ≤√
1 − U2x
The proportion of generated points when thisCondition is true is an estimate of pi�4.
Based on 10,000 simulated points using SAS:PI (SE) = 3.1056(0.016)
back
40/40
Hastings-University of Toronto Reading Seminar:MCMC