stationary distribution:...

8
Homework 2 posted, due Friday, March 7 at 2 PM. How does the stationary distribution apply to questions of long-run behavior of Markov chains? Two key results: 1. Stationary distribution as a limit distribution A finite-state, discrete-time Markov chain which is irreducible and aperiodic has a unique stationary distribution and moreover: That is, the probability distribution for the state of the Markov chain converges (in fact exponentially fast) to the stationary distribution at long times. 2. Law of Large Numbers for Markov Chains An irreducible (not necessarily aperiodic) finite-state, discrete-time Markov chain has the following property: For any function f on the state space S, Left hand side is a time average of a function of the state of the Markov chain. Right hand side is deterministic ensemble average with respect to the stationary distribution. Law of Large Numbers, extended to a Markov chain (which is certainly not iid, though maybe you can think of it that way by looking at large blocks of the time series as being approximately independent of each other; see dissection principle in Resnick Ch. 2) n.b. some books like Resnick include aperiodicity in the definition of ergodicity of a Markov chain, but that seems inconsistent with how the word is used in scientific practice The finite-state Markov chain satisfying the irreducibility condition obeys an ergodic property, meaning that long-time averages are equivalent to ensemble averages, independent of the choice of initial conditions. So it must be the case that ergodic systems somehow "forget" their initial conditions after a long enough time. This is saying two conceptual things: Stationary Distribution: Application Tuesday, February 25, 2014 1:58 PM Stoch14 Page 1

Upload: others

Post on 12-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stationary Distribution: Applicationeaton.math.rpi.edu/faculty/Kramer/Stoch14/stochnotes022514.pdf · Any detailed balance solution is a stationary distribution but not vice versa

Homework 2 posted, due Friday, March 7 at 2 PM.

How does the stationary distribution apply to questions of long-run behavior of Markov chains?

Two key results:

1. Stationary distribution as a limit distribution

A finite-state, discrete-time Markov chain which is irreducible and aperiodichas a unique stationary distribution and moreover:

That is, the probability distribution for the state of the Markov chain converges (in fact exponentially fast) to the stationary distribution at long times.

2. Law of Large Numbers for Markov Chains

An irreducible (not necessarily aperiodic) finite-state, discrete-time Markov chain has the following property:

For any function f on the state space S,

Left hand side is a time average of a function of the state of the Markov chain. Right hand side is deterministic ensemble average with respect to the stationary distribution.

• Law of Large Numbers, extended to a Markov chain (which is certainly not iid, though maybe you can think of it that way by looking at large blocks of the time series as being approximately independent of each other; see dissection principle in Resnick Ch. 2)

○ n.b. some books like Resnick include aperiodicity in the definition of ergodicity of a Markov chain, but that seems inconsistent with how the word is used in scientific practice

• The finite-state Markov chain satisfying the irreducibility condition obeys an ergodic property, meaning that long-time averages are equivalent to ensemble averages, independent of the choice of initial conditions. So it must be the case that ergodic systems somehow "forget" their initial conditions after a long enough time.

This is saying two conceptual things:

Stationary Distribution: ApplicationTuesday, February 25, 20141:58 PM

Stoch14 Page 1

Page 2: Stationary Distribution: Applicationeaton.math.rpi.edu/faculty/Kramer/Stoch14/stochnotes022514.pdf · Any detailed balance solution is a stationary distribution but not vice versa

inconsistent with how the word is used in scientific practice

Ergodicity is a very useful concept in practice, which is unfortunately very difficult to show to be true in complex systems. In practice, often ergodicity is assumed without proof.

1. It allows the computation of the long-time properties of a dynamical Markov chain to be reduced to a deterministic calculation.

2. It allows one to sample from awkward probability distributions (typically in high dimensions) by constructing an artificial Markov chain which has the desired probability distribution as its unique stationary distribution. Then one samples from the probability distribution by sampling from the Markov chain. One has to be concerned to construct the Markov chain so that it has good ergodicity properties (converges quickly to the stationary distribution) and is relatively easy to generate. This is what's called Markov Chain Monte Carlo (MCMC). How does one construct a Markov chain to have a desired stationary distribution? Metropolis-Hastings procedure is one common way.

Ergodicity is useful in practice from two directions.

• Just solve for the left eigenvector of the probability transition matrix with eigenvalue 1; normalize properly.

• Power method:

which follows from the stationary distribution serving as

limit distribution when the Markov chain is irreducible and aperiodic. Here is an arbitrary initial probability distribution vector.

• from Resnick Sec. 2.14

Now that we see how a stationary distribution can be useful, in principle, for computing long-time properties of Markov chains, how do we compute the stationary distribution? Several ways one can proceed:

• Method of Kirchhof which involves graphical/diagrammatic analysis of the Markov chain; Haken Synergistics Secs. 4.6-4.8. I've seen it used primarily with small biochemical reaction networks.

The above methods can be somewhat hard to implement for systems with large state spaces because the linear algebra becomes expensive.

An important method that sometimes works for finding stationary distributions, even analytically (sometimes), for large state spaces is to look for a detailed balance solution:

for all

Stoch14 Page 2

Page 3: Stationary Distribution: Applicationeaton.math.rpi.edu/faculty/Kramer/Stoch14/stochnotes022514.pdf · Any detailed balance solution is a stationary distribution but not vice versa

for all

Any detailed balance solution is a stationary distribution but not vice versa. A detailed balance solution has flux balance along each edge, whereas a stationary distribution has flux balance node by node. But detailed balance solutions often do arise as stationary distributions, especially in systems that have time-reversal symmetry (i.e., many physics applications).

Example Application of Stationary Distribution to Markov Chain Model

• Start by inspecting every product until you see M good products consecutively

• After the inspector has seen M consecutive good products, it switches to only sampling one in every r products, periodically.

• When the inspector finds a defect, the product is rejected, and it switches into searching every product until it again sees Mconsecutive good products.

The simple inspection protocol we'll discuss in lecture (more elaborate ones in the homework) is parameterized by two nonnegative integers M, r:

Every uninspected product as well as every inspected good product is shipped.

• What fraction of shipped products are defective?• What fraction of products are inspected?

Practical questions for an inspection protocol:

If we view each of these in terms of long-run averages, to which we can apply the Law of Large Numbers for Markov chains, once we define a Markov chain model.

• In class, we will assume that each product is defective with probability p, independently of the defect status of any other product.

• We'll also assume the inspector works perfectly.

Part of the model must involve a model for the defects in the products.

Stoch14 Page 3

Page 4: Stationary Distribution: Applicationeaton.math.rpi.edu/faculty/Kramer/Stoch14/stochnotes022514.pdf · Any detailed balance solution is a stationary distribution but not vice versa

• We'll also assume the inspector works perfectly.

The homework problem invites you to explore other models for defects and inspection protocols.

Let's define a Markov chain for the present model.

First question is how to choose the epochs. If we choose each inspection to define an epoch, then we can define the state space as

where is defined to be the number of consecutive good products seen by the inspector up to and including the nth inspection. (Set ). And if the inspector has seen more than M consecutive good products, we still define because the system operates in the same way. No information about the defects seen needs to be incorporated into the state space under the assumption that the defects are independent in different products.

If one defined epochs based on each item produced, then one would have to define a bigger state space also keeping track of the phase in the cycle when the inspector is skipping products.

Proceeding with our framework, we can define our Markov chain model as follows:

Stochastic update rule:

Probability transition matrix

Stoch14 Page 4

Page 5: Stationary Distribution: Applicationeaton.math.rpi.edu/faculty/Kramer/Stoch14/stochnotes022514.pdf · Any detailed balance solution is a stationary distribution but not vice versa

Let's do some topological analysis on the Markov chain to see whether it satisfies the conditions of our theorems about stationary distributions.

What are the communication classes of this MC? Irreducible; one communication class. To argue this, one would need to show that it is possible to go from any state i to any other state j. One such pathway is to go from i to 0, then increment up by 1 until state j is reached.

What is the period of the MC? 0 clearly has period 1 because it can return to itself in 1 epoch, and the whole MC has the same period because it's irreducible and period is a class property.

Therefore we have an irreducible, aperiodic FSDT MC so all the theorems about stationary distributions apply. We'll now see how we can use the stationary distribution to answer the two key questions described above. First let's compute the stationary distribution.

Here the model is simple enough (sparse probability transition matrix) that we can solve for the stationary distribution analytically.

Stoch14 Page 5

Page 6: Stationary Distribution: Applicationeaton.math.rpi.edu/faculty/Kramer/Stoch14/stochnotes022514.pdf · Any detailed balance solution is a stationary distribution but not vice versa

Doing the multiplication on the LHS and equating the result component by component, we get:

As always with calculations involving eigenvectors, one of these equations is redundant with the others.

We are also going to want to impose the normalization condition:

Stoch14 Page 6

Page 7: Stationary Distribution: Applicationeaton.math.rpi.edu/faculty/Kramer/Stoch14/stochnotes022514.pdf · Any detailed balance solution is a stationary distribution but not vice versa

This looks like a truncated geometric distribution, and could also be derived as the probability distribution for a "success run."

Let's see now how to use the stationary distribution to answer one of the questions of interest.

What fraction of products are inspected in the long run?

where is the number of products that exit the production line from the nth inspection up to but not including the (n+1)st inspection.

Stoch14 Page 7

Page 8: Stationary Distribution: Applicationeaton.math.rpi.edu/faculty/Kramer/Stoch14/stochnotes022514.pdf · Any detailed balance solution is a stationary distribution but not vice versa

Stoch14 Page 8