Андрей Андреевич Марков

Markov ChainsGraduate Seminar in Applied Statistics

Presented byMatthias Theubert Never look behind you

PrefaceThe so-called Markov Chains can be used to describe special stochastic processes over a longer period of time and calculate the probabilities of a future state of the process without to much effort.=> This makes the method very interesting for predictions. Named after the russian mathematician Andrei Andrejewitsch Markov (1856-1922) who used a comparatively simple way to describe stochastic processes.

BasicsA random process or stochastic process X is a collection of random variables { Xt : t T } indexed by some set T, which we will usually will think of as time.

If T = { 0, 1, 2, } the process is called a discrete time process.

If T = R or T = { 0, }, we call it a continuous time process.

In this talk only discrete time processes will be considered.

BasicsLet {X0, X1, } be a sequence of random variables which takes values in some countable set S, called the state space. thus, each Xt is a discrete random variable.

We write Xn = i and say the process is in state i at time n.

Markov propertyDefinition 1: The process X is said to be a Markov Chain if it satisfies the Markov property:

Informally, the Markov property says that given the past history of the process, future behaviour only depends on the current value.

Markov ChainsA Markov chain is specified by:

the state space S

the transition probabilities

and the initial distribution

The bug in the mazeTransition probabilities: p11= 0 p12= p13= p14= 0 etc. 3412

Transition probabilitiesProbability that the system will be in state j at time n+1 if it is in state i at n.

These probabilities are called the (one step) transition probabilities for the chain.

Can be represented as a transition matrix P.

Transition matrix for the example

Transition matrixThe transition matrix P is a stochastic matrix. That isP has no negative entries:

As each row i describes the probability function the row sums equal one

Homogenous Markov ChainA Markov Chain is called homogenous (or Markov Chain with stationary transition probabilities) if the transition probabilities are independent of time t.

= P(Xt+1= j | Xt = i) = P(Xt= j | Xt-1= i )

n-step Transition probabilitiesIt can be also of interest how likely it is, that the system is in state j after two, three, n steps, given it is now in state i.

For describing the long-term behaviour of a Markov chain one can define the n-step transition probabilities

The n-step transition matrix

is the matrix of n-step transition probabilities

n-step transition probabilitiesExample:

For p14(2) we have: p14(2)=

In general:ik12Npi1pi2piNp1kp2kpNkpik(2) =

Chapman-Kolmogorov EquationsTheorem:

NotationState i leads to state j :if the chain may ever visit state j with positive probability, starting from state i (possibly in more than one step).

i and j communicate: if and

It can be proven, that and implies

ClassesWe can partition the state space S into equivalence classes w.r.t. . These are called the communicating classes of the chain.If there is a single communicating class, the chain is said to be irreducible.

A class C is said to be closed if impliesOnce the Markov chain enters a closed class it will never leave the class.

ExampleSuppose S = { 1, 2, 3, 4, 5, 6} and

This can be drawn as a directed graph with vertices S and directed edge from i to j if pij > 0. This gives:

ExampleThe classes are {1, 2, 3}, {4} and {5, 6} with {5, 6} being a closed class.1234561/21/21111/21/21/31/31/3

Absorbing stateIf the system reaches a state i that can not be leaved, this state is called absorbing. It is a own closed class { i }.

Formally:

The Markov-chain is called absorbing if it has at least one absorbing state.i1

Recurrence and TransienceFor any states i and j , define to be the probability that, starting in i, the first transition into j occurs at time n.

Recurrence and TransienceLetDefinition: A state j is said to be recurrent if and transitive otherwise.Informally, a state j is recurrent if the chain is certain to return to j if it starts in j.

Initial distributionThe initial distribution

d(1) := [d1(1), d2(1), ..., dm(1)] = [P(X1= 1), P(X1=2), ..., P(X1=m)]

gives the probabilities that the Markov Chain starts in state j.

For the example it may be:d(1) = (0.25, 0.25, 0.25, 0.25) if we have no better information

d(1) = ( 1, 0, 0, 0) if we know the chain starts in state 1 for example.

Marginal distributionThe n-step transition probabilities are the conditional probabilities to be in state j at time m+n given that the system is in state i at time m.

In general one is also interested in the marginal probability to be in state i at a given time t, , without the condition that the system is in a certain state at certain time before.

This question can be answered by the marginal distribution.

Marginal distributionGiven the initial probability distribution for the first state, the probability function for the state Xt , can be computed as

= (P (Xt = 1), , P (Xt = m) ) = d (1) P n-1

Long term behaviour / stationary distributionWhat happens to the chain for very large t ?

One can show that, in case the Markov chain is homogeneous and irreducible, converges to a fixed vector, say , for large t.

The vector is called the stationary distribution of P and the Markov chain is said to be stationary.

Long term behaviourIf one calculates the transition probabilities in the bug example for large t one gets:

If we multiply it by any initial distribution we get the stationary distribution d= (0.207, 0.207, 0.276, 0.310)P(17) = P(18) =

Long term behaviourConvergence to Equilibrium:The effect is that in the long run an irreducible and aperiodic Markov chain forgets where it started. The probability of being in state j at time n , converges to , regardless of the initial state.It is easy to see what can go wrong if the chain is not irreducible. For example there are two closed classes, then the long term behaviour will be different depending on which class it starts in. Similarly, if the chain is periodic, then the value of pii(n), for example, will be zero unless n is a multiple of the period of the chain.

Thank you for your attention!

Sometimes we can break a Markov Chain into smaller pieces, each of which is relatively easy to understand.

Андрей Андреевич Марков

Documents