markov processes for everybody - computational physiology

32
Markov Processes for Everybody Introduction to the theory of continuous time Markov processes. Wilhelm Huisinga, & Eike Meerbach Fachbereich Mathematik und Informatik Freien Universit¨ at Berlin & DFG Research Center Matheon, Berlin [email protected] [email protected] Berlin, June 15, 2005 — preliminary version —

Upload: others

Post on 12-Sep-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Markov Processes for Everybody - Computational Physiology

Markov Processes for Everybody

Introduction to the theory of continuous timeMarkov processes.

Wilhelm Huisinga, &Eike Meerbach

Fachbereich Mathematik und InformatikFreien Universitat Berlin &DFG Research Center Matheon, [email protected]@mi.fu-berlin.de

Berlin, June 15, 2005 — preliminary version —

Page 2: Markov Processes for Everybody - Computational Physiology

2 CONTENTS

Contents

1 Markov jump processes 31.1 Setting the scene . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Communication and recurrence . . . . . . . . . . . . . . . . . 111.3 Infinitesimal generators and the master equation . . . . . . . 141.4 Invariant measures and stationary distributions . . . . . . . . 221.5 Reversibility and the law of large numbers . . . . . . . . . . . 271.6 Biochemical reaction kinetics . . . . . . . . . . . . . . . . . . 29

Page 3: Markov Processes for Everybody - Computational Physiology

3

1 Markov jump processes

1.1 Setting the scene

Consider some probability space (Ω,A,P), where Ω is called the samplespace, A the set of all possible events (the σ–algebra) and P is some prob-ability measure on Ω. A family X = X(t) : t ≥ 0 of random variablesX(t) : Ω → S is called a continuous-time stochastic process on thestate space S. The index t admits the convenient interpretation as time:if X(t) = y, the process is said to be in state y at time t. For some givenω ∈ Ω, the S–valued set X(t, ω) : t ≥ 0 is called a realization (trajectory,sample path) of the stochastic process X associated with ω.

Definition 1.1 (Markov process) A continuous-time stochastic processX(t) : t ≥ 0 on a countable state space S is called a Markov process, iffor any tk+1 > tk > . . . > t0 and B ⊂ S the Markov property

P[X(tk+1) ∈ B|X(tk), . . . , X(t0)] = P[X(tk+1) ∈ B|X(tk)] (1)

holds. If, moreover, the right hand side of (1) does only depend on thetime increment tk+1 − tk, but not on tk, then the Markov process is calledhomogeneous. Given a homogeneous Markov process, the function p :R+ × S× S → R+ defined by

p(t, x, y) = P[X(t) = y|X(0) = x]

is called the stochastic transition function; its values p(t, y, z) are the(conditional) transition probabilities to move from x to y within time t. Theprobability distribution µ0 satisfying

µ0(x) = P[X(0) = x]

is called the initial distribution. If there is a single x ∈ S such thatµ0(x) = 1, then x is called the initial state.

In the following, we will focus on homogenous Markov process, and thus theterm Markov process will always refer to a homogeneous Markov process,unless otherwise stated.

There are some subtleties in the realm of continuous time processes thatare not present in the discrete-time case. These steam from the fact that theuncountable union of measurable sets need not be measurable anymore. Forexample, the mapping X(t, ·) : Ω → S is measurable for every t ∈ R+, i.e.,ω ∈ Ω : X(t, ω) ∈ A ∈ A for every measurable subset A ⊂ S. However,

ω ∈ Ω : X(t, ω) ∈ A, t ∈ R+ =⋂

t∈R+

ω ∈ Ω : X(t, ω) ∈ A

Page 4: Markov Processes for Everybody - Computational Physiology

4 1 MARKOV JUMP PROCESSES

need not be in A in general. This is related to functions like inft∈R+ X(t)or supt∈R+ X(t), since, e.g.,

sup

t∈R+

X(t) ≤ x

=

t∈R+

ω ∈ Ω : X(t, ω) ≤ x.

We will therefore impose some (quite natural) regularity conditions onthe Markov process in order to exclude pathological cases (or too technicaldetails). Throughout this chapter, we assume that

p(0, x, y) = δxy, (2)

where δxy = 1, if x = y and zero otherwise. This guarantees that no transi-tion can take place at zero time. Moreover, we assume that the transitionprobabilities are continuous at t = 0:

limt→0+

p(t, x, y) = δxy (3)

for every x, y ∈ S. This guarantees (up to stochastic equivalence) that therealizations of X(t) : t ≥ 0 are right continuous functions (more precisely,it implies that the Markov process is stochastically continuous, separable andmeasurable on compact intervals. Moreover, there exists a separable version,being stochastically equivalent to X(t) : t ≥ 0 and all of whose realizationsare continuous from the right; for details see reference [5, Chapt. 8.5]). Dueto the fact that the state space is discrete, continuity from the right of thesampling functions implies that they are step functions, that is, for almostall ω ∈ Ω and all t ≥ 0 there exists ∆t(t, ω) > 0 such that

X(t + τ, ω) = X(t, ω); τ ∈ [0,∆t(t, ω)).

This fact motivates the name Markov jump process.

For our further study, we recall to important random variables. A con-tinuous random variable τ : Ω → R+ satisfying

P[τ > s] = exp(−λs)

for every s ≥ 0 is called an exponential random variable with parameterλ ≥ 0. Its probability density f : R+ → R+ is given by

f(s) = λ exp(−λs)

for s ≥ 0 (and zero otherwise). Moreover, the expectation is given by

E[τ ] =1λ

.

Page 5: Markov Processes for Everybody - Computational Physiology

1.1 Setting the scene 5

One of the most striking features of exponential random variables is theirmemoryless property, expressed as

P[τ > t + s|τ > t] = P[τ > s]

for all s, t ≥ 0. This is easily proven by noticing that the left hand side isper definition equal to exp(−λ(t + s))/ exp(−λt) = exp(−λs), being equalto the right hand side.

A discrete random variable N : Ω → N with probability distribution

P[N = k] =λke−λ

k!

for k ∈ N is called a Poisson random variable with parameter λ ∈ R+.Its expectation is given by

E[N ] = λ.

We now consider two examples of Markov jump processes that are of proto-type nature.

Example 1.2 Consider an iid. sequence τkk∈N of exponential randomvariable with parameter λ > 0 and define recursively the sequence of randomvariable Tkk∈N by

Tk+1 = Tk + τk

for k ≥ 1 and T0 = 0. Here, Tk is called the kth event time and τk theinter-event time. Then, the sequence of random variables N(t) : t ≥ 0defined by

N(t) =∞∑

k=0

1Tk≤t = max k ≥ 0 : Tk ≤ t .

for t ≥ 0 and with N(0) = 0. Its (discrete) distribution is given by thePoisson distribution with parameter λt:

P[N(t) = k] =(λt)k

k!e−λt;

for k ≥ 1. That is why N(t) is called a homogeneous Poisson processwith intensity λ. Per construction, N(t) is counting the number of eventsup to time t. Therefore, it is also sometime called the counting processassociated with Tk.

Remark: One could also determine the distribution of the event timesTk, the time at which the kth event happens to occur. Since per definition

Page 6: Markov Processes for Everybody - Computational Physiology

6 1 MARKOV JUMP PROCESSES

Tk is the sum of k iid. exponential random variables with parameter λ, theprobability density is known as

fTk(t) =

(λt)k

k!λe−λt

for t ≥ 0 and zero otherwise. This is the so-called Erlang distribution (aspecial type of Gamma distribution) with parameter k + 1 and λ.

Example 1.3 Consider some discrete-time Markov chain Ekk∈N on acountable state space S with stochastic transition matrix K = (k(x, y))xy∈S,and furthermore, consider some homogeneous Poisson process Tkk∈N onR+ with intensity λ > 0 and associated counting process N(t) : t ≥ 0.Then, assuming independence of Ek and N(t), the process X(t) : t ≥0 with

X(t) = EN(t)

is called the uniform Markov jump process with clock N(t) and sub-ordinated chain Ek. The thus defined process is indeed a Markov jumpprocess (exercise). Note that the jumps of X(t) are events of the clock pro-cess N(t), however, not every event of N(t) corresponds to a jump (unlessE(x, x) = 0 for all x ∈ S). In order to compute its transition probabilities,note that

P[X(t) = y|X(0) = x] = P[EN(t) = y|E0 = x]

=∞∑

n=0

P[En = y, N(t) = n|E0 = x]

=∞∑

n=0

P[En = y|E0 = x] P[N(t) = n],

where the last equality is due to the assumed independence of En and N(t).Hence, its transition probabilities are given by

p(t, x, y) =∞∑

n=0

e−λt (λt)n

n!kn(x, y), (4)

for t ≥ 0, x, y ∈ S, where kn(x, y) is the corresponding entry of Kn, then-step transition matrix of the subordinated Markov chain.

Solely based on the Markov property, we will now deduce some propertiesof Markov jump processes that illuminate the differences, but also the closerelations to the realm of discrete-time Markov chains. We will see thatthe uniform Markov chain is in some sense the prototype of a Markov jump

Page 7: Markov Processes for Everybody - Computational Physiology

1.1 Setting the scene 7

process. To do so, define for t ∈ R+ the residual life time τ(t) : S → [0,∞]in state X(t) by

τ(t) = infs > 0 : X(t + s) 6= X(t). (5)

Obviously, τ(t) is a stopping time, i.e., it can be expressed in terms ofX(s) : 0 ≤ s ≤ t (since the sample paths are right-continuous). Hence,conditioned on X(t) and τ(t) < ∞, the next jump (state change) will occurat time t+ τ(t). Otherwise the Markov process will not leave the state X(t)anymore.

Proposition 1.4 Consider some Markov process X(t) : t ≥ 0 being instate x ∈ S at time t ∈ R+. Then, there exists λ(x) ≥ 0, independent of thetime t, such that

P[τ(t) > s|X(t) = x] = exp(−λ(x)s) (6)

for every s > 0.

Therefore, λ(x) is called the jump rate associated with the state x ∈ S.Prop. 1.4 states that the residual life time decays exponentially in s.

Proof: Note that P[τ(t) > s|X(t) = x] = P[τ(0) > s|X(0) = x], since theMarkov jump process is homogeneous. Define g(s) = P[τ(0) > s|X(0) = x]and compute

g(t + s) = P[τ(0) > t + s|X(0) = x] = P[τ(0) > t, τ(t) > s|X(0) = x]= P[τ(0) > t|X(0) = x]P[τ(t) > s|τ(0) > t, X(0) = x]= g(t)P[τ(t) > s|τ(0) > t, X(0) = x,X(t) = x]= g(t)P[τ(t) > s|X(t) = x] = g(t)g(s).

In addition, g(s) is continuous at s = 0, since the transition probabilitieswere assumed to be continuous at zero. Moreover, 0 ≤ g(s) ≤ 1, whichfinally implies that the only solution must be

g(s) = exp(−λ(x)s)

with λ(x) ∈ [0,∞] given by λ(x) = − ln(P[τ(0) > 1|X(0) = x]). ¤

To further illuminate the characteristics of Markov processes on count-able state spaces, denote by T0 = 0 < T1 < T2 < . . . the random jumptimes or event times, at which the Markov process changes its state. Basedon the jump times, define the sequence of random life times (τk)k∈N viathe relation

τk = Tk+1 − Tk

Page 8: Markov Processes for Everybody - Computational Physiology

8 1 MARKOV JUMP PROCESSES

for k ∈ N. Due to Prop. 1.4 we know that

P[τk > s|X(Tk) = x] = exp(−λ(x)s)

for s ≥ 0. Moreover, the average life time of a state is

E[τ(t)|X(t) = x] =1

λ(x).

In terms of the jump times, we have

X(t) = X(Tk); t ∈ [Tk, Tk+1),

hence the Markov process is constant, except for the jumps.

Definition 1.5 Consider a state x ∈ S with associated jump rate λ(x).Then, x is called

1. permanent, if λ(x) = 0

2. stable, if 0 < λ(x) < ∞,

3. instantaneous, if λ(x) = ∞ (not present for Markov processes withright continuous sample paths).

Assume that X(t) = x at time t; if x is

1. permanent, then P[X(s) = x|X(t) = x] = 1 for every s > t, hence theMarkov process stays in x forever,

2. stable, then P[0 < τ(t) < ∞|X(t) = x] = 1,

3. instantaneous, then P[τ(t) = 0|X(t) = x] = 1, hence the Markovprocess exists the state as soon as it enters it.

Due to our general assumption, we know that the Markov process has rightcontinuous sample paths. As a consequence, the state space S does notcontain instantaneous states.

Definition 1.6 Consider some Markov process X(t) : t ≥ 0 with rightcontinuous sample paths. Then, the Markov process is called regular ornon-explosive, if

T∞ := supk∈N

Tk = ∞ (a.s.),

where T0 < T1 < . . . denote the jump times of X(t) : t ≥ 0. The randomvariable T∞ is called the explosion time.

Page 9: Markov Processes for Everybody - Computational Physiology

1.1 Setting the scene 9

If the Markov jump process is explosive, then P[T∞ < ∞] > 0. Hencethere is a ”substancial” set of realizations, for which the Markov process”blows up” in finite time. In such a situation, we assume that the Markovjump process is only defined for times smaller than the explosion time. Thefollowing proposition provides a quite general condition for a Markov processto be regular.

Proposition 1.7 A Markov process X(t) : t ≥ 0 on a countable statespace S is regular, if and only if

∞∑

k=1

1λ(X(Tk))

= ∞ (a.s.).

This is particularly the case, if (1) S is finite, or (2) if supx∈S λ(x) < ∞.

Proof: See Prop. 8.7.2 in [5]. ¤

Compare also Prop. 1.16. Based on the sequence of event times (Tk)k∈N,we define the following discrete-time S-valued stochastic process Ek byEk = X(Tk). As is motivated from the definition and the following results,Ekk∈N is called the embedded Markov chain. However, it still remains toprove that Ekk∈N is really well-defined and Markov. To do so, we needthe following

Definition 1.8 A Markov process X(t) : t ≥ 0 on a state space S fulfillsthe strong Markov property if, for any stopping time τ , being finite a.s.,

P[X(s + τ) ∈ A|X(τ) = x,X(t), t < τ ] = Px[X(s) ∈ A]

for every A ∈ A, whenever both sides are well-defined. Hence, the processX(s + τ) : s ≥ 0 is Markov and independent of X(t), t < τ, givenX(τ) = x.

In contrast to the discrete-time case, not every continuous-time Markovprocess on a countable state space obeys the strong Markov property. How-ever, under some suitable regularity conditions, this is true.

Theorem 1.9 A regular Markov process on a countable state space fulfillsthe strong Markov property.

Proof: See Thm. 4.1 in Chapter 8 of [1]. ¤

The next proposition states that the time, at which the Markov processjumps next, and the state, it jumps into, are independent.

Page 10: Markov Processes for Everybody - Computational Physiology

10 1 MARKOV JUMP PROCESSES

Proposition 1.10 Consider a regular Markov jump process on S, and as-sume that Tk+1 < ∞ a.s. Then, conditioned on X(Tk) = x, the randomvariables τk+1 and X(Tk+1) are independent, i.e.,

P[τk+1 > t, X(Tk+1) = y|X(Tk) = x] (7)= P[τk+1 > t|X(Tk) = x] ·P[X(Tk+1) = y|X(Tk) = x]

Proof: Starting with (7) we get, by applying Bayes rule,

P[τk+1 > t, X(Tk+1) = y|X(Tk) = x]= P[τk+1 > t|X(Tk) = x] ·P[X(Tk+1) = y|X(Tk) = x, τk+1 > t].

Using the Markov property we can rewrite he last factor

P[X(Tk+1) = y|X(Tk) = x, τk+1 > t]= P[X(Tk+1) = y, X(s) = x, Tk ≤ s < Tk+1|X(Tk + t) = x]= P[X(Tk + τ(Tk + t)) = y,X(s) = x, Tk ≤ s < Tk + τ(Tk + t)|X(Tk) = x]= P[X(Tk+1) = y|X(Tk) = x],

where we used the homogenousity of the Markov process to proceed fromthe second to the third line. ¤

We are now ready to define the embedded Markov chain of theMarkov jump process.

Definition 1.11 Define the homogeneous Markov chain Ekk∈N on thestate space S in terms of the following transition function P =

(p(x, y)

)xy∈S

.If x is permanent, set p(x, x) = 1. Otherwise, if x is stable, set

p(x, y) = P[X(T1) = y|X(0) = x] (8)

and consequently p(x, x) = 0.

Summarizing our results we obtain the following theorem.

Theorem 1.12 Consider a regular Markov jump process and assume thatthe state space consists only of stable states. Then, X(Tk)k∈N is a ho-mogeneous Markov chain with transition function defined in (8). In otherwords, it is

Ek = X(Tk)

for every k ∈ N (in distribution).

So, we obtain the following characterization of a Markov jump processesX(t) on the state space S. Assume that the process is at state x at timet, i.e., X(t) = x. If the state is permanent, then the Markov process will

Page 11: Markov Processes for Everybody - Computational Physiology

1.2 Communication and recurrence 11

stay in x forever, i.e., X(s) = x for all s > t. If the state is stable, then theMarkov process will leave the state x at a (random) time being exponentiallydistributed with parameter 0 < λ(x) < ∞. It then jumps into some otherstate y 6= x ∈ X with probability p(x, y), hence according to the law of theembedded Markov chain Ekk∈N. Therefore knowing the rates (λ(x)) andthe embedded Markov chain in terms of its transition function P = (p(x, y))completely characterizes the Markov jump process.

The above characterization is also the basis for a numerical simulationof the Markov jump process. To do so, one might exploit the followingimportant and well-known relation between an exponential random variableτ with parameter λ and some uniform random variable U on [0, 1], given by

τ = − 1λ

ln(U).

Hence, a numerical simulation of a Markov jump process can be based onrandomly drawing two uniform random numbers for each jump event (onefor the time, another one for the state change).

1.2 Communication and recurrence

This section is about the topology of regular Markov jump processes (unlessstated otherwise). As in the case of Markov chains, we start with some

Definition 1.13 Let X(t) : t ≥ 0 denote a Markov process with transitionfunction P (t), and let x, y ∈ S denote some arbitrary pair of states.

1. The state x has access to the state y, written x → y, if

P[X(t) = y|X(0) = x] > 0

for some t > 0.

2. The states x and y communicate, if x has access to y and y accessto x, denoted by x ↔ y.

3. The Markov chain is said to be irreducible, if all pairs of states com-municate.

As for Markov chains, it can be proven that the communication relation↔ is an equivalence relation on the state space. We remark that periodicityplays no role for continuous-time Markov jump processes, since they arealways aperiodic. We proceed by introducing the first return time.

Definition 1.14 1. The stopping time Ex : Ω → R+ ∪ ∞ defined by

Ex = inft ≥ 0 : X(t) 6= x,X(0) = x.is called the first escape time from state x ∈ S.

Page 12: Markov Processes for Everybody - Computational Physiology

12 1 MARKOV JUMP PROCESSES

2. The stopping time Rx : Ω → R+ ∪ ∞ defined by

Rx = inft > T1 : X(t) = x,

with inf = ∞, is called the first return time to state x.

Note that Px[Ex] = Px[τ(0)] (see eq. (5)).Analogous to the Markov chain theory, based on the first return time to

a state, we may define recurrence and transience of a state.

Definition 1.15 A state x ∈ S is called recurrent, if it is permanent or

Px[Rx = ∞] = 1,

and transient otherwise.

Again, recurrence and transience are class properties, i.e., the states of somecommunication class are either all recurrent or all transient. Interestinglyand maybe not surprisingly, some (but not all, as we will see later) propertiesof states can be determined in terms of the embedded Markov chain.

Proposition 1.16 Consider a regular and stable Markov jump process X(t) :t ≥ 0 and the associated embedded Markov chain Ekk∈N, then the fol-lowing holds true.

a) The Markov jump process is irreducible, if and only if its embedded Markovchain is irreducible.

b) A state x ∈ S is recurrent (transient) for the embedded Markov chain, ifand only if is rekurrent (transient) for the Markov jump process.

c) A state x ∈ S is recurrent for the Markov jump process, if and only if∫ ∞

0p(t, x, x)dt = ∞.

d) Recurrence and transience of the Markov process inherits to any dis-cretization, i.e. if h > 0 and Zk := X(kh) then recurrence of x ∈ S forthe Markov process is equivalent to recurrence of x ∈ S for the discretiza-tion Zkk∈N.

Proof: We leave the proof of the first two statements as an excercise to thereader.c) Remember the analogous formulation in the time-discrete case: if forsome Markov chain, e.g. Ekk∈N, and some state, e.g. x ∈ S, the random

Page 13: Markov Processes for Everybody - Computational Physiology

1.2 Communication and recurrence 13

variable Nx counts the number of visits in x, then x is recurrent if and onlyif

Ex[Nx] = Ex

[ ∞∑

k=0

1Ek=x

]=

∞∑

k=0

Ex[1Ek=x]

=∞∑

k=0

p(k)(x, x) = ∞,

where, as usual, p(k)(x, x) denotes the k−step transition probabilityPx[Ek =x]. Therefore we can proove the statement by showing that

∫ ∞

0p(t, x, x)dt =

1λ(x)

∞∑

k=0

p(k)(x, x).

This is done in the following, where we use Fubini’s theorem to exchangeintegral and expectation and Beppo Levi’s theorem to exchange summationand expectation:∫ ∞

0p(t, x, x)dt =

∫ ∞

0Ex[1X(t)=x]dt = Ex

[∫ ∞

01X(t)=xdt

]

= Ex

[ ∞∑

k=0

τk+11Ek=x

]=

∞∑

k=0

Ex[τk+1|Ek = x]Px[Ek = x]

=∞∑

k=0

1λ(x)

p(k)(x, x)

Be aware that the conditions to use Fubini’s theorem are only met becauseX is a jump process.d) That transience inherits to any discretization is obvious, so consider xrecurrent. If t is constrained by kh ≤ t < (k + 1)h, then

p((k + 1)h, x, x) ≥ p((k + 1)h− t, x, x)p(t, x, x)≥ exp(−λ(x)((k + 1)h− t)p(t, x, x)≥ exp(−λ(x)h)p(t, x, x).

Multiplication with exp(λ(x)h) yields

exp(λ(x)h)p((k + 1)h, x, x) ≥ p(t, x, x) , for kh ≤ t < (k + 1)h.

This enables us to give an upperbound to the integral∫ ∞

0p(t, x, x)dt ≤ h

∞∑

k=0

exp(−λ(x)h)p((k + 1)h, x, x)

= h exp(−λ(x)h)∞∑

k=1

p(kh, x, x).

Page 14: Markov Processes for Everybody - Computational Physiology

14 1 MARKOV JUMP PROCESSES

It follows from d) that∑∞

k=1 p(kh, x, x) = ∞, which is the sum over thetransition probabilities of the discretized Markov chain.¤

Irreducible and recurrent Markov jump processes are regular, as the nexttheorem states.

Theorem 1.17 An irreducible and recurrent Markov jump process is regu-lar.

Proof: Regularity means that the sequence of event times heads to infinity

limk→∞

Tk = ∞⇔∞∑

k=1

τk = ∞.

Let x ∈ S be an arbitrary start position. Since the Markov process is irre-ducible and recurrent, we know that the embedded Markov chain Ekk∈Nvisits x infinitly often. Denote by Nk(x)k∈N the sequence of visits in x.Observe that if τ is λ−exponential distributed, then λτ is 1−exponentialdistributed (we pose that as an easy excercise), therefore we have

∞ =∞∑

k=0

λ(ENk(x))τNk+1 = λ(x)∞∑

k=0

τNk+1

≤ λ(x)∞∑

k=0

τk+1.

¤

As we will see below, it is also possible to characterize invariant measuresin terms of the embedded Markov chain. However, the distinction betweenpositive and null-recurrence and the existence of stationary distributions cannot be examined in terms of the embedded Markov chain. Here, the rates(λ(x)) also have to come into play. That is why we postpone the correspond-ing analysis and first introduce the concept of infinitesimal generators, whichis the more adequate object to study.

1.3 Infinitesimal generators and the master equation

We now come to the characterization of Markov jump processes that is notpresent for the discrete-time case. It is in terms of infinitesimal changes ofthe transition probabilities and based on the notion of generators. As inthe preceding section, we assume throughout that the Markov jump processsatisfies the two regularity conditions (2) and (3).

Page 15: Markov Processes for Everybody - Computational Physiology

1.3 Infinitesimal generators and the master equation 15

To start with, we introduce the transition semigroup P (t) : t ≥ 0with

P (t) =(p(t, x, y)

)xy∈S

.

Due to (2), it is P (0) = Id and due to (3), we have

limt→0+

P (t) = Id.

In terms of the transition semigroup, we can also easily express the Chapman-Kolmogorov equation as

P (s + t) = P (s)P (t)

for t, s ≥ 0 (which justifies the notion of a semigroup). In semigroup theory,one aims at characterizing P (t) in terms of its infinitesimal generator Q. Inbroad terms, the goal is to prove and justify the notion P (t) = exp(tQ). Inthe following, we will proceed towards this goal.

Proposition 1.18 Consider the semigroup P (t) of a Markov jump process.Then, the limit

A = limt→0+

P (t)− Idt

exists (entrywise) and defines the infinitesimal generator A = (a(x, y))xy∈S

with −∞ ≤ a(x, x) ≤ 0 ≤ a(x, y) < ∞.

Note that we do not claim uniform convergence for all pairs of statesx, y ∈ S.

Proof : We first prove the result for the diagonal entries. Consider somestate x ∈ S and define h(t, x) = − ln(p(t, x, x)). Then, from the Chapman-Kolmogorov equation we deduce p(t + s, x, x) ≥ p(t, x, x)p(s, x, x). In termsof h, this implies h(t + s, x) ≤ h(t, x)h(s, x). Due to the general regularitycondition (2), it is h(0, x) = 0, implying h(t, x) ≥ 0 for all t ≥ 0. Now,define

sup0≤t≤∞

h(t, x)t

=: c ∈ [0,∞].

We now proof that c is in fact equal to the limit limh(t, x)/t for t → +,being equivalent to the statement that for every b < c it is

b ≤ lim inft→0+

h(t, x)t

≤ lim supt→0+

h(t, x)t

= c. (9)

So, choose b < c arbitrarily. Acc. to the definition of c, there exists s > 0such that b < h(s, x, x)/s. Rewriting s = nt+∆t with t > 0 and 0 ≤ ∆t < t,we obtain

b <h(s, x)

s≤ nt

s

h(t, x)t

+h(∆t, x)

s

Page 16: Markov Processes for Everybody - Computational Physiology

16 1 MARKOV JUMP PROCESSES

Taking the joint limit t → 0+, ∆t → 0 and n → ∞ such that nt/s → 1proves the first inequality in (9) and thus completes the proof for the diagonalentries.

To prove the statement for the off-diagonal entries, assume that we areable to prove for every ε ∈ (1/2, 1) there exist δ > 0 such that

p(ns, x, y) ≥ (2ε− 1)np(s, x, y) (10)

for every n ∈ N and s ≥ 0 such that 0 ≤ ns < δ. Denote by [x] the integerpart of x. Then,

p(s, x, y)s

≤ p([t/s]s, x, y)[t/s]s(2ε2 − ε)

with t, s < δ. Considering s → 0+, we obtain for all t > 0

lim sups→0+

p(s, x, y)s

≤ p(t, x, y)t(2ε2 − ε)

< ∞

since lims→0+[t/s]s = t. Therefore,

lim sups→0+

p(s, x, y)s

≤ 1(2ε2 − ε)

lim inft→0+

p(t, x, y)t

< ∞.

Since ε can be chosen arbitrarily close to 1, we finally obtain the desiredresult. However, statement (10) still needs to be proven ... ¤

Sometimes, even for discrete-time Markov chains ”generators” are de-fined; here A = P − Id mimics the properties of a infinitesimal generator(which it of course not is). In Graph Theory, such a matrix is known asLaplace matrix.

Example 1.19 Rewriting eq. (4) in matrix form, we obtain for the tran-sition semigroup of the uniform Markov process with intensity λ > 0 andsubordinated Markov chain K = (k(x, y))

P (t) =∞∑

n=0

e−λt (λt)n

n!Kn = etλ

(K−Id

), (11)

for t ≥ 0.The infinitesimal generator is thus given by

A = λ(K − Id

),

which, entry-wise, corresponds to a(x, x) = λ(1 − k(x, x)), and a(x, y) =λk(x, y) for x 6= y. The result directly follows from eq. (11).

By now we have two different descriptions of a Markov jump process,one in of form soujourn times and the embedded Markov chain, the other

Page 17: Markov Processes for Everybody - Computational Physiology

1.3 Infinitesimal generators and the master equation 17

by the generator. We saw in the preceeding sections that a Markov jumpprocess is fully determined by its soujourn times and the embedded Markovchain. Why did we introduce the notion of a generator then? The answer isthat more general Markov processes, i.e. in continuous state space, can notdescribed by an embedded Markov chain anymore, while it is still possibleto use the generator concept. But of course, in the case of a Markov jumpprocess, it is possible to convey both description types into each other, likewe did in the case of a uniform Markov jump process. This is what we doin the next paragraph. Therefore we will construct for a given generator asuitable Markov jump process and use this construction afterwards to getinsight about the entries of a generator of a given Markov jump process.As preparation we need

Proposition 1.20 Consider a series of independent exponential distributedrandom variables Nkk∈N with parameters λkk∈N. Assume that

∑∞k=0 λk =

λ < ∞ and letT = minN0, N1, N2, . . . ,

be the minimum value of the random variables series and J such that NJ =T . Then J and T are independent random variables with

P[J = i, T ≥ t] = P[J = i]P[T ≥ t] =λi

λexp(−λt).

Proof : Left as an exercise (use results about the distribution of a mini-mum of exponential distributed random variables, show the proposition fora finite number of random variables and then generalize to the case of aninfinite number of random variables). ¤

Given a stable and conservative generator, i.e. a matrix A with

−∞ < −a(x, x) ≤ 0,

0 ≤ a(x, y) < ∞ for x 6= y

and∑

y 6=x a(x, y) = −a(x, x). To construct a jump process based on thismatrix set T0 = 0, X(T0) = X(0) = x0 and define recursively the followingprocess

1. Assume X(Tk) = x.

2. If a(x, x) = 0 end the construction by setting τk = ∞ and X(t) = xfor all t ≥ Tk.

3. Otherwise τk = minNx,0, Nx,1, . . . , Nx,x−1, Nx,x+1, Nx,x+2, . . ., whereN(x, y) is exponential distributed to the parameter a(x, y).

4. Set Tk+1 = Tk + τk and Xk+1 = X(Tk+1) = y, where y is the statesuch that τk = N(x, y).

Page 18: Markov Processes for Everybody - Computational Physiology

18 1 MARKOV JUMP PROCESSES

Theorem 1.21 The previous constructed process is a homogeneous Markovjump process with generator A

Proof : We leave the fact that the constructed process is a homogenousMarkov jump process to the careful reasoning of the reader. It remains toshow that a(x, y) = limt→0

P (t,x,y)−P (0,x,y)t , i.e. it is necessary to analyze

the transition function P . The statement is trivial if a(x, x) = 0 (why?),therefore it is assumed in the following that a(x, x) 6= 0.The first case to be considered is x 6= y, then

P (t, x, y) = Px[T2 ≤ t,X(t) = y] +Px[T2 > t, X(t) = y]= Px[T2 ≤ t,X(t) = y] +Px[T2 > t, T1 ≤ t,X1 = y]= Px[T2 ≤ t,X(t) = y] +Px[T1 ≤ t, X1 = y]−Px[T1 ≤ t,X1 = y, T2 ≤ t]

The three terms on the right-hand side are now to be analyzed seperately.For the second term we have, by 1.20,

Px[T1 ≤ t,X1 = y] = (1− exp(−a(x, x)t)a(x, y)a(x, x)

=: f(t).

Since f ′(t) = a(x, y) exp(a(x, x)t) we have f ′(0) = a(x, y) and

limt→0

f(t)− f(0)t

= limt→0

f(t)t

= a(x, y).

The first and the third term are both upper-bounded by Px[T2 ≤ t], whichcan be bounded further by

Px[T2 ≤ t] ≤ Px[T1 ≤ t, τ1 ≤ t]

=∑

x 6=y

Px[T1 ≤ t,X1 = y, τ1 ≤ t]

=∑

x 6=y

(1− exp(a(x, x)t))a(x, x)−a(x, y)

(1− exp(a(y, y)t)) = f(t)g(t),

where f(t) := exp(a(y, y)t) − 1 and g(t) :=∑

x 6=y(1 − exp(a(x, x)t)a(x,x)a(x,y) .

Now observe

limt→0

1tf(t) = lim

t→0

exp(a(y, y)t)− exp(a(y, y)0)t− 0

= (exp(a(y, y)t))′|t=0 = a(y, y)

andlimt→0

g(t) = limt→0

x6=y

(1− exp(a(x, x)t)a(x, x)a(x, y)

= 0,

(the exchange of limes and summation is allowed in this case, because allsummands in the sum are positive, bounded and with an existing limes).This yields limt→0

Px[T2≤t]t = limt→0

f(t)g(t)t = 0 and, putting it all together,

limt→0

P (t, x, y)t

= a(x, y).

Page 19: Markov Processes for Everybody - Computational Physiology

1.3 Infinitesimal generators and the master equation 19

It remains to show that limt→0P (t,x,x)−1

t = −a(x, x). This is very similiarto the case x 6= y, in that P (t, x, x) is decomposed by

P (t, x, x) = Px[T2 ≤ t,X(t) = x] +Px[T2 > t,X(t) = x]= Px[T2 ≤ t,X(t) = x] +Px[T1 > t],

which can be treated similiar to the first case to show the assertion. ¤

As it is clear by now how to construct a Markov jump process to a givengenerator we proceed to the reverse direction. Assume a given Markov jumpprocess X(t) : t ≥ 0 with jump rates λ(x)x∈S and conditional transitionprobabilities k(x, y)x,y∈S, furthermore k(x, x) = 0 and λ(x) < ∞ for allx ∈ S. Let P be the transition function of this process. A matrix A definedby

a(x, y) =

−λ(x) for x = y

λ(x)k(x, y) for x 6= y

fulfills obviously the conditions we posed on a generator, namely conservativand stable, to construct a Markov Process by the previous described procee-dure. Doing this we obtain another Markov Process with transition functionP . By construction it is clear that P = P (you should be able to figure thatout!) and thererfore the derivatives at 0 are equal, that is the generator ofX(t) : t ≥ 0 fulfills A = A. We state this important result in

Theorem 1.22 Consider a homogenous and regular Markov jump processon state space S with jump rates λ(x)x∈S and conditional transition prob-abilities k(x, y)x,y∈S, where k(x, x) = 0 for all x ∈ S. Then the generatorA is given by

a(x, y) =

−λ(x) for x = y

λ(x)k(x, y) for x 6= y,

i.e. A = Λ(K − Id), where the jump rate matrix Λ is given by

Λ =

λ(0)λ(1)

λ(2). . .

. (12)

Hence, the negative diagonal entry of the generator corresponds to thelife time rate of the corresponding state, while the off-diagonal entries areproportional to the transition probabilities of the embedded Markov chain.We further remark that many infinitesimal generators can be represent inmultiple ways, if represented by an intensity λ ∈ R+ and some subordinatedMarkov chain with transition matrix S. Assume supx λ(x) < ∞. Then, forany choice of λ ≥ supx λ(x), define

S = λ−1Λ(K − Id) + Id,

Page 20: Markov Processes for Everybody - Computational Physiology

20 1 MARKOV JUMP PROCESSES

which indeed is a stochastic matrix (exercise). As a result, eq. (??) trans-forms into the uniformized representation

A = λ(S − Id), (13)

which is the representation of the infinitesimal generator of an uniformMarkov jump process with intensity λ and subordinated Markov chain rep-resented by S. In the representation (??), every event is associated with astate change (since p(x, x) = 0). In contrast, events in the representation(13) need not necessarily correspond to a state change (since s(x, x) ≥ 0,which might be positive).

Example 1.23 A birth and death process with birth rates (αx)x∈S anddeath rates (γx)x∈S on the state space S = N is a continuous-time Markovjump process with infinitesimal generator A = (a(x, y))xy∈S defined by

A =

−α0 α0 0 0 0 · · ·γ1 −(γ1 + α1) α1 0 0 · · ·0 γ2 −(γ2 + α2) α2 0 · · ·0 0 γ3 −(γ3 + α3) α3 · · ·...

......

. . . . . . . . .

.

We assume that α(x), γ(x) ∈ (0,∞) for x ∈ S. The birth and death processis regular, if and only if the corresponding Reuters’s criterium

∞∑

k=1

(1αk

+γk

αkαk−1+ · · ·+ γk · · · γ1

αk · · ·α0

)= ∞

is satisfied [1, Chapt. 8, Thm. 4.5].

The next propositon shows how the generator can be used to evolve thesemigroup of conditional transitions in time.

Proposition 1.24 Consider a Markov jump process with transition semi-group P (t) and infinitesimal generator A = (a(x, y)), satisfying −a(x, x) <∞ for all x ∈ S. Then, P (t) is differentiable for all t ≥ 0 and satisfies theKolmogorov backward equation

dP (t)dt

= AP (t). (14)

If furthermore ∑y

p(t, x, y)λ(y) < ∞ (15)

is satisfied for all t ≥ 0 and x ∈ S, then also the Kolmogorov forwardequation

dP (t)dt

= P (t)A (16)

holds.

Page 21: Markov Processes for Everybody - Computational Physiology

1.3 Infinitesimal generators and the master equation 21

Remark 1.25 Condition (15) is always satisfied if the state space S is finiteor supx λ(x) < ∞.

Proof : There is simple proof in the case of a finite state space. In thegeneral case the proof is considerably harder, we refere to [5], Prop. 8.3.4,p.210.By definition of a semigroup we have P (t + s) = P (t)P (s) = P (s + t),therefore, under the assumption that S is finite,

limh→0

P (t + h)− P (t)h

= limh→0

P (t)P (h)− Id

h= P (t)A

= limh→0

P (h)− Id

hP (t) = AP (t)

This does not work in the infinite case, because it is not sure if we can ex-change the limes with the sum in the matrix-matrix multiplication. ¤

Remark 1.26 Component wise the backward, resp. forward, equation reads

ddt

P (t, x, y) = −λ(x)P (t, x, y) +∑

z 6=x

a(x, z)P (t, z, y), resp.

ddt

P (t, x, y) = −λ(y)P (t, x, y) +∑

z 6=x

P (t, x, z)a(z, y)

If the state space is finite the solution to eq. (14) is given by P (t) = exp(tA),where the matrix exponential function is defined via the series

exp(tA) =∑

n∈N

(tA)n

n!

which is known to converge. The situation is quite easy if A is diagonalizable,i.e. we have A = V −1DV for a unitary matrix V and a diagonal matrix D.Then An = V −1DnV and

exp(tA) =∑

n∈N

(tV −1DV )n

n!= V −1

n∈N

(tD)n

n!V

= V −1diag(exp(td1), exp(td2), . . . , exp(tdr))V.

In the non-diagonalizable case the exponential function can still be used fora reasonable approximation by computing only part of the sum.¿From the Kolmogorov forward equation we can easily deduce the evolutionequation for an arbitrary initial distribution µ0 = (µ0(x))x∈S of the Markovjump process. As in the discrete-time case, we have

µ(t) = µ0P (t),

Page 22: Markov Processes for Everybody - Computational Physiology

22 1 MARKOV JUMP PROCESSES

with µ(t) = (µ(t, x))x∈S and µ(t, x) = Pµ0 [X(t) = x]. Now, multiplyingthe Kolmogorov forward equation with µ0 from the left, we get the so-calledMaster equation

dµ(t)dt

= µ(t)A

with initial condition µ(0) = µ0. It describes on an infinitesimal scale theevolution of densities w.r.t. the Markov process. An alternative formulationcan be given in terms of the jump rates λ(x) and the embedded Markovchain K = (k(x, y))

dµ(t, z)dt

=∑

y∈S

µ(t, y)a(y, z)

= λ(z)∑

y∈S

(µ(t, y)− µ(t, z)

)k(y, z)

for every z ∈ S.

1.4 Invariant measures and stationary distributions

This section studies the existence of invariant measures and stationary (prob-ability) distributions. We will see that the embedded Markov chain is stilla very useful object with this regard, but we will also see that not everyproperty of the continuous-time Markov process can be specified in terms ofthe embedded discrete-time Markov chain. This is particularly true for theproperty of positive recurrence.

Definition 1.27 A measure µ = (µ(x))x∈S satisfying

µ = µP (t)

for all t ≥ 0 is called an invariant measure of the Markov jump process.If, moreover, µ is a probability measure satisfying µ(S) = 1, it is called astationary distribution.

We are now able to state for a quite large class of Markov jump processesthe existence of invariant measures, and also to specify them.

Theorem 1.28 Consider an irreducible and recurrent regular Markov jumpprocess on S with transition semigroup P (t). For an arbitrary state x ∈ Sdefine µ = (µ(y))y∈S via

µ(y) = Ex

[∫ Rx

01X(s)=yds

], (17)

the expected time, the process visits y before returning to x. Then

Page 23: Markov Processes for Everybody - Computational Physiology

1.4 Invariant measures and stationary distributions 23

1. 0 < µ(y) < ∞ for all y ∈ S. Moreover, µ(x) = 1/λ(x) for the statex ∈ S chosen in the eq. (17).

2. µ = µP (t) for all t ≥ 0.

3. If ν = νP (t) for some measure ν, then ν = αµ for some α ∈ R.

Proof: See Bremaud Thm. 5.1, p 357. We just prove here µ(x) = 1/λ(x).We have

µ(x) = Ex

[∫ Rx

01X(s)=xds

](18)

= Ex

[∫ Ex

01X(s)=xds

]+Ex

[∫ Rx

Ex

1X(s)=xds

](19)

= Ex [Ex] + 0 =1

λ(x). (20)

¤

As one would expect, there is a close relation between the invariantmeasure of the transition semigroup, the infinitesimal generator and theembedded Markov chain.

Proposition 1.29 Consider an irreducible and recurrent regular Markovjump process on S with transition semigroup P (t), infinitesimal generator Aand embedded Markov chain with transition matrix K. Then the followingstatements are equivalent:

1. There exists a measure µ = (µ(x))x∈S such that

µ = µP (t)

for all t ≥ 0.

2. There exists a measure µ = (µ(x))x∈S such that

0 = µA.

3. There exists a measure ν = (ν(x))x∈S such that

ν = νK

The relation between µ and ν is given by µ = νΛ−1, which element-wisecorresponds to

µ(x) =ν(x)λ(x)

for every x ∈ S.

Page 24: Markov Processes for Everybody - Computational Physiology

24 1 MARKOV JUMP PROCESSES

Proof: Exercise. ¤

Consider the expected return times from some state x ∈ S defined by

Ex[Rx] = Ex

[∫ ∞

01s≤Rxds

]. (21)

Depending on the behavior of Ex[Rx], we further distinguish the two typesof recurrent states:

Definition 1.30 A recurrent state x ∈ S is called positive recurrent, if

Ex[Rx] < ∞and null recurrent otherwise.

As in the discrete-time case, we have the following result.

Theorem 1.31 An irreducible regular Markov jump process with infinitesi-mal generator A is positive recurrent, if and only if there exists a probabilitydistribution π on S such that

0 = πA

holds. Under these conditions, the stationary distribution π is unique andpositive everywhere, with

π(x) =1

λ(x)Ex[Rx].

Hence π(x) can be interpreted as the exit rate of state x times the inverseof the expected first return time to state x ∈ S.

Proof : Theorem 1.28 states that an irreducible and recurrent regularMarkov jump process admits an invariant measure µ defined through (17)for an arbitrary x ∈ S. Thus

y∈S

µ(y) =∑

y∈S

Ex

[∫ Rx

01X(s)=yds

]

= Ex

∫ ∞

0

y∈S

1X(s)=y1s≤Rxds

= Ex

[∫ ∞

01s≤Rxds

]= Ex[Rx],

which is by definition finite in the case of positive recurrence. Therefore thestationary distribution can be obtained by normalization of µ with Ex[Rx]yielding

π(x) =µ(x)Ex[Rx]

=1

λ(x)Ex[Rx].

Page 25: Markov Processes for Everybody - Computational Physiology

1.4 Invariant measures and stationary distributions 25

Since the state x was chosen arbitrary this is true for all x ∈ S. Unique-ness and positivity of π follows from Theorem 1.28. On the other hand,if there exists a stationary distribution of the Markov process, it satisfiesπ = πP (t) for all t ≥ 0 due to Prop. 1.29. Moreover if the Markov processwere transient, then

limt→∞1X(t)=y = 0 implying lim

t→∞ p(t, x, y) = 0

for x, y ∈ S by dominated convergence. In particular, πP (t) would tendto zero for t → ∞ component-wise, which would be in contradiction toπ = πP (t). Hence, the Markov process is recurrent. Positive recurrencefollows from the uniqueness of π and the consideration above. ¤

Our considerations in the proof of Theorem 1.31 easily leads to a criteriato distinguish positive recurrence from null recurrence.

Corollary 1.32 Consider an irreducible regular Markov jump process withinvariant measure µ. Then

1. X(t) : t ≥ 0 positive recurrent ⇔ ∑x∈S

µ(x) < ∞,

2. X(t) : t ≥ 0 null recurrent ⇔ ∑x∈S

µ(x) = ∞.

Proof: The proof is left as an exercise. ¤

It is important to notice that positive recurrence can not be characterizedon the basis of the embedded Markov chain. This is due to the fact thatgiven an irreducible regular Markov jump process with 0 = µA, λ(x), andν = νK, we know by by Prop. 1.29, that

∞∑

x=0

µ(x) =∞∑

x=0

ν(x)λ(x)

So, whether the left hand side converges or not, depends on both, the asymp-totic behavior of (ν(x)) and of (λ(x)).

Example 1.33 Consider the birth and death Markov jump process with em-bedded Markov chain given by

K =

0 11− p 0 p

1− p 0 p. . . . . . . . .

(22)

and jump rates (λ(x)), still to be specified. The embedded Markov chain isirreducible and recurrent for 0 < p ≤ 1/2. Hence, so is the thereby defined

Page 26: Markov Processes for Everybody - Computational Physiology

26 1 MARKOV JUMP PROCESSES

Markov jump process, which in addition is regular due to Prop. 1.16. Theinvariant measure of the embedded Markov chain is given by

ν(x) =1p

(p

1− p

)x

ν(0) (23)

for x ≥ 1 and ν(0) ∈ R. Computing the norm results in

∞∑

x=0

ν(x) =2− 2p

1− 2pν(0).

Hence, the embedded Markov chain is null-recurrent for p = 1/2 and positiverecurrent for p < 1/2. We now exemplify four possible setting:

1. Set λ(x) = x for x = 1, 2, . . . while λ(0) = 2, and p = 1/2. Then, weknow that the embedded Markov chain is null-recurrent with invariantmeasure ν = (1/2, 1, 1, . . .), and

∞∑

x=0

µ(x) =∞∑

x=0

1x

= ∞.

Hence, the Markov jump process is null-recurrent, too.

2. Set λ(x) = x2 for x = 1, 2, . . ., while λ(0) = 2, and p = 1/2. Again,the embedded Markov chain is null-recurrent, but now

∞∑

x=0

µ(x) =∞∑

x=0

1x2

< ∞.

Hence, now the Markov jump process is positive recurrent.

3. Set λ(x) = (1/3)x for x = 1, 2, . . ., while λ(0) = 1/4, and p = 1/4.Now, the embedded Markov chain is positive recurrent with stationarydistribution ν(x) = 4(1/3)x+1 for x ≥ 1 and ν(0) = 1/3.

∞∑

x=0

µ(x) =∞∑

x=0

43

= ∞.

Hence, the Markov jump process is null-recurrent.

4. Set λ(x) = 4/3 for x = 1, 2, . . ., while λ(0) = 1/3, and p = 1/4. Again,the embedded Markov chain is positive recurrent. Finally, we have

∞∑

x=0

µ(x) =∞∑

x=0

(13

)x

< ∞.

Hence, the Markov jump process is positive recurrent.

Page 27: Markov Processes for Everybody - Computational Physiology

1.5 Reversibility and the law of large numbers 27

In the same spirit, one can show that the existence of a stationary distri-bution of some irreducible Markov jump process (being not necessarily regu-lar) does not guarantee positive recurrence. In other words, Theorem 1.31 iswrong, if one drops the assumption that the Markov jump process is regular.

Example 1.34 We consider the embedded Markov chain with transitionmatrix given by eq. (22). If p > 1/2, then the Markov chain is transient.However, it does posses an invariant measure µ defined in eq. (23), wherewe choose ν(0) = p. Define the jump rates by

λ(x) =(

1− p

p

)2x

for x ≥ 1 and λ(0) = p. Then, we get that µ defined by µ(x) = ν(x)/λ(x) isan invariant measure with

∞∑

x=0

µ(x) =∞∑

x=0

(1− p

p

)x

=p

2p− 1< ∞.

since p > 1/2 and thus (1− p)/p < 1. Concluding, the Markov jump processis irreducible and possesses an stationary distribution. Due to Prop. 1.16, itmoreover is transient (since the embedded Markov chain is). This can onlybe in accordance with Thm 1.31, if the Markov jump process is non-regular,hence explosive.

1.5 Reversibility and the law of large numbers

The concept of time reversibility for continuous-time Markov processes isbasically the same as for discrete-time Markov chains. Consider some pos-itive recurrent, irreducible regular Markov jump process with infinitesimalgenerator A and stationary distribution π. Then, define the time-reversedMarkov jump process Y (t) : t ≥ 0 in terms of its transition semigroupQ(t); t ≥ 0 according to

q(t, y, x) =π(x)p(t, x, y)

π(y)(24)

for t ≥ 0 and x, y ∈ S. As can be easily shown be, Q(t) fulfills all re-quirements for a transition semigroup. Defining the diagonal matrix Dπ =(π(x))x,y∈S with π on its diagonal, we may rewrite the above eq. (24) as

Q(t) = D−1π P (t)T Dπ.

Now, let us determine the infinitesimal generator B = (b(x, y))x,y∈S of thesemigroup Q(t). It is

B = limt→0

Q(t)− Idt

= limt→0

D−1π

P (t)T − Idt

Dπ = D−1π AT Dπ, (25)

Page 28: Markov Processes for Everybody - Computational Physiology

28 1 MARKOV JUMP PROCESSES

hence the inf. generator transforms in the same way as the semigroup does.From the inf. generator, we easily conclude that the jump rates λ(x) are forboth processes equal, since

λB(x) = −b(x, x) = −a(x, x) = λA(x).

Now, denote by K and L the transition matrix of the embedded Markovchains of the Markov jump processes X(t) and Y (t), respectively. Assumethat K is positive recurrent with stationary distribution ν. Then, we get

Λ(L− Id) = B = D−1π AT Dπ (26)

= D−1π (KT − Id)ΛDπ (27)

= ΛΛ−1D−1π (KT − Id)ΛDπ (28)

= Λ(Λ−1D−1π KT ΛDπ − Id) (29)

= Λ(D−1ν KT Dν − Id) (30)

since λ(x)π(x) = aν(x) for all x ∈ S and some normalization constant a > 0,which implies ΛDπ = aDν . Hence, we get the relation

L = D−1ν KT Dν

and thus that the embedded Markov chain L of the time-reversed Markovjump process Y (t) equals the time-reversed embedded Markov chain K ofthe original Markov jump process X(t).

As in the discrete time case, we have

Definition 1.35 Consider an irreducible regular Markov jump process X(t) :t ≥ 0 with infinitesimal generator A and stationary distribution π > 0, andits associated time-reversed Markov jump process with infinitesimal genera-tor B. Then X(t) is called reversible w.r.t. π, if

A = B

for all x, y ∈ S.

The above definition can be reformulated: a Markov process is reversiblew.r.t. π, if and only if the detailed balance condition

π(x)a(x, y) = π(y)a(y, x) (31)

is satisfied for every x, y ∈ S.

A measurable function f : S → R defined on the state space is called anobservable. Observables allow to perform “measurements” on the systemthat is modelled by the Markov process. The expectation of f is defined as

Eπ[f ] =∑

x∈S

f(x)π(x).

Page 29: Markov Processes for Everybody - Computational Physiology

1.6 Biochemical reaction kinetics 29

Theorem 1.36 (Strong law of large numbers) Let X(t) : t ≥ 0 de-note an irreducible regular Markov process with stationary distribution π,and let f : S → R be some observable such that

Eπ[|f |] =∑

x∈S

|f(x)|π(x) < ∞.

Then for any initial state x ∈ S, i.e., X0 = x

1t

∫ t

0f(X(s))ds −→ Eπ[f ], (a.s.)

as t →∞.

Proof: The proof is analogous to the proof for discrete-time Markov chains.¤

1.6 Biochemical reaction kinetics

Consider a volume V containing molecules of N chemically active speciesS0, . . . , SN−1 and possibly molecules of inert species. For k = 0, . . . , N − 1,denote by Xk(t) ∈ N the number of molecules of the chemical species Sk inV at time t ∈ R+, and set X(t) = (X0(t), . . . , XN−1(t)) ∈ NN . Further-more, consider M chemical reactions R0, . . . , RM−1, characterized by areaction constant ck. The fundamental hypothesis of chemical reactionkinetics is that the rate of each reaction Rk can be specified in terms of aso-called propensity function αk = αj(X(t)), depending in general on thecurrent state X(t) and possible on time t. For the most common reactiontype, it is (c = some generic reaction constant, r.p. = reaction products):

1. ”spontaneous creation” ∗ → r.p. , α(X(t), t) = c,

2. mono-molecular reaction Sj → r.p. , α(X(t)) = cXj(t),

3. bi-molecular reaction Sj + Sk → r.p. , α(X(t)) = cXj(t)Xk(t),

4. bi-molecular reaction Sj +Sj → r.p. , α(X(t)) = cXj(t)(Xj(t)− 1)/2,

The change in numbers of molecules of described by the state change vectorsν0, . . . , νM−1 ∈ ZN , such that X(t) → X(t) + νk, if reaction Rk occurs. Thestate change vectors are part of the stoichiometric matrix.

Example 1.37 We consider here a model by Srivastava et al. [4] describingthe intracellular growth of a T7 phage. The model comprises three chemicalspecies, the viral nucleic acids classified into genomic (Sgen) and template(Stem) and viral structural proteins (Sstruc). The interaction network be-tween the bacteriophage and the host is modelled by six reactions.

Page 30: Markov Processes for Everybody - Computational Physiology

30 1 MARKOV JUMP PROCESSES

No. reaction propensity state changeR0 Sgen

c0−−→ Stem α0 = c0 ·Xgen η0=(1,-1,0)R1 Stem

c1−−→ ∅ α1 = c1 ·Xtem η1=(-1,0,0)R2 Stem

c2−−→ Stem + Sgen α2 = c2 ·Xtem η2= (0,1,0)R3 Sgen + Sstruc

c3−−→ ”virus” α3 = c3 ·Xgen ·Xstruc η3=(0,-1,-1)R4 Stem

c4−−→ Stem + Sstruc α4 = c4 ·Xtem η4=(0,0,1)R5 Sstruc

c5−−→ ∅ α5 = c5 ·Xstruc η5=(0,0,-1)

The reaction constants are given by c0 = 0.025, c1 = 0.25, c2 = 1.0,c3 = 7.5 · 10−6, c4 = 1000, and c5 = 1.99 (day−1). In the model, thevolume of the cell is V = 1. The interesting scenario is the low infectionlevel corresponding to the initial numbers of molecules Xtem = 1, Xgen =Xstruc = 0.

We now specify the dynamics of X(t) : t ≥ 0, assuming that X(t) is aregular Markov jump process on the state space S = NN that satisfies theregularity conditions (2) and (3), which seem to be a reasonable assump-tion for biochemical reaction systems. In terms of X(t), the fundamentalhypothesis of chemical reaction kinetics is that

P[X(t + h) = x + ηk|X(t) = x] = αk(x) h + o(h)

as h → 0 holds for k = 0, . . . , M − 1. This allows us to determine theinfinitesimal generator A = (a(x, y))xy∈S. In view of

p(h, x, x + ηk) = a(x, x + ηk)h + o(h),

for h →), we conclude that

a(x, x + ηk) = αk(x)

for k = 0, . . . , M − 1. As a consequence, the jump rates are given by thepropensity functions α(x)x∈S. Defining α(x) = α0(x)+ . . .+αM−1(x), theembedded Markov chain with transition matrix K = (k(x, y))xy∈S is givenby

k(x, x + ηk) =αk(x)α(x)

for k = 0, . . . , M − 1, and zero otherwise. The algorithmic realization ofthe chemical reaction kinetics is as follows:

1. Set initial time t = t0 and initial numbers of molecules X(t0);

2. Generate independently two uniformly distributed random numbersu0, u1 ∼ U [0, 1]. Set x = X(t).

Page 31: Markov Processes for Everybody - Computational Physiology

1.6 Biochemical reaction kinetics 31

3. Compute the next reaction time increment

τ = − ln(u0)α(x)

;

4. Compute the next reaction Rk according to the discrete probabilitydistribution (

α0(x)α(x)

, . . . ,αM−1(x)

α(x)

);

5. Update molecular numbers X(t+τ) ← X(t)+ηk, and time: t ← t+τ .Go to Step 2.

This algorithmic scheme is known as the direct method [2, 3].

Consider some initial distribution u0 and set u(t, x) = P[X(t) = x|X(0) ∼u0]. The evolution equation for u is given by the master equation, which inthis context is called the chemical master equation. It takes the form

du(t, x)dt

=

y 6=x∈S

u(t, y)a(y, x)

+ u(t, x)a(x, x)

=M−1∑

k=0

(u(t, x− ηk)a(x− ηk, x) + u(t, x)a(x, x + ηk)

)

=M−1∑

k=0

(αk(x− ηk)u(t, x− ηk)− αk(x)u(t, x)

).

Page 32: Markov Processes for Everybody - Computational Physiology

32 REFERENCES

Acknowledgement Supported by the DFG Research Center Matheon”Mathematics for key technologies”.

References

[1] P. Bremaud. Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues.Springer, New York, 1999.

[2] D. T. Gillespie. A general method for numerically simulating the stochastic timeevolution of coupled chemical reactions. J Comput Phys, 22:403–434, 1976.

[3] D. T. Gillespie. Exact stochastic simulating of coupled chemical reactions. J PhysChem, 81:2340–2361, 1977.

[4] R. Srivastava, L. You, J. Summer, and J. Yin. Stochastic vs. deterministic modelingof intracellular viral kinetics. J theor Biol, 218:309–321, 2002.

[5] P. Todorovic. An Introduction to Stochastic Processes and Their Applications. Seriesin Statistics. Springer, New York, 1992.