an intensive course in stochastic processes

7/30/2019 An Intensive Course In Stochastic Processes

1/244

An Intensive Course in Stochastic Processes and

Stochastic Differential Equations inMathematical Biology

Part I

Discrete-Time Markov Chains

Linda J. S. Allen

Texas Tech UniversityLubbock, Texas U.S.A.

National Center for Theoretical SciencesNational Tsing Hua University

August 2008

L. J. S. Allen Texas Tech University


2/244

Acknowledgement

I thank Professor Sze Bi Hsu and Professor Jing Yu for the invitationto present lectures at the National Center for Theoretical Sciences atthe National Tsing Hua University.



3/244

COURSE OUTLINE

Part I: Discrete-Time Markov Chains - DTMC

Theory Applications to Random Walks, Populations, and Epidemics

Part II: Branching Processes Theory Applications to Cellular Processes, Network Theory, and

PopulationsPart III: Continuous-Time Markov Chains - CTMC

Theory

Applications to Populations and EpidemicsPart IV: Stochastic Differential Equations - SDE

Comparisons to Other Stochastic Processes, DTMC and CTMC Applications to Populations and Epidemics



4/244

Some Basic References for this Course

[1 ] Allen, LJS. 2003. An Introduction to Stochastic Processes withApplications to Biology. Prentice Hall, Upper Saddle River, NJ.

[2 ] Allen, LJS. 2008. Chapter 3: An Introduction to StochasticEpidemic Models. Mathematical Epidemiology, Lecture Notes inMathematics. Vol. 1945. pp. 81-130, F. Brauer, P. van den

Driessche, and J. Wu (Eds.) Springer.

[3-20 ] Karlin and Taylor. 1975. A First Course in Stochastic Processes.2nd Ed. Acad. Press, NY;

Kimmel and Axelrod. 2002. Branching Processes in Biology.Springer-Verlag,NY.

Other references will be noted.



5/244


6/244

How do Stochastic Epidemic Models Differ fromDeterministic Models?

A deterministic model is formulated in terms of fixed not randomvariables whose dynamics are solutions of differential or differenceequations.

A stochastic model is formulated in terms of random variables

whose probabilistic dynamics depend on solutions to differential ordifference equations.

A solution of a deterministic model is a function of time or spaceand is dependent on the initial data. A solution of a stochastic model is a probability distribution or

density function which is a function of time or space and is dependenton the initial distribution or density. One sample path over time or

space is one realization from this distribution.

Stochastic models are used to model the variability inherent inthe process due to demography or the environment. Stochastic modelsare particularly important when the variability is large relative to themean, e.g., small population size may lead to population extinction.



7/244

Whether the Random Variables Associated with TheStochastic Process are Discrete or Continuous

Distinguishes Some Types of Stochastic Models.

A random variable X(t; s) of a stochastic process assigns a realvalue to each outcome A S in the sample space and a probability(or probability density),

Prob{X(t; s) A} [0, 1].The values of the random variable constitute the state space, X(t; s).For example, the number of cases associated with a disease may have

the following discrete or continuous set of values for its state space:

{0, 1, 2, . . .} or [0, N].

The state space can be discrete or continuous and correspondingly,the random variable is discrete or continuous. For simplicity, thesample space notation is suppressed and X(t) is used to denotea random variable indexed by time t. The stochastic process iscompletely defined when the set of random variables

{X(t)

}are

related by a set of rules.



8/244

The Choice of Discrete or Continuous Random Variableswith a Discrete or Continuous Index Set Defines the Type

of Stochastic Model.

Discrete Time Markov Chain (DTMC): t {0, t, 2t , . . .}, X(t) is a discreterandom variable. The term chain implies that the random variable is discrete.

X(t) {0, 1, 2, . . . , N }

Continuous Time Markov Chain (CTMC): t [0, ), X(t) is a discrete randomvariable.

X(t) {0, 1, 2, . . . , N }Diffusion Process, Stochastic Differential Equation (SDE): t [0, ), X(t) is a

continuous random variable.

X(t)

[0, N]

Note: These are three major types of stochastic processes that will be discussed butare not the only types of stochastic processes.



9/244

The Following Graphs Illustrate the Solution of a

Differential Equation versus Sample Paths of a StochasticEpidemic Model

0 500 1000 1500 20000

5

10

15

20

25

30

35

Time Steps

Numb

erofInfectives,I(t)

Figure 1: Solution of number of infectious individuals for a differential equationof a SIR epidemic model versus three sample paths of a discrete-time Markovchain.



10/244

Part I:

Discrete-Time Markov Chains

Definitions, Theorems, and Applications

Let Xn = a discrete random variable defined on a finite {1, 2, . . . , N } orcountably infinite state space, {1, 2, . . .}. The index set {0, 1, 2, . . .} often

represents the progression of time. The variable n is used instead of t.Definition 1. A discrete time stochastic process {Xn}n=0 is said to have the Markovproperty if

Prob{Xn = in|X0 = i0, . . . , Xn1 = in1} = Prob{Xn = in|Xn1 = in1},

where the values ofik {1, 2, . . .} for k = 0, 1, 2, . . . , n. The stochastic processis then called a Markov chain. A Markov stochastic process is a stochastic process

in which the future behavior of the system depends only on the present and not on

its past history.

Definition 2. The probability mass function associated with the random variable Xnis denoted {pi(n)}i=0, where

pi(n) = Prob{Xn = i}. (1)

Reference for Part I: [1] Chapters 2 and 3.



11/244

One-Step Transition ProbabilitiesDefinition 3. The one-step transition probability pji(n) is the probability that the

process is in state j at time n + 1 given that the process was in state i at theprevious time n, for i, j = 1, 2, . . ., that is,

pji(n) = Prob{Xn+1 = j|Xn = i}.

Definition 4. If the transition probabilities pji(n) do not depend on time n, theyare said to be stationary or homogeneous. In this case, pji(n) pji. If thetransition probabilities are time-dependent, pji(n), they are said to be nonstationaryor nonhomogeneous.The transition matrix

P =

0BBB@p11 p12 p13 p21 p22 p23 p31 p32 p33

... ... ...

1CCCA .

The column elements sum to one,P

pji = 1. A matrix with this property is calleda stochastic matrix, P and Pn are stochastic matrices.



12/244

NStep Transition ProbabilitiesDefinition 5. The n-step transition probability, denoted p

(n)

ji

, is the probability ofmoving or transferring from state i to state j in n time steps,

p(n)

ji = Prob{Xn = j|X0 = i}.

The n-step transition matrix is P

(n)

=

p

(n)

ji

.

Then P(1) = P, P(0) = I, identity matrix and, in general, P(n) = Pn. Letp(n) = (p1(n), p2(n), . . .)

T be the probability mass vector,

Xi=1

pi(n) = 1.

Then p(n + 1) = P p(n). In general,

p(n + m) = Pn+m

p(0) = Pn `

Pm

p(0)

= Pn

p(m).



13/244

Classification of StatesDefinition 6. The state j can be reached from the state i (or state j is accessible

from state i) if there is a nonzero probability, p(n)

ji > 0, for some n 0 denotedi j. If j i, and if i j, then i and j are said to communicate, or to be inthe same class, denoted i j.Definition 7. A set of states C is closed if it is impossible to reach any state outside

of C from any state inside C by one-step transitions, i.e., pji = 0 if i C andj / C.

The relation i j can be represented in graph theory as a directed edge.The relation i j is an equivalence relation. The equivalence relation on thestates define a set of equivalence classes. These equivalence classes are known ascommunication classes of the Markov chain.

Definition 8. If there is only one communication class, then the Markov chain issaid to be irreducible, but if there is more than one communication class, then theMarkov chain is said to be reducible.

A sufficient condition that shows that a Markov chain is irreducible is the

existence of a positive integer n such that p(n)ji > 0 for all i and j; that is,Pn > 0, for some positive integer n. For a finite Markov chain, irreducibility can be

checked from the directed graph. A finite Markov chain with states {1, 2, . . . , N }is irreducible if there is a directed path from i to j for every i, j {1, 2, . . . , N }.



14/244

Gamblers Ruin Problem or Random WalkExample 1. The states {0, 1, 2 . . . , N } represent the amount of money of thegambler. The gambler bets $1 per game and either wins or loses each game. The

gambler is ruined if he/she reaches state 0. The probability of winning (moving to theright) is p > 0 and the probability of losing (moving to the left) is q > 0, p + q = 1.This model can also be considered a random walk on a grid with N + 1 points.The one-step transition probabilities are p00 = 1 and pN N = 1. All other elements

are zero. There are three communication classes: {0}, {1, 2, . . . , N 1}, and{N}. The Markov chain is reducible. The sets {0} and {N} are closed, but theset {1, 2, . . . , N 1} is not closed. Also, states 0 and N are absorbing; theremaining states are transient.

0 1 2 N

The transition matrix for the gamblers ruin problem is

P =

0BBBBBBBBBBB@

1 q 0 0 00 0 q

0 0

0 p 0 0 00 0 p 0 0... ... ... ... ...0 0 0 q 00 0 0

0 0

0 0 0 p 1

1CCCCCCCCCCCA

.


Periodic Chains


15/244

Periodic ChainsExample 2. Suppose the states are {1, 2, . . . , N } with transition matrix

P =0BBBBB@

0 0 0 11 0

0 0

0 1 0 0... ... ... ...0 0 1 0

1CCCCCA .

Beginning in state i it takes exactly N time steps to return to state i, PN = I.

The chain is periodic with period equal to N.

1 2 3 N

Definition 9. The period of state i, d(i), is the greatest common divisor of all

integers n 1 for which p(n)ii > 0; that is, d(i) = g.c.d{n|p(n)ii > 0 and n 1}.

If a state i has period d(i) > 1, it is said to be periodic of period d(i). If theperiod of a state equals one, it is said to be aperiodic.

Periodicity is a class property: i j, implies d(i) = d(j). Thus, we speak ofa periodic class or chain or an aperiodic class or chain.

In the gamblers ruin problem or random walk model with absorbing boundaries,

Example 1, the classes{

0}

and{

N}

are aperiodic. The class{

1, 2, . . . , N

1}has period 2.



16/244

Transient and Recurrent StatesDefinition 10. Let f

(n)ii denote the probability that, starting from state i, X0 = i,

the first return to state i is at the nth time step, n 1; that is,f

(n)ii = Prob{Xn = i, Xm = i, m = 1, 2, . . . , n 1|X0 = i}.

The probabilities f(n)ii are known as first return probabilities. Define f

(0)ii = 0.

Definition 11. State i is said to be transient if P

n=1f(n)ii < 1. State i is said to be

recurrent ifP

n=1f

(n)ii = 1.

Definition 12. The mean recurrence time for state i is

ii =

Xn=1

nf(n)ii .

Definition 13. If a recurrent state i satisfies ii < , then it is said to be positiverecurrent, and if it satisfies ii = , then it is said to be null recurrent.

An example of a positive recurrent state is an absorbing state. The mean

recurrence time of an absorbing state is ii = 1.


First Passage Time and Recurrent Chains


17/244

First Passage Time and Recurrent Chains

The probability f(n)

ji for j = i is defined similarly.Definition 14. Let f

(n)ji denote the probability that, starting from state i, X0 = i,

the first return to state j, j = i is at the nth time step, n 1,

f(n)

ji = Prob{Xn = j, Xm = j, m = 1, 2, . . . , n 1|X0 = i}, j = i.

The probabilities f(n)

ji are known as first passage time probabilities. Define f(0)

ji = 0.

Definition 15. If X0 = i, then the mean first passage time to state j is denoted asji = E(Tji) and defined as

ji =X

n=1

nf(n)

ji , j = i.

We use these definitions to verify alternative definitions for transient and recurrentstates and recurrent and transient communication classes and chains.

Theorem 1. A state i is recurrent (transient) if and only ifP

n=0p

(n)ii diverges

(converges); that is,

Xn=0

p(n)ii = (< ).

Recurrence and transience are class properties; i

j, state i is recurrent

(transient) iff state j is recurrent (transient).



18/244

The 1-D, Unrestricted Random Walk isTransient unless it is Symmetric.

Example 3. Consider the 1-D, unrestricted random walk. The chain is irreducible and

periodic of period 2.

-2 -1 0 1 2

Let p be the probability of moving to the right and q be the probability of movingleft, p + q = 1. We verify that the state 0 or the origin is recurrent iffp = 1/2 = q.However, if the origin is recurrent, then all states are recurrent because the chain isirreducible. Starting from the origin, it is impossible to return in an odd number ofsteps,

p(2n+1)00 = 0 for n = 0, 1, 2, . . . .

In 2n steps, there are a total of n steps to the right and a total of n steps to theleft, and the n steps to the left must be the reverse of those steps taken to the rightin order to return to the origin.

2nn

=

(2n)!

n!n!

different paths (combinations) that begin and end at the origin. The probability of

occurrence of each one of these paths is pnqn. Thus,

Xn=0

p(n)00 =

Xn=0

p(2n)00 =

Xn=0

2n

n

pnqn.



19/244

We need an asymptotic formula for n! known as Stirlings formula to verify

recurrence:n! nnen

2n.

Stirlings formula gives the following approximation:

p(2n)00 =

(2n)!

n!n! pn

qn

4n(2n)2ne2n

2n2n+1e2np

nq

n

=

(4pq)n

n . (2)

There exists a positive integer N such that for n N,

(4pq)n

2n < p(2n)00

Xn=N(4pq)n

2

n=

1

2

Xn=N1

n= .

The latter series diverges because it is just a multiple of a divergent p-series.

Therefore, by Theorem 1, state 0 is recurrent iff p = 1/2 = q but the chain

is irreducible, so all states are recurrent. The chain is transient iff p = q; there isa positive probability that an object starting from the origin will never return to theorigin. The object tends to + or to .



21/244

Summary of Classification Schemes

Markov chains or classes can be classified as

Periodic or Aperiodic

Then further classified as

Transient or Recurrent

Then recurrent MC can be classified as

Null recurrent or Positive recurrent.

The term ergodic refers to a MC that is aperiodic, irreducibleand recurrent; strongly ergodic if it is positive recurrent and weaklyergodic if it is null recurrent.


Basic Theorems for Markov Chains (MC)


22/244

Basic Theorems for Marko Chains (MC)Theorem 2 (Basic Limit Theorem for Aperiodic MC). An ergodic MC has the property

limn p(n)ij =1

ii,

where ii is the mean recurrence time for state i; i and j are any states of thechain. [If ii = , then limn p(n)ij = 0.]Theorem 3 (Basic Limit Theorem for Periodic MC). A recurrent, irreducible, and d-periodic MC has the property

limn

p(nd)ii =

d

ii

and p(m)ii = 0 if m is not a multiple of d, where ii is the mean recurrence time for

state i. [If ii = , then limn p(nd)ii = 0.]Theorem 4. If j is a transient state of a MC, and i is any other state, then

limn

p(n)

ji = 0.

The first two proofs apply discrete renewal theory (Karlin and Taylor, 1975). These

theorems also apply to classes in a MC.



23/244

The 1-D Unrestricted Symmetric Random Walkis Null Recurrent.

Example 4. The unrestricted random walk model is irreducible andperiodic with period 2. The chain is recurrent iff it is a symmetricrandom walk, p = 1/2 = q (Example 3). Recall that the 2n-steptransition probability satisfies

p(2n)00

1n

and hence, limnp(2n)00 = 0. The Basic Limit Theorem for PeriodicMarkov chains states that d/00 = 0. Thus, 00 = . Whenp = 1/2 = q, the chain is null recurrent.

It can be shown in a 2-D symmetric lattice random walk (probability1/4 moving in each of 4 directions), the chain is null recurrent. But ina 3-D symmetric lattice random walk (probability 1/6 moving in eachof 6 directions), the chain is transient.



24/244

Stationary Probability Distribution

Definition 16. A stationary probability distribution of a MC is a probability vector

= (1, 2, . . .)T,Pi=1 i = 1, that satisfies

P = .

Example 5. Let

P =

0BBB@

a1 0 0 a2 a1 0 a3 a2 a1

... ... ...

1CCCA

,

where ai > 0 andPi=1 ai = 1. There exists no stationary probability distribution

because P = implies = 0, the zero vector. It is impossible for the sum of theelements of to equal one.

According the the Basic Limit Theorem for MC, every strongly ergodic MC

converges to a stationary probability distribution. For a periodic MC, this is not thecase.



25/244

A Strongly Ergodic MC Converges to aStationary Distribution.

Theorem 5. A strongly ergodic MC with states {1, 2, . . .} and transition matrixP has a unique positive stationary probability distribution = (1, 2, . . .)

T,

P = , such that

limn

Pnp(0) = .

Example 6. The following transition matrix is based on a strongly ergodic MC:

P =

0@

0 1/4 0

1 1/2 1

0 1/4 0

1A

.

The stationary probability distribution is = (1/6, 2/3, 1/6)T with mean recurrencetimes 11 = 6, 22 = 3/2, and 33 = 6. The columns ofP

n approach the stationary

probability distribution,

limn

Pn =

0@1/6 1/6 1/62/3 2/3 2/3

1/6 1/6 1/6

1A .



26/244

The Basic Theorems Simplify for Finite MC

Facts: In finite MC, there are NO null recurrent states and not all states are

transient. In addition, in finite MC, a stationary probability distribution, is aneigenvector of P corresponding to an eigenvalue one.

Theorem 6. An irreducible finite MC is positive recurrent. In addition, an irreducible,

aperiodic finite MC has a unique positive stationary distribution such that

limn

Pn

p(0) = .

Example 7. The transition matrix for an irreducible finite MC is

P =

1/2 1/31/2 2/3

.

The stationary probability distribution satisfies P = ,

= (2/5, 3/5)T

.

Mean recurrence times are 11 = 5/2 and 22 = 5/3.



27/244

Biological Applications of DTMC Processes

We will apply these definitions and theorems to some biologicalexamples:

(1) Gamblers Ruin Problem or Random Walk

(2) Birth and Death Process

(3) Logistic Growth Process(4) SIS Epidemic Process

(5) SIR Epidemic Process

(6) Chain Binomial Epidemic Process


( )


28/244

(1) Gamblers Ruin Problem

The gamblers ruin problem is a classical problem in DTMC theory.The model can also be considered a random walk on a spatial grid withN + 1 grid points. If N, the spatial grid is semi-infinite.Probability of Absorption

Let ak be the probability of absorption into state 0 (ruin) beginningwith a capital of k, 1 k N 1 and let bk be the probability ofabsorption into state N (win and game stops) beginning with a capitalofk. If there is only one absorbing state at 0, as in a population model,

then probability of absorption is probability of extinction. If akn andbkn represent absorption into the two states, 0 and N, respectively,after the nth game or step, then

ak =

k=0

akn and bk =

n=0

bkn


Probability of Absorption


29/244

We solve a boundary value problem (bvp) for ak (difference equation) and use

the fact that ak + bk = 1 to solve bk. The bvp is

ak = pak+1 + qak1

a0 = 1

aN = 0

This is a homogeneous linear difference equation. Assume ak = k

= 0 to obtainthe characteristic equation p2 + q = 0. Then

ak =(q/p)N (q/p)k

(q/p)N 1 , p = q

bk =(q/p)k 1(q/p)N 1, p = q.

If p = 1/2 = q, then

ak =N k

N, p = 1/2 = q

bk =k

N, p = 1/2 = q.



30/244

The Probability of Absorption for N = 100 and

k = 50.p = probability of winning (moving to right)

q = 1 p = probability of losing (moving to left),Prob. a50 b50

q = 0.50 0.5 0.5

q = 0.51 0.880825 0.119175

q = 0.55 0.999956 0.000044

q= 0

.60 1.00000 0.00000

Table 1: Gamblers ruin problem with a beginning capital of k = 50and a total capital of N = 100.



31/244

Expected Time Until Absorption

In terms of the gamblers ruin problem, we will determine the meantime until absorption. In terms of population models, if there is onlyone absorbing boundary at 0, this represents population extinction andthe mean time until absorption is the mean time until extinction.

Let k = the expected time until absorption in the gamblers ruinproblem. We solve the following bvp:

k = p(1 + k+1 + q(1 + k1) = 1 + pk+1 + qk1,0 = 0 = N

This is a nonhomogeneous, linear difference equation. To solve the

homogeneous equation, let = = 0 to obtain the characteristicequation: p2 + q = 0. A particular solution is k = k/(q p),q = p.



32/244

The Expected Time to Absorption for N = 100

and k = 50.

k =

k

q p N

q p

1 (q/p)k1 (q/p)N

, q = p

k(N k), q = pProb. a50 b50 50

q = 0.50 0.5 0.5 2500

q = 0.51 0.880825 0.119175 1904q = 0.55 0.999956 0.000044 500

q = 0.60 1.00000 0.00000 250

Table 2: Gamblers ruin problem with a beginning capital of k = 50and a total capital of N = 100.



33/244

Expected Time Until Absorption as a Function

of Initial Captial k for N = 100 and q = 0.55.

0 20 40 60 80 1000

200

400

600

800

Initial capital

Ex

pectedduration

Figure 2: Expected duration of the games, k for k = 0, 1, 2, . . . , 100,when q = 0.55 and N = 100.



34/244

Random Walk on a Semi-Infinite Domain,NProbability of Extinction

ak =

1, p < qq

p

k

, p q.

Expected Time Until Extinction

k =

k

q p, p < q

, p

q.




35/244


A birth and death process is related to the gamblers ruin problem, but the

probability of a birth (winning) or death (losing) are not constant but depend onthe size of the population and size N is not absorbing. Let Xn, n = 0, 1, 2, . . .denote the size of the population. The birth and death probabilities are bi and di,

b0 = 0 = d0, bN = 0, bi > 0 and di > 0 for i = 1, 2, . . .. During the timeinterval n

n + 1, at most one event occurs, either a birth or a death. Assume

pji = Prob{Xn+1 = j|Xn = i}

=

8>>>:

bi, if j = i + 1

di, if j = i 11 (bi + di), if j = i0, if j = i 1, i , i + 1

for i = 1, 2, . . ., p00 = 1, pj0 = 0 for j = 0, and pN+1,N = bN = 0.


The Transition Matrix for a Birth and Death


36/244

The Transition Matrix for a Birth and DeathProcess

The transition matrix P has the following form:0BBBBBBBBB@

1 d1 0 0 00 1 (b1 + d1) d2 0 00 b1 1 (b2 + d2) 0 00 0 b2

0 0

... ... ... ... ... ...0 0 0 1 (bN1 + dN1) dN0 0 0 bN1 1 dN

1CCCCCCCCCA

.

To ensure P is a stochastic matrix,

supi{1,2,...}

{bi + di} 1.

During each time interval, n to n + 1, either the population sizeincreases by one, decreases by one, or stays the same size. This is areasonable assumption if the time interval is sufficiently small.



37/244

Eventual Extinction Occurs with Probability One.

There are two communication classes, {0} and {1, . . . , N }. Thefirst one is is positive recurrent and the second one is transient.There exists a unique stationary probability distribution , P = ,where 0 = 1 and i = 0 for i = 1, 2, . . . , N . Eventually populationextinction occurs from any initial state (Theorem 4).

limn

Pnp(0) = .

However, the expected time to extinction may be very long!


Expected Time to Extinction in a Birth andD th P


38/244

Death Process.

Let k = the expected time until extinction for a population with initial size k.

k = bk(1 + k+1) + dk(1 + k1) + (1 (bk + dk))(1 + k)= 1 + bkk+1 + dkk1 + (1 bk dk)k

and N = 1 + dNN1 + (1 dN)N. This can be expressed in matrix form:D = c, where = (0, 1, . . . , N)

T, c = (0, 1, . . . , 1)T, and D is0

BBBBBBB@

1 | 0 0 0 0 0 d1 | b1 d1 b1 0 0 00 | d2 b2 d2 b2 0 0... | ... ... ... ... ... ...0 | 0 0 0 dN dN

1

CCCCCCCA= 1 0

D1 DN .

Definition 17. An N N matrix A = (aij ) is diagonally dominant if

aii NX

j=1,j=i|aij|.

Matrix A is irreducibly diagonally dominant if A is irreducible and diagonally dominant with strictinequality for at least one i


Expected Time until Extinction in a Birth andD th P


39/244

Death Process.

Matrix DN is irreducibly diagonally dominant, det(DN) = 0 and det(D) =det(DN) . Thus, D is nonsingular and the solution to the expected time untilextinction is

= D1c.

Because matrix D is tridiagonal, simple recursion relations can be applied to obtain

explicit formulas for the k, k = 1, 2, . . . , N .Theorem 7. Suppose {Xn}, n = 0, 1, 2, . . . , N , is a general birth and deathprocess with X0 = m 1 satisfying b0 = 0 = d0, bi > 0 for i = 1, 2, . . . , N 1,and di > 0 for i = 1, 2, . . . , N . The expected time until population extinctionsatisfies

m =

8>>>>>:

1

d1+

NPi=2

b1 bi1d1 di

, m = 1

1 +m1

Ps=1 "d1 dsb1

bs

N

Pi=s+1b1 bi1

d1

di # , m = 2, . . . , N .

Diagonally dominant and irreducibly diagonally dominant matrices are nonsingular.


An Example of a Simple Birth and Death


40/244

An Example of a Simple Birth and DeathProcess with N = 20.

Example 8. Suppose the maximal population size is N = 20, where the birthand death probabilities are linear: bi bi = 0.03i, for i = 1, 2, . . . , 19,di di = 0.02i, for i = 1, 2, . . . , 20, a simple birth and death process. Sinceb > d, there is population growth.

0 5 10 15 200

2

4

6x 10

4

Initial population size

Expec

tedduration

b > d

Figure 3: Expected time until population extinction when the maximalpopulation size is N = 20, bi = 0.03i, and di = 0.02i.



41/244

(3) Logistic Growth ProcessAssume bidi = ri (1 i/K), where r = intrinsic growthrate and K = carrying capacity.

Two cases:

(a) bi = ri i2

2K and di = r i

2

2K

, i = 0, 1, 2, . . . , 2K

(b) bi =

ri, i = 0, 1, 2, . . . , N 10, i N and di = r

i2

K, i =

0, 1, . . . , N


We Plot the Expected Time to Extinction for


42/244

pTwo Cases.

Example 9. Let r = 0.015, K = 10, and N = 20. The population persists muchlonger in case (a).

0 5 10 15 200

2

4

6

8

10x 10

6

(a)


Expec

tedtimetoextinction

0 5 10 15 200

1

2

2.5x 10

5

(b)


Expectedtimetoextinction

Figure 4: Expected time until population extinction when the birth and deathrates satisfy (a) and (b) and the parameters are r = 0.015, K = 10, andN = 20.


Quasistationary Probability Distribution


43/244

When the expected time to extinction is very long, it is reasonable to examine

the dynamics of the process prior to extinction. Define the probability conditioned

on nonextinction:

qi(n) = Prob{Xn = i|Xj = 0, j = 0, 1, 2, . . . , n 1}

=pi(n)

1 p0(n)for i = 1, 2, . . . , N . Note q(n) = (q1(n), q2(n), . . . , qN(n))

T defines aprobability distribution because

NXi=1

qi(n) =PN

i=1 pi(n)1 p0(n) = 1 p

0(n)1 p0(n) = 1.

Let Qn = the random variable for the population size at time n conditional onnonextinction; qi(n) = Prob{Qn = i}. This quasistationary process is a finiteirreducible MC. The stationary probability distribution for this process is denoted asq; q is referred to as the quasistationary probability distribution.



44/244

Quasistationary Probability DistributionDifference equations for qi(n) can be derived based on those for pi(n) [i.e.,

p(n + 1) = P p(n)]. From these difference equations the quasistationary probability

distribution q can be determined. It will be seen that q cannot be calculated by a

direct method but by an indirect method, an iterative scheme.

An approximation to the process {Qn} yields a strongly ergodic MC, {Qn}, withassociated probability distribution q(n). For this new process, a transition matrix,

P, and the limiting positive stationary probability distribution q can be defined.

The stationary probability distribution q is an approximation for the quasistationaryprobability distribution q.


Quasistationary Probability Distribution


45/244

Difference equations for qi(n + 1) are derived from the identity p(n + 1) =P p(n).

qi(n + 1) =pi(n + 1)

1 p0(n + 1)

= pi(n + 1)

1 p0(n)1 p0(n)

1 p0(n + 1)=

pi(n + 1)

1 p0(n)

1 p0(n)

1 p0(n) d1p1(n)

or

qi(n + 1)(1 d1q1(n)) = pi(n + 1)1 p0(n) .Using the identity for pi(n + 1), the following relation is obtained:

qi(n + 1)[1 d1q1(n)] = bi1qi1(n) + (1 bi di)qi(n) + di+1qi+1(n)

for i = 1, 2, . . . , N , b0 = 0, and qi(n) = 0 for i / {1, 2, . . . , N }. It issimilar to the difference equation satisfied by pi(n) except for an additional factor

multiplying qi(n + 1). An analytical solution to q cannot be found directly from

these equations since the coefficients depend on n, but q can be solved by aniterative method.


Approximate Quasistationary ProbabilityDistribution


46/244

To approximate the quasistationary probability distribution, q, let d1 = 0. That is,when the population size equals one, the probability of dying is zero. Then

qi(n + 1) = bi1qi1(n) + (1 bi di)qi(n) + di+1qi+1(n),i = 2, . . . , N 1, q1(n + 1) = (1 b1)q1(n) + d2q2(n), and qN(n + 1) =bN1qN1(n) + (1 dN)qN(n). The new transition matrix corresponding to thisapproximation satisfies

P =

0BBBBBBB@

1 b1 d2 0 0b1 1 (b2 + d2) 0 00 b2 0 0...

......

......

0 0

1

(bN

1 + dN

1) dN

0 0 bN1 1 dN

1CCCCCCCA

.

Note that P is a submatrix of the original transition matrix P, where the first

column and first row of P are deleted and d1 = 0. The MC q(n + 1) = Pq(n),is strongly ergodic and thus q(n) converges to a unique stationary probability

distribution, q, where

qi+1 =bi b1

di+1 d2q1 and

NXi=1

qi = 1.


Approximate Quasistationary ProbabilityDistribution


47/244

Example 10. The approximate quasistationary probability distribution, q, is comparedto the quasistationary probability distribution q when r = 0.015, K = 10, andN = 20 in cases (a) and (b). Both distributions have good agreement for N = 20,

but when N = 10 and K = 5, then the two distributions differ, especially for valuesnear zero.

5 10 15 200

0.05

0.1

0.15

0.2

Population size

Probability

(a)

5 10 15 200

0.05

0.1

0.15

0.2

Population size

Probability

(b)(b)

2 4 6 8 100

0.05

0.1

0.15

0.2

Population size

Probability

(c)

Figure 5: Quasistationary probability distribution, q (solid curve), and theapproximate quasistationary probability distribution, q (diamond marks),when r = 0.015, K = 10, and N = 20 in cases (a) and (b). In (c),r = 0.015, K = 5, N = 10, where bi = ri and di = ri

2/K.


Probability Distribution Associated with LogisticGrowth when N = 100, K = 50 and X0 = 5.


48/244

, 0p(n) = (p1(n), . . . , pN(n))

T, n = 0, 1, . . . , 2000.

0

50

100

0

1000

2000

0

0.2

0.4

0.6

0.8

1

StateTime, n

Probabilit

y

Figure 6: The stochastic logistic probability distribution, p(n), r = 0.004,K = 50, N = 100, X0 = 5.


(4) To Understand the Stochastic SIS EpidemicM d l W R i th D i f th


49/244

Model, We Review the Dynamics of the

Deterministic SIS Epidemic Model.Deterministic SIS:

S I

dS

dt=

NSI + (b + )I

dI

dt=

NSI (b + )I

where > 0, > 0, N > 0 and b 0, S(t) + I(t) = N.


The Dynamics of the Deterministic SISEpidemic Model Depend on the Basic


50/244

p pReproduction Number.

= transmission rate

b = birth rate = death rate

= recovery rate

N = total population size = constant.

Basic Reproduction Number:

R0 = b +

If R0 1, then limt I(t) = 0.If

R0 > 1, then limt

I(t) = N1

1

R0.



51/244

Formulation of the SIS Stochastic MC EpidemicModel

Let In denote the discrete random variable for the number of infected (and infectious)

individuals with associated probability function

pi(n) = Prob{In = i}

where i = 0, 1, 2, . . . , N is the total number infected at time t. The probability

distribution is

p(n) = (p0(n), p1(n), . . . , pN(n))T

for t = 0, 1, 2, . . . . Now we relate the random variables {In} indexed by time nby defining the probability of a transition from state i to state j, i j, at timen + 1 as

pji(n) = Prob{In+1 = j|In = i}.


For the Stochastic Model Assume that the


52/244

For the Stochastic Model, Assume that theTime Interval is Sufficiently Small, Such that theNumber of Infectives Changes by at Most One.

That is,i i + 1, i i 1 or i i.

Either there is a new infection, birth, death, or a recovery. Therefore, the

transition probabilities are

pji(n) =

8>>>>>>>>>:

i(N i)/N = bi, j = i + 1b + = di, j = i 11 [i(N i)/N + (b + )i] =

1

[bi + di], j = i

0, j = i + 1, i , i 1,

Then the SIS Epidemic Process is similar to a birth and death process.


Three Sample Paths of the DTMC SIS Model


53/244

Three Sample Paths of the DTMC SIS Model

are Compared to the Solution of theDeterministic Model.

0 500 1000 1500 20000

10

20

30

40

50

60

70

Time Steps

Num

berofInfectives

Figure 7: R0 = 2, = 0.01, b = 0.0025 = , N = 100, S0 = 98, andI0 = 2.


Even Though R0 > 1 the DTMC SIS Epidemic


54/244

Even Though R0 > 1, the DTMC SIS EpidemicModel Predicts the Epidemic Ends.

When R0 > 1, the Deterministic SIS epidemic model predicts that an endemicequilibrium is reached. This is not the case for the Stochastic SIS epidemic model.

limn

p0(n) = 1.

As mentioned earlier, this absorption at zero may take an exponential amount oftime. But when N is large and I0 = i is small, for large time n,

0 < p0(n) P0 = constant < 1


An Estimate for P0 Can be Obtained From theGamblers Ruin Problem on a Semi Infinite


55/244

Gambler s Ruin Problem on a Semi-Infinite

Domain.When N is large and i is small:

Probability Movement Right = p =

Ni(N

i)

i

Probability Movement Left = q = (b + )i

Based on an random walk model on a semi-infinite domain, an estimate for the

probability of no epidemic (probability of ruin) with a capital k is ak = (q/p)k:

Suppose I0 = k. Then

Probability of no Epidemic = P0

b +

k=

1R0k



56/244

Graphs Probability Distribution are Bimodal,Showing the Probability of Now Epidemic and

the Quasistationary Distribution.

R0 = 2, I(0) = 3, P0 1/8

0

50

100

0

1000

2000

0

0.2

0.4

0.6

0.8

1

iTime Steps

Prob{I(t)=i}


(5) We Review the Dynamics of theDeterministic SIR Epidemic Model.


57/244

Deterministic SIR: Basic Reproduction Number R0 = b +

S I R

dSdt

= N

SI + b(I + R)

dI

dt=

NSI (b + )I

dR

dt

= I

bR

If R0 > 1 and b > 0, then limt I(t) = I > 0.IfR0 > 1 and b = 0, then limt I(t) = 0. An epidemic occurs ifR0

S(0)

N> 1.

If

R0

1, then limt I(t) = 0.


Formulation of a DTMC SIR Epidemic ModelResults In A Bivariate Process.


58/244

Results In A Bivariate Process.

Sn + In + Rn = N = maximum population size.Let Sn and In denote discrete random variables for the number of susceptible

and infected individuals, respectively. These two variables have a joint probabilityfunction

p(s,i)(n) = Prob{

Sn = s, In = i}

where Rn = N Sn In. For this stochastic process, we define transitionprobabilities as follows:

p(s+k,i+j),(s,i) = Prob{(S, I) = (k, j)|(S(t), I(t)) = (s, i)}

=

8>>>>>>>>>>>>>:

i(N i)/N, (k, j) = (1, 1)i, (k, j) = (0, 1)bi, (k, j) = (1, 1)b(N

s

i), (k, j) = (1, 0)

1 [i(N i)/N + i + b(N s)], (k, j) = (0, 0)0, otherwise

In multivariate processes the transition matrix is often too large and complex to

write down.


Three Sample Paths of the DTMC SIR Epidemic


59/244

Three Sample Paths of the DTMC SIR Epidemic

Model are Compared to the Solution of theDeterministic Model.

0 500 1000 1500 20000

5

10

15

20

25

30

35

Time Steps

Numb

erofInfectives,I(t)

Figure 8: R0 = 2, R0S0/N = 1.96, = 0.01, b = 0, = 0.005,N = 100,S0 = 98, and I0 = 2.



60/244

(6) Chain Binomial Epidemic Model

There are two basic models known as the Greenwood and Reed-Frost models,originally developed in 1928 and 1931, respectively. These models apply to smallepidemics or to outbreaks within a household.

Both models are DTMC models that depend on the two random variables St and

It, bivariate MC. The latent period is the time from t to t + 1 and the infectious

period is contracted to a point. Therefore, at time t + 1, there are only newlyinfected individuals from the previous time t.

St+1 + It+1 = St.

Given there are i infectives, let pi = probability that a susceptible individual

does not become infected during the time period t to t + 1.



61/244

Greenwood Model: pi = p

The transition probability for (st, it) (st+1, it+1) depends only on p, st, andst+1 and is based on the binomial probability distribution:

pst+1,st =

stst+1

pst+1(1 p)stst+1.

Sample paths are denoted {s0, s1, . . . , st1, st}. The epidemic stops whenst = st1.

E(St+1|St = st) = pst

E(It+1|St = st) = (1 p)st.


S l P h f h G d Ch i Bi i l


62/244

Sample Paths of the Greenwood Chain-BinomialModel.

0 1 2 3 4 5 6 7

0

1

2

3

4

5

6

7

t

st

Figure 9: Four sample paths for the Greenwood chain binomial modelwhen s0 = 6 and i0 = 1, {6, 6}, {6, 5, 5}, {6, 4, 3, 2, 1, 1}, and{6, 2, 1, 0, 0}.


i


63/244

Reed-Frost Model: pi = p

i

The transition probability for (st, it) (st+1, it+1) is again based on thebinomial probability distribution but depends on p, st, st+1, and it:

p(s,i)t+1,(s,i)t =

st

st+1

(p

it)st+1(1 pit)stst+1

E(St+1|St = st) = stp

it

E(It+1|St = st) = st(1 pit).


Th D ti d Si f th E id i C b


64/244

The Duration and Size of the Epidemic Can beCalculated.

Sample Path Duration Size

{s0, . . . , st1, st} T W Greenwood Reed-Frost

3 3 1 0 p

3

p

3

3 2 2 2 1 3(1 p)p4 3(1 p)p43 2 1 1 3 2 6(1 p)2p4 6(1 p)2p43 1 1 2 2 3(1 p)2p2 3(1 p)2p33 2 1 0 0 4 3 6(1 p)3p3 6(1 p)3p3

3 2 0 0 3 3 3(1 p)3

p2

3(1 p)3

p2

3 1 0 0 3 3 3(1 p)3p 3(1 p)3p(1 + p)3 0 0 2 3 (1 p)3 (1 p)3

Table 3: All of the sample paths, their duration, and size are computed

for the Greenwood and Reed-Frost models when s0 = 3 and i0 = 1.


This Concludes Part I on DTMC.



65/244


Part II: Branching Processes

Theory

Applications to Cellular Processes, Network Theory, and Populations

Part III: Continuous-Time Markov Chains - CTMC

Theory

Applications to Populations and Epidemics

Part IV: Stochastic Differential Equations - SDE

Comparisons to Other Stochastic Processes, DTMC and CTMC




Stochastic Differential Equations in


66/244

Mathematical BiologyPart II

Branching Processes

Linda J. S. AllenTexas Tech University

Lubbock, Texas U.S.A.

National Center for Theoretical SciencesNational Tsing Hua University

August 2008


Acknowledgement


67/244

Acknowledgement

I thank Professor Sze Bi Hsu and Professor Jing Yu for the invitationto present lectures at the National Center for Theoretical Sciences atthe National Tsing Hua University.


COURSE OUTLINE


Theory


68/244



Theory

Applications to Cellular Processes, Network Theory, and

Populations


Theory

Applications to Populations and EpidemicsPart IV: Stochastic Differential Equations - SDE

Comparisons to Other Stochastic Processes, DTMC and CTMC Applications to Populations and Epidemics


Part II:Branching Processes

The subject of branching processes began in 1845 with Bienayme


69/244

and was advanced in the 1870s with the work of Reverend Henry

William Watson, a clergyman and mathematician, and the biometricianFrancis Galton.

Galton in 1873 posed a problem and two questions whose solutionswere not resolved until 1930:

Suppose adult males (N in number) in a population each have differentsurnames. Suppose in each generation, a0 percent of the adult males have no

male children who survive to adulthood; a1 have one such child; a2 have two,

and so on up to a5, who have five.

1 Find what proportion of the surnames become extinct after r generations.

2 Find how many instances there are of the same surname being held by m persons.

Reference for Part II: [1], Chapter 4.


We will Discuss Single-Type and Multi-TypeBranching Processes (BP)


70/244

A. Single-Type Galton-Watson BP: Each generation, keep track ofonly one type of individual, cell, etc.

Applications:

(1) Family Names

(2) Cell Division

(3) Network Theory

B. Multi-type Galton Watson BP: Each generation, keep track of k

types of individuals, cells, etc.

Application:

(1) k Different Age Groups in a Population.


A. Single-Type Galton-Watson BranchingProcesses


71/244

The type of problem studied by Galton and Watson is appropriatelynamed a Galton-Watson Branching Process.

Discrete time branching processes are DTMC. Branching processes are frequently studied separately from Markov

chains because

(a) a wide variety of applications in branching processes: electron multipliers,

neutron chain reactions, population growth, survival of mutant genes, changes

in DNA and chromosomes, cell cycle, cancer cells, chemotherapy, and newtwork

theory.(b) techniques other than the transition matrices are used to study their behavior:

probability generating functions.


Assumptions in Galton-Watson BranchingProcess


72/244

Let X0 = total population size at the zeroth generation and letXn = total population size at the nth generation.

The process {Xn}n=0 has state space {0, 1, 2, . . .} and will bereferred to as a branching process (bp).

Each individual in generation n gives birth to Y offspring in thenext generation of the same type (single type bp), where Y is a randomvariable that takes values in {0, 1, 2 . . .}.

The offspring distribution is

Prob{Y = k} = pk, k = 0, 1, 2, . . . .

Each individual gives birth independently of other individuals.


An Illustration of a BPA Stochastic Realizationor Sample Path

Let X0 = 1 population size (married couple), where the family history


73/244

Let X0 1 population size (married couple), where the family history

is followed over time.0

1

2

3

Figure 1: A sample path or stochastic realization of a branching process{Xn}n=0. In the first generation, four individuals are born, X1 = 4.The four individuals give birth to three, zero, four, and one individuals,respectively, making a total of eight individuals in generation 2, X2 = 8.


We Digress Here to Talk about GeneratingFunctions

Assume X is a discrete random variable with state space{0, 1, 2, . . .}. Let the probability mass function of X equal:


74/244

Prob{X = j} = pj,

j = 0, 1, 2, . . . , where

j=0pj = 1.

Mean or First Moment: X = E(X) =

Xj=0

jpj

Variance or Second Moment about the Mean:2X = E[(X X)

2] = E(X2) 2X

=

Xj=0 j

2

pj

2

X.

nth Moment: E(Xn) =X

j=0

jnpj


Definition of Probability Generating FunctionDefinition 1. The probability generating function (pgf) of X is defined

on a subset of the reals

X

j


75/244

PX(t) = E(tX

) =j=0

pjtj

,some

t R.

Because

j=0pj = 1, the sum converges absolutely for |t| 1implying

PX(t) is well defined for

|t

| 1 and infinitely differentiable

for |t| < 1. As the name implies, the pgf generates the probabilitiesassociated with the distribution

PX(0) = p0, PX(0) = p1, PX(0) = 2!p2.

In general, the kth derivative of the p.g.f. of X satisfies

P(k)X (0) = k!pk.


The PGF can be Used to Calculate the Meanand Variance.

Note that PX(t) = j=1jpjtj1 for 1 < t < 1.


76/244

X j 1 jThe Mean is

PX(1) =X

j=1

jpj = E(X) = X.

Also

PX(t) =

j=1j(j

1)pjt

j2 implies

PX(1) =X

j=1

j(j 1)pj = E(X2) E(X).

The Variance is

2X = E(X2) E(X) + E(X) [E(X)]2

= PX(1) + PX(1) [PX(1)]2.


Other Generating FunctionsDefinition 2. The moment generating function (mgf) is

MX(t) = E(etX) =

pjejt

some t R.


77/244

Xj=0The cumulant generating function (cgf) is the natural logarithm of

the moment generating function,

KX(t) = ln[MX(t)].

Note MX(t) is always well-defined for t 0. The mgf generatesthe moments:

MX(0) = 1, MX(0) = X = E(X), M

X(0) = E(X

2),

M(k)X (0) = E(Xk).


An Example Applying PGF and MGF

Poisson: 8je

j!, j = 0, 1, 2, . . . ,

th i

> 0.


78/244


79/244

h0(t) = t.

Recall Y is the offspring distribution; subscripts on Y relate to number of offspring.

In the next generation, each individual gives birth to k individuals with probabilitypk. The pgf of X1 is

h1(t) =

Xk=0

pktk = f(t).

Then X2 = Y1 + + YX1 because each of the X1 individuals gives birth to Y

individuals and the sum of all these births is X2. Note for a fixed sum m of iidrandom variables Yi,

Pmi=0 Yi, the pgf is

PP

Yi(t) = E(tY1 tYm) = E(tY1) E(tYm) = [f(t)]m.


The PGF of X2 is a Composition.But X2 is a sum of a random number X1 of iid Yi. The pgf of X2 is

h2(t) = E

tPX1

0 Yi

!=

Xj=0

tjProb

8


80/244

: ;=

Xj=0

tjX

m=0

Prob

8 0 for t (0, 1], where f(1).

(5) f(t) =

k=2 k(k 1)pktk2 > 0 for t (0, 1).


Main Theorem for Branching Processes.

Theorem 1. Assume the offspring distribution {pk} and the pgf f(t)satisfy properties (1)(5). In addition, assume X0 = 1. If m 1, then


86/244

limn

Prob{Xn = 0} = limn

p0(n) = 1

and if m > 1, then there exists q < 1 such that f(q) = q and

limn

Prob{Xn = 0} = limn

p0(n) = q.

If m 1, then Theorem 1 states that the probability of ultimate extinction

is one. If m > 1, then there is a positive probability 1 q that the bp does

not become extinct (e.g., a family name does not become extinct, a mutant gene

becomes established, a population does not die out).


Indication of Proof(1) The sequence {p0(n)}n is monotone increasing and bounded above:

limn

p0(n) = q.

(2) h li i i fi d i f f


87/244

(2) The limit is a fixed point of f:q = lim

tp0(n) = lim

nf(p0(n 1)) = f(q)

(3) m 1 iff f(t) < 1 for [0, 1).

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

y=t

t

f(t)

Figure 3: Two different probability generating functions y = f(t)intersect y = t in either one or two points on [0, 1].


All States are Transient in Galton-Watson BP,Except State Zero

N h h i b bi I f MC h h


88/244

Note that the zero state is absorbing. In terms of MC theory, theone-step transition probability

p00(n) = 1.

Theorem 2. Assume the offspring distribution {pk} and the p.g.f. f(t)satisfy properties (1)(5). In addition, assume X0 = 1. Then thestates 1, 2, . . . , are transient. In addition, if the mean m > 1, then

limn

Prob

{Xn = 0

}= q = 1

Prob

{lim

n

Xn =

},

where 0 < q < 1 is the unique fixed point of the pgf, f(q) = q.


A Corollary to the BP Theorem when X0 = N.

Corollary 1. Assume the offspring distribution{pk

}and the p.g.f. f(t)

ti f ti (1) (5) I dditi X N If 1


89/244

{ }satisfy properties (1)(5). In addition, assume X0 = N. If m 1,then

limn

Prob{Xn = 0} = limn

[p0(n)]N

= 1.

If m > 1, then

limn

Prob{Xn = 0} = limn

[p0(n)]N

= qN < 1.


Applications of Discrete-Time BranchingProcesses.


90/244

Single Type Process: {Xn}(1) Family Names

(2) Cell Cycle

(3) Network Theory


(1) An Example of a BP due to LotkaExample 2. Lotka assumed the number of sons a male has in his lifetime has the

following geometric probability distribution:

p0 = 1/2 and pk = 3

5k1 1

5for k = 1, 2, . . . .


91/244

Note that

Pk=1 pk = 1/2 and

f(t) =1

2+

1

5

Xk=1

3

5

k1tk =

1

2+

1

5

t

1 3t/5

.

m = f(1) =1/5

(1 3/5)2=

5

4

> 1.

The fixed points of f(t) are found by solving

1

2+

t

5 3t= t or 6t

2 11t + 5 = 0

so that fixed point is q = 5/6. A male has a probability of 5/6 that his line of

descent becomes extinct and a probability of 1/6 that his descendants will continueforever.


(2) An Application to the Cell CycleEach cell after completing its life cycle, doubles in size, then divides

into two progeny cells of equal sizes (Kimmel and Axelrod,2002). Aftercell division, some cells die, some remain inactive or quiesce and somekeep dividing or proliferating. After cell division:


92/244

(1) Cell proliferation, probability p2(2) Cell death, probability p0(3) Cell quiescence, probability p1, p0 + p1 + p2 = 1.

Proliferating

Proliferating Dead Quiescent Proliferating Dead Quiescent

D


The Cell Cycle is a Galton-Watson Process

Let Xn be the number of proliferating cells at time n The pgf is


93/244

Let X be the number of proliferating cells at time n. The pgf is

f(t) = (p0 + p1)2 + 2p2(p0 + p1)t + p

22t

2

= (p2t + p0 + p1)2

The mean of the proliferating cells is

m = f(1) = 2p2.

Reference: Kimmel and Axelrod (2002)


(3) An Application of BP to Network Theory

In disease networks, individuals are referred to as vertices or nodesin a network and their connectedness (by an edge) to other individualsin the network is described by a degree distribution.

Let p probability that a node or vertex in the network is


94/244

Let pk = probability that a node or vertex in the network isconnected by an edge to k other vertices. Then {pk}k=0 is known asthe degree distribution for the network. In disease transmission, it isimportant to determine the distribution of degrees for nodes reachedfrom a random node. For a randomly chosen node, the probability of

reaching a node of degree k is proportional to kpk (because there arek ways to reach this node). But for calculating the spread of disease,we do not count the edge on which the disease entered, hence theprobability associated with spread has degree k 1, called the excessdegree, qk1 kpk. Thus,

qk1 =kpk

k=1 kpk, k = 1, 2, . . . .

References: Newman (2002), Brauer (2008)


PGF for the Degree Distribution and ExcessDegree Distributions


95/244

f0(t) =

k=0

pktk

and f1(t) =

k=0

qktk =

k=1

kpktk1

m0,

where m0 =k=0 kpk is the mean of the degree distribution. Not

every connection in the network leads to disease transmission. Thus,we define the mean transmissiblity of the disease as T, 0 < T < 1,the probability that the disease is transmitted along an edge. Thebinomial distribution can be applied to a node of degree k to determinethe probability of m

k transmissions:

k

m

Tm(1 T)km.


The PGF as a Function of Transmissibility

The probability rm that there are m transmissions is

rm =

Xk=m

pkk

m Tm(1 T)km.


96/244

X Thus, the pgf associated with {rm}m=0 is

f0(t, T) =X

m=0

rmtm

=X

m=0

"X

k=m

pk

k

m

Tm(1 T)km

#tm

=X

k=0

pk

kXm=0

k

m

(tT)m(1 T)km

=X

k=0

pk(1 T + tT)k

= f0(1 T + tT).

Also, f1(t, T) = f1(1 T + tT).


The Basic Reproduction Number is Defined for aNetwork

The mean of the excess degree distribution is defined as the basic

reproduction number of the disease, R0.


97/244

p ,

R0 =df1(1 T + tT)

dt

t=1

= T f1(1).

It follows from the identities:

f0(t) =X

k=0

pktk

and f1(t) =X

k=0

qktk =

Xk=1

kpktk1

m0,

f1(t) =

f0(t)

m0 . (2)

Thus,

R0 = T f1(1) = Tf0 (1)

m0. (3)


An Example where the Network has a PoissonDegree Distribution

Example 3. Assume that the degree distribution has a Poisson distribution with

parameter . This type of network is known as a Poisson random graph. The

generating function for a Poisson distribution is f0(t) = e(t1), where m0 = .


98/244

g g f0( ) 0Applying the identities (2) and (3): f1(t) = f0(t) and

R0 = T m0.

Applying Theorem 1, the probability the disease dies out (q) is the fixed point of

f1(t, T) = f0(t, T) = e([1T+tT]1) = eR0(t1). That is,

q = eR0(q1).

For example, ifR0 = 2 and initially one infectious individual is introduced into thepopulation, then the probability the disease dies out is q 0.203. The probability

the disease becomes endemic is 1 q 0.797.


An Example where the Graph is Complete.

Example 4. Suppose the disease network is a complete graph withN

2 nodes, i.e., each node has exactly N

1 edges or connections

for a total of N(N 1)/2 edges. The degree distribution is pN1 = 1


99/244

and pk = 0 for k = N 1. The generating functions f0(t) = tN1and f1(t) = t

N2. Thus, the basic reproduction number is

R0 = T(N

2).


An Example Comparing 2 Connections to N 1Connections.

Example 5. Suppose in the disease network everyone is connected to only 2 individuals:p2 = 1, f0(t) = t

2, f1(t) = t, R0 = T.

Suppose there is one individual with N 1 connections ( small world network):N 1 1 N 1 1 2 1


100/244

p2 =N 1

N, pN1 =

1

N, f0(t) =

N 1

Nt2 +

1

NtN1, f1(t) =

2

3t +

1

3tN2

and R0 = T

2

3+

N 2

3

.

Comparing the R0 to the complete graph:

2 connections < small world < complete graph

T < T

2

3+

N 2

3

< T(N 2).


B. Multitype Galton Watson BranchingProcesses

In a multitype bp, each individual may give birth to different

types or classifications of individuals in the population - k types.When k = 1, the bp is a single type bp. There is an offspring


101/244

When , the bp is a single type bp. There is an offspringdistribution corresponding to each of these different types of individuals.For example, the population may be divided according to age or sizeand in each generation, individuals may age or grow to anotherage or size class. In addition, in each generation, individuals give birth

to new individuals in the youngest age or smallest size class.

Denote the multitype bp as {X(n)}n=0, a vector of randomvariables,

X(n) = (X1(n), X2(n), . . . , X k(n))T.

with k different types of individuals. Each random variable Xi(n)

has k associated random variables, {Yji}kj=1, where Yji is the randomvariable for the offspring distribution for an individual of type i to givebirth to an individual of type j = 1, 2, . . . , k.


We Extend the PGF to Multitype BP.

Let pi(s1, s2, . . . , sk) denote the probability an individual of type i

gives birth to s1 individuals of type 1, s2 individuals of type 2, . . ., andi di id l f t k th t i


102/244

sk individuals of type k; that is,

pi(s1, s2, . . . , sk) = Prob{Y1i = s1, Y2i = s2, . . . , Y ki = sk}.

Define the pgf for Xi, fi : [0, 1]k [0, 1] as follows:

fi(t1, t2, . . . , tk) =

sk=0

s2=0

s1=0

pi(s1, s2, . . . , sk)ts11 t

s22 tskk ,

for i = 1, 2, . . . , k.


The PGF for Multitype BP when X(0) = i.

Let i denote a k-vector with the ith component one and theremaining components zero,

i = (1i, 2i, . . . , ki)T

,

where is the Kronecker delta symbol Then X(0) means


103/244

where ij is the Kronecker delta symbol. Then X(0) = i meansthere is initially one individual of type i in the population. The pgffor Xi(0) given X(0) = i is f

0i (t1, t2, . . . , tk) = ti and the pgf for

Xi(n) given X(0) = i is fni (t1, t2, . . . , tk):

Xsk=0

X

s1=0

Prob{X1(n) = s1, . . . , Xk(n) = sk|X(0) = i}ts11 tskk

,

f1i = fi. Let F F(t1, . . . tk) = (f1, . . . , f k) denote the vector of pgfF : [0, 1]k [0, 1]k. The function F has a fixed point at (1, 1, . . . , 1)since fi(1, 1, . . . , 1) = 1. Ultimate extinction of the population depends

on whether F has another fixed point in [0, 1]k which depends on themean.


We Compute the Mean for Multitype BP inTerms of the PGF.

Let mji denote the expected number of births of a type jindividual by a type i individual; that is,

mji = E(Xj(1)|X(0) = i) for i, j = 1, 2, . . . , k .


104/244

j j

The means can be defined in terms of the pgf:

mji = fi(t1, . . . , tk)tj

t1=1,...,tk=1

.

Define the k k expectation matrix,

M =0BBB@

m11 m12 m1k

m21 m22 m2k......

...mk1 mk2 mkk

1CCCA .

If matrix M is regular (i.e., Mp > 0, for some p > 0), then M hasa simple eigenvalue of maximum modulus which we denote as .


Multitype Branching Process Theorem

Theorem 3. Assume each of the components functions fi of thepgf F(t1, . . . , tk) = (f1(t1, . . . , tk), . . . , f k(t1, . . . , tk)) are nonlinearfunctions of the variables t1, . . . , tk and the expectation matrix M is

regular. If the dominant eigenvalue of M satisfies 1, then


105/244

limn

Prob{X(n) = 0|X(0) = i} = 1,

i = 1, 2, . . . , k. If the dominant eigenvalue of M satisfies > 1, thenthere exists a vector q = (q1, q2, . . . , qk)T, qi [0, 1), i = 1, 2, . . . , k,

the unique nonnegative solution to F(t1, t2, . . . , tk) = (t1, t2, . . . , tk),such that

limn

Prob

{X(n) = 0

|X(0) = i

}= qi,

i = 1, 2, . . . , k.


Corollary to the Multitype Branching ProcessTheorem when X(0) = (r1, . . . , rk)

T.

Corollary 2. Suppose the hypotheses of Theorem 3 hold and X(0) =


106/244

Corollary 2. Suppose the hypotheses of Theorem 3 hold and X(0) =(r1, . . . , rk)

T. Then if the dominant eigenvalue of matrix M satisfies > 1,

limn

Prob{X(n) = 0|X(0) = (r1, r2 . . . , rk)T} = qr11 qr22 qrkk .


(1) Application of Multitype BP toAge-Structured Populations.

Suppose there are k age classes. An individual of type i either

survives to become a type i + 1 individual with probability pi+1,i > 0or dies with probability 1 pi+1,i, i = 1, 2, . . . , k 1. Probability0 A i di id l f i i bi h i di id l f


107/244

pk+1,k = 0. An individual of type i gives birth to r individuals of type1 with probability bi,r. The offspring distribution for an individual oftype i is

bi,r 0, and

r=0

bi,r = 1, i = 1, 2, . . . , k .

The mean of the offspring distribution is

bi =

r=1

rbi,r.


The Expectation Matrix has the form of a LeslieMatrix Model.

The expectation matrix can be calculated from the pgfs fi:

fi(t1, t2, . . . , tk) = [pi+1,iti+1 + (1 pi+1,i)]

r=0bi,rtr1, i = 1, . . . , k .


108/244

e.g., f1 = [p21t2 + (1 p21)]

b1,rtr1,

m11 =f1

t1 ti=1

= X rb1,r = b1, m21 =f1

t2 ti=1

= p21

M =

0BBBBB@

b1 b2 bk1 bkp21 0 0 0

0 p32 0 0...

... . . ....

...

0 0 pk,k1 0

1CCCCCA

.

The form of matrix M is known as a Leslie matrix. Assume matrixM is regular and that the pgfs are nonlinear. Then Theorem 3 can beapplied.


An Example of a Stochastic Age-StructuredBranching Process.

Example 6. Suppose there are two age classes with expectation matrix

M =

b1 b2p21 0

=

3/4 11/2 0

.

2


109/244

The characteristic equation of M is 2 (3/4) 1/2 = 0 so that

the dominant eigenvalue is = (3 +

41)/8 1.175 > 1. Supposethe birth probabilities are

b1,r =

1/2, r = 01/4, r = 1, 20, r = 0, 1, 2

, b2,r =

1/4, r = 0, 21/2, r = 10, r = 0, 1, 2

.

The mean number of births for each age class is

b1 = 3/4 =

r=1

rb1,r and b2 = 1 =

r=1

rb2,r

(the values in the first row of M).


To Find the Probability of Extinction, we Findthe Fixed Points of the PGF

The pgfs for the two age classes are

f1(t1, t2) = [(1/2)t2 + 1/2][1/2 + (1/4)t1 + (1/4)t21]

f2(t1 t2) = 1/4 + (1/2)t1 + (1/4)t2


110/244

f2(t1, t2) = 1/4 + (1/2)t1 + (1/4)t1.

Since > 1, the preceding system F = (f1, f2) has a unique fixedpoint on [0, 1)

[0, 1). The fixed point (q1, q2) is found by solving

f1(q1, q2) = q1 and f2(q1, q2) = q2. The solution is

(q1, q2) (0.6285, 0.6631).

Thus, if there are initially five individuals of age 1 and three

individuals of age 2, then the probability of ultimate extinction of thetotal population is approximately

(0.6285)5(0.6631)3 0.0286.


There are Many Applications of BranchingProcesses in Biology.

Several good references devoted to Branching Processes, in addition

to [1], Chapter 4:


111/244

1. Harris, TE. 1963. The Theory of Branching Processes. PrenticeHall, NJ.

2. Jagers, P. 1975. Branching Processes with Biological Applications.Wiley, Chichester.

3. Kimmel, M and D Axelrod. 2002. Branching Processes in Biology.Springer Verlag, NY.

4. Mode, CJ. 1971. Multitype Branching Processes Theory andApplications. Elsevier, NY.


This Concludes Part II on Branching Processes.


Theory

Applications to Random Walks, Populations, and Epidemics



112/244


Theory

Applications to Cellular Processes, Network Theory, and Populations


Theory




Applications to Populations and Epidemics1



Stochastic Differential Equations in

Mathematical Biology


113/244

Part IIIContinuous-Time Markov Chains

Linda J. S. AllenTexas Tech University

Lubbock, Texas U.S.A.

National Center for Theoretical Sciences

National Tsing Hua UniversityAugust 2008


Acknowledgement


114/244

I thank Professor Sze Bi Hsu and Professor Jing Yu for the invitation

to present lectures at the National Center for Theoretical Sciences atthe National Tsing Hua University.


COURSE OUTLINE




115/244

pp , p , p


Theory

Applications to Cellular Processes, Network Theory, andPopulations


Theory






Basic Reference for Part III of this Course

[1 ] Allen LJS 2003 An Introduction to Stochastic Processes with


116/244

[1 ] Allen, LJS. 2003. An Introduction to Stochastic Processes withApplications to Biology. Prentice Hall, Upper Saddle River, NJ.

Chapters 5, 6, 7

[2 ] Other references will be noted.


Part III:Continuous-Time Markov Chains - CTMC

Some Basic Definitions and NotationDefinition 1. Let {X(t)}, t [0,), be a collection of discrete


117/244

Definition 1. Let {X(t)}, t [0,), be a collection of discreterandom variables with values in {0, 1, 2, . . .}. Then the stochasticprocess {X(t)} is called a continuous time Markov chain if it satisfies

the following condition:

For any sequence of real numbers 0 t0 < t1 < < tn < tn+1,

Prob{X(tn+1) = in+1|X(t0) = i0, X(t1) = i1, . . . , X(tn) = in}

= Prob{X(tn+1) = in+1|X(tn) = in}.

Probability distribution {pi(t)}i=0 associated with X(t) is

pi(t) = Prob{X(t) = i}

with probability vector p(t) = (p0(t), p1(t), . . .)T.


The Transition Matrix for the CTMC hasProperties similar to DTMC.

Transition probabilities:

pji(t, s) = Prob{X(t) = j|X(s) = i}, s < t


118/244

for i, j = 0, 1, 2, . . . . If the transition probabilities only depend on the length of thetime step t s, they are called stationary or homogeneous transition probabilities;

Otherwise they are called nonstationary or nonhomogeneous. We shall assume thetransition probabilities are stationary, pji(t s), t > s.

Generally, the Transition matrix is a stochastic matrix,

Xj=0

pji(t) = 1

unless the process is explosive (blow-up in finite time). If the process is nonexplosive,

then P(t) is stochastic for all time P(t) = (pji(t)) , t > 0 and satisfies

P(s)P(t) = P(s + t)

for all s, t [0, ).


Waiting Times Between Jumps

The distinction between discrete versus continuous time Markov chains

is that in DTMC there is a jump to a new state at times 1, 2, . . . ,but in CTMC the jump to a new state may occur at any time t 0.The collection of random variables {Wi} denote the jump times or


119/244

The collection of random variables {Wi} denote the jump times orwaiting times and the times Ti = Wi+1 Wi are referred to as theinterevent times.

T0 T1 T2 T3

0 W1 W2 W3 W4

Figure 1: One sample path of a CTMC, illustrating waiting times and

interevent times. The process is continuous from the right.


An Example of an Explosive Process

If the waiting times approach a positive constant, W = sup{Wi},while the values of the states approach infinity,

lim X(W ) =


120/244

limi

X(Wi) = ,

then the process is explosive. We will assume the process isnonexplosive, unless noted otherwise. Sample paths are continuousfrom the right, but for ease in sketching sample paths, they are oftendrawn as connected rectilinear curves.

0 W1 W2 W3 W4 W

Figure 2: One sample path of a CTMC that is explosive.


The Poisson Process

The Poisson process {X(t)}, t [0,) is a CTMC with the

following properties:(1) X(0) = 0.


121/244

(2) pi+1,i(t) = Prob{X(t + t) = i + 1|X(t) = i} = t + o(t)

pii(t) = Prob{X(t + t) = i|X(t) = i} = 1 t + o(t)pji(t) = Prob{X(t + t) = j|X(t) = i} = o(t), j i + 2

pji(t) = 0, j < i.

known as infinitesimal transition probabilities,

limt

pi+1,i(t) t

t= 0.

The transition probabilities are independent of i and j and dependonly on the length of time t.


The Transition Matrix for the Poisson Process.

0p00(t) p01(t) p02(t)

(t) (t) (t)

1


122/244

P(t) =

0BBBBB@

p10(t) p11(t) p12(t) p20(t) p21(t) p22(t) p30(t) p31(t) p32(t)

... ... ...

1CCCCCA

=

0

BBBBB@

1 t 0 0 t 1 t 0

0 t 1 t

0 0 t ... ... ...

1

CCCCCA + o(t).

Note column sums of the matrix are one.


Assumptions (1) and (2) are Used to Derive aSystem of Differential Equations for the Poisson

Process.Because X(0) = 0, it follows that pi0(t) = pi(t).


123/244

Because X(0) 0, it follows that pi0(t) pi(t).

p0(t + t) = p0(t) [1 t + o(t)] .

Subtracting p0(t), dividing by t, and letting t 0,

dp0

(t)

dt = p0(t), p0(0) = 1.

The solution is

p0

(t) = et.


The Poisson Probabilities are Derived.Similarly,

pi(t + t) = pi(t)[1 t + o(t)] + pi1(t)[t + o(t)] + o(t),

leads todpi(t)

dt= pi(t) + pi1(t), pi(0) = 0, i 1,


124/244

dt( ) ( ) ( )

a system of differential-difference equations. The system can be solved

sequentially beginning with p0(t) = et

to show

p1(t) = tet, p2(t) = (t)

2 et

2!,

and in general, a Poisson probability distribution with parameter t

pi(t) = (t)ie

t

i!, i = 0, 1, 2, . . . .

with mean and variance

m(t) = t = 2(t).


The Interevent time is Exponentially Distributed.Let W1 be the random variable for the time until the process reaches state 1, theholding time until the first jump. Then

Prob{W1 > t} = p0(t) = et or Prob{W1 t} = 1 e

t;

W i i l d i bl i h I l i b h


125/244

W1 is an exponential random variable with parameter . In general, it can be shown

that the interevent time has an exponential distribution. We will show that this is

true in general for Markov processes.

0 2 4 6 8 10 12 14 16

0

2

4

6

8

10

12

14

t

X(t)

Poisson Process, =1

Figure 3: Sample path for a Poisson process with = 1.


Derivation of the Differential Equations byApplying the Transition Matrix Leads to a NewMatrix Known as the Generator Matrix.

Writing the probabilities in terms of the transition matrix P(t):

p(t + t) = P(t)p(t)


126/244

p( + ) ( )p( )

limt0

p(t + t) p(t)

t

= limt0

[P(t) I]

t

p(t),

where I is the identity matrix. Thus,

dp

dt = Qp(t).

where matrix Q is known as the infinitesimal generator matrix

Q = limt0

[P(t) I]

t.


The Generator Matrix has some Nice Properties.

The generator matrixq00 q01 q02


127/244

Q =

q10 q11 q12 q20 q21 q22

... ... ...

=

i=1

qi0 q01 q02

q10

i=0,i=1 q

i1 q12

q20 q21

i=0,i=2

qi2

... ... ...

.

(1)Column sum is zero.(2) The diagonal elements are negative and off-diagonal elements arenonnegative.


The Generator Matrix for the Poisson Process

The generator matrix for the Poisson process is


128/244

Q =

0 0 0

0 0 0 ... ... ...

.

The probability distribution p(t) = (p0(t), p1(t), . . . , pi(t),

an intensive course in stochastic processes

Documents