an intensive course in stochastic processes

Upload: arafatasghar

Post on 14-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 An Intensive Course In Stochastic Processes

    1/244

    An Intensive Course in Stochastic Processes and

    Stochastic Differential Equations inMathematical Biology

    Part I

    Discrete-Time Markov Chains

    Linda J. S. Allen

    Texas Tech UniversityLubbock, Texas U.S.A.

    National Center for Theoretical SciencesNational Tsing Hua University

    August 2008

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    2/244

    Acknowledgement

    I thank Professor Sze Bi Hsu and Professor Jing Yu for the invitationto present lectures at the National Center for Theoretical Sciences atthe National Tsing Hua University.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    3/244

    COURSE OUTLINE

    Part I: Discrete-Time Markov Chains - DTMC

    Theory Applications to Random Walks, Populations, and Epidemics

    Part II: Branching Processes Theory Applications to Cellular Processes, Network Theory, and

    PopulationsPart III: Continuous-Time Markov Chains - CTMC

    Theory

    Applications to Populations and EpidemicsPart IV: Stochastic Differential Equations - SDE

    Comparisons to Other Stochastic Processes, DTMC and CTMC Applications to Populations and Epidemics

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    4/244

    Some Basic References for this Course

    [1 ] Allen, LJS. 2003. An Introduction to Stochastic Processes withApplications to Biology. Prentice Hall, Upper Saddle River, NJ.

    [2 ] Allen, LJS. 2008. Chapter 3: An Introduction to StochasticEpidemic Models. Mathematical Epidemiology, Lecture Notes inMathematics. Vol. 1945. pp. 81-130, F. Brauer, P. van den

    Driessche, and J. Wu (Eds.) Springer.

    [3-20 ] Karlin and Taylor. 1975. A First Course in Stochastic Processes.2nd Ed. Acad. Press, NY;

    Kimmel and Axelrod. 2002. Branching Processes in Biology.Springer-Verlag,NY.

    Other references will be noted.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    5/244

  • 7/30/2019 An Intensive Course In Stochastic Processes

    6/244

    How do Stochastic Epidemic Models Differ fromDeterministic Models?

    A deterministic model is formulated in terms of fixed not randomvariables whose dynamics are solutions of differential or differenceequations.

    A stochastic model is formulated in terms of random variables

    whose probabilistic dynamics depend on solutions to differential ordifference equations.

    A solution of a deterministic model is a function of time or spaceand is dependent on the initial data. A solution of a stochastic model is a probability distribution or

    density function which is a function of time or space and is dependenton the initial distribution or density. One sample path over time or

    space is one realization from this distribution.

    Stochastic models are used to model the variability inherent inthe process due to demography or the environment. Stochastic modelsare particularly important when the variability is large relative to themean, e.g., small population size may lead to population extinction.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    7/244

    Whether the Random Variables Associated with TheStochastic Process are Discrete or Continuous

    Distinguishes Some Types of Stochastic Models.

    A random variable X(t; s) of a stochastic process assigns a realvalue to each outcome A S in the sample space and a probability(or probability density),

    Prob{X(t; s) A} [0, 1].The values of the random variable constitute the state space, X(t; s).For example, the number of cases associated with a disease may have

    the following discrete or continuous set of values for its state space:

    {0, 1, 2, . . .} or [0, N].

    The state space can be discrete or continuous and correspondingly,the random variable is discrete or continuous. For simplicity, thesample space notation is suppressed and X(t) is used to denotea random variable indexed by time t. The stochastic process iscompletely defined when the set of random variables

    {X(t)

    }are

    related by a set of rules.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    8/244

    The Choice of Discrete or Continuous Random Variableswith a Discrete or Continuous Index Set Defines the Type

    of Stochastic Model.

    Discrete Time Markov Chain (DTMC): t {0, t, 2t , . . .}, X(t) is a discreterandom variable. The term chain implies that the random variable is discrete.

    X(t) {0, 1, 2, . . . , N }

    Continuous Time Markov Chain (CTMC): t [0, ), X(t) is a discrete randomvariable.

    X(t) {0, 1, 2, . . . , N }Diffusion Process, Stochastic Differential Equation (SDE): t [0, ), X(t) is a

    continuous random variable.

    X(t)

    [0, N]

    Note: These are three major types of stochastic processes that will be discussed butare not the only types of stochastic processes.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    9/244

    The Following Graphs Illustrate the Solution of a

    Differential Equation versus Sample Paths of a StochasticEpidemic Model

    0 500 1000 1500 20000

    5

    10

    15

    20

    25

    30

    35

    Time Steps

    Numb

    erofInfectives,I(t)

    Figure 1: Solution of number of infectious individuals for a differential equationof a SIR epidemic model versus three sample paths of a discrete-time Markovchain.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    10/244

    Part I:

    Discrete-Time Markov Chains

    Definitions, Theorems, and Applications

    Let Xn = a discrete random variable defined on a finite {1, 2, . . . , N } orcountably infinite state space, {1, 2, . . .}. The index set {0, 1, 2, . . .} often

    represents the progression of time. The variable n is used instead of t.Definition 1. A discrete time stochastic process {Xn}n=0 is said to have the Markovproperty if

    Prob{Xn = in|X0 = i0, . . . , Xn1 = in1} = Prob{Xn = in|Xn1 = in1},

    where the values ofik {1, 2, . . .} for k = 0, 1, 2, . . . , n. The stochastic processis then called a Markov chain. A Markov stochastic process is a stochastic process

    in which the future behavior of the system depends only on the present and not on

    its past history.

    Definition 2. The probability mass function associated with the random variable Xnis denoted {pi(n)}i=0, where

    pi(n) = Prob{Xn = i}. (1)

    Reference for Part I: [1] Chapters 2 and 3.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    11/244

    One-Step Transition ProbabilitiesDefinition 3. The one-step transition probability pji(n) is the probability that the

    process is in state j at time n + 1 given that the process was in state i at theprevious time n, for i, j = 1, 2, . . ., that is,

    pji(n) = Prob{Xn+1 = j|Xn = i}.

    Definition 4. If the transition probabilities pji(n) do not depend on time n, theyare said to be stationary or homogeneous. In this case, pji(n) pji. If thetransition probabilities are time-dependent, pji(n), they are said to be nonstationaryor nonhomogeneous.The transition matrix

    P =

    0BBB@p11 p12 p13 p21 p22 p23 p31 p32 p33

    ... ... ...

    1CCCA .

    The column elements sum to one,P

    pji = 1. A matrix with this property is calleda stochastic matrix, P and Pn are stochastic matrices.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    12/244

    NStep Transition ProbabilitiesDefinition 5. The n-step transition probability, denoted p

    (n)

    ji

    , is the probability ofmoving or transferring from state i to state j in n time steps,

    p(n)

    ji = Prob{Xn = j|X0 = i}.

    The n-step transition matrix is P

    (n)

    =

    p

    (n)

    ji

    .

    Then P(1) = P, P(0) = I, identity matrix and, in general, P(n) = Pn. Letp(n) = (p1(n), p2(n), . . .)

    T be the probability mass vector,

    Xi=1

    pi(n) = 1.

    Then p(n + 1) = P p(n). In general,

    p(n + m) = Pn+m

    p(0) = Pn `

    Pm

    p(0)

    = Pn

    p(m).

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    13/244

    Classification of StatesDefinition 6. The state j can be reached from the state i (or state j is accessible

    from state i) if there is a nonzero probability, p(n)

    ji > 0, for some n 0 denotedi j. If j i, and if i j, then i and j are said to communicate, or to be inthe same class, denoted i j.Definition 7. A set of states C is closed if it is impossible to reach any state outside

    of C from any state inside C by one-step transitions, i.e., pji = 0 if i C andj / C.

    The relation i j can be represented in graph theory as a directed edge.The relation i j is an equivalence relation. The equivalence relation on thestates define a set of equivalence classes. These equivalence classes are known ascommunication classes of the Markov chain.

    Definition 8. If there is only one communication class, then the Markov chain issaid to be irreducible, but if there is more than one communication class, then theMarkov chain is said to be reducible.

    A sufficient condition that shows that a Markov chain is irreducible is the

    existence of a positive integer n such that p(n)ji > 0 for all i and j; that is,Pn > 0, for some positive integer n. For a finite Markov chain, irreducibility can be

    checked from the directed graph. A finite Markov chain with states {1, 2, . . . , N }is irreducible if there is a directed path from i to j for every i, j {1, 2, . . . , N }.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    14/244

    Gamblers Ruin Problem or Random WalkExample 1. The states {0, 1, 2 . . . , N } represent the amount of money of thegambler. The gambler bets $1 per game and either wins or loses each game. The

    gambler is ruined if he/she reaches state 0. The probability of winning (moving to theright) is p > 0 and the probability of losing (moving to the left) is q > 0, p + q = 1.This model can also be considered a random walk on a grid with N + 1 points.The one-step transition probabilities are p00 = 1 and pN N = 1. All other elements

    are zero. There are three communication classes: {0}, {1, 2, . . . , N 1}, and{N}. The Markov chain is reducible. The sets {0} and {N} are closed, but theset {1, 2, . . . , N 1} is not closed. Also, states 0 and N are absorbing; theremaining states are transient.

    0 1 2 N

    The transition matrix for the gamblers ruin problem is

    P =

    0BBBBBBBBBBB@

    1 q 0 0 00 0 q

    0 0

    0 p 0 0 00 0 p 0 0... ... ... ... ...0 0 0 q 00 0 0

    0 0

    0 0 0 p 1

    1CCCCCCCCCCCA

    .

    L. J. S. Allen Texas Tech University

    Periodic Chains

  • 7/30/2019 An Intensive Course In Stochastic Processes

    15/244

    Periodic ChainsExample 2. Suppose the states are {1, 2, . . . , N } with transition matrix

    P =0BBBBB@

    0 0 0 11 0

    0 0

    0 1 0 0... ... ... ...0 0 1 0

    1CCCCCA .

    Beginning in state i it takes exactly N time steps to return to state i, PN = I.

    The chain is periodic with period equal to N.

    1 2 3 N

    Definition 9. The period of state i, d(i), is the greatest common divisor of all

    integers n 1 for which p(n)ii > 0; that is, d(i) = g.c.d{n|p(n)ii > 0 and n 1}.

    If a state i has period d(i) > 1, it is said to be periodic of period d(i). If theperiod of a state equals one, it is said to be aperiodic.

    Periodicity is a class property: i j, implies d(i) = d(j). Thus, we speak ofa periodic class or chain or an aperiodic class or chain.

    In the gamblers ruin problem or random walk model with absorbing boundaries,

    Example 1, the classes{

    0}

    and{

    N}

    are aperiodic. The class{

    1, 2, . . . , N

    1}has period 2.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    16/244

    Transient and Recurrent StatesDefinition 10. Let f

    (n)ii denote the probability that, starting from state i, X0 = i,

    the first return to state i is at the nth time step, n 1; that is,f

    (n)ii = Prob{Xn = i, Xm = i, m = 1, 2, . . . , n 1|X0 = i}.

    The probabilities f(n)ii are known as first return probabilities. Define f

    (0)ii = 0.

    Definition 11. State i is said to be transient if P

    n=1f(n)ii < 1. State i is said to be

    recurrent ifP

    n=1f

    (n)ii = 1.

    Definition 12. The mean recurrence time for state i is

    ii =

    Xn=1

    nf(n)ii .

    Definition 13. If a recurrent state i satisfies ii < , then it is said to be positiverecurrent, and if it satisfies ii = , then it is said to be null recurrent.

    An example of a positive recurrent state is an absorbing state. The mean

    recurrence time of an absorbing state is ii = 1.

    L. J. S. Allen Texas Tech University

    First Passage Time and Recurrent Chains

  • 7/30/2019 An Intensive Course In Stochastic Processes

    17/244

    First Passage Time and Recurrent Chains

    The probability f(n)

    ji for j = i is defined similarly.Definition 14. Let f

    (n)ji denote the probability that, starting from state i, X0 = i,

    the first return to state j, j = i is at the nth time step, n 1,

    f(n)

    ji = Prob{Xn = j, Xm = j, m = 1, 2, . . . , n 1|X0 = i}, j = i.

    The probabilities f(n)

    ji are known as first passage time probabilities. Define f(0)

    ji = 0.

    Definition 15. If X0 = i, then the mean first passage time to state j is denoted asji = E(Tji) and defined as

    ji =X

    n=1

    nf(n)

    ji , j = i.

    We use these definitions to verify alternative definitions for transient and recurrentstates and recurrent and transient communication classes and chains.

    Theorem 1. A state i is recurrent (transient) if and only ifP

    n=0p

    (n)ii diverges

    (converges); that is,

    Xn=0

    p(n)ii = (< ).

    Recurrence and transience are class properties; i

    j, state i is recurrent

    (transient) iff state j is recurrent (transient).

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    18/244

    The 1-D, Unrestricted Random Walk isTransient unless it is Symmetric.

    Example 3. Consider the 1-D, unrestricted random walk. The chain is irreducible and

    periodic of period 2.

    -2 -1 0 1 2

    Let p be the probability of moving to the right and q be the probability of movingleft, p + q = 1. We verify that the state 0 or the origin is recurrent iffp = 1/2 = q.However, if the origin is recurrent, then all states are recurrent because the chain isirreducible. Starting from the origin, it is impossible to return in an odd number ofsteps,

    p(2n+1)00 = 0 for n = 0, 1, 2, . . . .

    In 2n steps, there are a total of n steps to the right and a total of n steps to theleft, and the n steps to the left must be the reverse of those steps taken to the rightin order to return to the origin.

    2nn

    =

    (2n)!

    n!n!

    different paths (combinations) that begin and end at the origin. The probability of

    occurrence of each one of these paths is pnqn. Thus,

    Xn=0

    p(n)00 =

    Xn=0

    p(2n)00 =

    Xn=0

    2n

    n

    pnqn.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    19/244

    We need an asymptotic formula for n! known as Stirlings formula to verify

    recurrence:n! nnen

    2n.

    Stirlings formula gives the following approximation:

    p(2n)00 =

    (2n)!

    n!n! pn

    qn

    4n(2n)2ne2n

    2n2n+1e2np

    nq

    n

    =

    (4pq)n

    n . (2)

    There exists a positive integer N such that for n N,

    (4pq)n

    2n < p(2n)00

    Xn=N(4pq)n

    2

    n=

    1

    2

    Xn=N1

    n= .

    The latter series diverges because it is just a multiple of a divergent p-series.

    Therefore, by Theorem 1, state 0 is recurrent iff p = 1/2 = q but the chain

    is irreducible, so all states are recurrent. The chain is transient iff p = q; there isa positive probability that an object starting from the origin will never return to theorigin. The object tends to + or to .

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    21/244

    Summary of Classification Schemes

    Markov chains or classes can be classified as

    Periodic or Aperiodic

    Then further classified as

    Transient or Recurrent

    Then recurrent MC can be classified as

    Null recurrent or Positive recurrent.

    The term ergodic refers to a MC that is aperiodic, irreducibleand recurrent; strongly ergodic if it is positive recurrent and weaklyergodic if it is null recurrent.

    L. J. S. Allen Texas Tech University

    Basic Theorems for Markov Chains (MC)

  • 7/30/2019 An Intensive Course In Stochastic Processes

    22/244

    Basic Theorems for Marko Chains (MC)Theorem 2 (Basic Limit Theorem for Aperiodic MC). An ergodic MC has the property

    limn p(n)ij =1

    ii,

    where ii is the mean recurrence time for state i; i and j are any states of thechain. [If ii = , then limn p(n)ij = 0.]Theorem 3 (Basic Limit Theorem for Periodic MC). A recurrent, irreducible, and d-periodic MC has the property

    limn

    p(nd)ii =

    d

    ii

    and p(m)ii = 0 if m is not a multiple of d, where ii is the mean recurrence time for

    state i. [If ii = , then limn p(nd)ii = 0.]Theorem 4. If j is a transient state of a MC, and i is any other state, then

    limn

    p(n)

    ji = 0.

    The first two proofs apply discrete renewal theory (Karlin and Taylor, 1975). These

    theorems also apply to classes in a MC.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    23/244

    The 1-D Unrestricted Symmetric Random Walkis Null Recurrent.

    Example 4. The unrestricted random walk model is irreducible andperiodic with period 2. The chain is recurrent iff it is a symmetricrandom walk, p = 1/2 = q (Example 3). Recall that the 2n-steptransition probability satisfies

    p(2n)00

    1n

    and hence, limnp(2n)00 = 0. The Basic Limit Theorem for PeriodicMarkov chains states that d/00 = 0. Thus, 00 = . Whenp = 1/2 = q, the chain is null recurrent.

    It can be shown in a 2-D symmetric lattice random walk (probability1/4 moving in each of 4 directions), the chain is null recurrent. But ina 3-D symmetric lattice random walk (probability 1/6 moving in eachof 6 directions), the chain is transient.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    24/244

    Stationary Probability Distribution

    Definition 16. A stationary probability distribution of a MC is a probability vector

    = (1, 2, . . .)T,Pi=1 i = 1, that satisfies

    P = .

    Example 5. Let

    P =

    0BBB@

    a1 0 0 a2 a1 0 a3 a2 a1

    ... ... ...

    1CCCA

    ,

    where ai > 0 andPi=1 ai = 1. There exists no stationary probability distribution

    because P = implies = 0, the zero vector. It is impossible for the sum of theelements of to equal one.

    According the the Basic Limit Theorem for MC, every strongly ergodic MC

    converges to a stationary probability distribution. For a periodic MC, this is not thecase.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    25/244

    A Strongly Ergodic MC Converges to aStationary Distribution.

    Theorem 5. A strongly ergodic MC with states {1, 2, . . .} and transition matrixP has a unique positive stationary probability distribution = (1, 2, . . .)

    T,

    P = , such that

    limn

    Pnp(0) = .

    Example 6. The following transition matrix is based on a strongly ergodic MC:

    P =

    0@

    0 1/4 0

    1 1/2 1

    0 1/4 0

    1A

    .

    The stationary probability distribution is = (1/6, 2/3, 1/6)T with mean recurrencetimes 11 = 6, 22 = 3/2, and 33 = 6. The columns ofP

    n approach the stationary

    probability distribution,

    limn

    Pn =

    0@1/6 1/6 1/62/3 2/3 2/3

    1/6 1/6 1/6

    1A .

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    26/244

    The Basic Theorems Simplify for Finite MC

    Facts: In finite MC, there are NO null recurrent states and not all states are

    transient. In addition, in finite MC, a stationary probability distribution, is aneigenvector of P corresponding to an eigenvalue one.

    Theorem 6. An irreducible finite MC is positive recurrent. In addition, an irreducible,

    aperiodic finite MC has a unique positive stationary distribution such that

    limn

    Pn

    p(0) = .

    Example 7. The transition matrix for an irreducible finite MC is

    P =

    1/2 1/31/2 2/3

    .

    The stationary probability distribution satisfies P = ,

    = (2/5, 3/5)T

    .

    Mean recurrence times are 11 = 5/2 and 22 = 5/3.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    27/244

    Biological Applications of DTMC Processes

    We will apply these definitions and theorems to some biologicalexamples:

    (1) Gamblers Ruin Problem or Random Walk

    (2) Birth and Death Process

    (3) Logistic Growth Process(4) SIS Epidemic Process

    (5) SIR Epidemic Process

    (6) Chain Binomial Epidemic Process

    L. J. S. Allen Texas Tech University

    ( )

  • 7/30/2019 An Intensive Course In Stochastic Processes

    28/244

    (1) Gamblers Ruin Problem

    The gamblers ruin problem is a classical problem in DTMC theory.The model can also be considered a random walk on a spatial grid withN + 1 grid points. If N, the spatial grid is semi-infinite.Probability of Absorption

    Let ak be the probability of absorption into state 0 (ruin) beginningwith a capital of k, 1 k N 1 and let bk be the probability ofabsorption into state N (win and game stops) beginning with a capitalofk. If there is only one absorbing state at 0, as in a population model,

    then probability of absorption is probability of extinction. If akn andbkn represent absorption into the two states, 0 and N, respectively,after the nth game or step, then

    ak =

    k=0

    akn and bk =

    n=0

    bkn

    L. J. S. Allen Texas Tech University

    Probability of Absorption

  • 7/30/2019 An Intensive Course In Stochastic Processes

    29/244

    We solve a boundary value problem (bvp) for ak (difference equation) and use

    the fact that ak + bk = 1 to solve bk. The bvp is

    ak = pak+1 + qak1

    a0 = 1

    aN = 0

    This is a homogeneous linear difference equation. Assume ak = k

    = 0 to obtainthe characteristic equation p2 + q = 0. Then

    ak =(q/p)N (q/p)k

    (q/p)N 1 , p = q

    bk =(q/p)k 1(q/p)N 1, p = q.

    If p = 1/2 = q, then

    ak =N k

    N, p = 1/2 = q

    bk =k

    N, p = 1/2 = q.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    30/244

    The Probability of Absorption for N = 100 and

    k = 50.p = probability of winning (moving to right)

    q = 1 p = probability of losing (moving to left),Prob. a50 b50

    q = 0.50 0.5 0.5

    q = 0.51 0.880825 0.119175

    q = 0.55 0.999956 0.000044

    q= 0

    .60 1.00000 0.00000

    Table 1: Gamblers ruin problem with a beginning capital of k = 50and a total capital of N = 100.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    31/244

    Expected Time Until Absorption

    In terms of the gamblers ruin problem, we will determine the meantime until absorption. In terms of population models, if there is onlyone absorbing boundary at 0, this represents population extinction andthe mean time until absorption is the mean time until extinction.

    Let k = the expected time until absorption in the gamblers ruinproblem. We solve the following bvp:

    k = p(1 + k+1 + q(1 + k1) = 1 + pk+1 + qk1,0 = 0 = N

    This is a nonhomogeneous, linear difference equation. To solve the

    homogeneous equation, let = = 0 to obtain the characteristicequation: p2 + q = 0. A particular solution is k = k/(q p),q = p.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    32/244

    The Expected Time to Absorption for N = 100

    and k = 50.

    k =

    k

    q p N

    q p

    1 (q/p)k1 (q/p)N

    , q = p

    k(N k), q = pProb. a50 b50 50

    q = 0.50 0.5 0.5 2500

    q = 0.51 0.880825 0.119175 1904q = 0.55 0.999956 0.000044 500

    q = 0.60 1.00000 0.00000 250

    Table 2: Gamblers ruin problem with a beginning capital of k = 50and a total capital of N = 100.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    33/244

    Expected Time Until Absorption as a Function

    of Initial Captial k for N = 100 and q = 0.55.

    0 20 40 60 80 1000

    200

    400

    600

    800

    Initial capital

    Ex

    pectedduration

    Figure 2: Expected duration of the games, k for k = 0, 1, 2, . . . , 100,when q = 0.55 and N = 100.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    34/244

    Random Walk on a Semi-Infinite Domain,NProbability of Extinction

    ak =

    1, p < qq

    p

    k

    , p q.

    Expected Time Until Extinction

    k =

    k

    q p, p < q

    , p

    q.

    L. J. S. Allen Texas Tech University

    (2) Birth and Death Process

  • 7/30/2019 An Intensive Course In Stochastic Processes

    35/244

    (2) Birth and Death Process

    A birth and death process is related to the gamblers ruin problem, but the

    probability of a birth (winning) or death (losing) are not constant but depend onthe size of the population and size N is not absorbing. Let Xn, n = 0, 1, 2, . . .denote the size of the population. The birth and death probabilities are bi and di,

    b0 = 0 = d0, bN = 0, bi > 0 and di > 0 for i = 1, 2, . . .. During the timeinterval n

    n + 1, at most one event occurs, either a birth or a death. Assume

    pji = Prob{Xn+1 = j|Xn = i}

    =

    8>>>:

    bi, if j = i + 1

    di, if j = i 11 (bi + di), if j = i0, if j = i 1, i , i + 1

    for i = 1, 2, . . ., p00 = 1, pj0 = 0 for j = 0, and pN+1,N = bN = 0.

    L. J. S. Allen Texas Tech University

    The Transition Matrix for a Birth and Death

  • 7/30/2019 An Intensive Course In Stochastic Processes

    36/244

    The Transition Matrix for a Birth and DeathProcess

    The transition matrix P has the following form:0BBBBBBBBB@

    1 d1 0 0 00 1 (b1 + d1) d2 0 00 b1 1 (b2 + d2) 0 00 0 b2

    0 0

    ... ... ... ... ... ...0 0 0 1 (bN1 + dN1) dN0 0 0 bN1 1 dN

    1CCCCCCCCCA

    .

    To ensure P is a stochastic matrix,

    supi{1,2,...}

    {bi + di} 1.

    During each time interval, n to n + 1, either the population sizeincreases by one, decreases by one, or stays the same size. This is areasonable assumption if the time interval is sufficiently small.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    37/244

    Eventual Extinction Occurs with Probability One.

    There are two communication classes, {0} and {1, . . . , N }. Thefirst one is is positive recurrent and the second one is transient.There exists a unique stationary probability distribution , P = ,where 0 = 1 and i = 0 for i = 1, 2, . . . , N . Eventually populationextinction occurs from any initial state (Theorem 4).

    limn

    Pnp(0) = .

    However, the expected time to extinction may be very long!

    L. J. S. Allen Texas Tech University

    Expected Time to Extinction in a Birth andD th P

  • 7/30/2019 An Intensive Course In Stochastic Processes

    38/244

    Death Process.

    Let k = the expected time until extinction for a population with initial size k.

    k = bk(1 + k+1) + dk(1 + k1) + (1 (bk + dk))(1 + k)= 1 + bkk+1 + dkk1 + (1 bk dk)k

    and N = 1 + dNN1 + (1 dN)N. This can be expressed in matrix form:D = c, where = (0, 1, . . . , N)

    T, c = (0, 1, . . . , 1)T, and D is0

    BBBBBBB@

    1 | 0 0 0 0 0 d1 | b1 d1 b1 0 0 00 | d2 b2 d2 b2 0 0... | ... ... ... ... ... ...0 | 0 0 0 dN dN

    1

    CCCCCCCA= 1 0

    D1 DN .

    Definition 17. An N N matrix A = (aij ) is diagonally dominant if

    aii NX

    j=1,j=i|aij|.

    Matrix A is irreducibly diagonally dominant if A is irreducible and diagonally dominant with strictinequality for at least one i

    L. J. S. Allen Texas Tech University

    Expected Time until Extinction in a Birth andD th P

  • 7/30/2019 An Intensive Course In Stochastic Processes

    39/244

    Death Process.

    Matrix DN is irreducibly diagonally dominant, det(DN) = 0 and det(D) =det(DN) . Thus, D is nonsingular and the solution to the expected time untilextinction is

    = D1c.

    Because matrix D is tridiagonal, simple recursion relations can be applied to obtain

    explicit formulas for the k, k = 1, 2, . . . , N .Theorem 7. Suppose {Xn}, n = 0, 1, 2, . . . , N , is a general birth and deathprocess with X0 = m 1 satisfying b0 = 0 = d0, bi > 0 for i = 1, 2, . . . , N 1,and di > 0 for i = 1, 2, . . . , N . The expected time until population extinctionsatisfies

    m =

    8>>>>>:

    1

    d1+

    NPi=2

    b1 bi1d1 di

    , m = 1

    1 +m1

    Ps=1 "d1 dsb1

    bs

    N

    Pi=s+1b1 bi1

    d1

    di # , m = 2, . . . , N .

    Diagonally dominant and irreducibly diagonally dominant matrices are nonsingular.

    L. J. S. Allen Texas Tech University

    An Example of a Simple Birth and Death

  • 7/30/2019 An Intensive Course In Stochastic Processes

    40/244

    An Example of a Simple Birth and DeathProcess with N = 20.

    Example 8. Suppose the maximal population size is N = 20, where the birthand death probabilities are linear: bi bi = 0.03i, for i = 1, 2, . . . , 19,di di = 0.02i, for i = 1, 2, . . . , 20, a simple birth and death process. Sinceb > d, there is population growth.

    0 5 10 15 200

    2

    4

    6x 10

    4

    Initial population size

    Expec

    tedduration

    b > d

    Figure 3: Expected time until population extinction when the maximalpopulation size is N = 20, bi = 0.03i, and di = 0.02i.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    41/244

    (3) Logistic Growth ProcessAssume bidi = ri (1 i/K), where r = intrinsic growthrate and K = carrying capacity.

    Two cases:

    (a) bi = ri i2

    2K and di = r i

    2

    2K

    , i = 0, 1, 2, . . . , 2K

    (b) bi =

    ri, i = 0, 1, 2, . . . , N 10, i N and di = r

    i2

    K, i =

    0, 1, . . . , N

    L. J. S. Allen Texas Tech University

    We Plot the Expected Time to Extinction for

  • 7/30/2019 An Intensive Course In Stochastic Processes

    42/244

    pTwo Cases.

    Example 9. Let r = 0.015, K = 10, and N = 20. The population persists muchlonger in case (a).

    0 5 10 15 200

    2

    4

    6

    8

    10x 10

    6

    (a)

    Initial population size

    Expec

    tedtimetoextinction

    0 5 10 15 200

    1

    2

    2.5x 10

    5

    (b)

    Initial population size

    Expectedtimetoextinction

    Figure 4: Expected time until population extinction when the birth and deathrates satisfy (a) and (b) and the parameters are r = 0.015, K = 10, andN = 20.

    L. J. S. Allen Texas Tech University

    Quasistationary Probability Distribution

  • 7/30/2019 An Intensive Course In Stochastic Processes

    43/244

    When the expected time to extinction is very long, it is reasonable to examine

    the dynamics of the process prior to extinction. Define the probability conditioned

    on nonextinction:

    qi(n) = Prob{Xn = i|Xj = 0, j = 0, 1, 2, . . . , n 1}

    =pi(n)

    1 p0(n)for i = 1, 2, . . . , N . Note q(n) = (q1(n), q2(n), . . . , qN(n))

    T defines aprobability distribution because

    NXi=1

    qi(n) =PN

    i=1 pi(n)1 p0(n) = 1 p

    0(n)1 p0(n) = 1.

    Let Qn = the random variable for the population size at time n conditional onnonextinction; qi(n) = Prob{Qn = i}. This quasistationary process is a finiteirreducible MC. The stationary probability distribution for this process is denoted asq; q is referred to as the quasistationary probability distribution.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    44/244

    Quasistationary Probability DistributionDifference equations for qi(n) can be derived based on those for pi(n) [i.e.,

    p(n + 1) = P p(n)]. From these difference equations the quasistationary probability

    distribution q can be determined. It will be seen that q cannot be calculated by a

    direct method but by an indirect method, an iterative scheme.

    An approximation to the process {Qn} yields a strongly ergodic MC, {Qn}, withassociated probability distribution q(n). For this new process, a transition matrix,

    P, and the limiting positive stationary probability distribution q can be defined.

    The stationary probability distribution q is an approximation for the quasistationaryprobability distribution q.

    L. J. S. Allen Texas Tech University

    Quasistationary Probability Distribution

  • 7/30/2019 An Intensive Course In Stochastic Processes

    45/244

    Difference equations for qi(n + 1) are derived from the identity p(n + 1) =P p(n).

    qi(n + 1) =pi(n + 1)

    1 p0(n + 1)

    = pi(n + 1)

    1 p0(n)1 p0(n)

    1 p0(n + 1)=

    pi(n + 1)

    1 p0(n)

    1 p0(n)

    1 p0(n) d1p1(n)

    or

    qi(n + 1)(1 d1q1(n)) = pi(n + 1)1 p0(n) .Using the identity for pi(n + 1), the following relation is obtained:

    qi(n + 1)[1 d1q1(n)] = bi1qi1(n) + (1 bi di)qi(n) + di+1qi+1(n)

    for i = 1, 2, . . . , N , b0 = 0, and qi(n) = 0 for i / {1, 2, . . . , N }. It issimilar to the difference equation satisfied by pi(n) except for an additional factor

    multiplying qi(n + 1). An analytical solution to q cannot be found directly from

    these equations since the coefficients depend on n, but q can be solved by aniterative method.

    L. J. S. Allen Texas Tech University

    Approximate Quasistationary ProbabilityDistribution

  • 7/30/2019 An Intensive Course In Stochastic Processes

    46/244

    To approximate the quasistationary probability distribution, q, let d1 = 0. That is,when the population size equals one, the probability of dying is zero. Then

    qi(n + 1) = bi1qi1(n) + (1 bi di)qi(n) + di+1qi+1(n),i = 2, . . . , N 1, q1(n + 1) = (1 b1)q1(n) + d2q2(n), and qN(n + 1) =bN1qN1(n) + (1 dN)qN(n). The new transition matrix corresponding to thisapproximation satisfies

    P =

    0BBBBBBB@

    1 b1 d2 0 0b1 1 (b2 + d2) 0 00 b2 0 0...

    ......

    ......

    0 0

    1

    (bN

    1 + dN

    1) dN

    0 0 bN1 1 dN

    1CCCCCCCA

    .

    Note that P is a submatrix of the original transition matrix P, where the first

    column and first row of P are deleted and d1 = 0. The MC q(n + 1) = Pq(n),is strongly ergodic and thus q(n) converges to a unique stationary probability

    distribution, q, where

    qi+1 =bi b1

    di+1 d2q1 and

    NXi=1

    qi = 1.

    L. J. S. Allen Texas Tech University

    Approximate Quasistationary ProbabilityDistribution

  • 7/30/2019 An Intensive Course In Stochastic Processes

    47/244

    Example 10. The approximate quasistationary probability distribution, q, is comparedto the quasistationary probability distribution q when r = 0.015, K = 10, andN = 20 in cases (a) and (b). Both distributions have good agreement for N = 20,

    but when N = 10 and K = 5, then the two distributions differ, especially for valuesnear zero.

    5 10 15 200

    0.05

    0.1

    0.15

    0.2

    Population size

    Probability

    (a)

    5 10 15 200

    0.05

    0.1

    0.15

    0.2

    Population size

    Probability

    (b)(b)

    2 4 6 8 100

    0.05

    0.1

    0.15

    0.2

    Population size

    Probability

    (c)

    Figure 5: Quasistationary probability distribution, q (solid curve), and theapproximate quasistationary probability distribution, q (diamond marks),when r = 0.015, K = 10, and N = 20 in cases (a) and (b). In (c),r = 0.015, K = 5, N = 10, where bi = ri and di = ri

    2/K.

    L. J. S. Allen Texas Tech University

    Probability Distribution Associated with LogisticGrowth when N = 100, K = 50 and X0 = 5.

  • 7/30/2019 An Intensive Course In Stochastic Processes

    48/244

    , 0p(n) = (p1(n), . . . , pN(n))

    T, n = 0, 1, . . . , 2000.

    0

    50

    100

    0

    1000

    2000

    0

    0.2

    0.4

    0.6

    0.8

    1

    StateTime, n

    Probabilit

    y

    Figure 6: The stochastic logistic probability distribution, p(n), r = 0.004,K = 50, N = 100, X0 = 5.

    L. J. S. Allen Texas Tech University

    (4) To Understand the Stochastic SIS EpidemicM d l W R i th D i f th

  • 7/30/2019 An Intensive Course In Stochastic Processes

    49/244

    Model, We Review the Dynamics of the

    Deterministic SIS Epidemic Model.Deterministic SIS:

    S I

    dS

    dt=

    NSI + (b + )I

    dI

    dt=

    NSI (b + )I

    where > 0, > 0, N > 0 and b 0, S(t) + I(t) = N.

    L. J. S. Allen Texas Tech University

    The Dynamics of the Deterministic SISEpidemic Model Depend on the Basic

  • 7/30/2019 An Intensive Course In Stochastic Processes

    50/244

    p pReproduction Number.

    = transmission rate

    b = birth rate = death rate

    = recovery rate

    N = total population size = constant.

    Basic Reproduction Number:

    R0 = b +

    If R0 1, then limt I(t) = 0.If

    R0 > 1, then limt

    I(t) = N1

    1

    R0.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    51/244

    Formulation of the SIS Stochastic MC EpidemicModel

    Let In denote the discrete random variable for the number of infected (and infectious)

    individuals with associated probability function

    pi(n) = Prob{In = i}

    where i = 0, 1, 2, . . . , N is the total number infected at time t. The probability

    distribution is

    p(n) = (p0(n), p1(n), . . . , pN(n))T

    for t = 0, 1, 2, . . . . Now we relate the random variables {In} indexed by time nby defining the probability of a transition from state i to state j, i j, at timen + 1 as

    pji(n) = Prob{In+1 = j|In = i}.

    L. J. S. Allen Texas Tech University

    For the Stochastic Model Assume that the

  • 7/30/2019 An Intensive Course In Stochastic Processes

    52/244

    For the Stochastic Model, Assume that theTime Interval is Sufficiently Small, Such that theNumber of Infectives Changes by at Most One.

    That is,i i + 1, i i 1 or i i.

    Either there is a new infection, birth, death, or a recovery. Therefore, the

    transition probabilities are

    pji(n) =

    8>>>>>>>>>:

    i(N i)/N = bi, j = i + 1b + = di, j = i 11 [i(N i)/N + (b + )i] =

    1

    [bi + di], j = i

    0, j = i + 1, i , i 1,

    Then the SIS Epidemic Process is similar to a birth and death process.

    L. J. S. Allen Texas Tech University

    Three Sample Paths of the DTMC SIS Model

  • 7/30/2019 An Intensive Course In Stochastic Processes

    53/244

    Three Sample Paths of the DTMC SIS Model

    are Compared to the Solution of theDeterministic Model.

    0 500 1000 1500 20000

    10

    20

    30

    40

    50

    60

    70

    Time Steps

    Num

    berofInfectives

    Figure 7: R0 = 2, = 0.01, b = 0.0025 = , N = 100, S0 = 98, andI0 = 2.

    L. J. S. Allen Texas Tech University

    Even Though R0 > 1 the DTMC SIS Epidemic

  • 7/30/2019 An Intensive Course In Stochastic Processes

    54/244

    Even Though R0 > 1, the DTMC SIS EpidemicModel Predicts the Epidemic Ends.

    When R0 > 1, the Deterministic SIS epidemic model predicts that an endemicequilibrium is reached. This is not the case for the Stochastic SIS epidemic model.

    limn

    p0(n) = 1.

    As mentioned earlier, this absorption at zero may take an exponential amount oftime. But when N is large and I0 = i is small, for large time n,

    0 < p0(n) P0 = constant < 1

    L. J. S. Allen Texas Tech University

    An Estimate for P0 Can be Obtained From theGamblers Ruin Problem on a Semi Infinite

  • 7/30/2019 An Intensive Course In Stochastic Processes

    55/244

    Gambler s Ruin Problem on a Semi-Infinite

    Domain.When N is large and i is small:

    Probability Movement Right = p =

    Ni(N

    i)

    i

    Probability Movement Left = q = (b + )i

    Based on an random walk model on a semi-infinite domain, an estimate for the

    probability of no epidemic (probability of ruin) with a capital k is ak = (q/p)k:

    Suppose I0 = k. Then

    Probability of no Epidemic = P0

    b +

    k=

    1R0k

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    56/244

    Graphs Probability Distribution are Bimodal,Showing the Probability of Now Epidemic and

    the Quasistationary Distribution.

    R0 = 2, I(0) = 3, P0 1/8

    0

    50

    100

    0

    1000

    2000

    0

    0.2

    0.4

    0.6

    0.8

    1

    iTime Steps

    Prob{I(t)=i}

    L. J. S. Allen Texas Tech University

    (5) We Review the Dynamics of theDeterministic SIR Epidemic Model.

  • 7/30/2019 An Intensive Course In Stochastic Processes

    57/244

    Deterministic SIR: Basic Reproduction Number R0 = b +

    S I R

    dSdt

    = N

    SI + b(I + R)

    dI

    dt=

    NSI (b + )I

    dR

    dt

    = I

    bR

    If R0 > 1 and b > 0, then limt I(t) = I > 0.IfR0 > 1 and b = 0, then limt I(t) = 0. An epidemic occurs ifR0

    S(0)

    N> 1.

    If

    R0

    1, then limt I(t) = 0.

    L. J. S. Allen Texas Tech University

    Formulation of a DTMC SIR Epidemic ModelResults In A Bivariate Process.

  • 7/30/2019 An Intensive Course In Stochastic Processes

    58/244

    Results In A Bivariate Process.

    Sn + In + Rn = N = maximum population size.Let Sn and In denote discrete random variables for the number of susceptible

    and infected individuals, respectively. These two variables have a joint probabilityfunction

    p(s,i)(n) = Prob{

    Sn = s, In = i}

    where Rn = N Sn In. For this stochastic process, we define transitionprobabilities as follows:

    p(s+k,i+j),(s,i) = Prob{(S, I) = (k, j)|(S(t), I(t)) = (s, i)}

    =

    8>>>>>>>>>>>>>:

    i(N i)/N, (k, j) = (1, 1)i, (k, j) = (0, 1)bi, (k, j) = (1, 1)b(N

    s

    i), (k, j) = (1, 0)

    1 [i(N i)/N + i + b(N s)], (k, j) = (0, 0)0, otherwise

    In multivariate processes the transition matrix is often too large and complex to

    write down.

    L. J. S. Allen Texas Tech University

    Three Sample Paths of the DTMC SIR Epidemic

  • 7/30/2019 An Intensive Course In Stochastic Processes

    59/244

    Three Sample Paths of the DTMC SIR Epidemic

    Model are Compared to the Solution of theDeterministic Model.

    0 500 1000 1500 20000

    5

    10

    15

    20

    25

    30

    35

    Time Steps

    Numb

    erofInfectives,I(t)

    Figure 8: R0 = 2, R0S0/N = 1.96, = 0.01, b = 0, = 0.005,N = 100,S0 = 98, and I0 = 2.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    60/244

    (6) Chain Binomial Epidemic Model

    There are two basic models known as the Greenwood and Reed-Frost models,originally developed in 1928 and 1931, respectively. These models apply to smallepidemics or to outbreaks within a household.

    Both models are DTMC models that depend on the two random variables St and

    It, bivariate MC. The latent period is the time from t to t + 1 and the infectious

    period is contracted to a point. Therefore, at time t + 1, there are only newlyinfected individuals from the previous time t.

    St+1 + It+1 = St.

    Given there are i infectives, let pi = probability that a susceptible individual

    does not become infected during the time period t to t + 1.

    L. J. S. Allen Texas Tech University

  • 7/30/2019 An Intensive Course In Stochastic Processes

    61/244

    Greenwood Model: pi = p

    The transition probability for (st, it) (st+1, it+1) depends only on p, st, andst+1 and is based on the binomial probability distribution:

    pst+1,st =

    stst+1

    pst+1(1 p)stst+1.

    Sample paths are denoted {s0, s1, . . . , st1, st}. The epidemic stops whenst = st1.

    E(St+1|St = st) = pst

    E(It+1|St = st) = (1 p)st.

    L. J. S. Allen Texas Tech University

    S l P h f h G d Ch i Bi i l

  • 7/30/2019 An Intensive Course In Stochastic Processes

    62/244

    Sample Paths of the Greenwood Chain-BinomialModel.

    0 1 2 3 4 5 6 7

    0

    1

    2

    3

    4

    5

    6

    7

    t

    st

    Figure 9: Four sample paths for the Greenwood chain binomial modelwhen s0 = 6 and i0 = 1, {6, 6}, {6, 5, 5}, {6, 4, 3, 2, 1, 1}, and{6, 2, 1, 0, 0}.

    L. J. S. Allen Texas Tech University

    i

  • 7/30/2019 An Intensive Course In Stochastic Processes

    63/244

    Reed-Frost Model: pi = p

    i

    The transition probability for (st, it) (st+1, it+1) is again based on thebinomial probability distribution but depends on p, st, st+1, and it:

    p(s,i)t+1,(s,i)t =

    st

    st+1

    (p

    it)st+1(1 pit)stst+1

    E(St+1|St = st) = stp

    it

    E(It+1|St = st) = st(1 pit).

    L. J. S. Allen Texas Tech University

    Th D ti d Si f th E id i C b

  • 7/30/2019 An Intensive Course In Stochastic Processes

    64/244

    The Duration and Size of the Epidemic Can beCalculated.

    Sample Path Duration Size

    {s0, . . . , st1, st} T W Greenwood Reed-Frost

    3 3 1 0 p

    3

    p

    3

    3 2 2 2 1 3(1 p)p4 3(1 p)p43 2 1 1 3 2 6(1 p)2p4 6(1 p)2p43 1 1 2 2 3(1 p)2p2 3(1 p)2p33 2 1 0 0 4 3 6(1 p)3p3 6(1 p)3p3

    3 2 0 0 3 3 3(1 p)3

    p2

    3(1 p)3

    p2

    3 1 0 0 3 3 3(1 p)3p 3(1 p)3p(1 + p)3 0 0 2 3 (1 p)3 (1 p)3

    Table 3: All of the sample paths, their duration, and size are computed

    for the Greenwood and Reed-Frost models when s0 = 3 and i0 = 1.

    L. J. S. Allen Texas Tech University

    This Concludes Part I on DTMC.

    Part I: Discrete-Time Markov Chains - DTMC

  • 7/30/2019 An Intensive Course In Stochastic Processes

    65/244

    Theory Applications to Random Walks, Populations, and Epidemics

    Part II: Branching Processes

    Theory

    Applications to Cellular Processes, Network Theory, and Populations

    Part III: Continuous-Time Markov Chains - CTMC

    Theory

    Applications to Populations and Epidemics

    Part IV: Stochastic Differential Equations - SDE

    Comparisons to Other Stochastic Processes, DTMC and CTMC

    Applications to Populations and Epidemics

    L. J. S. Allen Texas Tech University

    An Intensive Course in Stochastic Processes and

    Stochastic Differential Equations in

  • 7/30/2019 An Intensive Course In Stochastic Processes

    66/244

    Mathematical BiologyPart II

    Branching Processes

    Linda J. S. AllenTexas Tech University

    Lubbock, Texas U.S.A.

    National Center for Theoretical SciencesNational Tsing Hua University

    August 2008

    L. J. S. Allen Texas Tech University

    Acknowledgement

  • 7/30/2019 An Intensive Course In Stochastic Processes

    67/244

    Acknowledgement

    I thank Professor Sze Bi Hsu and Professor Jing Yu for the invitationto present lectures at the National Center for Theoretical Sciences atthe National Tsing Hua University.

    L. J. S. Allen Texas Tech University

    COURSE OUTLINE

    Part I: Discrete-Time Markov Chains - DTMC

    Theory

  • 7/30/2019 An Intensive Course In Stochastic Processes

    68/244

    Theory Applications to Random Walks, Populations, and Epidemics

    Part II: Branching Processes

    Theory

    Applications to Cellular Processes, Network Theory, and

    Populations

    Part III: Continuous-Time Markov Chains - CTMC

    Theory

    Applications to Populations and EpidemicsPart IV: Stochastic Differential Equations - SDE

    Comparisons to Other Stochastic Processes, DTMC and CTMC Applications to Populations and Epidemics

    L. J. S. Allen Texas Tech University

    Part II:Branching Processes

    The subject of branching processes began in 1845 with Bienayme

  • 7/30/2019 An Intensive Course In Stochastic Processes

    69/244

    and was advanced in the 1870s with the work of Reverend Henry

    William Watson, a clergyman and mathematician, and the biometricianFrancis Galton.

    Galton in 1873 posed a problem and two questions whose solutionswere not resolved until 1930:

    Suppose adult males (N in number) in a population each have differentsurnames. Suppose in each generation, a0 percent of the adult males have no

    male children who survive to adulthood; a1 have one such child; a2 have two,

    and so on up to a5, who have five.

    1 Find what proportion of the surnames become extinct after r generations.

    2 Find how many instances there are of the same surname being held by m persons.

    Reference for Part II: [1], Chapter 4.

    L. J. S. Allen Texas Tech University

    We will Discuss Single-Type and Multi-TypeBranching Processes (BP)

  • 7/30/2019 An Intensive Course In Stochastic Processes

    70/244

    A. Single-Type Galton-Watson BP: Each generation, keep track ofonly one type of individual, cell, etc.

    Applications:

    (1) Family Names

    (2) Cell Division

    (3) Network Theory

    B. Multi-type Galton Watson BP: Each generation, keep track of k

    types of individuals, cells, etc.

    Application:

    (1) k Different Age Groups in a Population.

    L. J. S. Allen Texas Tech University

    A. Single-Type Galton-Watson BranchingProcesses

  • 7/30/2019 An Intensive Course In Stochastic Processes

    71/244

    The type of problem studied by Galton and Watson is appropriatelynamed a Galton-Watson Branching Process.

    Discrete time branching processes are DTMC. Branching processes are frequently studied separately from Markov

    chains because

    (a) a wide variety of applications in branching processes: electron multipliers,

    neutron chain reactions, population growth, survival of mutant genes, changes

    in DNA and chromosomes, cell cycle, cancer cells, chemotherapy, and newtwork

    theory.(b) techniques other than the transition matrices are used to study their behavior:

    probability generating functions.

    L. J. S. Allen Texas Tech University

    Assumptions in Galton-Watson BranchingProcess

  • 7/30/2019 An Intensive Course In Stochastic Processes

    72/244

    Let X0 = total population size at the zeroth generation and letXn = total population size at the nth generation.

    The process {Xn}n=0 has state space {0, 1, 2, . . .} and will bereferred to as a branching process (bp).

    Each individual in generation n gives birth to Y offspring in thenext generation of the same type (single type bp), where Y is a randomvariable that takes values in {0, 1, 2 . . .}.

    The offspring distribution is

    Prob{Y = k} = pk, k = 0, 1, 2, . . . .

    Each individual gives birth independently of other individuals.

    L. J. S. Allen Texas Tech University

    An Illustration of a BPA Stochastic Realizationor Sample Path

    Let X0 = 1 population size (married couple), where the family history

  • 7/30/2019 An Intensive Course In Stochastic Processes

    73/244

    Let X0 1 population size (married couple), where the family history

    is followed over time.0

    1

    2

    3

    Figure 1: A sample path or stochastic realization of a branching process{Xn}n=0. In the first generation, four individuals are born, X1 = 4.The four individuals give birth to three, zero, four, and one individuals,respectively, making a total of eight individuals in generation 2, X2 = 8.

    L. J. S. Allen Texas Tech University

    We Digress Here to Talk about GeneratingFunctions

    Assume X is a discrete random variable with state space{0, 1, 2, . . .}. Let the probability mass function of X equal:

  • 7/30/2019 An Intensive Course In Stochastic Processes

    74/244

    Prob{X = j} = pj,

    j = 0, 1, 2, . . . , where

    j=0pj = 1.

    Mean or First Moment: X = E(X) =

    Xj=0

    jpj

    Variance or Second Moment about the Mean:2X = E[(X X)

    2] = E(X2) 2X

    =

    Xj=0 j

    2

    pj

    2

    X.

    nth Moment: E(Xn) =X

    j=0

    jnpj

    L. J. S. Allen Texas Tech University

    Definition of Probability Generating FunctionDefinition 1. The probability generating function (pgf) of X is defined

    on a subset of the reals

    X

    j

  • 7/30/2019 An Intensive Course In Stochastic Processes

    75/244

    PX(t) = E(tX

    ) =j=0

    pjtj

    ,some

    t R.

    Because

    j=0pj = 1, the sum converges absolutely for |t| 1implying

    PX(t) is well defined for

    |t

    | 1 and infinitely differentiable

    for |t| < 1. As the name implies, the pgf generates the probabilitiesassociated with the distribution

    PX(0) = p0, PX(0) = p1, PX(0) = 2!p2.

    In general, the kth derivative of the p.g.f. of X satisfies

    P(k)X (0) = k!pk.

    L. J. S. Allen Texas Tech University

    The PGF can be Used to Calculate the Meanand Variance.

    Note that PX(t) = j=1jpjtj1 for 1 < t < 1.

  • 7/30/2019 An Intensive Course In Stochastic Processes

    76/244

    X j 1 jThe Mean is

    PX(1) =X

    j=1

    jpj = E(X) = X.

    Also

    PX(t) =

    j=1j(j

    1)pjt

    j2 implies

    PX(1) =X

    j=1

    j(j 1)pj = E(X2) E(X).

    The Variance is

    2X = E(X2) E(X) + E(X) [E(X)]2

    = PX(1) + PX(1) [PX(1)]2.

    L. J. S. Allen Texas Tech University

    Other Generating FunctionsDefinition 2. The moment generating function (mgf) is

    MX(t) = E(etX) =

    pjejt

    some t R.

  • 7/30/2019 An Intensive Course In Stochastic Processes

    77/244

    Xj=0The cumulant generating function (cgf) is the natural logarithm of

    the moment generating function,

    KX(t) = ln[MX(t)].

    Note MX(t) is always well-defined for t 0. The mgf generatesthe moments:

    MX(0) = 1, MX(0) = X = E(X), M

    X(0) = E(X

    2),

    M(k)X (0) = E(Xk).

    L. J. S. Allen Texas Tech University

    An Example Applying PGF and MGF

    Poisson: 8je

    j!, j = 0, 1, 2, . . . ,

    th i

    > 0.

  • 7/30/2019 An Intensive Course In Stochastic Processes

    78/244

  • 7/30/2019 An Intensive Course In Stochastic Processes

    79/244

    h0(t) = t.

    Recall Y is the offspring distribution; subscripts on Y relate to number of offspring.

    In the next generation, each individual gives birth to k individuals with probabilitypk. The pgf of X1 is

    h1(t) =

    Xk=0

    pktk = f(t).

    Then X2 = Y1 + + YX1 because each of the X1 individuals gives birth to Y

    individuals and the sum of all these births is X2. Note for a fixed sum m of iidrandom variables Yi,

    Pmi=0 Yi, the pgf is

    PP

    Yi(t) = E(tY1 tYm) = E(tY1) E(tYm) = [f(t)]m.

    L. J. S. Allen Texas Tech University

    The PGF of X2 is a Composition.But X2 is a sum of a random number X1 of iid Yi. The pgf of X2 is

    h2(t) = E

    tPX1

    0 Yi

    !=

    Xj=0

    tjProb

    8

  • 7/30/2019 An Intensive Course In Stochastic Processes

    80/244

    : ;=

    Xj=0

    tjX

    m=0

    Prob

    8 0 for t (0, 1], where f(1).

    (5) f(t) =

    k=2 k(k 1)pktk2 > 0 for t (0, 1).

    L. J. S. Allen Texas Tech University

    Main Theorem for Branching Processes.

    Theorem 1. Assume the offspring distribution {pk} and the pgf f(t)satisfy properties (1)(5). In addition, assume X0 = 1. If m 1, then

  • 7/30/2019 An Intensive Course In Stochastic Processes

    86/244

    limn

    Prob{Xn = 0} = limn

    p0(n) = 1

    and if m > 1, then there exists q < 1 such that f(q) = q and

    limn

    Prob{Xn = 0} = limn

    p0(n) = q.

    If m 1, then Theorem 1 states that the probability of ultimate extinction

    is one. If m > 1, then there is a positive probability 1 q that the bp does

    not become extinct (e.g., a family name does not become extinct, a mutant gene

    becomes established, a population does not die out).

    L. J. S. Allen Texas Tech University

    Indication of Proof(1) The sequence {p0(n)}n is monotone increasing and bounded above:

    limn

    p0(n) = q.

    (2) h li i i fi d i f f

  • 7/30/2019 An Intensive Course In Stochastic Processes

    87/244

    (2) The limit is a fixed point of f:q = lim

    tp0(n) = lim

    nf(p0(n 1)) = f(q)

    (3) m 1 iff f(t) < 1 for [0, 1).

    0 0.2 0.4 0.6 0.8 10

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    y=t

    t

    f(t)

    Figure 3: Two different probability generating functions y = f(t)intersect y = t in either one or two points on [0, 1].

    L. J. S. Allen Texas Tech University

    All States are Transient in Galton-Watson BP,Except State Zero

    N h h i b bi I f MC h h

  • 7/30/2019 An Intensive Course In Stochastic Processes

    88/244

    Note that the zero state is absorbing. In terms of MC theory, theone-step transition probability

    p00(n) = 1.

    Theorem 2. Assume the offspring distribution {pk} and the p.g.f. f(t)satisfy properties (1)(5). In addition, assume X0 = 1. Then thestates 1, 2, . . . , are transient. In addition, if the mean m > 1, then

    limn

    Prob

    {Xn = 0

    }= q = 1

    Prob

    {lim

    n

    Xn =

    },

    where 0 < q < 1 is the unique fixed point of the pgf, f(q) = q.

    L. J. S. Allen Texas Tech University

    A Corollary to the BP Theorem when X0 = N.

    Corollary 1. Assume the offspring distribution{pk

    }and the p.g.f. f(t)

    ti f ti (1) (5) I dditi X N If 1

  • 7/30/2019 An Intensive Course In Stochastic Processes

    89/244

    { }satisfy properties (1)(5). In addition, assume X0 = N. If m 1,then

    limn

    Prob{Xn = 0} = limn

    [p0(n)]N

    = 1.

    If m > 1, then

    limn

    Prob{Xn = 0} = limn

    [p0(n)]N

    = qN < 1.

    L. J. S. Allen Texas Tech University

    Applications of Discrete-Time BranchingProcesses.

  • 7/30/2019 An Intensive Course In Stochastic Processes

    90/244

    Single Type Process: {Xn}(1) Family Names

    (2) Cell Cycle

    (3) Network Theory

    L. J. S. Allen Texas Tech University

    (1) An Example of a BP due to LotkaExample 2. Lotka assumed the number of sons a male has in his lifetime has the

    following geometric probability distribution:

    p0 = 1/2 and pk = 3

    5k1 1

    5for k = 1, 2, . . . .

  • 7/30/2019 An Intensive Course In Stochastic Processes

    91/244

    Note that

    Pk=1 pk = 1/2 and

    f(t) =1

    2+

    1

    5

    Xk=1

    3

    5

    k1tk =

    1

    2+

    1

    5

    t

    1 3t/5

    .

    m = f(1) =1/5

    (1 3/5)2=

    5

    4

    > 1.

    The fixed points of f(t) are found by solving

    1

    2+

    t

    5 3t= t or 6t

    2 11t + 5 = 0

    so that fixed point is q = 5/6. A male has a probability of 5/6 that his line of

    descent becomes extinct and a probability of 1/6 that his descendants will continueforever.

    L. J. S. Allen Texas Tech University

    (2) An Application to the Cell CycleEach cell after completing its life cycle, doubles in size, then divides

    into two progeny cells of equal sizes (Kimmel and Axelrod,2002). Aftercell division, some cells die, some remain inactive or quiesce and somekeep dividing or proliferating. After cell division:

  • 7/30/2019 An Intensive Course In Stochastic Processes

    92/244

    (1) Cell proliferation, probability p2(2) Cell death, probability p0(3) Cell quiescence, probability p1, p0 + p1 + p2 = 1.

    Proliferating

    Proliferating Dead Quiescent Proliferating Dead Quiescent

    D

    L. J. S. Allen Texas Tech University

    The Cell Cycle is a Galton-Watson Process

    Let Xn be the number of proliferating cells at time n The pgf is

  • 7/30/2019 An Intensive Course In Stochastic Processes

    93/244

    Let X be the number of proliferating cells at time n. The pgf is

    f(t) = (p0 + p1)2 + 2p2(p0 + p1)t + p

    22t

    2

    = (p2t + p0 + p1)2

    The mean of the proliferating cells is

    m = f(1) = 2p2.

    Reference: Kimmel and Axelrod (2002)

    L. J. S. Allen Texas Tech University

    (3) An Application of BP to Network Theory

    In disease networks, individuals are referred to as vertices or nodesin a network and their connectedness (by an edge) to other individualsin the network is described by a degree distribution.

    Let p probability that a node or vertex in the network is

  • 7/30/2019 An Intensive Course In Stochastic Processes

    94/244

    Let pk = probability that a node or vertex in the network isconnected by an edge to k other vertices. Then {pk}k=0 is known asthe degree distribution for the network. In disease transmission, it isimportant to determine the distribution of degrees for nodes reachedfrom a random node. For a randomly chosen node, the probability of

    reaching a node of degree k is proportional to kpk (because there arek ways to reach this node). But for calculating the spread of disease,we do not count the edge on which the disease entered, hence theprobability associated with spread has degree k 1, called the excessdegree, qk1 kpk. Thus,

    qk1 =kpk

    k=1 kpk, k = 1, 2, . . . .

    References: Newman (2002), Brauer (2008)

    L. J. S. Allen Texas Tech University

    PGF for the Degree Distribution and ExcessDegree Distributions

  • 7/30/2019 An Intensive Course In Stochastic Processes

    95/244

    f0(t) =

    k=0

    pktk

    and f1(t) =

    k=0

    qktk =

    k=1

    kpktk1

    m0,

    where m0 =k=0 kpk is the mean of the degree distribution. Not

    every connection in the network leads to disease transmission. Thus,we define the mean transmissiblity of the disease as T, 0 < T < 1,the probability that the disease is transmitted along an edge. Thebinomial distribution can be applied to a node of degree k to determinethe probability of m

    k transmissions:

    k

    m

    Tm(1 T)km.

    L. J. S. Allen Texas Tech University

    The PGF as a Function of Transmissibility

    The probability rm that there are m transmissions is

    rm =

    Xk=m

    pkk

    m Tm(1 T)km.

  • 7/30/2019 An Intensive Course In Stochastic Processes

    96/244

    X Thus, the pgf associated with {rm}m=0 is

    f0(t, T) =X

    m=0

    rmtm

    =X

    m=0

    "X

    k=m

    pk

    k

    m

    Tm(1 T)km

    #tm

    =X

    k=0

    pk

    kXm=0

    k

    m

    (tT)m(1 T)km

    =X

    k=0

    pk(1 T + tT)k

    = f0(1 T + tT).

    Also, f1(t, T) = f1(1 T + tT).

    L. J. S. Allen Texas Tech University

    The Basic Reproduction Number is Defined for aNetwork

    The mean of the excess degree distribution is defined as the basic

    reproduction number of the disease, R0.

  • 7/30/2019 An Intensive Course In Stochastic Processes

    97/244

    p ,

    R0 =df1(1 T + tT)

    dt

    t=1

    = T f1(1).

    It follows from the identities:

    f0(t) =X

    k=0

    pktk

    and f1(t) =X

    k=0

    qktk =

    Xk=1

    kpktk1

    m0,

    f1(t) =

    f0(t)

    m0 . (2)

    Thus,

    R0 = T f1(1) = Tf0 (1)

    m0. (3)

    L. J. S. Allen Texas Tech University

    An Example where the Network has a PoissonDegree Distribution

    Example 3. Assume that the degree distribution has a Poisson distribution with

    parameter . This type of network is known as a Poisson random graph. The

    generating function for a Poisson distribution is f0(t) = e(t1), where m0 = .

  • 7/30/2019 An Intensive Course In Stochastic Processes

    98/244

    g g f0( ) 0Applying the identities (2) and (3): f1(t) = f0(t) and

    R0 = T m0.

    Applying Theorem 1, the probability the disease dies out (q) is the fixed point of

    f1(t, T) = f0(t, T) = e([1T+tT]1) = eR0(t1). That is,

    q = eR0(q1).

    For example, ifR0 = 2 and initially one infectious individual is introduced into thepopulation, then the probability the disease dies out is q 0.203. The probability

    the disease becomes endemic is 1 q 0.797.

    L. J. S. Allen Texas Tech University

    An Example where the Graph is Complete.

    Example 4. Suppose the disease network is a complete graph withN

    2 nodes, i.e., each node has exactly N

    1 edges or connections

    for a total of N(N 1)/2 edges. The degree distribution is pN1 = 1

  • 7/30/2019 An Intensive Course In Stochastic Processes

    99/244

    and pk = 0 for k = N 1. The generating functions f0(t) = tN1and f1(t) = t

    N2. Thus, the basic reproduction number is

    R0 = T(N

    2).

    L. J. S. Allen Texas Tech University

    An Example Comparing 2 Connections to N 1Connections.

    Example 5. Suppose in the disease network everyone is connected to only 2 individuals:p2 = 1, f0(t) = t

    2, f1(t) = t, R0 = T.

    Suppose there is one individual with N 1 connections ( small world network):N 1 1 N 1 1 2 1

  • 7/30/2019 An Intensive Course In Stochastic Processes

    100/244

    p2 =N 1

    N, pN1 =

    1

    N, f0(t) =

    N 1

    Nt2 +

    1

    NtN1, f1(t) =

    2

    3t +

    1

    3tN2

    and R0 = T

    2

    3+

    N 2

    3

    .

    Comparing the R0 to the complete graph:

    2 connections < small world < complete graph

    T < T

    2

    3+

    N 2

    3

    < T(N 2).

    L. J. S. Allen Texas Tech University

    B. Multitype Galton Watson BranchingProcesses

    In a multitype bp, each individual may give birth to different

    types or classifications of individuals in the population - k types.When k = 1, the bp is a single type bp. There is an offspring

  • 7/30/2019 An Intensive Course In Stochastic Processes

    101/244

    When , the bp is a single type bp. There is an offspringdistribution corresponding to each of these different types of individuals.For example, the population may be divided according to age or sizeand in each generation, individuals may age or grow to anotherage or size class. In addition, in each generation, individuals give birth

    to new individuals in the youngest age or smallest size class.

    Denote the multitype bp as {X(n)}n=0, a vector of randomvariables,

    X(n) = (X1(n), X2(n), . . . , X k(n))T.

    with k different types of individuals. Each random variable Xi(n)

    has k associated random variables, {Yji}kj=1, where Yji is the randomvariable for the offspring distribution for an individual of type i to givebirth to an individual of type j = 1, 2, . . . , k.

    L. J. S. Allen Texas Tech University

    We Extend the PGF to Multitype BP.

    Let pi(s1, s2, . . . , sk) denote the probability an individual of type i

    gives birth to s1 individuals of type 1, s2 individuals of type 2, . . ., andi di id l f t k th t i

  • 7/30/2019 An Intensive Course In Stochastic Processes

    102/244

    sk individuals of type k; that is,

    pi(s1, s2, . . . , sk) = Prob{Y1i = s1, Y2i = s2, . . . , Y ki = sk}.

    Define the pgf for Xi, fi : [0, 1]k [0, 1] as follows:

    fi(t1, t2, . . . , tk) =

    sk=0

    s2=0

    s1=0

    pi(s1, s2, . . . , sk)ts11 t

    s22 tskk ,

    for i = 1, 2, . . . , k.

    L. J. S. Allen Texas Tech University

    The PGF for Multitype BP when X(0) = i.

    Let i denote a k-vector with the ith component one and theremaining components zero,

    i = (1i, 2i, . . . , ki)T

    ,

    where is the Kronecker delta symbol Then X(0) means

  • 7/30/2019 An Intensive Course In Stochastic Processes

    103/244

    where ij is the Kronecker delta symbol. Then X(0) = i meansthere is initially one individual of type i in the population. The pgffor Xi(0) given X(0) = i is f

    0i (t1, t2, . . . , tk) = ti and the pgf for

    Xi(n) given X(0) = i is fni (t1, t2, . . . , tk):

    Xsk=0

    X

    s1=0

    Prob{X1(n) = s1, . . . , Xk(n) = sk|X(0) = i}ts11 tskk

    ,

    f1i = fi. Let F F(t1, . . . tk) = (f1, . . . , f k) denote the vector of pgfF : [0, 1]k [0, 1]k. The function F has a fixed point at (1, 1, . . . , 1)since fi(1, 1, . . . , 1) = 1. Ultimate extinction of the population depends

    on whether F has another fixed point in [0, 1]k which depends on themean.

    L. J. S. Allen Texas Tech University

    We Compute the Mean for Multitype BP inTerms of the PGF.

    Let mji denote the expected number of births of a type jindividual by a type i individual; that is,

    mji = E(Xj(1)|X(0) = i) for i, j = 1, 2, . . . , k .

  • 7/30/2019 An Intensive Course In Stochastic Processes

    104/244

    j j

    The means can be defined in terms of the pgf:

    mji = fi(t1, . . . , tk)tj

    t1=1,...,tk=1

    .

    Define the k k expectation matrix,

    M =0BBB@

    m11 m12 m1k

    m21 m22 m2k......

    ...mk1 mk2 mkk

    1CCCA .

    If matrix M is regular (i.e., Mp > 0, for some p > 0), then M hasa simple eigenvalue of maximum modulus which we denote as .

    L. J. S. Allen Texas Tech University

    Multitype Branching Process Theorem

    Theorem 3. Assume each of the components functions fi of thepgf F(t1, . . . , tk) = (f1(t1, . . . , tk), . . . , f k(t1, . . . , tk)) are nonlinearfunctions of the variables t1, . . . , tk and the expectation matrix M is

    regular. If the dominant eigenvalue of M satisfies 1, then

  • 7/30/2019 An Intensive Course In Stochastic Processes

    105/244

    limn

    Prob{X(n) = 0|X(0) = i} = 1,

    i = 1, 2, . . . , k. If the dominant eigenvalue of M satisfies > 1, thenthere exists a vector q = (q1, q2, . . . , qk)T, qi [0, 1), i = 1, 2, . . . , k,

    the unique nonnegative solution to F(t1, t2, . . . , tk) = (t1, t2, . . . , tk),such that

    limn

    Prob

    {X(n) = 0

    |X(0) = i

    }= qi,

    i = 1, 2, . . . , k.

    L. J. S. Allen Texas Tech University

    Corollary to the Multitype Branching ProcessTheorem when X(0) = (r1, . . . , rk)

    T.

    Corollary 2. Suppose the hypotheses of Theorem 3 hold and X(0) =

  • 7/30/2019 An Intensive Course In Stochastic Processes

    106/244

    Corollary 2. Suppose the hypotheses of Theorem 3 hold and X(0) =(r1, . . . , rk)

    T. Then if the dominant eigenvalue of matrix M satisfies > 1,

    limn

    Prob{X(n) = 0|X(0) = (r1, r2 . . . , rk)T} = qr11 qr22 qrkk .

    L. J. S. Allen Texas Tech University

    (1) Application of Multitype BP toAge-Structured Populations.

    Suppose there are k age classes. An individual of type i either

    survives to become a type i + 1 individual with probability pi+1,i > 0or dies with probability 1 pi+1,i, i = 1, 2, . . . , k 1. Probability0 A i di id l f i i bi h i di id l f

  • 7/30/2019 An Intensive Course In Stochastic Processes

    107/244

    pk+1,k = 0. An individual of type i gives birth to r individuals of type1 with probability bi,r. The offspring distribution for an individual oftype i is

    bi,r 0, and

    r=0

    bi,r = 1, i = 1, 2, . . . , k .

    The mean of the offspring distribution is

    bi =

    r=1

    rbi,r.

    L. J. S. Allen Texas Tech University

    The Expectation Matrix has the form of a LeslieMatrix Model.

    The expectation matrix can be calculated from the pgfs fi:

    fi(t1, t2, . . . , tk) = [pi+1,iti+1 + (1 pi+1,i)]

    r=0bi,rtr1, i = 1, . . . , k .

  • 7/30/2019 An Intensive Course In Stochastic Processes

    108/244

    e.g., f1 = [p21t2 + (1 p21)]

    b1,rtr1,

    m11 =f1

    t1 ti=1

    = X rb1,r = b1, m21 =f1

    t2 ti=1

    = p21

    M =

    0BBBBB@

    b1 b2 bk1 bkp21 0 0 0

    0 p32 0 0...

    ... . . ....

    ...

    0 0 pk,k1 0

    1CCCCCA

    .

    The form of matrix M is known as a Leslie matrix. Assume matrixM is regular and that the pgfs are nonlinear. Then Theorem 3 can beapplied.

    L. J. S. Allen Texas Tech University

    An Example of a Stochastic Age-StructuredBranching Process.

    Example 6. Suppose there are two age classes with expectation matrix

    M =

    b1 b2p21 0

    =

    3/4 11/2 0

    .

    2

  • 7/30/2019 An Intensive Course In Stochastic Processes

    109/244

    The characteristic equation of M is 2 (3/4) 1/2 = 0 so that

    the dominant eigenvalue is = (3 +

    41)/8 1.175 > 1. Supposethe birth probabilities are

    b1,r =

    1/2, r = 01/4, r = 1, 20, r = 0, 1, 2

    , b2,r =

    1/4, r = 0, 21/2, r = 10, r = 0, 1, 2

    .

    The mean number of births for each age class is

    b1 = 3/4 =

    r=1

    rb1,r and b2 = 1 =

    r=1

    rb2,r

    (the values in the first row of M).

    L. J. S. Allen Texas Tech University

    To Find the Probability of Extinction, we Findthe Fixed Points of the PGF

    The pgfs for the two age classes are

    f1(t1, t2) = [(1/2)t2 + 1/2][1/2 + (1/4)t1 + (1/4)t21]

    f2(t1 t2) = 1/4 + (1/2)t1 + (1/4)t2

  • 7/30/2019 An Intensive Course In Stochastic Processes

    110/244

    f2(t1, t2) = 1/4 + (1/2)t1 + (1/4)t1.

    Since > 1, the preceding system F = (f1, f2) has a unique fixedpoint on [0, 1)

    [0, 1). The fixed point (q1, q2) is found by solving

    f1(q1, q2) = q1 and f2(q1, q2) = q2. The solution is

    (q1, q2) (0.6285, 0.6631).

    Thus, if there are initially five individuals of age 1 and three

    individuals of age 2, then the probability of ultimate extinction of thetotal population is approximately

    (0.6285)5(0.6631)3 0.0286.

    L. J. S. Allen Texas Tech University

    There are Many Applications of BranchingProcesses in Biology.

    Several good references devoted to Branching Processes, in addition

    to [1], Chapter 4:

  • 7/30/2019 An Intensive Course In Stochastic Processes

    111/244

    1. Harris, TE. 1963. The Theory of Branching Processes. PrenticeHall, NJ.

    2. Jagers, P. 1975. Branching Processes with Biological Applications.Wiley, Chichester.

    3. Kimmel, M and D Axelrod. 2002. Branching Processes in Biology.Springer Verlag, NY.

    4. Mode, CJ. 1971. Multitype Branching Processes Theory andApplications. Elsevier, NY.

    L. J. S. Allen Texas Tech University

    This Concludes Part II on Branching Processes.

    Part I: Discrete-Time Markov Chains - DTMC

    Theory

    Applications to Random Walks, Populations, and Epidemics

    Part II: Branching Processes

  • 7/30/2019 An Intensive Course In Stochastic Processes

    112/244

    Part II: Branching Processes

    Theory

    Applications to Cellular Processes, Network Theory, and Populations

    Part III: Continuous-Time Markov Chains - CTMC

    Theory

    Applications to Populations and Epidemics

    Part IV: Stochastic Differential Equations - SDE

    Comparisons to Other Stochastic Processes, DTMC and CTMC

    Applications to Populations and Epidemics1

    L. J. S. Allen Texas Tech University

    An Intensive Course in Stochastic Processes and

    Stochastic Differential Equations in

    Mathematical Biology

  • 7/30/2019 An Intensive Course In Stochastic Processes

    113/244

    Part IIIContinuous-Time Markov Chains

    Linda J. S. AllenTexas Tech University

    Lubbock, Texas U.S.A.

    National Center for Theoretical Sciences

    National Tsing Hua UniversityAugust 2008

    L. J. S. Allen Texas Tech University

    Acknowledgement

  • 7/30/2019 An Intensive Course In Stochastic Processes

    114/244

    I thank Professor Sze Bi Hsu and Professor Jing Yu for the invitation

    to present lectures at the National Center for Theoretical Sciences atthe National Tsing Hua University.

    L. J. S. Allen Texas Tech University

    COURSE OUTLINE

    Part I: Discrete-Time Markov Chains - DTMC

    Theory Applications to Random Walks, Populations, and Epidemics

  • 7/30/2019 An Intensive Course In Stochastic Processes

    115/244

    pp , p , p

    Part II: Branching Processes

    Theory

    Applications to Cellular Processes, Network Theory, andPopulations

    Part III: Continuous-Time Markov Chains - CTMC

    Theory

    Applications to Populations and Epidemics

    Part IV: Stochastic Differential Equations - SDE

    Comparisons to Other Stochastic Processes, DTMC and CTMC

    Applications to Populations and Epidemics

    L. J. S. Allen Texas Tech University

    Basic Reference for Part III of this Course

    [1 ] Allen LJS 2003 An Introduction to Stochastic Processes with

  • 7/30/2019 An Intensive Course In Stochastic Processes

    116/244

    [1 ] Allen, LJS. 2003. An Introduction to Stochastic Processes withApplications to Biology. Prentice Hall, Upper Saddle River, NJ.

    Chapters 5, 6, 7

    [2 ] Other references will be noted.

    L. J. S. Allen Texas Tech University

    Part III:Continuous-Time Markov Chains - CTMC

    Some Basic Definitions and NotationDefinition 1. Let {X(t)}, t [0,), be a collection of discrete

  • 7/30/2019 An Intensive Course In Stochastic Processes

    117/244

    Definition 1. Let {X(t)}, t [0,), be a collection of discreterandom variables with values in {0, 1, 2, . . .}. Then the stochasticprocess {X(t)} is called a continuous time Markov chain if it satisfies

    the following condition:

    For any sequence of real numbers 0 t0 < t1 < < tn < tn+1,

    Prob{X(tn+1) = in+1|X(t0) = i0, X(t1) = i1, . . . , X(tn) = in}

    = Prob{X(tn+1) = in+1|X(tn) = in}.

    Probability distribution {pi(t)}i=0 associated with X(t) is

    pi(t) = Prob{X(t) = i}

    with probability vector p(t) = (p0(t), p1(t), . . .)T.

    L. J. S. Allen Texas Tech University

    The Transition Matrix for the CTMC hasProperties similar to DTMC.

    Transition probabilities:

    pji(t, s) = Prob{X(t) = j|X(s) = i}, s < t

  • 7/30/2019 An Intensive Course In Stochastic Processes

    118/244

    for i, j = 0, 1, 2, . . . . If the transition probabilities only depend on the length of thetime step t s, they are called stationary or homogeneous transition probabilities;

    Otherwise they are called nonstationary or nonhomogeneous. We shall assume thetransition probabilities are stationary, pji(t s), t > s.

    Generally, the Transition matrix is a stochastic matrix,

    Xj=0

    pji(t) = 1

    unless the process is explosive (blow-up in finite time). If the process is nonexplosive,

    then P(t) is stochastic for all time P(t) = (pji(t)) , t > 0 and satisfies

    P(s)P(t) = P(s + t)

    for all s, t [0, ).

    L. J. S. Allen Texas Tech University

    Waiting Times Between Jumps

    The distinction between discrete versus continuous time Markov chains

    is that in DTMC there is a jump to a new state at times 1, 2, . . . ,but in CTMC the jump to a new state may occur at any time t 0.The collection of random variables {Wi} denote the jump times or

  • 7/30/2019 An Intensive Course In Stochastic Processes

    119/244

    The collection of random variables {Wi} denote the jump times orwaiting times and the times Ti = Wi+1 Wi are referred to as theinterevent times.

    T0 T1 T2 T3

    0 W1 W2 W3 W4

    Figure 1: One sample path of a CTMC, illustrating waiting times and

    interevent times. The process is continuous from the right.

    L. J. S. Allen Texas Tech University

    An Example of an Explosive Process

    If the waiting times approach a positive constant, W = sup{Wi},while the values of the states approach infinity,

    lim X(W ) =

  • 7/30/2019 An Intensive Course In Stochastic Processes

    120/244

    limi

    X(Wi) = ,

    then the process is explosive. We will assume the process isnonexplosive, unless noted otherwise. Sample paths are continuousfrom the right, but for ease in sketching sample paths, they are oftendrawn as connected rectilinear curves.

    0 W1 W2 W3 W4 W

    Figure 2: One sample path of a CTMC that is explosive.

    L. J. S. Allen Texas Tech University

    The Poisson Process

    The Poisson process {X(t)}, t [0,) is a CTMC with the

    following properties:(1) X(0) = 0.

  • 7/30/2019 An Intensive Course In Stochastic Processes

    121/244

    (2) pi+1,i(t) = Prob{X(t + t) = i + 1|X(t) = i} = t + o(t)

    pii(t) = Prob{X(t + t) = i|X(t) = i} = 1 t + o(t)pji(t) = Prob{X(t + t) = j|X(t) = i} = o(t), j i + 2

    pji(t) = 0, j < i.

    known as infinitesimal transition probabilities,

    limt

    pi+1,i(t) t

    t= 0.

    The transition probabilities are independent of i and j and dependonly on the length of time t.

    L. J. S. Allen Texas Tech University

    The Transition Matrix for the Poisson Process.

    0p00(t) p01(t) p02(t)

    (t) (t) (t)

    1

  • 7/30/2019 An Intensive Course In Stochastic Processes

    122/244

    P(t) =

    0BBBBB@

    p10(t) p11(t) p12(t) p20(t) p21(t) p22(t) p30(t) p31(t) p32(t)

    ... ... ...

    1CCCCCA

    =

    0

    BBBBB@

    1 t 0 0 t 1 t 0

    0 t 1 t

    0 0 t ... ... ...

    1

    CCCCCA + o(t).

    Note column sums of the matrix are one.

    L. J. S. Allen Texas Tech University

    Assumptions (1) and (2) are Used to Derive aSystem of Differential Equations for the Poisson

    Process.Because X(0) = 0, it follows that pi0(t) = pi(t).

  • 7/30/2019 An Intensive Course In Stochastic Processes

    123/244

    Because X(0) 0, it follows that pi0(t) pi(t).

    p0(t + t) = p0(t) [1 t + o(t)] .

    Subtracting p0(t), dividing by t, and letting t 0,

    dp0

    (t)

    dt = p0(t), p0(0) = 1.

    The solution is

    p0

    (t) = et.

    L. J. S. Allen Texas Tech University

    The Poisson Probabilities are Derived.Similarly,

    pi(t + t) = pi(t)[1 t + o(t)] + pi1(t)[t + o(t)] + o(t),

    leads todpi(t)

    dt= pi(t) + pi1(t), pi(0) = 0, i 1,

  • 7/30/2019 An Intensive Course In Stochastic Processes

    124/244

    dt( ) ( ) ( )

    a system of differential-difference equations. The system can be solved

    sequentially beginning with p0(t) = et

    to show

    p1(t) = tet, p2(t) = (t)

    2 et

    2!,

    and in general, a Poisson probability distribution with parameter t

    pi(t) = (t)ie

    t

    i!, i = 0, 1, 2, . . . .

    with mean and variance

    m(t) = t = 2(t).

    L. J. S. Allen Texas Tech University

    The Interevent time is Exponentially Distributed.Let W1 be the random variable for the time until the process reaches state 1, theholding time until the first jump. Then

    Prob{W1 > t} = p0(t) = et or Prob{W1 t} = 1 e

    t;

    W i i l d i bl i h I l i b h

  • 7/30/2019 An Intensive Course In Stochastic Processes

    125/244

    W1 is an exponential random variable with parameter . In general, it can be shown

    that the interevent time has an exponential distribution. We will show that this is

    true in general for Markov processes.

    0 2 4 6 8 10 12 14 16

    0

    2

    4

    6

    8

    10

    12

    14

    t

    X(t)

    Poisson Process, =1

    Figure 3: Sample path for a Poisson process with = 1.

    L. J. S. Allen Texas Tech University

    Derivation of the Differential Equations byApplying the Transition Matrix Leads to a NewMatrix Known as the Generator Matrix.

    Writing the probabilities in terms of the transition matrix P(t):

    p(t + t) = P(t)p(t)

  • 7/30/2019 An Intensive Course In Stochastic Processes

    126/244

    p( + ) ( )p( )

    limt0

    p(t + t) p(t)

    t

    = limt0

    [P(t) I]

    t

    p(t),

    where I is the identity matrix. Thus,

    dp

    dt = Qp(t).

    where matrix Q is known as the infinitesimal generator matrix

    Q = limt0

    [P(t) I]

    t.

    L. J. S. Allen Texas Tech University

    The Generator Matrix has some Nice Properties.

    The generator matrixq00 q01 q02

  • 7/30/2019 An Intensive Course In Stochastic Processes

    127/244

    Q =

    q10 q11 q12 q20 q21 q22

    ... ... ...

    =

    i=1

    qi0 q01 q02

    q10

    i=0,i=1 q

    i1 q12

    q20 q21

    i=0,i=2

    qi2

    ... ... ...

    .

    (1)Column sum is zero.(2) The diagonal elements are negative and off-diagonal elements arenonnegative.

    L. J. S. Allen Texas Tech University

    The Generator Matrix for the Poisson Process

    The generator matrix for the Poisson process is

  • 7/30/2019 An Intensive Course In Stochastic Processes

    128/244

    Q =

    0 0 0

    0 0 0 ... ... ...

    .

    The probability distribution p(t) = (p0(t), p1(t), . . . , pi(t),