chapter 3: birth and death processes

MScFinal.pdf1,1, )(,)( −+ == NNNN rNDrNB
p t t p t B N t p t D N t
p t B N t D N t o t N
p t t N
1 0for
where is the probability that the population size at time is
Chapter 3: Birth and Death processes
Thus far, we have ignored the random element of population behaviour. Of course, this
prevents us from finding the relative likelihoods of various events that might be of
interest, for example extinction. In this chapter, we focus on demographic stochasticity.
This component arises from the intrinsically stochastic nature of birth and death
processes. Even without external noise, one can not predict the future population
numbers with certainty. Stochastic models often predict population behaviour that is
significantly different from their deterministic equivalent. In these cases, the
randomness itself is central to the population dynamic. This randomness can, under
some nonlinearities have a systematic influence on the population. Such effects can not
be captured by a deterministic model.
3.1) Introduction to Birth and Death processes
A birth and death process is defined as any continuous-time Markov chain whose state
space is the set of all non-negative integers and whose transition rates from state i to
state j , jir , , are equal to zero whenever 1>− ji . That is, a birth and death process
that is currently in state i can only go to either state 1−i or state 1+i . When the state
increases by one, we say that a birth has occurred and when the process decreases by
one, we say that a death has occurred. We thus have:
(3.1.1)
The Markov assumption states that only the current population size is of use in
predicting future population behaviour. Thus, by definition, other possible transition rate
predictors – like the environmental condition – are ignored.
A population whose size at time t t+ is N could have only had one of four things
occurring in the preceding interval [ , ]t t t+ : a member of the population could have
given birth; a member of the population could have died; no births or deaths could have
occurred and, finally, there could have been more than one event (whether birth or
death or both) in the preceding interval. This means:
(3.1.2)
The probabilities of the first three events are given by the first three terms on the RHS,
20
( ) ( ) ( ) ( ) ( ) ( )[ ( ) ( )]
( ) ( ) ( )
p t p t B N p t D N p t B N D N N
p t p t D N
N N N N= − + + − + >
r i j
ij i j

( ) ( ) ( ) ( ) ( )
( )
p t B N p t B N p t N N
p
( ) ( ) ( )p t B N p tN N0 00= −
and o t( ) is the (negligible) probability of more than one event occurring within the
interval [ , ]t t t+ , resulting in the final population size at time t t+ being N . We need
to consider the case when 0=N separately, as such a circumstance will only arise if
there is a death when N = 1 . In this case:
(3.1.3)
By subtracting p tN ( ) from both sides of equations (3.1.2) and (3.1.3), and dividing by
t , we obtain the so-called Kolmogorov forward equations. As t → 0 :
(3.1.4)
The Kolmogorov equations are an infinite system of differential equations. They can be
written in matrix form as:
(3.1.5)
The Kolmogorov equations are the primary means to define a time-homogeneous
Markov process. By solving these equations, we can find the probability distribution of
N as a function of time.
3.2) Solving the Kolmogorov equations
The Kolmogorov equations can be solved for a linear “birth-only” model. Let N0 denote
the initial population size. For a “birth-only” model (i.e. a model that assumes that the
members of the population cannot die), the Kolmogorov equations are as follows:
(3.2.1)
(3.2.2)
21
t N
N
N t
+ − + − − − +
− − − +
= −
−

−

= −

−
N e eN N
+ + − − −
+ + = −
−

−

λ λ λ λ
The above equations can be solved directly. Suppose that B N N( ) = λ i.e. the birth rate
is not density-dependent. Upon substituting this transitional form into (3.2.2), we get:
1
0
p t t N C C
p t e
where is a constant
Using the boundary condition that pN0 0 1( ) = , we then have:
p t eN
0( ) = −λ
This expression derived for p tN0 ( ) can then be substituted into the Kolmogorov
( ) ( )
( )
N N
N t
where is a constant
Using the boundary condition that pN0 1 0 0+ =( ) , we then have:
p t N e eN
N t t
1 0 1+ − −= −( ) λ λ
Repeating the procedure with the boundary condition pN0 2 0 0+ =( ) , it can be shown that:
p t N N
−( ) λ λ
The first three terms, suggest that the probability mass function for the population
number is:
(3.2.3)
This is a negative binomial distribution with parameters N0 and e t−λ . If p tN ( ) does
indeed satisfy equation (3.2.3), then the differential equation for p tN +1( ) would be:
(3.2.4)
Equation (3.2.4) is a first order differential equation. We can thus apply a result given by
Jaeger et al. (1974) on equation (3.2.4), with the boundary condition pN + =1 0 0( ) to get:
(3.2.5)
22
( ) ( ) ( ) ( ) ( )
( )
p t D N p t D N p t for N N
p
p t N
bNt bt N N
01 0 1 0 for
We have thus shown that if p tN ( ) satisfies equation (3.2.3) then p tN +1( ) also satisfies
equation (3.2.3). We also know that equation (3.2.3) is true when N = 0 . Hence, by the
induction principle, we then know that equation (3.2.3) is always true. We have thus
proved that the probability mass function for a linear “birth-only” process is a Negative
Binomial Distribution with parameters N0 and e t−λ .
Similarly, one can directly derive the probability distribution function for a “death-
only” process. The Kolmogorov equations for the death process are:
(3.2.6)
(3.2.7)
As before, we start of solving for p tN0 ( ) using the boundary condition that pN0
0 1( ) = .
After which we can solve for p tN0 1− ( ) , p tN0 2− ( ) ,… , p t0 ( ) using the boundary condition
that p for N NN ( )0 0 0= ≠ . When D N bN( ) = (i.e. the death rate is proportional to the
population size), the resulting probability mass function for the “death only” process has
the form:
(3.2.8)
Unlike the “births-only” process, the “deaths-only” process has a finite set outcomes.
Thus, for a linear “deaths-only” process, the population size is binomially distributed
with parameters N0 and e bt− .
3.3) Alternatives to the Kolmogorov Equations
The direct method of solving the differential equations is quite laborious in dealing with
“birth-only” and “death-only” equations. This is due to the necessity of deriving the
probabilities for the various possible population sizes (or at least the first few
probabilities) separately before one can derive the general expression for p tN ( ) . This
direct method is even less practical when dealing with a model that allows for both
births and deaths. Due to the dependence of p tN ( ) on both p tN −1( ) and p tN +1( ) in such
models, one must solve the differential equations simultaneously – contrast this with the
successive solution of the equations in the two earlier models. This makes the
Kolmogorov equations unwieldy for large populations as, in such cases, obtaining even
23
N B N D N p tN
N N
( ) ( ) ( ) ( ) ( ) ( ) ( ) 1
2
2
2
p N N NN ( )0 0= −δ δ where is the Dirac Delta function
E dN B N D N dt= −( ) ( )
E dN B N D N dt 2 = +( ) ( )
a numerical solution is often difficult. The following alternatives to the Kolmogorov
equations have proved useful.
A. The Continuous Approximation
By its very nature the population size, N , is a discrete random variable. By treating N
as a continuous random variable and re-interpreting p tN ( ) as N ’s probability density
function (which we shall denote by p tN ( ) to avoid confusion), one can derive an
approximate probability distribution for the population. Nisbet et al. (1982) derived a
single, approximate differential equation for p tN ( ) . By performing a Taylor expansion on
p tN ( ) and discarding terms that are of third order and higher, the authors showed that:
(3.3.1)
A slightly modified version of the proof given by Nisbet and Gurney (1982) is given in
Appendix A (some of the elements of the proof have been re-ordered in an attempt to
make the proof more comprehensible).
The boundary condition for equation (3.3.1) – when the initial population size is N0 –
is:
(3.3.2)
Unfortunately, due to its non-linear form, equation (3.3.1) is analytically intractable.
Even a numerical solution to the differential equation cannot be found due to the
discontinuous nature of the boundary condition given in (3.3.2). This implies that the
continuous approximation cannot be used to derive an approximate probability
distribution for N . However, equation (3.3.1) can be used to generate an approximate,
quasi-equilibrium distribution (covered in Section 3.6) which, in turn, can be used to
derive an approximate analytical expression for the mean time to extinction.
B. Stochastic Differential Equations
Both the Kolmogorov equations and their continuous approximation model the
population probability distribution through time. The following model is based on the
population size itself and can thus be readily compared with the deterministic models
covered in Chapter 2. Stochastic differential equations are often also used to model
environmental stochasticity. It can be shown (see Appendix A) that:
24
dN B N D N dt t B N D N dt
t
η
ηwhere is a random variable with zero mean and unit variance
dN B N D N dt B N D N d t
t
dN
dt B N D N B N D N t t
d t
B N D N dt B N D N dt
B N D N dt dt
[ ] [ ] [ ]
( ) ( ) ( ) ( )
( ) ( )
= −
= + − −
≈ +

provided is sufficiently small
Thus, provided dt is sufficiently small for terms of order dt 2 and higher to be ignored,
we have:
(3.3.4)
Equation (3.3.4) is merely a cumbersome restatement of the Kolmogorov equations:
η( )t has an unusual, discrete probability distribution to accommodate the fact that N
can only take on integer values. However a tractable approximation to this equation is
possible.
Nisbet et al. (1982) stated, for all but the smallest populations, that any change in
the population size which is large enough to affect the transition probabilities must be
the result of a large number of statistically independent births and deaths. Thus, dN
can be taken over a relatively long time increment dt ; η( )t will then have an
approximately normal probability distribution (by the Central Limit Theorem). By
regarding N as a continuous variable (which implies that dt is small since dt ε 2 must
be constant – see proof of equation (3.3.1) in Appendix A) and η( )t as being normally
distributed (implying dt is large), we have:
(3.3.5)
The Wiener process, ω( )t is a continuous random process with independent increments
and which is also time homogeneous (i.e. ω( )t and ω ω( ) ( )t s s+ − have the same
distribution for s ≥ 0 and ω( )0 0= ). In addition, the Wiener process, ω( )t is Normally
distributed with mean 0 and variance σ 2 t (where σ > 0 ).
Nisbet et al. (1982) did not offer a resolution of the requirement that dt is both small
and large. However, from empirical evidence in Section 3.5, equation (3.3.5) does seem
to provide a good approximation to the Kolmogorov equations. The stochastic
differential equation can alternatively be written as:
(3.3.6)
25
j
j
j
( , ) ( ) ( , ) ( ) ( , )
f N t o t j
t f N o t j
j
j
j
1 0 0
Equation (3.3.6) can be interpreted as the sum of ‘deterministic’ and ‘stochastic’
contributions to dN dt . Care must be taken with this interpretation since white noise is
not well behaved (e.g. E tγ ( ) 2 = ∞ ). Stochastic differential equations prove to be
especially useful for deriving the gross fluctuation characteristics of a population.
C. Generating Functions
We now derive a singular differential equation for the population’s moment generating
function. The moment generating function of any random variable characterises that
random variable’s probability distribution. Hence such an equation implicitly models the
population’s probability distribution through time.
Consider the random variables N t( ) and N t t N t( ) ( )+ − . The variables represent
the population size at time t and the net change in the population size over the interval
[ , ]t t t+ respectively. If we let f Nj ( ) represent the continuous transition rate from
population size N to size N j+ , then:
(3.3.7)
For the birth and death model, f N B N1( ) ( )= , f N D N− =1( ) ( ) and f Nj ( ) = 0 for
j ≠ −1 0 1, , . Also, f N rj N N j( ) ,= + using the notation for the transition rates introduced in
Section 3.1. Bailey (1964) showed that the following differential equation for the
moment generating function, M t( , )θ , corresponds to the set of probability differential
equations (as shown in (3.1.5)):
(3.3.8)
Note that the ∂ ∂θ operator acts only on M t( , )θ . So, for example, if f N aN bN1
2( ) = −
M t b
M t 1
equation (3.3.8) is given in Appendix B.
The birth and death process assumes that the population size cannot change by
more than one unit in the interval t . Hence we have:
(3.3.9)
The advantage of the above equation is easy to see: instead of having a possibly infinite
set of differential equations to solve simultaneously, we only need to solve a single
26
( , ) ( ) ( , ) ( ) ( , )
1 11
∂
∂ = − + −
∂
M t( , ) ( , )θ θ
θ θ θ1 1
differential equation. The moment generating function characterises the probability
distribution of N so the solution of equation (3.3.8) helps to identify the correct
probability distribution of N . One can come up with corresponding differential equations
for the probability generating function, P t( , ) and the cumulant generating function,
K t( , )θ . For K t( , )θ , we use the relationship K t M t( , ) log ( , )θ θ= (equation (3.4.3) below
gives the differential equation of K t( , )θ for a birth and death process). If we substitute
eθ = and ∂ ∂ = ∂ ∂θ in (3.3.9), we then have the following differential equation for
the probability generating function, P t( , ) :
(3.3.10)
Consider the simple case where the transition rates are proportional to the population
size:
(3.3.11)
In this case, the differential equation for M t( , )θ becomes:
(3.3.12)
Equation (3.3.12) is a linear differential equation. Hence an analytical solution for
M t( , )θ is easily obtainable. Bailey (1964) showed that the solution of equation (3.3.12)
with boundary condition M e N( , )θ θ0 0= (i.e. the initial population size is N0 ) is:
(3.3.13)
Since the birth and death rates were simply proportional to the population size, it was
easy to derive this analytical solution for M t( , )θ . If the birth and death rates are
nonlinear, differential equations (3.3.9) and (3.3.10) can become intractable. In such
cases, we unfortunately cannot derive the exact probability distribution of N and thus
we need to look at ways to approximate the true probability distribution or to solve the
equation numerically.
K K ( ) ( ) ( )( ) ( )θ θ θ θ
θ θ θ 1 1 1 11 2 1 2
2
2
2
D N f N a N b N for N
a
b
( ) ( )
( ) ( )
, ; , ( ) ( )
= = −
= = + ≤
> = = =
−
K t t
i
i
i
i( , ) ( )
∂
∂ = + + +
θ θ 1 1 1 11 2 1 2
2
2
The difficulties encountered in deriving an analytical solution to the Kolmogorov
equations – particularly when the birth and death rates are nonlinear – forces one to
look at various, more solvable, models that approximate a population’s probability
distribution. This section is based on the work of Matis et al. (2000). Consider transition
rates with the following mathematical form:
(3.4.1)
The transition rates shown above only hold for N a b≤ 1 1 . This suggests that a b1 1>> .
Consequently, we expect the per capita birth rate to dominate when N is small and the
term b N1
2 to dominate when N is large. b N1
2 can thus be interpreted as the effect of
crowding on the population. A similar interpretation also holds for the death rates. By
applying equation (3.3.9) to the above transition rates, we obtain:
(3.4.2)
This implies that the differential equation for the cumulant generating function, K t( , )θ ,
is:
(3.4.3)
A derivation of equation (3.4.3) is given in Appendix E. Neither of the above two
differential equations is analytically tractable. We thus are unable to find an exact
analytical expression for the probability mass function. We thus need to look at ways to
derive an approximate expression for the probability mass function as it evolves through
time.
One alternative to generate an approximate probability mass function would be to try
to derive expressions for the first few cumulants of N from equation (3.4.3). In deriving
such expressions, we make use of the following relationship:
(3.4.4)
2 3 2 3
b b t
θκ θ
κ θ
κ κ
1 2 1 2 1 1 1 2 2
1 2 1 2 1 1 1 2 1 2 1 2 1 2 1 2 3
2
1 2 1 2 1 1 1 2 1 2
2 4 2 2
a a b b b b
a a b b a a b b b b b b
a a b b a a b b b
+ + + =
+
+
− − + − +
+ − − + − − − − + − +
− − + + + − + −

1 2 1 1 2 2 2
1 2 1 2 1 2 1 3 1 2 4
3
6
− − +
− − − − + − ++
+

We also need to use the series expansion of eθ :
(3.4.8)
Substituting equations (3.4.5) – (3.4.8) into equation (3.4.3), we thus have:
(3.4.9)
(3.4.10)
By equating the coefficients for the various powers of θ in equation (3.4.10), we obtain
the following system of differential equations for the cumulants:
( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
1 1 2 1 2 1 1 1 2 2
2 1 2 1 2 1 1 1 2 1 2 1 2 1 2 1 2 3
3 1 2 1 2 1 1 1 2 1 2 1 2 1 1 2 2 2
2 4 2
3 6 6
t a a b b b b
= − − + − +
= + − − + − − − − + − +
= − − + + + − + − − − +
+

3 3 6 3 1 2 1 2 1 2 1 3 1 2 4
( ) ( ) ( ) ( )a a b b b b b b− − − − + − +κ κ κ etc
(3.4.11)
It can be seen that the differential equation for the i -th cumulant function has terms up
to the i +1-th cumulant – this is due to the non-linear birth and death rates. The
presence of these higher-order cumulants prevents us from solving the differential
29
2
ψ κ κ κ
D N N N
0
2
2
for
otherwise
equations in (3.4.11) directly. Matis et al. (2000) proposed using a cumulant truncation
procedure. Here, one approximates the first i cumulants by setting all the cumulants of
order i +1 or higher to zero. If we set all the cumulants of order 4 and above to zero, we
can then solve the resulting three differential equations in (3.4.11) numerically to find
values for κ 1( )t , κ 2 ( )t and κ 3( )t at various values of t . The cumulant values obtained
can then be used to create a saddle-point approximation of the probability distribution.
The probability distribution of N may be approximated by a saddle-point probability
distribution. The saddle-point is a density function that will take as its parameters the
values of N ’s cumulants up to a specified order and force the values of the density
function’s cumulants to match those of N . It is this matching of the cumulants that
makes the saddle-point an approximation of the true distribution of N : one would
expect the approximation to be more accurate if more cumulants are being matched
(however, in some cases, this is not true). The values for the first three cumulants
(derived from equation (3.4.11)) can be substituted into the following saddle-point
approximation derived by Renshaw (1998):
(3.4.12)
Matis et al. (2000) have stated that the investigations which they have performed into
the accuracy of the above approximation have yielded results that were “very
encouraging”.
We again consider the transition rates introduced in Chapter 2:
(3.5.1)
By solving the equation ‘ B N D N( ) ( )− = 0 ’, we can see that the equilibrium population
size is 17 5. . Since this was a stable equilibrium state, more births tend to occur when
N < 17 5. and more deaths tend to occur when N > 17 5. .
Three simulations of the continuous-time Markov model were performed using the
above transition rates so as to get a feel of what a typical population trajectory might
look like. In order to execute the simulations, we needed to divide the timeline into
30
t
( )
ωwhere is a Wiener process
intervals of sufficiently small lengths (in this case, the interval length was set to 0.01) so
as to make the probability of more than one birth or death occurring within an interval
negligible. For each interval, we thus know (using t = 0 01. ) that the probability that a
birth occurs is B N t N N( ) . . = −0 003 0 00015 2 . The probability that a death occurs is
D N t N N( ) . . = +0 0002 0 00001 2 and the probability that nothing happens is
1 1 0 0032 0 00014 2− + = − +B N D N t N N( ) ( ) . . . One can then simulate which of the three
possible events occurs in each interval and hence replicate the population trajectory
over the period of interest. In the simulations performed, the initial population size was
set to 10. The three simulated population trajectories are shown below:
PopulationTrajectories
0
5
10
15
20
25
Time
N
Figure 3.1 Simulated runs of the Population when N0 = 10
Similarly, one can simulate various trajectories for the stochastic differential equation
(SDE) approximate representation of the birth and death process.
The SDE for the transitions given in (3.5.1) is:
(3.5.2)
As with the earlier model, we assume that the initial population size is 10 and we use
time increments of size 0.01. Thus, by simulating the values that the normal random
variable d tω ( ) takes over each time increment, one can derive the population
increments through time. By adding these increments to the initial population size, one
can derive the population trajectory.
31
PopulationTrajectories
0
5
10
15
20
25
Time
N
Figure 3.2 Simulated SDE runs of the Population when N0 = 10
In Figure 3.1, one can clearly see that the population size never moves by more than
one unit in any instant (since dt is made sufficiently small to exclude the possibility of
multiple births and deaths within any time increment). This serves to highlight that the
birth-and-death process is a discrete-state process in continuous time. However in
Figure 3.2 the population size can change to any value within an instant. This is to be
expected as the SDE treats the population size as a continuous variable.
Of course, if one were to repeat the simulations, one would, in all likelihood, obtain
appreciably different population trajectories from the ones shown in Figures 3.1 and 3.2
(since the population movements depend on the occurrence of random events). The
initial population size is 10. From Figure 2.1, we can see that the birth rates are higher
than the death rates when N = 10 and hence we would expect an upward trend in the
population numbers initially. Such a trend is clearly evident at the outset of all three
population simulations in both Figures 3.1 and 3.2. Once the equilibrium state is
reached, one can see from the above figures that the population then tends to vacillate
around this point. This is to be expected as this is a stable equilibrium state.
A million simulations were then run for the birth-and-death process and these were
compared with ten thousand simulations for the SDE. Both sets of simulations took
roughly an hour to run in Microsoft Excel on a Pentium 4, 2800 MHz, 512MB RAM: the
SDE simulations are relatively more time-consuming as random numbers from a
Normal distribution must be generated to carry out these simulations. This takes a
considerably longer period of time to complete than the Uniform random number
generation required for the birth-and-death process as the programming language used
32
κ κ κ κ κ κ κ κ
κ κ κ
3 1 1 1 2 2 1 3
1 0 2 3
0 32 0 014 0 546 0 064 0 032
0 28 0 016 0 944 0 084 0 096 0 798 0 096
0 10 0 0 0
t
t
t
N
with boundary conditions and
to execute the simulations required one of Microsoft Excel’s statistical functions to
generate the Normal random numbers.
For both types of simulation the initial population size is set to 10. The resulting
population sizes after each simulation run was recorded at time 10. These results were
then grouped under a frequency distribution (values for the SDE were rounded to the
nearest integer) and consequently the probability distribution at time 10, pN ( )10 for both
the birth-and-death process and the stochastic differential equation could be estimated
as the number of times that a particular population value occurred, divided by the
number of simulations undertaken. The estimated probabilities for the two models are
shown below:
0
0.05
0.1
0.15
0.2
0.25
1 3 5 7 9 11 13 15 17 19 21
PopulationSize
Figure 3.3 Simulated Probability Distributions for the two processes
One can clearly see that the probability distribution for both processes is negatively
skew at time 10. This is due, in part, to the population starting below the equilibrium
size. The probability distribution derived using the SDE is a good approximation to the
true distribution obtained by simulating the Birth and Death process directly. This is
despite the fact that the population size is quite small and so the normality assumption
implicit in the SDE model is contentious.
The above transition rates are nonlinear and so we need to use the cumulant
truncation method in order to obtain approximate values for the cumulants. If we choose
to truncate all cumulants of order four and higher, we will obtain the following system of
differential equations (see equations in (3.4.11)):
(3.5.3)
33
The solution of the first three cumulants at time 10 was obtained numerically. The
following values were obtained:
( ) . , ( ) . , ( ) .κ κ κ1 2 310 16 49 10 353 10 345= = = −
These estimates compare favourably with the first three cumulant values observed for
the one million simulations of the birth-and-death process:
κ κ κ1 2 310 16 50 10 352 10 3 61( ) . , ( ) . , ( ) .= = = −
The estimates of the cumulants can then be substituted into the saddle-point
approximation given in (3.4.12) so as to get an approximate probability mass function
for N . The diagram below compares the saddle-point probabilities with the simulated
relative frequencies from the Birth and Death process:
Accuracyof theSaddlepointapproximation
Population Size
P ro
b a
Simulated Probabilities
Saddlepoint approximations
Figure 3.4 Comparison of the Saddle-point and Simulated probabilities
From the above figure, one can see that the saddle-point approximation deviates
substantially from the true Birth and Death probabilities. The saddle-point approximation
does not seem to be valid for the above three values obtained for the cumulants – the
probability density function will be a complex number when n > 17 52. as ψ (as defined
in (3.4.12)) will be negative over this range. This means that the saddle-point
approximation, ( )pN 10 , is not defined for N > 17 . Matis et al. (2000) applied the saddle-
point approximation successfully to various other transitional forms. However, they did
not apply the saddle-point approximation when considering the transitional rates given
in (2.1.1). Further research needs to be done to ascertain the reason for the failure of
the saddle-point approximation for the transition rates in (2.1.1).
34
= + =+1 1 21 for
D D D N B p NN
* *( ) ( )... ( )
( ) ( )... ( ) ( ) ,=
− >
1 2 0 00
To see whether the saddle-point approximation worked better after a longer time
interval, the population size at time 20 was also studied. A hundred thousand
simulations were run. The values observed for the first three cumulants were:
κ κ κ1 2 320 17 29 20 2 58 20 2 30( ) . , ( ) . , ( ) .= = = −
Using the formula for ψ (given in (3.4.12)), we now find that the density function is
complex over the range: n > 18 74. . Compare the cumulant values at time 20 with those
observed at time 50:
κ κ κ1 2 350 17 35 50 2 49 50 2 20( ) . , ( ) . , ( ) .= = = −
Here, the density function becomes complex over the range: n > 18 79. . There are
minimal changes in the values of the cumulants. This seems to suggest that the
population is close to equilibrium at time 20 (the concept of a population being in
equilibrium is considered in more detail in section 3.6).
A process X t( ) is said to be ergodic if all its cumulants (e.g. µ = E X t( ) ) are equal
to the matching time averages of the process, (e.g. X T
X t dt T
0 ). One of the
conditions of ergodicity – which is satisfied by all the birth-and-death models considered
in this research report – is that the population should ‘forget’ its initial population size
after a suitably long period of time (see Nisbet et al. (1982) and section 3.7). Thus we
expect the values of the cumulants by time 20 to be independent of the starting
population size. Thus, for the transition rates considered in this example, the saddle-
point approximation is inappropriate; irrespective of the initial value of the population.
3.6) Quasi-Equilibrium Distribution
Nisbet et al. (1982) stated that a population with a true equilibrium state, pN
* has:
(3.6.1)
Intuitively, equation (3.6.1) signifies that a population at equilibrium has an equal
probability of increasing from size N to size N +1 as it has of decreasing from size
N +1 to size N . By repeatedly applying the above recurrence relationship, one can
show that:
0 = ΠR *
Since we are ignoring migration, B( )0 is zero. This implies that p NN
* = ∀ >0 0 . Since
= . This is to be expected since extinction is an
absorbing state when migration is ignored. However this distribution is of limited
interest. Consequently, the concept of the quasi-equilibrium distribution is considered
instead. Quasi-equilibrium is defined as the equilibrium probability distribution that the
population would ultimately be subject to if it were never to become extinct. We now
look at two possible methods of deriving the quasi-equilibrium distribution.
A. The modified Markov Process
Matis et al. (2000) modified the original birth and death process to create a new Markov
process whose probability distribution did not degenerate at equilibrium to an extinction
probability of one. The coefficient matrix for this new process R * was based on the
coefficient matrix for the birth and death process R (see equation (3.1.5)). By deleting
the first row and column of R (thus excluding the state N = 0 ) and by assuming that
D( )1 0= (which removes the only transition to the state N = 0 ), one obtains the
coefficient matrix for the new process, R * .
Let p tN
m ( ) be the probability that this modified process is equal to N at time t . Also
let p m ( ) ( )
, ,... t p tN
(3.6.3)
At equilibrium, we would expect ( )pm t = 0 . So if we let Π = =
pN N
distribution for the modified process (and the quasi-equilibrium distribution for the birth
and death process) we then have:
(3.6.4)
By solving equation (3.6.4) for Π , we get the quasi-equilibrium distribution. However,
the algebra may become tedious when the population is large. Note that R * must be a
singular matrix, otherwise Π = 0 .
B. Locally Linear Approximations
In Chapter 2, a locally linear approximation in the vicinity of the population’s equilibrium
state was used to derive an approximate population model. An approximation around
the population’s deterministic equilibrium state, N * can also be used to derive an
36
B N D N( ) ( )* *=
f N B N D N g N B N D N n t N t N( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) *= − = + = −
f N n n dB
dN
dD
∂
∂ = −
∂
∂ −
∂
∂

* * *( ) ( ) ( )λ
1
2
2
state satisfies the following relationship:
(3.6.5)
Nisbet et al. (1982) defined the three functions, f N( ) , g N( ) and n t( ) :
(3.6.6)
By performing a Taylor expansion of f N( ) around N * and retaining only the leading
term in the expansion, we have:
(3.6.7)
(3.6.8)
Equation (3.6.7) approximates f N( ) to the first order whilst equation (3.6.8)
approximates g N( ) to the zero’th order. These expressions can then be substituted into
the continuous approximation given by equation (3.3.1) to obtain:
(3.6.9)
N n
* ( ) 0 , one can solve the resulting differential equation to derive an
approximate expression for the quasi-equilibrium distribution:
(3.6.10)
The function clearly is Gaussian in form. Nisbet et al. stated that, with this locally linear
approximation, the population size has the following (approximate) distribution:
(3.6.11)
Note that λ < 0 for a stable equilibrium state (see equation (2.2.6)) and thus the
variance ( −Q 2λ ) for N is positive.
3.6.A) Example (continued)
For the transition rates considered in Section 3.5, we have N * .= 17 5 . Furthermore, we
know that:
17 5 17 5 13125
17 5 . . .
Thus, we have that N ≅ Normal( . , . )17 5 2 34375 .
37
The corresponding modified transition matrix is obtained using the transition rates in
(3.5.1):

population, N0 . The diagram below shows the quasi-equilibrium distribution; the
probability distribution at time 100 (obtained by simulation); and the locally linear Normal
approximation:
Theequilibriumprobabilitydistribution
0
0.05
0.1
0.15
0.2
0.25
0.3
Population Size
P ro
b a
Figure 3.5 The population at equilibrium
From the diagram, one can see that the locally linear approximation provides a relatively
good fit to both the population’s probability distribution at time 100 and the quasi-
equilibrium distribution. Thus, after a long enough time interval, the population assumes
an approximately normal distribution.
tim
T
0
1 = − ≡ −
→∞ lim
3.7) Gross Fluctuation Characteristics of a Population
The population’s probability distribution is usually a means to an end rather than the end
itself since it is almost impossible to estimate for any natural population. This is due to
both the unreliability of most ecological population data and the difficulty involved in
setting up ‘replicate’ populations so that various probabilities may be estimated.
However, gross fluctuation characteristics of a population, such as the mean, the
variance and the autocovariance function, can usually be observed over time for a
population. Thus such characteristics prove to be invaluable in calibrating any
population model.
The gross fluctuation characteristics should describe the properties of a population
at equilibrium. However, since extinction is an absorbing state, the equilibrium state of a
population is extinction. The characteristics of such a state are not very interesting and
so we would rather base the gross fluctuation characteristics on a population in quasi-
equilibrium. This section is based on the work of Nisbet et al. (1982). As before we let
pN
* be the probability that a population in quasi-equilibrium has size N . We then have:
(3.7.1)
(3.7.2)
Unfortunately, the above equations cannot be used to calculate the mean and the
variance as the quasi-equilibrium distribution of a population is seldom estimable. A
good way to relate the gross fluctuation characteristics to a measurable quantity is to
equate the above statistical expectations to the corresponding time averages of a single
population. That is, we assume the population is ergodic (see Section 3.5 for a
definition of ergodicity). In order for such a procedure to be valid, the following
conditions for ergodicity must hold:
i. After a suitably long period of time the population should ‘forget’ its initial value.
ii. A population starting from a particular value should, in principle, be able to reach
any other value.
Nisbet et al. stated that the above conditions are satisfied by all birth and death models.
The time averages of a single population (which we denote by ) are defined as:
(3.7.3)
(3.7.4)
39
C N t N N t N( ) ( ) ( )τ τ≡ − − −
dN B N D N dt B N D N d t= − + +( ) ( ) ( ) ( ) ( )ω
dN f N dt g N d t= +( ) ( ) ( )ω
dn
T
T
The time averages, µ tim and σ tim
2 should equal the mean and the variance of the quasi-
equilibrium distribution as the population should spend most of its time in the quasi-
equilibrium state. Thus, from equation (3.6.11), we get:
(3.7.5)
One can also define the autocovariance function, C( )τ using time averages:
(3.7.6)
The autocovariance function gives one an indication of the time it takes for a population
to ‘forget’ its initial value.
An alternative method of deriving the gross fluctuation characteristics is to use the
SDE formulation. The stochastic differential equation used to model the population (see
equation (3.3.5)) was:
(3.7.7)
Retaining the usage of the functions f N( ) and g N( ) , as defined in section 3.6, we
have:
(3.7.8)
If, consequently, one were to approximate the functions f N( ) and g N( ) around N * by
the equations (3.6.7) and (3.6.8), we would obtain the following linear SDE:
(3.7.9)
One can easily derive the gross fluctuation characteristics for a linear SDE using
Fourier methods. Consequently, a brief description of Fourier analysis is given below. (It
is also advisable to consult Appendix C as it gives proofs to some key Fourier
theorems.) For any function x t( ) , its Fourier transform, ~( )x ω is defined over the interval
−T T2 2; as:
(3.7.11)
It is this result in particular which makes a linear SDE amenable to Fourier analysis.
40
ωγ= +2 2
E E T S~( ) ; ~( ) ; ( )γ ω γ ω ωγ= = =0 1 2
n t T
( ) ~( )
( ) ( ) *
= − =
= =
2 =
Q d Q2 2 2
2 22 2 = = =
~( ) ~( ) ~( ) ~( ) ω
ω ω λ ω γ ω= = + 1 2
The spectral density, Sx ( )ω , of the function x t( ) is defined as:
(3.7.12)
If the population’s equilibrium state is stable, the transient initial condition-dependent
term will decay to zero and the persisting term becomes dominant. Since equation
(3.7.9) is linear, we know by Fourier transforming equation (3.7.9) (and applying
equation (3.7.11)) that:
(3.7.14)
The spectral density of the population is – upon substituting equation (3.7.14) in
equation (3.7.12) – given by:
(3.7.15)
The above relationships are useful as we know that white noise has the following
properties:
(3.7.16)
By applying some results proved in Appendix C to the population, we obtain:
(3.7.17)
(3.7.18)
By substituting equation (3.7.14) into (3.7.17) and taking expectations, we have:
(3.7.19)
In addition, by substituting (3.7.15) into (3.7.18) and applying the spectral property of
white noise (given in (3.7.16)), we have:
(3.7.20)
The time averages given above agree with the gross fluctuation characteristics derived
via the continuous approximation (see equation (3.7.5)). Unlike the continuous
approximation however, the stochastic differential equation formulation allows us to
41
ω ωτ ω= −∞
ρ τ λτ( ) = e
easily derive the autocovariance function, C( )τ , for the population. It can be proved
(see Appendix C) that:
(3.7.22)
(3.7.23)
The functional form of (3.7.23) implies that the population sizes at two distinct points in
time can never be negatively correlated.
3.7.A) Example (continued)
Using equation (3.7.23), we find that the autocorrelation function for the example in
section 3.5 (where λ = −0 28. ) is:
Autocorrelationfunction
0
0.2
0.4
0.6
0.8
1
Time lag
Figure 3.6 Correlation between two points a distance τ apart
One can see that the population size in the near future is closely correlated to the
population size now. This is to be expected as the time interval is too small to allow for
any more than a few births and deaths to occur. One can see that the population at
times a distance 20 apart are virtually uncorrelated. Thus we would expect the initial
population size of 10 to have virtually no impact on the population size at time 20.
3.8) Extinction
The extinction of a species is of particular interest in all population studies. Society is
rarely indifferent to the prospect of extinction of a species (whether it be the rhino… or
smallpox!). Extinction is an absorbing state. The finality of the extinction state is, in large
42
F t N t N N p t N NN0 0 0 00 0 0( ) Pr ( ) ( ) ( ) ( )= = = = =given
f t p t N NN0 0 00( ) ( ) ( )= =with
E tp t dt D tp t dtN0 0 11= = ( ) ( ) ( )
E b t N
N

− =
∞ − − −
=
p t p p t p N1 1 0 11 1( ) ( ) * *
≈ − − = where is the quasi equilibrium probability
dp t
1 01 1 ( )
= − −% &
E S t dt p t dt S t P T tN N0 00 0
0 1= = − = >
∞ ∞ ( ) ( ) ( ) [ ] where
part, the justification for any interest in this state. The probability of a population being
extinct at time t is p t0 ( ) . Since extinction is an absorbing state, p t0 ( ) is always an
increasing function of time.
Let TN0 denote the time to extinction when the current population size is N0 . Also let
f tN0 ( ) be the density function and F tN0
( ) the cumulative distribution function of TN0 .
We thus have:
(3.8.2)
. Using the functional form for ( )p t0 (shown in equation
(3.1.4)), we then have:
(3.8.3)
So for the “deaths-only” process in section 3.2 (where D N bN( ) = ), we have:
(3.8.4)
For the example in Section 3.5, one cannot obtain an analytical expression for p t1( )
and hence we will only be able to derive the mean time to extinction numerically using
the above method. Alternatively, one could try to derive an approximate analytical
expression.
(3.8.5)
Nisbet et al. (1982) showed that a when a population is close to its quasi-equilibrium
state, then:
The Kolmogorov equation when N = 1 (see equation (3.1.4)) is:
(3.8.7)
(3.8.8)
(3.8.9)
43
E D p t dt D pN0 1 11 1
1
0 = − =
−∞ exp ( ) ( ) * *% &
(3.8.10)
This expression is independent of the initial population size, N0 . This is because we are
assuming that the population has reached the quasi-equilibrium state, which is
independent of the initial population size. For the example in Section 3.5, D( ) .1 0 021=
and p1
.= × − (this probability was calculated in Section 3.6.A by solving equation
(3.6.4)). Substituting these numbers into equation (3.8.10), the model estimates the
mean time to extinction to be 61 1012. × time units; for any initial population size.
One can obtain an exact result for the example in Section 3.5 using the fact that we
are modelling the population as a Markov process. Let R + be a modified coefficient
matrix of R (see equation (3.1.5)), obtained by deleting the first row and the first column
of R . ( R + is not quite the same as R * – which was used in equation (3.6.3) – as we do
not make D( )1 0= . As such, R + is the coefficient matrix of the Kolmogorov equations
amongst the transient states.) Let M = mij be the matrix of so-called mean residence
times. The element mij is defined as the expected value of the total elapsed time that a
population, which starts at size N i( )0 = , will be of size j prior to the population
becoming extinct. Matis et al. (2000) stated that:
(3.8.11)
The mean time to extinction given that N N( )0 0= , denoted by EN0 , is:
(3.8.12)
For the example in section 3.5, R is a 21 21× matrix, as the population size cannot
increase above 20. The matrix R + is thus invertible and consequently, one can derive
the expected time to extinction using equation (3.8.12) for any initial population size.
The matrix R + is the same as the matrix shown in Section 3.6.A, except instead of
having the top left entry of the matrix, r1 1 0 285, .= − , we have r1 1 0 306, .= − . By applying
equations (3.8.11) and (3.8.12) in turn, we thus find:
E N
10 12
6 6 575 6 612 6 615 6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616 6 616= .124 . . . . . . . . . . . . . . . . . . .
The mean time to extinction increases as the population size increases. This is fairly
intuitive as the population has to suffer the loss of an additional member of the
44
population in order to become extinct. However, the mean time to extinction, EN0 is
effectively the same for initial population sizes of four and above. It thus seems that the
fact that the approximate expression for EN0 (equation (3.8.10)) is independent of the
initial population size in not all that unreasonable. Indeed, the estimated time to
extinction using equation (3.8.10) is not far from any of the true mean times to
extinction. Unfortunately, equations (3.8.11) and (3.8.12) cannot be used when the
population sizes are very large as the resulting matrices are also large and hence
difficult to manipulate. This is when the approximated analytical expression derived by
Nisbet et al. (1982) becomes especially useful.
The Birth-and-Death model is one of the most widely-used stochastic representations of
a population. In addition to its flexibility, the Birth-and-Death model is also appealing
due to the simplicity of its underlying principle: the population number can only change
when a member of the population gives birth or dies. In the following Chapter, we look
at an alternative method of modelling demographic stochasticity.

chapter 3: birth and death processes

Documents