A STATISTICAL MODEL FOR THE TRANSMISSION
OF INFECTIOUS DISEASES
WANG WEI
NATIONAL UNIVERSITY OF SINGAPORE
2007
A STATISTICAL MODEL FOR THE TRANSMISSION
OF INFECTIOUS DISEASES
WANG WEI
(B.Sc. University of Auckland, New Zealand)
A THESIS SUBMITTED
FOR THE DEGREE OF MASTER OF SCIENCE
DEPARTMENT OF STATISTICS AND
APPLIED PROBABILITY
NATIONAL UNIVERSITY OF SINGAPORE
2007
ACKNOWLEDGEMENTS
I would like to express my deep and sincere gratitude to Associate Prof. Xia
Yingcun, my supervisor, for his valuable advices and guidance, endless patience,
kindness and encouragements. I do appreciate all the time and efforts he has spent in
helping me to solve the problems I encountered. I have learned many things from him,
especially regarding academic research and character building.
I also would like to give my special thanks to my husband Mi Yabing for his love
and patience during my graduate period. I feel a deep sense of gratitude for my
parents who teach me the things that really matter in life.
Furthermore, I would like to attribute the completion of this thesis to other
members of the department for their help in various ways and providing such a
pleasant working environment, especially to Ms. Yvonne Chow and Mr. Zhang Rong.
Finally, it is a great pleasure to record my thanks to my dear friends: to Mr. Loke
Chok Kang, Mr. Khang Tsung Fei, Ms. Zhang Rongli, Ms. Zhao Wanting, Ms. Huang
Xiaoying, Ms. Zhang Xiaoe, Mr. Li Mengxin, Mr. Jia Junfei and Mr. Wang Daqing,
who have given me much help in my study. Sincere thanks to all my friends who help
me in one way or another for their friendship and encouragement.
Wang Wei
July 2007
ii
CONTENTS
Acknowledgements ii
Summary vi
List of Tables viii
List of Figures ix
Chapter 1 Introduction 1
1.1 Epidemiological background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Main objectives of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Organization of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Chapter 2 Classical Epidemic Models 7
2.1 Susceptible-infective-removed models (SIR) . . . . . . . . . . . . . . . . . . . . . . 8
2.2 The assumptions for epidemic models . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Assumptions about the population of hosts . . . . . . . . . . . . . . . . . . . 9
2.2.2 Assumptions about the disease mechanism . . . . . . . . . . . . . . . . . . . 10
2.3 Deterministic models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Stochastic models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Some terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5.1 Basic reproductive rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5.2 Important time scales in epidemiology . . . . . . . . . . . . . . . . . . . . . . 19
iii
Chapter 3 Statistical Epidemic Models 21
3.1 The time series – susceptible – infected – recovered model (TSIR) . . . . 21
3.1.1 Check the validation of TSIR model . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.2 The relationship between the parameters in TSIR model and the
3.1.2 deterministic SIR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1.3 Impact of aggregated data on the model fit . . . . . . . . . . . . . . . . . . 27
3.2 The cumulative alertness infection model (CAIM) . . . . . . . . . . . . . . . . . 33
3.2.1 Development of CAIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.2 Extension of CAIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.2.1 Change of alertness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.2.2 Long term epidemics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.3 Data denoising and model estimation . . . . . . . . . . . . . . . . . . . . . . . . 39
Chapter 4 Application of CAIM to Real Epidemiological Data 41
4.1 Foot and Mouth Disease (FMD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2 Severe Acute Respiratory Syndrome (SARS) . . . . . . . . . . . . . . . . . . . . . . 45
4.2.1 SARS in Hong Kong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.2 SARS in Singapore and Ontario, Canada . . . . . . . . . . . . . . . . . . . . . 51
4.3 Measles. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . 55
4.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
BIBLIOGRAPHY 59
Appendix Programme Codes 65
1 Programme to produce the realization of deterministic SIR model with
4.4 different parameters in (2.3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
iv
2 Programme to produce the realization of deterministic SIR model with
4.4 different R0 in (2.5.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3 Programme to check the validation of TSIR model in (3.1.1) . . . . . . . . . . . 68
4 Programme to investigate the relationship between the parameters in TSIR
4.4 model and R0 in (3.12) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5 Programme to check the impact of aggregated data on the TSIR model fit in
4.4 (3.1.3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6 Programme to investigate the relationship between the three parameters in
4.4 CAIM and R0 in (3.2.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7 Programme to apply CAIM to fit 2001 FMD data in UK in (4.1) . . . . . . . . . 82
8 Programme to apply CAIM to fit 2003 SARS data in HK in (4.2.1) . . . . . . . 84
9 Programme to apply CAIM to fit 2003 SARS data in Singapore in (4.2.2) . . 86
10 Programme to apply CAIM to fit 2003 SARS data in Ontario in (4.2.2) . . . 89
11 Programme to apply CAIM to fit measles data in China in (4.3) . . . . . . . . . 92
v
SUMMARY
Infectious diseases are the diseases which can be transmitted from one host of
humans or animals to other hosts. The impact of the outbreak of an infectious disease
on human and animal is enormous, both in terms of suffering and in terms of social
and economic consequences. In order to make predictions about disease dynamics and
to determine and evaluate control strategies, it is essential to study their spread, both
in time and in space. To serve this purpose, mathematical modeling is a useful tool in
gaining a better understanding of transmission mechanism.
There are two classic mathematical models: deterministic model and stochastic
model that have been applied to study infectious diseases and the concepts derived
from such models are now widely used in the design of infection control programmes.
Although the mathematical models are elegant, there are still some reasons that more
practical statistical models should be developed. During the outbreak of an infectious
disease, what we can observe in the first place is a time series of the number of cases.
It is very important to do an instantaneous analysis of the available time series and
provide useful suggestions. However, most existing mathematical models are based
on a system of differential equations with lots of unknown parameters which are
difficult to estimate statistically. Furthermore, these models need the effective number
of susceptibles, which is also difficult to calculate and define. In this thesis, we first
propose the time series-susceptible-infected-recovered (TSIR) model based on the
compartmental SIR mechanism. The validation of TSIR model was checked by
simulation. The result showed that TSIR model performed well. Then we develop a
vi
more practical statistical model, the cumulative alertness infection model (CAIM)
based on the TSIR model, which only requires the reported number of cases. The
parameters in the CAIM have been interpreted and some extensions of CAIM have
been discussed.
We also apply the CAIM to fit the real data of several infectious diseases: foot-and-
mouth disease in UK in 2001, SARS in Hong Kong, Singapore and Ontario, Canada
in 2003 and measles in China from 1994 to 2005. Our findings showed that the CAIM
could mimic the dynamics of these diseases reasonably well. The results indicate that
the CAIM may be helpful in making predictions about infectious disease dynamics.
vii
List of Tables
Table 2.1 Estimated values of R0 for various infectious diseases ………………. 19
Table 2.2 Incubation, latent and infectious periods (in days) for some infectious diseases . . . . . . . . . . . . . . . . . . . . . . . . ………………………………… 20
viii
List of Figures
Figure 2.1 Plot of a typical deterministic realization of an epidemic SIR model
with N=100, β=0.005, γ=0.1 for change in the number of infectives ……. 13
Figure 2.2 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.005, γ=0.1 for change in the number of susceptibles … 13
Figure 2.3 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.015, γ=0.1 for change in the number of infectives ……. 14
Figure 2.4 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.015, γ=0.1 for change in the number of susceptibles …. 14
Figure 2.5 Plot of a realization from the differential SIR model with R0 = 0.8, N=100 and γ =0.1 ………………………………………………………… 17
Figure 2.6 Plot of a realization from the differential SIR model with R0 = 1.5, N=100 and γ =0.1 ………………………………………………………… 18
Figure 2.7 Plot of a realization from the differential SIR model with R0 = 5, N=100 and γ =0.1 ………………………………………………………… 18
Figure 2.8 Diagrammatic illustration of the relationship between the incubation, latent and infectious periods for a hypothetical microparastic infection …. 20
Figure 3.1 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 2.0, γ= 1/5, N = 5000000 ……………………… 23
Figure 3.2 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 2.0, γ= 1/10, N = 5000000 …………………….. 23
Figure 3.3 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 7.0, γ= 1/5, N = 5000000 ……………………… 24
ix
Figure 3.4 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 7.0, γ= 1/10, N = 5000000 ……………………. 24
Figure 3.5 Plot of the relationship between c in TSIR model and R0 ……………. 25
Figure 3.6 Plot of the relationship between r in TSIR model and R0 …………..... 26
Figure 3.7 Plot of the relationship between α in TSIR and R0 …………………... 26
Figure 3.8 Plot of the deterministic realization of the estimated TSIR model before aggregating data based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000 ………………………………………………………………. 28
Figure 3.9 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 3 time units based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000 …………………………………………………. 29
Figure 3.10 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 5 time units based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000 …………………………………………………. 29
Figure 3.11 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 7 time units based on SIR model with R0 = 2.0,
γ =1/5 and N=5000000 ………………………………….................……. 30
Figure 3.12 Plot of the deterministic realization of the estimated TSIR model before aggregating data based on SIR model with R0 = 2.0, γ = 1/10 and
N=5000000 ……………………………………………………………… 30
Figure 3.13 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 3 time units based on SIR model with R0 = 2.0,
γ = 1/10 and N=5000000 ………………………………………………… 31
Figure 3.14 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 5 time units based on SIR model with R0 = 2.0,
γ = 1/10 and N=5000000 ………………………………………………… 31
x
Figure 3.15 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 7 time units based on SIR model with R0 = 2.0,
γ = 1/10 and N=5000000 ………………………………………………… 32
Figure 3.16 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 10 time units based on SIR model with R0 = 2.0,
γ = 1/10 and N=5000000 ………………………………………………… 32
Figure 3.17 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 12 time units based on SIR model with R0 = 2.0,
γ = 1/10 and N=5000000 ………………………………………………… 33
Figure 3.18 Plot of the relationship between R in CAIM model and R0 ………... 36
Figure 3.19 Plot of the relationship between r in CAIM model and R0 …………. 36
Figure 3.20 Plot of the relationship between α in CAIM model and R0 …………. 37
Figure 4.1 Plot of the observed daily cases of the 2001 FMD epidemic in the UK. 43
Figure 4.2 Plot of the deterministic and stochastic realization of the estimated model……………………………………………………………………… 44
Figure 4.3 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model …………………………………. 45
Figure 4.4 Plot of the observed daily cases of the 2003 SARS epidemic in Hong Kong………………………………………………………………………. 46
Figure 4.5 Plot of the deterministic and stochastic realization of the estimated model for the data of HK…………………………………………………. 47
Figure 4.6 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of HK …………….. 48
Figure 4.7 Plot of the deterministic and stochastic realization of the estimated model for the data of HK …………………………………………………. 50
xi
Figure 4.8 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of HK ……………... 50
Figure 4.9 Plot of the observed daily cases of the 2003 SARS epidemic in Singapore………………………………………………………………….. 51
Figure 4.10 Plot of the observed daily cases of the 2003 SARS epidemic in Ontario, Canada …………………………………………………………... 51
Figure 4.11 Plot of the deterministic and stochastic realization of the estimated model for the data of Singapore …………………………………………... 53
Figure 4.12 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of Singapore ……… 53
Figure 4.13 Plot of the deterministic and stochastic realization of the estimated model for the data of Ontario …………………………………………….. 54
Figure 4.14 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of Ontario ………… 54
Figure 4.15 Plot of the observed monthly measles cases from January 1994 to December 2005 in China …………………………………………………. 56
Figure 4.16 Plot of the deterministic realization of the estimated model for the data of Measles in China ………………………………………………….. 57
Figure 4.17 Plot of the deterministic realization of one-step ahead prediction based on the estimated model for the data of measles in China ………… 57
xii
CHAPTER 1
Introduction
1.1 Epidemiological background
In recent years, as the terms such as the Ebola virus, avian influenza and SARS
frequently dominate news head lines, infectious diseases have become a source of
continuous concern. Owing to improved sanitation, antibiotics and vaccination
programs, it was believed that infectious diseases would be eliminated soon in 1960s.
However, as new infectious diseases such as AIDS (1981), hepatitis C (1989),
hepatitis E (1990) and hanta virus (1993) have emerged and some existing diseases
such as malaria and dengue fever have reemerged and are spreading into new regions
as climate changes (Hethcote [2000]), infectious diseases have gained increasing
recognition as a key component in the human communities.
By definition, an infectious disease is a clinically evident disease of humans or
animals that damages or injures the host so as to impair host function, and results
from the presence and activity of one or more pathogenic microbial agents, including
viruses, bacteria, fungi, protozoa, multi-cellular parasites, and recently identified
proteins known as prisons.
An infectious disease is transmitted from some source. The infectiousness of a
disease indicates the comparative ease with which the disease can be transmitted to
from one host to others. The transmissible nature of infectious diseases makes them
fundamentally different from non-infectious diseases. Transmission of an infectious
1
disease may occur through several pathways, including through contact with infected
individuals, by water, food, airborne inhalation, or through vector-borne spread.
When there is an infectious disease epidemic (an unusually high number of cases
in a region), or pandemic (a global epidemic), public health professionals and policy
makers will be interested in such questions as how many people will be affected
altogether and thus require treatment? What is the maximum number of people
needing care at any particular time? How long will the epidemic last? How much
good would different interventions do in reducing the severity of the epidemic? To
answer these questions, the mathematical models can be useful tools in understanding
the patterns of disease spread and assessing the effects of different interventions.
The application of mathematics to the study of infectious disease appears to have
been initiated by Daniel Bernoulli in 1760. He used a mathematical method to
evaluate the effectiveness of the techniques of variolation against smallpox, with the
aim of influencing public health policy. But the first epidemiology modeling seemed
to have started in the 20th century. In 1906, Hamer formulated a discrete time model
to analyze the regular recurrence of measles epidemics and put forward so called
‘mass action principle’, one of the most important concepts in mathematical
epidemiology, indicating that the number of new cases per unit time depended on the
product of density of susceptible people times the density of infectious individuals. In
1908, Ronald Ross translated this ‘mass action principle’ into a continuous-time
framework in his pioneering work on dynamics of malaria (Ross [1911], [1916],
[1917]). The first complete mathematical model for the spread of an infectious disease
was proposed by Kermack and MacKendrick [1927], known as the deterministic
general epidemic model. Based on this model, Kermack and MacKendrick established
2
the celebrated Threshold Theorem, according to which the introduction of a few
infectious individuals into a community of susceptibles will not give rise to an
epidemic outbreak unless the density or number of susceptibles is above a certain
critical value. Thereafter, as cornerstones of modern theoretical epidemiology, the
threshold theorem and the mass action principle began to provide a firm theoretical
framework for the investigation of observed patterns.
As epidemiological data became more extensive, especially when small family or
household groups were considered, variation and elements of chance became more
important determinants of spread and persistence of infection and this led to the
development of stochastic theories and probabilistic models (Bartlett [1955], Bailey
[1975]). McKendrick [1926] was the first to propose a stochastic model. While the
deterministic model considers the actual number of new cases on a short interval time
to be proportional to the numbers of both susceptibles and infectious cases, as well as
to the length of the interval, the stochastic model assumed the probability of one new
case in a short interval to be proportional to the same quantity. Unfortunately, this
stochastic continuous–time version of the deterministic model of Kermack and
McKendrick [1927], did not received much attention. It was not until the late 1940’s,
when Bartlett [1949] studied the stochastic version of the Kermack-McKendrick
model and developed a partial differential equation for the probability-generating
function of the numbers of susceptibles and infectious cases at any instant, that
stochastic continuous-time epidemic models began to be analyzed more extensively.
Since then, the effort put into modeling infectious diseases has more or less exploded.
There is a vast literature on deterministic and stochastic epidemic modeling. Here
only a few central books on epidemic modeling will be mentioned. Most of the work
3
on modeling disease transmission prior to 1975 is contained in Bailey [1975]. The
author presents a comprehensive account of both deterministic and stochastic models,
illustrates the use of a variety of the models using real outbreak data and provides us
with a complete bibliography of the area. The book that has received most attention
recently is Anderson and May [1991]. The authors model the spread of disease for
several different situations and give many practical applications, but only focus on
deterministic models. A thematic semester at Isaac Newton Institute, Cambridge,
resulted in three collections of papers (Mollison [1995], Isham and Medley [1996],
Grenfell and Dobson [1996]), covering topics in stochastic modelling and statistical
analysis of epidemics, human infectious diseases and animal diseases, respectively.
Very recently, the book by Daley and Gani [1999] focuses on stochastic modeling but
also contains statistical inference and deterministic modeling, as well as some
historical remarks. Another new monograph by Diekmann and Heesterbeek [2000] is
concerned with mathematical epidemiology of infectious diseases and their methods
are also applied to real data.
Although the mathematical models are elegant, there are still some reasons that
more practical statistical models should be developed. During the outbreak of an
infectious disease, what we can observe in the first place is a time series of the
number of cases. It is very important to do an instantaneous analysis of the available
time series and provide useful suggestions. However, most existing mathematical
models are based on a system of differential equations with lots of unknown
parameters which are difficult to estimate statistically. Furthermore, these models
need the effective number of susceptibles, which is also difficult to calculate and
define. In this thesis, a simple and practical statistical model is proposed that only
4
considers the reported number of cases and the relationship with the classical
mathematical models will be analyzed.
1.2 Main objectives of this thesis
We start with an introduction of two classical mathematical models for infectious
diseases. For simplicity, in this thesis we assume that the population is a single group
of homogeneous individuals who mix uniformly and roughly constant through the
epidemic. At any given time, an individual in the population is either susceptible to
the disease, or infectious with it, or a removed case by acquired immunity or isolation
or death. Furthermore, the infectious disease discussed here is s directly transmitted
viral or bacterial disease. First, we propose the time series-susceptible-infected-
recovered (TSIR) model. Then we check the validation of the TSIR model and
investigate the relationship between the parameters in the TSIR model and the
classical mathematical epidemic model. Next, based on the TSIR model, we develop
a complete case-driven model, the cumulative alertness infection model (CAIM). The
parameters in the CAIM have been interpreted and some extensions of CAIM have
been discussed. Finally, we apply the CAIM to real data of some infectious diseases
to demonstrate that CAIM is useful in making predictions about the disease
dynamics.
1.3 Organization of this thesis
We organize this thesis into four chapters. Chapter 1 is a historical introduction to
the epidemic models for infectious diseases. In the next chapter, Chapter 2, we review
two classical mathematical models for infectious diseases and some important
terminology in epidemiological study. In Chapter 3, we introduce two new statistical
5
models: TSIR and CAIM. The validation of the models will be investigated and the
parameters in the new models interpreted. In the last chapter, Chapter 4, we apply the
model, CAIM to the real data of some infectious diseases and demonstrate that the
new model can be very useful in making predictions about disease dynamics.
6
CHAPTER 2
Classical Epidemic Models
Infectious disease data have two features that distinguish them from other data.
They are high dependence that inherently present and the infection process cannot be
observed entirely. Therefore, the analysis of data is usually most effective when it is
based on a model that describes a number of aspects of the underlying infection
pathway, i.e. on an epidemic model. The main purpose of the epidemic model is to
take facts about the disease as inputs and to make predictions about the numbers of
infected and uninfected people over time as outputs.
The application of mathematical models to dynamics of infectious disease such as
measles, influenza, rubella and chicken pox has been a real success story in the 20th
century science (see Hethcote [1976], Dietz [1979], Anderson and May [1982], Dietz
and Schenzle [1985], Hethcote and Van Ark [1987], Castill-Chavez et al. [1989],
Feng and Thieme [1995]). Even the dynamics of disease appear to be very complex,
surprisingly simple mathematical models can be used to understand the features
governing the outbreak and persistence of infectious disease.
Epidemic modeling is expected to attain the three aims. The first is to understand
better the mechanisms by which diseases spread. The second aim is to predict the
future course of the epidemic. The third aim is to understand how we may control the
spread of the epidemic. Of the several methods for achieving this, education,
immunization and isolation are those most often used.
7
2.1 Susceptible-infective-removed models (SIR)
A population comprises a large number of individuals, all of whom are different in
various fields. In order to model the progress of an epidemic in such a population, this
diversity must be reduced to a few key characteristics which are relevant to the
infection under consideration. For most common infectious diseases, all individuals
are initially susceptible. On infection they become infectious for a period, after which
they stop being infectious, recover and become immune or die. They are then said to
be removed. Any individual who is infectious is called infective. Hence, the
population can be divided into those who are susceptible to the disease (S), those who
are infected (I) and those who have been removed (R).These subdivisions of the
population are called compartments. The models which assume that individuals pass
through the susceptible (S), infective (I) and removed (R) states in turn are call SIR
models.
The SIR model is dynamic in two senses. At first, the model is dynamic in that the
numbers in each compartment may fluctuate over time. During an epidemic, the
number of susceptibles falls more rapidly as more of them are infected and thus enter
the infectious and recovered compartments. The disease cannot break out again until
the number of susceptibles has built back up as a result of babies being born into the
compartment. The SIR is also dynamic in the sense that individuals are born
susceptible, then may acquire the infection (move into the infectious compartment)
and finally recover (move into the recovered compartment). Thus each member of the
population typically progresses from susceptible to infectious to recover. This can be
shown as a flow diagram in which the boxes represent the different compartments and
the arrows the transition between compartments.
8
For the full specification of the model, the arrows should be labeled with the
transition rates between compartments. Between S and I, the transition rate is λ, the
force of infection, which is simply the rate at which susceptible individuals become
infected by an infectious disease. Between I and R, the transition rate is γ (simply the
rate of recovery). If the mean duration of the infection is denoted D, then D = 1/ γ,
since an individual experiences one recovery in D units of time.
2.2 The assumptions for epidemic models
For any given model, they usually encompass a set of assumptions. For simplicity
and convenience, some assumptions for the models considered in this thesis should be
introduced first.
An epidemic process of an infectious disease can be thought as the evolution of
the disease phenomenon within a given population of individuals. Naturally, the
assumptions for the epidemic models involve two aspects: assumptions about the
population of hosts and the disease mechanism.
2.2.1 Assumptions about the population of hosts
In general, populations of hosts show demographic turnover: old individual
disappear by death and new individuals appear by birth. Such a demographic process
has its characteristic time scale (for humans on the order of 1-10 years). The time
scale at which an infectious diseases sweeps through a population is often much
shorter (e.g. for influenza it is on the order of weeks). In such a case we choose to
9
ignore the demographic turnover and consider the population as ‘closed’ (which also
means that we do not pay any attention to emigration and immigrations).
Consider such a closed population and assume that it is naïve, in the sense that it is
completely free from a certain disease-causing organism in which we are interested.
Furthermore, the simplest situation where the disease-causing organism is introduced
in by only one host is under consideration in this thesis.
In summary, we make assumptions about the population as follows:
(a) the population structure: the population is a single group of homogeneous
individuals who mix uniformly ;
(b) the population dynamics: the population is closed so that it is a constant
collection of the same set of individuals for all time;
(c) a mutually exclusive and exhaustive classification of individuals according
to their disease status: at any given time, an individual is either susceptible
to the disease, or infectious with it, or a removed case by acquired
immunity or isolation or death.
2.2.2. Assumptions about the disease mechanism
As far as the disease mechanism is concerned, we assume that we deal with micro-
parasites, which are characterized by the fact that a single infection triggers an
autonomous process in the host. Micro-parasites may be thought of as those parasites
which have direct reproduction within the host. They tend to have small size and a
short generation time. Hosts that recover from infection usually acquire immunity
against re-infection for some time, and often for life so that no individual can be
infected twice. The duration of micro-parasite infection is usually short relative to the
10
expected life span of the host. Most viral and bacterial parasites, and many protozoan
and fungal parasites fall into the micro-parasitic category.
In addition, we assume that the disease is spread by a contagious mechanism so
that contact between an infectious individual and a susceptible is necessary. After an
infectious contact, the infectious individual succeeds in changing the susceptible
individual’s disease status.
2.3 Deterministic models
To indicate that the numbers might vary over time (even if the total population
size remains constant), we make the precise numbers a function of t (time): S(t), I(t)
and R(t). For a specific disease in a specific population, these functions may be
worked out in order to predict possible outbreaks and bring them under control.
In the classical model for a general epidemic, the size of the population N is
assumed to be fixed, and individuals in the population are counted according to their
disease status, numbering S (t) susceptibles, I (t) infectives and R (t) removals (dead,
isolated or immune), so that S (t) is non-increasing, R (t) is non-decreasing and the
sum S (t) + I (t) + R (t) = N, for all t >0. Then, the deterministic form of the SIR
model is defined as:
( )dS tdt = - βS(t)I(t) (2.1)
( )dI tdt
= βS(t)I(t) – γI(t) (2.2)
( )dR tdt
= γI(t) (2.3)
11
The evolution of this epidemic process is deterministic in the sense that no
randomness is allowed for. The results of a deterministic process are regarded as
giving an approximation to the mean of a random process.
Here β > 0 is the pair-wise rate of infection (i.e. infection parameter) at which the
number of infectives simultaneously increases at the same rate as the number of
susceptibles decreases, and deceases through removal at a rate; γ > 0 is the removal
rate at which infectives become removed.
The results derived from the equations are stated formally as follows, which
constitute a benchmark for a range of epidemic models.
Theorem 2.1 (Kermack-McKendrick). Subject to the initial conditions (S (0), I (0), R
(0)) = (S0, I0, R0) with I0 ≥ 1, R0 =0 and S0 +I0 =N, a general epidemic evolves
according to the differential equations (2.1-2.3) (Daley and Gani [1999]).
(i) (Survival and Total size). When infection ultimately ceases spreading,
a positive number S∞ of susceptibles remain uninfected, and the total
number R∞ of individuals ultimately infected and removed equals S0 +
I0 – R∞ and is the unique root of the equation
N – R∞ = S0 + I0 – R∞ = S0 exp(-R∞ /ρ),
Here I0 < R∞ < S0 + I0, ρ = γ / β being the relative removal rate.
(ii) (Threshold Theorem). A major outbreak occurs if and only if the initial
number of susceptibles S0 > ρ.
(iii) (Second Threshold Theorem). If S0 exceeds ρ by a small quantity v,
and if the initial number of infectives I0 is small relative to v, then the
12
final number of susceptibles left in the population is approximately ρ –
v, and R∞ ≈ 2v.
The major significance of these statements at the time of their first publication was
a mathematical demonstration that not all susceptibles would necessarily be infected.
Conditions were given for a major outbreak to occur, namely that the number of
susceptibles at the start of the epidemic should be sufficiently high; this would
happen, for example, in a city with a large population.
0 10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
45
50
time
SIR
:num
ber o
f cas
es
Figure 2.1 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.005, γ=0.1 for change in the number of infectives.
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100
time
SIR
:num
ber o
f sus
cept
ible
s
Figure 2.2 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.005, γ=0.1 for change in the number of susceptibles.
13
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
time
SIR
:num
ber o
f cas
es
Figure 2.3 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.015, γ=0.1 for change in the number of infectives.
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
90
100
time
SIR
:num
ber o
f sus
cept
ible
s
Figure 2.4 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.015, γ=0.1 for change in the number of susceptibles.
Deterministic modeling considers a structured mathematical framework, where
one takes the actual number of new cases in a short interval of time to be proportional
to the product of the number of both susceptible and infectious individuals, as well as
the length of the time interval.
14
2.4 Stochastic Models
While deterministic methods may be adequate to characterize the spread of an
epidemic in a large population, they are not satisfactory for smaller populations, in
particular those of family size. Hence, when the number of members of the
population subject to the epidemic is relatively small, we describe the evolution of an
epidemic as a stochastic process.
We consider an SIR model in which the total population is subdivided into S (t)
susceptibles, I (t) infectives and R (t) immunes or removals, with (S, I, R) (0) = (N, I0,
0) , where I0 ≥ 1 and
S (t) + I (t) + R (t) = N +I0 (all t ≥ 0). (2.4)
We assume that { ( S, I ) (t) : t ≥ 0} is a homogeneous Markov chain in continuous
time, with state space the non-negative integers in 2Z satisfying (2.4) and N ≥ S(t) ≥
0. The non-zero transition probabilities occur for 0 ≤ i ≤ N, j = N + I – i and are 0
Pr{ (S, I) ( t + Δt) = ( i-1, j+1) | ( S, I ) (t) = (i,j)} = βijΔt + o (Δt) (2.5)
Pr{ (S, I) ( t + Δt) = ( i, j-1) | ( S, I ) (t) = (i,j)} = γjΔt + o (Δt) (2.6)
Pr{ (S, I) ( t + Δt) = ( i, j) | ( S, I ) (t) = (i,j)} =1- ( βi + γ) jΔt + o (Δt) (2.7)
Here β is the pairwise infection parameter and γ the removal parameter. Since there is
assumed to be a homogeneous mixing so that in the time interval (t, t + Δt) infections
occur at rate βijΔt and removals at rate γjΔt.
15
Similarly, there is a stochastic threshold theorem similar in nature to the Kermack-
McKendrick theorem for the deterministic general epidemic.
Theorem 2.2 (Whittle’s Threshold Theorem). Consider a general epidemic process
with initial numbers of susceptibles N and infectives I, and relative removal rate ρ.
For any ζ in (0, 1), let π (ζ) denote the probability that at most [Nζ] of the susceptibles
are ultimately infected i.e. that the intensity of the epidemic does not exceed ζ (Daley
and Gani [1999]).
(i) If ρ < N(1- ζ), then (ρ/N) I ≤ π (ζ) ≤ (ρ/N(1- ζ)) I
(ii) If N (1- ζ) ≤ ρ < N, then (ρ/N) I ≤ π (ζ) ≤ 1.
(iii) If ρ ≥ N, then π (ζ) =1, and the probability that the epidemic achieves
an intensity greater than any predetermined ζ in (0, 1), is zero.
In short, stochastic modeling works on conditional probability structure, where
one assumes that a new case in a short interval of time is proportional to both
susceptibles and infectives, as well as the length of the time interval.
2.5 Some terminology
2.5.1 Basic reproductive rate
The key value governing the time evolution of these equations is the threshold
parameter, which is denoted by R0, the basic reproductive rate. R0 can be calculated
by the following formula:
R0 = β S (0) / γ
16
Here β is the infection parameter and γ the removal rate in Equation (2.1-2.3) and
S (0) is the number of susceptibles at t =0.
R0 is defined as the number of secondary infectious individuals generated by a
“typical” infectious individual when introduced into a fully susceptible population. In
other words, it determines the number of people infected by contact with a single
infected person before his death or recovery.
There are two qualitatively different situations which may occur with an epidemic
in a large population, where R0 plays an important role.
If R0 < 1, each person who contracts the disease will infect fewer than one person
before dying or recovering, so the epidemic will die out quickly, as shown in Figure
2.5.
If R0 > 1, each person who contracts the disease, will infect more than one person,
thus the epidemic will spread through the population, as shown in Figure 2.6 and 2.7.
0 20 40 60 80 100 120 140 160 180 2000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
time
SIR
:num
ber o
f cas
es
Figure 2.5 Plot of a realization from the differential SIR model with R0 = 0.8, N=100 and γ =0.1.
17
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
5
6
7
8
SIR
:num
ber o
f cas
es
time
Figure 2.6 Plot of a realization from the differential SIR model with R0 = 1.5, N=100 and γ =0.1.
0 20 40 60 80 100 120 140 160 180 2000
5
10
15
20
25
30
35
40
45
50
time
SIR
:num
ber o
f cas
es
Figure 2.7 Plot of a realization from the differential SIR model with R0 = 5, N=100 and γ =0.1.
The estimated values of the basic reproductive rate, R0, for some common
infectious diseases are listed in Table 2.1(Data from Anderson and May [1991],
Wallinga and Teunis [2004]).
18
Table 2.1 Estimated values of R0 for various infectious diseases.
Disease Location Time R0Measles England and Wales 1950-1968 16-18 Kansas, USA 1918-1921 5-6 Ontario, Canada 1912-1913 11-12 Mumps Baltimore, USA 1943 7-8 England and Wales 1960-1980 11-14 Netherlands 1970-1980 11-14 Rubella West Germany 1970-1977 6-7 Poland 1970-1977 11-12 poliomyelitis USA 1955 5-6 Netherlands 1960 6-7 HIV/AIDS England and Wales 1981-1985 2-5 Nairobi, Kenya 1981-1985 11-12 Kampala, Uganda 1985-1987 10-11 SARS Hong Kong 2003 3.1-4.2 Ontario, Canada 2003 1.8-3.6 Singapore 2003 2.3-4.0 Vietnam 2003 1.8-3.1
It shows that the range of R0 is estimated to be between 2 and 20 for most
infectious diseases in practice.
Generally, the larger the value of R0, the harder it is to control the epidemic. In
particular, the proportion of the population that needs to be vaccinated to provide herd
immunity and prevent sustained spread of the infection is given by 1-1/R0. The basic
reproductive rate is affected by several factors including the duration of infectivity of
affected patients, the infectiousness of the organism, and the number of susceptible
people in the population that the affected patients are in contact with.
2.5.2 Important time scales in epidemiology
There are several time scales in epidemic theory needed to be clarified before our
further discussion.
Assuming that there is an instant at which infection occurs for an individual:
19
1. the latent period indicates the period from the point of infection to the
beginning of the state of infectiousness.
2. the incubation period indicates the period from the point of infection to the
appearance of symptoms of disease.
3. the infectious period indicates the period during which an infected host is able
to infect susceptible individuals, i.e. infectious to susceptible individuals.
Hence a host may be infected but not yet infectious. The relationship between
these time scales is illustrated in Figure 2.8.
0 7 14
Time Units
Latent
Incubation
Infectious
Symptoms
Recovered / immune
Point of infection
Figure 2.8 Diagrammatic illustration of the relationship between the incubation, latent and infectious periods for a hypothetical microparastic infection.
The lengths of incubation, latent and infectious periods for some infectious
diseases are listed in Table 2.2 (Data from Fenner and White [1970], Christie [1974],
Benenson [1975]).
Table 2.2 Incubation, latent and infectious periods (in days) for some infectious diseases.
Infectious disease Incubation period Latent period Infectious period Measles 8-13 6-9 6-7 Mumps 12-26 12-18 4-8 Rubella 14-21 7-14 11-12 Poliomyelitis 7-12 1-3 14-20 Influenza 1-3 1-3 2-3 Scarlet fever 2-3 1-2 14-21
20
CHAPTER 3
Statistical Epidemic Models
3.1 The time series – susceptible – infected – recovered model (TSIR)
To begin with, the population is assumed to be a single group of homogeneous
individuals who mix uniformly and roughly constant through the epidemic. At any
given time, an individual in the population is either susceptible to the disease, or
infectious with it, or a removed case by acquired immunity or isolation or death.
Furthermore, the infectious disease discussed here is a directly transmitted viral
disease.
The simplest situation is under consideration where only one infective is
introduced into the close population in which all the individuals are susceptible to the
disease.
Let It be the number of cases at time period t and St the number of susceptibles.
Each period is the average duration between an individual acquiring infection and
passing it on to the next infectee.
According to the compartmental SIR mechanism, the number of new cases in the
next time period is
It+1 = β St It (t = 0, 1, 2, …)
21
Here the constant β is such that β St It ≤ St for all t (Bailey [1975], Finkenstädt and
Grenfell [2000]). However, this model was found to be not attractive in the terms of
dynamical pattern (Anderson and May [1991]).
Therefore, another model with nonlinear incidence rates which can show a wider
range of dynamical behavior has been proposed.
I t+1 = c S tr Itα (3.1)
St = S t-1 - It (3.2)
This model is called the time series- susceptible-infected-recovered model (TSIR)
in Finkenstädt and Grenfell [2000].
3.1.1 Check the validation of TSIR model
Since the deterministic SIR model has been widely used to study the spread of
infectious diseases, we take it as a true model for an epidemic of an infectious disease
after specifying the parameters in Equations (2.1-2.3), which can generate the number
of susceptibles, infectives and recovered for each time t. In this section, we will make
use of the generated numbers to check the validation of the model. The following
steps will be taken:
Step 1 The Kermack-McKendrick differential equations with specific R0 (hence β =
R0* γ / N), γ and N were used to generate the number of the susceptibles, infectives
and removals in an epidemic for each time t.
Step 2 The number of the susceptibles (St) was extracted and based on the equation
(3.2), the number of new infectives (It) at time t obtained.
22
Step 3 Take logarithm to both sides of the equation (3.1) and the least-squared
method was used to estimate the three parameters in the model.
Step 4 The deterministic realizations from the deterministic SIR model and from the
TSIR model are compared in the figures.
0 20 40 60 80 100 120 140 160 180 2000
2
4
6
8
10
12
14
16
18x 104
time
It
Figure 3.1 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 2.0, γ= 1/5, N = 5000000. ‘*’ is the number of new cases from the SIR model and ‘—’ the fitted TSIR model.
0 50 100 150 200 250 3000
1
2
3
4
5
6
7
8
9x 104
time
It
Figure 3.2 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 2.0, γ= 1/10, N = 5000000. ‘*’ is the number of new cases from the SIR model and ‘—’ the fitted TSIR model.
23
0 10 20 30 40 50 60 70 80 90 1000
0.5
1
1.5
2
2.5
3
3.5
4
4.5x 105
time
It
Figure 3.3 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 7.0, γ= 1/5, N = 5000000. ‘*’ is the number of new cases from the SIR model and ‘—’ the fitted TSIR model.
0 20 40 60 80 100 120 140 160 1800
0.5
1
1.5
2
2.5x 10
5
time
It
Figure 3.4 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 7.0, γ= 1/10, N = 5000000. ‘*’ is the number of new cases from the SIR model and ‘—’ the fitted TSIR model.
As shown in the figures above, the TSIR model can mimic the dynamic pattern of
an epidemic reasonably well.
24
3.1.2 The relationship between the parameters in TSIR model and the
deterministic SIR model
To investigate the relationship between the parameters in TSIR model and R0, Step
1-3 in Section 3.1.1 are repeated to get a series of the estimated parameters
corresponding to the different values of R0. The range of R0 is fixed between 2 and 20
as indicated in the real world. Similarly, as the actual mean infectious period (D) is
usually 3 – 10 days, the values of γ are chosen to be 1/3, 1/5, 1/7 and 1/10 (D = 1/ γ).
The population of susceptibles is fixed at N = 5,000,000. The results are shown in the
following figures.
2 4 6 8 10 12 14 16 18 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
R0
c
Figure 3.5 Plot of the relationship between c in TSIR model and R0.The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.
As shown in Figure 3.5, as the value of R0 increases, the parameter c in TSIR
model tends to increase for any mean infectious period. The relationship between
them is non-linear. The parameter c tends to be higher for longer infectious period. In
reality, for a specific infectious disease, since the infectious period is usually stable as
one of biological properties of the disease, the estimated value of c in TSIR model
25
can be thought to correspond to the basic reproductive number R0 of the infectious
disease during an epidemic.
2 4 6 8 10 12 14 16 18 200.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
R0
r
Figure 3.6 Plot of the relationship between r in TSIR model and R0. The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.
2 4 6 8 10 12 14 16 18 200.98
0.985
0.99
0.995
1
1.005
alph
a
R0
Figure 3.7 Plot of the relationship between α in TSIR and R0. The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.
The parameter r and α thought as mixing parameters of contact process (Liu et al
[1987]). Figure 3.6 and 3.7 showed that both of them have negative relationship with
R0. As R0 determines the number of people infected by contact with a single infected
26
person before his death or recovery, the negative relationship indicates that the
contact rate between the infectious hosts and susceptibles tend to be lower as R0 tend
to be higher.
3.1.3 Impact of aggregated data on the model fit
For an infectious disease, the most important timescale of the disease is the
infectious period, during which an infected host is able to infect susceptible
individuals, i.e. infectious to susceptible individuals. Assuming that there is an instant
at which an individual is infected and become infectious immediately to other
susceptibles. If the infectious period is, e.g. 7 days, this infected individual will
generate new cases in the following 7 days through contacting other susceptibles. To
put it in another way, if the data of the cases are aggregated into one-week time steps,
an infected person in week t can hence be related to a contact between an infected
individual in week t-1 and an susceptible individual (Finkenstädt and Grenfell
[2000]). Therefore, it would be reasonable to aggregate the cases based on the
infectious period and then use the TSIR model to fit it.
To check the impact of aggregated data on the model fit, we take the following
steps:
Step 1 The Kermack-McKendrick differential equations (2.1-2.3) with specific R0
(hence β = R0* γ / N), γ and N are used to generate the number of the susceptibles,
infectives and removals in an epidemic for each time t.
Step 2 The number of the susceptibles (St) is extracted and aggregated into different
time units to get the aggregated number of susceptibles. Based on the equation (3.2),
27
the aggregated number of infectives (It) for different time units can be obtained.
These aggregated data will be substitute into the TSIR model.
Step 3 Take logarithm to both sides of Equation (3.1) and the least-squared method
was used to estimate the three parameters in the model.
Step 4 The deterministic realizations from the deterministic SIR model and from the
TSIR model are compared in the figures.
Here we only display the results from the deterministic SIR model with R0 = 2.0, γ
= 1/5 and N=5000000 and with R0 = 2.0, γ = 1/10 and N=5000000. The data from the
former model were aggregated into 3, 5, 7 time units respectively (Figure 3.8-3.11)
while the data from the latter model were aggregated into 3, 5, 7, 10, 12 time
respectively (Figure 3.12-3.17). Note that ‘*’ is the number of new cases from the
SIR model and ‘—’ the fitted TSIR model.
0 20 40 60 80 100 120 140 160 180 2000
2
4
6
8
10
12
14
16
18x 10
4
time
It
Figure 3.8 Plot of the deterministic realization of the estimated TSIR model before aggregating data based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000.
28
0 50 100 150 200 250 3000
1
2
3
4
5
6x 105
time
It
Figure 3.9 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 3 time units based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000.
0 50 100 150 200 250 3000
1
2
3
4
5
6
7
8
9x 105
time
It
Figure 3.10 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 5 time units based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000.
29
0 50 100 150 200 250 3000
2
4
6
8
10
12x 105
time
It
Figure 3.11 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 7 time units respectively based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000.
As shown in Figure 3.8-3.11, it appears that the estimated TSIR model for the
aggregated data of 5 time units have the best fit.
0 50 100 150 200 250 3000
1
2
3
4
5
6
7
8
9x 104
time
It
Figure 3.12 Plot of the deterministic realization of the estimated TSIR model before aggregating data based on SIR model with R0 = 2.0, γ = 1/10 and N=5000000.
30
0 50 100 150 200 250 3000
0.5
1
1.5
2
2.5
3x 10
5
time
It
Figure 3.13 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 3 time units based on SIR model with R0 = 2.0, γ = 1/10 and N=5000000.
0 50 100 150 200 250 3000
0.5
1
1.5
2
2.5
3
3.5
4
4.5x 10
5
time
It
Figure 3.14 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 5 time units based on SIR model with R0 = 2.0, γ = 1/10 and N=5000000.
31
0 50 100 150 200 250 3000
1
2
3
4
5
6
7x 105
time
It
Figure 3.15 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 7 time units based on SIR model with R0 = 2.0, γ = 1/10 and N=5000000.
0 50 100 150 200 250 3000
1
2
3
4
5
6
7
8
9x 105
time
It
Figure 3.16 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 10 time units based on SIR model with R0 = 2.0, γ = 1/10 and N=5000000.
32
0 50 100 150 200 250 3000
2
4
6
8
10
12x 105
time
It
Figure 3.17 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 12 time units based on SIR model with R0 = 2.0, γ = 1/10 and N=5000000.
As shown in Figure 3.12-3.17, it appears that the estimated TSIR model for the
aggregated data of 10 time units have the best fit.
The same method were also used to check the impact of aggregated data on the
model fit based on different deterministic SIR model, the results showed similar
pattern to the two examples above: the estimated TSIR model for the aggregated data
into exact or close to D time units tend to have the better fit, where D is the mean
infectious period and D = 1/ γ, indicating that aggregating data into D time units may
improve the TSIR model fit.
3.2 The cumulative alertness infection model (CAIM)
3.2.1 Development of CAIM
For a nonrecurring SIR disease such as measles or SARS, under the SIR
mechanism the recovered will never be infected, the number of effective susceptibles
33
is the total number in the population minus the cumulative sum of all previous cases
reported, i.e.
St = N - 1
t
Iτ
τ=∑ (3.3)
Therefore, a general TSIR model can be proposed as
It+1 ∝ f (N - 1
t
Iτ
τ=∑ ) It
α (3.4)
Assuming that N is roughly constant but unknown, the general TSIR model can be
further simplify as
It+1 = g (1
t
Iτ
τ=∑ ) It
α (3.5)
Here g (·) is the recovered (immunized) – based reproductive rate.
By the three equations above, the function g should be assumed to be a decreasing
monotonic function. Another reason for the monotonicity assumption is that people’s
alertness to the diseases increases as more and more cases are reported. For
convenience, g(x) = Rexp(-r1
t
Iτ
τ=∑ ) It
α was chosen as the function of g, where R and
r are both positive constant parameters ( Xia [2007]).
Given Iτ, τ ≤ t, a deterministic dynamical infection system is
It+1 = Rexp(-r1
t
Iτ
τ=∑ ) It
α (3.6)
34
Randomizing the dynamical system by the Poisson distribution, a stochastic
dynamical cumulative alertness infection model (CAIM) is
It+1 ~ Poi {Rexp(-r1
t
Iτ
τ=∑ ) It
α } (3.7)
There are three parameters in the model. The parameter R corresponds to the basic
reproductive number R0. The parameter r is the coefficient of cumulative alertness,
which depends on several factors including people’s awareness of the diseases and
the efficiency of control measures. The lower the value of r is, the higher is the level
of alertness and the more efficient are the control measures. The parameter α
corresponds to the contact intensity between the infectious hosts and the susceptibles.
Since the alertness is assumed to depend on the cumulative number of previous cases
1
t
Iτ
τ=∑ , the model above is called the cumulative alertness infection model (CAIM).
The relationship between the three parameters in CAIM and R0 were investigated by
the simulation method similar to that in Section 3.1.2. The results are shown in Figure
3.18-3.20, which confirm that the interpretations for the three parameters are
reasonable.
35
2 4 6 8 10 12 14 16 18 200
5
10
15
20
25
R0
R
Figure 3.18 Plot of the relationship between R in CAIM model and R0. The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.
2 4 6 8 10 12 14 16 18 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1x 10-6
R0
r
Figure 3.19 Plot of the relationship between r in CAIM model and R0. The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.
36
2 4 6 8 10 12 14 16 18 200.95
0.96
0.97
0.98
0.99
1
1.01
R0
alph
a
Figure 3.20 Plot of the relationship between α in CAIM model and R0. The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.
Both TSIR model and CAIM can be directly applied to fit the data which are
discrete time measurements. As the number of susceptible is usually difficult to be
obtained in most cases, some approaches have been developed to reconstruct the
unobserved susceptible class in order to make use of TSIR model (Finkenstädt and
Grenfell [2000]). Compared to TSIR model, the CAIM only uses the reported
infections Iτ, τ ≤ t to model and predict the infection dynamics and no other
unobtainable data like the number of susceptibles (St) is required. Hence the CAIM is
data driven and might be preferred by statisticians.
3.2.2 Extension of CAIM
In order to investigate more complicated dynamics and other issues of infections,
two typical extensions of CAIM will be discussed in this section.
37
3.2.2.1 Change of alertness
Model (3.6) and (3.7) assumes that more cases in a community during an epidemic
of an infectious disease, people in the same community are more alerted to this
disease. As a consequence, the reproductive rate R t = Rexp(-r1
t
Iτ
τ=∑ ) will decrease as
the cases increases. The extent of alertness may not be the same in the other
communities or in another period during the epidemic. If the disease breaks out in
another community, the extent of alertness may start at a different level. The good
examples to support this understanding were the SARS outbreaks in Hong Kong and
Singapore (Chowell et al [2003]). Based on this understanding, a deterministic multi-
community CAIM model (Xia [2007]) is proposed as
It+1 = Rexp(-r1
1
1
T
Iτ
τ=∑ - r2
2
1 1
T
TI
τ
τ= +∑ - … - rp
1
t
TpI
τ
τ= +∑ ) It
α (3.8)
And the stochastic version of the model is
It+1 ~ Poi {Rexp(-r1
1
1
T
Iτ
τ=∑ - r2
2
1 1
T
T
Iτ
τ= +∑ - … - rp
1
t
Tp
Iτ
τ= +∑ ) It
α } (3.9)
3.2.2.2 Long term epidemics
The models discussed in the previous parts contain the assumption that the
population is constant through an epidemic of an infectious disease. Essentially all
conventional mathematical models for endemic or epidemic infections contain this
assumption. Given that the epidemiological time-scales such as the infectious period
38
are typically shorter than the time-scales characterizing demographic change such as
population doubling time, the assumption of a constant population is often a useful
approximation. However, for a nonrecurring infectious disease like measles, it is the
newborns who maintain the enough number of susceptibles to induce the long term
epidemic or endemic of the disease. Hence the CAIM should be modified to include
births and seasonality as follows:
It+1 = R(t)exp(-r1
t
Iτ
τ=∑ + β
1
t
bτ
τ=∑ ) It
α (3.10)
Here bt is the births during time period t and R(t) is a function of time t in a growing
population (Xia [2007]).
3.2.3 Data denoising and model estimation
Suppose It' is the reported number of cases in each time period t, t =1, 2, …, T. In
practice, the reported time series It' is usually contaminated by observation errors and
underreporting (Finkenstädt and Grenfell [2000]). Thus, it can be written as It' = It +
εt. The reported time series need to be de-noised before we use them to estimate a
model. Using kernel smoothing method, a de-noised time series is
= It1
( )T
Kh t Iτ
τ τ=
′−∑ / 1
(T
Kh tτ
τ=
)−∑ (3.11)
where K(.) is a symmetric smooth density function and h is a bandwidth used to
balance the noise and variation of the dynamic pattern, and K h (x) = K h (x / h) /h.
Using kernel smooth method to refine time series data in epidemiological modeling
can be found in the literature. See for example Becker and Hasofer [1998],
Finkenstädt & Grenfell [2000] and Ellner et al [2002].
39
{ } is called the denoised time series. In stead of the original time series, the de-
noised time series will be used to estimate the model. Taking logarithm to both sides
of the model
It
It+1 ~ Poi {Rexp(-r1
t
Iτ
τ=∑ ) It
α }
Hence log (It+1) = log (R) -r1
t
Iτ
τ=∑ + α log (It) + ξ t
where ξ t = log [Poi {Rexp(-r1
t
Iτ
τ=∑ ) It
α }] – log [Rexp(-r1
t
Iτ
τ=∑ ) It
α].
The iterative weighted least squared method is used to estimate the model.
40
CHAPTER 4
Application of CAIM to Real Epidemiological Data
Being able to make predictions about infectious disease dynamics is really helpful
for public health. If we know there will be an outbreak, we can prepare for it.
Alternatively, if we have a reliable model, we can study how to prevent an outbreak
and save lives by change the factors we can control using public health means. These
factors include education, immunization, quarantine regulations and health treatment
strategies.
In order to make reasonable predictions and develop methods of control, we must
be confident that our model captures the essential features of the course of an
epidemic. Thus, it becomes important to validate models, whether deterministic or
stochastic, by checking whether they fit the observed data. If the model is validated, it
can then be used to predict the course of the epidemic in time.
In this section, the model CAIM will be checked by fitting it to the observed data
of some nonrecurring infectious disease. Different from the statistical time series
multi-step prediction, which may fail to signify the system, we will check the model
by what will happen if the epidemic breakouts again and whether the fitted dynamic
system can give similar pattern.
Two kinds of realization of the estimated dynamical system will be carried out:
(i) deterministic realization assuming there is no random effect.
41
(ii) stochastic realization assuming the actual number of cases follows a
Poissson distribution. As the stochastic realizations are different from one
another, a number of realizations are needed to show the statistical dynamical
properties.
Both sides of the model will be taken logarithm and the iterative weighted least
square method (IWLSE) be used to estimated the parameters in the model.
4.1 Foot and Mouth Disease (FMD)
Foot and Mouth Disease (FMD) is one of the most contagious animal diseases,
with important economic losses. It is caused by a virus of the family Picornaviridae,
genus Aphthovirus and can spread very rapidly among livestock such as cattle and
sheep by direct or indirect contact (droplets). Low mortality rate in adult animals, but
often high mortality in young due to myocarditis. The incubation period is 2-14 days
and the mean is 7 days (Keeling et al [2001]).
The following data sources from the 2001 FMD epidemic in the United Kingdom.
The analysis will centers on the daily reported FMD cases for the aggregate of all UK
farms from February 22 to August 11 in 2001. These are official notifications,
collected by the Department of the Environment, Food and Rural Affairs.
42
Figure 4.1 Plot of the observed daily cases of the 2001 FMD epidemic in the UK.
As shown in the Figure 4.1, there was obvious variability in the daily case reports
though still a classical epidemic pattern. The fact that the mean incubation period of
FMD is 7 days indicates that an infected animal in week t can hence be related to a
contact between that individual as a susceptible and an infected individual in week t-
1. Therefore, the original data are aggregated into weekly time steps to form the
denoised data.
Next, the following basic CAIM was applied to model the weekly data. The
deterministic & stochastic realizations of the estimated model and one –step ahead
prediction based on the estimated model are shown in Figure 4.2 and Figure 4.3
respectively.
43
It+1 = Rexp(-r1
t
Iτ
τ=∑ ) It
α (4.1)
It+1 ~ Poi {Rexp(-r1
t
Iτ
τ=∑ ) It
α } (4.2)
Figure 4.2 Plot of the deterministic and stochastic realization of the estimated model. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of the estimated model and the shade is the region of 1000 stochastic realizations of the estimated model.
44
Figure 4.3 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of one-step ahead prediction and the shade is the region of 1000 stochastic realizations of one-step ahead prediction.
As shown in Figure 4.2, Model (4.1) and (4.2) can fit the data of FMD in UK in
2001 well. Figure 4.3 shows that given the first week data, the CAIM can produce
similar dynamic based on the estimated model if the FMD epidemic break outs again,
indicating that the basic CAIM is useful in making prediction for a typical epidemic
of FMD.
4.2 Severe Acute Respiratory Syndrome (SARS)
Severe Acute Respiratory Syndrome (SARS) is a new respiratory disease which
was first identified in November 2002 in Guangdong Province of China. On February
45
26, 2003, the first case of SARS in Hong Kong was reported officially. Thereafter,
this disease spread rapidly to the other countries such as Canada, Singapore, Taiwan
and Vietnam, which resulted in local outbreak of SARS. SARS is caused by
coronavirus and can be transmitted by droplet or airborne or person-to person contact.
Most infected individuals either recover, typically after 7-10 days, or suffer 4%
mortality or higher world-wide. The incubation period is 2-7 days, with 3-5 days
being most common.
In the following sections, the CAIM will be fitted to the SARS data observed in
Hong Kong, Singapore and Ontario, Canada. The original data are aggregated into 3-
day time steps to form the denoised data.
4.2.1 SARS in Hong Kong
Figure 4.4 Plot of the observed daily cases of the 2003 SARS epidemic in Hong Kong.
46
The following basic CAIM was first applied to model the denoised data. The
deterministic & stochastic realizations of the estimated model and one-step ahead
prediction based on the estimated model are shown in Figure 4.5 and Figure 4.6
respectively.
It+1 = Rexp(-r1
t
Iτ
τ=∑ ) It
α (4.1)
It+1 ~ Poi {Rexp(-r1
t
Iτ
τ=∑ ) It
α } (4.2)
Figure 4.5 Plot of the deterministic and stochastic realization of the estimated model for the data of HK. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of the estimated model and the shade is the region of 1000 stochastic realizations of the estimated model.
.
47
Figure 4.6 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of HK. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of one-step ahead prediction and the shade is the region of 1000 stochastic realizations of one-step ahead prediction.
As shown in Figure 4.5, Model (4.1) and (4.2) can fit the data of SARS in Hong
Kong in 2003 reasonably well, but Figure 4.6 shows that given the first 3-day data,
the basic CAIM cannot produce satisfactory dynamic based on the estimated model if
the SARS epidemic break outs again, indicating that the basic CAIM is not suitable to
make prediction for such an epidemic of SARS as in Hong Kong.
It can be seen from Figure 4.5 that there were three possible outbreaks of SARS in
Hong Kong. The first outbreak occurred from Day1 to Day 13, the second from Day
14 to Day 22 and the third outbreak from Day 23 till the end of the epidemic. Hence
the CAIM with three different levels of alertness are considered:
48
It+1 = Rexp(-r1
1
1
T
Iτ
τ=∑ - r2
2
1 1
T
T
Iτ
τ= +∑ - r3
2 1
t
T
Iτ
τ= +∑ ) It
α (4.3)
It+1 ~ Poi { Rexp(-r1
1
1
T
Iτ
τ=∑ - r2
2
1 1
T
T
Iτ
τ= +∑ - r3
2 1
t
T
Iτ
τ= +∑ ) It
α } (4.4)
Model (4.3) and (4.4) were applied to model the denoised data. The deterministic
& stochastic realizations of the estimated model and one–step ahead prediction based
on the estimated model are shown in Figure 4.7 and Figure 4.8 respectively.
Figure 4.7 Plot of the deterministic and stochastic realization of the estimated model for the data of HK. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of the estimated model and the shade is the region of 1000 stochastic realizations of the estimated model.
49
Figure 4.8 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of HK. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of one-step ahead prediction and the shade is the region of 1000 stochastic realizations of one-step ahead prediction.
As shown in Figure 4.7, Model (4.3) and (4.4) can fit the data of SARS in Hong
Kong in 2003 well, but Figure 4.8 shows that given the first 3-day data, the adjusted
CAIM with 3 different level of alertness can produce satisfactory dynamic based on
the estimated model if the SARS epidemic break outs again, indicating that the CAIM
with different level of alertness is useful in making prediction for such an epidemic of
SARS as in Hong Kong.
As mentioned before, the parameter r is the coefficient of cumulative alertness,
which depends on people’s awareness of the disease and the efficiency of control
50
measures. The lower the value of r is, the higher is the level of alertness and the more
efficient are the control measures. For the epidemic of SARS in Hong Kong, the
estimated values of r1, r2 and r3 were 0.0031, 0.0013 and 0.0014. The epidemic of
SARS in Hong Kong occurred mainly in three different regions in Hong Kong. SARS
first broke out in one community and then spread to the other two communities.
During the first outbreak, the people’s alertness to the disease was not very high and
the control measures were not very efficiently carried out. However, as the
cumulative number of reported cases increased, the level of people’s alertness
increased and the control measures were more efficient correspondingly, which
resulted in lower values of r for the second and third outbreak. Hence the change in
the values of r did reflect the change in the level of people’s alertness to SARS and
the efficiency of control measures.
4.2.2 SARS in Singapore and Ontario, Canada
Figure 4.9 Plot of the observed daily cases of the 2003 SARS epidemic in Singapore.
51
Figure 4.10 Plot of the observed daily cases of the 2003 SARS epidemic in Ontario, Canada.
It can be seen from Figure 4.9 and 4.10 that there were two outbreaks of SARS in
Singapore and in Ontario, Canada respectively. In Singapore, the first outbreak
occurred from Day1 to Day 19 and the second from Day 20 till the end of the
epidemic while in Ontario, the first outbreak occurred from Day 1 to Day 71 and the
second from Day 72 till the end of the epidemic. Hence the CAIM with two different
levels of alertness are considered:
It+1 = Rexp(-r1
1
1
T
Iτ
τ=∑ - r2
1 1
t
T
Iτ
τ= +∑ ) It
α (4.5)
It+1 ~ Poi { Rexp(-r1
1
1
T
Iτ
τ=∑ - r2
1 1
t
T
Iτ
τ= +∑ ) It
α } (4.6)
52
Model (4.5) and (4.6) were applied to model the denoised data of Singapore and
Ontario. The results for the data of Singapore are showed in Figure 4.11 and Figure
4.12 and those for the data of Ontario in Figure 4.13 and Figure 4.14 respectively.
Figure 4.11 Plot of the deterministic and stochastic realization of the estimated model for the data of Singapore. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of the estimated model and the shade is the region of 1000 stochastic realizations of the estimated model.
53
Figure 4.12 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of Singapore. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of one-step ahead prediction and the shade is the region of 1000 stochastic realizations of one-step ahead prediction.
Figure 4.13 Plot of the deterministic and stochastic realization of the estimated model for the data of Ontario. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of the estimated model and the shade is the region of 1000 stochastic realizations of the estimated model.
54
Figure 4.14 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of Ontario. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of one-step ahead prediction and the shade is the region of 1000 stochastic realizations of one-step ahead prediction.
As shown in Figure 4.11 and 4.13, Model (4.5) and (4.6) can fit the data of SARS in
Singapore and Ontario in 2003 reasonably well. Figure 4.12 and 4.14 show that given
the first 3-day data, the CAIM with 2 different level of alertness can produce
satisfactory dynamics based on the estimated model if the SARS epidemic break outs
again, indicating that the CAIM with different level of alertness is useful in making
prediction for such an epidemic of SARS as in Singapore and Ontario, Canada.
The estimated values of r1 and r2 for the epidemic of SARS in Singapore were
0.0099 and 0.012 and those for the epidemic in Ontario were 0.018 and 0.034.
Different from the SARS epidemic in Hong Kong, the value of r for the second
outbreak was higher than that for the first outbreak for the SARS epidemic in
Singapore and Ontario, which can be explained by the difference in the epidemic
pattern between them. It was obvious from Figure 4.9 and 4.10 that there were two
55
outbreaks of SARS in Singapore and Ontario. More importantly, at the end of the first
outbreak, the number of the reported new cases was few, e.g. 1 new case in Singapore
and no new case in Ontario, which was different from the situation in Hong Kong
shown in Figure 4.4. Few new reported cases might make people get such a false
impression that this epidemic had died out that they became less alert to the disease
when the second outbreak occurred, which resulted in higher value of r for the second
outbreak.
4.3 Measles
Measles is caused by a highly infective single-stranded RNA virus belonging to the
morbillivirus group. It may infect other primates, but is largely specialized on its
human host. Measles can be transmitted from one host to the next by direct contact or
in vapor droplets. The disease causes considerable mortality in malnourished young
children in developing countries while in developed countries case fatality is very low
but epidemics are still a significant public health problem (Anderson and May [1991],
Bjørnstad et al [2002]).
The following cases of measles have been registered on a monthly basis by Centre
for Disease Control (CDC) of China from January 1994 to December 2005.
56
Figure 4.15 Plot of the observed monthly measles cases from January 1994 to December 2005 in China.
As the data shows long term epidemic of measles, the CAIM for long term
epidemic will be fitted to the data.
It+1 = R(t)exp(-r1
t
Iτ
τ=∑ + β
1
t
bτ
τ=∑ ) It
α (4.7)
Here bt is the monthly birth number and R(t) = exp ( -a1D1-a2D2-…-a12D12)where a1,
a2 … a12 are the parameters for each month and D1, D2, …D12 are dummy
variables whose values are either 0 or 1. The results are showed in Figure 4.15 and
4.16.
57
Figure 4.16 Plot of the deterministic realization of the estimated model for the data of Measles in China. The red circle is the observed cases, the blue line is the fitted model and the green shade is the region of 1000 stochastic realizations of the estimated model.
.
Figure 4.17 Plot of the deterministic realization of one-step ahead prediction based on the estimated model for the data of measles in China. The red circle is the observed number of cases, the blue line is the deterministic realization of one-step ahead prediction and the green shade is the region of 1000 stochastic realizations of one-step ahead prediction.
58
It can be seen from Figure 4.16 that Model (4.7) and (4.8) can fit the data of
measles in China well. Figure 4.17 shows that given the first month data, the CAIM
for long term epidemic can produce satisfactory dynamics based on the estimated
model if measles is persistent for a long time, indicating that the CAIM for long term
epidemic is useful in making prediction for such long term epidemic of measles as in
China.
4.4 Concluding remarks
In this thesis, we first proposed the time series-susceptible-infected-recovered
(TSIR) model based on the compartmental SIR mechanism. The validation of TSIR
model was checked by applying it to the data generated by the classic deterministic
model. The result showed that TSIR model performed well.
Next, considering that the number of susceptibles is not obtainable in most cases,
we develop a more practical statistical model, the cumulative alertness infection
model (CAIM) based on the TSIR model, which only requires the reported number of
cases. The parameters in the CAIM have been interpreted and some extensions of
CAIM have been discussed.
Finally, the CAIM was applied to model the real data of several infectious diseases:
foot-and-mouth disease in UK in 2001, SARS in Hong Kong, Singapore and Ontario,
Canada in 2003 and measles in China from 1994 to 2005. Our findings showed that
the CAIM could mimic the dynamics of these diseases reasonably well. The results
indicate that the CAIM may be helpful in making predictions about infectious disease
dynamics.
59
BIBLIOGRAPHY
Anderson, H., Britton, T. (2000) Stochastic Epidemic Models amd Their Statistical
Analysis. Springer-Verlag, New York.
Anderson, R. M., May, R. M. (1991) Infectious Diseases of Humans: Dynamics and
Control. Oxford University Press, New York.
Bailey, N. T. J. (1975) The Mathematical Theory of Infectious Diseases and its
Applications (2nd edn). Charles Griffin & Company Ltd. London.
Bartllet, M. S. (1949) Some evolutionary stochastic processes. Journal of Statistical
Society, 11: 211-229.
Bartllett, M. S.(1955) Stochastic Processes. Cambridge University Press.
Becker, N. G., Britton,T.(1999) Statistical studies of infectious disease incidence.
Journal of Statistical Society, 61: 287-307.
Becker, N. G., Hasofer, A. M. (1998) Estimating the transmission rate for a highly
infectious disease. Biometrics, 54: 730-738.
Benenson, A. S. (1975) Control of communicable diseases in man, (12 th edn).
American Public Health Association, Washington, D.C.
60
Bjørnstad, O. N., Finkenstädt, B. F., Grenfell, B. T. (2002) Dynamics of measles
epidemics: estimating scaling of transmission rates using a time series sir model.
Ecological Monographs, 72(2): 169-184.
Castillo- Chavez, C., Hetchcote, H. W., Anderson, V., Levin, S. A., Liu, W. (1989)
Epidemiological models with age structure, proportionate mixing and cross-
immunity. Journal of Mathematical Biology, 27: 233-258.
Castillo-Chavez, C., Blower, S., Driessche, P. V. D., Kirschner, D., Yakubu, A.
(2002) Mathmatical Approaches for Emerging and Reemerging Infectious Diseases:
An Introduction. Springer-Verlag, New York.
Chowell, G., Fenimore, P. W., Catillo-Garsow, M. A., Castillo-Chavez, C. (2002)
SARS outbreaks in Ontario, Hong Kong and Singapore: the role of diagnosis and
isolation as a control mechanism. Journal of Theoretical Biology, 24: 1-8.
Chowell, G., Hengartner, N. W., Castillo-Chavez, C., P. W., Catillo-Garsow, Hyman.
J. M. (2004) The basic reproductive number of Ebola and the effects of public health
measures: the cases of Congo and Uganda. Journal of Theoretical Biology, 229: 119-
126.
Christie, A. B. (1974) Infectious diseases: epidemiology and clinical practice.
Chuchill Livingstone, London.
61
Daley, D. J., Gani, J. (1999) Epidemic Modelling: An Introduction. Cambridge
University Press.
Diekmann, O., Heesterbeek, J. A. P. (2000) Mathematical Epidemiology of Infectious
Diseases: Model Building, Analysis and Interpretation. John Wiley & Sons,
Chichester.
Dietz, K. (1979) Epidemiologic interference of virus populations. Journal of
Mathematical Biology, 8: 291-300.
Dietz, K., Schenzle, D. (1985) Proportionate mixing models for age-dependent
infection transmission. Journal of Mathematical Biology, 22: 117-120.
Ellner, S. P., Seifu, Y., Smith, R. H.(2002) Fitting population dynamic models to
time-series data by gradient matching. Ecology, 83: 2256-2270.
Farrington, C. P., Kanaan, M. N. (2001) Estimation of the basic reproduction number
for infectious diseases from age-stratified serological survey data. Journal of Applied
Statistics, 50: 251-292.
Feng, Z, Thieme, R. H. (1995) Multi-annual outbraks of childhood disease revisited:
The impact of isolation. Journal of Mathematical Bioscience, 128: 93-130.
Fenner, F., White, D. O. (1970) Medical virology. Academic, New York.
62
Finkenstädt, B. F., Grenfell, B. T. (2000) Time series modeling of childhood diseases:
a dynamical systems approach. Journal of Applied Statistics, 49: 187-205.
Grenfell, B., Dobson, A. (1995) Ecology of infectious diseases in natural populations.
Cambridge University Press.
Hamer, W. H.(1906) Epidemic disease in England. The Lancet, i: 733-739.
Hethcote, H. W. (1976) Qualitative analysis for communicable disease models.
Journal of Mathematical Bioscience, 28: 335-356.
Hethcote, H. W., Van Ark, J. W. (1987) Epidemiological models for heterogeneous
populations: Proportionate mixing, parameter estimation and immunization programs.
Journal of Mathematical Bioscience, 84: 85-118.
Hethcote, H.W. (2000) The mathematics of infectious diseases. Siam Review, 42:
599-653.
Isham, V., Medley, G. (1996) Models for infectious human diseases: Their structure
and relation to data. Cambridge University Press.
Keeling, M. J., Woolhouse, M. E. J., Shaw, D. J., Matthews, L., Chase-Topping, M.,
Haydon, D. T., Cornell, S. J., Kappey, J., Wilesmith, J., Grenfell, B.T. (2001)
Dynamics of the 2001 UK foot and mouth epidemic: stochastic dispersal in a
heterogeneous landscape. Science, 294: 813-817.
63
Kermack, W. O., McKendrick, A. G. (1927) A contribution to the mathematical
theory of epidemics. Proceedings of the Royal Society, A115: 700-721.
Liu, W., Hethcote, H.W., Levin, S. (1987) Dynamic behavior of epidemiological
models with nonlinear incidence rates. Journal of Mathematical Bioscience, 25: 359-
380.
McKendrick, A. G. (1926) Applications of mathematics to medical problems.
Proceedings of Edinburg Mathematical Society, 14: 98-130.
Mollison, D. (1995) Epidemic Models: Their structure and relation to data.
Cambridge University Press.
Naug, D., Smith, B. (2007) Experimentally indiced change in infectious period affects
transmission dynamics in social group. Proceedings of the Royal Society, 274: 61-65.
Ross, R. (1908) Report on the prevention of malaria in Mauritius. London.
Ross, R. (1911) The prevention of malaria(2nd edn). Murray, London.
Ross, R. (1916) An application of the theory of probabilities to the study of a priori
pathometry, I. Proceedings of the Royal Society, A92: 204-230.
64
Ross, R. (1917) An application of the theory of probabilities to the study of a priori
pathometry, II. Proceedings of the Royal Society, A93: 212-225.
Wallinga, J., Teunis, P. (2004) Different epidemic curves for severe acute respiratory
syndrome reveal similar impacts of control measures. American Journal of
Epidemiology, 160: 509-516.
Xia, Y. (2007) A statistical model for the transmission of infectious diseases.
Manustript.
65
APPENDIX
Programme codes
1 Prgramme to produce the realization of deterministic SIR model with different
parameters in (2.3)
% Define the function of the deterministic model
function A = sir(beta, v, N)
options = odeset ('RelTol',1e-5,'AbsTol',1e-5);
[T,Y] = ode45(@sirsys1,[0:1:100],[N 1 0],options,beta,v);
A=[T,Y];
function dy = sirsys1(t,y,beta,v)
dy = zeros (3,1);
dy(1)=-beta*y(1)*y(2);
dy(2)=beta*y(1)*y(2)-v*y(2);
dy(3)=v*y(2);
% Produce the plots
A=SIR(0.005, 0.1, 100);
plot(A(:,1),A(:,3))
xlabel('time')
ylabel('SIR: number of cases')
A=SIR(0.005, 0.1, 100);
66
plot(A(:,1),A(:,2))
xlabel('time')
ylabel('SIR: number of susceptibless')
A=SIR(0.015, 0.1, 100);
plot(A(:,1),A(:,3))
xlabel('time')
ylabel('SIR: number of cases')
A=SIR(0.015, 0.1, 100);
plot(A(:,1),A(:,2))
xlabel('time')
ylabel('SIR: number of cases')
2 Programme to produce the realization of deterministic SIR model with
different R0 in (2.5.1)
% Define the function of the deterministic model
function A = sir(beta, v, N)
options = odeset ('RelTol',1e-5,'AbsTol',1e-5);
[T,Y] = ode45(@sirsys1,[0:1:200],[N 1 0],options,beta,v);
A=[T,Y];
function dy = sirsys1(t,y,beta,v)
dy = zeros (3,1);
dy(1)=-beta*y(1)*y(2);
dy(2)=beta*y(1)*y(2)-v*y(2);
67
dy(3)=v*y(2);
% Produce the plots
R0 = 0.8;
v = 0.1;
N =100;
beta = R0 * v /N;
A=SIR(beta, v, N);
plot(A(:,1),A(:,3))
xlabel('time')
ylabel('SIR: number of cases')
R0 = 1.5;
v = 0.1;
N =100;
beta = R0 * v /N;
A=SIR(beta, v, N);
plot(A(:,1),A(:,3))
xlabel('time')
ylabel('SIR: number of cases')
R0 = 5;
v = 0.1;
N =100;
beta = R0 * v /N;
68
A=SIR(beta, v, N);
plot(A(:,1),A(:,3))
xlabel('time')
ylabel('SIR: number of cases')
3 Programme to check the validation of TSIR model in (3.1.1)
% Define the function of the deterministic model
function A = sir(beta, v, N)
options = odeset ('RelTol',1e-5,'AbsTol',1e-5);
[T,Y] = ode45(@sirsys1,[0:1:200],[N 1 0],options,beta,v);
A=[T,Y];
function dy = sirsys1(t,y,beta,v)
dy = zeros (3,1);
dy(1)=-beta*y(1)*y(2);
dy(2)=beta*y(1)*y(2)-v*y(2);
dy(3)=v*y(2);
% Estimate the coefficients of TSIR model
R0=2.0; % or R0 = 7
N=5000000;
gamma=1/5; % or gamma = 1/10
l1=size(gamma);
l=l1(2);
beta=gamma*R0/N;
k1=size(beta);
69
k=k1(2);
c=zeros(l,k);
r=zeros(l,k);
alpha=zeros(l,k);
tol = 1e-35;
maxit = 25;
[beta_final,gamma_final]=meshgrid(beta,gamma);
for i = 1:l
for j = 1:k
A=SIR(beta_final(i,j),gamma_final(i,j),N);
a=size(A);
n=a(1);
check=[A(1:(n-1),2)-A(2:n,2) (1:1:(n-1))'];
check1=check(check(:,1)<0,:);
if(sum(check1(:,1))==0)
new.n=n;
else
new.n=min(check1(:,2));
end
B=A(1:new.n,2);
n=length(B);
split=1;
n1=floor(n/split);
output=sum(B(1:split));
for i1 = 2:n1
70
out=sum(B((split*(i1-1)+1):(split*i1)));
output=[output;out];
end
if((n/split)~=n1)
out=mean(B((split*n1+1):n))*split;
output=[output;out];
end
n=length(output);
b=log(output(2:(n-1))-output(3:n));
tmp_A=ones(n-2,1);
tmp_A1=log(output(1:(n-2)));
tmp_A2=log(output(1:(n-2))-output(2:(n-1)));
A1=[tmp_A tmp_A1 tmp_A2];
[x,flag,relres,iter,resvec,lsvec]=lsqr(A1,b,tol,maxit);
c(i,j)=exp(x(1));
r(i,j)=x(2);
alpha(i,j)=x(3);
end
end
% Produce the plot to compare TSIR model and SIR model
simu_n1=size(b);
simu_n=simu_n1(1);
simu_A1=tmp_A1;
simu_A2=tmp_A2;
71
for i = 1:simu_n
simu_b(i)=log(c)+simu_A1(i)*r+simu_A2(i)*alpha;
if(i~=simu_n)
simu_A2(i+1)=log(exp(simu_b(i)));
simu_A1(i+1)=log(exp(simu_A1(i))-exp(simu_A2(i)));
end
end
LOG_INFECTIVENESS_T1=simu_b;
F=(LOG_INFECTIVENESS_T1)';
INFECTIVENESS=exp(F);
INFECTIVENESS=INFECTIVENESS';
T1=length(INFECTIVENESS);
T=(1:1:(T1-1));
T=T';
INFECTIVENESS2=exp(tmp_A2);
t1=length(INFECTIVENESS2);
t2=(1:1:(t1-1));
t2=t2';
Q=length(INFECTIVENESS2)-1;
plot(T(1:Q),INFECTIVENESS(1:Q)',t2,INFECTIVENESS2(2:t1),'*')
xlabel(‘time’)
ylabel(‘It’)
72
4 Programme to investigate the relationship between the parameters in TSIR
model and the deterministic SIR model in (3.12)
% Define the function of the deterministic model
function A = sir(beta, v, N)
options = odeset ('RelTol',1e-5,'AbsTol',1e-5);
[T,Y] = ode45(@sirsys1,[0:1:200],[N 1 0],options,beta,v);
A=[T,Y];
function dy = sirsys1(t,y,beta,v)
dy = zeros (3,1);
dy(1)=-beta*y(1)*y(2);
dy(2)=beta*y(1)*y(2)-v*y(2);
dy(3)=v*y(2);
% Estimate the coefficients of TSIR model
R0=2:0.1:20;
N=5000000;
gamma=1/3; % or gamma =1/5, 1/7, 1/10
l1=size(gamma);
l=l1(2);
beta=gamma*R0/N;
k1=size(beta);
k=k1(2);
c3=zeros(l,k); % or c5, c7, c10 corresponding to gamma =1/5, 1/7, 1/10
r3=zeros(l,k); % or r5, r7, r10 corresponding to gamma =1/5, 1/7, 1/10
alpha3=zeros(l,k); % alpha5, alpha7, alpha10 corresponding to gamma =1/5,1/7, 1/10
73
tol = 1e-35;
maxit = 25;
[beta_final,gamma_final]=meshgrid(beta,gamma);
for i = 1:l
for j = 1:k
A=SIR(beta_final(i,j),gamma_final(i,j),N);
a=size(A);
n=a(1);
check=[A(1:(n-1),2)-A(2:n,2) (1:1:(n-1))'];
check1=check(check(:,1)<0,:);
if(sum(check1(:,1))==0)
new.n=n;
else
new.n=min(check1(:,2));
end
B=A(1:new.n,2);
n=length(B);
split=1;
n1=floor(n/split);
output=sum(B(1:split));
for i1 = 2:n1
out=sum(B((split*(i1-1)+1):(split*i1)));
output=[output;out];
end
74
if((n/split)~=n1)
out=mean(B((split*n1+1):n))*split;
output=[output;out];
end
n=length(output);
b=log(output(2:(n-1))-output(3:n));
tmp_A=ones(n-2,1);
tmp_A1=log(output(1:(n-2)));
tmp_A2=log(output(1:(n-2))-output(2:(n-1)));
A1=[tmp_A tmp_A1 tmp_A2];
[x,flag,relres,iter,resvec,lsvec]=lsqr(A1,b,tol,maxit);
c3(i,j)=exp(x(1)); % c5, c7, c10 corresponding to gamma =1/5, 1/7, 1/10
r3(i,j)=x(2); % r5, r7, r10 corresponding to gamma =1/5, 1/7, 1/10
alpha3(i,j)=x(3); % alpha5, alpha7, alpha10 corresponding to gamma =1/5, 1/7, 1/10
end
end
% produce the plots to show the relationship between R0 and the three parameters in
TSIR models
plot (R0, c3, ‘*’, R0, c5, ‘*’, R0, c7, ‘*’, R0, c10, ‘*’)
xlabel(‘R0’)
ylabel(‘c’)
plot (R0, r3, ‘*’, R0, r5, ‘*’, R0, r7, ‘*’, R0, r10, ‘*’)
xlabel(‘R0’)
ylabel(‘r’)
75
plot (R0, alpha3, ‘*’, R0, alpha5, ‘*’, R0, alpha7, ‘*’, R0, alpha10, ‘*’)
xlabel(‘R0’)
ylabel(‘alpha’)
5 Programme to check the impact of aggregated data on the TSIR model fit in
(3.1.3)
% Define the function of the deterministic model
function A = sir(beta, v, N)
options = odeset ('RelTol',1e-5,'AbsTol',1e-5);
[T,Y] = ode45(@sirsys1,[0:1:200],[N 1 0],options,beta,v);
A=[T,Y];
function dy = sirsys1(t,y,beta,v)
dy = zeros (3,1);
dy(1)=-beta*y(1)*y(2);
dy(2)=beta*y(1)*y(2)-v*y(2);
dy(3)=v*y(2);
% Estimate the coefficients of TSIR model
R0=2.0;
N=5000000;
gamma=1/5; % or gamma = 1/10
l1=size(gamma);
l=l1(2);
beta=gamma*R0/N;
k1=size(beta);
76
k=k1(2);
c=zeros(l,k);
r=zeros(l,k);
alpha=zeros(l,k);
tol = 1e-35;
maxit = 25;
[beta_final,gamma_final]=meshgrid(beta,gamma);
for i = 1:l
for j = 1:k
A=SIR(beta_final(i,j),gamma_final(i,j),N);
a=size(A);
n=a(1);
check=[A(1:(n-1),2)-A(2:n,2) (1:1:(n-1))'];
check1=check(check(:,1)<0,:);
if(sum(check1(:,1))==0)
new.n=n;
else
new.n=min(check1(:,2));
end
B=A(1:new.n,2);
n=length(B);
split=1; % or split = 3, 5, 7, 10,12 corresponding to the diferent aggregated data
n1=floor(n/split);
output=sum(B(1:split));
for i1 = 2:n1
77
out=sum(B((split*(i1-1)+1):(split*i1)));
output=[output;out];
end
if((n/split)~=n1)
out=mean(B((split*n1+1):n))*split;
output=[output;out];
end
n=length(output);
b=log(output(2:(n-1))-output(3:n));
tmp_A=ones(n-2,1);
tmp_A1=log(output(1:(n-2)));
tmp_A2=log(output(1:(n-2))-output(2:(n-1)));
A1=[tmp_A tmp_A1 tmp_A2];
[x,flag,relres,iter,resvec,lsvec]=lsqr(A1,b,tol,maxit);
c(i,j)=exp(x(1));
r(i,j)=x(2);
alpha(i,j)=x(3);
end
end
% Produce the plot to compare TSIR model and SIR model
simu_n1=size(b);
simu_n=simu_n1(1);
simu_A1=tmp_A1;
simu_A2=tmp_A2;
78
for i = 1:simu_n
simu_b(i)=log(c)+simu_A1(i)*r+simu_A2(i)*alpha;
if(i~=simu_n)
simu_A2(i+1)=log(exp(simu_b(i)));
simu_A1(i+1)=log(exp(simu_A1(i))-exp(simu_A2(i)));
end
end
LOG_INFECTIVENESS_T1=simu_b;
F=(LOG_INFECTIVENESS_T1)';
INFECTIVENESS=exp(F);
INFECTIVENESS=INFECTIVENESS';
T1=length(INFECTIVENESS);
T=(1:1:(T1-1));
T=T';
INFECTIVENESS2=exp(tmp_A2);
t1=length(INFECTIVENESS2);
t2=(1:1:(t1-1));
t2=t2';
Q=length(INFECTIVENESS2)-1;
plot(T(1:Q),INFECTIVENESS(1:Q)',t2,INFECTIVENESS2(2:t1),'*')
xlabel(‘time’)
ylabel(‘It’)
6 Programme to investigate the relationship between the three parameters in
CAIM and R0 in (3.2.1)
79
% Define the function of the deterministic model
function A = sir(beta, v, N)
options = odeset ('RelTol',1e-5,'AbsTol',1e-5);
[T,Y] = ode45(@sirsys1,[0:1:200],[N 1 0],options,beta,v);
A=[T,Y];
function dy = sirsys1(t,y,beta,v)
dy = zeros (3,1);
dy(1)=-beta*y(1)*y(2);
dy(2)=beta*y(1)*y(2)-v*y(2);
dy(3)=v*y(2);
% Estimate the coefficients of TSIR model
R0=2:0.1:20;
N=5000000;
gamma=1/3; % or gamma =1/5, 1/7, 1/10
l1=size(gamma);
l=l1(2);
beta=gamma*R0/N;
k1=size(beta);
k=k1(2);
R3=zeros(l,k); % or R5, R7, R10 corresponding to gamma =1/5, 1/7, 1/10
r3=zeros(l,k); % or r5, r7, r10 corresponding to gamma =1/5, 1/7, 1/10
alpha3=zeros(l,k); %alpha5, alpha7, alpha10 corresponding to gamma =1/5, 1/7, 1/10
tol = 1e-35;
maxit = 25;
80
[beta_final,gamma_final]=meshgrid(beta,gamma);
for i = 1:l
for j = 1:k
A=SIR(beta_final(i,j),gamma_final(i,j),N);
a=size(A);
n=a(1);
check=[A(1:(n-1),2)-A(2:n,2) (1:1:(n-1))'];
check1=check(check(:,1)<0,:);
if(sum(check1(:,1))==0)
new.n=n;
else
new.n=min(check1(:,2));
end
B=A(1:new.n,2);
n=length(B);
split=1;
n1=floor(n/split);
output=sum(B(1:split));
for i1 = 2:n1
out=sum(B((split*(i1-1)+1):(split*i1)));
output=[output;out];
end
if((n/split)~=n1)
out=mean(B((split*n1+1):n))*split;
output=[output;out];
81
end
n=length(output);
b=log(output(2:(n-1))-output(3:n));
tmp_A=ones(n-2,1);
tmp_A1=N-output(1:(n-2));
tmp_A2=log(output(1:(n-2))-output(2:(n-1)));
A1=[tmp_A tmp_A1 tmp_A2];
[x,flag,relres,iter,resvec,lsvec]=lsqr(A1,b,tol,maxit);
R3(i,j)=exp(x(1)); % R5, R7, R10 corresponding to gamma =1/5, 1/7, 1/10
r3(i,j)=x(2); % r5, r7, r10 corresponding to gamma =1/5, 1/7, 1/10
alpha3(i,j)=x(3); % alpha5, alpha7, alpha10 corresponding to gamma =1/5, 1/7, 1/10
end
end
% produce the plots to show the relationship between R0 and the three parameters in
TSIR models
plot (R0, R3, ‘*’, R0, R5, ‘*’, R0, R7, ‘*’, R0, R10, ‘*’)
xlabel(‘R0’)
ylabel(‘c’)
plot (R0, r3, ‘*’, R0, r5, ‘*’, R0, r7, ‘*’, R0, r10, ‘*’)
xlabel(‘R0’)
ylabel(‘r’)
plot (R0, alpha3, ‘*’, R0, alpha5, ‘*’, R0, alpha7, ‘*’, R0, alpha10, ‘*’)
xlabel(‘R0’)
ylabel(‘alpha’)
82
7 Programme to apply CAIM to fit 2001 FMD data in UK in (4.1)
% Produce the plot for original FMD data
orig_data1=read.csv("D:\\FMD_orig.csv")
orig_data=orig_data1[,1]
x=seq(7,170,by=7)
x1=1:170
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of FMD
Case")
% Produce the deterministic and stochastic realizations of the estimated CAIM for
FMD data
data=read.csv("D:\\footmouth.csv")
data1=data
data[,2]=log(data[,2])
data[,3]=log(data[,3])
data2=data
model=lm(It1~It+cul1,data)
coefficients=model$coefficients
predicted.A=predict.lm(model,data)
predicted.A1=predicted.A
n=length(predicted.A)
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of FMD
Case")
for(j in 1:1000)
{
83
predicted.A=predict.lm(model,data)
for(i in 1:n)
{
predicted.A[i]=rpois(1,exp(predicted.A[i]))
}
lines(x,predicted.A/7,col="Green")
}
lines(x,exp(predicted.A1)/7,col="Blue")
points(x1,orig_data,col="Red",pch="*")
% Produce the deterministic and stochastic realizations of one-step ahead prediction
based on the estimated CAIM for FMD data
predicted.B=data$It1
predicted.B1=predicted.B
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of FMD
Case")
for(j in 1:1000)
{
data=data2
for(i in 1:n)
{
data[i,3]=predict.lm(model,data[i,])
predicted.B1[i]=data[i,3]
predicted.B[i]=rpois(1,exp(predicted.B1[i]))
if(i!=n)
84
{
data[i+1,2]=data[i,3]
data[i+1,4]=data[i,4]+exp(data[i,3])
}
}
lines(x,predicted.B/7,col="Green")
}
points(x1,orig_data,col="Red",pch="*")
lines(x,exp(predicted.B1)/7,col="Blue")
8 Programme to apply CAIM to fit 2003 SARS data in Hong Kong in (4.2.1)
% Produce the plot for original SARS data in Hong Kong
orig_data1=read.csv("D:\\hk_orig.csv")
orig_data=c(10,orig_data1[,1])
x=seq(3,92,by=3)
x1=1:92
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of SARS
Case")
% Produce the deterministic and stochastic realizations of the estimated CAIM for
FMD data
data=read.csv("D:\\HK.csv")
data1=data
data[,2]=log(data[,2])
data[,3]=log(data[,3])
model=lm(It1~It+cul,data)
85
coefficients=model$coefficients
predicted.A=predict.lm(model,data)
predicted.A1=predicted.A
n=length(predicted.A)
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of SARS
Case")
for(j in 1:1000)
{
predicted.A=predict.lm(model,data)
for(i in 1:n)
{
predicted.A[i]=rpois(1,exp(predicted.A[i]))
}
lines(x,predicted.A/3,col="Green")
}
lines(x,exp(predicted.A1)/3,col="Blue")
points(x1,orig_data,col="Red",pch="*")
% Produce the deterministic and stochastic realizations of one-step ahead prediction
based on the estimated CAIM for FMD data
predicted.B=data$It1
predicted.B1=predicted.B
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of SARS
Case")
for(j in 1:1000)
86
{
data=data2
for(i in 1:n)
{
data[i,3]=predict.lm(model,data[i,])
predicted.B1[i]=data[i,3]
predicted.B[i]=rpois(1,exp(predicted.B1[i]))
if(i!=n)
{
data[i+1,2]=data[i,3]
data[i+1,7]=data[i,7]+exp(data[i,3])
}
}
lines(x,predicted.B/3,col="Green")
}
points(x1,orig_data,col="Red",pch="*")
lines(x,exp(predicted.B1)/3,col="Blue")
9 Programme to apply CAIM to fit 2003 SARS data in Singapore in (4.2.2)
% Produce the plot for original SARS data in Singapore
orig_data1=read.csv("D:\\singapore_orig.csv")
orig_data=orig_data1[,1]
x=seq(3,67,by=3)
x1=1:67
87
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of SARS
Case")
% Produce the deterministic and stochastic realizations of the estimated CAIM for
SARS data in Singapore
data=read.csv("D:\\singapore.csv")
data1=data
data[,2]=log(data[,2])
data[,3]=log(data[,3])
data2=data
model=lm(It1~It+cul1+cul2,data)
coefficients=model$coefficients
predicted.A=predict.lm(model,data)
predicted.A1=predicted.A
n=length(predicted.A)
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of SARS
Case")
for(j in 1:1000)
{
predicted.A=predict.lm(model,data)
for(i in 1:n)
{
predicted.A[i]=rpois(1,exp(predicted.A[i]))
}
lines(x,predicted.A/3,col="Green")
88
}
lines(x,exp(predicted.A1)/3,col="Blue")
points(x1,orig_data,col="Red",pch="*")
% Produce the deterministic and stochastic realizations of one-step ahead prediction
based on the estimated CAIM for SARS data in Singapore
predicted.B=data$It1
predicted.B1=predicted.B
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of SARS
Case")
for(j in 1:1000)
{
data=data2
for(i in 1:n)
{
data[i,3]=predict.lm(model,data[i,])
predicted.B1[i]=data[i,3]
predicted.B[i]=rpois(1,exp(predicted.B1[i]))
if(i!=n)
{
data[i+1,2]=data[i,3]
if(i<6)
{
data[i+1,4]=data[i,4]+exp(data[i,3])
}else{
89
# data[i+1,4]=data[i,4]
data[i+1,4]=0
data[i+1,5]=data[i,5]+exp(data[i,3])
}
}
}
lines(x,predicted.B/3,col="Green")
}
points(x1,orig_data,col="Red",pch="*")
lines(x,exp(predicted.B1)/3,col="Blue")
10 Programme to apply CAIM to fit 2003 SARS data in Ontario in (4.2.2)
% Produce the plot for original SARS data in Ontario
orig_data1=read.csv("D:\\ontario_orig.csv")
orig_data=orig_data1[,1]
x=seq(3,121,by=3)
x1=1:121
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of SARS
Case")
% Produce the deterministic and stochastic realizations of the estimated CAIM for
SARS data in Ontario
data=read.csv("D:\\ontario.csv")
data1=data
data[,2]=log(data[,2])
90
data[,3]=log(data[,3])
data2=data
model=lm(It1~It+cul1+cul2,data)
coefficients=model$coefficients
predicted.A=predict.lm(model,data)
predicted.A1=predicted.A
n=length(predicted.A)
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of SARS
Case")
for(j in 1:1000)
{
predicted.A=predict.lm(model,data)
for(i in 1:n)
{
predicted.A[i]=rpois(1,exp(predicted.A[i]))
}
lines(x,predicted.A/3,col="Green")
}
lines(x,exp(predicted.A1)/3,col="Blue")
points(x1,orig_data,col="Red",pch="*")
% Produce the deterministic and stochastic realizations of one-step ahead prediction
based on the estimated CAIM for SARS data in Ontario
predicted.B=data$It1
predicted.B1=predicted.B
91
plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of SARS
Case")
for(j in 1:1000)
{
data=data2
for(i in 1:n)
{
data[i,3]=predict.lm(model,data[i,])
predicted.B1[i]=data[i,3]
predicted.B[i]=rpois(1,exp(predicted.B1[i]))
if(i!=n)
{
data[i+1,2]=data[i,3]
if(i<24)
{
data[i+1,4]=data[i,4]+exp(data[i,3])
}else{
# data[i+1,4]=data[i,4]
data[i+1,4]=0
data[i+1,5]=data[i,5]+exp(data[i,3])
}
}
}
lines(x,predicted.B/3,col="Green")
}
92
points(x1,orig_data,col="Red",pch="*")
lines(x,exp(predicted.B1)/3,col="Blue")
11 Programme to apply CAIM to fit measles data in China in (4.3)
% Produce the plot for original measles data in China
A = read.csv("D:\\measles.csv",header=TRUE)
data = A[,1:17]
plot(exp(data$It1),type="b",col="Red",xlab="Month",ylab="Number of Measles
case")
% Produce the deterministic and stochastic realizations of the estimated CAIM for
measles data in China
A = read.csv("D:\\measles.csv",header=TRUE)
data = A[,1:17]
data[,1]=log(data[,1])
data[,2]=log(data[,2])
model=lm(It1~c1+c2+c3+c4+c5+c6+c7+c8+c9+c10+c11+c12+culIt+culbt+It-1,data)
coefficients=model$coefficients
predicted.A=predict.lm(model,data)
plot(exp(data$It1),type="b",col="Red",xlab="Month",ylab="Number of Measles
case")
lines(exp(predicted.A),col="Blue")
for(j in 1:1000)
{
for(i in 1:144)
93
{
data[i,1]=predict.lm(model,data[i,])
temp=exp(data[i,1])
data[i,1]=data[i,1]+log(rpois(1,temp))-log(temp)
predicted.B[i]=data[i,1]
}
lines(exp(predicted.B),col="Green")
}
points(exp(data1$It1),col="Red")
lines(exp(predicted.A),col="Blue")
% Produce the deterministic and stochastic realizations of one-step ahead prediction
based on the estimated CAIM for measles data in China
plot(exp(data1$It1),type="p",col="Red",xlab="Month",ylab="Number of Measles
case")
for(j in 1:1000)
{
for(i in 1:144)
{
data[i,1]=predict.lm(model,data[i,])
temp=exp(data[i,1])
data[i,1]=data[i,1]+log(rpois(1,temp))-log(temp)
predicted.B[i]=data[i,1]
if(i!=144)
{
94
data[i+1,2]=data[i,1]
data[i+1,3]=data[i,3]+exp(data[i,1])
}
}
lines(exp(predicted.B),col="Green")
}
for(i in 1:n)
{
data[i,1]=predict.lm(model,data[i,])
predicted.B[i]=data[i,1]
if(i!=144)
{
data[i+1,2]=data[i,1]
data[i+1,3]=data[i,3]+exp(data[i,1])
}
}
points(exp(data1$It1),col="Red")
lines(exp(predicted.B),col="Blue")
95