Download - A STATISTICAL MODEL FOR THE TRANSMISSION OF INFECTIOUS DISEASES · 2018-01-09 · Infectious diseases are the diseases which can be transmitted from one host of humans or animals

A STATISTICAL MODEL FOR THE TRANSMISSION

OF INFECTIOUS DISEASES

WANG WEI

NATIONAL UNIVERSITY OF SINGAPORE

2007

A STATISTICAL MODEL FOR THE TRANSMISSION

OF INFECTIOUS DISEASES

WANG WEI

(B.Sc. University of Auckland, New Zealand)

A THESIS SUBMITTED

FOR THE DEGREE OF MASTER OF SCIENCE

DEPARTMENT OF STATISTICS AND

APPLIED PROBABILITY

NATIONAL UNIVERSITY OF SINGAPORE

2007

ACKNOWLEDGEMENTS

I would like to express my deep and sincere gratitude to Associate Prof. Xia

Yingcun, my supervisor, for his valuable advices and guidance, endless patience,

kindness and encouragements. I do appreciate all the time and efforts he has spent in

helping me to solve the problems I encountered. I have learned many things from him,

especially regarding academic research and character building.

I also would like to give my special thanks to my husband Mi Yabing for his love

and patience during my graduate period. I feel a deep sense of gratitude for my

parents who teach me the things that really matter in life.

Furthermore, I would like to attribute the completion of this thesis to other

members of the department for their help in various ways and providing such a

pleasant working environment, especially to Ms. Yvonne Chow and Mr. Zhang Rong.

Finally, it is a great pleasure to record my thanks to my dear friends: to Mr. Loke

Chok Kang, Mr. Khang Tsung Fei, Ms. Zhang Rongli, Ms. Zhao Wanting, Ms. Huang

Xiaoying, Ms. Zhang Xiaoe, Mr. Li Mengxin, Mr. Jia Junfei and Mr. Wang Daqing,

who have given me much help in my study. Sincere thanks to all my friends who help

me in one way or another for their friendship and encouragement.

Wang Wei

July 2007

ii

CONTENTS

Acknowledgements ii

Summary vi

List of Tables viii

List of Figures ix

Chapter 1 Introduction 1

1.1 Epidemiological background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Main objectives of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Organization of this thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Chapter 2 Classical Epidemic Models 7

2.1 Susceptible-infective-removed models (SIR) . . . . . . . . . . . . . . . . . . . . . . 8

2.2 The assumptions for epidemic models . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.1 Assumptions about the population of hosts . . . . . . . . . . . . . . . . . . . 9

2.2.2 Assumptions about the disease mechanism . . . . . . . . . . . . . . . . . . . 10

2.3 Deterministic models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4 Stochastic models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.5 Some terminology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.5.1 Basic reproductive rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.5.2 Important time scales in epidemiology . . . . . . . . . . . . . . . . . . . . . . 19

iii

Chapter 3 Statistical Epidemic Models 21

3.1 The time series – susceptible – infected – recovered model (TSIR) . . . . 21

3.1.1 Check the validation of TSIR model . . . . . . . . . . . . . . . . . . . . . . . 22

3.1.2 The relationship between the parameters in TSIR model and the

3.1.2 deterministic SIR model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.1.3 Impact of aggregated data on the model fit . . . . . . . . . . . . . . . . . . 27

3.2 The cumulative alertness infection model (CAIM) . . . . . . . . . . . . . . . . . 33

3.2.1 Development of CAIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2.2 Extension of CAIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.2.2.1 Change of alertness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.2.2.2 Long term epidemics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.2.3 Data denoising and model estimation . . . . . . . . . . . . . . . . . . . . . . . . 39

Chapter 4 Application of CAIM to Real Epidemiological Data 41

4.1 Foot and Mouth Disease (FMD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.2 Severe Acute Respiratory Syndrome (SARS) . . . . . . . . . . . . . . . . . . . . . . 45

4.2.1 SARS in Hong Kong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.2.2 SARS in Singapore and Ontario, Canada . . . . . . . . . . . . . . . . . . . . . 51

4.3 Measles. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . 55

4.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

BIBLIOGRAPHY 59

Appendix Programme Codes 65

1 Programme to produce the realization of deterministic SIR model with

4.4 different parameters in (2.3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

iv


4.4 different R0 in (2.5.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3 Programme to check the validation of TSIR model in (3.1.1) . . . . . . . . . . . 68

4 Programme to investigate the relationship between the parameters in TSIR

4.4 model and R0 in (3.12) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

5 Programme to check the impact of aggregated data on the TSIR model fit in

4.4 (3.1.3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6 Programme to investigate the relationship between the three parameters in

4.4 CAIM and R0 in (3.2.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

7 Programme to apply CAIM to fit 2001 FMD data in UK in (4.1) . . . . . . . . . 82

8 Programme to apply CAIM to fit 2003 SARS data in HK in (4.2.1) . . . . . . . 84

9 Programme to apply CAIM to fit 2003 SARS data in Singapore in (4.2.2) . . 86

10 Programme to apply CAIM to fit 2003 SARS data in Ontario in (4.2.2) . . . 89

11 Programme to apply CAIM to fit measles data in China in (4.3) . . . . . . . . . 92

v

SUMMARY

Infectious diseases are the diseases which can be transmitted from one host of

humans or animals to other hosts. The impact of the outbreak of an infectious disease

on human and animal is enormous, both in terms of suffering and in terms of social

and economic consequences. In order to make predictions about disease dynamics and

to determine and evaluate control strategies, it is essential to study their spread, both

in time and in space. To serve this purpose, mathematical modeling is a useful tool in

gaining a better understanding of transmission mechanism.

There are two classic mathematical models: deterministic model and stochastic

model that have been applied to study infectious diseases and the concepts derived

from such models are now widely used in the design of infection control programmes.

Although the mathematical models are elegant, there are still some reasons that more

practical statistical models should be developed. During the outbreak of an infectious

disease, what we can observe in the first place is a time series of the number of cases.

It is very important to do an instantaneous analysis of the available time series and

provide useful suggestions. However, most existing mathematical models are based

on a system of differential equations with lots of unknown parameters which are

difficult to estimate statistically. Furthermore, these models need the effective number

of susceptibles, which is also difficult to calculate and define. In this thesis, we first

propose the time series-susceptible-infected-recovered (TSIR) model based on the

compartmental SIR mechanism. The validation of TSIR model was checked by

simulation. The result showed that TSIR model performed well. Then we develop a

vi

more practical statistical model, the cumulative alertness infection model (CAIM)

based on the TSIR model, which only requires the reported number of cases. The

parameters in the CAIM have been interpreted and some extensions of CAIM have

been discussed.

We also apply the CAIM to fit the real data of several infectious diseases: foot-and-

mouth disease in UK in 2001, SARS in Hong Kong, Singapore and Ontario, Canada

in 2003 and measles in China from 1994 to 2005. Our findings showed that the CAIM

could mimic the dynamics of these diseases reasonably well. The results indicate that

the CAIM may be helpful in making predictions about infectious disease dynamics.

vii

List of Tables

Table 2.1 Estimated values of R0 for various infectious diseases ………………. 19

Table 2.2 Incubation, latent and infectious periods (in days) for some infectious diseases . . . . . . . . . . . . . . . . . . . . . . . . ………………………………… 20

viii

List of Figures

Figure 2.1 Plot of a typical deterministic realization of an epidemic SIR model

with N=100, β=0.005, γ=0.1 for change in the number of infectives ……. 13

Figure 2.2 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.005, γ=0.1 for change in the number of susceptibles … 13

Figure 2.3 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.015, γ=0.1 for change in the number of infectives ……. 14

Figure 2.4 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.015, γ=0.1 for change in the number of susceptibles …. 14

Figure 2.5 Plot of a realization from the differential SIR model with R0 = 0.8, N=100 and γ =0.1 ………………………………………………………… 17

Figure 2.6 Plot of a realization from the differential SIR model with R0 = 1.5, N=100 and γ =0.1 ………………………………………………………… 18

Figure 2.7 Plot of a realization from the differential SIR model with R0 = 5, N=100 and γ =0.1 ………………………………………………………… 18

Figure 2.8 Diagrammatic illustration of the relationship between the incubation, latent and infectious periods for a hypothetical microparastic infection …. 20

Figure 3.1 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 2.0, γ= 1/5, N = 5000000 ……………………… 23

Figure 3.2 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 2.0, γ= 1/10, N = 5000000 …………………….. 23

Figure 3.3 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 7.0, γ= 1/5, N = 5000000 ……………………… 24

ix

Figure 3.4 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 7.0, γ= 1/10, N = 5000000 ……………………. 24

Figure 3.5 Plot of the relationship between c in TSIR model and R0 ……………. 25

Figure 3.6 Plot of the relationship between r in TSIR model and R0 …………..... 26

Figure 3.7 Plot of the relationship between α in TSIR and R0 …………………... 26

Figure 3.8 Plot of the deterministic realization of the estimated TSIR model before aggregating data based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000 ………………………………………………………………. 28

Figure 3.9 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 3 time units based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000 …………………………………………………. 29

Figure 3.10 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 5 time units based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000 …………………………………………………. 29

Figure 3.11 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 7 time units based on SIR model with R0 = 2.0,

γ =1/5 and N=5000000 ………………………………….................……. 30

Figure 3.12 Plot of the deterministic realization of the estimated TSIR model before aggregating data based on SIR model with R0 = 2.0, γ = 1/10 and

N=5000000 ……………………………………………………………… 30

Figure 3.13 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 3 time units based on SIR model with R0 = 2.0,

γ = 1/10 and N=5000000 ………………………………………………… 31


γ = 1/10 and N=5000000 ………………………………………………… 31

x


γ = 1/10 and N=5000000 ………………………………………………… 32


γ = 1/10 and N=5000000 ………………………………………………… 32


γ = 1/10 and N=5000000 ………………………………………………… 33

Figure 3.18 Plot of the relationship between R in CAIM model and R0 ………... 36

Figure 3.19 Plot of the relationship between r in CAIM model and R0 …………. 36

Figure 3.20 Plot of the relationship between α in CAIM model and R0 …………. 37

Figure 4.1 Plot of the observed daily cases of the 2001 FMD epidemic in the UK. 43

Figure 4.2 Plot of the deterministic and stochastic realization of the estimated model……………………………………………………………………… 44

Figure 4.3 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model …………………………………. 45

Figure 4.4 Plot of the observed daily cases of the 2003 SARS epidemic in Hong Kong………………………………………………………………………. 46

Figure 4.5 Plot of the deterministic and stochastic realization of the estimated model for the data of HK…………………………………………………. 47

Figure 4.6 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of HK …………….. 48

Figure 4.7 Plot of the deterministic and stochastic realization of the estimated model for the data of HK …………………………………………………. 50

xi

Figure 4.8 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of HK ……………... 50

Figure 4.9 Plot of the observed daily cases of the 2003 SARS epidemic in Singapore………………………………………………………………….. 51

Figure 4.10 Plot of the observed daily cases of the 2003 SARS epidemic in Ontario, Canada …………………………………………………………... 51

Figure 4.11 Plot of the deterministic and stochastic realization of the estimated model for the data of Singapore …………………………………………... 53

Figure 4.12 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of Singapore ……… 53

Figure 4.13 Plot of the deterministic and stochastic realization of the estimated model for the data of Ontario …………………………………………….. 54

Figure 4.14 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of Ontario ………… 54

Figure 4.15 Plot of the observed monthly measles cases from January 1994 to December 2005 in China …………………………………………………. 56

Figure 4.16 Plot of the deterministic realization of the estimated model for the data of Measles in China ………………………………………………….. 57

Figure 4.17 Plot of the deterministic realization of one-step ahead prediction based on the estimated model for the data of measles in China ………… 57

xii

CHAPTER 1

Introduction

1.1 Epidemiological background

In recent years, as the terms such as the Ebola virus, avian influenza and SARS

frequently dominate news head lines, infectious diseases have become a source of

continuous concern. Owing to improved sanitation, antibiotics and vaccination

programs, it was believed that infectious diseases would be eliminated soon in 1960s.

However, as new infectious diseases such as AIDS (1981), hepatitis C (1989),

hepatitis E (1990) and hanta virus (1993) have emerged and some existing diseases

such as malaria and dengue fever have reemerged and are spreading into new regions

as climate changes (Hethcote [2000]), infectious diseases have gained increasing

recognition as a key component in the human communities.

By definition, an infectious disease is a clinically evident disease of humans or

animals that damages or injures the host so as to impair host function, and results

from the presence and activity of one or more pathogenic microbial agents, including

viruses, bacteria, fungi, protozoa, multi-cellular parasites, and recently identified

proteins known as prisons.

An infectious disease is transmitted from some source. The infectiousness of a

disease indicates the comparative ease with which the disease can be transmitted to

from one host to others. The transmissible nature of infectious diseases makes them

fundamentally different from non-infectious diseases. Transmission of an infectious

1

disease may occur through several pathways, including through contact with infected

individuals, by water, food, airborne inhalation, or through vector-borne spread.

When there is an infectious disease epidemic (an unusually high number of cases

in a region), or pandemic (a global epidemic), public health professionals and policy

makers will be interested in such questions as how many people will be affected

altogether and thus require treatment? What is the maximum number of people

needing care at any particular time? How long will the epidemic last? How much

good would different interventions do in reducing the severity of the epidemic? To

answer these questions, the mathematical models can be useful tools in understanding

the patterns of disease spread and assessing the effects of different interventions.

The application of mathematics to the study of infectious disease appears to have

been initiated by Daniel Bernoulli in 1760. He used a mathematical method to

evaluate the effectiveness of the techniques of variolation against smallpox, with the

aim of influencing public health policy. But the first epidemiology modeling seemed

to have started in the 20th century. In 1906, Hamer formulated a discrete time model

to analyze the regular recurrence of measles epidemics and put forward so called

‘mass action principle’, one of the most important concepts in mathematical

epidemiology, indicating that the number of new cases per unit time depended on the

product of density of susceptible people times the density of infectious individuals. In

1908, Ronald Ross translated this ‘mass action principle’ into a continuous-time

framework in his pioneering work on dynamics of malaria (Ross [1911], [1916],

[1917]). The first complete mathematical model for the spread of an infectious disease

was proposed by Kermack and MacKendrick [1927], known as the deterministic

general epidemic model. Based on this model, Kermack and MacKendrick established

2

the celebrated Threshold Theorem, according to which the introduction of a few

infectious individuals into a community of susceptibles will not give rise to an

epidemic outbreak unless the density or number of susceptibles is above a certain

critical value. Thereafter, as cornerstones of modern theoretical epidemiology, the

threshold theorem and the mass action principle began to provide a firm theoretical

framework for the investigation of observed patterns.

As epidemiological data became more extensive, especially when small family or

household groups were considered, variation and elements of chance became more

important determinants of spread and persistence of infection and this led to the

development of stochastic theories and probabilistic models (Bartlett [1955], Bailey

[1975]). McKendrick [1926] was the first to propose a stochastic model. While the

deterministic model considers the actual number of new cases on a short interval time

to be proportional to the numbers of both susceptibles and infectious cases, as well as

to the length of the interval, the stochastic model assumed the probability of one new

case in a short interval to be proportional to the same quantity. Unfortunately, this

stochastic continuous–time version of the deterministic model of Kermack and

McKendrick [1927], did not received much attention. It was not until the late 1940’s,

when Bartlett [1949] studied the stochastic version of the Kermack-McKendrick

model and developed a partial differential equation for the probability-generating

function of the numbers of susceptibles and infectious cases at any instant, that

stochastic continuous-time epidemic models began to be analyzed more extensively.

Since then, the effort put into modeling infectious diseases has more or less exploded.

There is a vast literature on deterministic and stochastic epidemic modeling. Here

only a few central books on epidemic modeling will be mentioned. Most of the work

3

on modeling disease transmission prior to 1975 is contained in Bailey [1975]. The

author presents a comprehensive account of both deterministic and stochastic models,

illustrates the use of a variety of the models using real outbreak data and provides us

with a complete bibliography of the area. The book that has received most attention

recently is Anderson and May [1991]. The authors model the spread of disease for

several different situations and give many practical applications, but only focus on

deterministic models. A thematic semester at Isaac Newton Institute, Cambridge,

resulted in three collections of papers (Mollison [1995], Isham and Medley [1996],

Grenfell and Dobson [1996]), covering topics in stochastic modelling and statistical

analysis of epidemics, human infectious diseases and animal diseases, respectively.

Very recently, the book by Daley and Gani [1999] focuses on stochastic modeling but

also contains statistical inference and deterministic modeling, as well as some

historical remarks. Another new monograph by Diekmann and Heesterbeek [2000] is

concerned with mathematical epidemiology of infectious diseases and their methods

are also applied to real data.

Although the mathematical models are elegant, there are still some reasons that

more practical statistical models should be developed. During the outbreak of an

infectious disease, what we can observe in the first place is a time series of the

number of cases. It is very important to do an instantaneous analysis of the available

time series and provide useful suggestions. However, most existing mathematical

models are based on a system of differential equations with lots of unknown

parameters which are difficult to estimate statistically. Furthermore, these models

need the effective number of susceptibles, which is also difficult to calculate and

define. In this thesis, a simple and practical statistical model is proposed that only

4

considers the reported number of cases and the relationship with the classical

mathematical models will be analyzed.

1.2 Main objectives of this thesis

We start with an introduction of two classical mathematical models for infectious

diseases. For simplicity, in this thesis we assume that the population is a single group

of homogeneous individuals who mix uniformly and roughly constant through the

epidemic. At any given time, an individual in the population is either susceptible to

the disease, or infectious with it, or a removed case by acquired immunity or isolation

or death. Furthermore, the infectious disease discussed here is s directly transmitted

viral or bacterial disease. First, we propose the time series-susceptible-infected-

recovered (TSIR) model. Then we check the validation of the TSIR model and

investigate the relationship between the parameters in the TSIR model and the

classical mathematical epidemic model. Next, based on the TSIR model, we develop

a complete case-driven model, the cumulative alertness infection model (CAIM). The

parameters in the CAIM have been interpreted and some extensions of CAIM have

been discussed. Finally, we apply the CAIM to real data of some infectious diseases

to demonstrate that CAIM is useful in making predictions about the disease

dynamics.

1.3 Organization of this thesis

We organize this thesis into four chapters. Chapter 1 is a historical introduction to

the epidemic models for infectious diseases. In the next chapter, Chapter 2, we review

two classical mathematical models for infectious diseases and some important

terminology in epidemiological study. In Chapter 3, we introduce two new statistical

5

models: TSIR and CAIM. The validation of the models will be investigated and the

parameters in the new models interpreted. In the last chapter, Chapter 4, we apply the

model, CAIM to the real data of some infectious diseases and demonstrate that the

new model can be very useful in making predictions about disease dynamics.

6

CHAPTER 2

Classical Epidemic Models

Infectious disease data have two features that distinguish them from other data.

They are high dependence that inherently present and the infection process cannot be

observed entirely. Therefore, the analysis of data is usually most effective when it is

based on a model that describes a number of aspects of the underlying infection

pathway, i.e. on an epidemic model. The main purpose of the epidemic model is to

take facts about the disease as inputs and to make predictions about the numbers of

infected and uninfected people over time as outputs.

The application of mathematical models to dynamics of infectious disease such as

measles, influenza, rubella and chicken pox has been a real success story in the 20th

century science (see Hethcote [1976], Dietz [1979], Anderson and May [1982], Dietz

and Schenzle [1985], Hethcote and Van Ark [1987], Castill-Chavez et al. [1989],

Feng and Thieme [1995]). Even the dynamics of disease appear to be very complex,

surprisingly simple mathematical models can be used to understand the features

governing the outbreak and persistence of infectious disease.

Epidemic modeling is expected to attain the three aims. The first is to understand

better the mechanisms by which diseases spread. The second aim is to predict the

future course of the epidemic. The third aim is to understand how we may control the

spread of the epidemic. Of the several methods for achieving this, education,

immunization and isolation are those most often used.

7

2.1 Susceptible-infective-removed models (SIR)

A population comprises a large number of individuals, all of whom are different in

various fields. In order to model the progress of an epidemic in such a population, this

diversity must be reduced to a few key characteristics which are relevant to the

infection under consideration. For most common infectious diseases, all individuals

are initially susceptible. On infection they become infectious for a period, after which

they stop being infectious, recover and become immune or die. They are then said to

be removed. Any individual who is infectious is called infective. Hence, the

population can be divided into those who are susceptible to the disease (S), those who

are infected (I) and those who have been removed (R).These subdivisions of the

population are called compartments. The models which assume that individuals pass

through the susceptible (S), infective (I) and removed (R) states in turn are call SIR

models.

The SIR model is dynamic in two senses. At first, the model is dynamic in that the

numbers in each compartment may fluctuate over time. During an epidemic, the

number of susceptibles falls more rapidly as more of them are infected and thus enter

the infectious and recovered compartments. The disease cannot break out again until

the number of susceptibles has built back up as a result of babies being born into the

compartment. The SIR is also dynamic in the sense that individuals are born

susceptible, then may acquire the infection (move into the infectious compartment)

and finally recover (move into the recovered compartment). Thus each member of the

population typically progresses from susceptible to infectious to recover. This can be

shown as a flow diagram in which the boxes represent the different compartments and

the arrows the transition between compartments.

8

For the full specification of the model, the arrows should be labeled with the

transition rates between compartments. Between S and I, the transition rate is λ, the

force of infection, which is simply the rate at which susceptible individuals become

infected by an infectious disease. Between I and R, the transition rate is γ (simply the

rate of recovery). If the mean duration of the infection is denoted D, then D = 1/ γ,

since an individual experiences one recovery in D units of time.

2.2 The assumptions for epidemic models

For any given model, they usually encompass a set of assumptions. For simplicity

and convenience, some assumptions for the models considered in this thesis should be

introduced first.

An epidemic process of an infectious disease can be thought as the evolution of

the disease phenomenon within a given population of individuals. Naturally, the

assumptions for the epidemic models involve two aspects: assumptions about the

population of hosts and the disease mechanism.

2.2.1 Assumptions about the population of hosts

In general, populations of hosts show demographic turnover: old individual

disappear by death and new individuals appear by birth. Such a demographic process

has its characteristic time scale (for humans on the order of 1-10 years). The time

scale at which an infectious diseases sweeps through a population is often much

shorter (e.g. for influenza it is on the order of weeks). In such a case we choose to

9

http://en.wikipedia.org/wiki/Image:SIR.PNG

ignore the demographic turnover and consider the population as ‘closed’ (which also

means that we do not pay any attention to emigration and immigrations).

Consider such a closed population and assume that it is naïve, in the sense that it is

completely free from a certain disease-causing organism in which we are interested.

Furthermore, the simplest situation where the disease-causing organism is introduced

in by only one host is under consideration in this thesis.

In summary, we make assumptions about the population as follows:

(a) the population structure: the population is a single group of homogeneous

individuals who mix uniformly ;

(b) the population dynamics: the population is closed so that it is a constant

collection of the same set of individuals for all time;

(c) a mutually exclusive and exhaustive classification of individuals according

to their disease status: at any given time, an individual is either susceptible

to the disease, or infectious with it, or a removed case by acquired

immunity or isolation or death.

2.2.2. Assumptions about the disease mechanism

As far as the disease mechanism is concerned, we assume that we deal with micro-

parasites, which are characterized by the fact that a single infection triggers an

autonomous process in the host. Micro-parasites may be thought of as those parasites

which have direct reproduction within the host. They tend to have small size and a

short generation time. Hosts that recover from infection usually acquire immunity

against re-infection for some time, and often for life so that no individual can be

infected twice. The duration of micro-parasite infection is usually short relative to the

10

expected life span of the host. Most viral and bacterial parasites, and many protozoan

and fungal parasites fall into the micro-parasitic category.

In addition, we assume that the disease is spread by a contagious mechanism so

that contact between an infectious individual and a susceptible is necessary. After an

infectious contact, the infectious individual succeeds in changing the susceptible

individual’s disease status.

2.3 Deterministic models

To indicate that the numbers might vary over time (even if the total population

size remains constant), we make the precise numbers a function of t (time): S(t), I(t)

and R(t). For a specific disease in a specific population, these functions may be

worked out in order to predict possible outbreaks and bring them under control.

In the classical model for a general epidemic, the size of the population N is

assumed to be fixed, and individuals in the population are counted according to their

disease status, numbering S (t) susceptibles, I (t) infectives and R (t) removals (dead,

isolated or immune), so that S (t) is non-increasing, R (t) is non-decreasing and the

sum S (t) + I (t) + R (t) = N, for all t >0. Then, the deterministic form of the SIR

model is defined as:

( )dS tdt = - βS(t)I(t) (2.1)

( )dI tdt

= βS(t)I(t) – γI(t) (2.2)

( )dR tdt

= γI(t) (2.3)

11

The evolution of this epidemic process is deterministic in the sense that no

randomness is allowed for. The results of a deterministic process are regarded as

giving an approximation to the mean of a random process.

Here β > 0 is the pair-wise rate of infection (i.e. infection parameter) at which the

number of infectives simultaneously increases at the same rate as the number of

susceptibles decreases, and deceases through removal at a rate; γ > 0 is the removal

rate at which infectives become removed.

The results derived from the equations are stated formally as follows, which

constitute a benchmark for a range of epidemic models.

Theorem 2.1 (Kermack-McKendrick). Subject to the initial conditions (S (0), I (0), R

(0)) = (S0, I0, R0) with I0 ≥ 1, R0 =0 and S0 +I0 =N, a general epidemic evolves

according to the differential equations (2.1-2.3) (Daley and Gani [1999]).

(i) (Survival and Total size). When infection ultimately ceases spreading,

a positive number S∞ of susceptibles remain uninfected, and the total

number R∞ of individuals ultimately infected and removed equals S0 +

I0 – R∞ and is the unique root of the equation

N – R∞ = S0 + I0 – R∞ = S0 exp(-R∞ /ρ),

Here I0 < R∞ < S0 + I0, ρ = γ / β being the relative removal rate.

(ii) (Threshold Theorem). A major outbreak occurs if and only if the initial

number of susceptibles S0 > ρ.

(iii) (Second Threshold Theorem). If S0 exceeds ρ by a small quantity v,

and if the initial number of infectives I0 is small relative to v, then the

12

final number of susceptibles left in the population is approximately ρ –

v, and R∞ ≈ 2v.

The major significance of these statements at the time of their first publication was

a mathematical demonstration that not all susceptibles would necessarily be infected.

Conditions were given for a major outbreak to occur, namely that the number of

susceptibles at the start of the epidemic should be sufficiently high; this would

happen, for example, in a city with a large population.

0 10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

45

50

time

SIR

:num

ber o

f cas

es

Figure 2.1 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.005, γ=0.1 for change in the number of infectives.

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100

time

SIR

:num

ber o

f sus

cept

ible

s

Figure 2.2 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.005, γ=0.1 for change in the number of susceptibles.

13

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

time

SIR

:num

ber o

f cas

es

Figure 2.3 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.015, γ=0.1 for change in the number of infectives.

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

80

90

100

time

SIR

:num

ber o

f sus

cept

ible

s

Figure 2.4 Plot of a typical deterministic realization of an epidemic SIR model with N=100, β=0.015, γ=0.1 for change in the number of susceptibles.

Deterministic modeling considers a structured mathematical framework, where

one takes the actual number of new cases in a short interval of time to be proportional

to the product of the number of both susceptible and infectious individuals, as well as

the length of the time interval.

14

2.4 Stochastic Models

While deterministic methods may be adequate to characterize the spread of an

epidemic in a large population, they are not satisfactory for smaller populations, in

particular those of family size. Hence, when the number of members of the

population subject to the epidemic is relatively small, we describe the evolution of an

epidemic as a stochastic process.

We consider an SIR model in which the total population is subdivided into S (t)

susceptibles, I (t) infectives and R (t) immunes or removals, with (S, I, R) (0) = (N, I0,

0) , where I0 ≥ 1 and

S (t) + I (t) + R (t) = N +I0 (all t ≥ 0). (2.4)

We assume that { ( S, I ) (t) : t ≥ 0} is a homogeneous Markov chain in continuous

time, with state space the non-negative integers in 2Z satisfying (2.4) and N ≥ S(t) ≥

0. The non-zero transition probabilities occur for 0 ≤ i ≤ N, j = N + I – i and are 0

Pr{ (S, I) ( t + Δt) = ( i-1, j+1) | ( S, I ) (t) = (i,j)} = βijΔt + o (Δt) (2.5)

Pr{ (S, I) ( t + Δt) = ( i, j-1) | ( S, I ) (t) = (i,j)} = γjΔt + o (Δt) (2.6)

Pr{ (S, I) ( t + Δt) = ( i, j) | ( S, I ) (t) = (i,j)} =1- ( βi + γ) jΔt + o (Δt) (2.7)

Here β is the pairwise infection parameter and γ the removal parameter. Since there is

assumed to be a homogeneous mixing so that in the time interval (t, t + Δt) infections

occur at rate βijΔt and removals at rate γjΔt.

15

Similarly, there is a stochastic threshold theorem similar in nature to the Kermack-

McKendrick theorem for the deterministic general epidemic.

Theorem 2.2 (Whittle’s Threshold Theorem). Consider a general epidemic process

with initial numbers of susceptibles N and infectives I, and relative removal rate ρ.

For any ζ in (0, 1), let π (ζ) denote the probability that at most [Nζ] of the susceptibles

are ultimately infected i.e. that the intensity of the epidemic does not exceed ζ (Daley

and Gani [1999]).

(i) If ρ < N(1- ζ), then (ρ/N) I ≤ π (ζ) ≤ (ρ/N(1- ζ)) I

(ii) If N (1- ζ) ≤ ρ < N, then (ρ/N) I ≤ π (ζ) ≤ 1.

(iii) If ρ ≥ N, then π (ζ) =1, and the probability that the epidemic achieves

an intensity greater than any predetermined ζ in (0, 1), is zero.

In short, stochastic modeling works on conditional probability structure, where

one assumes that a new case in a short interval of time is proportional to both

susceptibles and infectives, as well as the length of the time interval.

2.5 Some terminology

2.5.1 Basic reproductive rate

The key value governing the time evolution of these equations is the threshold

parameter, which is denoted by R0, the basic reproductive rate. R0 can be calculated

by the following formula:

R0 = β S (0) / γ

16

Here β is the infection parameter and γ the removal rate in Equation (2.1-2.3) and

S (0) is the number of susceptibles at t =0.

R0 is defined as the number of secondary infectious individuals generated by a

“typical” infectious individual when introduced into a fully susceptible population. In

other words, it determines the number of people infected by contact with a single

infected person before his death or recovery.

There are two qualitatively different situations which may occur with an epidemic

in a large population, where R0 plays an important role.

If R0 < 1, each person who contracts the disease will infect fewer than one person

before dying or recovering, so the epidemic will die out quickly, as shown in Figure

2.5.

If R0 > 1, each person who contracts the disease, will infect more than one person,

thus the epidemic will spread through the population, as shown in Figure 2.6 and 2.7.

0 20 40 60 80 100 120 140 160 180 2000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

time

SIR

:num

ber o

f cas

es

Figure 2.5 Plot of a realization from the differential SIR model with R0 = 0.8, N=100 and γ =0.1.

17

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

5

6

7

8

SIR

:num

ber o

f cas

es

time

Figure 2.6 Plot of a realization from the differential SIR model with R0 = 1.5, N=100 and γ =0.1.

0 20 40 60 80 100 120 140 160 180 2000

5

10

15

20

25

30

35

40

45

50

time

SIR

:num

ber o

f cas

es

Figure 2.7 Plot of a realization from the differential SIR model with R0 = 5, N=100 and γ =0.1.

The estimated values of the basic reproductive rate, R0, for some common

infectious diseases are listed in Table 2.1(Data from Anderson and May [1991],

Wallinga and Teunis [2004]).

18

Table 2.1 Estimated values of R0 for various infectious diseases.

Disease Location Time R0Measles England and Wales 1950-1968 16-18 Kansas, USA 1918-1921 5-6 Ontario, Canada 1912-1913 11-12 Mumps Baltimore, USA 1943 7-8 England and Wales 1960-1980 11-14 Netherlands 1970-1980 11-14 Rubella West Germany 1970-1977 6-7 Poland 1970-1977 11-12 poliomyelitis USA 1955 5-6 Netherlands 1960 6-7 HIV/AIDS England and Wales 1981-1985 2-5 Nairobi, Kenya 1981-1985 11-12 Kampala, Uganda 1985-1987 10-11 SARS Hong Kong 2003 3.1-4.2 Ontario, Canada 2003 1.8-3.6 Singapore 2003 2.3-4.0 Vietnam 2003 1.8-3.1

It shows that the range of R0 is estimated to be between 2 and 20 for most

infectious diseases in practice.

Generally, the larger the value of R0, the harder it is to control the epidemic. In

particular, the proportion of the population that needs to be vaccinated to provide herd

immunity and prevent sustained spread of the infection is given by 1-1/R0. The basic

reproductive rate is affected by several factors including the duration of infectivity of

affected patients, the infectiousness of the organism, and the number of susceptible

people in the population that the affected patients are in contact with.

2.5.2 Important time scales in epidemiology

There are several time scales in epidemic theory needed to be clarified before our

further discussion.

Assuming that there is an instant at which infection occurs for an individual:

19

1. the latent period indicates the period from the point of infection to the

beginning of the state of infectiousness.

2. the incubation period indicates the period from the point of infection to the

appearance of symptoms of disease.

3. the infectious period indicates the period during which an infected host is able

to infect susceptible individuals, i.e. infectious to susceptible individuals.

Hence a host may be infected but not yet infectious. The relationship between

these time scales is illustrated in Figure 2.8.

0 7 14

Time Units

Latent

Incubation

Infectious

Symptoms

Recovered / immune

Point of infection

Figure 2.8 Diagrammatic illustration of the relationship between the incubation, latent and infectious periods for a hypothetical microparastic infection.

The lengths of incubation, latent and infectious periods for some infectious

diseases are listed in Table 2.2 (Data from Fenner and White [1970], Christie [1974],

Benenson [1975]).

Table 2.2 Incubation, latent and infectious periods (in days) for some infectious diseases.

Infectious disease Incubation period Latent period Infectious period Measles 8-13 6-9 6-7 Mumps 12-26 12-18 4-8 Rubella 14-21 7-14 11-12 Poliomyelitis 7-12 1-3 14-20 Influenza 1-3 1-3 2-3 Scarlet fever 2-3 1-2 14-21

20

CHAPTER 3

Statistical Epidemic Models

3.1 The time series – susceptible – infected – recovered model (TSIR)

To begin with, the population is assumed to be a single group of homogeneous

individuals who mix uniformly and roughly constant through the epidemic. At any

given time, an individual in the population is either susceptible to the disease, or

infectious with it, or a removed case by acquired immunity or isolation or death.

Furthermore, the infectious disease discussed here is a directly transmitted viral

disease.

The simplest situation is under consideration where only one infective is

introduced into the close population in which all the individuals are susceptible to the

disease.

Let It be the number of cases at time period t and St the number of susceptibles.

Each period is the average duration between an individual acquiring infection and

passing it on to the next infectee.

According to the compartmental SIR mechanism, the number of new cases in the

next time period is

It+1 = β St It (t = 0, 1, 2, …)

21

Here the constant β is such that β St It ≤ St for all t (Bailey [1975], Finkenstädt and

Grenfell [2000]). However, this model was found to be not attractive in the terms of

dynamical pattern (Anderson and May [1991]).

Therefore, another model with nonlinear incidence rates which can show a wider

range of dynamical behavior has been proposed.

I t+1 = c S tr Itα (3.1)

St = S t-1 - It (3.2)

This model is called the time series- susceptible-infected-recovered model (TSIR)

in Finkenstädt and Grenfell [2000].

3.1.1 Check the validation of TSIR model

Since the deterministic SIR model has been widely used to study the spread of

infectious diseases, we take it as a true model for an epidemic of an infectious disease

after specifying the parameters in Equations (2.1-2.3), which can generate the number

of susceptibles, infectives and recovered for each time t. In this section, we will make

use of the generated numbers to check the validation of the model. The following

steps will be taken:

Step 1 The Kermack-McKendrick differential equations with specific R0 (hence β =

R0* γ / N), γ and N were used to generate the number of the susceptibles, infectives

and removals in an epidemic for each time t.

Step 2 The number of the susceptibles (St) was extracted and based on the equation

(3.2), the number of new infectives (It) at time t obtained.

22

Step 3 Take logarithm to both sides of the equation (3.1) and the least-squared

method was used to estimate the three parameters in the model.

Step 4 The deterministic realizations from the deterministic SIR model and from the

TSIR model are compared in the figures.

0 20 40 60 80 100 120 140 160 180 2000

2

4

6

8

10

12

14

16

18x 104

time

It

Figure 3.1 Plot of a deterministic realization of the estimated TSIR model and the SIR model with R0 = 2.0, γ= 1/5, N = 5000000. ‘*’ is the number of new cases from the SIR model and ‘—’ the fitted TSIR model.

0 50 100 150 200 250 3000

1

2

3

4

5

6

7

8

9x 104

time

It


23

0 10 20 30 40 50 60 70 80 90 1000

0.5

1

1.5

2

2.5

3

3.5

4

4.5x 105

time

It


0 20 40 60 80 100 120 140 160 1800

0.5

1

1.5

2

2.5x 10

5

time

It


As shown in the figures above, the TSIR model can mimic the dynamic pattern of

an epidemic reasonably well.

24

3.1.2 The relationship between the parameters in TSIR model and the

deterministic SIR model

To investigate the relationship between the parameters in TSIR model and R0, Step

1-3 in Section 3.1.1 are repeated to get a series of the estimated parameters

corresponding to the different values of R0. The range of R0 is fixed between 2 and 20

as indicated in the real world. Similarly, as the actual mean infectious period (D) is

usually 3 – 10 days, the values of γ are chosen to be 1/3, 1/5, 1/7 and 1/10 (D = 1/ γ).

The population of susceptibles is fixed at N = 5,000,000. The results are shown in the

following figures.

2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

R0

c

Figure 3.5 Plot of the relationship between c in TSIR model and R0.The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.

As shown in Figure 3.5, as the value of R0 increases, the parameter c in TSIR

model tends to increase for any mean infectious period. The relationship between

them is non-linear. The parameter c tends to be higher for longer infectious period. In

reality, for a specific infectious disease, since the infectious period is usually stable as

one of biological properties of the disease, the estimated value of c in TSIR model

25

can be thought to correspond to the basic reproductive number R0 of the infectious

disease during an epidemic.

2 4 6 8 10 12 14 16 18 200.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

R0

r

Figure 3.6 Plot of the relationship between r in TSIR model and R0. The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.

2 4 6 8 10 12 14 16 18 200.98

0.985

0.99

0.995

1

1.005

alph

a

R0

Figure 3.7 Plot of the relationship between α in TSIR and R0. The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.

The parameter r and α thought as mixing parameters of contact process (Liu et al

[1987]). Figure 3.6 and 3.7 showed that both of them have negative relationship with

R0. As R0 determines the number of people infected by contact with a single infected

26

person before his death or recovery, the negative relationship indicates that the

contact rate between the infectious hosts and susceptibles tend to be lower as R0 tend

to be higher.

3.1.3 Impact of aggregated data on the model fit

For an infectious disease, the most important timescale of the disease is the

infectious period, during which an infected host is able to infect susceptible

individuals, i.e. infectious to susceptible individuals. Assuming that there is an instant

at which an individual is infected and become infectious immediately to other

susceptibles. If the infectious period is, e.g. 7 days, this infected individual will

generate new cases in the following 7 days through contacting other susceptibles. To

put it in another way, if the data of the cases are aggregated into one-week time steps,

an infected person in week t can hence be related to a contact between an infected

individual in week t-1 and an susceptible individual (Finkenstädt and Grenfell

[2000]). Therefore, it would be reasonable to aggregate the cases based on the

infectious period and then use the TSIR model to fit it.

To check the impact of aggregated data on the model fit, we take the following

steps:

Step 1 The Kermack-McKendrick differential equations (2.1-2.3) with specific R0

(hence β = R0* γ / N), γ and N are used to generate the number of the susceptibles,

infectives and removals in an epidemic for each time t.

Step 2 The number of the susceptibles (St) is extracted and aggregated into different

time units to get the aggregated number of susceptibles. Based on the equation (3.2),

27

the aggregated number of infectives (It) for different time units can be obtained.

These aggregated data will be substitute into the TSIR model.

Step 3 Take logarithm to both sides of Equation (3.1) and the least-squared method

was used to estimate the three parameters in the model.

Step 4 The deterministic realizations from the deterministic SIR model and from the

TSIR model are compared in the figures.

Here we only display the results from the deterministic SIR model with R0 = 2.0, γ

= 1/5 and N=5000000 and with R0 = 2.0, γ = 1/10 and N=5000000. The data from the

former model were aggregated into 3, 5, 7 time units respectively (Figure 3.8-3.11)

while the data from the latter model were aggregated into 3, 5, 7, 10, 12 time

respectively (Figure 3.12-3.17). Note that ‘*’ is the number of new cases from the

SIR model and ‘—’ the fitted TSIR model.

0 20 40 60 80 100 120 140 160 180 2000

2

4

6

8

10

12

14

16

18x 10

4

time

It

Figure 3.8 Plot of the deterministic realization of the estimated TSIR model before aggregating data based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000.

28

0 50 100 150 200 250 3000

1

2

3

4

5

6x 105

time

It

Figure 3.9 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 3 time units based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000.

0 50 100 150 200 250 3000

1

2

3

4

5

6

7

8

9x 105

time

It

Figure 3.10 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 5 time units based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000.

29

0 50 100 150 200 250 3000

2

4

6

8

10

12x 105

time

It

Figure 3.11 Plot of the deterministic realization of the estimated TSIR model by aggregating data into 7 time units respectively based on SIR model with R0 = 2.0, γ = 1/5 and N=5000000.

As shown in Figure 3.8-3.11, it appears that the estimated TSIR model for the

aggregated data of 5 time units have the best fit.

0 50 100 150 200 250 3000

1

2

3

4

5

6

7

8

9x 104

time

It

Figure 3.12 Plot of the deterministic realization of the estimated TSIR model before aggregating data based on SIR model with R0 = 2.0, γ = 1/10 and N=5000000.

30

0 50 100 150 200 250 3000

0.5

1

1.5

2

2.5

3x 10

5

time

It

Figure 3.13 Plot of the deterministic realization of the estimated TSIR model by aggregated data into 3 time units based on SIR model with R0 = 2.0, γ = 1/10 and N=5000000.

0 50 100 150 200 250 3000

0.5

1

1.5

2

2.5

3

3.5

4

4.5x 10

5

time

It


31

0 50 100 150 200 250 3000

1

2

3

4

5

6

7x 105

time

It


0 50 100 150 200 250 3000

1

2

3

4

5

6

7

8

9x 105

time

It


32

0 50 100 150 200 250 3000

2

4

6

8

10

12x 105

time

It


As shown in Figure 3.12-3.17, it appears that the estimated TSIR model for the

aggregated data of 10 time units have the best fit.

The same method were also used to check the impact of aggregated data on the

model fit based on different deterministic SIR model, the results showed similar

pattern to the two examples above: the estimated TSIR model for the aggregated data

into exact or close to D time units tend to have the better fit, where D is the mean

infectious period and D = 1/ γ, indicating that aggregating data into D time units may

improve the TSIR model fit.

3.2 The cumulative alertness infection model (CAIM)

3.2.1 Development of CAIM

For a nonrecurring SIR disease such as measles or SARS, under the SIR

mechanism the recovered will never be infected, the number of effective susceptibles

33

is the total number in the population minus the cumulative sum of all previous cases

reported, i.e.

St = N - 1

t

Iτ

τ=∑ (3.3)

Therefore, a general TSIR model can be proposed as

It+1 ∝ f (N - 1

t

Iτ

τ=∑ ) It

α (3.4)

Assuming that N is roughly constant but unknown, the general TSIR model can be

further simplify as

It+1 = g (1

t

Iτ

τ=∑ ) It

α (3.5)

Here g (·) is the recovered (immunized) – based reproductive rate.

By the three equations above, the function g should be assumed to be a decreasing

monotonic function. Another reason for the monotonicity assumption is that people’s

alertness to the diseases increases as more and more cases are reported. For

convenience, g(x) = Rexp(-r1

t

Iτ

τ=∑ ) It

α was chosen as the function of g, where R and

r are both positive constant parameters ( Xia [2007]).

Given Iτ, τ ≤ t, a deterministic dynamical infection system is

It+1 = Rexp(-r1

t

Iτ

τ=∑ ) It

α (3.6)

34

Randomizing the dynamical system by the Poisson distribution, a stochastic

dynamical cumulative alertness infection model (CAIM) is

It+1 ~ Poi {Rexp(-r1

t

Iτ

τ=∑ ) It

α } (3.7)

There are three parameters in the model. The parameter R corresponds to the basic

reproductive number R0. The parameter r is the coefficient of cumulative alertness,

which depends on several factors including people’s awareness of the diseases and

the efficiency of control measures. The lower the value of r is, the higher is the level

of alertness and the more efficient are the control measures. The parameter α

corresponds to the contact intensity between the infectious hosts and the susceptibles.

Since the alertness is assumed to depend on the cumulative number of previous cases

1

t

Iτ

τ=∑ , the model above is called the cumulative alertness infection model (CAIM).

The relationship between the three parameters in CAIM and R0 were investigated by

the simulation method similar to that in Section 3.1.2. The results are shown in Figure

3.18-3.20, which confirm that the interpretations for the three parameters are

reasonable.

35

2 4 6 8 10 12 14 16 18 200

5

10

15

20

25

R0

R

Figure 3.18 Plot of the relationship between R in CAIM model and R0. The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.

2 4 6 8 10 12 14 16 18 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1x 10-6

R0

r

Figure 3.19 Plot of the relationship between r in CAIM model and R0. The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.

36

2 4 6 8 10 12 14 16 18 200.95

0.96

0.97

0.98

0.99

1

1.01

R0

alph

a

Figure 3.20 Plot of the relationship between α in CAIM model and R0. The blue* for γ = 1/3; the green* for γ = 1/5; the red* for γ = 1/7 and the turquoise * for γ = 1/10.

Both TSIR model and CAIM can be directly applied to fit the data which are

discrete time measurements. As the number of susceptible is usually difficult to be

obtained in most cases, some approaches have been developed to reconstruct the

unobserved susceptible class in order to make use of TSIR model (Finkenstädt and

Grenfell [2000]). Compared to TSIR model, the CAIM only uses the reported

infections Iτ, τ ≤ t to model and predict the infection dynamics and no other

unobtainable data like the number of susceptibles (St) is required. Hence the CAIM is

data driven and might be preferred by statisticians.

3.2.2 Extension of CAIM

In order to investigate more complicated dynamics and other issues of infections,

two typical extensions of CAIM will be discussed in this section.

37

3.2.2.1 Change of alertness

Model (3.6) and (3.7) assumes that more cases in a community during an epidemic

of an infectious disease, people in the same community are more alerted to this

disease. As a consequence, the reproductive rate R t = Rexp(-r1

t

Iτ

τ=∑ ) will decrease as

the cases increases. The extent of alertness may not be the same in the other

communities or in another period during the epidemic. If the disease breaks out in

another community, the extent of alertness may start at a different level. The good

examples to support this understanding were the SARS outbreaks in Hong Kong and

Singapore (Chowell et al [2003]). Based on this understanding, a deterministic multi-

community CAIM model (Xia [2007]) is proposed as

It+1 = Rexp(-r1

1

1

T

Iτ

τ=∑ - r2

2

1 1

T

TI

τ

τ= +∑ - … - rp

1

t

TpI

τ

τ= +∑ ) It

α (3.8)

And the stochastic version of the model is


1

1

T

Iτ

τ=∑ - r2

2

1 1

T

T

Iτ

τ= +∑ - … - rp

1

t

Tp

Iτ

τ= +∑ ) It

α } (3.9)

3.2.2.2 Long term epidemics

The models discussed in the previous parts contain the assumption that the

population is constant through an epidemic of an infectious disease. Essentially all

conventional mathematical models for endemic or epidemic infections contain this

assumption. Given that the epidemiological time-scales such as the infectious period

38

are typically shorter than the time-scales characterizing demographic change such as

population doubling time, the assumption of a constant population is often a useful

approximation. However, for a nonrecurring infectious disease like measles, it is the

newborns who maintain the enough number of susceptibles to induce the long term

epidemic or endemic of the disease. Hence the CAIM should be modified to include

births and seasonality as follows:

It+1 = R(t)exp(-r1

t

Iτ

τ=∑ + β

1

t

bτ

τ=∑ ) It

α (3.10)

Here bt is the births during time period t and R(t) is a function of time t in a growing

population (Xia [2007]).

3.2.3 Data denoising and model estimation

Suppose It' is the reported number of cases in each time period t, t =1, 2, …, T. In

practice, the reported time series It' is usually contaminated by observation errors and

underreporting (Finkenstädt and Grenfell [2000]). Thus, it can be written as It' = It +

εt. The reported time series need to be de-noised before we use them to estimate a

model. Using kernel smoothing method, a de-noised time series is

= It1

( )T

Kh t Iτ

τ τ=

′−∑ / 1

(T

Kh tτ

τ=

)−∑ (3.11)

where K(.) is a symmetric smooth density function and h is a bandwidth used to

balance the noise and variation of the dynamic pattern, and K h (x) = K h (x / h) /h.

Using kernel smooth method to refine time series data in epidemiological modeling

can be found in the literature. See for example Becker and Hasofer [1998],

Finkenstädt & Grenfell [2000] and Ellner et al [2002].

39

{ } is called the denoised time series. In stead of the original time series, the de-

noised time series will be used to estimate the model. Taking logarithm to both sides

of the model

It


t

Iτ

τ=∑ ) It

α }

Hence log (It+1) = log (R) -r1

t

Iτ

τ=∑ + α log (It) + ξ t

where ξ t = log [Poi {Rexp(-r1

t

Iτ

τ=∑ ) It

α }] – log [Rexp(-r1

t

Iτ

τ=∑ ) It

α].

The iterative weighted least squared method is used to estimate the model.

40

CHAPTER 4

Application of CAIM to Real Epidemiological Data

Being able to make predictions about infectious disease dynamics is really helpful

for public health. If we know there will be an outbreak, we can prepare for it.

Alternatively, if we have a reliable model, we can study how to prevent an outbreak

and save lives by change the factors we can control using public health means. These

factors include education, immunization, quarantine regulations and health treatment

strategies.

In order to make reasonable predictions and develop methods of control, we must

be confident that our model captures the essential features of the course of an

epidemic. Thus, it becomes important to validate models, whether deterministic or

stochastic, by checking whether they fit the observed data. If the model is validated, it

can then be used to predict the course of the epidemic in time.

In this section, the model CAIM will be checked by fitting it to the observed data

of some nonrecurring infectious disease. Different from the statistical time series

multi-step prediction, which may fail to signify the system, we will check the model

by what will happen if the epidemic breakouts again and whether the fitted dynamic

system can give similar pattern.

Two kinds of realization of the estimated dynamical system will be carried out:

(i) deterministic realization assuming there is no random effect.

41

(ii) stochastic realization assuming the actual number of cases follows a

Poissson distribution. As the stochastic realizations are different from one

another, a number of realizations are needed to show the statistical dynamical

properties.

Both sides of the model will be taken logarithm and the iterative weighted least

square method (IWLSE) be used to estimated the parameters in the model.

4.1 Foot and Mouth Disease (FMD)

Foot and Mouth Disease (FMD) is one of the most contagious animal diseases,

with important economic losses. It is caused by a virus of the family Picornaviridae,

genus Aphthovirus and can spread very rapidly among livestock such as cattle and

sheep by direct or indirect contact (droplets). Low mortality rate in adult animals, but

often high mortality in young due to myocarditis. The incubation period is 2-14 days

and the mean is 7 days (Keeling et al [2001]).

The following data sources from the 2001 FMD epidemic in the United Kingdom.

The analysis will centers on the daily reported FMD cases for the aggregate of all UK

farms from February 22 to August 11 in 2001. These are official notifications,

collected by the Department of the Environment, Food and Rural Affairs.

42

Figure 4.1 Plot of the observed daily cases of the 2001 FMD epidemic in the UK.

As shown in the Figure 4.1, there was obvious variability in the daily case reports

though still a classical epidemic pattern. The fact that the mean incubation period of

FMD is 7 days indicates that an infected animal in week t can hence be related to a

contact between that individual as a susceptible and an infected individual in week t-

1. Therefore, the original data are aggregated into weekly time steps to form the

denoised data.

Next, the following basic CAIM was applied to model the weekly data. The

deterministic & stochastic realizations of the estimated model and one –step ahead

prediction based on the estimated model are shown in Figure 4.2 and Figure 4.3

respectively.

43

It+1 = Rexp(-r1

t

Iτ

τ=∑ ) It

α (4.1)


t

Iτ

τ=∑ ) It

α } (4.2)

Figure 4.2 Plot of the deterministic and stochastic realization of the estimated model. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of the estimated model and the shade is the region of 1000 stochastic realizations of the estimated model.

44

Figure 4.3 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of one-step ahead prediction and the shade is the region of 1000 stochastic realizations of one-step ahead prediction.

As shown in Figure 4.2, Model (4.1) and (4.2) can fit the data of FMD in UK in

2001 well. Figure 4.3 shows that given the first week data, the CAIM can produce

similar dynamic based on the estimated model if the FMD epidemic break outs again,

indicating that the basic CAIM is useful in making prediction for a typical epidemic

of FMD.

4.2 Severe Acute Respiratory Syndrome (SARS)

Severe Acute Respiratory Syndrome (SARS) is a new respiratory disease which

was first identified in November 2002 in Guangdong Province of China. On February

45

26, 2003, the first case of SARS in Hong Kong was reported officially. Thereafter,

this disease spread rapidly to the other countries such as Canada, Singapore, Taiwan

and Vietnam, which resulted in local outbreak of SARS. SARS is caused by

coronavirus and can be transmitted by droplet or airborne or person-to person contact.

Most infected individuals either recover, typically after 7-10 days, or suffer 4%

mortality or higher world-wide. The incubation period is 2-7 days, with 3-5 days

being most common.

In the following sections, the CAIM will be fitted to the SARS data observed in

Hong Kong, Singapore and Ontario, Canada. The original data are aggregated into 3-

day time steps to form the denoised data.

4.2.1 SARS in Hong Kong

Figure 4.4 Plot of the observed daily cases of the 2003 SARS epidemic in Hong Kong.

46

The following basic CAIM was first applied to model the denoised data. The

deterministic & stochastic realizations of the estimated model and one-step ahead

prediction based on the estimated model are shown in Figure 4.5 and Figure 4.6

respectively.

It+1 = Rexp(-r1

t

Iτ

τ=∑ ) It

α (4.1)


t

Iτ

τ=∑ ) It

α } (4.2)

Figure 4.5 Plot of the deterministic and stochastic realization of the estimated model for the data of HK. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of the estimated model and the shade is the region of 1000 stochastic realizations of the estimated model.

.

47

Figure 4.6 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of HK. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of one-step ahead prediction and the shade is the region of 1000 stochastic realizations of one-step ahead prediction.

As shown in Figure 4.5, Model (4.1) and (4.2) can fit the data of SARS in Hong

Kong in 2003 reasonably well, but Figure 4.6 shows that given the first 3-day data,

the basic CAIM cannot produce satisfactory dynamic based on the estimated model if

the SARS epidemic break outs again, indicating that the basic CAIM is not suitable to

make prediction for such an epidemic of SARS as in Hong Kong.

It can be seen from Figure 4.5 that there were three possible outbreaks of SARS in

Hong Kong. The first outbreak occurred from Day1 to Day 13, the second from Day

14 to Day 22 and the third outbreak from Day 23 till the end of the epidemic. Hence

the CAIM with three different levels of alertness are considered:

48

It+1 = Rexp(-r1

1

1

T

Iτ

τ=∑ - r2

2

1 1

T

T

Iτ

τ= +∑ - r3

2 1

t

T

Iτ

τ= +∑ ) It

α (4.3)

It+1 ~ Poi { Rexp(-r1

1

1

T

Iτ

τ=∑ - r2

2

1 1

T

T

Iτ

τ= +∑ - r3

2 1

t

T

Iτ

τ= +∑ ) It

α } (4.4)

Model (4.3) and (4.4) were applied to model the denoised data. The deterministic

& stochastic realizations of the estimated model and one–step ahead prediction based

on the estimated model are shown in Figure 4.7 and Figure 4.8 respectively.

Figure 4.7 Plot of the deterministic and stochastic realization of the estimated model for the data of HK. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of the estimated model and the shade is the region of 1000 stochastic realizations of the estimated model.

49

Figure 4.8 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of HK. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of one-step ahead prediction and the shade is the region of 1000 stochastic realizations of one-step ahead prediction.

As shown in Figure 4.7, Model (4.3) and (4.4) can fit the data of SARS in Hong

Kong in 2003 well, but Figure 4.8 shows that given the first 3-day data, the adjusted

CAIM with 3 different level of alertness can produce satisfactory dynamic based on

the estimated model if the SARS epidemic break outs again, indicating that the CAIM

with different level of alertness is useful in making prediction for such an epidemic of

SARS as in Hong Kong.

As mentioned before, the parameter r is the coefficient of cumulative alertness,

which depends on people’s awareness of the disease and the efficiency of control

50

measures. The lower the value of r is, the higher is the level of alertness and the more

efficient are the control measures. For the epidemic of SARS in Hong Kong, the

estimated values of r1, r2 and r3 were 0.0031, 0.0013 and 0.0014. The epidemic of

SARS in Hong Kong occurred mainly in three different regions in Hong Kong. SARS

first broke out in one community and then spread to the other two communities.

During the first outbreak, the people’s alertness to the disease was not very high and

the control measures were not very efficiently carried out. However, as the

cumulative number of reported cases increased, the level of people’s alertness

increased and the control measures were more efficient correspondingly, which

resulted in lower values of r for the second and third outbreak. Hence the change in

the values of r did reflect the change in the level of people’s alertness to SARS and

the efficiency of control measures.

4.2.2 SARS in Singapore and Ontario, Canada

Figure 4.9 Plot of the observed daily cases of the 2003 SARS epidemic in Singapore.

51

Figure 4.10 Plot of the observed daily cases of the 2003 SARS epidemic in Ontario, Canada.

It can be seen from Figure 4.9 and 4.10 that there were two outbreaks of SARS in

Singapore and in Ontario, Canada respectively. In Singapore, the first outbreak

occurred from Day1 to Day 19 and the second from Day 20 till the end of the

epidemic while in Ontario, the first outbreak occurred from Day 1 to Day 71 and the

second from Day 72 till the end of the epidemic. Hence the CAIM with two different

levels of alertness are considered:

It+1 = Rexp(-r1

1

1

T

Iτ

τ=∑ - r2

1 1

t

T

Iτ

τ= +∑ ) It

α (4.5)

It+1 ~ Poi { Rexp(-r1

1

1

T

Iτ

τ=∑ - r2

1 1

t

T

Iτ

τ= +∑ ) It

α } (4.6)

52

Model (4.5) and (4.6) were applied to model the denoised data of Singapore and

Ontario. The results for the data of Singapore are showed in Figure 4.11 and Figure

4.12 and those for the data of Ontario in Figure 4.13 and Figure 4.14 respectively.

Figure 4.11 Plot of the deterministic and stochastic realization of the estimated model for the data of Singapore. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of the estimated model and the shade is the region of 1000 stochastic realizations of the estimated model.

53

Figure 4.12 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of Singapore. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of one-step ahead prediction and the shade is the region of 1000 stochastic realizations of one-step ahead prediction.

Figure 4.13 Plot of the deterministic and stochastic realization of the estimated model for the data of Ontario. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of the estimated model and the shade is the region of 1000 stochastic realizations of the estimated model.

54

Figure 4.14 Plot of the deterministic and stochastic realization of one-step ahead prediction based on the estimated model for the data of Ontario. ‘*’ is the observed number of cases, ‘—’ is the deterministic realization of one-step ahead prediction and the shade is the region of 1000 stochastic realizations of one-step ahead prediction.

As shown in Figure 4.11 and 4.13, Model (4.5) and (4.6) can fit the data of SARS in

Singapore and Ontario in 2003 reasonably well. Figure 4.12 and 4.14 show that given

the first 3-day data, the CAIM with 2 different level of alertness can produce

satisfactory dynamics based on the estimated model if the SARS epidemic break outs

again, indicating that the CAIM with different level of alertness is useful in making

prediction for such an epidemic of SARS as in Singapore and Ontario, Canada.

The estimated values of r1 and r2 for the epidemic of SARS in Singapore were

0.0099 and 0.012 and those for the epidemic in Ontario were 0.018 and 0.034.

Different from the SARS epidemic in Hong Kong, the value of r for the second

outbreak was higher than that for the first outbreak for the SARS epidemic in

Singapore and Ontario, which can be explained by the difference in the epidemic

pattern between them. It was obvious from Figure 4.9 and 4.10 that there were two

55

outbreaks of SARS in Singapore and Ontario. More importantly, at the end of the first

outbreak, the number of the reported new cases was few, e.g. 1 new case in Singapore

and no new case in Ontario, which was different from the situation in Hong Kong

shown in Figure 4.4. Few new reported cases might make people get such a false

impression that this epidemic had died out that they became less alert to the disease

when the second outbreak occurred, which resulted in higher value of r for the second

outbreak.

4.3 Measles

Measles is caused by a highly infective single-stranded RNA virus belonging to the

morbillivirus group. It may infect other primates, but is largely specialized on its

human host. Measles can be transmitted from one host to the next by direct contact or

in vapor droplets. The disease causes considerable mortality in malnourished young

children in developing countries while in developed countries case fatality is very low

but epidemics are still a significant public health problem (Anderson and May [1991],

Bjørnstad et al [2002]).

The following cases of measles have been registered on a monthly basis by Centre

for Disease Control (CDC) of China from January 1994 to December 2005.

56

Figure 4.15 Plot of the observed monthly measles cases from January 1994 to December 2005 in China.

As the data shows long term epidemic of measles, the CAIM for long term

epidemic will be fitted to the data.

It+1 = R(t)exp(-r1

t

Iτ

τ=∑ + β

1

t

bτ

τ=∑ ) It

α (4.7)

Here bt is the monthly birth number and R(t) = exp ( -a1D1-a2D2-…-a12D12)where a1,

a2 … a12 are the parameters for each month and D1, D2, …D12 are dummy

variables whose values are either 0 or 1. The results are showed in Figure 4.15 and

4.16.

57

Figure 4.16 Plot of the deterministic realization of the estimated model for the data of Measles in China. The red circle is the observed cases, the blue line is the fitted model and the green shade is the region of 1000 stochastic realizations of the estimated model.

.

Figure 4.17 Plot of the deterministic realization of one-step ahead prediction based on the estimated model for the data of measles in China. The red circle is the observed number of cases, the blue line is the deterministic realization of one-step ahead prediction and the green shade is the region of 1000 stochastic realizations of one-step ahead prediction.

58

It can be seen from Figure 4.16 that Model (4.7) and (4.8) can fit the data of

measles in China well. Figure 4.17 shows that given the first month data, the CAIM

for long term epidemic can produce satisfactory dynamics based on the estimated

model if measles is persistent for a long time, indicating that the CAIM for long term

epidemic is useful in making prediction for such long term epidemic of measles as in

China.

4.4 Concluding remarks

In this thesis, we first proposed the time series-susceptible-infected-recovered

(TSIR) model based on the compartmental SIR mechanism. The validation of TSIR

model was checked by applying it to the data generated by the classic deterministic

model. The result showed that TSIR model performed well.

Next, considering that the number of susceptibles is not obtainable in most cases,

we develop a more practical statistical model, the cumulative alertness infection

model (CAIM) based on the TSIR model, which only requires the reported number of

cases. The parameters in the CAIM have been interpreted and some extensions of

CAIM have been discussed.

Finally, the CAIM was applied to model the real data of several infectious diseases:

foot-and-mouth disease in UK in 2001, SARS in Hong Kong, Singapore and Ontario,

Canada in 2003 and measles in China from 1994 to 2005. Our findings showed that

the CAIM could mimic the dynamics of these diseases reasonably well. The results

indicate that the CAIM may be helpful in making predictions about infectious disease

dynamics.

59

BIBLIOGRAPHY

Anderson, H., Britton, T. (2000) Stochastic Epidemic Models amd Their Statistical

Analysis. Springer-Verlag, New York.

Anderson, R. M., May, R. M. (1991) Infectious Diseases of Humans: Dynamics and

Control. Oxford University Press, New York.

Bailey, N. T. J. (1975) The Mathematical Theory of Infectious Diseases and its

Applications (2nd edn). Charles Griffin & Company Ltd. London.

Bartllet, M. S. (1949) Some evolutionary stochastic processes. Journal of Statistical

Society, 11: 211-229.

Bartllett, M. S.(1955) Stochastic Processes. Cambridge University Press.

Becker, N. G., Britton,T.(1999) Statistical studies of infectious disease incidence.

Journal of Statistical Society, 61: 287-307.

Becker, N. G., Hasofer, A. M. (1998) Estimating the transmission rate for a highly

infectious disease. Biometrics, 54: 730-738.

Benenson, A. S. (1975) Control of communicable diseases in man, (12 th edn).

American Public Health Association, Washington, D.C.

60

Bjørnstad, O. N., Finkenstädt, B. F., Grenfell, B. T. (2002) Dynamics of measles

epidemics: estimating scaling of transmission rates using a time series sir model.

Ecological Monographs, 72(2): 169-184.

Castillo- Chavez, C., Hetchcote, H. W., Anderson, V., Levin, S. A., Liu, W. (1989)

Epidemiological models with age structure, proportionate mixing and cross-

immunity. Journal of Mathematical Biology, 27: 233-258.

Castillo-Chavez, C., Blower, S., Driessche, P. V. D., Kirschner, D., Yakubu, A.

(2002) Mathmatical Approaches for Emerging and Reemerging Infectious Diseases:

An Introduction. Springer-Verlag, New York.

Chowell, G., Fenimore, P. W., Catillo-Garsow, M. A., Castillo-Chavez, C. (2002)

SARS outbreaks in Ontario, Hong Kong and Singapore: the role of diagnosis and

isolation as a control mechanism. Journal of Theoretical Biology, 24: 1-8.

Chowell, G., Hengartner, N. W., Castillo-Chavez, C., P. W., Catillo-Garsow, Hyman.

J. M. (2004) The basic reproductive number of Ebola and the effects of public health

measures: the cases of Congo and Uganda. Journal of Theoretical Biology, 229: 119-

126.

Christie, A. B. (1974) Infectious diseases: epidemiology and clinical practice.

Chuchill Livingstone, London.

61

Daley, D. J., Gani, J. (1999) Epidemic Modelling: An Introduction. Cambridge

University Press.

Diekmann, O., Heesterbeek, J. A. P. (2000) Mathematical Epidemiology of Infectious

Diseases: Model Building, Analysis and Interpretation. John Wiley & Sons,

Chichester.

Dietz, K. (1979) Epidemiologic interference of virus populations. Journal of

Mathematical Biology, 8: 291-300.

Dietz, K., Schenzle, D. (1985) Proportionate mixing models for age-dependent

infection transmission. Journal of Mathematical Biology, 22: 117-120.

Ellner, S. P., Seifu, Y., Smith, R. H.(2002) Fitting population dynamic models to

time-series data by gradient matching. Ecology, 83: 2256-2270.

Farrington, C. P., Kanaan, M. N. (2001) Estimation of the basic reproduction number

for infectious diseases from age-stratified serological survey data. Journal of Applied

Statistics, 50: 251-292.

Feng, Z, Thieme, R. H. (1995) Multi-annual outbraks of childhood disease revisited:

The impact of isolation. Journal of Mathematical Bioscience, 128: 93-130.

Fenner, F., White, D. O. (1970) Medical virology. Academic, New York.

62

Finkenstädt, B. F., Grenfell, B. T. (2000) Time series modeling of childhood diseases:

a dynamical systems approach. Journal of Applied Statistics, 49: 187-205.

Grenfell, B., Dobson, A. (1995) Ecology of infectious diseases in natural populations.

Cambridge University Press.

Hamer, W. H.(1906) Epidemic disease in England. The Lancet, i: 733-739.

Hethcote, H. W. (1976) Qualitative analysis for communicable disease models.

Journal of Mathematical Bioscience, 28: 335-356.

Hethcote, H. W., Van Ark, J. W. (1987) Epidemiological models for heterogeneous

populations: Proportionate mixing, parameter estimation and immunization programs.

Journal of Mathematical Bioscience, 84: 85-118.

Hethcote, H.W. (2000) The mathematics of infectious diseases. Siam Review, 42:

599-653.

Isham, V., Medley, G. (1996) Models for infectious human diseases: Their structure

and relation to data. Cambridge University Press.

Keeling, M. J., Woolhouse, M. E. J., Shaw, D. J., Matthews, L., Chase-Topping, M.,

Haydon, D. T., Cornell, S. J., Kappey, J., Wilesmith, J., Grenfell, B.T. (2001)

Dynamics of the 2001 UK foot and mouth epidemic: stochastic dispersal in a

heterogeneous landscape. Science, 294: 813-817.

63

Kermack, W. O., McKendrick, A. G. (1927) A contribution to the mathematical

theory of epidemics. Proceedings of the Royal Society, A115: 700-721.

Liu, W., Hethcote, H.W., Levin, S. (1987) Dynamic behavior of epidemiological

models with nonlinear incidence rates. Journal of Mathematical Bioscience, 25: 359-

380.

McKendrick, A. G. (1926) Applications of mathematics to medical problems.

Proceedings of Edinburg Mathematical Society, 14: 98-130.

Mollison, D. (1995) Epidemic Models: Their structure and relation to data.

Cambridge University Press.

Naug, D., Smith, B. (2007) Experimentally indiced change in infectious period affects

transmission dynamics in social group. Proceedings of the Royal Society, 274: 61-65.

Ross, R. (1908) Report on the prevention of malaria in Mauritius. London.

Ross, R. (1911) The prevention of malaria(2nd edn). Murray, London.

Ross, R. (1916) An application of the theory of probabilities to the study of a priori

pathometry, I. Proceedings of the Royal Society, A92: 204-230.

64

Ross, R. (1917) An application of the theory of probabilities to the study of a priori

pathometry, II. Proceedings of the Royal Society, A93: 212-225.

Wallinga, J., Teunis, P. (2004) Different epidemic curves for severe acute respiratory

syndrome reveal similar impacts of control measures. American Journal of

Epidemiology, 160: 509-516.

Xia, Y. (2007) A statistical model for the transmission of infectious diseases.

Manustript.

65

APPENDIX

Programme codes

1 Prgramme to produce the realization of deterministic SIR model with different

parameters in (2.3)

% Define the function of the deterministic model

function A = sir(beta, v, N)

options = odeset ('RelTol',1e-5,'AbsTol',1e-5);

[T,Y] = ode45(@sirsys1,[0:1:100],[N 1 0],options,beta,v);

A=[T,Y];

function dy = sirsys1(t,y,beta,v)

dy = zeros (3,1);

dy(1)=-beta*y(1)*y(2);

dy(2)=beta*y(1)*y(2)-v*y(2);

dy(3)=v*y(2);

% Produce the plots

A=SIR(0.005, 0.1, 100);

plot(A(:,1),A(:,3))

xlabel('time')

ylabel('SIR: number of cases')

A=SIR(0.005, 0.1, 100);

66

plot(A(:,1),A(:,2))

xlabel('time')

ylabel('SIR: number of susceptibless')

A=SIR(0.015, 0.1, 100);

plot(A(:,1),A(:,3))

xlabel('time')


A=SIR(0.015, 0.1, 100);

plot(A(:,1),A(:,2))

xlabel('time')



different R0 in (2.5.1)





A=[T,Y];


dy = zeros (3,1);

dy(1)=-beta*y(1)*y(2);

dy(2)=beta*y(1)*y(2)-v*y(2);

67

dy(3)=v*y(2);

% Produce the plots

R0 = 0.8;

v = 0.1;

N =100;

beta = R0 * v /N;

A=SIR(beta, v, N);

plot(A(:,1),A(:,3))

xlabel('time')


R0 = 1.5;

v = 0.1;

N =100;

beta = R0 * v /N;

A=SIR(beta, v, N);

plot(A(:,1),A(:,3))

xlabel('time')


R0 = 5;

v = 0.1;

N =100;

beta = R0 * v /N;

68

A=SIR(beta, v, N);

plot(A(:,1),A(:,3))

xlabel('time')


3 Programme to check the validation of TSIR model in (3.1.1)





A=[T,Y];


dy = zeros (3,1);

dy(1)=-beta*y(1)*y(2);

dy(2)=beta*y(1)*y(2)-v*y(2);

dy(3)=v*y(2);

% Estimate the coefficients of TSIR model

R0=2.0; % or R0 = 7

N=5000000;

gamma=1/5; % or gamma = 1/10

l1=size(gamma);

l=l1(2);

beta=gamma*R0/N;

k1=size(beta);

69

k=k1(2);

c=zeros(l,k);

r=zeros(l,k);

alpha=zeros(l,k);

tol = 1e-35;

maxit = 25;

[beta_final,gamma_final]=meshgrid(beta,gamma);

for i = 1:l

for j = 1:k

A=SIR(beta_final(i,j),gamma_final(i,j),N);

a=size(A);

n=a(1);

check=[A(1:(n-1),2)-A(2:n,2) (1:1:(n-1))'];

check1=check(check(:,1)<0,:);

if(sum(check1(:,1))==0)

new.n=n;

else

new.n=min(check1(:,2));

end

B=A(1:new.n,2);

n=length(B);

split=1;

n1=floor(n/split);

output=sum(B(1:split));

for i1 = 2:n1

70

out=sum(B((split*(i1-1)+1):(split*i1)));

output=[output;out];

end

if((n/split)~=n1)

out=mean(B((split*n1+1):n))*split;


end

n=length(output);

b=log(output(2:(n-1))-output(3:n));

tmp_A=ones(n-2,1);

tmp_A1=log(output(1:(n-2)));

tmp_A2=log(output(1:(n-2))-output(2:(n-1)));

A1=[tmp_A tmp_A1 tmp_A2];

[x,flag,relres,iter,resvec,lsvec]=lsqr(A1,b,tol,maxit);

c(i,j)=exp(x(1));

r(i,j)=x(2);

alpha(i,j)=x(3);

end

end

% Produce the plot to compare TSIR model and SIR model

simu_n1=size(b);

simu_n=simu_n1(1);

simu_A1=tmp_A1;

simu_A2=tmp_A2;

71

for i = 1:simu_n

simu_b(i)=log(c)+simu_A1(i)*r+simu_A2(i)*alpha;

if(i~=simu_n)

simu_A2(i+1)=log(exp(simu_b(i)));

simu_A1(i+1)=log(exp(simu_A1(i))-exp(simu_A2(i)));

end

end

LOG_INFECTIVENESS_T1=simu_b;

F=(LOG_INFECTIVENESS_T1)';

INFECTIVENESS=exp(F);

INFECTIVENESS=INFECTIVENESS';

T1=length(INFECTIVENESS);

T=(1:1:(T1-1));

T=T';

INFECTIVENESS2=exp(tmp_A2);

t1=length(INFECTIVENESS2);

t2=(1:1:(t1-1));

t2=t2';

Q=length(INFECTIVENESS2)-1;

plot(T(1:Q),INFECTIVENESS(1:Q)',t2,INFECTIVENESS2(2:t1),'*')

xlabel(‘time’)

ylabel(‘It’)

72

4 Programme to investigate the relationship between the parameters in TSIR

model and the deterministic SIR model in (3.12)





A=[T,Y];


dy = zeros (3,1);

dy(1)=-beta*y(1)*y(2);

dy(2)=beta*y(1)*y(2)-v*y(2);

dy(3)=v*y(2);


R0=2:0.1:20;

N=5000000;

gamma=1/3; % or gamma =1/5, 1/7, 1/10

l1=size(gamma);

l=l1(2);

beta=gamma*R0/N;

k1=size(beta);

k=k1(2);

c3=zeros(l,k); % or c5, c7, c10 corresponding to gamma =1/5, 1/7, 1/10

r3=zeros(l,k); % or r5, r7, r10 corresponding to gamma =1/5, 1/7, 1/10

alpha3=zeros(l,k); % alpha5, alpha7, alpha10 corresponding to gamma =1/5,1/7, 1/10

73

tol = 1e-35;

maxit = 25;


for i = 1:l

for j = 1:k


a=size(A);

n=a(1);

check=[A(1:(n-1),2)-A(2:n,2) (1:1:(n-1))'];



new.n=n;

else


end

B=A(1:new.n,2);

n=length(B);

split=1;

n1=floor(n/split);


for i1 = 2:n1



end

74

if((n/split)~=n1)



end

n=length(output);


tmp_A=ones(n-2,1);





c3(i,j)=exp(x(1)); % c5, c7, c10 corresponding to gamma =1/5, 1/7, 1/10

r3(i,j)=x(2); % r5, r7, r10 corresponding to gamma =1/5, 1/7, 1/10

alpha3(i,j)=x(3); % alpha5, alpha7, alpha10 corresponding to gamma =1/5, 1/7, 1/10

end

end

% produce the plots to show the relationship between R0 and the three parameters in

TSIR models

plot (R0, c3, ‘*’, R0, c5, ‘*’, R0, c7, ‘*’, R0, c10, ‘*’)

xlabel(‘R0’)

ylabel(‘c’)

plot (R0, r3, ‘*’, R0, r5, ‘*’, R0, r7, ‘*’, R0, r10, ‘*’)

xlabel(‘R0’)

ylabel(‘r’)

75

plot (R0, alpha3, ‘*’, R0, alpha5, ‘*’, R0, alpha7, ‘*’, R0, alpha10, ‘*’)

xlabel(‘R0’)

ylabel(‘alpha’)

5 Programme to check the impact of aggregated data on the TSIR model fit in

(3.1.3)





A=[T,Y];


dy = zeros (3,1);

dy(1)=-beta*y(1)*y(2);

dy(2)=beta*y(1)*y(2)-v*y(2);

dy(3)=v*y(2);


R0=2.0;

N=5000000;

gamma=1/5; % or gamma = 1/10

l1=size(gamma);

l=l1(2);

beta=gamma*R0/N;

k1=size(beta);

76

k=k1(2);

c=zeros(l,k);

r=zeros(l,k);

alpha=zeros(l,k);

tol = 1e-35;

maxit = 25;


for i = 1:l

for j = 1:k


a=size(A);

n=a(1);

check=[A(1:(n-1),2)-A(2:n,2) (1:1:(n-1))'];



new.n=n;

else


end

B=A(1:new.n,2);

n=length(B);

split=1; % or split = 3, 5, 7, 10,12 corresponding to the diferent aggregated data

n1=floor(n/split);


for i1 = 2:n1

77



end

if((n/split)~=n1)



end

n=length(output);


tmp_A=ones(n-2,1);





c(i,j)=exp(x(1));

r(i,j)=x(2);

alpha(i,j)=x(3);

end

end

% Produce the plot to compare TSIR model and SIR model

simu_n1=size(b);

simu_n=simu_n1(1);

simu_A1=tmp_A1;

simu_A2=tmp_A2;

78

for i = 1:simu_n

simu_b(i)=log(c)+simu_A1(i)*r+simu_A2(i)*alpha;

if(i~=simu_n)

simu_A2(i+1)=log(exp(simu_b(i)));

simu_A1(i+1)=log(exp(simu_A1(i))-exp(simu_A2(i)));

end

end

LOG_INFECTIVENESS_T1=simu_b;

F=(LOG_INFECTIVENESS_T1)';

INFECTIVENESS=exp(F);

INFECTIVENESS=INFECTIVENESS';

T1=length(INFECTIVENESS);

T=(1:1:(T1-1));

T=T';

INFECTIVENESS2=exp(tmp_A2);

t1=length(INFECTIVENESS2);

t2=(1:1:(t1-1));

t2=t2';

Q=length(INFECTIVENESS2)-1;

plot(T(1:Q),INFECTIVENESS(1:Q)',t2,INFECTIVENESS2(2:t1),'*')

xlabel(‘time’)

ylabel(‘It’)

6 Programme to investigate the relationship between the three parameters in

CAIM and R0 in (3.2.1)

79





A=[T,Y];


dy = zeros (3,1);

dy(1)=-beta*y(1)*y(2);

dy(2)=beta*y(1)*y(2)-v*y(2);

dy(3)=v*y(2);


R0=2:0.1:20;

N=5000000;

gamma=1/3; % or gamma =1/5, 1/7, 1/10

l1=size(gamma);

l=l1(2);

beta=gamma*R0/N;

k1=size(beta);

k=k1(2);

R3=zeros(l,k); % or R5, R7, R10 corresponding to gamma =1/5, 1/7, 1/10

r3=zeros(l,k); % or r5, r7, r10 corresponding to gamma =1/5, 1/7, 1/10

alpha3=zeros(l,k); %alpha5, alpha7, alpha10 corresponding to gamma =1/5, 1/7, 1/10

tol = 1e-35;

maxit = 25;

80


for i = 1:l

for j = 1:k


a=size(A);

n=a(1);

check=[A(1:(n-1),2)-A(2:n,2) (1:1:(n-1))'];



new.n=n;

else


end

B=A(1:new.n,2);

n=length(B);

split=1;

n1=floor(n/split);


for i1 = 2:n1



end

if((n/split)~=n1)



81

end

n=length(output);


tmp_A=ones(n-2,1);

tmp_A1=N-output(1:(n-2));




R3(i,j)=exp(x(1)); % R5, R7, R10 corresponding to gamma =1/5, 1/7, 1/10

r3(i,j)=x(2); % r5, r7, r10 corresponding to gamma =1/5, 1/7, 1/10

alpha3(i,j)=x(3); % alpha5, alpha7, alpha10 corresponding to gamma =1/5, 1/7, 1/10

end

end

% produce the plots to show the relationship between R0 and the three parameters in

TSIR models

plot (R0, R3, ‘*’, R0, R5, ‘*’, R0, R7, ‘*’, R0, R10, ‘*’)

xlabel(‘R0’)

ylabel(‘c’)

plot (R0, r3, ‘*’, R0, r5, ‘*’, R0, r7, ‘*’, R0, r10, ‘*’)

xlabel(‘R0’)

ylabel(‘r’)

plot (R0, alpha3, ‘*’, R0, alpha5, ‘*’, R0, alpha7, ‘*’, R0, alpha10, ‘*’)

xlabel(‘R0’)

ylabel(‘alpha’)

82

7 Programme to apply CAIM to fit 2001 FMD data in UK in (4.1)

% Produce the plot for original FMD data

orig_data1=read.csv("D:\\FMD_orig.csv")

orig_data=orig_data1[,1]

x=seq(7,170,by=7)

x1=1:170

plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of FMD

Case")

% Produce the deterministic and stochastic realizations of the estimated CAIM for

FMD data

data=read.csv("D:\\footmouth.csv")

data1=data

data[,2]=log(data[,2])


data2=data

model=lm(It1~It+cul1,data)

coefficients=model$coefficients

predicted.A=predict.lm(model,data)

predicted.A1=predicted.A

n=length(predicted.A)


Case")

for(j in 1:1000)

{

83


for(i in 1:n)

{

predicted.A[i]=rpois(1,exp(predicted.A[i]))

}

lines(x,predicted.A/7,col="Green")

}

lines(x,exp(predicted.A1)/7,col="Blue")

points(x1,orig_data,col="Red",pch="*")

% Produce the deterministic and stochastic realizations of one-step ahead prediction

based on the estimated CAIM for FMD data

predicted.B=data$It1

predicted.B1=predicted.B


Case")

for(j in 1:1000)

{

data=data2

for(i in 1:n)

{

data[i,3]=predict.lm(model,data[i,])

predicted.B1[i]=data[i,3]

predicted.B[i]=rpois(1,exp(predicted.B1[i]))

if(i!=n)

84

{

data[i+1,2]=data[i,3]

data[i+1,4]=data[i,4]+exp(data[i,3])

}

}

lines(x,predicted.B/7,col="Green")

}


lines(x,exp(predicted.B1)/7,col="Blue")

8 Programme to apply CAIM to fit 2003 SARS data in Hong Kong in (4.2.1)

% Produce the plot for original SARS data in Hong Kong

orig_data1=read.csv("D:\\hk_orig.csv")

orig_data=c(10,orig_data1[,1])

x=seq(3,92,by=3)

x1=1:92

plot(x1,orig_data,type="p",col="Red",pch="*",xlab="Day",ylab="Number of SARS

Case")


FMD data

data=read.csv("D:\\HK.csv")

data1=data



model=lm(It1~It+cul,data)

85






Case")

for(j in 1:1000)

{


for(i in 1:n)

{


}


}




based on the estimated CAIM for FMD data




Case")

for(j in 1:1000)

86

{

data=data2

for(i in 1:n)

{




if(i!=n)

{



}

}


}



9 Programme to apply CAIM to fit 2003 SARS data in Singapore in (4.2.2)

% Produce the plot for original SARS data in Singapore

orig_data1=read.csv("D:\\singapore_orig.csv")


x=seq(3,67,by=3)

x1=1:67

87


Case")


SARS data in Singapore

data=read.csv("D:\\singapore.csv")

data1=data



data2=data

model=lm(It1~It+cul1+cul2,data)






Case")

for(j in 1:1000)

{


for(i in 1:n)

{


}


88

}




based on the estimated CAIM for SARS data in Singapore




Case")

for(j in 1:1000)

{

data=data2

for(i in 1:n)

{




if(i!=n)

{


if(i<6)

{


}else{

89

# data[i+1,4]=data[i,4]

data[i+1,4]=0


}

}

}


}



10 Programme to apply CAIM to fit 2003 SARS data in Ontario in (4.2.2)

% Produce the plot for original SARS data in Ontario

orig_data1=read.csv("D:\\ontario_orig.csv")


x=seq(3,121,by=3)

x1=1:121


Case")


SARS data in Ontario

data=read.csv("D:\\ontario.csv")

data1=data


90


data2=data

model=lm(It1~It+cul1+cul2,data)






Case")

for(j in 1:1000)

{


for(i in 1:n)

{


}


}




based on the estimated CAIM for SARS data in Ontario



91


Case")

for(j in 1:1000)

{

data=data2

for(i in 1:n)

{




if(i!=n)

{


if(i<24)

{


}else{

# data[i+1,4]=data[i,4]

data[i+1,4]=0


}

}

}


}

92



11 Programme to apply CAIM to fit measles data in China in (4.3)

% Produce the plot for original measles data in China

A = read.csv("D:\\measles.csv",header=TRUE)

data = A[,1:17]

plot(exp(data$It1),type="b",col="Red",xlab="Month",ylab="Number of Measles

case")


measles data in China

A = read.csv("D:\\measles.csv",header=TRUE)

data = A[,1:17]



model=lm(It1~c1+c2+c3+c4+c5+c6+c7+c8+c9+c10+c11+c12+culIt+culbt+It-1,data)



plot(exp(data$It1),type="b",col="Red",xlab="Month",ylab="Number of Measles

case")

lines(exp(predicted.A),col="Blue")

for(j in 1:1000)

{

for(i in 1:144)

93

{


temp=exp(data[i,1])

data[i,1]=data[i,1]+log(rpois(1,temp))-log(temp)

predicted.B[i]=data[i,1]

}

lines(exp(predicted.B),col="Green")

}

points(exp(data1$It1),col="Red")

lines(exp(predicted.A),col="Blue")


based on the estimated CAIM for measles data in China

plot(exp(data1$It1),type="p",col="Red",xlab="Month",ylab="Number of Measles

case")

for(j in 1:1000)

{

for(i in 1:144)

{


temp=exp(data[i,1])

data[i,1]=data[i,1]+log(rpois(1,temp))-log(temp)


if(i!=144)

{

94



}

}

lines(exp(predicted.B),col="Green")

}

for(i in 1:n)

{



if(i!=144)

{



}

}

points(exp(data1$It1),col="Red")

lines(exp(predicted.B),col="Blue")

95

Download - A STATISTICAL MODEL FOR THE TRANSMISSION OF INFECTIOUS DISEASES · 2018-01-09 · Infectious diseases are the diseases which can be transmitted from one host of humans or animals

Top Related