time series analysis in road safety research using state space methods ·  · 2016-12-13time...

209
Time series analysis in road safety research using state space methods Frits Bijleveld

Upload: doanque

Post on 29-Apr-2018

220 views

Category:

Documents


2 download

TRANSCRIPT

Time series analysis in road safetyresearch using state space methods

Frits Bijleveld

Time series analysis in road safety research using state space m

ethods Frits Bijleveld

ISB

N: 978-90-73946-04-0VU University Amsterdam

Time series analysis in road safety

research using state space methods

Frits Bijleveld

SWOV–Dissertatiereeks, Leidschendam, Nederland.

In deze reeks is eerder verschenen:

Jolieke Mesken (2006). Determinants and consequences of drivers emotions.

Ragnhild Davidse (2007). Assisting the older driver: Intersection design and in car

devices to improve the safety of the older driver.

Maura Houtenbos (2008). Expecting the unexpected. A study of interactive

driving behaviour at intersections.

Dit proefschrift is mede tot stand gekomen met steun van de Stichting

Wetenschappelijk Onderzoek Verkeersveiligheid SWOV.

Uitgever:

Stichting Wetenschappelijk Onderzoek Verkeersveiligheid SWOV

Postbus 1090

2262 AR Leidschendam

E: [email protected]

I: www.swov.nl

ISBN: 978-90-73946-04-0

c© 2008 Frits Bijleveld

Alle rechten zijn voorbehouden. Niets uit deze uitgave mag worden verveel-

voudigd, opgeslagen of openbaar gemaakt op welke wijze dan ook zonder

voorafgaande schriftelijke toestemming van de auteur.

VRIJE UNIVERSITEIT

Time series analysis in road safetyresearch using state space methods

ACADEMISCH PROEFSCHRIFT

ter verkrijging van de graad Doctor aan

de Vrije Universiteit Amsterdam,

op gezag van de rector magnificus

prof.dr. L.M. Bouter,

in het openbaar te verdedigen

ten overstaan van de promotiecommissie

van de faculteit der Economische Wetenschappen en Bedrijfskunde

op dinsdag 4 november 2008 om 15.45 uur

in de aula van de universiteit,

De Boelelaan 1105.

door

Frederik Deodaat Bijleveld

geboren te Voorburg

promotor: prof.dr. S.J. Koopman

copromotor: prof.dr.ir. C.A.G.M. van Montfort

Contents

1. Introduction 9

1.1. The main ideas of this research 9

1.2. Important issues in time series analysis of road safety data 12

1.2.1. Time dependence 12

1.2.2. Multiple road safety outcomes 16

1.2.3. Exposure data 17

1.2.4. Explanatory variables 19

1.2.5. Conclusions 23

1.3. Structure of this thesis 24

2. Safety, exposure and risk 28

2.1. Introduction 28

2.2. Risk exposure in road safety analysis 30

2.2.1. Statistical distributions 31

2.2.2. The distribution of accident counts 33

2.2.3. Over-dispersion 33

2.2.4. Gaussian approximations 34

2.2.5. The distribution of victim counts 34

2.2.6. The relation between trials and exposure 35

2.3. Traffic volume and accident occurrence 36

2.3.1. The relation between ‘traffic volume’ and the number of

accidents 36

2.3.2. A remark on traffic volume and multiparty accident oc-

currence 38

2.4. Summary and discussion 38

3. Multivariate structural time series models 41

3.1. Introduction 41

3.2. The concept of state and its observation 42

3.3. The latent risk time series model 45

3.3.1. A basic latent risk observation model 45

3.3.2. The role of the dynamic relation among states 47

3.3.3. Specification by means of linear structural models 54

3.3.4. Linear measurement equations 60

3.3.5. General state space model specification 62

3.3.6. Estimation of parameters and latent factors, missing data 62

3.3.7. Kalman smoother, auxiliary residuals 64

3.3.8. Diagnostic checking 64

3.4. Applications 65

3.4.1. State space DRAG-similar models 65

3.4.2. Estimating the registration level of accidents involving

hospitalised victims 71

3.5. Non linear extensions 77

3.5.1. Introduction 77

3.5.2. Mixing additive and multiplicative models 78

3.5.3. Further generalisations 79

4. The covariance between the number of accidents and victims 80

4.1. Introduction 80

4.1.1. The need for multivariate modelling of influences on road

safety 80

4.1.2. The issue of dependence among outcomes 81

4.1.3. An approximating solution 82

4.1.4. Overview of the paper 84

4.2. The covariance structure of road safety related

outcomes 85

4.2.1. Introduction 85

4.2.2. Results 85

4.3. Simulation studies 86

4.4. Examples 89

4.4.1. The mortality ratio 89

4.4.2. Multivariate state space modelling and the Kalman filter 91

4.4.3. The relative error of the variance estimate of the loga-

rithm of a Poisson distributed random variable 91

4.5. Conclusions 92

5. Model-based measurement of latent risk in time series 94

5.1. Introduction 94

5.2. The statistical framework 96

5.3. Case I: a two-dimensional insurance LRT model 100

5.4. Case II: a three-dimensional credit card LRT model 103

5.5. Case III: a multiple exposure LRT model 106

5.6. Conclusions 108

6. Multivariate nonlinear time series modelling of exposure and risk in

road safety research 109

6.1. Introduction 109

6.2. Data description 112

6.3. The multivariate nonlinear time series model 113

6.3.1. Specification of model and assumptions 113

6.3.2. Unobserved stochastic local linear trend factors 115

6.3.3. Observation equation 115

6.3.4. Nonlinear state space model formulation 116

6.4. Estimation of parameters and latent factors 118

6.5. Empirical results: estimation and model selection 121

6.5.1. Parameter estimation results 122

6.5.2. Signal extraction: trends for exposure and risk 123

6.5.3. Model fit 124

6.5.4. External validation 126

6.6. Implications for road safety research 127

6.7. Conclusions 128

7. The likelihood filter: estimation and testing 130

7.1. Introduction 130

7.2. Maximum likelihood approach to filtering 132

7.2.1. Gaussian maximum likelihood approach to filtering 132

7.2.2. General maximum likelihood approach to filtering 132

7.3. Laplace approximation of the likelihood 133

7.4. Simulation studies 134

7.5. Applications 141

7.5.1. Volatility: pound/dollar daily exchange rates 141

7.5.2. The effects of precipitation on road safety 142

7.5.3. Conclusions 155

7.6. Discussion and conclusions 155

8. Conclusions 157

References 163

Author index 173

Appendix A. 177

Appendix B. 187

Appendix C. 194

Samenvatting 199

Dankwoord 207

1. Introduction

1.1. A short description of the main ideas of this research

In this thesis we present a comprehensive study into novel time series models

for aggregated road safety data. The models are mainly intended for analysis

of indicators relevant to road safety, with a particular focus on how to measure

these factors. Such developments may need to be related to or explained by

external influences. It is also possible to make forecasts using the models. Rel-

evant indicators include the number of persons killed per month or year. These

statistics are closely watched by government agencies and the public, and their

relevance to society is not disputed. A large body of research is devoted to the

improvement of road safety. To that end, changes in the number of accidents

or victims are often attempted to be explained by (changes in) factors such as

exposure, policy, driving under the influence of alcohol, speeding by drivers.

Some factors such as policy changes can be directly observed (although com-

pliance with policy and law may not). Other factors can be observed in theory

but in practice their measurement is either difficult or very expensive. Exam-

ples of such factors are exposure, which is measured using surveys and vehicle

counting systems, and percentage of drivers exceeding the legal blood alcohol

concentration limit, which is measured using road side surveys. Finally, some

factors are even harder to observe such as driver skill or experience.

The methodology used by the novel approach introduced in this thesis is de-

signed to address potential inaccuracies of data, both in dependent variables

and in explanatory variables. The methodology also addresses the potential

multivariate nature of road safety analysis problems due to multiple depen-

dent road safety outcomes like the number of accidents and victims. The first

aspect results in non-homogeneous observation error variances and the needs

for a multivariate approach to modelling. The second aspect introduces struc-

tural but time varying covariance among (multivariate) observation errors.

Both issues are accounted for by readily available statistical techniques derived

from the Kalman filter (Kalman, 1960). In this thesis a special form of Kalman

(1960)’s model which is referred to as a structural time series model is further

developed. Structural time series models originate from Muth (1960), and were

made popular by Harvey (1983), and applied in multivariate form by Harvey

and Koopman (1997). A special form of the latter model designed for road

safety risk analysis is developed in this thesis and was published as Bijleveld,

Commandeur, Gould, and Koopman (2008). This model is combined with an

9

approach to estimating the structural covariance among accident related data

in Chapter 4, which was published in Bijleveld (2005).

Structural time series models were first applied in road safety analysis by Har-

vey and Durbin (1986). In Harvey and Durbin (1986) the consequences of the

introduction of the seat belt law in the United Kingdom in 1983 is evaluated.

The same methodology was later applied to a seat belt use change in West-

Germany by Ernst and Bruning (1990) and to a re-analysis of the introduction

of the seat belt law in the Netherlands by Bos and Bijleveld (1991). Other ap-

plications in road safety analysis based on this method are by Lassarre (2001),

Scuffham and Langley (2002), and COST329 (2004), and in recent PhD theses

the method is applied by Scuffham (1998), Christens (2003), Gould (2005) and

Van den Bossche (2006).

Given the fact that time series are analysed, the choice for structural time series

models was mainly made because the time series can then be decomposed into

interpretable components. This allows for the interpretation of risk and other

developments while such developments are not directly observed.

In addition, estimating interpretable components also allows for limited vali-

dation of their development, as the interpretable parts should at least have a

reasonably plausible developments. In case additional information is available

pertaining to the development of interpretable components, such information

can be included in the model. Adding such additional information allows the

researcher to use as much available information as possible. The possibility

of a limited form of validation of the results is a substantial advantage of the

structural time series approach over more black-box like analysis alternatives.

One of such alternatives are ARIMA models as applied in Box and Tiao (1975),

see also Box and Jenkins (1976) and many textbooks.

The structural approach presented in this thesis allows the researcher to distin-

guish factors that affect road safety from factors that affect the way road safety

is observed. A change in a travel survey is not likely to change travel patterns,

it is more likely to change travel data. Furthermore, it is also possible to specify

on which component (or components) a particular factor should have an effect

according to theory or hypothesis, which can then be further verified.

As a side effect, the multivariate approach introduced in this thesis in which

traditional dependent variables as well as variables traditionally treated as ex-

planatory variables are simultaneously treated as dependent variables has an

additional benefit. A regression coefficient associated with the relation be-

10

tween an explanatory variable and a dependent variable can be absorbed in

the model.

The special case where exposure is the explanatory variable is given promi-

nent attention in this thesis. In the log linear context, as used in Chapter 3

and Chapter 5, a regression coefficient as described in the handbook by Elvik

and Vaa (2004, p. 49) and many other studies, is absorbed in the model. Elvik

and Vaa (2004)’s approach has the advantage of (approximately) accounting

for a non linear relation between traffic volume and the number of accidents,

as suggested by for instance Hauer (1995). However, Elvik and Vaa (2004)’s

approach has the disadvantage of limiting the comparability of its results be-

tween models that have different coefficients. The model developed in Chap-

ter 3 and Chapter 5 estimates development for risk as the ratio of the number

of accidents per vehicle kilometre, which should be comparable between mod-

els. Other models, described in Chapter 3 are inspired by and share properties

of the DRAG (demande routiere, accidents et leur gravite) framework by Gaudry

(1984) and Gaudry and Lassarre (2000).

The first study within the context of this thesis was Bijleveld (1999). The objec-

tive of Bijleveld (1999) was to improve the reliability of short-term prognosis

of general road safety outcomes, to be used as part of an annual review of the

development of road safety in the Netherlands. Specifically, such prognoses

were intended to be used to determine whether or not road safety outcomes in

the reviewed year were in line with what could be expected from road safety

developments just before that year. A comprehensive analysis of changes in

the development of road safety related indicators could help the road safety

researcher detect recent general changes in road safety conditions, if any. After

Bijleveld (1999), the objective was extended to the analysis of the development

of aspects of road safety in general, resulting in this thesis.

A primitive form of the model was developed during work on the COST329

(2004) report in the second half the 1990’s. The simplicity of the implemen-

tation of the EM algorithm (Expectation Maximisation, e.g. Dempster, Liard,

and Rubin, 1977; McLachlan and Krishnan, 1997) for state space estimation

found in Fahrmeir and Tutz (1994) and others, which easily allowed for a gen-

eral multivariate implementation of the approach taken by Harvey and Durbin

(1986), was also of importance. The final publication of COST329 (2004) was

delayed, and as a result the approach was first published in Bijleveld (1999).

The results presented in this thesis are aimed at providing better and statis-

tically more reliable options for time series analysis of road safety data. The

11

analyses performed in this thesis are not intended to answer specific road

safety questions, but are intended to illustrate the application of the methods

introduced in this thesis.

1.2. Important issues in time series analysis of road safety

data

In this section four central issues involved in time series analysis of road safety

data are presented: time dependence, multiple road safety outcomes, exposure

data, and other explanatory variables.

1.2.1. Time dependence

When a specific condition in road traffic suddenly changes at a certain time

point, it is often to be determined whether (or not) a relevant road safety indi-

cator changed at about the same time point. The opposite also occurs: when

a specific road safety indicator changed at a certain time point, it is often to

be determined whether (or not) a relevant road traffic condition changed at

about the same time point. A classical approach to statistical analysis in this

situation would be to select a type of accident that should be affected by the

change (which is called the experimental group), and a type of accident that

should not be affected by the change (which is called the control group). Then

both accident counts for a period before and after the change are compared in

a 2×2 table:

Count before after

experimental group eb ea

control group cb ca

In a typical before/after study, the rate before eb/cb and after ea/ca are com-

pared. It has to be assumed that the rates would remain constant if the condi-

tion in road traffic had not changed. If the rate e/c was constantly decreasing,

eb/cb would be larger than ea/ca, if only for that reason. This drop could be

falsely attributed to the sudden change in road traffic conditions. There are

numerous reasons why the rate could change with time. For instance, when

the experimental group is moped victims, and the control group is bicycle vic-

tims, the rate will change when bicycles are getting preferred over mopeds for

travel. Therefore it is wise to determine the rate e/c for a number of periods in

the before and after period. Then verify that the rate e/c is reasonably constant

12

in the before and after period, before a change in this rate can be attributed to

the change in road traffic conditions. If this analysis is performed, and a se-

ries of rates e/c is available for a period of time, it is also wise to determine

whether the drop in the rate occurred about the time of the change in road

traffic conditions or not. If this is not the case, some other influence may have

caused the drop (this possibility can never be excluded). Furthermore it can

be determined whether or not the change in the rate is exceptional. If similar

drops in the rate occur regularly and cannot be explained, there is no reason to

assume that this particular drop is not coincidental but caused by the change

in road traffic conditions, while others are considered coincidental.

The analysis steps described above, are regularly performed in time series

analysis. In the first step, a trend is determined, in the second and third step

a so-called structural break is identified (both its location (where and when it

occurred) and whether it is significant).

For this reason alone it can be suggested to perform a more elaborate time

series analysis than a before/after study, which itself is a rudimentary analysis

of time ordered data, with just two time points. More reasons can be suggested

to make this choice.

When a specific condition in road traffic does not change suddenly but changes

gradually it is not trivial to use a before/after study. In such situations (time

series) regression analysis is currently most often applied.

There are other ways in which time dependence may affect the analysis of road

safety data. For example, time dependence implies some structure among ob-

servations. There is sufficient reason to at least consider time dependence in

road safety analysis. If data collected over a longer period of time are con-

sidered, the general road safety situation is likely to have changed as, among

other conditions, road and vehicle design may have improved. If this is the

case, observations close in time will resemble each other more than observa-

tions further apart in time. This phenomenon is reflected in the development

of many road safety related features like the number of fatally injured victims

in road accidents in Figure 1.1. The road safety situation in 1970 will say lit-

tle about the road safety situation in 2000, while the road safety situation in

2007 may give a rather accurate idea of what the road safety situation in 2008

probably will be.

13

1950 1960 1970 1980 1990 20001000

1500

2000

2500

3000

Figure 1.1. The development of the numberof police recorded fatally injured victims inroad accidents (1950–2000) in the Netherlands.Source: CBS (2000).

Most statistical models require that the difference between the model and data

is purely coincidental and no two differences are related1. Technically this

means that the so-called disturbances (the difference between the observed

and the prediction by the true model, which is not observed) are required to be

independent of each other. Failure to satisfy this requirement may lead to over

or under estimation of model uncertainty, which again may lead to statistical

tests being too conservative or worse, not being conservative enough. This

in turn may lead to falsely positive identification of relationships or interven-

tions in road safety analysis. See, for instance Scheffe (1967, Chapter 10) for a

discussion on violations of assumptions on the disturbances in a linear model,

which also includes uniformity of the variance of the disturbances. This poten-

tial problem cannot be ignored, and accounting for it is the second way time

dependence affects the analysis of road safety data.

Model residuals are differences between observed values and the predicted

values from the estimated model, as depicted at time point “4” on the left

hand side of Figure 1.2. Model residuals are observed in contrast to the distur-

bances. The residuals in this figure are positive for the first two time points, the

next residual is approximately zero, then three residuals are negative, the next

three residuals are positive, the following three residuals are again negative,

etc. Most models require that disturbances are independent of one another,

which roughly speaking means that knowing one residual (which estimates a

disturbance) should not help in predicting the next. In the example shown in

Figure 1.2 the requirement of independence of the disturbances is most likely

violated.

1Models exist which require an independent source of error, not necessarily describing thedifference between the model and data.

14

0 5 10 15 20

16

17

18

19

20

Figure 1.2. Theoretical development of the number of accidents (hy-pothetical development is 20 − t/5 + sin (t) for t = 1, . . . , 20) anda linear regression over the first 16 observations (to the left of thevertical reference line) plus a forecast (to the right of the verticalreference line). The differences between the dots and the (straight)line (to the left of the vertical reference line) are called the modelresiduals. The differences to the right of the vertical reference lineare technically not model residuals, as they were not included in theregression. It can be seen that consecutive residuals tend to sharethe same sign.

An example of the first way in which time dependence may affect the analy-

sis of road safety data is correcting for time dependencies in model residuals

for (short-term) prognosis. This can be understood from the example devel-

opment presented in Figure 1.2. It is not uncommon to have a development

of the number of accidents similar to Figure 1.2, where there is a linear trend

(in this case fixed at 20 − t/5) and some fluctuation around it, (for example

sin (t)), yielding the function 20 − t/5 + sin (t) for t = 1, . . . , 20. From Fig-

ure 1.2 it is clear that the forecast for t = 17, . . . , 20 obtained by extending

the linear regression line (as depicted by the straight line in Figure 1.2) can be

substantially improved by using the knowledge that the observations follow a

pattern of being positioned over and under the regression line. Roughly, this is

what considering ‘time dependence’ of model residuals amounts to: account-

ing for an empirically revealed structure in residuals. In general, the dynamic

structure is unknown, and much like in this example it is attempted to build a

description of the dynamic structure. First a linear trend (or another structure

suggested by theory) is fitted. Then the residuals are studied. If those residu-

als do not reveal a structure, the model may be adequate. If not, the dynamic

structure is adapted. There are a number of approaches to adapt the dynamic

structure, one of them is chosen later in this thesis.

15

1.2.2. Multiple road safety outcomes

One important aspect of road safety (time series) analysis is that road safety

cannot be measured unambiguously. There is no unique measure of road

safety. Usually, road safety is measured in terms of the amount of ‘lack of road

safety’, for instance the number of accidents occurring per time unit. Even if

the number of accidents is selected as the measure of road safety, it could still

be all accidents, injury accidents, serious accidents or fatal accidents, or other

types of accidents. But even then the number of victims per accident may be

of interest, as well as the number of fatalities per accident.

It should further be considered that influences on road safety may primarily

affect certain parts of the road safety process. For instance, it is sometimes

claimed (and disputed) that the use of seat belts primarily has an effect on ac-

cident consequences, not on accident occurrence. If it is true that the use of seat

belts primarily has an effect on accident consequences, it would be sufficient

to study the number of victims. Risk adaptation theories (such as for exam-

ple Wilde (1994) and Summala and Naataanen (1988)) state that developments

that could be expected from theory may be counteracted due to behavioural

adaptation, in this case possibly increased speeding by drivers. If this is true,

not only the accident consequences in terms of the number of injuries need to

be considered, but also the number of accidents. Even if the original theory is

assumed to be true, it is sensible to study both the development of the number

of accidents and the number of victims.

Assume a study into the effect of the introduction of a seat belt law on road

safety is to be conducted. It is possible that in the period in which the seat

belt law was introduced, other influences had an effect on road safety. Such

influences may have had an effect on the indicators that are considered to be

relevant to the safety effect of seat belts. If the effect of the seat belt law is to be

determined, one may need to correct for other influences. Therefore the mod-

elling approach should be able to disentangle multiple effects. These effects

may have had an impact on the number of accidents or victims of a certain

type, or both, which is best done by modelling them simultaneously. How-

ever, the number of accidents and the number of victims resulting from these

accidents are correlated, and this correlation should be accounted for in the

analysis. In summary, the modelling approach should be capable of simulta-

neously treating at least two dependent variables (in case of the example above

these would be the number of accidents and victims), and their covariance.

16

Another reason to consider multiple road safety outcomes is that although

road safety interventions may be introduced to reduce certain accident out-

comes, they may also – hopefully to a lesser extent – increase certain other

accident outcomes. In general, the accumulated effect of road safety interven-

tions is considered most important as it indicates the net effect to society. In

specific applications the differentiated effect of road safety interventions needs

to be studied, for instance to test hypotheses on theories.

1.2.3. Exposure data

In France more accidents occur in road traffic than in the Netherlands, but

does that necessarily mean that road traffic is safer in the Netherlands than in

France? Is it not the case that France is a much larger country than the Neth-

erlands, and thus has more potential to have accidents in road traffic than the

Netherlands? One would expect an imaginary country twice the Netherlands

in every respect and otherwise completely equal to have twice as many acci-

dents as the Netherlands. This reasoning is often used to justify using accident

rates in terms of the number of accidents per unit of scale when comparing dif-

ferent entities such as road sections or countries. In this example, the number

of accidents for the imaginary country would be divided by two as the coun-

try potentially has twice as many accidents. The potential to have accidents

(or victims) is generally referred to as exposure in road safety analysis.

Accounting for differences in exposure is not always straightforward. For in-

stance, when comparing the number of fatal accidents in France to the Neth-

erlands (which is about 6.5 to 1), the difference in country size (about 552.000

km2 for France and about 42.000 km2 for the Netherlands, including water sur-

face) could be used to account for differences between France and the Nether-

lands. This would make the Netherlands in this respect less safe than France.

Such a figure would ignore differences in land use (notably population den-

sity), which could be considered a disadvantage. Alternatively, population

size could be used, which was about 61 million for France and about 16 mil-

lion for the Netherlands in 2007. Using population figures as a measure of

exposure would present France as less safe than the Netherlands. A drawback

of using population figures may be that such figures may not sufficiently ac-

count for differences in road use: in a large country like France, the population

may have to travel longer distances. In order to improve on such figures, traf-

fic volume (the number of kilometres or miles driven on the road by vehicles)

or travel volume (the number kilometres or miles travelled on the road by per-

sons) are often used when available. Such figures may better represent the

exposure of a country than its size or number of inhabitants.

17

It should be noted that, although traffic volume is mostly preferred as a mea-

sure of exposure, it is the research question that determines the optimal expo-

sure measure. In practice the researcher not just selects one available exposure

measure, rather, exposure measures are selected for a specific purpose. The

number of fatalities per unit of population per year is sometimes used specifi-

cally to be compared with other mortality rates. For similar reasons, the number

of victims per unit of population per year can be used to compare with other

incidence rates. When road accidents are compared with work accidents, then

the time spent in travel is probably the preferred choice.

It should further be noted that, no matter how accurate traffic or travel volume

appears to be measured, such measurements cannot be considered an exact es-

timate of exposure. As some information on traffic or travel volume is obtained

through travel surveys, these data are by nature subject to random error. An-

other reason is that it is not only the amount of travel that is important to road

safety, but also the conditions under which the travel took place.

As an example of the uncertainties concerning traffic volume data, consider

the Dutch travel data using mopeds presented in Figure 1.3. In the left hand

panel of this figure the number of person kilometres2 is presented for mopeds,

together with the number of police registered accidents with killed or hospi-

talised victims between mopeds and cars. The grey area depicts the point wise

95% percent confidence intervals for the person kilometres. These intervals are

based on an estimate of the error due to sampling only – an estimate of the error

due to respondents providing erroneous data is not available – therefore the

actual error is likely to be larger. In the right hand panel of Figure 1.3 the rel-

ative error based on Slootbeek (1993) and CBS (2003) (right hand panel, solid

line, left hand axis) is presented together with a plot of 1/√

number of trips

(dashed line). This plot reveals that the relative sampling error for the to-

tal moped travel in 2005 is about 16% (left hand scale). The relative error of

moped travel for separate age groups will be substantially larger. In 1994 and

1995 the survey was substantially extended. In 1999/2000, the survey struc-

ture has changed. Over the last few years the survey size has been reduced

while the use of mopeds has also decreased. This resulted in the relative accu-

racy of moped data being at about the same level as it was near the end of the

1980’s.

2Driver kilometre data are not available, but the development of driver kilometres shouldbe similar to the development of passenger kilometres. Moped occupancy appears to be rela-tively constant based on a moped helmet survey (Ermens and van Vliet, 2006) for 2002–2005,where it was found that on about 11% of the mopeds evaluated a passenger was present.

18

0.6

0.8

1.0

1.2

1.4

1.6

1985 1990 1995 2000 2005

800

1000

1200

1400

1600

0.015

0.020

0.025

0.030

0.035

1985 1990 1995 2000 2005

0.08

0.10

0.12

0.14

0.16

0.18

Figure 1.3. Traffic volume and accident data for mopeds in the Netherlands 1985–2006. Left handpanel, left hand axis, dots: the number of police registered accidents with killed or hospitalisedvictims between mopeds and cars. Left hand panel, right hand axis, solid line: the number ofperson kilometres (billion) using mopeds in the Netherlands. The grey area depicts the pointwise 95% percent sampling confidence intervals for the person kilometres based on Slootbeek(1993). Right hand panel, left hand axis, solid line: relative sampling error in person kilometresbased on Slootbeek (1993). Right axis, dashed line: 1/

number of trips.

In the left hand panel of Figure 1.3, the traffic volume appears to go up and

down by a substantial amount near the end of the 1980’s, while the accident

counts seem relatively stable. Ignoring the fact that the traffic volume data

in this case are not accurate, one may conclude that both the traffic volume

and the risk (being the ratio of the number of accidents to the traffic volume)

fluctuated substantially in this period, which was probably not the case.

The topic of exposure is further discussed in Chapter 2, which also discusses

whether exposure affects road safety linearly or non linearly, as for instance

argued by Hauer (1995).

1.2.4. Explanatory variables

Besides exposure, the development of road safety can be influenced by devel-

opments in many areas such as road design, vehicle technology, education, de-

mography, weather, economy, etc. Quantitative information on such develop-

ments is regularly obtained from separate research results. The studies which

provide such results can be regularly and consistently conducted surveys, as

is the case with the travel survey in the Netherlands, or population figures ob-

tained from censuses or registers. However, studies may come from different

disciplines, may have different viewpoints, and are often limited by design to

some subsection of the complete road safety field. As road safety time series

analysis typically considers a longer period of time, it is likely that study de-

sign and purpose have changed over time, although such studies generally

19

still measure the same phenomenon. It is possible that such changes could

influence analysis results, the impact of which should be minimised.

Example: drink driving data

One example of a case where data collection may potentially affect analysis

results is data on the percentage of drivers exceeding the legal blood alcohol

concentration limit (drink driving). It is commonly assumed that drink driv-

ing is a risk increasing factor. When the consequences of drink driving for road

safety are to be determined, it is important to know how many drivers are ac-

tually exceeding the legal blood alcohol concentration limit. In Figure 1.4 the

percentage of car drivers tested to have a Blood Alcohol Concentration (BAC)

larger than 0.5 g/l (0.5 grammes per litre) in the Netherlands is given. The re-

sults are obtained from a number of surveys intermittently conducted during

autumn weekend nights, starting in 1970. This example demonstrates another

case of an important explanatory variable that in general should measure the

same phenomenon (the percentage of drivers exceeding the legal blood alcohol

concentration limit). Due to changes in measurement and scale of the survey,

the series of data is not fully consistent and not systematic in its accuracy. Fur-

thermore, the measurement for one year is distorted, possibly as a result of

the fact that the focus of the survey that year was directed at the introduction

of a new drinking driving law. Finally, the studies are justifiably focused on

assessing the worst extent of the problem by measuring drink driving in a pe-

riod, weekend nights, where the percentage of drivers under the influence of

alcohol is expected to be largest. The measure is therefore unlikely to represent

drink driving in general road traffic.

On the first of November 1974 a new law introducing the 0.5 g/l BAC legal

limit became effective in the Netherlands. At the same time, chemical test

tubes for road side testing were introduced. This time point is marked by

the first vertical reference line in Figure 1.4. SWOV (1978) reports that the

measurement for that year (1.5 %) was based on the average of observations

specifically taken one weekend immediately before the introduction of the law

(the weekend of 25–27 October 1974, 12% violations) and two larger surveys in

weekends immediately after the introduction of the law (the weekends of 8–10

November and 22–24 November 1974, 1% violating the law). Given the ob-

servation in 1975 and the fact that 12% violations were recorded the weekend

before the introduction of the law (and 15% in 1973), it may not be realistic to

consider the observation of about 1.5% for 1974 as being representative for the

percentage of drivers exceeding the 0.5 g/l BAC limit in the whole of 1974.

20

1970 1975 1980 1985 1990 1995 2000 2005

0

2.5

5

7.5

10

12.5

15

Figure 1.4. Percentages of car drivers having a Blood Alcohol Con-centration (BAC) exceeding 0.5 g/l in the Netherlands based on sur-veys taken in the autumn during weekend nights (see, Mathijssen,2004). The survey was not conducted every year. Dots mark avail-able data points.

In 1984 (marked by the second vertical reference line in Figure 1.4) electronic

alcohol breath test devices for selection purposes were introduced (blood tests

were still needed for legal confirmation). Starting in 1985 a gradual change

from selective to random police alcohol controls took place, which changed the

population sampled. As of the first of January 1987 (marked by the third ver-

tical reference line), results of alcohol breath tests could be used for evidential

purposes (in addition to blood sample tests). As of the first of November 1992,

heavier fines for drink-driving were introduced. The survey initially consisted

of about 3,000 observations, by the early 1990s this number increased to about

15,000, and at the end of the series there are about 30,000 observations. More-

over, the survey has not been conducted each year. Missing data are interpo-

lated in Figure 1.4. However, the percentages not necessarily dropped linearly

starting in 1984, the first of three years in which no surveys were conducted

(as is noted by Mathijssen (2004)). If accident occurrence is indeed related to

alcohol use by drivers, a drop in alcohol use by drivers could be reflected by a

drop in accident occurrence. A drop in accident occurrence at a later year may

indicate that alcohol use could have dropped later, but may not be conclusive.

An estimate of the missing values based on the accident development is likely

more reliable than the linear interpolation.

The example concerning drink-driving data as well as the discussion on expo-

sure data suggest that explanatory variables should not be considered at face

value. Each explanatory variable should be carefully considered and weighed.

21

In both examples the survey size varies over time, effectively meaning that

the accuracy of the data is not the same for all time points. As a result, het-

eroscedasticity among observation errors should be considered.

Further issues

Apart from the reliability of an explanatory variable, another important issue

to consider is its validity, that is, whether or not it actually represents what it

is supposed to represent. For instance, in the drink driving example, the data

actually refer to autumn weekend nights, not full days. This means that the

scope of the data should be considered. In road safety research one quite often

is forced either not to use an explanatory variable or to assume that the ‘true’

explanatory variable (in this case drink driving on average days) has a ‘similar’

development to the one actually available, or to try and find confirmation of

this assumption from other studies. Exposure data are subject to similar prob-

lems. The exposure data are obtained from household surveys CBS (2003) and

AVV (2005). As the sampling unit is households3, the persons in the survey

are almost exclusively residents of the Netherlands (but not necessarily Dutch

nationals). This implies that travel data for non-residents of the Netherlands is

not included in the survey, thus the survey does not represent all travel in the

Netherlands.

The scale of studies providing explanatory variables may vary between the mi-

croscopic level – at the level of individual accidents – and the (supra) national

macroscopic level of aggregated data. Generalisations of many such ‘pieces’

of information may be necessary to complete the ‘puzzle’ of road safety. A

microscopic level study may reveal the effect of seat belts on victims, while

macroscopic level studies may establish the effect a law on seat belt use has on

society.

As the type of analysis targeted in this research tends towards macroscopic

(aggregated) level analysis rather than microscopic level analysis, consequen-

ces of using results from lesser aggregated studies should be considered. For

instance, while a microscopic level study into the influence of weather on road

safety may reveal that the average temperature explains some variation in acci-

dent counts, the average temperature over a year may not. As a second exam-

ple, Eisenberg (2004, p. 637) finds that “in a typical state-month pair in the US

from 1975 to 2000, increased precipitation is associated with reduced fatal road

traffic crashes. More precisely, an additional 10 cm of rain in a state-month is

associated with a 3.7% decrease in the fatal crash rate”. Later he states: “First,

when the regression analysis is conducted with the state-day, rather than the

3Actually addresses are sampled. Some addresses may have more households.

22

state-month, as the unit of observation, the association between precipitation

and fatal crashes is estimated to be positive and significant, as in the literature.”

(Eisenberg, 2004, p. 637). Eisenberg (2004) continues to explain the importance

of lagged precipitation data in his (daily) model, effectively introducing a time

series model. This shows that different aggregation levels may yield opposite

results.

1.2.5. Conclusions

In this chapter it is demonstrated that travel volume data and data on the per-

centage of drivers exceeding the legal blood alcohol concentration limit (both

derived from surveys) have to be considered as observed under error. How-

ever, it is not just travel or alcohol surveys that are observed under error. Sim-

ilar arguments would hold for data derived from surveys like crash helmet

use on mopeds (Ermens and van Vliet, 2006), and many others. If a variable

is measured under error this means that instead of the true value, by coinci-

dence a different value is used, which can be considered random fluctuation

from the true value. In case of traffic volume data, the true value would be

the number of kilometres driven, while the value actually used would be the

number of kilometres driven based on the randomly selected respondents of

a survey, instead of the entire population. In general, the fluctuations are on

average (expected to be) nil. However, its variance, which is a measure of the

statistical accuracy of the data is larger than nil.

The issue of the potential random fluctuations in exposure and other explana-

tory data is mostly ignored in road safety analysis, probably as often no infor-

mation with respect to the statistical accuracy of the data is available. In many

cases, however, the consequences of ignoring statistical inaccuracy of expo-

sure or explanatory data may be negligible compared to other inaccuracies.

Neglecting the statistical accuracy of the data is not always warranted. For

instance disaggregate traffic volume data (traffic volume data for subgroups)

may be subject to substantially larger sampling errors than aggregate data (as

described in the example on moped travel), up to more than 100% sampling

error. Furthermore, there is no reason not to account for the inaccuracy of the

data when it is possible to do so.

Therefore it is important to consider the possibility of random fluctuations

in the explanatory data as well as random fluctuations in the accident data.

Considering possible random error in explanatory variables as well as in de-

pendent variables implies an ‘errors-in-variables’ approach (see, Seber and

Wild, 1988, Chapter 10). This approach essentially treats explanatory varia-

bles (which are assumed to have error) as dependent variables alongside the

23

original dependent variables. As a result, models are multivariate in the sense

of multiple dependent variables. Besides the ‘errors-in-variables’ argument,

there are further reasons to consider road safety analysis problems multivar-

iate. It is argued that road safety cannot be measured unambiguously as no

unique measure of road safety is available. Depending on the research ques-

tion road safety can be measured in terms of the number of accidents or vic-

tims, and combinations of these.

In this thesis road safety is therefore considered inherently a multivariate prob-

lem, that should preferably be analysed accordingly. Furthermore, the conse-

quences of time dependence should be considered, not only in view of reli-

ability of statistical tests, but also in view of making forecasts of future road

safety indicators. It will be demonstrated in Chapter 3 that considering time

dependence allows for an intuitive treatment of missing data as well.

A sufficiently flexible general framework to statistical time series analysis is

already available, based on (derivations of) the Kalman filter (Kalman, 1960).

This framework also handles non-homogeneous observation error variances

in a straightforward manner. In this thesis a special form of (Kalman, 1960)’s

model called a structural time series model is further developed in a multivar-

iate dimension, specifically designed for road safety risk analysis.

Given the fact that time series are analysed, the choice for structural time series

models was mainly made because the time series can then be decomposed into

interpretable parts. This allows for the interpretation of risk developments

– see Chapter 2 for further details, while risk itself is not actually observed.

This applicability becomes even more important when as in Section 3.4 the risk

relates to multiple dependent road safety outcomes. The resultant model is a

multivariate unobserved components model, which is a special case of Harvey

and Koopman (1997).

1.3. Structure of this thesis

This introductory chapter provides the background of the research presented

in this thesis, including how it originated and the main issues that require close

attention when analysing developments in road safety: time dependence, the

multivariate nature of road safety, and the problems associated with exposure

data and other explanatory variables.

In Chapter 2, background definitions and statistical properties known in road

safety research are provided for the three central concepts in the analysis of

24

road safety: safety, exposure and risk. In practical terms, Chapter 2 is about

how road safety is observed at each time point.

Chapter 3 first introduces the novel multivariate structural time series frame-

work. By using this framework developments in accident and victims counts,

exposure and other explanatory variables can be analysed simultaneously, thus

considering the multivariate nature of road safety. Their developments are

modelled using structural components for exposure, risk and other factors.

This approach not only allows to consider time dependencies, but also allows

the researcher to interpret the development of these structural components.

The latter can lead to new insights, for instance by assessing the significance

of changes in risk. It can also be used for validation purposes, which may be

important in limited data situations. By using the combined framework of ad-

vanced state space and Kalman filter techniques, traffic volume data and other

data can be treated stochastically, thus taking care of measurement errors in

explanatory data and allowing to consider the covariance between accident

related outcomes. Chapter 3 starts with the concept of ‘state’ in Section 3.2.

The state is an unobserved vector containing parameters of the important parts

(aspects) of road safety. For instance the state can be assumed to contain the

parameters that define traffic volume and risk, as well as other parts consid-

ered important to the particular road safety analysis. The modelling frame-

work can then be used to estimate these parameters and thereby quantify these

important aspects. In Section 3.3.1 the basic form of the measurement of the

state of the linear models in this thesis is explained, which is used as a starting-

point for the time development of the models. Thereafter the approach of how

the dynamics are treated in this thesis is outlined, which coincides with the

structural time series approach. In Section 3.3 the main linear multivariate

structural time series model framework developed in this thesis is described.

The framework allows the risk to be treated as a latent variable, and the asso-

ciated model is therefore called the latent risk time series model. In Section 3.4

two applications are discussed, which extend the models discussed in Chap-

ter 5 by integrating results from Chapter 4 and by including alternative source

victim data. In the first example an extended LRT model is used to compare

the development of two accident severity indices, the number of killed or hos-

pitalised victims per serious accident and the number of fatalities per victim

for rear-end accidents to the same indices for all accident types. These two

appear to have different developments. In particular it is noted that the num-

ber of killed or hospitalised victims per serious accident is not constant over

time. This result is used in the next example, where the registration level of ac-

cidents involving hospitalised victims is used as a common factor to estimate

the number of accidents corrected for incomplete registration. In this example,

25

two sources of accident victim data are used: police records, which include de-

tailed accident information, and hospital records, which have detailed infor-

mation on road individuals admitted to hospital, but do not include detailed

accident information. Both sources are used to estimate the ‘true’ number of

hospitalised victims. Under the hypothesis that the police either register all

hospitalised victims or none, the ‘true’ number of accidents with hospitalised

victims can be estimated using the LRT model by assuming all accidents with

hospitalised victims and police recorded accidents with hospitalised victims

share the same latent factor describing the number of hospitalised victims per

accident. The advantage of the LRT approach over averaging is its acknowl-

edgement that registration rates and the number of hospitalised victims per

accident change with time. These figures are also estimates and are thus not

accurately measured. This approach should yield more reliable results than

calculations based on averages.

In Chapter 4, a variance-covariance structure for accident related outcomes is

established, thus allowing for a proper treatment of their inter-dependencies in

a multivariate time series analysis. The approach describes a straightforward

way to estimating the covariance matrix of the number of accidents, victims

and killed, and possibly other accident outcomes. These results are important

when more than one of such variables are used in the model, see also Bijleveld

(2005).

In Chapter 5, a comprehensive and technically detailed overview is presented

of the main linear multivariate structural time series model framework devel-

oped in this thesis. Estimation details are given, and example applications are

given based on Australian and Dutch data. The examples demonstrate that

the applicability of the model is not limited to road safety time series analysis.

This chapter was published as Bijleveld et al. (2008).

Chapter 6 presents a nonlinear extension of the multivariate structural time se-

ries model framework, based on Gaussian error distributions. The estimation

procedure applies the extended Kalman filter instead of the classical Kalman

filter used in the linear models discussed in Chapter 3 and Chapter 5. The

model is applied to the analysis of the development of road safety disaggre-

gated into inside and outside urban areas. This example is typical for disag-

gregated data where not all relevant data is available in disaggregated form.

In this case disaggregated traffic volume is not available for all observations.

However, the total traffic volume, traffic volume for inside urban areas plus

traffic volume for outside urban areas is available for all observations. Struc-

tural components are estimated for risk inside and outside urban areas, which

26

are compared, and exposure for risk inside and outside urban areas. As one

example of how the structural nature of the framework can be used to validate

a model, the last of these structural components is further compared to an es-

timate of traffic volume outside urban areas based on road length and traffic

intensity measurements. The result of this comparison appears to support the

validity of the model.

Chapter 7 discusses a further generalisation of Chapter 6 which allows for the

specification of non-Gaussian error distributions. The estimation procedure

in this Chapter can be regarded as a generalisation of the iterated extended

Kalman filter using Laplace approximations. Apart from an example appli-

cation on well known data, a simulation study is reported in Chapter 7. The

approach is applied to road safety in an example. In this application, precipita-

tion duration is used to estimate the relative contribution to risk of fatal single

car accidents due to precipitation. The example model is based on two daily

accident counts (with and without precipitation according to the police) traffic

volume data derived from the travel survey (thus small samples, which should

be accounted for) and individual precipitation duration data of 10 weather

stations distributed over the Netherlands, acknowledging the consistency of

weather patterns.

27

2. Safety, exposure and risk: definitions and

some statistical properties

2.1. Introduction

A philosophical discussion covering the topic of “unsafety” or the lack of safety

is beyond the scope of this thesis. This thesis is focused on practical time series

modelling aspects of aggregate road safety data. It is assumed that the results

of “unsafety” are accident consequences such as accident or victim counts, or

combinations of both. The precise type of accident to be considered is deter-

mined by the research question of a study. Other accident consequences such

as monetary consequences of road accidents may also be considered.

A primary assumption in road safety analysis is that accident related road

safety outcomes are non-predictable, non-deliberate consequences of entities

(vehicles, persons) taking part in traffic. The precise definition of what a road

accident (sometimes called a crash) is, for example, has no relevance for the

research presented in this thesis. In short, this thesis is concerned with the

analysis of collected outcomes of non-predictable, non-deliberate accident-like

events in road traffic.

Inspection of basic road safety data for the Netherlands (see Figure 2.1) re-

veals that the number of police recorded fatal accidents increased from 969 in

the year 1950 to a maximum of 2984 fatal accidents (which resulted in 3264

fatalities) in the year 1972. It then started to decrease to 1006 fatal accidents

in the year 2000. As the number of fatal accidents in the year 1950 is approx-

imately equal to the number of fatal accidents in the year 2000, the question

1950 1960 1970 1980 1990 20001000

1500

2000

2500

3000

Figure 2.1. The development of the numberof police recorded fatal road accidents (1950–2000) in the Netherlands. Source: CBS (2000).

28

1950 1960 1970 1980 1990 200010

11

12

13

14

15

16

1950 1960 1970 1980 1990 2000 0

20

40

60

80

100

120

Figure 2.2. Left hand panel: the number of inhabitants in the Netherlands (by 1 January,in millions) for 1950–2000. Right hand panel: the number of motor vehicle kilometres (inbillions) in the Netherlands for 1950–2000. Source: CBS (2007) and CBS (2003).

1950 1960 1970 1980 1990 2000

75

100

125

150

175

200

225

250

1950 1960 1970 1980 1990 2000 0

25

50

75

100

125

150

Figure 2.3. Left panel: the number of police registered road accident fatalities per millioninhabitants (as of 1 January) for 1950–2000. Right hand panel: the number of police registeredfatal road accidents per motor vehicle kilometre (in billions) for 1950–2000. Source: DVS(2003) and CBS (2007).

arises whether all efforts to improve road safety in the period 1950–2000 only

resulted in reducing safety to the level of 1950. The answer to this question

depends on how one assesses the scale of the road safety problem.

In Figure 2.2 the development of the number of inhabitants and the develop-

ment of (motorised) traffic volume is given for the same period of Figure 2.1.

It is shown in Figure 2.2 that the population in the Netherlands increased by

about 60% in that period. Traffic volume, on the other hand, was 20 times

larger in 2000 than it was in 1950 (this refers to motorised traffic only, but non-

motorised traffic volume, which consists of pedestrian, bicycle and (light-)

moped travel is minor compared to motorised traffic volume in this demon-

stration).

From the perspective of increased population and traffic volume, it is inter-

esting to consider the relative ‘unsafety’ in terms of the rate of the number of

fatalities per inhabitant (a public health perspective) and fatal accidents per

29

motor-vehicle kilometre (a traffic performance perspective). These develop-

ments are displayed in Figure 2.3. The huge increase in motorised traffic vol-

ume resulted in a (continuing) decrease in the number of fatal road accidents

per motor vehicle kilometre, similar to Appel (1982). Even by looking at the

number of inhabitants, the number of fatalities per inhabitant is lower (at about

67%) in 2000 than it was in 1950. Given the fact that road traffic substantially

increased over that period, this may be considered as a remarkable result.

Which kind of exposure can best be used however is not clear from these fig-

ures. The following quotes by Hauer (1995): “Thus the question is not ‘what is

exposure?’, but ‘What is the accident rate good for when VMT, ADT and the

like serve as exposure?’ ”4 and by (Hakkert and Braimaister, 2002, p. 7):“It will

be shown that there is no general definition of exposure and of risk and that

these terms should be defined within the context of the issue studied.” seem

to position this issue in road safety analysis. When the probability of a person

dying in a road accident is compared with the probability of a person dying of

cancer, then the number of inhabitants is an appropriate measure of exposure.

When road accidents are compared with work accidents, then the time (hours)

spent in travel is probably the preferred choice, while comparisons between

different transport modes (e.g., car, train, aeroplane) often involve the use of

kilometres travelled.

In aggregate models, road safety is often studied in terms of failures per unit

performance. Because of the numerous possibilities for a sensible choice of the

combination of the road safety indicator and the exposure measure (Yannis

et al., 2005) this thesis is not focused on one particular type of combination. As

stated in Yannis et al. (2005), traffic volume is usually the preferred measure

for exposure, and the examples in this thesis are therefore mainly oriented at

the use of vehicle kilometres as scale factor for the road safety problem.

2.2. Risk exposure in road safety analysis5

As the basic distributional properties of road accident statistics play a central

role in road safety analysis this section first discusses this topic. A textbook

level derivation of the statistical distribution of accidents is described, which

is further used as a starting point for a discussion of the nature of exposure.

4VMT is vehicle-miles travelled, ADT is average daily traffic5This section is adapted from section 2.1 of the SafetyNet WP2 state-of-the-art report Yan-

nis et al. (2005), a section co-authored by myself.

30

2.2.1. Statistical distributions

This section is devoted to a discussion of the statistical distribution of aggre-

gated accident counts, with some reference to the distribution of victim counts.

Accident distributions refer to the distribution of the number of accidents and

not to the spatial distribution of the accidents over an area or temporal distri-

bution over time.

An introduction to a discussion of the basic concepts of road accident statistics

is the work by the French mathematician Poisson (see, Feller, 1968, page 153).

Poisson investigated the properties of Bernouilli trials. A Bernouilli trial is an

experiment that has two possible outcomes: success or failure. This type of

experiment seems to be a useful building block for modelling road safety. For

instance, the crossing of a road by a pedestrian can be conceived of as an exper-

iment with a (fortunately) minimal probability of a ‘success’ (i.e., an accident

occurring). A similar argument could be used for a vehicle passing through

a road section, a vehicle driving past a road side obstacle, or two vehicles en-

countering each other on the road. Many other examples could be considered.

The concept of a trial in this chapter is different from the concept of a conflict in

Hauer (1982), which is at a much later – almost final – stage of the development

of an accident.

The original work of Poisson assumed the probability of success to be the same

at each trial. Poisson could then prove that the distribution of the sum of all

successes would tend to a Poisson distribution. The restriction Poisson used

that the probability of success has to be the same value, say p, at each trial has

since been relaxed (see Feller, 1968, page 282). Let N denote the number of

trials, it is not necessary that all probabilities of success pi are equal to each

other for i = 1, . . . , N. Rather the sum of all N probabilities should tend to a

finite λ (which serves as the expected number of accidents), and its maximum

(e.g. Feller, 1968, page 282) or sum of squares (e.g. Shorack, 2000, page 367)

should tend to nil:

limN→∞

N

∑i=1

pi = λ limN→∞

max1≤i≤N

pi = 0 limN→∞

N

∑i=1

p2i = 0, (2.1)

where N is the number of trials, and pi is the probability of an accident in trial

i.

For the practice of road safety analysis this result has the following conse-

quence: if the number of accidents can be regarded as the sum of the outcomes

of many independent conceptual events, each having a small probability pi of

31

turning into an accident, then the distribution of the sum of those events that

turned into accidents – thus the number of accidents – tends to the Poisson

distribution with parameter equal to the sum of the probabilities of events re-

sulting in an accident. Therefore the expected number of accidents is equal to

the sum of all probabilities, which is λ in the limiting case.

It should be noted that:

1. This result applies to the distribution of the number of accidents, not to

the distribution of the number of victims (unless there happens to be at

most one victim per accident) or of other outcomes of accidents.

2. The role of independence is important in this result. It should be quite

reasonable to assume that the outcomes of the different events are inde-

pendent, otherwise the result may not hold.6

3. When accident registration problems are to be considered, the concept

of ‘a small probability of resulting in an accident’ can be replaced by ‘a

small probability of resulting in an accident and being registered’. The reg-

istration should not be selective.

4. A different but no less important accident registration issue is that usu-

ally only accidents exceeding a certain level of severity are considered. In

that case ‘a small probability of resulting in an accident’ can be replaced

by ‘a small probability of resulting in an accident with a certain severity

and being registered’. Even if these probabilities are different for each

trial, the distribution of the resulting number of accidents still tends to

the Poisson distribution.

5. An alternative approach to deriving the Poisson distribution for counts,

based on counting processes (in real-time), requires that the (real-time)

registration system cannot be saturated by the accident process. Although

this is mostly relevant to Geiger-Muller counter like systems, its potential

effects should not be ignored in road safety analysis. For instance, police

districts may allocate limited resources to less severe accidents, and may

simply stop registering them once a certain threshold is exceeded, thus

truncating distributions.

6Outcomes resulting from the same event, such as the number of persons killed, seriouslyinjured, lightly injured, and unharmed in one accident, are likely to be dependent (see Chap-ter 4 in this thesis, or see, Bijleveld, 2005). Furthermore, it should be noted that it is the eventsthat should be independent, not the probabilities, which may depend on N. Accidents that arecause by other accidents are in most cases considered part of the initial accident.

32

2.2.2. The distribution of accident counts

The statistical properties of accident counts mentioned in the previous section

only apply for large numbers of trials. For road safety analysis this means that

the distribution of accident counts will become indistinguishable from a Pois-

son distribution only in the limiting case. Thus, in practice accident counts will

never be precisely Poisson distributed. The limit character of the properties of

accident counts is due to the large number of trials on which it is based. If a

count is based on many, many trials, it is likely that its distribution is indistin-

guishable from a Poisson distribution. For instance annual, national counts of

a general type of accidents will practically be Poisson distributed. However, a

problem arises when the actual number of trials is not so large. This is the case

when a rare accident type is studied for example, or road sections with small

traffic volumes. For more discussion in the situation in which the number of

trials is not very large, see in particular Lord, Washington, and Ivan (2005).

2.2.3. Over-dispersion

As mentioned in Hauer (2001) over-dispersion is commonly encountered in

road safety analysis: “After the unknown model parameters are estimated,

one usually finds that the accident counts are ‘overdispersed’. That is, that

the differences between the accident counts and model predictions, are larger

than what would be consistent with the assumption that accident counts are

Poisson distributed” (Hauer, 2001, p. 799). This phenomenon also occurs in

settings where one would consider the distribution to be practically identical

to the Poisson distribution. The problem is with the replications used in the

generic model as described by Hauer (2001). Even if the accident distribution

would be indistinguishable from the Poisson distribution, replications would

never be under identical conditions. In other words: replications will be drawn

from a different Poisson distribution each time and the replications will there-

fore vary more than would be expected when the replications are sampled

from the same (Poisson) distribution. A more extensive discussion from the

viewpoint of different probabilities can be found in Lord et al. (2005). See e.g.

Hauer (2001) and the references therein for more on how overdispersion can

be estimated. The methods applied in this thesis never assume the prediction

to be fixed, rather the methods assume the predictions to be subject to error.

This situation is comparable to assuming that “replications would never be

under identical conditions.” as remarked just above. In a general context, this

approach is called a mixture approach to generalised count models, of which

the negative binomial model (a Poisson-Gamma mixture) is one example. In

all cases in this thesis it appears that no overdispersion parameter in addition

to the mixture needs to be estimated. The approach where the amount of dis-

33

persion in addition to the prediction error is estimated is taken in this thesis.

More general forms and other distributions can be considered in Chapter 7.

2.2.4. Gaussian approximations

The distribution of the number of accidents is often approximated by the Gaus-

sian distribution. This approximation is also used in the models presented in

this thesis, except for those in Chapter 7. The common procedure is to assume

(first approximation) a Poisson distribution with parameter λ, and then to ap-

proximate (second approximation) the Poisson distribution with a Gaussian

distribution with mean parameter and variance parameter equal to λ. In mod-

elling situations, the expected value λ is often estimated by the model predic-

tion of the observed count. When no statistical model is available, the expected

value λ is usually estimated by the observed count. Sometimes an amount of

‘overdispersion’ is added to the variance parameter, that is a constant value is

added to λ.

It should be noted that the approximation of the Poisson distribution by a

Gaussian distribution deteriorates when the accident counts are getting smaller.

There is no general rule as to what value the counts should exceed in order for

the approximation to be sufficiently reliable since that depends on the applica-

tion and the required accuracy. It should also be noted that for many types of

statistical models count data versions are available. Therefore in many cases a

Gaussian approximation is no longer needed.

2.2.5. The distribution of victim counts

Given that an accident occurs, determining the distribution of the number of

victims resulting from that accident is difficult. Obviously the distribution is

dependent on the number of persons involved in that accident7. When done

at all, approximations can be made based on compound distributions. It can

however be assumed that the victim counts are overdispersed, more so than

accident counts. The amount of overdispersion depends on the variation of the

number of victims per accident (see Chapter 4 in this thesis, or Bijleveld, 2005).

This means that victim counts from accidents that rarely involve more than one

victim, will be less ‘extra’ overdispersed than victim counts from accidents that

(more) often involve more than one victim, as compared to the overdispersion

of the number of accidents.

7Which is unfortunately not known in the Netherlands, since unharmed participants in anaccident are not registered unless they are drivers.

34

Generally, the distribution of victim counts has no influence on the distribu-

tion of accident counts. In practice however often accidents exceeding a cer-

tain severity level are registered or used in an analysis. If the distribution of

victim counts changes in a way that the probability of exceeding the severity

level decreases, the expected number of accidents will decrease, and thus the

accident count distribution will change.

2.2.6. The relation between trials and exposure

As discussed above, the number of trials N plays a dominant role in the ex-

pected number of accidents. Assuming the pi values to be sufficiently regular,

the expected number of accidents is proportional to the number of trials since

λN = ∑Ni=1 pi. The number of trials is therefore probably closest to the true

exposure we can get. Unfortunately, the value of N is generally unknown.

Since N and the pi are unknown all need be estimated. Given the fact that es-

timation of each individual pi is impractical, we assume a homogeneous dis-

tribution of the pi. In addition, the data are used in aggregate models, which

means that aggregate counts of accidents are available as well as aggregate es-

timates of exposure. This means that given and estimate of N, only the average

of R (the pi) can be determined.

No general guidelines are available on how to estimate either N or R. As N is

obviously somehow dependent on the scale of road traffic, and the number of

accidents is dependent on both N and R, the approach taken in this thesis is to

estimate both N and R by means of two (approximate, effectively stochastic)

equations:

{

Scale of road traffic ≈ N

Number of accidents ≈ N × R.(2.2)

See Chapter 3 for further details on how N and R are estimated in this the-

sis, an approach which allows for nonlinear relations. The nonlinear nature of

the relations is suggested by the discussion in the next section. Note that (2.2)

implies that any alternative estimate of N proportional to N cannot be distin-

guished from N.

The research question determines for which kind of accident the ‘Number of

accidents’ needs to be analysed. The research question also determines, given

available data, the optimal choice of what quantity can best be used to measure

the ‘Scale of road traffic’ (see also Hauer (1995) and Hakkert and Braimaister

(2002)). All methodology presented in this thesis is independent of choices for

35

‘Number of accidents’ and ‘Scale of road traffic’. However, given the practi-

cal importance of the traffic volume for ‘Scale of road traffic’, the use of traffic

volume as a measure for the scale of road traffic is elaborated upon in the next

section, which discusses the relation between traffic volume and accident oc-

currence from a general and at a less aggregated level.

2.3. Traffic volume and accident occurrence

Traffic volume is a commonly used measure of exposure, and is used in all

examples in this thesis. Therefore some discussion on the relation between

traffic volume and the number of accidents is given in this section.

2.3.1. The relation between ‘traffic volume’ and the number of accidents

In road safety research, many results are obtained from studies using road

sections as observational units. Even when all other variables are held constant

as much as possible, such studies typically reveal a nonlinear relation between

the number of vehicles passing per time unit and the number of accidents or

victims occurring in the same period of time. The number of vehicles passing

a road section per unit of time is linearly related to the traffic volume of a road

section, as its length is fixed. In a formula this can be written as:

number of accidents = f (traffic volume), (2.3)

where f is a (nonlinear) function. Like in Figure 2.4, this function is often in-

creasing as described in the handbook by (Elvik and Vaa, 2004, p. 49). Depend-

ing on the type of accident being studied, however, this function can also be

concave (Hauer, 1995, p. 135), or even decreasing, (see e.g., Hiselius, 2004). Note

that (2.3) could equivalently be defined explaining the number of accidents per

unit road length as a function of traffic intensity (the number of vehicles pass-

ing), rather than traffic volume.

The nonlinear relation is frequently found to be similar to the curved line in

Figure 2.4. A suggested explanation for the general shape of this relation is

that average speed decreases as traffic intensity increases, and the traffic flow

then becomes relatively safer (as expressed by lower pi values in (2.1)). An-

other explanation given is that when road sections are more intensively used,

they get more attention and are therefore designed somewhat safer than other

sections, thus yielding smaller pi values (see for example, p. 11, Reurings and

Janssen, 2006). The latter reasoning only holds when physically different road

sections are compared. Both suggested explanations underline the idea that

36

Predicted numberof accidentst

f(Traffic volumet)

Number of accidentst

Traffic volumet

Nu

mb

er o

f ac

cid

ents

t

Figure 2.4. Theoretical nonlinear relation between the number ofaccidents and traffic volume for a road section, region or coun-try. All other influences than traffic volume are held constant. See,e.g. (Elvik and Vaa, 2004, p. 49). In this illustration the relationf (x) =

√x is used. This function is often called the Safety Perfor-

mance Function SPF (Hauer, 1995).

there is an interaction between traffic volume (and thus traffic intensity8) and

the relative safety performance of road sections given a certain amount of traf-

fic, and thus aggregations of the latter.

Furthermore, the relation between traffic volume and the number of accidents

may be quite different for different levels of accident severity and for different

accident types. Due to data reliability and availability issues (mainly regis-

tration issues), only more severe accidents tend to be analysed as their regis-

tration is more reliable. In practice this means that only accidents involving

persons being killed or seriously injured are used. Thus the many damage-

only and light injury accidents are excluded from analysis due to the much

smaller reliability of their registration. In many studies only fatal accidents are

considered. A possible consequence of analysing only fatal accidents is that

as traffic intensity increases, accident severity actually decreases and therefore

the total number of accidents may increase while at the same time the number

of fatal accidents may decrease. Another complication could be that during

rush hour, for instance, the number of occupants per vehicle may on average

be lower than during evenings/weekends. This will also decrease the proba-

bility of at least one person getting hurt in an accident, solely due to the fact

that less persons are at risk in such accidents. An otherwise ‘equal’ accident

may then be less likely to end up as a serious accident.

8Since road length is usually constant in studies at use road sections as observational units.

37

2.3.2. A remark on traffic volume and multiparty accident occurrence

This section briefly addresses the relation between traffic volume and the num-

ber of accidents at a microscopic level in view of multiparty accidents. In 2006,

about two thirds of the fatal accidents in the Netherlands involved more than

one pedestrian or vehicle. This means that it is likely that for substantial part of

the fatal accidents, presence of another vehicle taking part in traffic was some-

how important. The ‘presence’ of another vehicle is probably best understood

as one vehicle or pedestrian being at one point in time in close enough proxim-

ity to be involved in an accident with another vehicle, which may be consid-

ered a ‘trial’. This would mean that for the number of multiparty accidents, the

number of such encounters between vehicles could be more important than the

mere of traffic volume itself. If for instance the number of accidents between

mopeds and cars is to be studied this means that traffic volume of both mopeds

and cars are needed.

This result is plausible because of the fact that an increase in traffic volume

of mopeds is likely to result in an increase in the number of encounters be-

tween mopeds and cars, while an increase in the traffic volume of cars may do

so as well. There is a caveat in that an increase in the traffic volume of cars

on motorways is not likely to have such an effect. To a lesser extent, this issue

is relevant to mopeds as well. Road safety measures tend to separate traffic

flows (separate carriageways for traffic in opposite directions, grade separated

junctions) have similar effects. The ongoing implementation of such measures

systematically tends to alter the relation between traffic volume(s) and acci-

dent occurrence.

It should be noted that the use of two (or more) measures of exposure results

in a different risk concept than when only one measure of exposure is used.

This case is treated in the modelling approach in this thesis in Chapter 5. That

approach maintains the multiplicative interpretation of the risk and exposure

variables in the models.

2.4. Summary and discussion

This chapter discusses two main aspects of road safety: lack of safety and ex-

posure. First lack of safety is discussed. It is assumed that the results of lack

of safety to be used in time series analysis are accident consequences. These

include accident counts and victim counts. It is further noted that the precise

research question determines what type of accidents is to be considered, as is

well known in road safety research. It is important however that accidents

38

are are non-predictable, non-deliberate consequences of entities (vehicles, per-

sons) taking part in traffic.

The well known importance of exposure (or scale) is demonstrated by present-

ing an example that shows that the development of the raw count of fatal acci-

dents in the Netherlands from 1950 to 2000. That development alone may not

sufficiently explain the development of road safety in the Netherlands from

1950 to 2000, as it should then be concluded that little improvement in road

safety was made in 50 years. The development could also suggest that the

road safety situation deteriorated up to the early 1970’s, and then improved

afterwards. However when the number of vehicle kilometres considered (as

an approximate measure of exposure or scale), a different picture emerges.

The use of an exposure measure in combination with an road safety indicator

induces a measure of risk, defined as the road safety indicator per unit of ex-

posure. This chapter discusses established literature considering road safety in

terms of exposure, usually assuming a nonlinear relation, and compares these

approaches to a decomposition of the road safety indicator into exposure and

risk as used further in this thesis.

First, a model for accident occurrence and registration is described based on

encounters between vehicles (including pedestrians). The number of regis-

tered accidents is the number of encounters that resulted in an accident that

is registered. In advanced textbooks it is proven using the limiting behaviour

of sums of Bernouilli trails with unequal probabilities, that under general con-

ditions this sum tends to the Poisson distribution. The underlying number

of encounters is of interest in the discussion considering the relation between

exposure and traffic volume.

It is argued that the expected number of accidents is proportional to exposure.

However, in many road safety studies using road sections as observational

units but also others, nonlinear relations between exposure measures like traf-

fic volume and the number of accidents are found. In such studies however

it is not only traffic volume that changes, but also traffic conditions. Notably,

when traffic volume increases, traffic intensity is likely to increase as well, in

particular when a set of road sections is considered. Therefore, referring to

the accident occurrence model based on Bernouilli trails, it is likely that the

accident probabilities decrease at the same time, as in most but not all cases a

concave function is found. To complicate matters further, it can be argued that

approximate exposure measures like traffic volume are nonlinearly related to

the number of encounters.

39

It is noted that such results are found when it is attempted to hold all other var-

iables constant as much as is possible. When time series models are considered

– in an analysis of road safety data over a longer period of time – this condition

cannot be preserved, because the road safety situation tends to change with

time. Therefore even when a nonlinear relation is know at one time point, it

may have changed in the next. As generally only one observation is available

per time point in time series analysis, it may be impossible to determine the

shape of the nonlinear relation from the data.

This would suggest to adapt (2.2) acknowledging the nonlinear monotone re-

lation between the exposure measure and exposure. Accordingly:

{

Exposure measure ≈ k(N)

Number of accidents ≈ N × R,(2.4)

where k is a nonlinear function, and acknowledge that R may be affected by

exposure. In practical cases k will be strictly monotone, which allows for an

alternative specification:

{

Scale of road traffic ≈ N′

Number of accidents ≈ k−1(N′) × R,(2.5)

where N′ ≡ k(N). The approach (2.5) is taken in this thesis, where N′ is con-

sidered ‘exposure’. Further details, in which time series aspects are considered

can be found in Chapter 3. There it will be demonstrated that in the log lin-

ear modelling context, the important case where k−1(x) is a power function is

absorbed in the model.

40

3. Multivariate structural time series models

3.1. Introduction

This chapter introduces the multivariate structural time series model as a gen-

eral model that should satisfy many of the requirements related to aggregate

time series modelling in road safety. For a thorough discussion on multivariate

structural time series models see Harvey (1989), Harvey and Koopman (1997)

and Durbin and Koopman (2001).

The elementary concepts of the structural time series models are the state, how

the state is ‘observed’ and how the state evolves over time. In an exemplar

structural time series model, we observe a time series yt, (t = 1, 2, . . . , n) which

is assumed to be measured under (possibly minimal) error:

yt = at + εt, (3.1)

where εt is assumed to follow an independent Gaussian distribution with mean

zero and variance σ2ε . This means that at serves as the underlying value of yt,

free of noise, it is the expected value in this example. The variable yt may rep-

resent a phenomenon that is expected to have its value “tomorrow” to be the

same as “today”. Such a phenomenon could be described by a so called local

level model Harvey (1989), Durbin and Koopman (2001) or Commandeur and

Koopman (2007):

at+1 = at + ηt, (3.2)

where ηt is usually assumed to follow an independent Gaussian distribution

with mean zero and variance σ2η . This means that, although the expected value

of at+1 given at is equal to at, in practice it may be somewhat larger or smaller

than at. The amount of variation is determined by the variance σ2η . Note that

a meteorologist will probably be able to improve (3.1) and (3.2). The equa-

tions (3.1) and (3.2) are usually called the measurement equation and the system

equation respectively, and at is called the state. In the models presented in this

thesis, the state is represented by a m × 1 dimensional vector of real valued

elements.

In the univariate example (3.1) a basic means to observing the state is specified.

It is assumed that the state contains the parameters that determine the ‘true’

values of the observations, which can only be observed distorted with error. In

41

a multivariate case, which is further developed in this chapter, it is assumed

that apart from error, linear combinations of the state are observed:

yt = Z at + εt, (3.3)

where yt and εt are p × 1 dimensional vectors, and Z is a p × m dimensional

matrix. In Chapter 6 and Chapter 7 a more general form of (3.3) is used. The

concept and measurement of the state is the topic of Section 3.2. Assuming

that the state of road safety in a country evolves over time is empirically sup-

ported, as the road safety situation appears to improve with time. There are

however a number of issues in this respect to be discussed in Section 3.3. This

is done only after the basic measurement structure of the models developed in

this thesis (3.3) is formalised. In that section the basic measurement equations

are formulated using the relation between trials and exposure as described in

Chapter 2. In Section 3.3, the latent risk time series model first formulated,

which is further detailed in Chapter 5, with some example applications in Sec-

tion 3.4. In Section 3.5 some nonlinear extensions are introduced, which are

detailed in Chapter 6 and Chapter 7.

3.2. The concept of state and its observation

When (linear) regression models are applied in road safety analysis, it is as-

sumed that the explanatory variables used in the model (plus all regression

coefficients and possibly an intercept term) sufficiently describe the road safety

outcome. This is what the state in the modelling approach in this thesis is all

about: the state should contain everything that is needed to estimate all the

road safety outcomes. This means that the state may contain both explanatory

variables and coefficients. If some of the explanatory variables are measured

subject to error, it is assumed that the state contains the true values of these

explanatory variables. For instance if it is assumed in a study that seat belt

use is relevant 9 to road safety, then this should be the actual seat belt use, not

the observed percentage of seat belt use obtained from some small (or larger)

survey, which is subject to sampling error. The state may further contain in-

formation on how road safety develops, and thus effectively contains all the

relevant coefficients of a model.

To be more precise, the fundamental assumption of the modelling approach

presented in this thesis is that the ‘state of road safety’ (e.g. in an area, in a

period, in the whole country, for a year) can be described by a finite dimen-

sional vector of data. Two restrictions apply: the first restriction is that the

9A remark on aggregation issues is presented later in this section

42

state vector is assumed to be finite dimensional10 and the second restriction

is that all the state variables are assumed to be real-valued.11 Any observable

quantity, whether being a road safety outcome or an explanatory variable, is

observed through a (potentially trivial) function of the state. This observation

is in general assumed to be subject to (not necessarily independent) random

error.

In a schematic formulation, we have:

road safety outcome ←−g(state) + some random error, (3.4)

while at the same time, for possible explanatory variables we have

explanatory variables ←−h(state) + another random error. (3.5)

Note that the distinction between the multivariate functions g and h is more

conceptual (distinguishing road safety outcomes from explanatory variables)

than practical (g and h together map the unobserved state to the observed

quantities). To stress this distinction between variables, the observed quan-

tities are often called manifest variables, while the unobserved states are often

called latent variables. A schematic representation of these relations between

the observations and the state is also given in Figure 3.1.

In case the full state vector is observable and accurately measured, then ‘an-

other random error’ in (3.5) is zero (which means no error at ‘A’ in Figure 3.1).

Moreover assuming an identity relation for h (meaning that, apart from coef-

ficients, the state simply equals the explanatory variables), (3.5) can be substi-

tuted in (3.4):

road safety outcome ←−g(coefficients, explanatory variables)+

some random error,

which is effectively a classical regression specification. In summary we assume

an observed road safety quantity somehow to be the observation of a particu-

lar state though some function, the observation of which is possibly distorted

10The existence of infinite dimensional models is acknowledged, but such models are notdiscussed. See, e.g. (Chan and Palma, 1998) for an approximation of infinite dimensionalmodels by finite state space models.

11Models using discrete valued state variables, which are used to model processes that mayjump from one state into another (for instance a person changing travel mode in a microscopicmodel) are not discussed.

43

True value ofexplanatory variables

Observed value ofexplanatory variables

Observed value ofroad safety

True value ofroad safety

State

Observation error

Potentialsystem error

Classicalregression

model

g h

‘TrueÕregression

relation

A

Figure 3.1. Schematic description of the relations between state,true and observed values of road safety and explanatory variablesin (3.4)–(3.5). Presence of error at location “A” introduces the errors-in-variables properties of the model.

by noise (random error). Apart from random error, the observed road safety

quantity is assumed to depend on the state only, that is, the expected road

safety quantity (or even its statistical distribution) can be derived from the

state values. Furthermore, individual variables of the state do not necessarily

have a direct manifest counterpart, like true percentage of seat belt use would

be related to the observed seat belt use.

Many relevant variables in road safety can in practice not be observed directly

and are therefore called latent. Risk (see also (3.7) below), for instance, is

never directly observed, no matter how it is defined (Hauer, 1995; Hakkert

and Braimaister, 2002). It is commonly defined as the ratio of e.g. the number

of accidents to some measure of scale called exposure, the latter often being

operationalised as the total traffic volume (the amount of distance travelled,

usually by vehicles, sometimes by persons). The examples in the beginning of

Section 1.2.3 also mention the size of a country and the number of its inhabi-

tants as alternative measures of scale.

As the development of risk (and changes in it) is often to be interpreted and

sometimes even attributed to (the introduction of) road safety measures, it is

important to consider the possibility of random fluctuations in the traffic vol-

ume data as well as in the accident data. The issue of the potential random

fluctuations in exposure data is mostly ignored in road safety analysis, proba-

bly because information pertaining to the statistical accuracy of the exposure

data is often unavailable, see also Yannis et al. (2005) and Yannis et al. (2008).

44

In many cases the consequences of ignoring statistical inaccuracy of traffic vol-

ume data may be negligible compared to other inaccuracies, but such a de-

cision is not always warranted. One example may be the moped travel data

discussed in Section 1.2.3. Further, disaggregated traffic volume data (data for

subgroups) may be subject to substantially larger sampling errors than aggre-

gate data. Other explanatory variables than exposure may also be subject to

random fluctuations. Considering random fluctuations in explanatory varia-

bles as well as in dependent variables implies an ‘errors-in-variables’ approach

(see Seber and Wild, 1988, Chapter 10).

There is no formal requirement with respect to how road safety depends on

the state. In Section 3.3 a log-linear relation is assumed, which appears to be

suitable for many applications. However, some applications require a more

complex relation, as for example discussed in Section 3.5 and Chapter 6 and

Chapter 7.

The state can only define an aggregate estimate of ‘the state of’ road safety, as

it is to represent road safety in a period of time (often a year), for a particular

area (often a country), not the exact conditions of the accidents. This problem is

shared with other aggregate models (almost all other models), like regression

models. This problem is particularly relevant when results from studies at

the accident level are to be incorporated into aggregate (e.g. national) level

models. For instance the temperature at the time of the accident is not likely to

be the average, minimum or maximum temperature of the day, week, month

of year or whatever period of time is used in an analysis. This means that the

level of aggregation has to be considered when results are compared.

As mentioned above, a restriction taken in this thesis is that the state is as-

sumed to be finite dimensional, which excludes certain long range dependency

models. There is evidence of long range dependency in empirical science, al-

though no studies pertaining to this in road safety appear to have been pub-

lished. Koornstra (1992) and Commandeur and Koornstra (2001) have encoun-

tered this phenomenon.

3.3. The latent risk time series model

3.3.1. A basic latent risk observation model

A number of authors have proposed (Hauer, 1995; Hakkert and Braimaister,

2002; see also Section 2.1 in this thesis), that risk can be defined in close re-

lation with the measure of exposure selected as being the most suitable for

45

the purpose of the analysis. Consequently, “risk” is mostly defined to satisfy

“number of accidents” equals “exposure” × “risk”. Therefore, to start a typical

example in a risk analysis setting, a minimal state would consist of an “expo-

sure” variable – in this example assumed to be related to traffic volume – and

a “risk” variable (defined to match this measure of exposure). In this way we

arrive at a situation similar to equation (2.2), which is a simple but common

case of (3.4) and (3.5):

{

Traffic volume = exposure,

Number of accidents = exposure × risk.(3.6)

Having an estimate or observation of the “Traffic volume” and the “Number of

accidents” would yield an estimate of both “exposure” and “risk” by solving

(3.6) for “exposure” and “risk”. If the relations are assumed to be as exact as

in (3.6), then the “risk” estimate is identical to the accident rate “Number of

accidents” divided by “Traffic volume”, as has been done in many studies (e.g,

Oppe, 1991a,c). This means that, although risk cannot be observed, it can still

be measured indirectly.

In the latent risk time series model the approach taken in (3.6) is modified by

applying logarithms on both sides of the equations and subsequently intro-

ducing error terms:

log (Traffic volume) = log (exposure)+

random error in traffic volume,

log (Number of accidents) = log (exposure) + log (risk)+

random error in accidents,

(3.7)

where ‘random error in traffic volume’ is introduced because we assume ‘ex-

posure’ to be only approximately equal to ‘traffic volume’, if only due to ob-

servation error.12

12In addition (3.7) could be modified further by taking the approach suggested in the hand-book by Elvik and Vaa (2004, p 49) by adding a coefficient b:

{

log (Traffic volume) = log (exposure) + random error in traffic volume

log (Number of accidents) = b × log (exposure) + log (risk) + random error in accidents,

(3.8)

an approach which is nested in the modelling approach in this thesis. This is explained inAppendix A.2. Elvik and Vaa (2004, p 49)) define Number of accidents = α Qb. where Q istraffic volume and α is a constant, similar but not equal to risk.

46

The system of equations (3.6) and (3.7) defines a means to estimate the latent

exposure and risk variables from observations of traffic volume and the num-

ber of accidents, where, due to the explicit introduction of error terms, the

estimates have a simultaneous distribution, rather than a fixed point estimate.

It is thus not assumed in advance that traffic volume is identical to exposure

(although it may turn out to be almost similar), hence the presence of the error

terms in the first equation of (3.7). As a result, it is also not assumed in advance

that risk is equal to the number of accidents divided by the traffic volume.

Acknowledging that the actual values of the latent exposure and risk varia-

bles are not accurately known, but that only their statistical distributions are

known, allows a statistical model to improve these estimates by incorporat-

ing estimates of the latent exposure and risk variables based on previous (and

later) time points. How this can be achieved is demonstrated in the following

section.

3.3.2. The role of the dynamic relation among states

Introduction

There are two ways in which time dependence may affect the analysis of road

safety data: the fact that observations are time related, and the fact that de-

pendence between disturbances may affect statistical inference. This section is

devoted to the former aspect. It is generally found that data collected over a

period of time tend to exhibit some form of time dependency (see, e.g. Ham-

pel, Rousseeuw, Ronchetti, and Stahel, 1986, Chapter 8, and many time series

monographs). Ignoring the dependencies may adversely affect the reliability

of statistical tests, and a model should therefore be able to correct for the time

dependencies (see for instance Harvey (1989), Durbin and Koopman (2001) or

Commandeur and Koopman (2007), and Section 3.3.8 for an overview).

The fact that important road safety factors may develop over time implies that

observations close together in time are often more similar than observations

further apart in time. An observation of road safety from 1960 will be less

indicative of the state of road safety in 2000 than an observation of road safety

from 1990.

It should be considered that some and possibly all aspects of a model could

be time-evolving. Conditions in the early days of motorisation are likely to be

different from the current conditions. The introduction of air bags into cars and

the improvement of vehicle construction in general is likely to have influenced

the effectiveness of seat belts in cars, for example. Therefore, the effect of seat

belt use shortly after it became obligatory will not necessarily be the same as

today. This means that some aspect of a model that describes the effectiveness

47

of seat belts is likely to evolve over time. The evolvement of model aspects may

result in slowly changing values of (regression) coefficients. The importance

of the time-evolving nature of a model is of course dependent on the problem

at hand, in particular the length of the analysis period considered. Therefore,

a model should at least allow components to be time evolving. This is one of

the features of the latent risk time series model (LRT) presented in this thesis,

as will be demonstrated below.

For practical purposes, and apart from interventions, it can be assumed that

all aspects of the road safety system develop more or less smoothly (that is:

not completely erratically) over time. The negation of this assumption would

mean that consecutive observations bear no information at all on each other,

similar to assumptions for cross-sectional models. Although the case where

observations bear no information at all on each other is implausible in time

ordered data (see, e.g. Hampel et al., 1986, Chapter 8, and many time se-

ries monographs), it is important that when little information exists between

consecutive observations, models should not be adversely affected by it. See

Appendix A.3 for a further discussion of this issue in reference to multivariate

structural time series models. If absolutely no dynamic relation is assumed

between the time-ordered observations of an explanatory variable, the LRT

allows such variables to be treated similarly to ordinary regression models.

The discussion so far suggests that consecutive observations may be related

and that tendencies may be smooth, but it does not mention how the develop-

ments should be specified. As might be concluded from Figure 3.2, at least as

of the 1970’s, the developments of the log-vehicle kilometres (left hand panel)

and of the log-number of fatal accidents per vehicle kilometre (right hand

panel) may be reasonably well represented by just a straight line. Given the

fact that the points in the figure are the logarithms of the actual observed data,

which are observed subject to random error, one may wonder whether the true

log-exposure and log-risk may not have an even smoother development, pos-

sibly closer to the straight lines in the figure.

Trying to answer this question is the point where the dynamic relation between

the states (exposure and risk) comes into play. If the current year is on average

similar to the previous year and, more generally, if a year can be moderately

well predicted13 from its recent past, how does knowledge of the state of the

13It is important that it on average would predict the next value. The magnitude of thestatistical uncertainty is not important to be allowed to do this, although when the statisticaluncertainty is high compared to the observation uncertainty, using previous observations willnot help much.

48

1950 1960 1970 1980 1990 2000

2.0

2.5

3.0

3.5

4.0

4.5

1950 1960 1970 1980 1990 20002.0

2.5

3.0

3.5

4.0

4.5

5.0

Figure 3.2. Left hand panel: log-vehicle kilometres in the Netherlands (dots) and two linearregression lines. Right hand panel: log-number of fatal accidents per vehicle kilometre forthe Netherlands.

previous year balance with the knowledge of the state based on the observa-

tion of the current year if one wants to determine what the state of road safety

is in the current year?

This balancing of information in a time series setting can be achieved with the

Kalman filter (Kalman, 1960) and its derivations, as will be illustrated with an

example below. The Kalman filter can roughly be described as the time se-

ries equivalent of the ‘Empirical Bayes’ approach used in cross-sectional road

safety analysis, as discussed for instance in Hauer (1992). The basic assump-

tions of the Kalman filter and the Empirical Bayes method as described in

Hauer (1992) are similar.

While the relation between the state and the observed values based on road

safety theory is relatively straightforward (as discussed in Section 3.3.1), the

dynamic relation between states may be more complicated, and sometimes

more difficult to identify. However in many applications simple (local) linear

trends appear to be sufficient, as suggested by Figure 3.2. The next subsection

is devoted to a brief introduction of the concepts, assumptions and benefits of

dynamic relations.

Example of a basic dynamic relation: Constancy

Although this may seem a trivial dynamic relation, constancy is an often used

(assumed) dynamic relation which in real life may not always completely hold.

In the natural sciences this assumption is clearly reflected in the use of ‘con-

stants’. In reality, physical constants are measurements, often taken at irregular

intervals. For instance, the gravitational constant is currently estimated to be

G = 6.693 × 10−11 cubic meters per kilogram second squared, with a standard

error of the mean of ±0.027 × 10−11 and a systematic error of ±0.021 × 10−11

cubic meters per kilogram second squared (taken from abstract, Fixler, Fos-

49

ter, McGuirk, and Kasevich, 2007). This value of the gravitational constant is

commonly substituted in equations. Because their development is practically

constant and their measurement sufficiently accurate for most purposes, their

‘state estimate’ can be assumed to be a constant value too for most applica-

tions. This substituting of values in equations is roughly how the LRT treats

missing values in data, as will be detailed later below.

Constancy as a dynamic relation can be specified as follows:

{

yt = a + εt,

at = at−1,(3.9)

for t = 1, . . . , n, where yt are the observations, n is the total number of obser-

vations, a0 = a is the constant state, and εt are the (possibly small) observation

errors.

One application is the computation of the average of a series of data. As an

example, assume that the long run average weight w of sugar lumps produced

in a factory is constant (stated more precisely: the weight of sugar lumps is

independently sampled from a distribution with fixed mean and finite vari-

ance). Every minute one lump is sampled from the production line and the

average weight of the lumps up to that time t, wt is determined. As can be

concluded from limit theorems, the average weight wt will tend to the true

average weight w, but each individual average weight wt, t = 1, 2, . . . will be

slightly different. In the next section it will be demonstrated how this average

can be calculated using the Kalman filter, and how the wt are the so-called fil-

tered estimate of the Kalman filter. The case where the true average weight w

never changes is discussed under the caption ‘Accurate dynamics’ below. The

more realistic case where, due to temperature changes, machine maintenance

and many other reasons, the true average weight w actually changes with time

is discussed under the caption ‘Approximate dynamics’. In the latter case, a

constant value is no longer assumed for the latent state.

The Kalman filter as a algorithm for computing the average of a series of

observations

Accurate dynamics

A simple example of the Kalman filter is the calculation of the average of a se-

ries of identically and independently Gaussian distributed data y1, . . . , yn. This

is a common example where a constant, accurate dynamic relation is assumed.

It is acknowledged that y = (1/n) ∑ni=1 yi is a more familiar and in many cases

50

– but not always – easier way to compute the average of n observations than

the approach described here.

Using the Kalman filter (Kalman, 1960), at the first time point we have y1 = y1

(our first sugar lump). This is what we know about the average if we only have

one observation. At the second time point we have

y2 =1

2y1 +

1

2y2 ≡ y1 + y2

2=

y1 + y2

2,

because y1 and y2 are equally precise estimates of y. The third time point yields

y3 =2

3y2 +

1

3y3 ≡ 2 × y2 + y3

3=

2 × yy+y2

2 + y3

3=

y1 + y2 + y3

3,

and for time point k we have:

yk =k − 1

kyk−1 +

1

kyk. (3.10)

It should be remarked that:

• Although (3.10) may appear unnecessarily complicated, it is efficient in

real-time processing because when a new observation becomes available,

only the weighted sum has to be computed instead of the average of all

(potentially many) observations.

• The approach y = (1/n) ∑ni=1 yi (and the precise implementation in (3.10)

as well) is only optimal when the yi are identically, and to a lesser extent,

Gaussian distributed. Both conditions are not always met. In particular

the condition that the observations should be identically distributed may

often be violated.

As concerns the latter issue, should the observation variances not be all the

same, then the solution is to calculate a weighed average of the sample y1, . . . ,

yn, considering their variances σ2(y1), . . . , σ2(yn) as well. The first (weighted)

average is now (assuming independence):

y2 =σ2(y2)

σ2(y1) + σ2(y2)y1 +

σ2(y1)

σ2(y1) + σ2(y2)y2, σ2(y2) =

σ2(y1)σ2(y2)

σ2(y1) + σ2(y2).

(3.11)

51

Recursively applying this approach thus amounts to applying the Kalman fil-

ter (Kalman, 1960; Harvey, 1981, 1989; Durbin and Koopman, 2001) for the

estimation of the average of a time series. The estimate yk is then called the fil-

tered estimate of y. This approach is almost equivalent to the ‘Empirical Bayes’

approach taken in many road safety studies (e.g. Hauer, 1992, Equation (1),

page 460) except that there the variance for accident counts is estimated differ-

ently from the approach used in Bijleveld (2005) (see also Chapter 4), Bijleveld

et al. (2008) (see also Chapter 5) and Chapter 6.

The filtered state estimate yk is thus an estimate of the state at time k based on

all data up to time k. An alternative estimate of the state is one based on all

available data (not only up to time k), called the smoothed state (Durbin and

Koopman (2001, Chapter 4) for more details). In the simple average example

(3.10), this estimate is equal to y for all time points. In the weighted average

example (3.11) the estimate of the smoothed state also has one and the same

value (i.e., the weighted average) for all time points.

Approximate dynamics

The example above assumes constancy of the estimated quantity. This assump-

tion may be too strict in road safety analysis, particularly when a longer period

of time is considered. Therefore, we have to be careful not to assume a priori too

many values to be constant, although values may well turn out to be constant

in practice. For instance in the drink driving example in Figure 1.4, the alcohol

percentages of drivers exceeding the legal blood alcohol concentration limit

are probably not identical over the whole period, but they may be relatively

constant for some periods.

The LRT approach to this situation is to adapt the estimate (3.10) to a time

varying value by adding a second random component ηt in (3.9),

{

yt = a + εt,

at = at−1 + ηt,

which amounts to modifying the weights according to the now adapted vari-

ances (compare with Hauer, 1992, Equation (1), page 460):

yk = αkyk−1 + (1 − αk)yk, (3.12)

where 0 ≤ αk ≤ 1 and αk is appropriately chosen smaller than (k − 1)/k. The

consequence of this is that the filtered estimate of the state ‘relies’ more on the

actual observation yk than on its prediction from the past yk−1. Hauer (1992,

52

page 460) uses a similar argument. Because of its symmetry, the Kalman fil-

ter approach could be compared to Hauer (1992)’s Empirical Bayes approach

by, in Hauer (1992)’s terminology, considering the observation as the reference

population and the prediction from the previous observation as the ‘accident

count’ or by considering the prediction from the previous observation as the

reference population and the new observation as the ‘accident count’. Both ap-

proaches rely on the fact that if consecutive observations are indeed related (in

Hauer (1992)’s terminology, if we do have a representative reference popula-

tion) then using that information will provide a better estimate. One difference

between the two approaches is that in the LRT, this assumption is extended to

selected explanatory variables as well.

The optimal choice of αk, in (3.12) and its multivariate analogue, is based on

likelihood inference from the Kalman filter (Kalman, 1960; Harvey, 1981, 1989;

Durbin and Koopman, 2001) in Bijleveld et al. (2008) (see Chapter 5) and on

the Extended Kalman Filter (e.g., Harvey, 1989) in Chapter 6 while Chapter 7

relies on a further generalisation of the Kalman filter.

Example of a commonly used dynamic relation: Trend

As mentioned above, it might be concluded from Figure 3.2, at least as of the

1970’s, that the developments of the log-vehicle kilometres (left hand panel)

and of the log-number of fatal accidents per vehicle kilometre (right hand

panel) may be reasonably well represented by just a straight line14, which is

often called a linear trend in time series analysis. The linear regression tech-

niques used to determine the straight lines in the left hand panel of Figure 3.2

assume relations like

log (Traffic volumet) = p × t + q + (some form of random error)t,

where t = 1, 2, . . . , n, q is the intercept or level of the regression line, and p

is the regression coefficient or slope of the regression line. If p × t + q is as-

sumed to be the true value of log (Traffic volumet) (as is usually done in such

models, where – if the model is correct – the prediction is assumed to be a bet-

ter estimate of the observed value than the observed value itself, as under the

model assumptions the observed value is the predicted value plus some form

of random error15), the state variable at representing the ‘true value of expo-

14In this case, where logarithmic transforms of the data are displayed, it appears that theactual data are represented by an exponential line.

15Some multistage analysis techniques, like the DRAG framework by Gaudry (1984) (andJohansson (1996))), use the estimated traffic volume instead of the observed traffic volume.Our approach takes this one step further, as it acknowledges that additional equations simi-larly improve estimates of the true exposure, as well as of other variables.

53

sure’ increments every new time point by the value ‘p’. The trend is specified

by p and q:

at = at−1 + p, a0 = q,

or more generally

{

at = at−1 + pt,

pt = pt−1,a0 = q, p0 = p, t = 1, . . . , n.

When required the LRT approach allows the value of the slope p to change

with time (which would be useful for the modelling of the series in the left

hand panel of Figure 3.2) and/or the value of the level q to change with time

(which would be useful for the modelling of the series in the right hand panel

of Figure 3.2):

{

at = at−1 + pt + η(a)t ,

pt = pt−1 + η(p)t ,

a0 = q, p0 = p, t = 1, . . . , n. (3.13)

Negative values of η(a)t result in a drop in the level q of the trend, which could

be the case in the early 1970’s in the right hand panel of Figure 3.2. Negative

values of η(p)t result in a decrease in the slope p of the trend, with the effect that

the trend increases less fast than before if p is positive, and decreases faster

than before if p is negative. The variances of η(a)t and η

(p)t determine ‘how

easily’ the trend changes. The smaller these variances are, the more the trend

approximates a classical linear trend.

Extreme fluctuations in η(a)t or η

(p)t (as may well occur for η

(p)t around the

1970’s in the left hand panel of Figure 3.2) indicate structural breaks, which

are further discussed in Section 3.3.8. Note that constancy is a special case of

trend.

3.3.3. Specification by means of linear structural models

Introduction

Looking at the observations shown in Figure 3.2, one may be tempted to con-

clude that the log-exposure and log-risk components are simple linear func-

tions of time. However, a closer inspection reveals that this is not precisely

the case, i.e., the developments do not precisely follow a straight line. The

54

log-risk16 in the right hand panel of Figure 3.2, for example, shows a few ‘ups

and downs’ that seem to last longer than just a few years (or observations in

general). The development of the log-risk may therefore well be approximated

by just a linear trend plus some noise in (3.13). The development of the log-

traffic volume in the left hand panel of Figure 3.2 may also reasonably well be

approximated by (two) linear trends, but then some structural break around

1970 will probably be left in the residuals. In addition, it appears that some

cyclic fluctuation around the trend remains. This may (or may not) be an eco-

nomic cycle. The LRT (which is a type of structural time series model) allows

for a kind of building-block approach, where the latent variable for exposure

is decomposed into a linear trend and a cycle component, each having their

own dynamic relation.

The basic idea of the treatment of latent variables in structural models is to de-

compose them into components, sometimes called unobserved components,

that serve special purposes. In this section we follow the introduction of Har-

vey and Shephard (1993) (see also Harvey (1989) and Commandeur and Koop-

man (2007)) and their formulation. These authors propose to decompose a

(univariate) time series yt into a trend (µt), cycle (ψt), seasonal (γt) and irregular

(εt) component:

yt = µt + ψt + γt + εt. (3.14)

As mentioned in Harvey and Shephard (1993), all components are in general

stochastic and their disturbances (see below) are assumed to be mutually un-

correlated (the multivariate case is different in this respect). In order to be

identifiable, the components ψt, γt and εt should on average be nil. In the LRT

approach, which is a multivariate extension of the structural time series mod-

elling approach, there are multiple dependent variables ‘yt’ (traffic volume,

number of accidents) and multiple latent variables ‘µt + ψt + γt’ (exposure,

risk, percentage alcohol abusive drivers, percentage seat belt users, etc). Each

latent variable is represented by the sum of a set of unobserved components.

Such sets of unobserved components may even consist of higher order trends

than linear, multiple seasonal patterns, and multiple cycles, if that should re-

quired for the proper modelling of the dynamic properties of a latent variable.

The trend, seasonal and irregular components are further discussed in the fol-

lowing sections. Cycle components are not used in this thesis, but details can

be found in Harvey (1989, Section 2.3.3) and Harvey and Shephard (1993).

16Note that the log-risk is the logarithm of the empirical risk, thuslog (number of fatal accidents) − log (vehicle kilometres). In the latent risk model fortime series the developments will be smoother.

55

The trend component (local linear trend) µt

As already given in (3.13), the trend specification is:

{

at = at−1 + pt + η(a)t ,

pt = pt−1 + η(p)t ,

a0 = q, p0 = p, t = 1, . . . , n. (3.13)

The component at in (3.13) is called the level component of the trend, and pt

is called the slope component. As the other components (the seasonal and the

irregular discussed below) should be nil on average, the component at serves

as the expected value for yt (compare (3.14)).

If (3.13) describes the trend of exposure, then this will henceforth be denoted

by replacing at by µ(el)t (with ‘l’ for level) and pt by µ

(es)t (with ‘s’ for slope) as

in (3.32). We now have:

(

µ(el)t

µ(es)t

)

=

(

1 1

0 1

) (

µ(el)t−1

µ(es)t−1

)

+

(

η(el)t

η(es)t

)

, (3.15)

which can be rewritten as

αt =

(

1 1

0 1

)

αt−1 + ηt. (3.16)

When the variances of the error terms η(el)t and η

(es)t are nil, (3.15) collapses to

a straight line (see Commandeur and Koopman (2007) for a discussion on how

the linear regression model can be derived from (3.15)). When the variances

of the error terms η(el)t and η

(es)t are not equal to nil, this model is called a

local linear trend model. It can be considered a linear regression model pt ×t + qt, where, if the pt and qt only mildly vary in value over time, the trend

is approximately linear over a short period of time. A trend model where

µ(es)t−1 ≡ 0 is a so-called local level model, an often used special case of a trend.

The seasonal component γt

A second important class of dynamic relation is the seasonal pattern. One

example in Chapter 5 and the example in Chapter 7 include such patterns. The

common approach is to have a – seasonally corrected – trend and an additional

seasonal effect.

In many applications, seasonal patterns are modelled using dummy variables.

When a monthly pattern is to be modelled, effects for each month γ(jan) to γ(dec)

56

are included in a model:

yyear,month = p× year + q + γ(month) + (some form of random error)year,month

(3.17)

where the seasonal pattern also includes the within-year trend, or

yt = p × t + q + γ(month(t)) + (some form of random error)t,

where t increments per month. Neither of these two specifications is fully

determined. To achieve that, many approaches either fix one month at nil, or

enforce γ(jan) + γ(feb) + · · ·+ γ(dec) = 0. The latter approach has the benefit that

the trend component (here simplified to p× t + q) represents the average trend

while in the former approach the trend component represents the trend of the

month fixed at nil. The approach enforcing γ(jan) + γ(feb) + · · · + γ(dec) = 0

is taken17 in structural time series models and the LRT, as it is easily im-

plemented in a dynamic specification, as it implies to enforce e.g. γ(jan) =

−(γ(feb) + · · · + γ(dec)). Thus, the value of the seasonal component for one

month is effectively equal to minus the sum of the last eleven seasonal compo-

nents (where mt(0) is the current month at time t, for instance July and mt(−1)

is June, which also is mt−1(0)):

γ(mt(0))t = −γ

(mt−1(0))t−1 − · · · − γ

(mt−1(−11))t−1 + η

(eγ)t

γ(mt(−1))t = γ

(mt−1(0))t−1

γ(mt(−2))t = γ

(mt−1(−1))t−1

γ(mt(−3))t = γ

(mt−1(−2))t−1

γ(mt(−4))t = γ

(mt−1(−3))t−1

γ(mt(−5))t = γ

(mt−1(−4))t−1

γ(mt(−6))t = γ

(mt−1(−5))t−1

γ(mt(−7))t = γ

(mt−1(−6))t−1

γ(mt(−8))t = γ

(mt−1(−7))t−1

γ(mt(−9))t = γ

(mt−1(−8))t−1

γ(mt(−10))t = γ

(mt−1(−9))t−1 ,

(3.18)

where when t is January, γ(mt(0))t is the effect of January, γ

(mt−1(0))t−1 ≡ γ

(mt(−1))t

is the effect of December last year and γ(mt−1(−1))t−1 is the effect of November

last year. If the variance of η(eγ)t is nil, (3.18) collapses to the classical dummy

17For alternative specifications, see e.g. Harvey (1989).

57

variable approach, as is often used in (generalised) linear models. When the

variance of η(eγ)t is not equal to nil, the seasonal pattern may slowly change

over time.

The system of linear equations (3.18) can also be put into matrix form:

γ(mt(0))t

γ(mt(−1))t

γ(mt(−2))t

γ(mt(−3))t

γ(mt(−4))t

γ(mt(−5))t

γ(mt(−6))t

γ(mt(−7))t

γ(mt(−8))t

γ(mt(−9))t

γ(mt(−10))t

=

η(eγ)t

0

0

0

0

0

0

0

0

0

0

0

+

−1 −1 −1 −1 −1 −1 −1 −1 −1 −1 −1

1 0 0 0 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0 0 0 0

0 0 1 0 0 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0 0 0 0

0 0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 0 1 0

γ(mt−1(0))t−1

γ(mt−1(−1))t−1

γ(mt−1(−2))t−1

γ(mt−1(−3))t−1

γ(mt−1(−4))t−1

γ(mt−1(−5))t−1

γ(mt−1(−6))t−1

γ(mt−1(−7))t−1

γ(mt−1(−8))t−1

γ(mt−1(−9))t−1

γ(mt−1(−10))t−1

.

(3.19)

The vector {γ(mt(0))t , . . . , γ

(mt(−10))t }′ thus contains the eleven most recent sea-

sonal effects.

The irregular component εt

The irregular component is similar to the error component in classical regres-

sion models. The irregular components of different time points are assumed

to be independent. However, within each time point, error components are

not necessarily independent. For instance, the number of accidents and the

number of victims are likely to be correlated, as more accidents usually im-

plies more victims as well. As regards traffic volume data, trip data for differ-

58

ent travel modes are obtained from the same sample units, and therefore the

sampling error is also likely to be correlated. More details on the observation

covariance of accident data can be found in Bijleveld (2005) or Chapter 4 of this

thesis, and for travel survey data we refer to Slootbeek (1993).

For more details on structural components, and alternatives, see for instance

Harvey (1989), Harvey and Shephard (1993) and Commandeur and Koopman

(2007).

Linear Dynamic specification

All structural components discussed so far in this section are linear, and can be

generally specified as

αt = Tαt−1 + ηt.

A latent variable that develops with a local linear trend and a quarterly sea-

sonal can be represented by five state elements: {levelt, slopet, seasonalt, sea-

sonal dummy1t, seasonal dummy2t}′:

levelt

slopet

seasonalt

seasonal dummy1t

seasonal dummy2t

=

1 1 0 0 0

0 1 0 0 0

0 0 −1 −1 −1

0 0 1 0 0

0 0 0 1 0

levelt−1

slopet−1

seasonalt−1

seasonal dummy1t−1

seasonal dummy2t−1

+

ηlevelt

ηslopet

ηseasonalt

0

0

,

see also (3.13) and the seasonal specification of (3.19). Note that the value of

the latent variable is levelt + seasonalt, which is equal to

1

0

1

0

0

levelt

slopet

seasonalt

seasonal dummy1t

seasonal dummy2t

. (3.20)

As a model can be defined using more than one latent component, for instance

one for latent exposure and one for latent risk, in practice the model is defi-

59

ned by stacking the latent components (see also, Chapter 9 Commandeur and

Koopman, 2007). In summary, the dynamic specification of the LRT can be

written as

αt = Tt αt−1 + ct + Rt ηt, (3.21)

where the vector ct can be used to implement effects of explanatory (exoge-

nous) variables, and structural breaks. This vector can also be used to allow

traffic volume to have a direct effect on risk. In the macroscopic models (Oppe,

1989; Oppe and Koornstra, 1990; Oppe, 1991a,c) non-linear (or log-linear) de-

velopments are assumed. Although the methods based on the extended Kalman

filter presented in Chapter 6 could also handle nonlinear dynamic relations,

these are not developed in this thesis, and have not yet been implemented in

the approach presented in Chapter 7.

3.3.4. Linear measurement equations

The latent risk time series model (LRT) is a (multivariate) combination of mea-

surement equations as shown in (3.7), where the components on the right hand

side of (3.7) (the (log-)risk and (log-)exposure components) are assumed to de-

velop over time roughly as discussed in Section 3.3.2, and illustrated with Fig-

ure 3.2.

In this section the measurement model is formalised, and formulated as a set

of linear equations. To this end, more linear equations are added to the set

of linear equations (3.7) discussed in Section 3.4 (and Chapter 5). As a simple

example, the measurement equations in (3.7) can be extended to include victim

counts, as follows:

log (Traffic volume) = log (exposure) + random term1

log (Number of accidents) = log (exposure) + log (risk) + random term2,

log (Number of victims) = log (exposure) + log (risk) + log (injury)+

random term3.

(3.22)

Here ‘log (injury)’ is an additional latent variable describing the (logarithm of)

average number of victims per accident, which may be of interest as a measure

of accident severity. In the remaining of this chapter, a symbolic notation will

be used for the terms in (3.22), including the time index t. From now on, the

log (exposure) is denoted by µ(e)t (where µt stands for a latent variable, and ‘e’

for exposure) with random term ε(e)t , log (risk) is denoted by µ

(a)t (with ‘a’ for

60

accidents) with random term ε(a)t , and log (injury) is denoted by µ

(v)t (‘v’ for

victims) with random term ε(v)t . This yields:

log (Traffic volumet) = µ(e)t + ε

(e)t ,

log (Number of accidentst) = µ(e)t + µ

(a)t + ε

(a)t ,

log (Number of victimst) = µ(e)t + µ

(a)t + µ

(v)t + ε

(v)t .

(3.23)

In matrix notation (3.23) can be written as:

log (Traffic volumet)

log (Number of accidentst)

log (Number of victimst)

=

1 0 0

1 1 0

1 1 1

µ(e)t

µ(a)t

µ(v)t

+

ε(e)t

ε(a)t

ε(v)t

.

Note that a latent variable like µ(e)t may in practice consist of several compo-

nents (such as a level, a slope and a seasonal), and in that case needs to be

represented by a vector. Further letting

yt =

log (Traffic volumet)

log (Number of accidentst)

log (Number of victimst)

, Zt =

1 0 0

1 1 0

1 1 1

,

αt =

µ(e)t

µ(a)t

µ(v)t

, εt =

ε(e)t

ε(a)t

ε(v)t

.

(3.23) can be written as

yt = Zt αt + dt + εt, (3.24)

which will be referred to in Section 3.3.5. Just as in the dynamic relation (3.21),

a vector dt is added to allow for explanatory/exogenous variables. Note that

effects included in dt affect how the latent variables are observed. For instance,

in 1994 the survey structure of the travel survey was changed. From then on

under 12 year old inhabitants were included in the survey. This may have had

an effect on some travel indicators. However, there is no reason to assume this

change in the travel survey actually changed travel itself. Therefore the devel-

opment of the latent exposure component µ(e)t should not have been affected,

however, the way it is observed (the survey) did change, so this intervention

should be in the measurement equations.

61

3.3.5. General state space model specification

Combining (3.21) and (3.24) we obtain the full and general specification of the

LRT. The LRT is a linear Gaussian state space model (see e.g., Harvey, 1989;

Durbin and Koopman, 2001) where it is assumed that p× 1 observation vectors

yt (t = 1, . . . , n) are generated by the process:

{

αt = Tt αt−1 + ct + Rt ηt,

yt = Zt αt + dt + εt,ηt ∼ NID(0, Qt), εt ∼ NID(0, Ht) (3.25)

where the error terms εt and ηt are assumed to be zero mean, independent and

identically multivariate Gaussian distributed. The unobserved state at time t

is represented by the m × 1 vector αt. Rt is a selection matrix composed of

r ≤ m columns of the m-dimensional identity matrix Im. The variance matrices

Qt and Ht are assumed to be non-singular (the variance matrix for Rtηt need

not be non-singular). The vectors ct and dt can be used to model effects of

explanatory variables on both the state and the measurement18. In general, the

matrices Zt, Tt, Ht and Qt and vectors ct and dt are assumed to be known or

otherwise to depend on an unknown parameter vector θ. The first equation of

(3.25) is called the state equation, the second equation is called the observation

equation.

In the current context Ht is the sum of a time invariant matrix and a time de-

pendent observation covariance matrix, see alsoBijleveld (2005) or Chapter 4

in this thesis for how this applies to accident data, and Slootbeek (1993) for

how this applies to travel survey data see Slootbeek (1993).

3.3.6. Estimation of parameters and latent factors, missing data

This section discusses the estimation procedure for the linear LRT (see Sec-

tion 5.2 for more details, Section 6.4 for the almost equivalent variant for the

extended Kalman filter, and Section 7.2 for the generalised version). The el-

ements of the unknown parameter vector θ are estimated by the method of

maximum likelihood. For a fully linear model, the Gaussian log-likelihood

function is evaluated by the Kalman filter and numerically maximised with

respect to the unknown parameters, see Harvey (1989) and Durbin and Koop-

man (2001). Consider a state space model with a linear Gaussian observation

18In general, when an explanatory (exogenous) variable affects how road safety is observed,it should be included in dt. If it affects road safety itself, for instance risk, it should be includedin ct.

62

equation

yt = dt + Zt αt + εt, (3.26)

where dt is a known vector and Zt is a known matrix (both may depend on θ).

Both can be time-varying and may depend on past observations. Further, we

assume that the disturbances εt are Gaussian distributed.

The Kalman filter recursively evaluates the estimator of the state vector condi-

tional on past observations Yt−1 = {y1, . . . , yt−1}. The conditional estimator of

the state vector is denoted by at|t−1 = E(αt|Yt−1) and its conditional variance

matrix by Pt|t−1 = var(αt|Yt−1). The Kalman filter is given by the set of vector

and matrix equations

vt = yt − dt − Ztat|t−1, Ft = ZPt|t−1Z′ + Ht,

Kt = TPt|t−1Z′tF

−1t ,

at+1|t = Tat|t−1 + Ktvt, Pt+1|t = TPt|t−1T′ − KtF−1t K′

t + RtQtR′t,

(3.27)

for t = 1, . . . , n, where a1|0 and P1|0 are the unconditional mean and variance

of the initial state vector, respectively. When an initial state element is taken as

a realisation from a diffuse density, we can take its mean as zero and its vari-

ance as a large value. Exact treatments of diffuse initialisations are discussed in

Durbin and Koopman (2001), and are implemented using Ox (Doornik (2001))

and SsfPack (Koopman, Shephard, and Doornik (1998)). The vector vt is the

one-step ahead prediction error with variance matrix Ft. The optimal weight-

ing for filtering is determined by the Kalman gain matrix Kt. The joint density

of the observations can be expressed as a product of predictive densities via the

prediction error decomposition. As a result, the log-likelihood function can be

constructed via the Kalman filter and is given by

ℓ = −n

2log 2π − 1

2

n

∑t=1

log |Ft| −1

2

n

∑t=1

v′tF−1t vt. (3.28)

With diffuse state elements, the log-likelihood function requires some modifi-

cations. For a linear Gaussian state space model, the log-likelihood function ℓ

is exact.

When a value for a particular element of vector yt is not available, it is treated

as a missing value. The Kalman filter can handle missing values in a straight-

forward way. A direct consequence of a missing entry is that the associated

element of the innovation vector vt cannot be computed and is unknown. As-

63

suming that all entries in yt are missing, we treat vt as unknown by taking

vt = 0 and, its variance matrix, Ft → ∞I such that F−1t → 0 and Kt → 0. It

follows that the state update equations can now be written as

at+1|t = Tat|t−1, Pt+1|t = TPt|t−1T′ + RtQtR′t.

These computations are repeated when a number of (consecutive) observa-

tions are missing. This solution also serves as the basis for out-of-sample fore-

casting (where future values are missing) or back-casting (where past values

are missing; this step involves the Kalman smoother as detailed in the next sec-

tion). A missing value does not enter the log-likelihood expression of (3.28).

If only some elements of yt are missing, then the corresponding elements of vt

are taken as zero and the associated rows and columns of F−1t and Kt are taken

as zero vectors (effectively removed).

3.3.7. Kalman smoother, auxiliary residuals

The smoothed estimate of a latent factor is the conditional mean given all avail-

able observations in the sample. The smoothed estimate of the state vector is

denoted by αt = E(αt|Yn) with variance matrix Vt = var(αt|Yn). Once the

Kalman filter has been applied, the smoothed estimates can be computed via

the backward recursions

rt−1 = Z′tF

−1t vt + L′

trt−1, Nt−1 = Z′tF

−1t Zt + L′

tNt−1Lt,

αt = at|t−1 + Pt|t−1rt−1, Vt = Pt|t−1 − Pt|t−1Nt−1Pt|t−1,(3.29)

where Lt = T −KtZt and with initialisations rn = 0 and Nn = 0. The algorithm

is a variation of the fixed interval smoothing method of Anderson and Moore

(1979) and was developed by de Jong (1989) and Kohn and Ansley (1989), see

also Durbin and Koopman (2001, Chapter 4).

3.3.8. Diagnostic checking

Given that the model is well specified, it can be shown that the one-step ahead

prediction error series vt is a Gaussian white noise sequence with variance

matrix Ft, for t = 1, . . . , n. For a set of observation Yn and a given model,

this proposition can be tested via the diagnostic checking of normality, het-

eroscedasticity and serial correlation, see Harvey (1989, Chapter 5).

A particular concern is the existence of outliers and breaks in a time series since

they can distort the estimation of parameters and can be influential in the em-

pirical analysis. Specific diagnostic procedures are developed for the detection

of breaks and outliers in a time series. In the context of state space time series

64

analysis, Harvey and Koopman (1992) and de Jong and Penzer (1998) have

used smoothing errors or so-called auxiliary residuals for this purpose. The

auxiliary residuals are based on the smoothed estimate of the disturbances.

Now

et = F−1t vt − K′

trt, Dt = F−1t + K′

tNtKt,

for t = 1, . . . , n and with rt and Nt computed by (3.29). Note that Dt = var(et)

and Nt = var(rt). A relatively large observation error εt indicates the presence

of an outlier while a relatively large value in the level noise ηt indicates a struc-

tural break, see Harvey and Koopman (1992) for a more detailed discussion.

It is argued by de Jong and Penzer (1998) that such auxiliary residuals can be

computed for any element of the state vector. After standardisation, they can

be considered as t-tests for the hypotheses

H0 : yt − Ztαt − dt − εt = 0, H0 : αt+1 − Tαt − ct − ηt = 0,

element by element, for a particular time point t. The actual statistics for these

hypotheses are given by

e∗it = eit/√

Dii,t, r∗jt = rjt/√

Njj,t, (3.30)

respectively, for i = 1, . . . , N and j = 1, . . . , p, where eit is the ith element

of et, Dii,t is the ith diagonal element of Dt, rjt is the jth element of rt and

Njj,t is the jth diagonal element of Nt. In practice, the diagnostic auxiliary

residual checking procedures are carried out using a conservative significance

level since the interest is limited to serious outliers and breaks, and because

diagnostic checking of the auxiliary residuals involves performing a lot of t-

tests.

3.4. Applications

3.4.1. State space DRAG-similar models

In road safety analysis the demande routiere, accidents et leur gravite (DRAG)

framework of Gaudry (1984) and Gaudry and Lassarre (2000) has been ap-

plied and still is being applied today. The basic form of the DRAG framework

models the dimensions of exposure, risk and severity sequentially. The DRAG

modelling approach consists of three stages: first traffic volume is modelled

(‘demande routiere’). In some applications this is an important step, as often

no true traffic volume data are available (at least not for all data points). In

65

the next step accident frequency (‘des accidents’) is modelled, which is similar to

the risk component of the LRT, and finally severity (‘leur gravite’) is modelled

(the number of fatalities per accident). The LRT approach allows to fit such

models19 using a multivariate unobserved components structure.

In Section 2.2.5 and 2.3 of this thesis we mention that, if accidents are analysed

that are required to exceed a certain severity level, the accident severity level

itself will influence the recorded number of accidents and thus its accident fre-

quency, which implies that the accident risk and accident severity components

could be correlated. Furthermore, as discussed in Bijleveld (2005) and Chap-

ter 4 of this thesis, accident and victim counts (as well as their logarithms) are

also correlated. In this section we therefore propose a DRAG-like model using

a multivariate state space approach, which acknowledges and accommodates

correlations between dependent variables and innovations. The model also

acknowledges the uncertainty in traffic volume data. It may be noted that the

LRT can be regarded as an offspring of the DRAG model. However, it is not

intended to routinely implement Box-Cox transformations as is usual in the

DRAG approach (Box and Cox, 1964; Bickel and Doksum, 1981; Box and Cox,

1982), so the LRT effectively can be considered to be a sub-model of the DRAG

approach. Furthermore no explicit regression coefficients for traffic volume

(exposure) are estimated in the standard LRT, as is discussed in Appendix A.2.

In this section a ‘DRAG’-type version of the LRT is demonstrated using annual

data from the Netherlands. All analyses were also performed on the quar-

terly level (see Commandeur, Bijleveld, and Bergel, 2007, where the results on

quarterly data are presented, also for French networks), but these will not be

discussed here.

The following four dependent variables for t = 1987, . . . , 2005 are used in the

analysis

• Traffic volumet: traffic volume data derived from CBS (2003) and AVV

(2005). The actual data are travel kilometres by drivers in the survey, thus

imitating vehicle kilometres. Note that all drivers in the survey are in-

cluded, also drivers of non-motorised vehicles.

• Accidentst: the number of ‘Killed and seriously injured’ KSI accidents

(according to the police, see Section 3.4.2 for a further example). KSI ac-

cidents are road accidents that have resulted in at least one victim being

19Box-Cox transformations (Box and Cox, 1964; Bickel and Doksum, 1981; Box and Cox,1982) are not implemented in the linear models.

66

killed or seriously injured; in practice seriously injured implies being ad-

mitted into a hospital.20

• KSI victimst: the corresponding number of KSI victims.

• Fatalitiest: the number of fatalities, which is a subset of the KSI victimst.

The dependent variables are assumed to depend on a latent log-exposure var-

iable µ(e)t , a latent log-risk variable µ

(a)t , a latent log-severity variable µ

(k)t serv-

ing as the latent development of the logarithm of the expected number of KSI

victims per KSI accident, and a latent log-lethality variable µ( f )t serving as the

latent development of the logarithm of the expected number of fatalities per

KSI victim. All dependent variables have a random observation error compo-

nent. Each of the latent variables is assumed to develop according to a local

linear trend (see Section 3.3.3). The observation equations are defined as fol-

lows:

log Traffic volumet = µ(e)t + ε

(e)t

log Accidentst = µ(e)t + µ

(a)t + ε

(a)t

log KSI victimst = µ(e)t + µ

(a)t + µ

(k)t + ε

(i)t

log Fatalitiest = µ(e)t + µ

(a)t + µ

(k)t + µ

( f )t + ε

( f )t

(3.31)

The state vector contains the components µ(e)t , µ

(a)t , µ

(k)t and µ

( f )t . Each of these

latent variables follows a local linear trend consisting of a level and a slope

sub-component (there is no seasonal sub-component because we have annual

data). For the latent variable exposure, for example, we have:

µ(e)t ≡

(

µ(el)t

µ(es)t

)

, (3.32)

where (el) (sub-scripted l) denotes the level sub-component associated with

the (e) (for exposure) component. Similarly, (es) (sub-scripted s) denotes the

slope sub-component associated with the (e). The same procedure is followed

for the log-risk component µ(a)t , the latent log-severity variable µ

(k)t , and the

20This category however included victims admitted to hospital for further observation, andreleased the next day considered unharmed.

67

latent log-lethality variable µ( f )t , yielding (see also Section 3.3.3):

statet ≡ αt ≡

µ(e)t

µ(a)t

µ(k)t

µ( f )t

µ(el)t

µ(es)t

µ(al)t

µ(as)t

µ(kl)t

µ(ks)t

µ( fl)t

µ( fs)t

, T =

1 1 0 0 0 0 0 0

0 1 0 0 0 0 0 0

0 0 1 1 0 0 0 0

0 0 0 1 0 0 0 0

0 0 0 0 1 1 0 0

0 0 0 0 0 1 0 0

0 0 0 0 0 0 1 1

0 0 0 0 0 0 0 1

. (3.33)

The following dynamic covariance components are to be estimated:

Q0 =

q11 0 q13 0 q15 0 q17 0

0 q22 0 q24 0 q26 0 q28

q13 0 q33 0 q35 0 q37 0

0 q24 0 q44 0 q46 0 q48

q15 0 q35 0 q55 0 q57 0

0 q26 0 q46 0 q66 0 q68

q17 0 q37 0 q57 0 q77 0

0 q28 0 q48 0 q68 0 q88

, Q = Q′0Q0. (3.34)

The observation vector in (3.25) is

yt ≡

log Traffic volumet

log Accidentst

log KSI victimst

log Fatalitiest

. (3.35)

The observation error covariance matrix Ht is time-variant, and consists of two

parts: a time-variant part based on the results presented in Bijleveld (2005) and

Chapter 4 of this thesis, which is time variant, and a time-invariant part which

is the full covariance matrix for the error due to the model. The observation

matrix Z is set up as:

Z =

1 0 0 0 0 0 0 0

1 0 1 0 0 0 0 0

1 0 1 0 1 0 0 0

1 0 1 0 1 0 1 0

, (3.36)

68

which completes the definition of the vectors and matrices in the general state

space formulation

{

at+1 = T at + ηt

yt = Z at + εt

, ηt ∼ NID(0, Q) εt ∼ NID(0, Ht). (3.37)

The model is applied to two types of data: all accidents and rear-end acci-

dents.21 For both analyses, exponential transforms of the latent log-severity

variable µ(k)t and a latent log-lethality variable µ

( f )t are depicted in Figures 3.3

and 3.4 respectively. From Figures 3.3 and 3.4 it can be inferred that the

confidence intervals of the rear-end accidents are substantially larger than the

confidence intervals for all accidents. This is simply a data issue, as rear-end

accidents are much less frequent than all accidents combined (of which rear-

end accidents are a subset). As a result of this the developments of both com-

ponents associated with the rear-end accidents are much smoother than the

developments of the components associated with all accidents, as the observa-

tion error is relatively larger22. One can argue what can be inferred from these

developments other than some general tendencies. For instance, in Figure 3.3

the development of the number of victims per rear-end accident may appear

to have suddenly increased in the years 1996 and 1997, but it simply cannot

be distinguished from a more gradual increase in the same period (there is

no reason to assume it did either, but that is a different matter). What can be

inferred is that the number of victims per rear-end accident did increase com-

paring the period up to 1995 with the end of the series, although the difference

between the beginning and the end of the series may not be that significant.

It is interesting to see that a similar pattern, but opposite in nature, occurs to

the number of fatalities per victim (Figure 3.4). The rate dropped substantially

21In the latter case no distinction is made between victims travelling in the vehicle impactedin the front (head), in the rear (tail) or in any other vehicle colliding later; all these victims areconsidered simultaneously. However, a model where these distinctions are made could easilybe created as follows:

log Traffic volumet = µ(e)t + ε

(e)t

log Accidentst = µ(e)t + µ

(a)t + ε

(a)t

log KSI victims (head)t = µ(e)t + µ

(a)t + µ

(kh)t + ε

(ih)t

log Fatalities (head)t = µ(e)t + µ

(a)t + µ

(kh)t + µ

( f h)t + ε

( f h)t

log KSI victims (tail)t = µ(e)t + µ

(a)t + µ

(kt)t + ε

(it)t

log Fatalities (tail)t = µ(e)t + µ

(a)t + µ

(kt)t + µ

( f t)t + ε

( f t)t

,

sharing accident occurrence (through µ(e)t + µ

(a)t ), but not its consequences.

22The model tends to ‘follow’ accurate data more closely than (relatively) less accurate data

69

1987 1990 1995 2000 2005

1.12

1.14

1.16

1.18

1.2

1.22

1987 1990 1995 2000 2005

1.12

1.14

1.16

1.18

1.2

1.22

Figure 3.3. Smoothed development of the ‘expected number of KSI

victims per KSI accident’ (latent severity, exp µ(k)t ) variable for all

accidents (solid line, dark grey point-wise 95% confidence inter-vals), and rear-end accidents (dashed line, light grey point-wise95% confidence intervals). The larger, light grey dots denote theempirical ratios of the number of KSI victims to KSI accident forrear-end accidents, while the smaller, dark grey dots denote the em-pirical estimates of the number of KSI victims per KSI accident forall accidents. Note that the confidence intervals are for the expectedvalue, not for the empirical ratios.

1987 1990 1995 2000 2005

0.04

0.05

0.06

0.07

0.08

0.09

0.1

1987 1990 1995 2000 2005

0.04

0.05

0.06

0.07

0.08

0.09

0.1

Figure 3.4. Smoothed development of the logarithm of the expected

number of fatalities per KSI victim (latent lethality, exp µ( f )t ) var-

iable for all accidents (solid line, dark grey point-wise 95% con-fidence intervals), and rear-end accidents (dashed line, light greypoint-wise 95% confidence intervals). The larger, light grey dotsdenote the empirical ratios of the number of fatalities to KSI victimfor rear-end accidents, while the smaller, dark grey dots denote theempirical estimates of the number of fatalities per KSI victim for allaccidents. Note that the confidence intervals are for the expectedvalue, not for the empirical ratios.

70

starting approximately 1994–1995, to a much larger extent than the increase

in the number of victims per accident, resulting in an effective decrease in the

number of fatalities per rear-end accident. Similar results are found for all ac-

cidents, but the decrease in the number of fatalities per accident (for quarterly

data see also, Commandeur et al., 2007) is much slower. Similar results are also

found in the next example, see Figure 3.8 in particular.

3.4.2. Estimating the registration level of accidents involving hospital-

ised victims

In this section an application is presented demonstrating the possibilities of

analysing accident data from different sources. In the Netherlands, the prime

source for accident data is the police registration. However, other sources are

available. The main alternative sources are hospital data (for hospitalised vic-

tims), official mortality statistics and, through surveys (self-reporting). The

police registrations contain the best information on accident circumstances,

while hospital data contain the best information on victim consequences, in

particular in the long-term. However, for a number of reasons, it is not easy to

match hospital data and police data in the Netherlands: it is difficult to match

an individual victim in police records with an individual hospital admission

in the hospital records. Since hospitalised road accident victims are sometimes

missing from police records and because hospitalised victims are sometimes

not correctly identified as road accident victims in hospital records, the ‘true’

figure of hospitalised road accident victims is likely to be larger than both the

number based on hospital records and the number based on police records (see

for more information, Polak, 2000; Reurings, Bos, and van Kampen, 2007).

One particular distinction between police records on the one hand and the hos-

pital records on the other hand is that the former is accident oriented while the

latter is victim oriented. In practice therefore it is almost impossible to de-

termine whether two victims in the hospital records resulted from the same

accident or not. This means that, although studies like Blokpoel and Polak

(1991) and Polak (2000) and Reurings et al. (2007) probably give a good esti-

mate of the actual number of hospitalised road accident victims, without fur-

ther assumptions or information, those studies cannot determine how many

accidents resulted in this number of victims.

The current example demonstrates an approach for estimating this figure. This

is an example intended to demonstrate how this figure could be determined,

and will not yield definitive figures. Currently, studies have started using am-

bulance data (Remmerswaal, 2007), which may result not only in a better link-

ing between police and hospital data but also in information on when and

71

1985 1990 1995 2000 2005 0

5000

10000

15000

20000

25000

1985 1990 1995 2000 2005

54565860626466

Figure 3.5. Left panel: Police and estimated ‘true’ number of hospitalised road accident vic-tims. The number of police registered hospitalised victims is depicted by the solid line. Thedots above this line indicate the ‘true’ number of hospitalised road accident victims basedon studies by Polak (2000) and Reurings et al. (2007). The grey dots indicate data based onextrapolation of previous registration results studies, the black dots indicate actual measure-ments. Right panel: Road accident victim registration level percentage (scale is percentage).The uninterrupted grey line denotes the registration rate based on 100× the number of hos-pitalised victims derived from police data divided by a weighted estimate of the true numberof victims. The dots denote the registration rate based on 100× the number of hospitalisedvictims derived from police data divided by the actual estimate of the true number of victims.

where victims were picked up, thus facilitating the task of matching hospital

data with accidents.

Acknowledging foreseeable future improvements, a LRT model is developed

to estimate the registration level of road accidents involving hospitalised vic-

tims. Note that these accidents do not include fatal victim only accidents, so

they are not equivalent to the KSI accidents of the previous example. In the

left hand panel of Figure 3.5 the number of – police registered – hospitalised

victims is depicted by the solid line. The dots above this line indicate the ‘true’

(actually weighted) number of hospitalised victims based on Polak (2000) and

Reurings et al. (2007). In reality police and hospital data have been matched

only on data from 1985–1986 and 1992–2003 (Polak, 1997; Polak and Blokpoel,

1998; Polak, 2000), resulting in the black dots shown in the figure. Data for

1987–1991 as well as after 2003 are indicated using the grey dots. The data

for these periods were obtained by applying fixed weight factors to subsets of

victims from hospital records and then summing these groups. These weight

factors were based on the 1992–1993 data. This series is commonly used as the

series of the ‘true’ number of hospitalised victims.

Note the substantial difference between the levels of the police registered and

of the ‘true’ number of victims. It is assumed that this difference is mainly

caused due to the police not registering accidents rather than police registering

72

accidents, but missing out on some victims (which does happen, as well as

mixing up on victims).

Apart from the periods with missing data 1987–1991, 2004–2005, an obvious

choice would be to divide the true number of hospitalised victims by the av-

erage number of hospitalised victims per hospitalised victim-accident (a non-

standard accident type). However, the latter figure is similar to the number

of serious victims per KSI accidents as expressed by component µ(k)t in Fig-

ure 3.3), Section 3.4.1, and that component is not time-invariant. Therefore,

analogous to the example in Section 3.4.1, this average number of hospitalised

victims per hospitalised victim-accident is treated as a latent variable.

We use traffic volume data in terms of vehicle kilometres (which include non-

motorised vehicles) in ‘Traffic volumet’, police recorded accident counts of

accidents involving at least one hospitalised victim (according to the police)

in ‘Accidents (police)t’, police recorded hospitalised victims in ‘Hospitalised

(police)t’ and the true number of hospitalised victims in ‘Hospitalised (true)t’,

all for 1985–2005. The LRT consists of the following observation equations:

log Traffic volumet = µ(e)t + ε

(e)t

log Accidents (police)t = µ(e)t + µ

(a)t + µ

(r)t + ε

(ap)t

log Hospitalised (police)t = µ(e)t + µ

(a)t + µ

(h)t + µ

(r)t + ε

(hp)t

log Hospitalised (true)t = µ(e)t + µ

(a)t + µ

(h)t + ε

(ht)t ,

(3.38)

where µ(e)t serves as the latent log-exposure variable, and ε

(e)t is the observa-

tion error component related to traffic volume. Further, µ(a)t serves as the latent

log-accident risk variable, µ(r)t as the latent log-registration rate variable, and

ε(ap)t is the observation error component in the police accident data. Finally,

µ(h)t is the latent log-‘average number of hospitalised victims per hospitalised

victim-accident’ variable, which is shared by both police registered accidents

and ‘hospital registered’ accidents23, ε(hp)t is the observation error component

in the police hospitalised victim data, and ε(ht)t is the observation error com-

ponent in the true number of hospitalised victims. All components µ(e)t , µ

(a)t ,

µ(r)t and µ

(h)t are supposed to develop as local linear trends, as is commonly

assumed in the LRT. Thus the traffic volume depends on exposure, the num-

ber of police recorded accidents depends on exposure, risk and the registration

rate, the number of police recorded hospitalised victims depends on exposure,

23Because it is assumed that the police not registering the accident is the most importantcause of not registering a victim.

73

risk, the registration rate and the number of victims per accident, and the true

number of hospitalised victims finally depends on exposure, risk and the num-

ber of victims per accident.

Almost identically to the previous example, the state vector is now made up

of the components µ(e)t , µ

(a)t , µ

(h)t and µ

(r)t for log-exposure, log-accident risk,

log-‘average number of hospitalised victims per hospitalised victim-accident’

and log-registration rate, which replaces the log-lethality component µ( f )t used

in (3.31). As already mentioned, each of these components is assumed to be a

local linear trend model, and the state vector can therefore be written as:

statet ≡ αt ≡

µ(e)t

µ(a)t

µ(h)t

µ(r)t

µ(el)t

µ(es)t

µ(al)t

µ(as)t

µ(hl)t

µ(hs)t

µ(rl)t

µ(rs)t

. (3.39)

Matrix T is the same as in (3.33) and matrix Q is the same as in (3.34). The

observation vector in (3.38) is:

yt ≡

log Traffic volumet

log Accidents (police)t

log Hospitalised (police)t

log Hospitalised (true)t

. (3.40)

Just as in Section 3.4.1, the observation error covariance matrix Ht is time-

variant, and again consists of two parts: a time-variant part based on the re-

sults presented in Bijleveld (2005) and Chapter 4 of this thesis, which is time-

variant (the ‘hospitalised (true)’ victims are however assumed independent),

and a time-invariant part which is the full covariance matrix for the error due

to the model. The observation matrix Z is now defined as:

Z =

1 0 0 0 0 0 0 0

1 0 1 0 0 0 1 0

1 0 1 0 1 0 1 0

1 0 1 0 1 0 0 0

, (3.41)

74

1985 1990 1995 2000 200550

55

60

65

70

1985 1990 1995 2000 200550

55

60

65

70

Figure 3.6. The smoothed development of the registration rate

for accidents(

100 × exp(

µ(r)t

))

for all data (dashed line) and data

based on actual measurements (ignoring ‘true’ hospitalised victimsfor 1987–1991, 2004 and 2005) (solid line). The dark grey area de-fines the point-wise 95% confidence interval for the registration rate(without 1987–1991, 2004 and 2005), and the light grey area definesthe point-wise 95% confidence interval for the registration rate (in-cluding 1987–1991, 2004 and 2005). The dots denote the registrationrate for hospitalised victims rather than accidents. Note that the twolines are not predictors for the dots in this plot.

which completes the definition of the vectors and matrices in the general state

space formulation (3.37).

Two models are fitted, one using all data, based on the weighted sample, in-

cluding the ‘true’ hospitalised victims for the period 1987–1991 and 2004–2005,

and one based on the actual measurements from the registration studies. The

estimates of the registration rate components (100 × exp(

µ(r)t

)

, for accidents)

obtained in the two analyses are depicted in Figure 3.6, which also contains

the registration rate figures, both for weighted and actual data as presented in

the right hand panel of Figure 3.5. Note that these registration rate figures are

with respect to victims, not accidents.

Although the differences may not be significant, it is clearly visible in Fig-

ure 3.6 that the estimate of the registration rate (ignoring ‘true’ hospitalised

victims for 1987–1991, 2004 and 2005, thus using only actual observations) was

lower than the estimate of the registration rate (including ‘true’ hospitalised

victims for 1987–1991, 2004 and 2005) at the beginning of the series and was

already decaying, probably even before 1985. Obviously, this difference in the

level of the registration rates is caused by the fact that in 1985–1986 the ‘true’

75

1985 1990 1995 2000 2005

5000

10000

15000

20000

25000

Figure 3.7. Dots are the ‘true’ (weighted) number of hospitalisedroad accident victims (including 1987–1991, 2004 and 2005, de-picted in grey, where no real matching is performed). The solid linenear the bottom denotes the police registered hospitalised victim-accidents, the dashed line with light-grey 95% point-wise confi-dence intervals denotes the true number of hospitalised victim-accidents based on weighted data, the solid line with dark-grey 95%point-wise confidence intervals denotes the true number of hospi-talised victim-accidents based on true data from matching studies.

number of victims was larger than the weighted number, and thus the regis-

tration rate was lower (at the end of the series, on the other hand, this situation

is reversed, but the difference there is not significant by any reasonable stan-

dard). Further differences between the two developments may be explained by

the fact that in the 1987–1991 period, only the weighted number of victims has

data. The fact that this is also the case for 2004–2005 does not appear to have a

large impact. It is also visible that the observation for the registration rates of

victims in 1996–1997 are at odds with the other observations. Assuming sim-

ilarity between the registration rate for accidents and victims, this difference

may well be significant. Reurings et al. (2007) also found that this observa-

tion is different from others in other respects. In the end, the differences with

respect to the registration rates between the two approaches are not significant.

In Figure 3.7 the smoothed prediction of the ‘true’ number of hospitalised

victim-accidents is displayed within the context of the police recorded num-

ber of hospitalised victim-accidents and the true number of hospitalised vic-

tims, also displayed in Figure 3.5. Obviously, the differences between the two

approaches are marginal. One concern however is that the differences are sys-

tematic, in that the difference between the weighted and true estimates of the

76

1985 1990 1995 2000 2005

1.08

1.1

1.12

1.14

1.16

1985 1990 1995 2000 2005

1.08

1.1

1.12

1.14

1.16

Figure 3.8. Plot of the number of hospitalised victims per hospi-talised victim-accident (based on weighted data). The dots are the‘true’ (weighted) number of hospitalised victims divided by theestimated true number of hospitalised victim-accidents (based onweighted data model). The dashed line with the light-grey 95%

point-wise confidence intervals denotes exp µ(h)t , the latent ‘average

number of hospitalised victims per hospitalised victim-accident’variable, which is shared by both police registered accidents and‘hospital registered’ accidents, and the solid line with the dark-grey95% point-wise confidence intervals denotes the latent risk varia-

ble exp µ(a)t from (3.31) in the example of Section 3.4.1. Note that

the latter figure also includes fatalities, and is therefore somewhatlarger than the number of hospitalised victims.

number of hospitalised victims is opposite at the beginning and the end of the

series.

In Figure 3.8 the smoothed average number of hospitalised victims per hos-

pitalised victim-accident’ component is compared to an ‘empirical’ estimate.

The development is that the number of hospitalised victims per hospitalised

victim-accident increased up to around the year 2000, and then started to drop

again. A similar pattern (including fatalities in addition to hospitalised vic-

tims) can be observed from police data (solid line, see also (3.31) from the ex-

ample of Section 3.4.1).

3.5. Non linear extensions

3.5.1. Introduction

The latent risk time series model LRT is an additive linear or multiplicative

log-linear model, assuming (at least approximately) Gaussian additive or mul-

77

tiplicative error structures. These assumptions may not always hold. Models

may for instance have both additive and multiplicative components, and in

case of accident counts, may need other error distributions then the Gaussian

distribution.

3.5.2. Mixing additive and multiplicative models

In Chapter 6 a model is presented combining multiplicative and additive mea-

surement equations. The analysis aims to model the disaggregate develop-

ment of road safety inside and outside urban areas. As sometimes happens in

road safety research, disaggregated data are not available for the entire period.

Specifically, although disaggregated accident data are available for the entire

1961 to 2000 period, separate data for traffic volume inside and outside urban

areas are only available in the years 1984 to 1996. This problem is not isolated.

For example, travel data with and without rainfall is generally not available

while this condition is distinguished in accident data (see also Section 3.5.3).

A simplification of the travel survey of the OVG (CBS, 2003) around the year

2000 resulted in certain travel modes becoming more difficult to distinguish

in the survey. A proposed change in the collection of exposure data at Statis-

tics Netherlands may have the opposite effect: certain distinctions that were

previously unavailable may then become available.

In the analysis discussed in Chapter 6 we have the following situation. For

some observations two sets of equations like (3.6) are available,

traffic volume outside urban areas ≈ exposure outside urban areas,

accidents outside urban areas ≈ exposure outside urban areas

×risk outside urban areas,

traffic volume inside urban areas ≈ exposure inside urban areas,

accidents inside urban areas ≈ exposure inside urban areas

×risk inside urban areas,

while for other data points only the total traffic volume is known,

national traffic volume ≈ exposure outside urban areas

+exposure inside urban areas,

accidents outside urban areas ≈ exposure outside urban areas

×risk outside urban areas,

accidents inside urban areas ≈ exposure inside urban areas

×risk inside urban areas.

78

In Chapter 6 of this thesis it is shown how this model can be fitted using the

extended Kalman filter, and by applying local linear trends to both risk and

exposure components of inside and outside urban areas.

3.5.3. Further generalisations

The extended Kalman filter has been improved in some applications. In Chap-

ter 7 of this thesis a generalisation of the Gauss-Newton interpretation of Bell

and Cathey (1993) of the iterated extended Kalman filter (Wishner, Tabaczyn-

ski, and Athans, 1969) is presented. The method developed in Chapter 7 ef-

fectively implements the iterated extended Kalman filter when Gaussian er-

rors are assumed, but allows for other statistical distributions, and is capable

of processing large problems. The method is applied in an analysis of daily

data on accidents with and without precipitation and traffic volume. Like in

the previous example (Section 3.5.2) no disaggregated traffic volume with and

without precipitation is available. Unlike the previous example, not even a few

observations with traffic volume with and without precipitation are available.

On the other hand, the traffic related to the accident type (single car accidents)

analysed in Chapter 7 appears moderately affected by rain fall (some results

are available in the literature, but further research would help) so the actual

relative duration of rainfall is – for now – used as a fraction to divide the total

exposure over both weather conditions. Obviously, this indicator cannot be

fully relied upon, so a latent component is used instead.

79

4. The covariance between the number of acci-

dents and the number of victims in multivar-

iate analysis of accident related outcomes24

In this study some statistical issues involved in the simultaneous analysis of accident related

outcomes of the road traffic process are investigated. Since accident related outcomes like the

number of victims, fatalities or accidents show interdependencies, their simultaneous analysis

requires that these interdependencies are taken into account. One particular interdependency

is the number of fatal accidents that is always smaller than the number of fatalities as at least

one fatality results from a fatal accident. More generally, when the number of accidents in-

creases, the number of people injured as a result of these accidents will also increase. Since de-

pendencies between accident related outcomes are reflected in the variance-covariance struc-

ture of the outcomes, the main focus of the present study is on establishing this structure. As

this study shows it is possible to derive relatively simple expressions for estimates of the var-

iances and covariances of (logarithms of) accidents and victim counts. One example reveals a

substantial effect of the inclusion of covariance terms in the estimation of a confidence region

of a mortality rate. The accuracy of the estimated variance-covariance structure of the accident

related outcomes is evaluated using samples of real life accident data from the Netherlands.

Additionally, the effect of small expected counts on the variance estimate of the logarithm of

the counts is investigated.

4.1. Introduction

4.1.1. The need for multivariate modelling of influences on road safety

The development of (road) traffic safety is very often analysed studying the de-

velopment of only one single accident related outcome. However, in practice

there are always several accident related outcomes: the number of accidents

themselves, the number of fatalities, the number of people injured, the cost of

the material damage, and so on. Road safety measures usually do not have

the same (quantitative) effect on each of these accident outcomes. For exam-

ple, it is likely that the compulsory use of seat belts mainly has an effect on

the consequences of an accident, whereas measures aiming to reduce the oc-

currence of drink-driving mainly have an effect on the number of accidents

(and therefore on the number of victims). Speed reducing measures are sup-

posed to have an effect on both the number of accidents and the consequences

of an accident. Particular theories, such as for example the risk homeostasis

theory (Wilde, 1994) and the zero risk theory (Summala and Naataanen, 1988)

state that theoretically likely developments may be counteracted because of

behavioural adaptation. An example of this is the use of seat belts which may,

24This chapter appeared as Bijleveld (2005).

80

according to some theories, result in higher speeds and other more dangerous

behaviour, so that the expected reduction in the number of injuries is (at least

partly) undone by the fact that the number of accidents increases.

In order to get a better understanding of the (quantitative) effect of road safety

measures, their potentially differentiated effect on each of the accident related

outcomes should be investigated. This becomes more important when the ef-

fects of different road safety measures introduced in a brief period of time are

to be decomposed. In principle this can often be done by careful definition

of dependent variables using separate univariate models, but in many cases

a multivariate framework, where the dependence between the outcome varia-

bles is acknowledged is likely to be preferable.

The multivariate approach allows for the simultaneous estimation of unknown

quantities based on all relevant data, rather than one estimate per dependent

variable. This feature will become more important as new statistical tech-

niques become practical that allow for unobserved – latent – components. One

example is a multivariate extension of Harvey and Durbin (1986), other ap-

plications include methods like Schafer (1987), that implements an errors-in-

variables approach by means of the EM-algorithm (Dempster et al., 1977) to

generalized linear models by “casting the true covariates as ‘missing data’ ”.

For instance alternative methods exist that estimate a sufficient statistic for the

explanatory variables. One example of an application in road safety is Johans-

son (1996) in which exposure is modelled by means of a latent variable that

is estimated by means of one dependent variable, not all dependent variables

simultaneously. This subject is discussed in more detail in Section 4.4.2.

4.1.2. The issue of dependence among outcomes

One important issue is that multivariate road safety outcomes may not be in-

dependent. A notable example is the fact that no more fatal accidents can occur

in a period of time than the total number of fatalities in that period of time, as

at least one fatality occurs in a fatal accident. A more general example is that

when more road accidents occur in a year than usual, it is likely that more peo-

ple get injured in road traffic as well. The former restriction is not imposed

in this study, rather an approximation is made based on the latter aspect by

developing an expression for the covariance (matrix) of the counts of accident

related outcomes (and logarithms thereof). In some cases it will be possible

to redefine the problem to a problem with independent road safety outcomes.

This will however only simplify matters as far as the off-diagonal elements of

the covariance matrix are concerned. The diagonal elements still have to be

81

estimated in which the covariance matrix is implicitly used. In some cases the

use of multivariate models will be inevitable.

Ignoring the covariance may have serious consequences in inference. Suppose

the differentiated effect of a safety measure is to be evaluated on two different

outcome variables: the annual number of accidents and the annual number of

injured people. Also suppose that two models (A and B, say) are fitted on the

data. For a certain observation (year) it is found that model A overestimates

the observed number of accidents as well as the observed number of injured

by say ten. On the other hand, for the same year model B overestimates the

observed number of accidents also by an amount of ten, but underestimates the

number of injured by an amount of ten. Expressed in terms of ‘fit’ (based on

the assumption that the errors follow a symmetrical but not necessarily normal

distribution) both models are equally likely when the covariance between the

two outcomes is ignored. However, once the positive covariance between the

two outcome variables is taken into account, model A yields a better fit than

model B, as it should. Even worse, if model B overestimates the observed

number of accidents by an amount of ten, but underestimates the number of

injured by an amount of five, then it would yield a better fit than model A if the

covariance between the two outcome variables is ignored. This could possibly

result in false conclusions concerning the differentiated effectiveness of safety

measures.

Thus, in the multivariate analysis (in the sense that multiple outcome variables

are analysed) of road safety the dependencies between the dependent variables

should be taken into account.

4.1.3. An approximating solution

In the following sections an analytical procedure is proposed for the estimation

of the variance-covariance matrix from accident data that can be used in mod-

els based on these moments, such as normal approximations, which includes

the vast majority of available multivariate models in which multiple outcome

variables are analysed.

Some multivariate models for multiple count variables do exist, see

Cameron and Trivedi (1998, Chapter 8). Cameron and Trivedi (1998, p. 252)

however state that “Applications of multivariate count models are relatively

uncommon. Practical experience has been restricted to some special computa-

tionally tractable cases.”

82

For this reason and the fact that accident related outcomes are not restricted

to count data, it is attempted in this paper to define a generally applicable, al-

beit approximate, method for analysing multivariate accident related data that

may work in less tractable cases. To that end, estimates of the mean and co-

variance of those data are developed. It is intended that (possibly derivatives

of) these estimates are used in weighted models based on the normal distri-

bution. The methods will therefore not be suitable for observations based on

a small number of accidents. With respect to the estimates derived in this pa-

per the results in this paper are more extensive than developed in Evans (2003,

par. 3 and appendices). This paper adds expressions for covariance between

outcomes as well as a framework for developing estimates of higher moments,

when needed.

It should be noted that this method differs from methods, for instance in time

series analysis, in which (auto)covariances (of errors) are estimated from within

a model, using aggregated data. The proposed method is different in that it es-

timates covariances from within the accident data, using individual accident

information on the relevant outcomes, for instance the number of victims and

fatalities. Note that in road safety it is rare to have information on non-injured

persons, except for vehicle drivers. In principle it is possible to decide when

the driver of a vehicle is not among the victims and the vehicle was not parked

that the driver probably is a non-injured person. Due to the possible compli-

cations, this possibility is ignored in this study.

Furthermore, only covariances within an observation are estimated in contrast

to for instance time series analysis in which covariances between observations

are estimated.

Recently Hutchings, Knight, and Reading (2003) published a method based on

generalized estimating equations that allows for the estimation of covariances.

The estimation of the covariances is in this case from within a model, but uses

individual accident records. This approach used information on non-injured

car occupants that is not available in many cases including the current study,

where only victims and drivers are registered.

All theory in this study is based on the assumption that accident counts fol-

low a Poisson distribution. At first this seems to be in conflict with modern

theories on generalized Poisson modelling. However, this is not the case. The

Poisson assumption is supported by limit theorems such as Feller (1968, p.

282), of which a less general version can be found in McCullagh and Nelder

(1989, p. 105), concerning the asymptotic distribution of the sum of n (n large)

83

independent Bernoulli trials25 with variable (but quite small) probabilities of

success. If each Bernoulli trial is equivalent to an encounter in road traffic

with a unique but small probability of an accident, the distribution of the total

number of accidents will tend to the Poisson distribution. This result how-

ever neglects the impact the small number of accidents can have on the large

number of encounters.

In practice it is impossible to perform true replications, so the assumption that

accident counts follow a Poisson distribution cannot be verified by means of

an experiment. Observations that are assumed to be replications in practice pro-

duce larger variation than can be explained by the Poisson distribution. This

is by no means a disproof of the assumption. More on causes of this overdis-

persion phenomenon can be found in Hauer (2001). In these cases however,

it is assumed that overdispersion is mostly a modelling issue rather that a data

issue. As described in Section 4.4.2 this approach of overdispersion can also be

taken using this study.

In contrast to the strict assumptions on the accident counts, the assumptions on

the distribution of the outcomes are rather relaxed. It is assumed that the mo-

ments of the variables that are a consequence of the accident (e.g. the number

of victims, cost of damage) is finite. This assumption will in practice always

be met as in practice damage is limited. Additionally it is assumed that the

outcomes are independently and identically distributed for all accidents. It is

likely that the results can be extended to not identically distributed outcomes.

The fact that the proposed method uses information on individual accidents

will prohibit its use in cases where only aggregated accident information is

available.

4.1.4. Overview of the paper

Section 4.2 describes and discusses the results of an analytical derivation of

the covariance between the total number of accidents (not necessarily injury

accidents) and the total number of victims in a time period. The latter could

also have been the total cost of damages or any other measurable quantity

resulting as a consequence of an accident. The analytical results can be used to

derive higher order moments than covariances as well.

25A Bernoulli trial with parameter p is an experiment with probability p of ‘success’ (unfor-tunately an accident in this case) and probability 1 − p of ‘failure’ (no accident)

84

The analytical results are compared with results based on simulation in Sec-

tion 4.3. This is done using samples from real accident data from the Nether-

lands in the period 1980–1999. The purpose of this is to assess the accuracy of

the variance-covariance estimates based on the analytical derivations. Simula-

tion studies have been performed on four sets of accidents: the first set consists

of fatal car-only accidents, in the second set injury only accidents are included

as well, the third set consists of fatal accidents (not just car-only accidents) and

in the fourth injury accidents are included.

In Section 4.4 some examples are discussed of (possible) applications of the

methods.

4.2. The covariance structure of road safety related

outcomes

4.2.1. Introduction

This section describes how an estimate of the covariance matrix of the number

of (injury) accidents and victims can be computed. In the following ‘number

of victims’ may be read as ‘the cost of damage’ or any other consequence of an

accident, as long as this consequence is equally and independently distributed

with finite moments for all accidents. Details on the derivations can be found

in the appendix.

4.2.2. Results

Using the results of the derivations reported in the appendix, variance-covari-

ance estimates are formulated between either the count variables or the loga-

rithms of those count variables. As stated above, these results are applicable

to any variable with finite moments that is the consequence of an accident.

Two cases are developed in this study: the total number of accidents, victims

and fatalities and the logarithm of the total number of accidents, victims and

fatalities.

Table 4.2 provides an overview of all results while Table 4.1 contains an expla-

nation of abbreviations used in Table 4.2.

Table 4.1 shows that not all information is available in standard publications on

accidents. This is indicated by a “*”. Detailed sources on individual accidents

are needed to get more precise estimates. In that case probably the individual

85

Number of:Realisation Abbreviation

accidents (acc) n nvictims in accident i vi

fatalities in accident i fi∗

Sum over all accidents of number of:Estimate Abbrev.

victims (vic) ∑ni=1 vi Σv

fatalities (fat) ∑ni=1 fi Σ f

Sum over all accidents of the square of the number of:Estimate Abbreviation

victims ∑ni=1 v2

i∗ Σv2

fatalities ∑ni=1 f 2

i∗ Σ f 2

Sum over all accidents of the cross product of the numbers of:Estimate Abbreviation

victims and fatalities ∑ni=1 vi fi

∗ Σ f v

Table 4.1. Abbreviations used in the derived equations for variances andcovariances and estimates. The quantities marked ∗ are usually not availablein aggregated accident data.

fatality counts fi and victim counts vi per accident will be available. In that case

for instance the variance of the total number of victims can be computed as the

sum of the squared victims counts as indicated in Table 4.2. It can be seen

that the variance of such victim counts is generally larger than the variance

of similar accident counts. The amount of ‘extra’ variance depends on the

distribution of the number of victims per accident. When more victims tend to

occur in certain types of accident the variance of the number of victims tends

to be higher.

4.3. Simulation studies

To assess the accuracy of the variance-covariance estimates in Section 4.2 a

number of simulation studies were performed. From injury accidents that oc-

curred in the Netherlands in the years 1980 through 1999 the number of victims

and the number of fatalities were recorded for each individual accident as well

as the month and year in which the accident occurred. Additional simulations

were performed using accidents that only involved cars, using accidents that

involved only fatal accidents and using accidents that involved exclusively fa-

tal car-only accidents. All simulation studies were performed by selecting a

86

Results based on countsVariance of: Estimate Equation

the total number of accidents nthe total number of victimsa Σv2 B.7the total number of fatalities Σ f 2 B.7

Covariance of: Estimate Equation

the total number of accidents and victims Σv B.13the total number of accidents and fatalities Σ f B.13the total number of victims and fatalities Σ f v B.14

Results based on logarithms of countsVariance of: Estimate Equation

the total number of accidents 1/n B.16

the total number of victims Σv2/ (Σv)2 B.18

the total number of fatalities Σ f 2/ (Σ f )2 B.18

Covariance of: Estimate Equation

the total number of accidents and victims 1/n B.19the total number of accidents and fatalities 1/n B.19the total number of victims and fatalities Σ f v/ (Σv × Σ f ) B.20

aThe variance of the total number of victims is up to about 50% higherthan the total number of victims based on data used in the simulationstudy.(Section4.3)

Table 4.2. Derived equations for variances and covariances and estimates.

87

random number of accident records with replacement from a specific month.

The number of accidents to be selected was a random number sampled from a

Poisson distribution with expected value equal to the number of accidents that

actually occurred that particular month.

This scheme should produce a selection of accidents that could have occurred

almost as likely as the selection of accidents that actually occurred. For each

thus created sample the total number of accidents, victims and fatalities were

computed, as well as the logarithms thereof. Covariances were computed us-

ing a large number of such samples.

Table 4.3 compares results of estimates based on simulations with the estimates

in Table 4.2. The estimates based on simulations were computed as the sam-

ple variance-covariance matrix. Each sample consisted of the 50000 simulated

months. One sample of 50000 was drawn for each of the 240 months in the

range starting january 1980 through december 1999. For each month, sample

estimates (esample) and computed estimates (e) were compared by means of the

difference measure d = (esample − e)/esample.

For each statistic in the first column of Table 4.3 this resulted in 240 difference

measures. The mean values and standard deviations of those 240 difference

measures are listed horizontally in Table 4.3 for each of the four accident selec-

tions. The mean values and standard deviations reflect the amount of similar-

ity between the sample estimates and the computed estimates. A large depar-

ture from zero of a mean value indicates a systematic difference (bias) between

both estimates whereas a relatively large standard deviation is an indication of

inaccuracy of the estimate.

It should be noted that the sampling scheme implies a Poisson distribution of

the number of accidents and therefore simulation checks on the estimation of

the variance of the number of accidents cannot be used to check the variance

estimate. As discussed in the introduction, without true replicates it is im-

possible to validate the Poisson assumption. True replicates would mean for

instance months of traffic sites with exactly the same accident distribution. The

entry ‘var(acc)’ in Table 4.3 is thus for reference only.

Table 4.3 shows that the two types of estimates are quite similar except in the

case of log-fatalities, indicating no important difference between the sample

statistics and the estimates proposed in this study in the other cases. Particu-

larly the logarithmic case with a smaller number of accidents (near the bottom

of Table 4.3) appears to be biased. This bias may be the result of the approx-

88

all all fatal fatalaccidents car accidents accidents car accidents

Measure Mean Std. Mean Std. Mean Std. Mean Std.

var(acc) −0.0000 0.0064 −0.0004 0.0065 0.0004 0.0065 0.0003 0.0063var(vic) −0.0006 0.0062 −0.0002 0.0062 −0.0002 0.0065 0.0004 0.0063var(fat) −0.0001 0.0064 0.0004 0.0066 0.0003 0.0063 0.0004 0.0065cov(acc,vic) −0.0005 0.0067 −0.0005 0.0069 0.0000 0.0069 0.0004 0.0067cov(acc,fat) 0.0006 0.0259 −0.0004 0.0352 0.0003 0.0066 0.0004 0.0066cov(vic,fat) 0.0004 0.0201 0.0006 0.0208 0.0000 0.0068 0.0004 0.0068var(log(acc)) 0.0004 0.0064 0.0021 0.0066 0.0126 0.0078 0.0513 0.0292var(log(vic)) −0.0003 0.0063 0.0023 0.0063 0.0116 0.0099 0.1516 0.1194var(log(fat)) 0.0046 0.0068 0.0687 0.0411 0.0118 0.0076 0.0684 0.0400cov(log(acc),log(vic)) 0.0000 0.0068 0.0026 0.0070 0.0165 0.0087 0.1068 0.0724cov(log(acc),log(fat)) 0.0018 0.0262 0.0164 0.0382 0.0134 0.0079 0.0669 0.0367cov(log(vic),log(fat)) 0.0005 0.0203 0.0123 0.0234 0.0149 0.0083 0.1165 0.0766

Table 4.3. Means and standard deviations of the relative differences(esample − e)/esample between simulation sample estimates of the measures‘esample’ and computed estimates ‘e’, 50000 simulations, based on accidentdata from 1980–1999 in the Netherlands. Abbreviations: var(*)=variance of*, cov(*,#)= covariance of * and #, acc=number of accidents, vic=number ofvictims, fat=number of fatalities, log(*)=logarithm of *.

imation of the logarithm used to obtain an estimate of the variance in these

cases, as it does not occur for instance in the case of var(fat) in combination

with relatively little fatalities. Apparently estimates of the variance may not be

that accurate and care must be taken in case of logarithms in combination with

small counts that these inaccuracies do not influence inferences too much. The

relative error of the variance estimate of the logarithm of a Poisson distributed

random variable is the subject of Section 4.4.3, which is concerned with this

issue.

4.4. Examples

4.4.1. The mortality ratio

As a first application one can use derived statistics as the mortality ratio. The

mortality ratio is the number of fatalities divided by the number of accidents

and as such is an application of multivariate accident related outcomes. In this

example the number of hospitalized victims as well as the fatalities are used.

The ratios are obtained by dividing the number of victims by the number of

accidents resulting in hospitalized victims or worse. Obviously, each ratio is a

nonlinear function of the respective number of victims and the number of ac-

cidents, which themselves are not independent. The textbook approximation

method to this ratio (delta method, see for instance Rice (1995, Chapter 4.6))

is used to obtain the expected value and the variance of the ratio Z = Y/X in

89

1980 1985 1990 1995 2000

1.001.021.041.061.08

1980 1985 1990 1995 2000

1.001.021.041.061.08

(a) Hospitalized victims

1980 1985 1990 1995 20000.09

0.10

0.11

0.12

1980 1985 1990 1995 20000.09

0.10

0.11

0.12

(b) Fatalities

Figure 4.3. The ratio between the number of hospitalized victims ((a)) and thenumber of fatalities ((b)) and the number of accidents with hospitalized victimsor fatalities. Dark gray areas denote the approximated 95% confidence regions.Light gray areas (only visible in ((a))) are 95% confidence regions ignoring covar-iance and estimating the variance of the number of victims like the number ofaccidents using the number of victims.

Rice (1995, p. 153)):

E(Z) ≈ µY

µX+ σ2

X

µY

µ3X

− σXY

µ2X

=∑ vi

n+ n

∑ vi

n3− ∑ vi

n2=

∑ vi

n.

As a nice consequence but to no surprise it can be concluded that although the

number of victims and the number of accidents is correlated, correction for it

cancels out. The variance is approximated as in Rice (1995, p. 153)):

Var(Z) ≈ σ2X

µ2Y

µ4X

+σ2

Y

µ2X

− 2σXYµY

µ3X

= n(∑ vi)

2

n4+

∑ v2i

n2− 2

(

∑ vi

) ∑ vi

n3

=∑ v2

i

n2− (∑ vi)

2

n3.

In Figure 4.3 the annual ratios for the Netherlands are plotted for the years

1976 through 2002. The dark gray areas denote the approximate 95% confi-

dence regions based on the estimates from this study. The light gray areas

(only visible in Figure 4.4(a)) are 95% confidence regions ignoring covariance

and using the mean estimate like the number of accidents as the variance esti-

mate. The differences between the confidence regions are evident. The panels

roughly resemble El-Sadig, Norman, Lloyd, and Bener (2002, Fig. 6 and Fig. 7)

except that accidents with more serious outcomes as well as more seriously

injured are counted in this case. El-Sadig et al. (2002, p. 472 and Discussion)

and others also notice an increase in severity from traffic accidents. Although

out of the scope of this study, it appears that the rates have suddenly changed

in the late 1990’s.

90

4.4.2. Multivariate state space modelling and the Kalman filter

One possible field of application of the proposed method is in applications

of multivariate time series, for instance using state space models (Durbin and

Koopman, 2001). As stated in the introduction, Johansson (1996, p. 75) ac-

knowledged that the level of exposure is an unobservable variable and in that

study it is modelled as a latent variable. This approach is similar to taken in

Bijleveld, Commandeur, Koopman, and van Montfort (Prep) in which the (ex-

tended) Kalman filter (Harvey, 1989, p. 160) is used to model the development

of the number of fatal accidents inside and outside urban area’s in the Nether-

lands together with vehicle kilometres inside and outside urban area’s. In that

approach it appeared possible to reconstruct the vehicle kilometres inside and

outside urban area’s in years in which only the total (the sum of inside and

outside urban area’s) number of vehicle kilometres is available. The multivar-

iate approach allows for estimation of the latent exposure in Johansson (1996,

p. 75) for all dependent variables, not just on a per dependent variable basis.

That is likely to yield different exposure developments per dependent variable

which may not be optimal.

If counts are modelled linearly in state space models, an additive approach to

overdispersion will be taken. This is due to the way variances are handled in

the Kalman filtering approach. This can be seen from (Durbin and Koopman,

2001p 66, (4.7)) where the prediction error covariance is decomposed into a part

based on the system error (the model part) and an observation part, the latter

composed from the (co)variances of the observation errors, which this study

serves.

4.4.3. The relative error of the variance estimate of the logarithm of a

Poisson distributed random variable

In Section 4.3 it was found that the approximation of the variance of the log-

arithm of counts performs poorly when counts are small. In order to obtain

a better insight into this matter this section is devoted to estimating the error

in approximation. To this end, the theoretical variance of the logarithm of a

Poisson distributed number as a function of its expected value λ is computed

numerically and compared to the estimate 1/λ. Table 4.4 lists the numerical

approximation (s2) of the theoretical variance of the logarithm of a Poisson

distributed number as well as the estimate 1/λ for that number over a range

of small values of the expected value λ of the Poisson distributed number. In

Figure 4.4 the relative difference (approximation-estimate)/approximation is

graphed as a function of the expected value λ. Obviously, both results for

the logarithmic case are approximations, but the numerical approximation is

91

0 5 10 15 20 25 30

-0.2

-0.1

0

0.1

0.2

3.29 6.00

0.231

0.05

Figure 4.4. The relative error of the variance estimate 1/λ of the logarithmof a Poisson distributed random variable with respect to a numerical ap-proximation as a function of the expected value λ of the Poisson distributednumber (horizontal axis). The relative error is computed as (approximation-estimate)/approximation.

much more precise, so all differences are attributed to the estimate. As Fig-

ure 4.4 shows, the relative error for the variance of the number of accidents

can be substantial if λ is less than about 20–30. Similar results will hold for

victims and fatalities, which do not obey a Poisson law but are dependent on

one.

4.5. Conclusions

In this study some statistical issues involved in the simultaneous analysis of

accident related outcomes (such as the number of victims, fatalities or acci-

λ 1/λ s2(Nλ) λ 1/λ s2(Nλ) λ 1/λ s2(Nλ) λ 1/λ s2(Nλ)

1 1.0000 0.1343 11 0.0909 0.1072 21 0.0476 0.0515 31 0.0323 0.03402 0.5000 0.2631 12 0.0833 0.0967 22 0.0455 0.0490 32 0.0313 0.03283 0.3333 0.3037 13 0.0769 0.0881 23 0.0435 0.0467 33 0.0303 0.03184 0.2500 0.2898 14 0.0714 0.0808 24 0.0417 0.0446 34 0.0294 0.03085 0.2000 0.2546 15 0.0667 0.0747 25 0.0400 0.0427 35 0.0286 0.02996 0.1667 0.2168 16 0.0625 0.0695 26 0.0385 0.0409 36 0.0278 0.02907 0.1429 0.1840 17 0.0588 0.0649 27 0.0370 0.0393 37 0.0270 0.02828 0.1250 0.1574 18 0.0556 0.0610 28 0.0357 0.0378 38 0.0263 0.02749 0.1111 0.1366 19 0.0526 0.0574 29 0.0345 0.0364 39 0.0256 0.026710 0.1000 0.1202 20 0.0500 0.0543 30 0.0333 0.0351 40 0.0250 0.0260

Table 4.4. Estimates and approximations of variances of the logarithm of thenumber of accidents when the expected number of accidents is small. Seealso Figure 4.4.

92

dents and costs) of the road traffic process were investigated. The main focus

of this study was on the estimation of the variance-covariance structure of such

outcomes. Correction for covariance is needed in order to enhance the statis-

tical reliability of techniques applied to the simultaneous analysis of accident

related outcomes. It turns out to be possible to derive relatively simple expres-

sions for the variances and covariances of (logarithms of) accidents and victim

counts.

It is argued that when multiple accident outcomes are modelled, their covari-

ance should be taken into account . One example reveals a substantial effect of

the inclusion of covariance terms in the estimation of a confidence region of a

mortality rate.

The variances and covariances were compared with estimates obtained in a

simulation study. Not surprisingly, it was found in the logarithmic case that

bias increases as the number of accidents decreases. In general, estimates will

deteriorate when the number of accidents decreases.

As a special case it is recommended not to use Normal approximations to

the Poisson distribution (Feller, 1968, Chapter VII) of the variance of, for in-

stance, the number of victims in a year by estimating its value using the ob-

served number of victims. The actual variance may be substantially larger.

The amount of ‘extra’ variance depends on the distribution of the number of

victims per accident. When more victims tend to occur in certain types of ac-

cident the variance of the number of victims tends to be higher. As a result

this study confirms that is better to approximate this variance by the sum of

the square of the number of victims per accident rather than by the sum of the

number of victims per accident.

In order to compute the statistics, in some cases information at the level of

individual accidents is needed. For instance the estimate of the variance of

the number of fatalities is computed by summing the squares of the number

of fatalities for each individual accident. This information may not always be

available.

93

5. Model-based measurement of latent risk in

time series with applications26

Risk is at the center of many policy decisions in companies, governments and other institu-

tions. The risk of road fatalities concerns local governments in planning countermeasures, the

risk and severity of counterparty default concerns bank risk managers on a daily basis and

the risk of infection has actuarial and epidemiological consequences. However, risk can not be

observed directly and it usually varies over time. In this paper we introduce a general multi-

variate time series model for the analysis of risk based on latent processes for (i) the exposure

to an event, (ii) the risk of that event occurring and (iii) the severity of the event. Linear state

space methods can be used for the statistical treatment of the model. The new framework is il-

lustrated for time series of insurance claims, credit card purchases and road safety. It is shown

that the general methodology can be effectively used in the assessment of risk.

5.1. Introduction

In the statistics and econometrics literature the term “risk” can take many

meanings. Here we focus on event or operational risk: given a certain level

of exposure, what is the expected severity of loss due to certain events? Exam-

ples of exposure are the number (or value) of buildings owned by a corporate

firm or the size of agricultural land with a certain crop. The event can be fire

(relevant to buildings) and flooding (relevant to crops). This risk definition

contrasts with, for example, Value-at-Risk where the focus is on the maximum

loss with probability of, say, 1% in a prespecified period. These two approaches

of risk can be regarded as complements. Value-at-Risk focuses on the extreme

and total risk while operational risk is concerned with expected and more spe-

cific risk. Government and industry are concerned with a large variety of op-

erational risk in relation to many different events. For example, road safety is

of concern to the general public and therefore most governments take an ac-

tive role in this. Also, insurance companies focus on the risk of a certain claim

while epidemiologic research is usually concerned with medical risk of infec-

tion. There is growing pressure to develop risk models in a range of fields.

International regulations (from the Basel Committee on Bank Supervision) re-

quire banks to be able to model and forecast risk. Road safety researchers have

considerable pressure from governments to evaluate past safety measures and

forecast future accidents and injuries.

Event or operational risk is generally concerned with (i) exposure to an event,

(ii) the probability of the event occurring and (iii) the severity of the event.

26This chapter appeared as Bijleveld, Commandeur, Gould, and Koopman (2008).

94

The time series modelling of event risk offers new insights into data and can

confirm or reject the validity of constant risk assumptions. There is substan-

tial evidence that simple deterministic models fail to adequately explain the

dynamics of risk. Recently a number of articles have examined stochastically

time-varying structures to model risk in epidemiological applications. For ex-

ample, Dominici, McDermott, and Hastie (2004) find evidence of time vary-

ing risk factors within a generalised additive model framework used to deter-

mine the interaction between mortality rates and air pollution concentrations.

Finkenstadt and Grenfell (2000) find evidence of seasonal time variations in a

model for measles epidemics. An illustration of modelling disease incidences

on the basis of latent processes is given by Morton and Finkenstadt (2005). In

actuarial research, there is a surprising lack of time series models for the risk

and severity of insurance claims. Among the few articles is de Jong and Boyle

(1983), in which Bayesian methods are applied to a state space model which

produces stochastically time-varying mortality rates. Harvey and Fernandes

(1989) also develop a model for insurance claims using latent factors where

both the size of claims and the number of claims are modelled. Automobile

insurance claims for multiple cohorts are analysed by Ledolter, Klugman, and

Lee (1991) who test for common latent factors across cohorts. In bank risk man-

agement there have been some articles examining the use of time varying pa-

rameters to model the risk of counterparty default. Allen and Saunders (2003)

highlight the need for dynamic approaches to modelling company default. A

time-varying logistic model for unemployment durations is developed for this

purpose by Fahrmeir and Wagenpfeil (1996). It assesses the probability of sub-

jects entering or leaving a state of unemployment. The results suggest there

is a need for time-variation in model parameters. In road safety research, the

framework of Oppe (1989) assumes that exposure follows a logistic-S curve

and log-risk evolves deterministically. The demande routiere, des accidents et leur

gravite (DRAG) approach of Gaudry (1984) and Gaudry and Lassarre (2000)

uses regression and Box-Jenkins methods for separating the effects of crash

risk and exposure. Li and Kim (2000) use cross-sectional methods for this pur-

pose. Levitt and Porter (2001) show the importance of sample selection in a

micro-economic framework for analysing effects of seatbelts and airbags on

accident survival rates. Time series approaches are regarded as complemen-

tary to cross-sectional methods since they account for serial correlation also

and they can be used when only aggregated time series are available.

In this paper we introduce a general multivariate model for event risk anal-

ysis that can consider exposure, risk and severity simultaneously. The latent

risk time series (LRT) model can be applied to a range of problems involving

event risk and is not specifically limited to particular applications. The LRT

95

model is general and allows for the stochastic evolution of exposure, risk and

severity over time. It extends previous work by treating exposure and severity

as an integral part of the risk problem. In existing approaches some or all of

these variables (particularly exposure) are treated as known, when in reality

they are measured under error and are subject to stochastic variation. The LRT

model has a multivariate structure and therefore correlations between latent

processes and errors can be estimated. The multivariate decomposition can

include latent factors for trend, seasonal and cyclical dynamics together with

regression and intervention effects. It further allows for the forecasting of fu-

ture exposures, events and losses together with prediction confidence bounds,

which are of particular interest to risk managers. Finally, our multivariate

framework also can handle data with multiple cohorts.

The statistical framework, including state space forms and estimation methods

are presented in Section 2. The exposure-risk motor vehicle insurance model is

the first example of a LRT analysis and is discussed in Section 3. The exposure-

risk-severity model for credit card use is treated in Section 4. The multiple

exposure-single risk model for bicycle and moped road traffic accidents is pre-

sented in Section 5. The empirical illustrations include parameter estimation,

signal extraction of latent factors and some discussion of results.

5.2. The statistical framework

The latent risk time series (LRT) model includes latent factors for exposure

Eit, risk Rjt and severity Skt which are associated with the observed varia-

bles exposure Xit, outcome Yjt and the loss Zkt for subject indices i = 1, . . . , I,

j = 1, . . . , J, k = 1, . . . , K and time index t = 1, . . . , n. The basic form of the

model is for I = J = K and links the observables with the latent factors via the

multiplicative relations

Xit = Eit ×U(X)it , Yit = Eit × Rit ×U

(Y)it , Zit = Eit × Rit × Sit ×U

(Z)it ,

where U(a)it are random error terms with unity mean for i = 1, . . . , I, t =

1, . . . , n and a = X, Y, Z. The exposure variable Xit can be the number of ve-

hicle (type i) registrations or distance travelled, the number or value of loans

(type i) or population in region i. The outcome variable Yit is typically the

number of times a certain event occurs for a group i such as claims, accidents

and successful treatments. The loss variable Zit measures the severity of the

outcome such as the dollar value of claims or defaults (type i). The multiplica-

tive error terms reflect that observed variables are measured under uncertainty

due to inaccurate reporting and use of proxy variables. It is not needed to set

96

I = J = K because multiple outcomes for only a single exposure variable can

occur and multiple types of severity can exist for a single outcome. For ex-

ample, we can have multiple types of accidents with cars so that I = 1 and

J > 1.

Variables in logs are denoted by the small version of the corresponding capital

letter used for the original variable, e.g. eit = log Eit. Further, for any t, we

denote vt = (v1t, . . . , vIt)′ where vit represents any variable with two indices i

and t and with the first index i used as stacking argument for i = 1, . . . , I and

t = 1, . . . , n. After taking logs (element by element) and stacking variables in

vectors, the multiplicative LRT equations become the linear system

xt = et + u(x)t , yt = et + rt + u

(y)t , zt = et + rt + st + u

(z)t , (5.1)

where u(a)t is a serially independent disturbance vector with zero mean and

variance matrix Σ(aa)u for a = x, y, z. The disturbances can also be mutually

but instantaneously correlated and the corresponding covariance matrix is de-

noted by Σ(ab)u for a, b = x, y, z and a 6= b. In case the dimension I, J and K do

not match, the different series for x, y and z can be distributed generally via

xt = et + u(x)t , yt = Hyxet + rt + u

(y)t , zt = Hzy(Hyxet + rt) + st + u

(z)t ,

(5.2)

where J × I matrix Hyx and K × J matrix Hzy are typically selection matrices

consisting of ones and zeroes. It is assumed that the dimensions of observed

exposure (proxy) x and latent exposure e, of observed outcome y and latent risk

r and of observed loss z and latent severity s match and are equal to I, J and

K, respectively. It is straighforward to modify (5.2) further to account for cases

where the dimensions of observed and corresponding latent variable do not

match. However, identifiability of the system becomes an issue in such cases

while this is not the case for system (5.2) since any latent variable is uniquely

linked with an observed variable.

The additive system (5.1) is the observation equation where log-exposure eit,

log-risk rit and log-severity sit are treated as latent factors which can be mod-

elled separately. The latent factors can be specified as vector autoregressive

integrated moving average (ARIMA) processes. A more flexible approach

is to let these factors depend on a sum of ARIMA processes and fixed ef-

fects as advocated by Harvey (1989), known as structural time series mod-

els, and Bell (2004), known as RegComponent models. For example, latent

97

factor c may partly depend on a trend (long-term) component that is mod-

elled by µ(c)t+1 = µ

(c)t + β

(c)t + η

(c)t with β

(c)t = 0 (local level model) or with

β(c)t+1 = β

(c)t + ζ

(c)t (local linear trend model) where η

(c)t and ζ

(c)t are disturbance

vectors with zero mean and variance matrices Σccη and Σcc

ζ , respectively, for

c = e, r, s. The disturbance vectors η(c)t and ζ

(c)t for latent factor c are mutually

independent. However, the contemporaneous covariance matrix between dis-

turbance vectors η(c)t and η

(d)t , denoted by Σ

(cd)η , can be nonzero for c, d = e, r, s

and c 6= d. This may also apply to ζ(c)t and ζ

(d)t .

In case the time series is observed in quarterly or monthly frequencies, the se-

ries may be subject to seasonal effects. The latent factor c may then depend

on a periodic (seasonal) process that can be modelled by ∑p−1j=0 γ

(c)t+1−j = ω

(c)t

(stochastic seasonal dummy) where γ(c)t is the seasonal effect at time t with

seasonal length p for c = e, r, s. The disturbance vector ω(c)t has similar proper-

ties as the disturbance vectors η(c)t and ζ

(c)t but they are mutually independent

of each other. Apart from trend and seasonal dynamics, latent factors can be

further composed of, possibly stationary, ARIMA processes.

Regression effects can be added to the latent factor as is common within the

frameworks of structural time series models and RegComponent models. Fixed

regression effects can also include intervention effects for outlying observa-

tions D(c)t (τ; 0), level breaks in trend D

(c)t (τ; 1) and slope breaks in trend

D(c)t (τ; 2) where 1 < τ < n is a fixed time point at which the intervention

occurs for factors c = e, r, s. We can formally define the interventions by

D(c)t (τ; 0) = 1, ∆D

(c)t (τ; 1) = 1 and ∆2D

(c)t (τ; 2) = 1 for t = τ, all are zero

otherwise, with difference operator ∆ = 1− B and backshift operator B so that

∆yt = (1 − B)yt = yt − yt−1. An illustration of intervention analysis in this

framework is presented by Harvey and Durbin (1986) for a univariate time

series of road accidents. A general model-based methodology for identifying

interventions from a given time series is developed by de Jong and Penzer

(1998).

Components and fixed effects are assumed to be part of the latent factors et,

rt and st and therefore part of the LRT system. This implies that a seasonal or

an intervention effect in observed exposure also enters the equations for ob-

served outcome and loss. However, the modeller may decide that some effects

need to appear exclusively in one equation. We therefore need to introduce

the idiosyncratic latent factors e(x)t and r

(y)t for the observation equations for

98

exposure and outcome, respectively. We obtain

xt =et + e(x)t + u

(x)t ,

yt =Hyxet + rt + r(y)t + u

(y)t , (5.3)

zt =Hzy(Hyxet + rt) + st + u(z)t .

The compositions of the idiosyncratic factors e(x)t and r

(y)t can be specified in

the same way as the factors et, rt and st as described above. Some account need

to be taken with respect to the identification of the factors. For example, in case

I = J = K = 1, a seasonal component can appear in both et and e(x)t but for the

remaining two equations only one additional seasonal component is available

since only three observed series are given to identify the seasonal effects. This

also applies to other effects that are part of the model.

It is well documented in the literature (see earlier references in this section)

that different linear dynamic processes can be formulated in state space form

jointly. The state equation as formulated in Durbin and Koopman (2001) is

given by

αt+1 = Ttαt + Gtξt, ξt ∼ N(0, Qt), t = 1, . . . , n, (5.4)

where the initial state vector α1 is specified separately. For example, the local

level model defined above is obtained from (5.4) by having Tt and Gt as iden-

tity matrices and setting Qt = Σ(cc)η . Regression effects can also be considered

as a part of the state vector. In this framework we define a component as a

linear function of the state vector containing latent processes and regression

effects. While matrix Gt is typically a known selection matrix, elements of the

matrices Tt and Qt may be unknown as is apparent from the example given

above. The unknown elements are collected in the parameter vector ψ and are

estimated as described below.

The state space formulation is completed with the observation equation

xt

yt

zt

=

FI 0 0 FI 0

Hyx FJ 0 0 FJ

HzyHyx Hzy FK 0 0

θt + ut, θt = Wtαt, t = 1, . . . , n,

(5.5)

with i× i identity matrix Fi for i = I, J, K, signal vector θt = (e′t, r′t, s′t, e(x)′t , r

(y)′t )′

and disturbance vector ut = (u(x)′t , u

(y)′t , u

(z)′t )′. Matrix Wt links the signal θt

99

with the state αt by selecting the appropriate elements of the state vector that

contains the components and fixed regression effects required for modelling

the dependent time series xt, yt and zt.

The state equation (5.4) and the observation equation (5.5) define the state

space model and enables the application of the Kalman filter for the filtering

of the state vector. Filtering refers to the estimation of αt conditional on obser-

vations up to and including time t. Smoothing is similar but the estimation is

conditional on all observations (up to and including time n). A related method

carries out the computations for smoothing. Both methods also compute mean

squared errors for the estimators. In case all disturbances in the model are

normally distributed, we obtain minimum mean squared estimators. When

normality is not assumed, they are minimum mean squared linear estimators.

A textbook treatment of state space methods is given by Durbin and Koop-

man (2001) while a non-technical introduction is given by Commandeur and

Koopman (2007).

The Kalman filter carries out the prediction error decomposition for a given

state space model and a particular value of ψ. This implies that the likelihood

function can be evaluated by the Kalman filter for a given ψ. Maximum like-

lihood estimation of ψ then becomes a standard exercise of numerically max-

imising the likelihood function with respect to ψ. In the empirical applications

of the LRT model below, parameters in ψ are limited to the elements of vari-

ance matrices such as Σ(cc)η given above for the local level model. Regression

coefficients can be placed in the state vector. To ensure positive semi-definite

variance matrices, a variance matrix is decomposed as Σ(cc)η = M′M where M

is a symmetric matrix.

5.3. Case I: a two-dimensional insurance LRT model

The first illustration of the latent risk model concerns insurance policies and

claims related to motor vehicle fatalities in Victoria, Australia. We analyse

annual time series consisting of the number of vehicle registrations (in thou-

sands, exposure xt) and the number of claims (in units, outcome yt) for the

years 1950–2001. Registrations are a measure of the total stock of vehicles on

Victorian roads. The two time series are presented in the upper panels of Fig-

ure 5.1. The registrations series display an upwards, smooth trend while the

fatal claims series have a “hump” shape, with a peak in the early 1970s. Since

registrations have increased monotonically over the past 50 years, the reduc-

tion in fatal claims must have been caused by a decrease in risk. Risk reduc-

100

tions have been driven by gradual improvements in vehicle and road design

together with increased public awareness. Demographic factors have also been

important as a new generation of road users (“baby boomers”) began to start

driving. Public horror at a road casualty toll of 1034 for Victoria in 1970 led to

newspaper declarations of “war on 1034”. This has been indicative of chang-

ing attitudes towards road safety. The effects on attitude have proved to be

long-term. Other important relevant events in the sample are the introduc-

tion of seat belt laws in 1971 and the increased enforcement and mass media

advertising campaigns on road safety in the early 1990s.

1950 1960 1970 1980 1990 2000

1000

2000

3000

1950 1960 1970 1980 1990 2000

500

1000

1950 1960 1970 1980 1990 2000

7

8

1950 1960 1970 1980 1990 2000

-2

-1

0

1950 1960 1970 1980 1990 2000

0.025

0.075

1950 1960 1970 1980 1990 2000

-0.05

0.00

0.05

Figure 5.1. Time series of registered vehicles (in thousands) and crash fatalities (inunits) in Victoria, Australia (row 1). Smooth estimates of exposure (column 1) and risk(column 2) factors modelled as stochastic trends (row 2) with stochastic slopes (row 3)incl. interventions.

The policy exposure series xt and claim outcome series yt are both univariate

(data is not disaggregated into groups or cohorts). A time series for loss zt

(e.g., the dollar value of payouts on claims) is not available and therefore we

consider a two-dimensional LRT model that consists of the first two equations

in (5.3) with Hyx = 1 and with dimensions I = J = 1. The latent factors et and

rt are modelled as local linear trends. The following special events are consid-

ered as intervention variables: (i) in 1970, a publicity campaign was launched

to increase public and governmental awareness of road safety issues (“war on

1034”); (ii) in 1971, introduction of seat belt laws; (iii) in 1980, change in data

collection on vehicle registrations; (iv) in 1990, introduction of advertising and

enforcement initiatives aimed at reducing accident risk; (v) in 1992, another

change in data collection on vehicle registrations. The changes in data collec-

101

tion should only affect exposure and are therefore part of the latent factor et

while the other events should have an effect on risk rt. The intervention (i) is a

long-term effect and therefore captured by a change in the slope term of risk.

The events (ii) and (iv) are taken as immediate step changes in the level of risk.

These interventions are confirmed by applying the methods of de Jong and

Penzer (1998) to this data set. The interventions (i), (ii) and (iv) are assumed

to only have an impact on accident risk, as none of the measures are aimed at

reducing road use.

Estimates for a selection of parameters are displayed in Table 5.2. Standard er-

rors are computed but space considerations prevent us from presenting them.

The estimated (co)variances for trend and slope disturbances for the two latent

factors reveal that exposure and risk are perfectly negatively correlated:

Σ(er)η /

Σ(ee)η · Σ

(rr)η = Σ

(er)ζ /

Σ(ee)ζ · Σ

(rr)ζ = −1.

The perfect negative correlations mean that both exposure and risk factors are

subject to the same stochastic shocks that determine their time-varying be-

haviour. This finding is in agreement with most road crash research, which

finds a strong negative relationship between risk and exposure. There are a

number of reasons for this relationship, including the fact that roads become

more congested as exposure increases, which slows vehicle speeds such that

fatal or serious injury accidents are less likely. In developed countries, there

has been a period of increased road use and decreasing fatal accident risk over

the past 35 years. Over this period, technology and safety awareness have im-

proved, which is also an indirect cause of the negative correlation. The perfect

correlation of shocks implies that the components can be interpreted as com-

mon factors. Nevertheless, the estimated components are distinct from each

other since they are also subject to different interventions.

The estimates of the intervention coefficients are presented in Table 5.2. The

estimated intervention for the anticipated break in the level of exposure due

to a change in the data collection of policies (registrations) is clearly signifi-

cant for 1992 but less significant for 1980. The level interventions for risk in

1971 (seat belt laws) and 1990 (advertising initiatives) are very significant. The

magnitude of the 1990 intervention is nearly four times greater than the seat-

belt law introduced in 1971. However, the 1971 seatbelt effect may partly be

confounded with the hightly significant ”war on 1034” effect on the slope of

log-risk. This estimated effect of −0.079 implies that each year a reduction of

0.079 is achieved in log-risk. The combined effects of 1970 and 1971 have there-

102

fore more impact than the advertising campaign in 1990. Since the different

events occur shortly at the beginning of the 1970s, it is difficult to disentangle

those effects.

Figure 5.1 presents the estimated level and slope components of exposure and

risk (in logs). The estimated components are subject to both random shocks

and interventions. The salient features of the analysis are the increasing expo-

sure with a significant slope term throughout the sample, and the decreasing

risk with a significant negative slope term that is mainly caused by the pub-

licity intervention. Risk displays relatively more stochastic variation in both

the estimated level and slope terms. Apart from the intervention shocks, level

and slope components of risk are perfectly and negatively correlated with level

and slope components of exposure, respectively. The estimated slopes of risk

and exposure are of opposite sign but both evolve towards zero. This suggests

a long-term flattening of risk and exposure, which is evident in the data. The

level terms are also perfectly and negatively correlated. As exposure increases

around its slope, risk decreases. Exposure evolves relatively smoothly, with

the slope term driving much of the variation.

5.4. Case II: a three-dimensional credit card LRT model

In this section we study the developments in the usage of credit cards in Aus-

tralia. The dataset consists of monthly observations, from May 1994 through

to August 2004 (124 observations), with the number of credit card accounts

(exposure xt), the number of purchases made by credit cards (outcome yt) and

the total dollar value of purchases by credit cards (loss zt), as presented in the

upper row of Figure 5.3. The analysis is crucial for marketing credit cards but

is also of concern to bank risk managers who have an interest in Australian

consumers’ reliance on credit card debt. Since the observed time series for xt,

yt and zt have (rapid) increasing patterns, we model the latent factors et, rt and

st as local linear trends. The monthly series yt and zt have also seasonal fluctu-

ations around the trend due to changing consumer behaviour within the year

due to, for example, Christmas and Easter. The seasonal factors should not

necessarily affect risk and severity and therefore we adopt model (5.3) with

r(y)t and st as stochastic seasonal dummy processes. The data is in nominal

terms so that severity includes inflationary effects. Furthermore, we examine

the event of January 2002 when the Reserve Bank of Australia started to in-

clude credit card accounts from commercial banks and other financial institu-

tions in the sample. The inclusion of data from other credit card issuers means

that the number of credit cards has increased but the unobserved factors risk

103

parameter description Case I Case II Case III

×10−3 ×10−5 ×10−4

Σ(ee)η variance trend exposure 0.31 1.33

Σ(rr)η variance trend risk 1.30 8.91

Σ(ss)η variance trend severity 1.18

Σ(er)η covariance trend exposure-risk −0.640

Σ(ee)ζ variance slope exposure 0.040 0.0261 0.27

Σ(rr)ζ variance slope risk 0.130 0.1590 20.0

Σ(ss)ζ variance slope severity 0.0014

Σ(er)ζ covariance slope exposure-risk −0.070 −0.0371

Σ(es)ζ covariance slope exposure-severity −0.0058

Σ(rs)ζ covariance slope risk-severity 0.0056

Σ(yy)ω variance seasonal outcome 220

Σ(zz)ω variance seasonal loss 181

Σ(yz)ω covariance seasonal outcome-loss 194

Σ(xx)u variance disturbance exposure 0.16 0.31 9.70

Σ(yy)u variance disturbance outcome 4.21 1.07 0.47

Σ(zz)u variance disturbance loss 4.13

intervention description Case I Case II Case III

D(r)t (1970; 2) “war on 1034” −0.079∗∗∗

D(r)t (1971; 1) seat belt law introduction −0.108∗∗∗

D(e)t (1980; 1) data collection change −0.086∗∗

D(r)t (1990; 1) advertising initiative −0.376∗∗∗

D(e)t (1992; 1) data collection change −0.066∗∗∗

D(x)t (2002.1; 1) data collection change 0.062∗∗∗

D(y)t (2002.1; 1) data collection change 0.066∗∗

D(s)t (2002.1; 1) data collection change 0.083∗∗∗

D(e)t (1991; 1) free travel pass introduction −0.180∗∗

D(r)t (2000; 1) start of law moped on main road −0.310∗∗∗

Figure 5.2. Parameter estimates for disturbance (co)variances and interventions. In caseof interventions, ∗∗ and ∗∗∗ indicate significance at 90% and 95% levels, respectively. Thelast three columns are for the three models described in the sections for Case I, II and III.

104

and severity may also change since the new issuers in the sample of credit card

users may represent customers with different spending patterns. The change

in the composition of the sample in January 2002 is permanent and therefore

level interventions for this month are appropriate and are included for the la-

tent trend factors et, rt and st.

1994 1999 2004

8000

10000

12000

1994 1999 2004

50000

100000

1994 1999 2004

5000

10000

15000

1994 1999 2004

9.0

9.2

9.4

1994 1999 2004

1.5

2.0

1994 1999 2004

-2.3

-2.2

-2.1

-2.0

1994 1999 2004

0.0000

0.0025

0.0050

0.0075

1994 1999 2004

0.00

0.01

0.02

1994 1999 2004

0.002

0.003

0.004

Figure 5.3. Monthly time series related to credit cards data from Australia (row 1): num-ber of cards (xt), number of purchases (yt) and their value (zt). Smooth estimates of ex-posure (column 1), risk (column 2) and severity (column 3) factors modelled as stochas-tic trends (row 2) with stochastic slopes (row 3). Intervention estimates are added to thetrends.

The parameter estimates are given in Table 5.2. The variance matrices for trend

and observation noises are taken as diagonal. This is strongly supported by

the fact that maximum likelihood estimation produces almost equal likelihood

values for models with and without this restriction. The estimated variances

of the seasonal disturbances are relatively large compared to the observation

noise. Further, the estimate for the seasonal covariance Σ(yz)ω implies a high

correlation and it may therefore be sufficient to consider model (5.2) with the

inclusion of a seasonal component for rt only. The estimated trends presented

in Figure 5.3 are smooth and their slopes are varying over time. The log-risk

growth is decreasing from 1999 onwards while severity growth is more con-

stant over time. Exposure growth is hump-shaped. The three intervention

estimates are highly significant and are added to the estimated trends in Fig-

ure 5.3 despite that they are part of the observation equations. Although the

risk factor is significantly affected by the intervention for the change in sur-

vey composition, the severity of credit card purchases increased the most. It

105

can therefore be concluded that the new account holders in the survey from

January 2002 onwards are making more expensive purchases with their credit

cards. The new customers have had a smaller effect on the risk (intensity) of

making a purchase.

5.5. Case III: a multiple exposure LRT model

The yearly number of persons killed and seriously injured (KSI) in collisions

between mopeds and bicycles in the Netherlands is closely watched since they

involve mostly young persons. Further, mopeds and bicycles are widely and

intensively used in the Netherlands. An official study was carried out to in-

vestigate the risk of this category of KSI accidents. For this purpose, a dataset

has been constructed with two exposure variables (I = 2) and one outcome

variable (J = 1). The two exposure variables consist of the numbers of kilo-

metres driven by mopeds and by bicycles. The outcome variable is the yearly

number of accidents where the primary collision partners are one moped user

and one bicycle user, and where the victims are either killed or hospitalised.

The yearly observations ranges from 1985 to 2003. Given the short sample, the

model used was parsimonious to preserve a sufficient number of degrees of

freedom.

The three time series are presented in the upper panels of Figure 5.4. For the

two exposure series, the 95% confidence intervals are also presented. These

are based on the published survey error variances. The number of kilometres

driven by bicycles are subject to stepwise increases in the late 1980s and in 1994

while those by mopeds show a gradual decrease over the years. The increase

in 1994 for the bicycle kilometres driven may be explained by the extension

of the sample with persons under 12 years of age. The decrease of the 95%

confidence intervals for the two exposure series from 1994 onwards is due to

the increase of the survey sample size by a factor of two. The yearly num-

ber of accidents show stepwise decreases in 1991 and in 2000. It is anticipated

that the decrease in 1991 coincides with the introduction of a free travelpass

for students (typically between 17 and 21 years of age). The travelpass gave

free access to the national and local public transport systems (mainly buses

and trains). The usage of the free travelpass became more and more restricted

over the years from 1995 onwards. This may partly explain the slow increase

of KSI accidents in the late 1990s. It is reasonable to argue that the decrease in

2000 may have been caused in part by the introduction of a law that moved

all mopeds from the special bicycle roads (or tracks) to the main roads in use

by other motorized vehicles (motors, cars, trucks). This law only applies to

situations where special bicycle roads or tracks exist and where the traffic con-

106

ditions are sufficiently safe. Therefore many exceptions to this law exist and

the “mopeds on the roadway” law can only partly explain the 2000 drop.

The first two equations of the LRT model (5.2) are considered with I = 2,

J = 1 and Hyx = (1 1). The two latent factors in et and the latent factor rt

are modelled as local linear trends. All variance matrices are diagonal. The

two variances of u(x)t depend on a parameter plus a known time-varying value

that is implied by the different precisions of the surveys. This also applies

to the variance of u(y)t but the time-varying value is implied by the normal

approximation of the Poisson counts of accidents. The estimated parameters

are reported in Table 5.2. Given the short time-span of the sample, the time-

variations in the level and slope components are limited. The variances of the

level disturbances are estimated as zero. In the case of kilometres driven by

mopeds, the slope variation is also estimated as zero and therefore we obtain a

fixed time trend that is only interrupted by the estimated intervention in 1991.

The constant variance of the observation noise for moped volume is estimated

as zero so that the random noise is only due to the variation in the different

sample sizes over the years.

1990 2000

10

12

14

1990 2000

1.0

1.5

1990 2000

300

350

400

450

1990 2000

2.4

2.5

2.6

1990 2000

0.0

0.2

0.4

1990 2000

3.0

3.2

3.4

1990 2000

0.00

0.02

0.04

1990 2000

-0.015

-0.010

-0.005

1990 2000

-0.1

0.0

0.1

Figure 5.4. Yearly time series of traffic volume (in billion kms) of bicycles and mopedstogether with counts of accidents between them in The Netherlands (row 1). Smoothestimates of bicycle (column 1) and mopeds (column 2) exposure and risk (column 3)factors modelled as stochastic trends (row 2) with stochastic slopes (row 3).

Two significant intervention estimates are reported in Table 5.2. The first esti-

mate is for the effect of the variable representing the introduction of the free

travel pass in 1991 on kilometres driven by mopeds. The second estimate is

107

for the variable representing the effect of the law of “mopeds on the roadway”

on risk. The extension of the sample for bicycle volume with children under

12 years of age did not affect the analysis. We also have experimented with

other possible interventions but their inclusion had little or no impact on the

value of the likelihood function. The estimated smooth trends for exposure

and risk are displayed in Figure 5.4. Risk is decreasing until the early 1990s,

but has been increasing since 1993. The estimated slope pattern for risk may

be explained by the popularity of light mopeds for which it is not obligatory

to wear a crash helmet. It is evident that accidents are likely to be more severe

when the concerned moped drivers do not wear helmets. This may explain the

increasing trend in KSI accidents.

5.6. Conclusions

In this paper we propose a latent risk time series model for measuring event

risk. The multivariate modelling framework includes latent dynamic factors

for exposure, risk and severity. The multivariate nature of the model means

that common factors can be identified through the correlation structure of la-

tent dynamic processes. The magnitude and sign of correlations may provide

interesting interpretations for researchers. The stochastic trend and seasonal

factors are time-varying by nature and arbitrary re-calibrations of model pa-

rameters are not needed. This is an advantage inherent in the unobserved

components time series modelling approach.

The application to credit cards data showed that stochastic variation is im-

portant in measuring the risk and severity of credit card purchases. For the

car insurance data, stochastic variation seems less important. It appears that

structural breaks explain most of the changes in risk and exposure over the

past 50 years. The illustration of accidents between mopeds and bicycles has

shown that the model can also include multiple categories of exposure varia-

bles. When more data is available, more detailed categories of exposure, risk

and severity can be considered. For example, different risk factors can be in-

cluded for males/females, different age groups and different regions. Future

research is directed towards extending the modelling framework further for

handling multiple categories or panel (longitudinal) structures in data.

108

6. Multivariate nonlinear time series modelling

of exposure and risk in road safety research27

In this paper we consider a multivariate nonlinear time series model for the analysis of traffic

volumes and road casualties inside and outside urban areas. The model consists of dynamic

latent (unobserved) factors for exposure and risk that are related in a nonlinear way. The

multivariate dimension of the model is due to its inclusion of multiple time series for inside

and outside urban areas. The analysis is based on the extended Kalman filter. Approximate

maximum likelihood methods are utilised for the estimation of unknown parameters. The

latent factors are estimated by extended smoothing methods. We present a case study of yearly

time series of numbers of fatal accidents (inside and outside urban areas) and numbers of

driven kilometers by motor vehicles in the Netherlands between 1961 and 2000. The analysis

accounts for missing entries in the disaggregated numbers of driven kilometres although the

aggregated numbers are observed throughout. It is concluded that the salient features of the

observed time series are captured by the model in a satisfactory way.

6.1. Introduction

This paper considers a multivariate nonlinear time series model for the analy-

sis of traffic volume and road accident data. The model is based on the class

of multivariate unobserved components time series models and is modified to

allow for nonlinear relationships between components. The analysis relies on

disaggregated and aggregated data and can account for missing entries in the

data set. Missing observations are quite usual in road safety analysis where

disaggregated data are not available throughout the sample period but data at

the aggregated level are available for a longer period. The nonlinear nature of

the model arises from the fact that locally the expected number of fatal acci-

dents for a year equals risk times exposure estimates for that year. This multi-

plicative relationship can be made additive by taking logarithms in the usual

way. However since the analysis is based on aggregated and disaggregated

data, summing constraints need to be considered as well. This mixture of mul-

tiplicative and additive relations in the model calls for a nonlinear analysis.

Furthermore, the analysis is for a vector of time series and the model consists

of multiple latent variables. Therefore, we adopt multivariate nonlinear state

space methods for the analysis of road accidents.

The empirical motivation is to analyse the development of road safety inside

and outside urban areas in the Netherlands between 1961 and 2000. The ex-

27Co-authored with Jacques Commandeur, SWOV Institute for Road Safety Research, Lei-dschendam, Netherlands, and Siem Jan Koopman and Kees van Montfort, Department ofEconometrics, Vrije Universiteit Amsterdam, Netherlands.

109

pected annual number of fatal accidents is defined by a risk factor times ex-

posure. Risk and exposure are simultaneously treated as latent or unobserved

components. The expected number of vehicle kilometres driven (traffic vol-

ume) is set equal to the latent exposure component. The observed traffic vol-

ume and the observed number of fatal accidents are available for inside and

outside urban areas in the Netherlands. However, for some periods only the

total number of vehicle kilometres driven (the sum of numbers for inside and

outside urban areas) is available. For these periods, the expected total number

of vehicle kilometres is set equal to the sum of the latent exposure components

for inside and outside urban areas.

Since the seminal paper of Smeed (1949), time ordered accident data are anal-

ysed in many studies in road safety. In Smeed (1949) it is argued that the

annual number of fatalities per registered motor vehicle can be explained by

means of the motorisation, measured by the number of registered motor ve-

hicles per capita. The availability of more detailed time series data has led

to advanced and interesting statistical studies on road safety. An example is

the introduction of the use of traffic volume data. Traffic volume (e.g. ve-

hicle kilometres driven, sometimes travel kilometres) is currently assumed to

be one of the most important factors available for the explanation of accident

counts. Appel (1982) found an exponentially decaying risk when he decom-

posed the (expected) number of accidents in a risk component (accidents per

kilometres driven) and exposure (kilometres driven). Similar approaches have

been adopted by Broughton (1991) and Oppe (1989, 1991b). These models are

univariate (one dependent variable) and some consist of just one explanatory

variable measuring traffic volume. Time-dependencies in the error structure

are ignored and estimation is based on classical methods.

Various time series analysis techniques, on the other hand, do take time-depen-

dencies in the error structure into account. For example, autoregressive inte-

grated moving average (ARIMA) techniques with explanatory variables (ARI-

MAX) as developed by Box and Jenkins (1976) are used in the DRAG (De-

mand for Road use, Accidents and their Gravity) analyses of Gaudry (1984)

and Gaudry and Lassarre (2000). A DRAG analysis consists of three stages:

first the traffic volume is modelled, next the accidents using the estimated traf-

fic volume, and then the number of victims per accident (severity). Such a

DRAG analysis is focussed on explaining the underlying factors of road safety

while earlier studies were more focussed on forecasting. The DRAG approach

allows for a non-linear transformation of the data by means of Box-Cox trans-

forms. The time series structure however is linear. The model in this paper

110

disentangles exposure and risk by unobserved components that are estimated

simultaneously rather than estimated by separate stages.

An alternative method to analysing road safety data was proposed by Harvey

and Durbin (1986) and is based on a structural time series model with inter-

ventions. This approach has been applied in road safety analysis by a number

of authors. Ernst and Bruning (1990), for example, used a structural time series

model to assess the effect of a German seat belt law while Lassarre (2001) ap-

plied structural time series models to compare the road safety developments

in a number of countries. The method of Harvey and Durbin (1986) can also

be extended to the simultaneous modelling of traffic volume, road safety and

severity, see Bijleveld et al. (2008). In these approaches linear Gaussian time se-

ries techniques such as the Kalman filter are used for estimation, analysis and

forecasting. In the present paper we need to adopt a nonlinear equivalent of a

structural time series model. Linear estimation techniques cannot be used as a

result and therefore we rely on extended (nonlinear) Kalman filter techniques.

Related approaches based on counts and with latent factors were discussed by

Johansson (1996).

In road safety analysis, the use of disaggregated data is useful when the sepa-

rate series can be modelled more effectively than the original aggregated time

series. For instance, the composition of transport modes inside urban areas is

usually different from that outside urban areas. Therefore, traffic volume and

safety are different in these two parts of the traffic system. The present paper

implements a model-based simultaneous treatment of traffic volume and fa-

tal accidents for inside and outside urban areas. An important feature of the

method is that it can handle the temporal unavailability of traffic volume data

at the disaggregated level, while still providing estimates of the disaggregated

exposure and risk for the full sample.

The paper is organised as follows. Section 6.2 presents the data used in the

application part of the paper. The relation between observed and unobserved

factors within a multivariate nonlinear time series model is described in detail

in Section 6.3, by first introducing the model and then providing a state space

formulation of the model. A description of the estimation methods is given

in Section 6.4. The main empirical results are presented in Section 6.5, and

in Section 6.6 implications for road safety research are discussed. Section 6.7

concludes.

111

1960 1970 1980 1990 2000

20

40

60

80

100

120

1960 1970 1980 1990 2000

200 400 600 80010001200140016001800

Figure 6.1. Traffic volume in billions of motor vehicle kilometres (left panel) and thenumber of fatal accidents (right panel) for inside urban areas (solid line) and outside ur-ban areas (dashed line). The total traffic volume in the left panel is marked by a dashedline over the whole period. The vertical lines inside the graphs mark the period inwhich disaggregated traffic volume data are available. This data-set will be modelled.

6.2. Data description

In the empirical study we analyse annual road traffic statistics from the Neth-

erlands consisting of numbers of fatal accidents and traffic volume, defined as

kilometres driven by motor vehicles, in the period 1961 up to and including

2000, both separated into inside and outside urban areas. This yields y1t as the

traffic volume inside urban areas, y2t as the traffic volume outside urban areas,

y3t as the total traffic volume, x1t as the number of fatal accidents inside urban

areas and x2t as the number of fatal accidents outside urban areas where time

index t = 1, . . . , n represents the index range for the years from 1961 up to and

including 2000. The total number of time points is therefore n = 40. All data

were obtained from the Dutch Ministry of Transport and Statistics Netherlands

while the accident information originated from police records.

The five time series are presented in Figure 6.1 with two displays. The left

hand display shows the development of the motor vehicle kilometres in the

Netherlands. Disaggregated figures of traffic volume y1t and y2t are missing

for the periods 1961 up to and including 1983 and 1997 up to and including

2000. For these years only the total traffic volume y3t is available. Only modest

deviations from an almost linear increase can be noticed from the traffic vol-

ume figures. These deviations are potentially caused by economic factors. The

right hand display in Figure 6.1 shows the development of the number of fa-

tal accidents in the Netherlands, both for inside and outside urban areas. The

total number of fatal accidents increased until the early 1970s. Since then the

trend has reversed, although the rate of decrease seems to slow down near the

end of the series.

112

1960 1970 1980 1990 2000

60

80

100

120

140

160

1960 1970 1980 1990 2000

51

53

55

57

59

Figure 6.2. Traffic intensity index (left panel) and total length in kilometers of mainroads outside urban areas (right panel). The vertical lines inside the graphs mark theperiod in which disaggregated traffic volume data are available. This data-set is usedfor external model validation.

The results of the empirical analysis in Section 6.5 will be validated against an

alternative estimate of the traffic volume outside urban areas. This estimate is

composed of indexed figures on traffic intensity on main roads multiplied by

the length of the road system outside urban areas as obtained from a survey of

municipalities, which is not available for all time points. The data for these two

time series are presented in Figure 6.2. The data of the last years are considered

to be inconsistent due to changes in registration. The product of the latter two

series should be roughly equal to the development of motor vehicle kilometres

outside urban areas when it is assumed that the development of the traffic

intensity outside urban areas is approximately proportional to the intensity on

the main roads.

6.3. The multivariate nonlinear time series model

6.3.1. Specification of model and assumptions

The multivariate nonlinear time series model is based on two unobserved com-

ponents: a component for exposure (traffic volume) and a component for risk.

Each component is bivariate to disentangle the effects for inside and outside

urban areas. The statistical specification of the components is based on linear

dynamic processes. The observed time series of fatal accidents and driven mo-

tor vehicle kilometres depend on these factors. In particular, (i) the expected

number of fatal accidents is the product of risk and exposure, (ii) the expected

number of driven motor vehicle kilometres is proportional to the unobserved

factor exposure and (iii) the expected total number of driven motor vehicle

kilometres is proportional to the sum of the unobserved factors of exposure

inside and outside urban areas. The dynamic specifications of the unobserved

components are based on flexible time-varying trend functions. The level of

113

1960 1970 1980 1990 2000

20

40

60

80

100

120

1960 1970 1980 1990 2000

2

3

4

5

Figure 6.3. The total number of fatal accidents per billion vehicle kilometres(left panel) and in logs (right panel). The vertical lines inside the graphsmark the period in which disaggregated traffic volume data are available.

smoothness of these trends can be estimated. Note that it will be found that

the unobserved exposure components will effectively increase with time while

the unobserved risk components steadily decrease with time. As a result, be-

cause the increase in exposure coincides with a decrease in risk, the relation

between exposure and the number of accidents flattens off, as it often does

when nonlinear relations are assumed (Hauer, 1995). In addition, the general

risk is likely to have decreased. It remains a question whether these effects will

ever be distinguishable, as relations like proposed in (Hauer, 1995) are likely

to have changed as well over the 40 year period considered in this study. This

has not been attempted in this study.

Disaggregated time series data for inside and outside urban areas are available

for fatal accidents and driven kilometres although for the latter series the data

are not available for the full sample. However the yearly series of total number

of driven kilometres is available for the full sample. The five time series (par-

tially consisting of missing values) are modelled simultaneously. A log-linear

model can be considered to treat the multiplicative dependency (product of

risk and exposure) but it cannot at the same time handle the sum restrictions

for the missing disaggregated data. For this reason we need to adopt a mul-

tivariate and partially nonlinear time series model. Figure 6.1 confirms that

the number of driven kilometers are trending linearly throughout. Another

nonlinear aspect of the model is introduced by the assumption of an exponen-

tial decay over time of risk factors. Figure 6.3 displays the number of fatal

accidents per billion vehicle kilometres (in levels and logs); this is a rough in-

dication of risk. It shows that risk may decay exponentially over time and it

may therefore be reasonable to assume a smooth linear trend function for log-

risk. This exponential trend specification introduces a further nonlinear aspect

in the model. Possible breaks in trends can be accounted for by including in-

tervention regression effects in the model.

114

6.3.2. Unobserved stochastic local linear trend factors

The deterministic trend specifications for exposure and in particular log-risk

are too rigid in practice because trends will not be constant over time in a

long period of forty years. A time-varying trend is more flexible. A possible

stochastic specification for a time-varying trend µt is the local linear trend with

increment or slope term βt and given by

µt+1 = µt + βt + ηt, βt+1 = βt + ζt, t = 1, . . . , n, (6.1)

where the disturbances ηt and ζt are normally distributed with mean zero and

variances σ2η and σ2

ζ , respectively. The disturbances ηt and ζs are mutually and

serially independent of each other at all time points t, s = 1, . . . , n. The initial

values of µ1 and β1 can be regarded as realisations from a diffuse distribution

or as fixed unknown coefficients, see the discussion in Durbin and Koopman

(2001). The special case of σ2η = σ2

ζ = 0 reduces (6.1) to βt+1 = βt = β1 and

µt+1 = µ1 + β1 + . . . + βt = µ1 + β1 · t for t = 1, . . . , n. This is the deterministic

linear trend. In a similar way we can show that for the case of σ2η > 0 and σ2

ζ =

0, we obtain the random walk plus fixed drift ∆µt+1 = β1 + ηt for t = 1, . . . , n.

Further, it can be established that a smooth trend specification can be obtained

by σ2η = 0 and σ2

ζ > 0, see also Harvey (1989) or Commandeur and Koopman

(2007).

The unobserved exposure factors for inside and outside urban areas are indi-

cated by µ1t and µ2t, respectively. The log-risk factors for inside and outside

urban areas are indicated by δ1t and δ2t, respectively. Given the discussion

in the previous section and to gain flexibility in modelling, we consider local

linear trend specifications for the unobserved factors, that is

µit ∼ LLT, δit ∼ LLT, i = 1, 2, t = 1, . . . , n,

where LLT refers to the trend specification (6.1). The disturbance sequences

driving the unobserved factors are mutually independent of each other except

those within the pairs of (µ1t, µ2t) and (δ1t, δ2t). In other words, correlation

between log-risk factors inside and outside urban areas can be estimated. This

also applies to exposure factors.

6.3.3. Observation equation

The dynamic mutual dependencies of the five observed time series are speci-

fied solely through the four unobserved and independent factors. This leads to

a relatively simple model specification for the observed time series. Given the

discussion of the model in Section 6.3.1, the model equations for the observed

115

traffic volume for inside and outside urban areas are given by

yit = µit + εit, εit ∼ WN, i = 1, 2, t = 1, . . . , n, (6.2)

where WN refers to a Gaussian white noise sequence. The disturbances εit

have mean zero and variance σ2ε,i for i = 1, 2. Further, they are mutually inde-

pendent of all other disturbances in the model. The variances σ2ε,1 and σ2

ε,2 need

to be estimated. The total traffic volume is modelled by the relation

y3t = y1t + y2t = µ1t + µ2t + ε1t + ε2t. (6.3)

These observation equations are linear and can be regarded as a special trivari-

ate common trends model with two independent stochastic trends. It should

be noted that when no observations are available for y1t and y2t, the distur-

bances ε1t and ε2t cannot be identified separately. The sum ε1t + ε2t can be

identified when only y3t is observed. Therefore, in this case we take ε1t + ε2t

as a Gaussian white noise sequence with mean zero and variance σ2ε,1 + σ2

ε,2.

The statistical model specification for the number of fatal accidents in and out-

side urban areas is given by

g(µit, δit) = µit · exp(δit), (6.4)

which implies an exponential trend for the log-risk factor that is proportional

to exposure. The Gaussian disturbances ξit have mean zero and a non-negative

variance, for i = 1, 2. They are also mutually independent of all other distur-

bances. The counts of fatal accidents are approximated by a normal distribu-

tion, as the counts vary within the wide interval of 250 and 1750. We account

for possible overdispersion by setting the variance of the disturbances ξit equal

to xit(1 + σ2ξ,i) where we regard the fatal accidents xit as a proxy for the mean

of xit.

6.3.4. Nonlinear state space model formulation

The general state space model with a (possible) nonlinear observation equation

is given by

(

yt

xt

)

= Z(αt)+ Gut, αt+1 = Tαt + Hut, ut ∼ WN, t = 1, . . . , n,

(6.5)

116

where αt is the state vector and ut is the disturbance vector with mean zero

and variance matrix V. The system matrices G, T and H together with the

function Z(·) are fixed and known although they may partially depend on

a parameter vector of fixed unknown coefficients. The initial state vector is

taken as a realisation from a diffuse density but can also be regarded as fixed

unknown coefficients, see Durbin and Koopman (2001, Chapter 6).

The local linear trend models for exposure and log-risk, inside and outside ur-

ban areas, can be simultaneously put in state space form by placing the trends

µit and δit with their associating slope terms, for i = 1, 2, in the state vector αt.

The disturbance terms are put in ut. We obtain,

αt =(µ1t, βµ1t, µ2t, β

µ2t, δ1t, βδ

1t, δ2t, βδ2t)

′,

ut =(ηµ1t, ζ

µ1t, ε1t, η

µ2t, ζ

µ2t, ε2t, ηδ

1t, ζδ1t, ξ1t, ηδ

2t, ζδ2t, ξ2t, )

′,

where xµ indicates the association of variable x with exposure µ and xδ with

log-risk factor δ for x = β, η, ζ. The stochastic processes for the four trend

functions are implied by the system matrices given by

T = I4 ⊗[

1 1

0 1

]

, H = I4 ⊗[

1 0 0

0 1 0

]

,

where I4 is the 4 × 4 identity matrix and ⊗ is the Kronecker matrix product.

The multivariate observation equation for traffic volume is linear. In terms of

the observation vector yt = (y1t, y2t, y3t)′ and the state vector αt it follows from

(6.2) and (6.3) that

yt =

1 0 0 0

0 1 0 0

1 1 0 0

(

1 0)

αt +

1 0 0 0

0 1 0 0

1 1 0 0

(

0 0 1)

ut. (6.6)

This observation equation is rank deficient due to its inclusion of identity (6.3).

However, during the estimation process, the equation for y3t is only considered

when y1t and y2t are missing and y3t is not considered when y1t and y2t are not

missing. The estimation method does this implicitly through its treatment of

missing values.

The observation equation for the number of fatal accidents, inside and outside

urban areas, can also be formulated in terms of the state vector αt but it requires

a nonlinear specification. Define xt = (x1t, x2t)′ and consider (6.5) where Z(αt)

117

and G are chosen such that

xt =

(

µ1t · exp δ1t

µ2t · exp δ2t

)

+

[

0 0 1 0

0 0 0 1

]

⊗(

0 0 1)

ut, (6.7)

with(

µ1t

µ2t

)

=

[

1 0 0 0

0 1 0 0

]

⊗(

1 0)

αt,

(

δ1t

δ2t

)

=

[

0 0 1 0

0 0 0 1

]

⊗(

1 0)

αt.

This completes the state space formulation of the multivariate nonlinear model

that is the basis of the empirical study discussed in Section 6.5.

6.4. Estimation of parameters and latent factors

The variances of the disturbances in vector ut of the state space model in the

previous section are treated as unknown parameters. They will be estimated

by the method of maximum likelihood. For a fully linear model, the Gaussian

log-likelihood function is evaluated by the Kalman filter and numerically max-

imised with respect to the unknown parameters, see Harvey (1989) and Durbin

and Koopman (2001). Consider a state space model with a linear Gaussian ob-

servation equation

(

yt

xt

)

= ct + Zt · αt + Gut, (6.8)

where ct is a known vector and Zt is a known matrix. Both can be time-varying

and may depend on past observations. It is noticed that the linear Gaussian

model (6.8) is equivalent to (6.5) with Z(αt) = ct + Zt · αt. Further, we assume

that the disturbances ut are normally distributed.

The Kalman filter recursively evaluates the estimator of the state vector con-

ditional on past observations Yt−1 = {y1, x1, . . . , yt−1, xt−1}. The conditional

estimator of the state vector is denoted by at|t−1 = E(αt|Yt−1) and its condi-

tional variance matrix Pt|t−1 = var(αt|Yt−1). The Kalman filter is given by the

118

set of vector and matrix equations

vt =

(

yt

xt

)

− ct − Ztat|t−1, Ft = ZtPt|t−1Z′t + GG′,

Kt = (TPt|t−1Z′t + HG′)F−1

t ,

at+1|t = Tat|t−1 + Ktvt, Pt+1|t = TPt|t−1T′ − KtF−1t K′

t + HH′,

(6.9)

for t = 1, . . . , n and where a1|0 and P1|0 are the unconditional mean and var-

iance of the initial state vector, respectively. When an initial state element is

taken as a realisation from a diffuse density, we can take its mean as zero and

its variance as a very large value. Exact treatments of diffuse initialisations

are discussed in Durbin and Koopman (2001). The vector vt is the one-step

ahead prediction error with variance matrix Ft. The optimal weighting for fil-

tering is determined by the Kalman gain matrix Kt. The joint density of the

observations can be expressed as a product of predictive densities via the pre-

diction error decomposition. As a result, the log-likelihood function can be

constructed via the Kalman filter and is given by

ℓ = −n

2log 2π − 1

2

n

∑t=1

log |Ft| −1

2

n

∑t=1

v′tF−1t vt. (6.10)

With diffuse state elements, the log-likelihood function requires some modifi-

cations. For a linear Gaussian state space model, the log-likelihood function ℓ

is exact.

When a value for a particular element of vector (y′t, x′t)′ is not available, it is

treated as a missing value. The Kalman filter can handle missing values in

a straightforward way. Effectively, it measurement is removed. An alterna-

tive approach is to assume its resulting variance tending to infinity. A direct

consequence of a missing entry is that the associated element of the innova-

tion vector vt cannot be computed and is unknown. Assume that all entries

in (y′t, x′t)′ are missing, we can treat vt as unknown by taking vt = 0 and, its

variance matrix, Ft → ∞I such that F−1t → 0 and Kt → 0. It follows that the

state update equations become

at+1|t = Tat|t−1, Pt+1|t = TPt|t−1T′ + HH′.

These computations are repeated for when a number of (consecutive) obser-

vations are missing. This solution also serves as the basis for out-of-sample

forecasting (future values are missing) or back-casting (past values are miss-

119

ing). A missing value does not enter the log-likelihood expression of (6.10).

When some elements of (y′t, x′t)′ are missing, the corresponding elements of vt

are taken as zero and the associating rows and columns of F−1t and Kt are taken

as zero vectors.

The nonlinearities in the multivariate model of Section 6.3 are treated by the

extended Kalman filter that is based on a first-order Taylor expansion of the

nonlinear relation. Since the nonlinearity is limited to the observation vec-

tor, we only require the linearisation of µit exp δit around some known values

(µ∗it, δ∗it), that is

µit exp δit ≈ µ∗it exp δ∗it + exp δ∗it(µit − µ∗

it) + µ∗it exp δ∗it(δit − δ∗it)

= exp δ∗it × (−µ∗itδ

∗it + µit + µ∗

itδit),

for i = 1, 2 and t = 1, . . . , n. The linearisation is more accurate when the value

of (µ∗it, δ∗it) is close to (µit, δit).

Within the Kalman filter, the nonlinear function Z(·) is linearised in this way

with an expansion at the location of the predicted state vector at|t−1. It implies

that µ∗it and δ∗it are taken from the appropriate elements in at|t−1, the conditional

estimator of αt given Yt−1. The necessary amendments of the Kalman filter lead

to vector function Z(·) becoming time-varying with vector ct and matrix Zt. In

particular, we have

ct =

0

0

0

−µ∗1tδ

∗1t exp δ∗1t

−µ∗2tδ

∗2t exp δ∗2t

,

Zt =

1 0 0 0

0 1 0 0

1 1 0 0

exp δ∗1t 0 µ∗1t exp δ∗1t 0

0 exp δ∗2t 0 µ∗2t exp δ∗2t

⊗(

1 0)

,

and with(

µ∗1t

µ∗2t

)

=

[

1 0 0 0

0 1 0 0

]

⊗(

1 0)

at|t−1,

(

δ∗1t

δ∗2t

)

=

[

0 0 1 0

0 0 0 1

]

⊗(

1 0)

at|t−1.

120

The extended Kalman filter approximates the nonlinear features of the model.

The prediction error is therefore not evaluated exactly and the log-likelihood

function (6.10) is an approximation.

The smoothed estimate of a latent factor is the conditional mean given all avail-

able observations in the sample. The smoothed estimate of the state vector is

denoted by αt = E(αt|Yn) with its variance matrix Vt = var(αt|Yn). Once the

Kalman filter is carried out, the smoothed estimates can be computed via the

backward recursions

rt−1 = Z′tF

−1t vt + L′

trt−1, Nt−1 = Z′tF

−1t Zt + L′

tNt−1Lt,

αt = at|t−1 + Pt|t−1rt−1, Vt = Pt|t−1 − Pt|t−1Nt−1Pt|t−1,(6.11)

where Lt = T −KtZt and with initialisations rn = 0 and Nn = 0. The algorithm

is a variation of the fixed interval smoothing method of Anderson and Moore

(1979) and is developed by de Jong (1989) and Kohn and Ansley (1989). The

smoothing recursions apply to the linear Gaussian state space model. How-

ever, since we have explicitly used a time-varying Zt, the computations can

also be carried out in conjunction with the extended Kalman filter. We note

that smoothing requires the storage of all Kalman filter quantities, including

the time-varying values of ct and Zt, for t = 1, . . . , n.

6.5. Empirical results: estimation and model selection

We consider the five time series described in Section 6.2 for the years 1961–

2000. The disaggregated time series of traffic volume is only observed for the

sample 1984–1996 and therefore we are dealing with many missing values in

the data set. The traffic volume series yit are modelled by (6.2) while the num-

ber of fatal accidents series xit are modelled by (6.4). Note that i = 1 refers

to outside urban areas and i = 2 refers to inside urban areas. The total traffic

volume y3t is considered when disaggregated data are not available and model

(6.3) applies. The full model consists of two sets (for outside and inside urban

areas) of two unobservable trend functions (for exposure and for log-risk). For

each trend we need to estimate two variances while for each observation equa-

tion we need to estimate an additional variance. The total number of parame-

ter is therefore 12. Finally, in our nonlinear multivariate model of Section 6.3,

the counts of yearly accidents xit are approximated by a normal distribution.

To account for possible overdispersion we have set the disturbance variances

of ξit, for i = 1, 2, equal to (1 + σ2ξ,i)xit. The implication is that matrix G in (6.5)

is effectively time-varying. Note that the prediction of the number of accidents

(see (6.4)) is a function of stochastic processes rather than a function of fixed

121

explanatory variables as is assumed in many classical models. Therefore the

model assumes more dispersion than ξit, for i = 1, 2.

At a closer inspection of the number of fatal accidents series in the right panel

of Figure 6.1, it is clear that trend-breaks occur in the years of 1974 and 1975.

They can (partly) be attributed to the global ‘oil crisis’ in 1974 and the intro-

duction of alcohol legislation in the Netherlands (November 1974). In the next

year, legislation on wearing moped helmets (February 1975) and seat belt legis-

lation (June 1975) were introduced. We therefore have included dummy effects

for trend breaks in 1974 and 1975 in the equations for yearly accidents xit, for

i = 1, 2. These dummy effects are estimated simultaneously with the other

parameters in the model.

6.5.1. Parameter estimation results

Table 6.4 presents the estimates of the parameters in the model including the

variances and dummy effects of trend breaks. The table reports the estimates

together with 95% lower and upper limits of the confidence intervals. The con-

fidence intervals are based on the approximation discussed in Harvey (1989,

§3.4.5 and §3.4.6). Since variance parameters are restricted to be non-negative,

the logged variances are estimated and related confidence intervals are there-

fore asymmetric. We do not report parameters that have been estimated very

closely to zero. These are (i) the variances σ2ξ,i for both i = 1, 2, (ii) the var-

iances of level disturbances ηµit for both i = 1, 2, (iii) the variances of slope

disturbances ζδit for both i = 1, 2 and (iv) the dummy effect for the 1975 trend

break in log-risk for outside urban areas (i = 2).

From these results we learn that overdispersion of counts in the Gaussian ap-

proximation are not significant. This is probably due to additional variance

already by the model. Further, the estimated zero level variances in the expo-

sure components µit, for i = 1, 2, lead to so-called smooth trends for µit. The

estimates of the slope variances in exposure, as reported in Table 6.4, reveal

that the variation in exposure growth is larger for outside (≈ 1.70) than for

inside urban areas (≈ 0.03). These estimates rely on the limited sample period

1984–1996. The time series plots in Figure 6.1 confirm that the traffic volume

inside urban areas is almost constant over these years while the growth of traf-

fic volume outside urban areas has increased more rapidly in the period before

1990 than the period after 1990. Also, traffic volume inside urban areas appear

to be noisier than traffic volume outside urban areas (compare the observation

variances for outside, ≈ 0.08, and inside, ≈ 0.75, in Table 6.4). The slope vari-

ances of the log-risk components are estimated zero. This reduces the log-risk

122

1960 1970 1980 1990 2000

2

3

4

5

1960 1970 1980 1990 2000

20

40

60

80

Figure 6.5. Left panel: Estimated trends of log-risk δit for inside (with fi-nal estimate 2.39, s.e. 0.05) and outside (with final estimate 1.93, s.e. 0.03)urban areas. Right panel: Estimated trends of exposure µit for inside (finalestimate 31.82, s.e. 1.36) and outside (final estimate 94.24, s.e. 1.48) urbanareas. The shaded areas indicate 95% confidence intervals.

trends to random walk processes with fixed growth terms. Both level vari-

ances are estimated to be relatively small since log-risk trends have a much

smaller scale than exposure trends. Although these estimated level variances

are small they deviate from zero significantly.

Parameter Inside urban areas (i = 1) Outside urban areas (i = 2)Est lci uci Est lci uci

Var(ζµit) 0.0312 0.0191 0.0509 1.6999 1.1974 2.4132

Var(ηδit) 0.0012 0.0007 0.0020 0.0012 0.0005 0.0025

Var(εit) 0.7492 0.5177 1.0844 0.0794 0.0183 0.3451Break δit 1974 −0.1822 −0.2850 −0.0794 −0.1705 −0.2567 −0.0842Break δit 1975 −0.1387 −0.2464 −0.0310

Figure 6.4. Estimation results of parameters in equations for inside and outside urbanareas. Estimates are reported, when they significantly deviate from zero, together withlower (lci) and upper (uci) limits of the 95% confidence interval (those for variances areasymmetric).

The estimation results for the dummy effects, as reported in Table 6.4, show

significant breaks in the log-risk trends for the years 1974 and 1975. Only the

break in 1975 for log-risk outside urban areas is not significant. The anticipated

effect of the seat belt law in 1975 may already have its main effect in 1974.

To further disentangle the effects of these events, more detailed accident and

mobility data are required.

6.5.2. Signal extraction: trends for exposure and risk

The left hand panel of Figure 6.5 presents the estimated trends for the log-risk

of inside and outside urban areas. The apparent accelerated decrease in the

trend of the risk for inside urban areas is the result of the interventions in 1974

and 1975 whereas for outside urban areas it is the result of the effect of the in-

tervention in 1974. It appears that the risk outside urban areas decreases more

123

rapidly than the risk inside urban areas. Whether this is due to the increased

implementation of motorways and the separation of long distance traffic from

local traffic requires further investigations.

The right hand panel of Figure 6.5 displays the trends of exposure inside and

outside urban areas. The exposure inside urban areas increases steadily from

the 1960s onwards until it levels off at the end of the 1970s. It starts slowly

increasing again from the 1990s onwards. It may be noted that the stabili-

sation of the exposure inside urban areas in the 1970s takes place before the

period for which disaggregated traffic volume data are available. This shows

that the methodology enables the recognition of such changes before disag-

gregated data are available. In comparison with the trend of exposure inside

urban areas, the confidence margin in the trend of exposure outside urban ar-

eas is small. Moreover, the outside trend is growing more consistently over

the years although some minor temporary fluctuations of trend increases can

be observed. Such fluctuations are detected even at time points where traf-

fic volume data outside urban areas are not available. This can be explained

by the fact that the estimated exposure trends also rely on the observed time

series of number of fatal accidents. Since more fatal accidents occur outside

urban areas, it is apparently more likely that the fluctuations in the number of

accidents affect outside exposure more than inside exposure.

6.5.3. Model fit

This section concentrates on the ability of the multivariate nonlinear model

to fit the time series of motor vehicle kilometres and fatal accidents, inside

and outside urban areas. In Figures 6.6 and 6.7 the model predictions, both

based on only previous observations (one-ahead predictions) and based on

all observations (smoothed predictions) are represented as solid lines, with

approximate 95% confidence intervals represented by shaded areas, and the

observed data are represented as enlarged dots. The confidence intervals are

based on the estimated variances of the disturbances.

The estimated values for the motor vehicle kilometres in Figure 6.6 are equal

to the trends µit for exposure discussed in the previous section. The fit of the

estimated model is quite satisfactory. The estimated number of fatal accidents

in Figure 6.7 is based on the nonlinear function µit exp δit. The effectiveness

of this simple nonlinear relationship is convincing given the good fit of the

estimated number of accidents to the data. Apart from some small differences,

the estimates for inside and outside urban areas show similar patterns. It is

encouraging that the model has identified the sudden increase in the number

of fatal accidents outside urban areas in 1975–1977 – and mainly attributed

124

1960 1970 1980 1990 2000

20

40

60

80

100

120

1960 1970 1980 1990 2000

20

40

60

80

100

120

Figure 6.6. Estimated (solid line) versus observed (dots) motor vehicle kilometres forinside (lower line), outside (middle line) urban areas and total (upper line). Left panel:estimates based on past observations only (prediction). Right panel: estimates basedon all observations (smoothing). Disaggregated traffic volume data are available in theperiod within the vertical lines. The shaded areas are 95% confidence intervals.

1960 1970 1980 1990 2000

200 400 600 800

10001200140016001800

1960 1970 1980 1990 2000

200 400 600 800

10001200140016001800

Figure 6.7. Estimated (solid line) versus observed (dots) number of fatal accidentsfor inside (lower line) and outside (upper line) urban areas. Left panel: estimatesbased on past observations only (prediction). Right panel: estimates based on allobservations (smoothing). Solid lines represent the model estimates and dots arethe observed values. Disaggregated traffic volume data are available in the periodwithin the vertical lines. The shaded areas indicate 95% confidence intervals.

125

1960 1970 1980 1990 2000

-2

-1

0

1

2

3(a)

19831990 1997

-2

-1

0

1

2

3(b)

19831990 1997

-2

-1

0

1

2

3(c)

1960 1970 1980 1990 2000

-2

-1

0

1

2

3(d)

1960 1970 1980 1990 2000

-2

-1

0

1

2

3(e)

Figure 6.8. Standardised prediction residuals for (a) total traffic volume, (b) traffic vol-ume inside urban areas, (c) traffic volume outside urban areas, (d) fatal accidents insideurban areas, (e) fatal accidents outside urban areas. Disaggregated traffic volume dataare available in the period within the vertical lines.

this to an increase in traffic volume (see Figure 6.5) – whereas the number of

accidents inside urban areas almost continues to decrease in this period.

In Figure 6.8 the standardised residuals are displayed. Although these resid-

uals appear not to violate standard univariate tests, it should be noted that

this case is not very standard. The series for traffic volume inside and outside

urban areas are very short, so the power and reliability of – in addition asymp-

totic – tests on such series is limited. Secondly, the series for traffic volume are

interrelated, and as the relatively large deviances in both inside and outside ur-

ban area series are not reflected in the series for the whole of the Netherlands,

one can suspect some correlation here, which is not considered in univariate

tests. Altogether these results should not be considered decisive, rather, the

results can be considered encouraging.

6.5.4. External validation

To further validate the estimates obtained by the model, we consider the esti-

mated trend for the exposure outside urban areas displayed in the right panel

of Figure 6.5. These estimates are also presented as the solid line in Figure 6.9.

Since traffic volume data outside urban areas are only available for the years

1984 up to 1996, the fit between the observed volume data outside urban areas

and the estimated trend can only be evaluated for this 13 year period. How-

ever, as mentioned in Section 6.2, an alternative indicator for exposure outside

126

1960 1970 1980 1990 2000

102030405060708090 Figure 6.9. The fit of traffic volume outside ur-

ban areas: extrapolation and validation of themodel. The fit implied by the multivariate non-linear model is represented by a solid line. Trafficvolume outside urban areas is only available inthe period within vertical lines. The alternativeindicator observations of volume are representedby the dots. The dashed line reflects the linear ex-trapolation of the traffic volume data outside ur-ban areas.

urban areas is available which extends beyond the 13 year period. This alterna-

tive indicator is obtained by multiplying the indexed traffic intensity on main

roads in the Netherlands with the total length of roads outside urban areas.

Since this alternative indicator is measured on a different scale from the motor

vehicle kilometers driven outside urban areas, the values of the latter obser-

vations were regressed on the alternative indicator observations for the years

1984 up to 2000. The predicted values of this simple regression without inter-

cept yield properly re-scaled alternative indicator observations and are plotted

as dots in Figure 6.9. As the figure shows, the estimated trend for exposure out-

side urban areas is quite consistent with the alternative indicator values, even

in the eleven year period from 1973 through 1983 for which no motor vehicle

kilometres driven specific for outside urban areas were available.

Finally, alternative back-casts and forecasts can be produced by the linear ex-

trapolation of traffic volume outside urban areas. These back- and forecasts are

shown in Figure 6.9 as a dashed line. Especially the back-casts of the nonlinear

state space model are clearly superior to a simple extrapolation of the traffic

volume data.

6.6. Implications for road safety research

The current results offer the possibility to interpret the disaggregated develop-

ments of road safety over a much longer period of time than the 13 year period

of 1984 up to and including 1996 for which all disaggregated data are available.

Moreover, it may be that the methodology used in this paper is applicable in

more cases in road safety analysis where where some important exposure fea-

ture is not available in a disaggregation of interest to road safety research, for

instance when a distinction is not yet made in a survey, while already being

available in accident data. In this example a brief period of data availability

was used to extend the disaggregation of traffic volume data to other periods.

In practice however, it is more likely that such a gap in information is filled

using specialist surveys or third party data. Although such data are likely to

127

be only intermittently available in practice, it may be possible that they could

still be used for analysis.

With respect to interpretation of the road safety development in the Nether-

lands, from Figure 6.6 we learn that the development of Dutch traffic volume

has increased since the 1960s. Disaggregating the traffic volume for inside and

outside urban areas shows that the traffic volume inside urban areas contin-

ued to increase until the end of the 1970s. It started to increase again from

the 1990s. On the other hand, the traffic volume outside urban areas kept on

growing more consistently and strongly with the largest acceleration between

about 1983 and 1992. Although the increase of traffic mobility outside urban

areas was limited in the early 1960s, it has increased more dramatically from

the end of the 1960s when comparing it to mobility inside urban areas. It can

therefore be concluded that this development was a dominant factor in the

total traffic volume long before the beginning of the new century. It would

be interesting to further disaggregate the developments, both in terms of road

type and accident type, where the impact of the separation of vulnerable road

users from motorised traffic could be of interest.

6.7. Conclusions

The model-based treatment of exponential and multiplicative relationships be-

tween number of accidents and factors such as exposure and risk has proven

to be effective. A multivariate nonlinear time series model is estimated using

a partially disaggregated data set of traffic volume and number of accidents.

The estimation methods are based on extended versions of the standard multi-

variate Kalman filter and related algorithms. We have shown that a state space

methodology in a multivariate and nonlinear setting with many missing ob-

servations is feasible and that it can lead to interesting empirical results. The

empirical study consists of the analysis of road safety in the Netherlands by

simultaneous consideration of two sections of the total traffic system: inside

and outside urban areas. It is assumed that the development of road safety

inside urban areas is different from the development of road safety outside ur-

ban areas due to differences in road infrastructure and changes in the use of

road transport inside and outside urban areas over the years.

The empirical results show that developments of exposure inside and out-

side urban areas have roughly kept up with each other up to 1980. After this

period, a decline of the growth in exposure inside urban areas occurred and

lasted until approximately 1990. Then exposure inside urban areas started to

increase again. In contrast, the exposure outside urban areas has steadily in-

128

creased since 1980. The model has successfully reconstructed the development

of traffic volume outside urban areas for a long time period. This is confirmed

by considering an alternative estimate of traffic volume outside urban areas,

based on the product of the index of traffic intensity and an estimate of the

total road length, both outside urban areas. The similarity between these alter-

native data-driven estimates and the model estimates is convincing.

Although the empirical results are satisfactory, the methodology of this paper

can be improved further. For example, the model may need to allow for covar-

iances between the disaggregated values. Furthermore, introducing common

components in the model may lead to statistically more significant dynamic

relations between the series. Finally, the consideration of non-Gaussian fea-

tures in the model may enhance the applicability of the current methodology

in cases where small counts are observed.

129

7. The likelihood filter: estimation and testing28

7.1. Introduction

Based on an idea in Bell and Cathey (1993)’s paper on the iterated extended

Kalman filter as a Gauss-Newton method, this paper considers an approach to

filtering and likelihood estimation for state space models where the dependent

variables can be non-Gaussian distributed, not necessarily by an exponential

family distribution. Furthermore, dependence on the state vector is not re-

stricted to location parameters.

In classical linear Gaussian state space models (see e.g., Harvey, 1989) it is

assumed that p × 1 observation vectors yt (t = 1, . . . , n) are generated by the

process:

yt = Zt αt + εt, εt ∼ NID(0, Ht), (7.1)

αt = Tt αt−1 + Rt ηt, ηt ∼ NID(0, Qt), (7.2)

where the error terms εt and ηt are assumed to be zero mean, independent and

identically Gaussian distributed. The unobserved state at time t is represented

by the m × 1 vector αt. Rt is a selection matrix composed of r ≤ m columns of

the identity matrix Im. The variance matrices Qt and Ht are assumed to be non-

singular. In general, the matrices Zt, Tt, Ht and Qt are assumed to be known

or otherwise to depend on an unknown parameter vector ψ.

Following Durbin and Koopman (1997), this paper considers the case where

the linear Gaussian observation model (7.1) is replaced by a general observa-

tion conditional density

p(yt|α1, . . . , αt, y1, . . . , yt−1, ψ) = p(yt|αt, ψ), (7.3)

while the linear state transition equation (7.2) and its Gaussian assumptions

are retained. The likelihood of the state space model is

p(y1, . . . , yn, ψ) = p(y1, ψ)n

∏t=2

p(yt|y1, . . . , yt−1, ψ). (7.4)

28co-authored with Siem Jan Koopman, Department of Econometrics, Vrije UniversiteitAmsterdam, Netherlands.

130

which can be reformulated by conditioning on the state, explicitly expressing

the likelihood in terms of the general observation conditional density p(yt|αt, ψ):

p(y1, . . . , yn, ψ) =n

∏t=1

p(yt|αt, ψ) p(αt|y1, . . . , yt−1, ψ) d αt, (7.5)

where y1, . . . , y0 in p(α1|y1, . . . , y0, ψ) represents, possibly diffuse, prior infor-

mation. Evaluation of (7.5), more specifically the (numerical) evaluation of the

integrals

p(yt|αt, ψ) p(αt|y1, . . . , yt−1, ψ) d αt, (7.6)

can be decisively slow when the dimension of the state m is larger than 1 as

it requires the evaluation of multidimensional integrals, in particular, when

these integrals cannot be evaluated analytically. Durbin and Koopman (1997)

and others address the evaluation of such integrals by means of Monte Carlo

techniques. This paper however considers the use of an analytical evalua-

tion of such integrals based on Laplace approximations, as applied by e.g.

Wolfinger (1993); Vonesh (1996); Huber, Ronchetti, and Victoria-Feser (2004),

see also de Bruijn (1981, Chapter 4). It will be demonstrated in Section 7.3 that

evaluation of the Laplace approximations coincides with “the maximum likeli-

hood/least squares approach to the (nonlinear) update problem” as discussed

by Bell and Cathey (1993, p. 295). Thus in the case of Gaussian observation

conditional density, the approach presented in this paper is equivalent to the

iterated extended kalman filter. This approach is obviously an approximation,

but is quite computationally efficient, as most computations needed for the

Laplace approximation can be used to determine a Gaussian approximation to

the state distribution. The performance of the approach is assessed by means of

a number of simulations of a typical road safety analysis, similar to the model

of Bijleveld et al. (2008). The approach is applied to the pound/dollar exchange

rate, were the results are compared to results by Durbin and Koopman (2001),

and Lee and Nelder (2006), which are almost equal to the results by Durbin and

Koopman (2001). A second application concerns a large model for the effects

of precipitation on road safety, where 19 years of daily traffic volume, road ac-

cident data and precipitation data of 10 weather stations is used in a nonlinear

model. The observation conditional density (7.3) contains both Gaussian and

Poisson distributed observations, dependent on a state that contains several

latent risk and exposure components, as well as a latent fraction of traffic with

precipitation component. Conclusions are given in Section 7.6.

131

7.2. Maximum likelihood approach to filtering

7.2.1. Gaussian maximum likelihood approach to filtering

The maximum likelihood approach to filtering is described for the Gaussian

case by Bell and Cathey (1993, p. 295) as follows (using our notation):

yt ∼ N(Z a, Ht), at|t−1 ∼ N(a, Pt|t−1), (7.7)

next

A =

(

yt

at|t−1

)

, g(a) =

(

Z a

a

)

(7.8)

hence

A ∼ N(g(a), B), where B =

(

Ht 0

0 Pt|t−1

)

. (7.9)

Bell and Cathey (1993) argue that the maximum likelihood estimate of the fil-

tered state at|t is

at|t = argmaxa

φ(g(a),B) (A) , (7.10)

where φ denotes the standard multivariate Gaussian distribution.

We now proceed by noting that the density implied by (7.9) can be rearranged

as the product of two independent densities, namely the ones implied by (7.7):

φ(g(a),B) (A) = φ(Z a,Ht)(yt) × φ(a,Pt|t−1)(at|t−1). (7.11)

The next section considers an approach where the generality of (7.3) is intro-

duced in the approach by Bell and Cathey (1993).

7.2.2. General maximum likelihood approach to filtering

We can generalise (7.11) to match the terms of (7.6) and define the function

Lt(αt, ψ) = p(yt|αt) p(αt|y1, . . . , yt−1, ψ). (7.12)

By Bell and Cathey’s argument (7.10) we for now assume that the maximum

likelihood estimate for the filtered state at|t can be obtained using the max-

132

imiser of (7.12):

αt|t = argmaxα

(Lt(α, ψ)) . (7.13)

The error covariance (matrix) Pt|t of the estimate αt|t obtained from (7.13) can

be estimated by means of minus the inverse of the Hessian of log (Lt(α, ψ)) (ψ

held fixed) evaluated at α = at|t. This approach implies a Gaussian approxima-

tion to the filtered state distribution, and is further explored in this paper. The

approximation could in principle be extended to higher order approximations.

Using a Gaussian approximation to the filtered state distribution means that

(7.12) is replaced by:

Lt(αt, ψ) = p(yt|αt, ψ) g(αt|y1, . . . , yt−1, ψ), (7.14)

where g(•|•, ψ) denotes a Gaussian distribution.

Due to the Markov property of the state space model, (7.14) can be reformu-

lated using the standard multivariate Gaussian density:

Lt(αt, ψ) = p(yt|αt, ψ) φ(αt|t−1,Pt|t−1)(αt), (7.15)

where, due to the assumption that the state is Gaussian distributed

αt|t−1 =Tt αt−1|t−1 (7.16)

Pt|t−1 =Tt Pt−1|t−1 T′t + Rt Qt R′

t. (7.17)

Note that in general Qt and both αt+1|t and Pt+1|t will depend on ψ. α0|0, P0|0may depend on ψ.

7.3. Laplace approximation of the likelihood

The evaluation of the likelihood (7.5) requires (usually multidimensional) in-

tegration of terms (7.12) or, in case of Gaussian approximation pursued in this

paper, (7.14). In this paper a Laplace approximation (de Bruijn, 1981, Chapter

4) to such integrals is proposed.

Encouraging results are published using Laplace’s approximation in this man-

ner in fields such as nonlinear mixed models (Wolfinger, 1993) and generalised

linear latent models (Huber et al., 2004), a model that has much in common

133

with filtering. Following (Wolfinger, 1993, p. 791) using our notation, we have:

exp (k l(α, ψ))d α ≈ (2π/k)m/2 det(

−l′′(α, ψ))−1/2

exp (k l(α, ψ)), (7.18)

where α is a unique maximiser of l(α, ψ). k is a constant and the approximation

improves as k → ∞. m is the dimension of the integral, which in our case is the

dimension of the state. In the current context, it is thus assumed that Lt(αt, ψ)

(from (7.15)) can be reformulated as

Lt(αt, ψ) = exp (lt(αt, ψ)), (7.19)

which implies that the set of densities is restricted to non-negative densities.

(At this point, k = 1 is assumed, further discussion of this issue is deferred to

Section C.5.)

An interesting result is that from the viewpoint of the Laplace approximation

procedure, α is defined as the maximiser of Lt(α, φ) (and lt(α, ψ)), which in the

filtering viewpoint is the maximum likelihood estimate of the filtered state ac-

cording to Bell and Cathey’s (7.10) approach. Also, the Hessian of the log like-

lihood is required in the evaluation of the Laplace approximated likelihood.

7.4. Simulation studies

Simulation studies are performed to assess the performance of the method in

a typical road safety situation, which is the small count alternative to the sim-

plest Latent Risk Model described in Bijleveld et al. (2008). In the Bijleveld

et al. (2008) model it is assumed that we have observed a series of accidents (in

general counts) that is governed by two possibly dependent latent processes.

One process performs the role of risk and the other the role of exposure to

that risk while the product of the two processes is the expected number of ac-

cidents (counts). It is further assumed that the development of the exposure

to the risk process can be indirectly observed through a real world ‘volume’

phenomenon (e.g. the number of vehicle kilometres, the number of vehicles,

population size): (t = 1, . . . , 20)

volume = exp (exposure) + NID(0, σ2v ) (7.20)

count ∼Poisson (exp (exposure) × exp (risk)) . (7.21)

In road safety analysis it is important to know whether changes in (accident)

counts can be attributed to changes in exposure or changes in risk, or combi-

134

nations of both. If the risk changes, it may be of interest to know whether the

risk process is affected by exogenous influences or structural breaks.

In order to get an indication of the performance of the likelihoodfilter approach

in reconstructing the risk and exposure processes in typical road safety ap-

plications, four combinations of log-risk and log-exposure developments are

selected, each one consisting of a fixed straight line:

1. exposure(t) = exp (2 + 0.3t), risk(t) = exp (0.5 − 0.01t),

2. exposure(t) = exp (2 + 0.5t), risk(t) = exp (0.5 − 1.5t),

3. exposure(t) = exp (2 + 0.5t), risk(t) = exp (0.5 − 3t),

4. exposure(t) = exp (2), risk(t) = exp (0.5).

For each of these combinations a set of 1000 samples are created of just 20 ob-

servations, were ‘volume’ observations are simulated using Gaussian random

numbers with a variance of 1, and matching accident counts were simulated

using Poisson distribution random numbers, resembling a basic analysis in

road safety analysis of 20 years data. See Figure 7.3 for sample developments.

The first of each of the 1000 samples started with the same random seed.

Subsequently, a four dimensional state space model is fitted based on two Lo-

cal Linear Trend models, one for exposure and one for risk (Bijleveld et al.,

2008). A dynamic covariance structure like in (Bijleveld et al., 2008, section 3)

is used. Using the likelihoodfilter and a classical linear smoother, the smoothed

states and their covariances are estimated. For each time point t in the range

11, . . . , 20 the smoothed estimate of the state is compared with the simulated

state: in case 1) the levels of log-exposure and log-risk are fixed at 2 + 0.3t and

0.5 − 0.01t, while the slope of the exposure is fixed at 0.3 and the slope of the

risk is fixed at −0.01, for all t. For each time point t in the range 11, . . . , 20

the standardised difference between the simulated state and its smoothed esti-

mate is computed. The cumulative distribution of the 1000 standardised errors

is compared to the standard Gaussian cumulative distribution, and displayed

for each set of simulations in Figure 7.4–7.7.

Although the series of 20 observations is short and the observation error in the

’volume’ is large – roughly comparable to moped travel survey data – the state

estimates of both the exposure and risk components appear quite accurate to

the eye: little bias appears in the level components (columns 1 and 3) which re-

spectively represent the log-exposure estimate and log-risk estimate, while the

135

0 5 10 15 2056789

1011

0 5 10 15 2068

101214161820

(a) Simulated data of one sample of simulation 1:exposure(t) = exp (2 + 0.3t), risk(t) = exp (0.5 − 0.01t).

0 5 10 15 20

6

8

10

12

0 5 10 15 2002468

101214

(b) Simulated data of one sample of simulation 2:exposure(t) = exp (2 + 0.5t), risk(t) = exp (0.5 − 1.5t).

0 5 10 15 20

6

8

10

12

0 5 10 15 200

5

10

15

20

(c) Simulated data of one sample of simulation 3:exposure(t) = exp (2 + 0.5t), risk(t) = exp (0.5 − 3t).

0 5 10 15 205

6

7

8

9

10

0 5 10 15 20

4

68

101214

(d) Simulated data of one sample of simulation 4:exposure(t) = exp (2), risk(t) = exp (0.5).

Figure 7.3. Simulated data of one sample of each of 4 simulation types(rows, top to bottom). In each row, ‘volume’ on the left hand panel‘counts’ are on the right hand panel.

136

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

Figure 7.4. Cumulative plots of standardised errors over all samples of the last 10observations for simulation set 1. The dashed line denotes the cumulative function ofthe standard Gaussian distribution. The results for the ”level of log-exposure” (left),”slope of log-exposure”, ”level of log-risk” and ”slope of log-risk” (right) are arrangedhorizontally, while observation 11 through 20 are arranged downward.

137

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

Figure 7.5. Cumulative plots of standardised errors over all samples of the last 10observations for simulation set 2. The dashed line denotes the cumulative function ofthe standard Gaussian distribution. The results for the ”level of log-exposure” (left),”slope of log-exposure”, ”level of log-risk” and ”slope of log-risk” (right) are arrangedhorizontally, while observation 11 through 20 are arranged downward.

138

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

Figure 7.6. Cumulative plots of standardised errors over all samples of the last 10observations for simulation set 3. The dashed line denotes the cumulative function ofthe standard Gaussian distribution. The results for the ”level of log-exposure” (left),”slope of log-exposure”, ”level of log-risk” and ”slope of log-risk” (right) are arrangedhorizontally, while observation 11 through 20 are arranged downward.

139

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

-2 -1 0 1 2 3

0.2

0.4

0.6

0.8

1

Figure 7.7. Cumulative plots of standardised errors over all samples for of the last10 observations simulation set 4. The dashed line denotes the cumulative function ofthe standard Gaussian distribution. The results for the ”level of log-exposure” (left),”slope of log-exposure”, ”level of log-risk” and ”slope of log-risk” (right) are arrangedhorizontally, while observation 11 through 20 are arranged downward.

140

variance appears to be sufficiently correct. The slopes did not fare so well. In

particular in the second column, which represents the distribution of the slope

of the log-exposure component, the distributions appear skewed (dashed line

over the empirical cumulative distribution in the negative horizontal range,

while less difference is visible in the positive range), and it appears that the

distributions for the slope of the log-risk component (fourth column) have less

variance than is expected. This could be the result of the estimate of the var-

iance of the component being generally too large. In road safety analysis in-

ference has most often to be drawn from log-exposure and log-risk estimate

(the third and first column). Based on this particular simulation experiment

the performance of the likelihoodfilter approach in reconstructing the risk and

exposure processes in a typical road safety application appears to be accept-

able.

7.5. Applications

7.5.1. Volatility: pound/dollar daily exchange rates

This application re-estimates a zero-mean stochastic volatility model as de-

scribed in Durbin and Koopman (2001), see also Harvey, Ruiz, and Shephard

(1994). Following Durbin and Koopman (2001), denoting the daily exchange

rate by xt, the observations considered are given by yt = △ log(xt), for t =

1, . . . , n. The following model is considered (t = 1, . . . , n, 0 < φ < 1):

yt =σ exp

(

1

2θt

)

ut, ut ∼ N(0, 1),

θt+1 =φθt + ηt, ηt ∼ N(0, σ2η).

(7.22)

In order to allow for unconstrained optimisation techniques, the parameters

σ, ση and φ are internally represented as σ = exp(ψ1), ση = exp(ψ2), and

ψ = exp(ψ3)/(1 + exp(ψ3)). The method used by Durbin and Koopman

(2001) resulted in the estimates given on the left hand side of Table 7.8. The

likelihoodfilter results are given on the right hand side. Given the fact that the

results reported by Lee and Nelder (2006) are quite similar to the results re-

ported by Durbin and Koopman (2001), and that the results derived from the

likelihoodfilter differ from these (although not significantly), we conclude that

the performance of the likelihoodfilter is acceptable, although a more detailed

study is needed to assess the method.

141

Durbin and Koopman (2001) Likelihoodfilterinternal internal internal

estimate representation estimate estimate estimate

σ = 0.6338 σ = exp(ψ1) ψ1=-0.4561 σ = 0.7053 ψ1=-0.3491(0.1033) (0.1203)

ση = 0.1726 ση = exp(ψ2) ψ2=-1.7569 ση = 0.1902 ψ2=-1.65991(0.2170) (0.2253)

φ = 0.9731 φ =exp(ψ3)

1+exp(ψ3)ψ3= 3.5876 φ = 0.9691 ψ3=3.44713

(0.5007) (0.4815)

Figure 7.8. Parameter estimates and standard deviations for the stochastic volatilitymodel compared. Both real-world and internal representation of the parameter esti-mates due to Durbin and Koopman (2001) on the left hand and the likelihood filteron the right hand are given. The internal representations are used to constrain σ, ση

and φ to: σ > 0, ση > 0 and 0 < φ < 1.

7.5.2. The effects of precipitation on road safety

Introduction

It is commonly assumed that weather conditions influence the occurrence of

accidents in road traffic. The nature of the established and suspected influences

is diverse, including direct effects like poor visibility and deteriorated road

surface conditions impairing the capabilities of road users to avoid accidents,

and indirect effects like the decision to travel, choice of travel time and means

of transport influencing the number of road users at risk.

A common approach taken in road safety analysis is to compare the number of

accidents corrected for the amount of traffic – often called risk – under the dif-

ferent weather conditions. This approach however is not as straightforward to

implement as it may seem to be. First of all, data availability may induce lim-

itations, and often it will be difficult to determine the amount of traffic under

the different weather conditions. In previous studies, several approaches are

taken to obtain comparable accident counts, recognising differences in traffic

volume. Often days (or parts as in Andrey and Yagar (1993)) with precipitation

are matched with otherwise comparable days without precipitation, separated

by a short period of time, usually a week. This approach is more likely to se-

lect observations in periods with volatile weather conditions than from stable,

summer weather periods. This may cause bias issues. For an extensive discus-

sion of the influence of precipitation on road safety refer to for instance Brod-

sky and Hakkert (1988); Eisenberg (2004); Keay and Simmonds (2005, 2006)

and the references therein.

142

The following application describes a model currently under development de-

signed to identify an (potentially) increased ratio of the number of accidents

per traffic kilometre (risk) under wet weather conditions compared to dry

weather conditions in the Netherlands. For reasons beyond the scope of an

example in this article, the accident type is restricted to single fatal car acci-

dents. This type of accident is relatively well registered, and travel volume of

cars appears mildly influenced by immediate weather conditions, compared

to for instance bicyclists who may take shelter from a rain storm. This latter

aspect is important as we want to aggregate observations to the daily level

and compare the number of accidents with and without precipitation to an es-

timate of travel with and without precipitation, similar to the corrected ‘wet

pavement index’ as described by Brodsky and Hakkert (1988).

The following data is available:

• weather data provided by the Royal Netherlands Meteorological Institute

(KNMI, 2006) for a number of weather stations distributed over the Neth-

erlands. On the KNMI (2006) website, highly detailed data is available for

10 weather stations spanning a sufficiently long time. The observations

are on a daily (in universal time) basis (daily sums, averages, maxima

and minima) on a number of indicators. To start building a model in this

study, the duration of precipitation is used, as measured in the number

of units of one tenth of an hour, see Figure 7.9.

• detailed traffic volume data for most modes of transport (including cars)

is provided in a consistent manner since 1985 from the Dutch national

travel survey (CBS, 2003; AVV, 2005). Although starting and ending lo-

cation and time of trips as reported by interviewed Dutch inhabitants

are often known, it appears impossible to determine the parts of the trip

undertaken under the respective weather conditions. Data on travel kilo-

metres by car drivers is used, see Figure 7.9.

• detailed accident data is provided in a consistent dataset as of 1987 (older

data is available) (AVV, 2006) from which the weather conditions of the

accidents are determined through the police records rather than accident

location and time. Weather conditions in the accident record are un-

known for about 3% of the fatal accidents, which are further discarded

from this analysis. Counts of fatal single car accidents are used, see Fig-

ure 7.9.

143

1987/1/1 1993/5/2 1999/9/1 2005/12/30

5101520

1987/1/1 1993/5/2 1999/9/1 2005/12/30

5101520

1987/1/1 1993/5/2 1999/9/1 2005/12/30

0.20.40.60.81.0

1987/1/1 1993/5/2 1999/9/1 2005/12/30

50100150200

Figure 7.9. Available data: Top left: number of dry weather accidents; top right:number of wet weather accidents (both fatal single car accidents); bottom left:traffic volume (car driver kilometres). The survey was substantially extended in1994, but later somewhat reduced; bottom right: precipitation duration in ‘DeKooy’ 6-minute units.

Although more detailed data are available, it is decided to restrict the analysis

to the daily level at this point of model development, both to limit the size of

the model and to avoid data compatibility issues: accident times as recorded

by the police tend to be rounded to the (probably) nearest full hour, half hour,

quarter of an hour, ten and five minute points. The extent of this issue is under

investigation in a separate study matching accident data to ambulance data

that do not observe this phenomenon. Published weather data however is ag-

gregated over minutes 0 − 59. The travel survey data observe this time round-

ing phenomenon even in a stronger manner. Therefore, acknowledging the

locality in space and time of for instance rainfall (Brodsky and Hakkert, 1988),

we feel the advantages of using hourly instead of daily data may be limited,

and start development using daily data.

In this study, an approach to estimating the traffic volume – and subsequently

risk – under ‘dry’ and ‘wet’ weather conditions similar to Bijleveld et al. (2008)

is developed. To that end, a latent component for general risk as well as a latent

component for the relative risk under wet weather conditions is defined. In

addition, a latent component for the total travel volume is defined. This total

travel volume is distributed over ‘dry’ and ‘wet’ weather conditions by means

of an estimate of the fraction of travel with precipitation. In order for this

assumption to be reasonable, it is necessary to be allowed to assume that travel

is not influenced too much by precipitation. This appears to be the case with

travel by cars. The topic of the impact of weather conditions on travel habits

in general (for other means of transport) is the subject of further research. This

is one of the reasons to restrict the analysis to the safety of cars.

144

The fraction of travel with precipitation is derived as follows. First it is as-

sumed that for a small country like the Netherlands the daily fraction of travel

with precipitation represents the average fraction of time with precipitation

well. It is assumed that no considerable structural differences in the weather

pattern exist,29 although regional variation may apply. The daily fraction of

travel with precipitation is not directly observed and is therefore further con-

sidered a latent factor. The precipitation duration for the 10 weather stations

however is measured and used in this analysis. The mechanism by which the

data is collected is quite accurate, but the data is rounded. Each of these obser-

vations is considered a random draw from a Gaussian distribution with mean

equal to a given fraction of travel with precipitation, and a variance considering

the direct measurement error, the fact that universal time days are used while

other data is in middle European time (a one or two hour lag, depending on

winter or summer time) which introduces another error, and the error due to

regional differences in weather conditions. The magnitude of the combined

variance is assumed the same for the whole period. We will consider the dy-

namic aspects of this latent factor as common sense suggests that the weather

today has some predictive value for the weather tomorrow. An optimal pre-

diction is likely to depend on conditions like air-pressure, direction of wind

(wind from sea or not) and (average) temperature. Variables measuring such

influences are not included in the model. In our initial development efforts,

without such explanatory variables, a local level model is considered for an

inverse logistic transformed version of this series.

Similar to Bijleveld et al. (2008), it is not assumed that the volume of the traffic

itself is what is causing accidents, rather we assume that a latent, unobserv-

able traffic process exists, on average proportional to the exposure to accidents

and on average proportional to – but not identical to – the traffic volume. We

further assume the measurement of the daily traffic volume to be subject to

(substantial survey) error (Slootbeek, 1993). Using an exponential transform

to maintain positiveness, we have:

Total traffic volume = exp(exposure) + error (7.23)

The following general accident model for dry and wet weather conditions is

under consideration, where f is the fraction of travel under ‘wet’ weather con-

29We can however not be certain about this. Small, at first sight unimportant differencesmay turn out to be relevant to road safety, but at this point we do not know.

145

ditions (suppressing the time index for clarity):

Number of accidentsdry = exp(exposure) × (1 − f ) × Riskgeneral (7.24)

Number of accidentswet = exp(exposure) × f × Riskgeneral × Riskwet,

(7.25)

where ultimately Riskwet is of most interest.

Details: observation model

In more detail, the following observation model will now be considered. We

first define – using Latin capitals for observed values – the number of dry

weather accidents at day t by Dt, the number of wet weather accidents at day

t by Wt, the total traffic volume (car driver kilometres) as Vt (error variance

estimate σ2(Vt)) and the number of tenth-hours of the day with precipitation

for each weather station i, recoded into the fraction of time with precipitation

F(i)t , where 0 ≤ F

(i)t < 1 (it never rains all day) and i = 1, . . . , 10. We assume

– using lower case Latin symbols for latent factors – a latent log-general risk

factor (r(g)t ) and log-relative risk wet weather conditions factor (r

(w)t ) as well

as a latent factor (vt) for log-exposure. Finally we have an unobserved com-

ponent ft, the logistic transformation of which determines the fraction of the

traffic with precipitation that day. Given ft (which is stochastic), the fraction of

time with precipitation for each weather station is assumed to follow an inde-

pendent Gaussian distribution with mean 1/(1 + exp(− ft)) and variance equal

to a constant σ2(Ft) plus an estimated parameter σ2F. Technically, the indepen-

dence assumption is not likely to truly hold, as spacial relations should be

considered. However, this effect is ignored at this point, but when knowledge

about its structure is available, it could be incorporated in the model without

too much effort.

Further we assume the accident counts to be independently Poisson distributed

given the latent factors while, given the latent factors, the traffic volume is inde-

pendently Gaussian distributed with variance determined by the travel survey

plus potential extra variance.

146

The log-likelihood of an observation Yt = {Dt, Wt, Vt, F(1)t , . . . , F

(10)t }, given the

latent factors at = {vt, ft, r(g)t , r

(w)t }, is now:

l(Yt|at) = logPoisson(

Dt, λ := 1/(1 + exp( ft)) exp (vt + r(g)t )

)

+

(7.26)

logPoisson(

Wt, λ := 1/(1 + exp(− ft)) exp(

vt + r(g) + r(w)t

))

+

(7.27)

LogGaussian(

Vt, µ := exp (vt), σ2 := σ2(Vt) + σ2V

)

+ (7.28)

10

∑i=1

LogGaussian(F(i)t , µ := 1/(1 + exp(− ft)), σ2 := σ2(Ft) + σ2

F).

(7.29)

This approach, where latent variables and both continuous and discrete de-

pendent variables are modelled can also be found in Sammel, Ryan, and Legler

(1997).

Details: dynamics model

At this stage of model development, the latent factors {vt, ft, r(g)t , r

(w)t } are as-

sumed to be related over time, acknowledging that in the future, at least in

theory, this time variation may be (in part) explained by yet unidentified ex-

planatory variables. For now, all latent factors except the fraction of time with

precipitation ft are modelled using a basic structural model with ‘seasonal’

component Harvey (1989, Chapter 6–7). The ‘seasonal’ component however is

used to model the weekly patterns in traffic volume, as well as the apparent

weekly pattern in the risk components. The fraction of traffic with precipi-

tation related component ft is for now modelled using a local level model.

Obviously, explanatory variables like average air pressure, humidity and tem-

perature as well as travel variables are likely to improve the explanation of the

development of the fraction of travel with precipitation component ft. How-

ever, such improvements are not likely to change the general model structure,

which is studied here.

The dynamical specification of a basic structural model with a ‘seasonal’ of

length seven (for a week) is in general defined as follows (Harvey, 1989, Chap-

147

ter 6–7):

a(1)t+1

a(2)t+1

a(3)t+1

a(4)t+1

a(5)t+1

a(6)t+1

a(7)t+1

a(8)t+1

=

1 1 0 0 0 0 0 0

0 1 0 0 0 0 0 0

0 0 −1 −1 −1 −1 −1 −1

0 0 1 0 0 0 0 0

0 0 0 1 0 0 0 0

0 0 0 0 1 0 0 0

0 0 0 0 0 1 0 0

0 0 0 0 0 0 1 0

a(1)t

a(2)t

a(3)t

a(4)t

a(5)t

a(6)t

a(7)t

a(8)t

+

η(1)t

η(2)t

η(3)t

0

0

0

0

0

, (7.30)

where the dynamic noise components η(1)t , η

(2)t and η

(3)t are assumed to be mu-

tually independent zero mean Gaussian with variance σ2η(1) , σ2

η(2) and σ2η(3) . The

components a(1)t , t = 1, 2, . . . are called level components, the a

(2)t , t = 1, 2, . . .

are called slope components and the a(3)t , t = 1, 2, . . . are called seasonal com-

ponents. The other components can be regarded as dummy components. See

(Harvey, 1989, Chapter 6–7).

For each of the components {vt, r(g)t , r

(w)t } the dynamic part of a basic structural

model with a seven day seasonal is included. This means that for these three

components three dynamic models (7.30) are combined, one for vt, one for r(g)t

and one for r(w)t , where each individual component is associated with the sum

of their respective level (a(1)t , t = 1, 2, . . . ) and seasonal components (a

(3)t , t =

1, 2, . . . ). The noise components for the levels may not be independent. In

particular, a correlation between risk levels and traffic volume levels has to be

considered, as it is often assumed that an increase in traffic volume may not

proportionally lead to a similar increase in accidents. Therefore its covariance

matrix has to be estimated. Similar arguments can be given not to consider

independence for the slope and seasonal noise components. Thus for each type

of component: level components, slope components and seasonal components

a full dynamic covariance matrix is considered. The results are in Table 7.10.

For now, neither explanatory variables nor interventions are considered. Fol-

lowing (Harvey, 1989, Chapter 6–7), such variable can be implemented acting

on the unobserved components in the dynamic equations above as well as on

the observed variables in equations (7.26)–(7.29).

148

Type Component Risk dry Risk wet Exposure

Level- Risk dry 0.000283 0.000277 −0.000197Covariance Risk wet 0.000277 0.000294 −0.000166

Risk dry −0.000197 −0.000166 0.000169

Seasonal- Risk dry 1.35 × 10−6 4.41 × 10−6 −2.65 × 10−7

Covariance Risk wet 4.41 × 10−6 0.0000144 −8.66 × 10−7

Exposure −2.65 × 10−7 −8.66 × 10−7 5.20 × 10−8

Figure 7.10. Components of the dynamic covariance matrix. The variance ofthe noise of the ‘fraction of time with precipitation’ local level model is esti-mated at 2.318.

Results

Daily data starting 1 January 1987 up to 31 December 2005 are analysed, to-

talling 6940 observations. Only seven-day seasonal components are consid-

ered and a local level model for the fraction of the travel with precipitation,

resulting in a 25 (3 × (2 + 6) + 1) dimensional state space. The analysis re-

vealed effectively zero (co)variance for the slope error components, which are

further assumed to be zero.

Estimates of the non-zero components of the dynamic covariance matrix are in

Table 7.10. The results are in line with what was expected in the onset, where

exposure noise is expected to be negatively correlated with noise in both risk

components (an increase in traffic volume tends to coincide with a reduction in

risk, See Hauer (1995)) and the noise in risk components are (highly) correlated

(the seasonal noise terms show a correlation approximating one).

In Figure 7.13 the development of the level components of the Risk and the rel-

ative risk under wet weather conditions, exposure and the transformed faction

of time with precipitation are displayed. The first remarkable phenomenon is

the increase in risk in the initial part of the series for both the wet and dry

risk developments. This increase took place over the cause of a few months

time, and is not understood. Another remarkable phenomenon is the increase

in traffic volume, that took place around the period the survey was extended.

The increase is rather gradual compared to a structural break that would be

expected from a sudden change in the survey structure. The change is not re-

flected by a decrease in risk in either risk component.30 Therefore it is assumed

that this increase in traffic volume is genuine. Near the end of the series, the

traffic volume is suspiciously low while the risk components is are estimated

30When only the travel data changes due to a change in the survey, not the actual travel bythe population. This is usually reflect in a change in the rate of accidents per travel unit, as thenumber of accidents may not have changed.

149

rather high. In general, it appears that traffic volumes drop at new years eve,

and a calendar effect should be considered to accommodate this in a further

development in the model, but, without further information this datum ap-

pears to be suspect. It is not an artifact of the method, as it does not occur with

other selections of the data.

Based on the relative log-risk development (Figure 7.14(b)) it can be concluded

that risk under wet weather conditions is approximately twice as high as it is

under dry weather conditions. This result is reflected in the literature, for in-

stance Brodsky and Hakkert (1988). A particularly interesting result from this

analysis is that the risk varies a lot over time. Another remarkable result is that

the general risk and the relative risk are strongly correlated. Comparing Fig-

ure 7.14(a) and Figure 7.14(b) shows strong local resemblance, while the gen-

eral risk steadily decreased, the relative risk remained at about the same level.

The short term fluctuations in both developments however appear strongly

related.

Figure 7.14(d) shows the development of the level of the transformed propor-

tion of traffic with precipitation. The level is transformed through a logistic

transform ft → 1/(1 + exp(− ft)) to constrain the result to (0, 1). This means

that in dry-spells the level should be quite negative while on rainy days it may

exceed 0. This results in substantial dynamic variation, which may not be well

captured by the local level model currently implemented: extended periods

with little variation in time with precipitation (particularly in periods without

any precipitation) will occur while in other periods the weather in this respect

is quite volatile. How best to improve on this is the subject of further investiga-

tion. Although for this process a model based on meteorological theory seems

appropriate and should be considered, it is in the end the fraction of traffic with

precipitation which is important, and which may require a somewhat different

model.

One-ahead standardised residuals and smoothed predictions

Calculating standardised residuals and predictions is not straightforward. Here

the following approach is taken. First two subsets of latent factors (state com-

ponents) are distinguished: the ‘fraction of traffic with precipitation’ and the

others. Except for traffic volume, all dependent variables somehow depend

on ‘fraction of traffic with precipitation’, and the expected values of all depen-

dent variables except the precipitation duration variables depend on a linear

combination of the second set of state variables.

150

1987/1/1 1993/5/2 1999/9/1 2005/12/302.4

2.6

2.8

3.0

3.2

3.4

3.6

(a) Level log Risk

1987/1/1 1993/5/2 1999/9/1 2005/12/30

0.4

0.6

0.8

1.0

1.2

(b) Level log Relative Risk Wet

1987/1/1 1993/5/2 1999/9/1 2005/12/30

-2.0-1.9-1.8-1.7-1.6-1.5-1.4-1.3

(c) Level log Exposure

1987/1/1 1993/5/2 1999/9/1 2005/12/30-12-10 -8 -6 -4 -2 0 2

(d) Inverse logistic ‘fraction wet’

Figure 7.13. Development of selected latent factors, including pointwise 95%margins.

151

For instance, the expected number of wet weather accidents is by (7.27)

1/(1 + exp(− ft)) exp(

vt + r(g)t + r

(w)t

)

which can be further simplified into

1/(1 + exp(−x)) exp (y) (7.31)

where (x, y) is assumed bi-variate Gaussian, its expected value and covariance

determined from the one-ahead predicted state or smoothed state depending

on purpose. The predictions are determined by evaluating the expected value

of (7.31) under bi-variate Gaussian law. The variance is calculated by adding

the prediction to the variance of (7.31). Smoothed predictions and one-ahead

predictions of traffic volume data and precipitation duration data are deter-

mined in a similar, less complicated manner.

The residuals (omitted for the ten weather stations), depicted in Figure 7.16 do

not appear to be Gaussian. The turning points test is non-significant for the

dry weather accidents and almost significant for the traffic volume. The test

is significant for the wet weather accidents. These residuals also show a sig-

nificant negative first-order autocorrelation. This result appears to be in line

with findings of Eisenberg (2004), where 1 cm of precipitation if found to be

associated with a decrease of 3.06% the next day, while 1 cm of rain is associ-

ated with an increase of 1.83% of accidents on the same day. So the fact that

it rained the previous day reduces the number of accidents today. However,

these results are for the total number of fatal accidents (irrespective of weather

conditions) and exposure is based on the annual motor vehicle kilometres.

Relative risk

Probably the most interesting results can be obtained studying the smoothed

log risk developments in Figure 7.14(a) (general risk) and 7.14(b) (Relative risk

weather under precipitation). Their joint development could be studied in or-

der to identify general influences on road safety, however, focus is directed at

precipitation effects. One obvious choice is then to calculate the relative risk

under wet conditions. In Figure 7.20 the relative risk (the exponential trans-

form of the relative log-risk under wet weather conditions) and its margins is

given.

It is clear from Figure 7.20, that the risk in traffic increases by a factor of two

when precipitation is present. However, given that, it is also clear that substan-

tial differences exist. In Figure 7.20, fluctuations in the level of the relative risk

are smoothed, but it is obvious that the level is sometimes lower and some-

152

1987/1/1 1993/5/2 1999/9/1 2005/12/30

-2

-1

0

1

2

3

4

(a) Residuals fatal dry weather single car accidents

1987/1/1 1993/5/2 1999/9/1 2005/12/30

-0.6

-0.4

-0.2

0.0

0.2

0.4

(b) Residuals fatal wet weather single car accidents

1987/1/1 1993/5/2 1999/9/1 2005/12/30

-4

-2

0

2

(c) Residuals car traffic volume

Figure 7.16. One ahead standardised residuals.

153

2001/10/13 2001/11/15 2001/12/18 2002/1/19 0.0 2.5 5.0 7.5

10.012.515.017.5

2001/10/13 2001/11/15 2001/12/18 2002/1/19 0.0 2.5 5.0 7.5

10.012.515.017.5

(a) Fatal dry weather single car accidents

2001/10/13 2001/11/15 2001/12/18 2002/1/19 0

2

4

6

8

10

2001/10/13 2001/11/15 2001/12/18 2002/1/19 0

2

4

6

8

10

(b) Fatal wet weather single car accidents

2001/10/13 2001/11/15 2001/12/18 2002/1/19

0.10.20.30.40.50.60.7

2001/10/13 2001/11/15 2001/12/18 2002/1/19

0.10.20.30.40.50.60.7

(c) Car traffic volume

2001/10/13 2001/11/15 2001/12/18 2002/1/190.0

0.2

0.4

0.6

0.8

2001/10/13 2001/11/15 2001/12/18 2002/1/190.0

0.2

0.4

0.6

0.8

(d) Fraction of traffic with precipitation

Figure 7.19. Smoothed predictions.

154

1987/1/1 1993/5/2 1999/9/1 2005/12/30

1.5

2.0

2.5

3.0

Figure 7.20. Smoothed log relative risk wet weather conditions versus dryweather conditions.

times higher than average. A next step in model development should be to

relate these variations to external influences, like precipitation intensity, tem-

perature, but also non-meteorological factors. A similar effort should be taken

with respect to the joint risk development, as well as the traffic volume distri-

bution.

7.5.3. Conclusions

Based on the relative risk development it can be concluded that risk under wet

weather conditions is approximately twice as high as it is under dry weather

conditions. This result is reflected in the literature. A particularly interesting

result from this analysis is that the risk varies a lot over time. Another remark-

able result is that the general risk and the relative risk are strongly correlated.

Comparing Figure 7.14(a) and Figure 7.14(b) show strong local resemblance,

while the general risk steadily decreased, the relative risk basically remained

at the same level. That aside, the short term fluctuations in both developments

appear strongly related. If the results are correct, this suggests that one pro-

cess may govern both variations, although the relative risk appears to be more

sensitive to this process.

7.6. Discussion and conclusions

An approach to filtering and likelihood estimation for state space models where

the dependent variables can have a non-Gaussian distribution is considered.

The state space however is still assumed to be rule by Gaussian disturbances.

The approach considered can be regarded as generalisation of the iterated ex-

tended Kalman filter, therefore it shares its properties. It appears to function

well in the Gaussian-Poisson combination described in the simulation stud-

ies in Section 7.4, but did not fare so well in the stochastic volatility model in

Section 7.5.1. Although it is not possible to assess the results of the road safety

precipitation application in Section 7.5.2 other than to compare the results with

155

general results based on other methods, the results appear sound. One inter-

esting aspect of the scale of the model is that it may be the case that one influ-

ence affect both the general risk and the relative risk under wet weather con-

ditions, which may not be easy to find in other methodologies. The method

however is not fully matured. One possible improvement is extending the

approach to non normal state distributions. This may reduce the differences

found in the stochastic volatility model. One approach, which has not been

attempted would be to specify a multivariate skew normal distribution, for

instance as in Azzalini and Capitanio (1999). It is probably best to look for

opportunities related to a further (higher order) improvement of the Laplace

approximation, which may induce a class of distributions for the state. An-

other possibility is that the state variance approximation can be improved, as

a small sample is used now to estimate it.

156

8. Conclusions

This thesis presents a set of comprehensive studies into time series analysis

for aggregated road safety data, such as accident counts and victim counts. In

particular, the number of fatalities and serious injuries is closely monitored by

government agencies and the public, and its relevance to society is not dis-

puted. Much research is conducted into how road safety can be improved. To

that end it is often attempted to explain changes in road safety statistics by

factors (or changes in) such as exposure, policy, driving under the influence of

alcohol, speeding by drivers and infrastructural measures. Some factors such

as regulations, traffic law and policy can be directly observed (although com-

pliance with regulations, traffic law and policy may not). Other factors can be

observed in theory but in practice their measurement is either difficult or very

expensive. Examples of such factors are exposure, which can be measured us-

ing surveys and vehicle counting systems, and driving under the influence of

alcohol, which can be measured using road side surveys. Finally, some factors

are even harder to observe such as driver skill or experience. Data obtained

from diverse sources as described above are likely to differ in accuracy, which

may complicate statistical analysis.

Another complicating factor in road safety time series analysis is that no unique

measure of road safety is available. Usually road safety is measured in terms

of the number of accidents or the number of victims. Although in practice the

situation is more complicated, some road safety measures may affect either ac-

cident occurrence or accident severity. When a study is performed measuring

the effect of a policy intervention on road safety that should mainly affect acci-

dent occurrence, the development of the number of accidents of a relevant kind

would be studied. On the other hand, if the policy intervention should mainly

affect accident severity, the development of the number of victims would be

studied. If possible, an analysis is performed on both a type of victim for which

the count should be affected by the policy intervention, and a type of victim for

which the count is not affected by the policy intervention. The development

of both victim types should otherwise be as similar as possible. If a reduction

in the number of victims is indeed identified it is important to confirm that the

number of victims is reduced because accident severity is reduced, not because

the number of accidents is reduced. Ideally, changes should not be found for

the type of victim of which the count should not be affected by the policy inter-

vention. Unfortunately, it is not likely that both conditions are met when the

number of victims is studied for a longer period of time. The possibility has

to be considered that other influences have affected road safety. These influ-

157

ences themselves may need to be modelled and need not be fully independent

of the policy intervention originally considered. To accommodate this case it

is important to jointly model influences on joint dependent variables, notably

accident counts and victim counts.

In this thesis a novel approach to road safety time series analysis of develop-

ments of aggregated road safety data is presented intended to improve options

and yield more reliable statistical analysis compared to commonly used alter-

natives. The approach is based on multivariate heteroscedastic structural time

series models and addresses many of the issues in road safety time series anal-

ysis, and all of the above described issues. Currently applied models may only

partly treat these issues.

The combination of three basic aspects of the approach presented in this thesis

allow for the improvements to road safety time series models. These three

aspects are:

• The use of structural components. The time series models are constructed

using interpretable structural components, such as exposure, risk and

severity components. Also registration rate, seat belt use percentage, per-

centage of drivers exceeding the legal blood alcohol concentration limit,

speeding behaviour and other important aspects of road safety may be

represented using structural components. The structural components

may follow trends and seasonal patterns. The benefits of using inter-

pretable components become apparent when a single component can be

related to more than one dependent variable. Furthermore, these models

allow the researcher to distinguish external effects into effects that affect

road safety or its important components (for instance interventions) or

effects that affect how road safety is observed (for instance changes in

registration rate). Furthermore explanatory variables can be modelled to

have an effect on specific components of road safety.

• Multivariate dependent variables: both accident and victim counts can

be included in one model. Explanatory variables can be included in a

traditional way. In addition, explanatory variables measured with obser-

vation error can be included as well, where a structural component can

be considered an estimate of the true value (without observation error).

For instance if the explanatory variable is seat belt use, data may be ob-

tained from relatively small road side surveys, not necessarily available

for all observations. If seat belt use can be considered relatively constant,

the structural component can be the average of all observations. This

158

structural component could then be used for all observations, including

observations where no survey result is available. If seat belt use cannot be

considered constant, a trend can be considered, still allowing the model

to use all observations.

• Heteroscedastic structure of errors: the models can treat observations

which accuracy may vary over time. The observations may also be un-

available at a few time points. The models can treat cases where variables

differ in accuracy. For instance, traffic volume data for cars may be more

accurate than traffic volume data for motorcycles (the relative error for

the total traffic volume of cars is about 2–3 % in the survey of 2003, the

relative error for the total traffic volume of motorcycles is almost 30%

in the survey of 2003). Yet both variables may be included in the same

model.

It is the combination of these properties in one model and the use of shared

structural components that makes the model particularly attractive to road

safety time series analysis. Considering that a longer period of time is stud-

ied, where conditions change over time, the following practical benefits can be

mentioned:

• The multivariate nature and the heteroscedastic nature of the models

are utilised to account for covariance among dependent variables. It is

known that the number of victims is dependent on the number of ac-

cidents. Therefore models including both accident counts and victim

counts should account for their covariance. Similarly, the error in travel

data from surveys can be correlated. The covariance between accident

counts and victim counts is successfully used in both applications of

Chapter 3. The error in travel data from surveys is used in the appli-

cations of Chapter 3, the Dutch application in Chapter 5 and the road

safety application in Chapter 7.

• Structural components allow for a limited level of verification. The de-

velopment of structural components can be compared with secondary

information. However, if sufficient secondary information is available, it

is probably better to include that information in the model. A successful

application of this approach is in Chapter 6 where the development of ex-

posure outside urban areas is compared to an estimate of traffic volume

constructed from road length data and traffic intensity data.

• Structural components can be shared among many dependent variables.

In the road safety application in Chapter 7, weather conditions measured

159

by ten weather stations is related to one single structural component. If

the weather stations agree, the value of the structural component is accu-

rately known. On the other hand, if the weather stations disagree, that is,

the weather pattern differs among the weather stations, the value of the

structural component is not accurately known. The possibility of sharing

components is also utilised in Chapter 3 where a structural component

representing the number of victims per accident is shared among police

recorded victims and an estimate of the so called true number of victims.

This component is used to improve the estimate of the true number of

accidents, which cannot be inferred from hospital records.

• The approach allows for other observation error distributions than Gaus-

sian. Even combinations of other observation error distributions are pro-

vided for by the modelling approach presented in this thesis. Notably,

an example is given using both multiple Gaussian distributed observa-

tion errors and multiple Poisson distributed dependent variables. This

feature is used in the road safety application in Chapter 7, where Poisson

distributed accident counts are analysed for both dry and wet weather

conditions as well as Gaussian distributed traffic volume data and weath-

er station data.

• The approach allows correlation among so called innovations of struc-

tural components. When innovations of structural components are al-

lowed to correlate (and are correlated), the developments of structural

components mutually affect each other. This is relevant when compo-

nents represent phenomena that may affect each other, such as exposure

and risk. Also the development of accident severity may affect the occur-

rence of accidents exceeding a certain severity level. The fact that expo-

sure can affect risk is used in almost all examples in this thesis (eliminat-

ing the need for a coefficient for exposure), while the fact that accident

severity can affect occurrence of severe accidents is used in the first ap-

plication of Chapter 3.

In the model applications presented in this thesis the definition of exposure

used is always proportional to traffic volume. This restriction is not funda-

mental however. The approach presented in this thesis can also be applied

to models assuming a nonlinear relation, and using other exposure measures

than traffic volume. Furthermore, the dynamic relations assumed in this these

are derived from local linear trend models. Although smooth developments

may be approximated satisfactorily in practice using local linear trend models,

this need not always be the case. In particular the local level approximation to

160

the development of the fraction of traffic with precipitation in Chapter 7 may

be improved upon. As this thesis is aimed at providing better and statistically

more reliable options for time series analysis of road safety data rather than

generating empirical results, the improvement of such models is considered

outside the scope of this thesis.

The models considered in this thesis all assume a linear dynamic relation, and

Gaussian state disturbances. These assumptions may not hold in all cases, al-

though the approximately linear local linear trend models appear to function

quite well in practice. Sufficient empirical evidence is given in this thesis that

shows the effectiveness of the new proposed methodology of time series anal-

ysis for traffic safety data. It is planned to develop the methodology further

into higher dimensions and into more realistic models for traffic safety. In this

thesis the key contributions of the new approach are reported.

161

Bibliography

Allen, L. and A. Saunders (2003). A survey of cyclical effects in credit risk

measurement models. BIS Working Paper 126, Bank for International Settle-

ments, Basel, Switzerland.

Anderson, B. D. O. and J. B. Moore (1979). Optimal Filtering. Englewood Cliffs:

Prentice-Hall.

Andrey, J. and S. Yagar (1993). A temporal analysis of rain-related crash risk.

Accident Analysis and Prevention 25(4), 465–472.

Appel, H. (1982). Strategische aspekten zur erhohung der sicherheit im

Straßenverkehr. Automobil-Industrie 3, 347–356.

AVV (2004–2005). Mobiliteitsonderzoek Nederland (mobility research in the

Netherlands). Electronic publication.

AVV (2006). Bestand geRegistreerde Ongevallen Nederland (database of reg-

istered accidents in the Netherlands). Electronic publication.

Azzalini, A. and A. Capitanio (1999). Statistical applications of the multivariate

skew normal distribution. Journal of the Royal Statistical Society B 61(3), 579–

602.

Bell, B. M. and F. W. Cathey (1993). The iterated Kalman filter update as a

Gauss-Newton method. IEEE Transactions on Automatic Control 38(2), 294–

297.

Bell, W. R. (2004). On RegComponent time series models and their applica-

tions. In A. C. Harvey, S. J. Koopman, and N. Shephard (Eds.), State space

and unobserved components models: theory and applications. Cambridge Univer-

sity Press, Cambridge.

Bickel, P. J. and K. A. Doksum (1981). An analysis of transformations revisited.

Journal of the American Statistical Association 76(374), 296–311.

Bijleveld, F. D. (1999). Monitoring van verkeersveiligheid : beschrijving van

een rekeninstrument voor het volgen van ontwikkelingen in de verkeersvei-

ligheid. Technical Report R-99-20, SWOV, Leidschendam, the Netherlands.

Bijleveld, F. D. (2005). The covariance between the number of accidents and

the number of victims in multivariate analysis of accident related outcomes.

Accident Analysis and Prevention 37, 591–600.

163

Bijleveld, F. D., J. J. F. Commandeur, P. G. Gould, and S. J. Koopman (2008).

Model-based measurement of latent risk in time series with applications.

Journal of the Royal Statistical Society A 171(1), 265–277.

Bijleveld, F. D., J. J. F. Commandeur, S. J. Koopman, and K. van Montfort (In

Prep.). Non-linear interpolation of disaggregated time series with an appli-

cation to traffic safety.

Blokpoel, A. and P. H. Polak (1991). Koppeling tussen de landelijke medis-

che registratie (lmr) en de verkeersongevallenregistratie (vor) van in zieken-

huis opgenomen verkeersgewonden. Technical Report R-91-79, SWOV, Lei-

dschendam, the Netherlands. in Dutch.

Bos, J. M. J. and F. D. Bijleveld (1991). Tijdreeksanalyse van het gordeleffect.

Technical Report R-91-92, SWOV, Leidschendam, the Netherlands. in Dutch.

Box, G. E. P. and D. R. Cox (1964). An analysis of transformations. Journal of

the Royal Statistical Society B 26, 211–246.

Box, G. E. P. and D. R. Cox (1982). An analysis of transformations revisited,

rebutted. Journal of the American Statistical Association 77(377), 209–210.

Box, G. E. P. and G. M. Jenkins (1976). Time series analysis. San Francisco:

Holden-Day.

Box, G. E. P. and G. C. Tiao (1975, March). Intervention analysis with appli-

cations to economic and environmental problems. Journal of the American

Statistical Association 70(349), 70–79.

Brodsky, H. and A. S. Hakkert (1988). Risk of a road accident in rainy weather.

Accident Analysis and Prevention 20(3), 161–176.

Broughton, J. (1991). Forecasting road accident casualties in Great Britain. Ac-

cident Analysis and Prevention 23(5), 353–362.

Cameron, A. C. and P. K. Trivedi (1998). Regression analysis of count data. Cam-

bridge: Cambridge University Press.

CBS (1950-2000). Traffic fatalities in the netherlands (various titles). Annual

(now internet).

CBS (1985–2003). Onderzoek verplaatsings gedrag.

CBS (2007). History population. Yearly publication.

Chan, N. H. and W. Palma (1998). State space modeling of long-memory pro-

cesses. Annals of Statistics 26(2), 719–740.

164

Christens, P. F. (2003). Statistical modelling of traffic safety development. Ph. D. the-

sis, Informatics and Mathematical Modelling, Technical University of Den-

mark, DTU, and Danish Transport Research Center, Richard Petersens Plads,

Building 321, DK-2800 Kgs. Lyngby. Supervisor: Poul Thyregod.

Commandeur, J. J. F., F. D. Bijleveld, and R. Bergel (2007). Multivariate time

series analysis of SafetyNet data. Deliverable D7.7, SafetyNet. in Press.

Commandeur, J. J. F. and S. J. Koopman (2007). An Introduction to State Space

Time Series Analysis. Practical Econometrics Series. Oxford: Oxford Univer-

sity Press.

Commandeur, J. J. F. and M. J. Koornstra (2001). Prognoses voor de ver-

keersveiligheid in 2010. Technical Report R-2001-9, SWOV, Leidschendam,

the Netherlands. in Dutch.

COST329 (2004). Models for traffic and safety development and interventions.

final report. Technical Report EUR 20913 – COST 329, Office for Official

Publications of the European Communities, Luxembourg.

de Bruijn, N. G. (1981). Asymptotic methods in analysis. Dover Publications, Inc.

de Jong, P. (1989). Smoothing and interpolation with the state-space model.

Journal of the American Statistical Association 84(408), 1085–1088.

de Jong, P. and P. P. Boyle (1983). Monitoring mortality a state-space approach.

Journal of Econometrics 23, 131–146.

de Jong, P. and J. Penzer (1998). Diagnosing shocks in time series. Journal of the

American Statistical Association 93(442), 796–806.

Dempster, A. P., N. M. Liard, and D. B. Rubin (1977). Maximum likelihood

from incomplete data via the EM algorithm. Journal of the Royal Statistical

Society B B(34), 183–202.

Dominici, F., A. McDermott, and T. J. Hastie (2004). Improved semiparametric

time series models of air pollution and mortality. Journal of the American

Statistical Association 99, 938–948.

Doornik, J. A. (2001). Object-Oriented Matrix Programming using Ox 3.0. London:

Timberlake Consultants Press.

Durbin, J. and S. J. Koopman (1997). Monte Carlo maximum likelihood esti-

mation for non-Gaussian state space models. Biometrika 84(3), 669–684.

165

Durbin, J. and S. J. Koopman (2001). Time Series Analysis by State Space Methods.

Oxford: Oxford University Press.

DVS (2003). Dutch road accident data. http://www.rijkswaterstaat.nl/dvs/.

Eisenberg, D. (2004). The mixed effects of precipitation on traffic crashes. Ac-

cident Analysis and Prevention 36, 637–647.

El-Sadig, M., J. N. Norman, O. L. Lloyd, and A. Bener (2002). Road traffic acci-

dents in the united arab emirates: trends of morbidity and mortality during

1977–1998. Accident Analysis and Prevention 34, 465–467.

Elvik, R. and T. Vaa (Eds.) (2004). Handbook of road safety measures. Oxford:

Elsevier Ltd.

Ermens, R. J. L. and J. S. N. van Vliet (2006). Monitoring bromfietshelmen 2006.

Technical Report I&M-99380179-BvV, Grontmij Verkeer en Infrastructuur, De

Bilt.

Ernst, G. and E. Bruning (1990). Funf Jahre danach:, Wirksamkeit der ‘Gur-

tanlegepflicht fur Pkw Insassen ab 1. 8. 1984’. Zeitschrift fur Verkehrssicher-

heit 36(1), 2–13.

Evans, A. W. (2003). Estimating transport fatality risk from past accident data.

Accident Analysis and Prevention 35, 459–472.

Fahrmeir, L. and G. Tutz (1994). Multivariate Statistical Modelling Based on Gen-

eralized Linear Models. New York: Springer-Verlag.

Fahrmeir, L. and S. Wagenpfeil (1996). Smoothing hazard functions and time-

varying effects in discrete duration and competing risk models. Journal of the

American Statistical Association 91, 1584–1594.

Feller, W. (1968). An introduction to probability theory and its applications (Third

ed.), Volume I. New York: John Wiley & Sons, Inc.

Finkenstadt, B. F. and B. T. Grenfell (2000). Time series modelling of childhood

diseases: A dynamical systems approach. Applied Statistics 49, 187–205.

Fixler, J. B., G. T. Foster, J. M. McGuirk, and M. A. Kasevich (2007, January).

Atom interferometer measurement of the newtonian constant of gravity. Sci-

ence 315(5808), 74–77.

Gaudry, M. (1984). DRAG, un modele de la demande routiere, des accidents

et leur gravite, applique au Quebec de 1956–1986. Technical Report Pub-

lication CRT-359, Centre de recherche sur les Transports, et Cahier #8432,

Departement de sciences economiques, Universite de Montreal.

166

Gaudry, M. and S. Lassarre (Eds.) (2000). Structural Road Accident Models: The

International DRAG Family. Oxford: Elsevier Science Ltd.

Gould, P. G. (2005). Econometric modelling of road crashes. Ph. D. thesis, Monash

University Australia.

Hakkert, A. S. and L. Braimaister (2002). The uses of exposure and risk in

road safety studies. Technical Report R-2002-12, SWOV, Leidschendam, the

Netherlands.

Hampel, F. R., P. J. Rousseeuw, E. M. Ronchetti, and W. A. Stahel (1986). Robust

statistics. New York: John Wiley & Sons, Inc.

Harvey, A. C. (1981). Time series models. London: Phillip Allan.

Harvey, A. C. (1983). The formulation of structural time series models in dis-

crete and continuous time. Questiio 7, 563–575.

Harvey, A. C. (1989). Forecasting, structural time series models and the Kalman

filter. Cambridge: Cambridge University Press.

Harvey, A. C. and J. Durbin (1986). The effects of seat belt legislation on British

road casualties: A case study in structural time series modelling. Journal of

the Royal Statistical Society A 149(3), 187–227.

Harvey, A. C. and C. Fernandes (1989). Time series models for insurance

claims. Journal of the Institute of Actuaries 116, 513–528.

Harvey, A. C. and S. J. Koopman (1992). Diagnostic checking of unobserved-

components time series models. Journal of Buisiness & Economic Statis-

tics 10(4), 377–389.

Harvey, A. C. and S. J. Koopman (1997). Multivariate structural time series

models. In C. Heij, Schumacher, H., B. Hanzon, and K. Praagman (Eds.),

System Dynamics in Economic and Financial Models, pp. 269–298. Chichester,

England: John Wiley & Sons.

Harvey, A. C., E. Ruiz, and N. Shephard (1994). Multivariate stochastic vari-

ance models. Rev. Econ. Stud. 61, 247–264.

Harvey, A. C. and N. Shephard (1993). Structural time series models. In G. S.

Maddala, C. R. Rao, and H. D. Vinod (Eds.), Handbook of statistics, Volume 11,

Chapter 10, pp. 261–302. Amsterdam: Elsevier Science Publishers B.V.

Hauer, E. (1982). Traffic conflicts and exposure. Accident Analysis and Preven-

tion 14(5), 359–364.

167

Hauer, E. (1992). Emperical Bayes approach to the estimation of “Unsafety”:

the multivariate regression method. Accident Analysis and Prevention 24(5),

457–477.

Hauer, E. (1995). On exposure and accident rate. Traffic Engineering + Con-

trol 36(3), 134–138.

Hauer, E. (2001). Overdispersion in modelling accidents on road sections and

in Empirical Bayes estimation. Accident Analysis and Prevention 33, 799–808.

Hiselius, L. W. (2004). Estimating the relationship between accident frequency

and homogeneous and inhomogeneous traffic flows. Accident Analysis and

Prevention 36, 985–992.

Huber, P., E. M. Ronchetti, and M.-P. Victoria-Feser (2004). Estimation of gen-

eralized linear latent variable models. Journal of the Royal Statistical Society

B 66, 893–908.

Hutchings, C. B., S. Knight, and J. C. Reading (2003). The use of generalized

estimating equations in the analysis of motor vehicle crash data. Accident

Analysis and Prevention 35, 3–8.

Johansson, P. (1996). Speed limitation motorway casualties: a time series count

data regression approach. Accident Analysis and Prevention 28(1), 73–87.

Kalman, R. E. (1960). A new approach to linear filtering and prediction prob-

lems. Journal of Basic Engineering (Series D) 82, 35–45.

Keay, K. and I. Simmonds (2005). The association of railfall and other weather

variables with road traffic volume in Melbourne, Australia. Accident Analysis

and Prevention 37, 109–124.

Keay, K. and I. Simmonds (2006). Road accidents and rainfall in a large aus-

tralian city. Accident Analysis and Prevention 38, 445–454.

KNMI (2006). Dutch precipitation data by the Royal Netherlands Meteorolog-

ical Institute. Internet www.knmi.nl.

Kohn, R. and C. F. Ansley (1989). A fast algorithm for signal extraction, influ-

ence and cross-validation in state space models. Biometrika 76(1), 65–79.

Koopman, S. J., N. Shephard, and J. A. Doornik (1998). Statistical algorithms

for models in state space using SsfPack 2.2. Econometrics Journal 13, 1–55.

Koornstra, M. J. (1992). The evolution of road safety and mobility. IATSS

Research 16(2), 129–147.

168

Lassarre, S. (2001). Analysis of progress in road safety in ten european coun-

tries. Accident Analysis and Prevention 33, 743–751.

Ledolter, J., S. Klugman, and C. Lee (1991). Credibility models with time-

varying trend components. ASTIN Bulletin 21, 73–91.

Lee, Y. and J. A. Nelder (2006). Double hierarchical generalized linear models.

Applied Statistics 55(2), 139–185.

Levitt, S. D. and J. Porter (2001). Sample selection in the estimation of air bag

and seat belt effectiveness. The Review of Economics and Statistics 83(4), 603–

615.

Li, L. and K. Kim (2000). Estimating driver crash risks based on the extended

Bradley-Terry model: an induced exposure method. Journal of the Royal Sta-

tistical Society A 163(2), 227–240.

Lord, D., S. P. Washington, and J. N. Ivan (2005). Poisson, Poisson-gamma and

zero-inflated regression models of motor vehicle crashes: balancing statisti-

cal fit and theory. Accident Analysis and Prevention 37, 35–46.

Magnus, J. R. and H. Neudecker (1999). Matrix Differential Calculus. London:

John Wiley & Sons, Inc.

Mathijssen, R. (2004). Three decades of drink-driving policy in the netherlands;

an evaluation. In P. M. Williams and A. B. Clayton (Eds.), Proceedings of the

17th Meeting of the International Council on Alcohol, Drugs and Traffic Safety,

Glasgow, Scotland, United Kingdom, 8 — 13 August 2004. ICADTS.

McCullagh, P. and J. A. Nelder (1989). Generalized Linear Models (Second Edi-

tion ed.). Chapman & Hall.

McLachlan, G. J. and T. Krishnan (1997). The EM Algorithm and Extensions.

Wiley series in probability and statistics. New York: John Wiley & Sons, Inc.

Morton, A. and B. Finkenstadt (2005). Discrete time modelling of disease in-

cidence time series by using markov chain monte carlo methods. Applied

Statistics 54, 575–594.

Muth, J. F. (1960). Optimal properties of exponentially weighted forecasts (corr:

V57 p919-20). Journal of the American Statistical Association 55, 299–305.

Oppe, S. (1989). Macroscopic models for traffic and traffic safety. Accident

Analysis and Prevention 21, 225–232.

169

Oppe, S. (1991a). Development of traffic and traffic safety: Global trends and

incidental fluctuations. Accident Analysis and Prevention 23(5), 413–422.

Oppe, S. (1991b). Development of traffic and traffic safety: Global trends and

incidental fluctuations. Accident Analysis and Prevention 23(5), 413–422.

Oppe, S. (1991c). Development of traffic and traffic safety in six developed

countries. Accident Analysis and Prevention 23(5), 401–412.

Oppe, S. and M. J. Koornstra (1990). A mathematical theory for related long

term developments of road traffic and safety. In M. Koshi (Ed.), Proceedings

of the Eleventh International Symposium on Transportation and Traffic Theory, July

18-20, 1990 in Yokohama, Japan, New York, pp. 113–132. Elsevier.

Polak, P. H. (1997). Registratiegraad van in ziekenhuizen opgenomen ver-

keersslachtoffers. Technical Report R-97-15, SWOV, Leidschendam, the

Netherlands. in Dutch.

Polak, P. H. (2000). De aantallen in ziekenhuizen opgenomen verkeersge-

wonden, 1985–1997. Technical Report R-2000-26, SWOV, Leidschendam, the

Netherlands. in Dutch.

Polak, P. H. and A. Blokpoel (1998). Schatting van de werkelijke omvang van

de verkeersonveiligheid 1997 (methodiek en resultaten voor ziekenhuisop-

namen). Technical Report R-98-51, SWOV, Leidschendam, the Netherlands.

In Dutch.

Remmerswaal, M. (2007). Het koppelen van verkeersongeval gerelateerde be-

standen met behulp van een afstandsfunctie. Master’s thesis, THRijswijk,

Rijswijk.

Reurings, M. C. B., N. M. Bos, and L. T. B. van Kampen (2007). Berekening van

het werkelijk aantal ziekenhuisgewonden; methodiek en resultaten van kop-

peling en ophoging van bestanden. Technical report, SWOV, Leidschendam,

the Netherlands. (in Dutch, in. prep.).

Reurings, M. C. B. and T. Janssen (2006). Accident prediction models for urban

and rural carriageways. Technical Report R-2006-14, SWOV, Leidschendam,

the Netherlands.

Rice, J. A. (1995). Mathematical statistics and data analysis. Belmont, CA:

Duxberry Press.

Sammel, M. D., L. M. Ryan, and J. M. Legler (1997). Latent variable models

for mixed discrete and continuous outcomes. Journal of the Royal Statistical

Society B 59(3), 667–678.

170

Schafer, D. W. (1987). Covariate measurement error in generalized linear mod-

els. Biometrika 74(2), 385–391.

Scheffe, H. (1967). The Analysis of Variance (Fifth ed.). London: John Wiley &

Sons, Inc.

Scuffham, P. A. (1998). An Econometric Analysis of Motor Vehicle Traffic Crashes

and Macroeconomic Factors. Ph. D. thesis, Department of Economics. Univer-

sity of Otago., Dunedin.

Scuffham, P. A. and J. D. Langley (2002). A model of traffic crashes in new

zealand. Accident Analysis and Prevention 34(5), 673–687.

Seber, G. A. F. and C. J. Wild (1988). Nonlinear Regression. New York: John

Wiley & Sons, Inc.

Shorack, G. R. (2000). Probability for Statisticians. New York: Springer-Verlag.

Slootbeek, G. T. (1993). Een vergelijking tussen de BHS-methode en analytis-

che benaderingsformules voor het schatten van relatieve marges van cijfers

uit het OVG. Technical Report BPA no.: 2646-93-M1/INTERN, CBS, Voor-

burg/Heerlen.

Smeed, R. J. (1949). Some statistical aspects of road safety research. Journal of

the Royal Statistical Society A 112(1), 1–34.

Summala, H. and R. Naataanen (1988). The zero-risk theory and overtaking de-

cisions. In J. A. Rothengatter and R. A. de Bruin (Eds.), Road User Behaviour;

Theory and research, pp. 82–92. Assen/Maastricht: van Gorkum.

SWOV (1978). Alcoholgebruik onder automobilisten. verslag en resultaten van

het onderzoek rij- en drinkgewoonten van nederlandse automobilisten in

weekeindnachten in het najaar van de jaren 1970, 1971, 1973, 1974, 1975 en

1977. Technical Report R-78-19, SWOV, Leidschendam, the Netherlands. in

Dutch.

Van den Bossche, F. A. M. (2006). Road Safety, Risk and exposure in Belgium. Ph.

D. thesis, Universiteit Hasselt.

Vonesh, E. F. (1996). A note on the use of Laplace’s approximation for non-

linear mixed-effects models. Biometrika 83(2), 447–452.

Wilde, G. J. S. (1994). Target risk. Toronto: PDE Publications.

Wishner, R. P., J. A. Tabaczynski, and M. Athans (1969). A comparison of three

non-linear filters. Automatica 5, 487–496.

171

Wolfinger, R. (1993). Laplace’s approximation for nonlinear mixed models.

Biometrika 80(4), 791–795.

Yannis, G., E. Papadimitriou, A. Chaziris, G. Duchamp, P. Lejeune, V. Treny,

S. Hemdorff, M. Haddak, E. Lenguerrand, P. Hollo, A. Angermann,

S. Hoeglinger, J. Cardoso, F. D. Bijleveld, S. Houwing, and T. Bjørnskau

(2008). RED common framework. SafetyNet Deliverable 2.3, NTUA - Na-

tional Technical University of Athens.

Yannis, G., E. Papadimitriou, P. Lejeune, V. Treny, S. Hemdorff, R. Bergel,

M. Haddak, P. Hollo, J. Cardoso, F. D. Bijleveld, S. Houwing, and

T. Bjørnskau (2005). State of the art report on risk and exposure data. Safe-

tyNet Deliverable 2.1, NTUA - National Technical University of Athens.

172

Author index

Allen, L. 95

Anderson, B. D. O. 64, 121

Andrey, J. 142

Angermann, A. 44

Ansley, C. F. 64, 121

Appel, H. 30, 110

Athans, M. 79

AVV 22, 66, 143

Azzalini, A. 156

Bell, B. M. 79, 130–132, 134

Bell, William R. 97

Bener, A. 90

Bergel, R. 30, 44, 66, 71

Bickel, P. J. 66

Bijleveld, F. D. 9–11, 26, 30, 32, 34, 44,

52, 53, 59, 62, 66, 68, 71, 74, 80, 91,

94, 111, 131, 134, 135, 144, 145

Bjørnskau, T. 30, 44

Blokpoel, A. 71, 72

Bos, J. M. J. 10

Bos, N. M. 71, 72, 76

Box, G. E. P. 10, 66, 110

Boyle, P. P. 95

Braimaister, L. 30, 35, 44, 45

Brodsky, H. 142–144, 150

Broughton, J. 110

Bruning, E. 10, 111

Cameron, A. C. 82

Capitanio, A. 156

Cardoso, J. 30, 44

Cathey, F. W. 79, 130–132, 134

CBS 14, 18, 22, 28, 29, 66, 78, 143

Chan, N. H. 43

Chaziris, A. 44

Christens, P. F. 10

Commandeur, J. J. F. 9, 26, 41, 45, 47,

52, 53, 55, 56, 59, 60, 66, 71, 91, 94,

100, 111, 115, 131, 134, 135, 144, 145

COST329 10, 11

Cox, D. R. 66

de Bruijn, N. G. 131, 133

de Jong, P. 64, 65, 95, 98, 102, 121

Dempster, A. P. 11, 81

Doksum, K. A. 66

Dominici, F. 95

Doornik, J. A. 63

Duchamp, G. 44

Durbin, J. 10, 11, 41, 47, 52, 53, 62–64,

81, 91, 98–100, 111, 115, 117–119,

130, 131, 141, 142, 184, 197, 198

DVS 29

Eisenberg, D. 22, 23, 142, 152

El-Sadig, M. 90

Elvik, R. 11, 36, 37, 46, 177, 180

Ermens, R. J. L. 18, 23

Ernst, G. 10, 111

Evans, A. W. 83

Fahrmeir, L. 11, 95

Feller, W. 31, 83, 93, 187, 188

Fernandes, C. 95

Finkenstadt, B. 95

Finkenstadt, B. F. 95

Fixler, J. B. 50

Foster, G. T. 50

Gaudry, M. 11, 53, 65, 95, 110

Gould, P. G. 9, 10, 26, 52, 53, 94, 111,

131, 134, 135, 144, 145

Grenfell, B. T. 95

Haddak, M. 30, 44

Hakkert, A. S. 30, 35, 44, 45, 142–144,

150

173

Hampel, F. R. 47, 48

Harvey, A. C. 9–11, 24, 41, 47, 52, 53,

55, 57, 59, 62, 64, 65, 81, 91, 95, 97,

98, 111, 115, 118, 122, 130, 141, 147,

148, 198

Hastie, T. J. 95

Hauer, E. 11, 19, 30, 31, 33, 35–37, 44,

45, 49, 52, 53, 84, 114, 149, 177, 178

Hemdorff, S. 30, 44

Hiselius, L. W. 36

Hoeglinger, S. 44

Hollo, P. 30, 44

Houwing, S. 30, 44

Huber, P 131, 133, 198

Hutchings, C. B. 83

Ivan, J. N. 33

Janssen, T. 36

Jenkins, G. M. 10, 110

Johansson, P. 53, 81, 91, 111

Kalman, R. E. 9, 24, 49, 51–53

Kasevich, M. A. 50

Keay, K. 142

Kim, K. 95

Klugman, S. 95

Knight, S. 83

KNMI 143

Kohn, R. 64, 121

Koopman, S. J. 9, 24, 26, 41, 47, 52, 53,

55, 56, 59, 60, 62–65, 91, 94, 99, 100,

111, 115, 117–119, 130, 131, 134, 135,

141, 142, 144, 145, 184, 197, 198

Koornstra, M. J. 45, 60

Krishnan, T. 11

Langley, J. D. 10

Lassarre, S. 10, 11, 65, 95, 110, 111,

180, 182

Ledolter, J. 95

Lee, C. 95

Lee, Y 131, 141

Legler, J. M. 147

Lejeune, P. 30, 44

Lenguerrand, E. 44

Levitt, S. D. 95

Li, L. 95

Liard, N. M. 11, 81

Lloyd, O. L. 90

Lord, D. 33

Magnus, J. R. 195

Mathijssen, R. 21

McCullagh, P. 83

McDermott, A. 95

McGuirk, J. M. 50

McLachlan, G. J. 11

Moore, J. B. 64, 121

Morton, A. 95

Muth, J. F. 9

Naataanen, R. 16, 80

Nelder, J. A. 83, 131, 141

Neudecker, H. 195

Norman, J. N. 90

Oppe, S. 46, 60, 95, 110

Palma, W. 43

Papadimitriou, E. 30, 44

Penzer, J. 65, 98, 102

Polak, P. H. 71, 72

Porter, J. 95

Reading, J. C. 83

Remmerswaal, M. 71

Reurings, M. C. B. 36, 71, 72, 76

Rice, J. A. 89, 90

Ronchetti, E. M. 47, 48, 131, 133, 198

Rousseeuw, P. J. 47, 48

Rubin, D. B. 11, 81

Ruiz, E. 141

Ryan, L. M. 147

Sammel, M. D. 147

174

Saunders, A. 95

Schafer, D. W. 81

Scheffe, H. 14

Scuffham, P. A. 10

Seber, G. A. F. 23, 45

Shephard, N. 55, 59, 63, 141

Shorack, G. R. 31

Simmonds, I. 142

Slootbeek, G. T. 18, 19, 59, 62, 145

Smeed, R. J. 110

Stahel, W. A. 47, 48

Summala, H. 16, 80

SWOV 20

Tabaczynski, J. A. 79

Tiao, G. C. 10

Treny, V. 30, 44

Trivedi, P. K. 82

Tutz, G. 11

Vaa, T. 11, 36, 37, 46, 177, 180

Van den Bossche, F. A. M. 10

van Kampen, L. T. B. 71, 72, 76

van Montfort, K. 91

van Vliet, J. S. N. 18, 23

Victoria-Feser, M.-P. 131, 133, 198

Vonesh, E. F. 131

Wagenpfeil, S. 95

Washington, S. P. 33

Wild, C. J. 23, 45

Wilde, G. J. S. 16, 80

Wishner, R. P. 79

Wolfinger, R. 131, 133, 134

Yagar, S. 142

Yannis, G. 30, 44

175

Appendix A. Some robustness aspects of the latent

risk time series model

As already discussed in Chapter 2, in many road safety studies a non-linear

relation between traffic volume and the number of accidents is assumed. One

specific non-linear relation is often used in practice: the power function dis-

cussed in Elvik and Vaa (2004, p 49). This appendix details how this non-linear

relation is actually nested within the LRT model. The appendix also shows

that the explicit estimation of a regression coefficient for stochastically treated

explanatory variables like traffic volume becomes redundant in the LRT frame-

work. Finally, Appendix A.3 explores what happens when an explanatory var-

iable is treated stochastically in the LRT by including it in the state, and when

the chosen model predicts a next observation from its predecessor with a large

prediction error. It is shown that the explanatory variable then tends to be

treated as a fixed and known variable not subject to measurement error.

A.1. The non-linear relation between exposure and the num-

ber of accidents revisited

In the latent risk time series model (LRT) presented in this thesis, a log-linear

relation is assumed between the latent exposure and the number of accidents

as results from taking the logarithm of both sides of the second equation of

(2.2) or (3.6) yielding (3.7) where error terms are suppressed:

log (Number of accidents) = log (exposure) + log (risk) . (A.1)

This log-linear relation contrasts with the non-linear relation presented in Hauer

(1995), (see also (2.3)):

Number of accidents = f (exposure), (A.2)

and with the relation described in the handbook of Elvik and Vaa (2004, p 49):

Number of accidents =α Qb, (A.3)

177

where Q is a measure of traffic volume, and α is a general risk coefficient (note

that Q and α have a different meaning elsewhere in this thesis). In (A.3) the

function f of (A.2) is thus assumed to be a power function. Taking the loga-

rithm of (A.3) we obtain

log (Number of accidents) =b log (exposure) + log (risk) , (A.4)

where α is replaced by risk, and Q is replaced by exposure (see also (3.8)). The

latter relation between the number of accidents and exposure is assumed in

many road safety studies.

In this section it will be demonstrated that the LRT approach based on (A.1)

is a reasonable choice, and at least as good as the functional relation shown

in (A.4). Moreover, if required the non-linear and non-Gaussian extensions of

the LRT discussed in Chapter 6 and Chapter 7 can be used to fit any of the

relations expressed in (A.1)–(A.4) as well as (2.2) or (3.6), and the general form

(A.2) in particular.

That the LRT approach is a reasonable choice can be based on the following

arguments:

• In his paper on the non-linearity of the functional form f in (A.2) (which

he calls a safety performance function) Hauer (1995, p. 134) states “It tells

how the average number of accidents in a specified period of time would

be changing if exposure changed while all other conditions affecting ac-

cident occurrence remained fixed”.

However, when road safety time series data span a long period of time

it cannot always be assumed that all other conditions affecting accident

occurrence remain fixed. This means that even if the functional form of

the relation between number of accidents and exposure f is known for

one point in time, it will not necessarily remain constant over the whole

period of time, if only due to the introduction of effective road safety

measures. In practice, a series of functions ft may therefore be considered

where t = 1, . . . , n and n is the number of time points.

• In many applications in road safety time series analysis only one observa-

tion per time point per safety performance function is available. For in-

stance in the application in Chapter 6, for each time point there is data for

inside and outside urban areas, which both should have separate safety

performance functions. This means that we cannot infer the shape of f

from data available at each time point.

178

• The exact shape of f in (A.2) is not generally known. In road safety re-

search the power function (A.3) is assumed in most cases. This approach

is particularly convenient in log-linear modelling, where (A.4) can be fit-

ted. This approach is often considered appropriate.

• As will be demonstrated in Appendix A.2, (A.4) is nested within (a spe-

cial case of) the LRT approach. Therefore, the latter functional form is at

least as good as the former.

As a consequence of these arguments, but mainly the first two, in most time

series applications there will be no possibility to estimate the shape of ft for

each time point.

If exposure is accurately measured and relevant, one possible way to consider

the time varying nature of f is to use a time varying regression approach as

follows:

log (Number of accidentst) =µt × log (exposuret) + log (riskt) , (A.5)

where both µt and log(riskt) can be treated as local linear trend models (or

other more appropriate models if needed). This solution implies a time vary-

ing extension of (A.3):

Number of accidents =riskt exposuretµt .

In (A.5) log(riskt) and µt may be interpreted as representing the parameters of

an approximation to ft at exposuret.

Note that in case traffic volume is used to estimate exposure, in (3.7)

log (exposuret) is equal to log (Traffic volumet) plus additive noise, while in

(A.5), µt × log (exposuret) is equal to log (Traffic volumet) times multiplicative

noise (not necessarily with expected value nil). In such cases, for example in

shorter time series, even (3.7) and (A.5) may be hard to distinguish, in particu-

lar when the innovations of µt correlate with the innovations of log(riskt).

Usually exposuret is measured with error, and may also be subject to fluctua-

tions in its relevance for road safety over time. In that case, exposuret in (A.5)

can be treated stochastically, just like riskt and µt. However, (A.5) will then

probably be even more difficult to distinguish from a time series version of

(3.7) for shorter time series. Moreover, although the developments of log(riskt)

and log(exposuret) are quite linear for the example shown in Figure 3.2, this

179

will not always be true in practice, in which case more elaborate models are

required.

Summarising, the possibility has to be considered that the functional form

of the non-linear relation between the number of accidents and exposure as

well as the effect of other influences on this relation change over time. Unlike

cross-sectional studies where observations are obtained over a short period of

time, in time series analysis the assumption of ‘all other variables as much as

is possible held constant’ is not very realistic, and one would therefore need to

add an additional ‘relative risk due to other influences’ latent variable to the

model. This turns the reliable identification of parameters in a time-evolving

non-linear function ft into a near impossible task. In the LRT, the ‘relative risk

due to other influences’ variable is included in the already mentioned latent

risk variable. Moreover, in the LRT the development of exposure, or even traf-

fic volume itself, is allowed to influence the development of risk directly by

choosing a a suitably defined dynamic covariance matrix (see Appendix A.2).

A.2. Elimination of the b coefficient.

As already mentioned in Appendix A.1, Elvik and Vaa (2004, p 49) discussed

the non-linear relation between the number of accidents and exposure ex-

pressed in (A.3). In Lassarre (2001) accident counts are analysed for a number

of countries and the same relation is implemented in a log-linear state space

model, as follows:

log(yt) = mt + b log(vt),

mt = mt−1 + bt−1,

bt = bt−1.

(A.6)

In (A.6), yt is the number of fatalities, vt is the observed traffic volume (not the

latent exposure), mt is the level of log-risk and b is the regression coefficient for

traffic volume. Equation (A.6) is a simplified version of the model discussed

by Lassarre (2001, p.745) since stochastic components are suppressed in (A.6),

and Lassarre also considers intervention variables and autoregressive compo-

nents. Moreover, where Lassarre (2001) used the symbol η for the regression

coefficient of traffic volume, here we follow Elvik and Vaa (2004)’s use of b for

this parameter.

180

In the latent risk time series model it could be considered to estimate the same

parameter b but now for the latent exposure:

{

log Traffic volumet = µ(e)t + ε

(e)t

log Fatalitiest = b µ(e)t + µ

( f )t + ε

( f )t ,

(A.7)

where µ(e)t and µ

( f )t , the latent variables exposure and risk, are treated as local

linear trend models. Here it will be shown that parameter b in (A.7) becomes

unidentifiable when the disturbances of the latent variables exposure and risk

are allowed to co-variate. Letting

T =

1 1 0 0

0 1 0 0

0 0 1 1

0 0 0 1

and Z =

(

1 0 0 0

1 0 1 0

)

the LRT can be written as

αt+1 =Tαt + ηt,

yt =Zαt + εt,

where Q = var(ηt) and H = var(εt) as before. We can estimate the non-zero

elements of Q and H by maximum likelihood. For given matrices T, Z, Q

and H and an invertible linear transformation matrix F, we can rearrange this

model into

Fαt+1 =(

F T F−1)

Fαt + Fηt,

yt =(

ZF−1)

Fαt + εt,

without affecting the value of the likelihood function. Defining α(b)t = Fαt,

η(b)t = Fηt, Z(b) = Z F−1 and Q(b) = F Q F′, the latter model can be written as

α(b)t+1 =Tα

(b)t + η

(b)t ,

yt =Z(b)α(b)t + εt,

181

which still gives the same value of the likelihood function. We can prove the

assertion by solving the system of equations

T = F T F−1,

ZF−1 = Z(b) =

(

1 0 0 0

b 0 1 0

)

(A.8)

for F, yielding

F =

1 0 0 0

0 1 0 0

1 − b 0 1 0

0 1 − b 0 1

.

These state space representations are equivalent in terms of likelihood only if

Q(b) = F Q F′. If, for some reason, the representation of Q in the estimation

process is restricted (for instance assuming the innovations of the slopes or

levels being uncorrelated), there may be a unique and under these restrictions

optimal likelihood solution of F, and thus b may then be identifiable. Oth-

erwise, the likelihood will not improve by changing the value of b, and b is

therefore not identifiable by maximum likelihood.

One consequence of this result is that the solution based on b = 0 is also equiv-

alent in likelihood. Specifically,

Z∗ =

(

1 0 1 0

0 0 1 0

)

is also equivalent, where the component on the position of the original expo-

sure component plays the role of ‘kilometres driven per accident’.

Compared to the state space model (A.6) discussed in Lassarre (2001), the LRT

requires the estimation of four extra parameters: one for the potential covari-

ance between the observation disturbances (which should probably be present

anyway), two for the variances of the disturbances of the level and slope com-

ponents of the latent exposure, and two for the covariances of the disturbances

between the slope and level components. However, the estimation of param-

eter b is not required, and some of the just mentioned extra parameters may

not be significant. Moreover, if traffic volume is measured under observation

error, the LRT should be more appropriate.

182

A heuristic explanation of this result is that when

log (yt) =b log (exposure) + log (risk)

instead

log (yt) = log (exposure) + log (risk′)

is modelled, where

log (risk′) = log (risk) + (b − 1) log (exposure). (A.9)

Note that the derivation of the matrix F in (A.8) is based on transforming

a model based on log (yt) = log (exposure) + log (risk) to a model based

on log (yt) = b log (exposure) + log (risk′), while in (A.9) above, a model

based on log (yt) = b log (exposure) + log (risk) is transformed into log (yt) =

log (exposure) + log (risk)′, hence the difference in the coefficient 1 − b in F

and (b − 1) in (A.8). From (A.9) it can be seen that (A.8) amounts to adding

a coefficient times log (exposure) to the log (risk) component. If this opera-

tion maps one model onto another model within the same class this would

not affect the likelihood. If log (exposure) is assumed to be a local linear

trend model and log (risk) a local level model this would not be the case, as

log (risk) + (b − 1) log (exposure) would be a local linear trend model. Also,

if the innovations of log (exposure) and log (risk) are uncorrelated, the inno-

vations of log (exposure) and log (risk) + (b − 1) log (exposure) likely are cor-

related. If the model class assumes the innovations to be uncorrelated, the op-

eration (A.8) would not map the model onto a model within the same model

class. Assumptions of non-Gaussian innovation distributions may also result

in a mapping onto a model not within the same model class, and thus result in

an identifiable b.

A.3. Consequences of large dynamic prediction error for ex-

planatory variables

The modelling strategy introduced in this thesis suggests to include explan-

atory variables in the state when their observation or their effect on the road

safety system may be subject to error. In its simplest form this means that

one latent variable is added to the model for each such explanatory variable,

now to be called a latent explanatory variable. However, not all explanatory

variables necessarily have a well specified dynamic relation. More generally,

the dynamic prediction may be subject to large random error. This section

183

briefly explores the consequences of including an explanatory variable as a la-

tent explanatory variable in the model when its dynamic prediction is poorly

described by a local level model, for example. When the prediction is sub-

ject to large random error, this will result in a relatively large element for this

component in the dynamic variance matrix Qt, as happens for the component

representing the fraction of time with rain in the weather application discussed

in Chapter 7.

Using a simple example it will now be demonstrated that the smoothed predic-

tions for exposure based on the LRT tend to the observed explanatory values

used in a linear regression model if the prediction variance is tending to infin-

ity and other conditions match those of classical linear regression. This means

that the co-variation between the innovations of the state components remain

finite (and their correlation tends to nil). It is further assumed that the obser-

vation error variance of the explanatory variable tends to nil, as is the case in

the classical linear regression model.

The filter and smoothing equations can be written as (Durbin and Koopman,

2001, equations (4.14) and (4.27))

vt = yt − Ztat, Ft = ZtPtZ′t + Ht,

Mt = PtZ′t,

at|t = at + MtF−1t vt, Pt|t = Pt − MtF

−1t M′

t,

at+1 = Ttat|t, Pt+1 = TtPt|tT′t + RtQtR

′t

t = 1, . . . , n,

and

αt = at|t + Pt|tT′t P−1

t+1 (αt+1 − at+1) ,

where αt is the smoothed state at time t, and Ft is the variance matrix of the

one-ahead prediction errors (and not a linear transformation matrix as in the

previous section). We consider a simple LRT where the latent variables expo-

sure and risk are both treated as a local level model. In this case, Tt and Rt are

identity matrices of order 2 × 2. Letting the first element of the observation

vector yt and the first element of the state vector (a•) consist of the observed

traffic volume and of the level component of the latent exposure, respectively,

184

we have that:

Zt =

(

1 0

1 1

)

, Qt =

(

q11 q12

q12 q22

)

,

Ht =

(

h11 h12

h12 h22

)

, Pt−1|t−1 =

(

p11 p12

p12 p22

)

,

Pt =

(

p11 + q11 p12 + q12

p12 + q12 p22 + q22

)

.

In classical linear regression the elements h11 and h12 of the observation error

variance matrix Ht are assumed to be nil. For the situation where q11 → ∞ and

h11 = h12 = 0 it will be demonstrated that the first element of at|t tends to the

first element of yt (i.e., the observed traffic volume), as in that case Pt|tT′t P−1

t+1

and Pt|t both tend to a matrix with only one non-zero element:

(

0 0

0 ∗

)

.

We start by noting that Ft and MtF−1t (where h11 = h12 = 0) can now be written

as:

Ft =

(

p11 + q11 p11 + p12 + q11 + q12

p11 + p12 + q11 + q12 h22 + p11 + 2p12 + p22 + q11 + 2q12 + q22

)

,

and

[

MtF−1t

]

h11=0h12=0

=

1 0(p11+q11)(p22+q22)−(p12+q12)(h22+p12+q12)

(p12+q12)2−(p11+q11)(h22+p22+q22)

(p12+q12)2−(p11+q11)(p22+q22)

(p12+q12)2−(p11+q11)(h22+p22+q22)

.

Substitution of

A :=

[

limq11→∞

MtF−1t

]

h11=0h12=0

=

(

1 0

− p22+q22h22+p22+q22

p22+q22h22+p22+q22

)

,

185

and

B :=

[

limq11→∞

MtF−1t Zt

]

h11=0h12=0

=

(

1 0

0p22+q22

h22+p22+q22

)

,

in at|t = at + MtF−1t vt yields

at|t = (I − B)at + Ayt,

since vt = yt − Ztat. It follows that

at|t =

(

yt1h22a2−(p22+q22)(yt1−yt2)

h22+p22+q22

)

.

Similarly, it can be proven that

Pt|t =

(

0 0

0h22(p22+q22)h22+p22+q22

)

,

and

Pt|tT′t P−1

t+1 →(

0 0

0h22(p22+q22)

q22(p22+q22)+h22(p22+2q22)

)

.

Therefore in this situation only the second element of the state vector is up-

dated in the smoothing equation

αt = at|t + Pt|tT′t P−1

t+1 (αt+1 − at+1) .

This again means that the smoothed level of the latent exposure is equal to

the filtered level of the latent exposure, which in turn is equal to the observed

value for traffic volume.

Concluding, the more the LRT is unable to capture the dynamic structure of

a latent explanatory variable like exposure (as indicated by a very large value

of its dynamic variance), the more the latent explanatory exposure tends to

become equal to the observed traffic volume. The same applies to any other

explanatory variable that is included in the state of the model.

186

Appendix B. The covariance structure of accident re-

lated outcomes

Here it is shown how to derive an expression for the variance-covariance ma-

trix of the number of (injury) accidents and victims. It is assumed that a basic

simplification can be used: the number of victims per accident is equally dis-

tributed with finite variance for all accidents, although it may be possible to

relax this assumption. No further assumptions on the shape of the distribu-

tion of the accident outcomes are made. The derivation extends the result in

(Feller, 1968 page 286).

B.1. The expected value and variance of the number of vic-

tims

First define N as the number of accidents in a certain period of time. N is

assumed to be Poisson distributed with parameter λ.

Let the stochastic variables Vi (i = 1, . . . , N) denote the number of victims in

accident i. The Vi are assumed to be independently identically distributed.

The distribution of the Vi has characteristic function φ(t) and expected value µ

and all moments are finite. The symbol vi is used to denote a realisation of the

number of victims in accident i.

Let the total number of victims be V, defined as V = ∑Ni=1 Vi, thus V is a sum

over a Poisson distributed random number (N) of accidents. Defining Φ(t) as

the characteristic function of the distribution of V, we have that

Φ(t) = E(

eitV)

= E(

E(

eitV |N))

, (B.1)

where i is the imaginary number (i2 = −1). Since

E(

eitV |N = n)

= E(

eit ∑ni=1 Vi |N = n

)

= E

(

n

∏i=1

eitVi

)

=n

∏i=1

φ(t) = φn(t) (B.2)

187

then substituting (B.2) in (B.1) and because N follows a Poisson distribution,

we get

Φ(t) = E(

φN(t))

= e−λ∞

∑n=0

φn(t)λn

n!= e−λ

∑n=0

(λφ(t))n

n!.

Since ex = ∑∞n=0

xn

n! (see Feller, 1968 page 286) we obtain

Φ(t) = e−λ+(λφ(t)) = eλ(φ(t)−1). (B.3)

As E(|V|3) exists and is finite, E(V) = i−1Φ′(0) and E(V2) = −Φ′′(0). Because

φ(0) = 1, Φ(0) = 1, φ(0)′ = i E(Vk) = iµ and φ(0)′′ = −E(V2k ), we get the

following expected value for the total number of victims V:

E(V) = i−1[

λφ′(t)Φ(t)]

t=0= i−1λφ′(0) = λµ. (B.4)

This quantity can be estimated using:

m(V) =N

∑i=1

vi. (B.5)

This estimator is unbiased since

E(m(V)) = E

(

E

(

N

∑i=1

vi|N))

= E (Nµ) = λµ.

The variance of the total number of victims V is σ2(V) = E(V2) − E2(V). Be-

cause

E(V2) = −[

λφ′′(t)Φ(t) + (λφ′(t))2Φ(t)]

t=0= λµ. − λφ′′(0)− (λφ′(0))2

we obtain

σ2(V) = λE(V2k ). (B.6)

This can be estimated using

s2(V) =N

∑i=1

v2i . (B.7)

188

Again, this estimator is unbiased since

E(s2(V)) = E

(

E

(

N

∑i=1

v2i |N

))

= E(

NE(V2k )

)

= λE(V2k ).

The expected value and the variance of the number of fatalities can be derived

in the same way.

B.2. The covariance between the number of accidents and

the number of victims

The covariance between the number of injury accidents and the number of

victims is more complicated. Its derivation is based on the same characteristic

function argument as used above. The characteristic function of the random

vector (N, V) is defined as

Φ(s, t) = E(

eisN+itV)

≡ E ( f (N) × g(V)) .

Using the same property of conditional expectations

E (E ( f (N)× g(V)|N)) = E ( f (N)E(g(V)|N))

then using (B.2) we obtain

Φ(s, t) = E(

eisNφN(t))

=

= e−λ∞

∑k=0

(λφ(t)eis)k

k!=

= e−λeλφ(t)eis= eλ(φ(t)eis−1). (B.8)

In order to derive the covariance, we have E(N) = λ by the Poisson law of N

and E(V) = λµ is already available in (B.4). In order to complete the derivation

of the covariance, we need to evaluate

E (N V) = −[

∂2Φ(s, t)

∂s∂t

]

s=t=0

. (B.9)

The derivative of Φ with respect to s is

∂Φ(s, t)

∂s= iλφ(t)Φ(s, t)

189

and the derivative of the latter with respect to t

∂2Φ(s, t)

∂s∂t=iλφ(t)λeisφ′(t)Φ(s, t) + iλφ′(t)Φ(s, t) =

=iλΦ(s, t)φ′(t)(

φ(t)λeis + 1)

.

Because Φ(0, 0) = φ(0) = 1, and φ′(0) = iµ it follows that

[

∂2Φ(s, t)

∂s∂t

]

s=t=0

= i2λµ (λ + 1) .

Therefore

E (N V) = λµ (λ + 1) (B.10)

and thus

Cov(N, V) = λ2µ + λµ − λλµ = λµ. (B.11)

This quantity can be estimated using

s(N, V) =N

∑i=1

vi. (B.12)

Again, this estimator is unbiased because

E(s(N, V)) = E

(

E

(

N

∑i=1

vi|N))

= E (Nµ) = λµ. (B.13)

The covariance between the number of accidents and the number of fatalities

can be derived in the same manner.

B.3. The covariance between the number of victims and the

number of fatalities

Let the random variable Fi be the number of fatalities in accident i. Let F =

∑Ni=1 Fi. Define Ψ(s, t) as the characteristic function of the random vector (V, F)

and ψ(s, t) as the characteristic function of the random vector (Vi, Fi). Then,

for each i 6= j, (Vi, Fi) is independent of (Vj, Fj). However, the Vi and Fi are not

190

independent because Vi ≥ Fi a.s. Now

Ψ(s, t) = E(

eitV+isF)

= E(

E(

eitV+isF|N))

.

Following a derivation similar to (B.2) we obtain:

E(

eitV+isF|N = n)

= E(

e∑ni=1(itVi+isFi)|N = n

)

= E

(

n

∏i=1

eitVi+isFi

)

=n

∏i=1

ψ(s, t) = ψn(s, t).

Analogous to (B.8) it is found

Ψ(s, t) = E(

ψN(s, t))

= e−λ∞

∑n=0

ψn(s, t)λn

n!= e−λ

∑n=0

(λψ(s, t))n

n!,

from which it follows that

Ψ(s, t) = e−λ+(λψ(s,t)) = eλ(ψ(s,t)−1).

Using the same argument as in the derivation of (B.10) we obtain

E (V F) = λ2E(Fi)E(Vi) + λE(FiVi),

and therefore

Cov(V, F) = λE(FiVi).

This can be estimated with

s(V, F) =N

∑i=1

fivi. (B.14)

Again, this estimator is unbiased.

B.4. Derivation for the logarithm of counts

B.4.1. The expected value and variance of the logarithms of number of

accidents and victims

Unfortunately, it is not possible to derive an explicit characteristic function as

simple as the one in equation (B.3) in the case of the logarithm of the number

of accidents and victims. For that reason, approximations need to be made in

order to get a useful expression for the covariance between the logarithm of

191

the number of accidents and victims. This is done using the ‘delta’ method.

The basic idea is that the logarithms of N and V are approximated by a series

expansion of order k (usually order one) about their expected values. This

results in log(N) being approximated by a polynomial in N of order k, that is,

log(N)k≈ a0 + a1 × (N − λ) + · · · + ak × (N − λ)k.

In the present case, a first order approximation about the expected value (λ) of

the number of accidents is:

log(N)1≈ log(λ) +

N − λ

λ(B.15)

where1≈ means variance of the first order approximation. Thus, the expected

value of this first order approximation is equal to log(λ): E (log(N))1≈ log(λ).

Similarly, the square of the linear approximation is

(N − λ)2

λ2+ log2(λ) + 2

(

N − λ

λ

)

log(λ).

The latter part has expected value 0, so its expected value is

σ2(N)

λ2+ log2(λ) =

1

λ+ log2(λ),

so combining we have

σ2(log(N)) ≈ σ2(N)

λ2=

1

λ, s2(log(N))≈ 1

N(B.16)

In the case of log(V), approximations are about the expected value λµ of V:

log(V)1≈ log(λµ) +

V − λµ

λµ(B.17)

For that reason E (log(V))1≈ log(λµ) and using first (B.16) and then (B.5) we

obtain

σ2 (log(V)) ≈ σ2(V)

(λµ)2=

σ2(V)

(E(V))2, s2 (log(V)) ≈ s2(V)

m(V)2. (B.18)

Results for fatalities are derived in a similar way.

192

B.4.2. The covariance between the logarithm of the number of accidents

and the logarithm of the number of victims and fatalities

Extending the first order approximations of both log-accident counts and log-

victims, it can be seen that, using (B.15) and (B.17)

Cov (log(N), log(V)) ≈ Cov

(

log(λ) +N − λ

λ, log(λµ) +

V − λµ

λµ

)

= E

(

N − λ

λ

V − λµ

λµ

)

=Cov(N, V)

λ2µ

using (B.11): =1

λs(log(N), log(V)) = 1/n (B.19)

Again, results for fatalities are derived in a similar way.

B.4.3. The covariance between the logarithm of the number of victims

and the logarithm of the number of fatalities

In this case a similar approach can be taken:

Cov (log(V), log(F)) ≈ Cov

(

log(λµV) +V − λµV

λµV, log(λµF) +

V − λµF

λµF

)

= E

(

V − λµV

λµV

F − λµF

λµF

)

=Cov(V, F)

λ2µVµF

using (B.14): = s(log(V), log(F)) =∑

ni=1 vi fi

∑ni=1 vi ∑

ni=1 fi

(B.20)

193

Appendix C. Score of the Laplace approximated log-

likelihood

C.1. The derivative of at|t(ψ) with respect to a parameter ψ.

Define ξt as the vector of parameters describing the distribution of the pre-

dicted state at time t. In the Gaussian case, ξt would comprise of both at|t−1

and Pt|t−1. Also assume that the derivatives with respect to the parameter

(vector) of interest ψ of the components of ξt is available. Obviously, ξt can be

assumed a function of ψ.

Because at|t(ψ) maximises lt(α, ξt(ψ), yt, ψ), at at|t(ψ) and lt can be written as

the sum of and observation and a dynamic part:

lt(α, ξt(ψ), yt, ψ) = qt(α, yt, ψ) + rt(α, ξt(ψ), ψ), (C.1)

we have for i = 1, . . . , m ≡ dim(α):

[

∂αi(qt(α, yt, ψ) + rt(α, ξt(ψ), ψ))

]

α=at|t(ψ)

= 0, ∀ψ ∈ R. (C.2)

Define the following entities:

g(t)o (α, yt, ψ) =

∂αqt(α, yt, ψ), g

(t)d (α, ξt, ψ) =

∂αrt(α, ξt, ψ),

H(t)o (α, yt, ψ) =

∂2

∂α∂αqt(α, yt, ψ), H

(t)d (α, ξt, ψ) =

∂2

∂α∂αrt(α, ξt, ψ),

and their derivatives with respect to ψ (further suppressing the time index t):

dgo(α, yt, ψ) =∂

∂ψgo(α, yt, ψ),

dHo(α, yt, ψ) =∂

∂ψHo(α, yt, ψ),

dgd1(α, ξt, ψ) =∂

∂ψgd(α, ξt, ψ),

dgd2(α, ξt(ψ), ψ) =∂

∂ξt(ψ)gd(α, ξt(ψ), ψ).

Because (C.2) holds, we have:

go(at|t(ψ), yt, ψ) + gd(at|t(ψ), ξt(ψ), ψ) = 0, ∀ψ.

194

Taking derivatives on both sides with respect to ψ gives:

0 ≈ δ =Ho(at|t(ψ), yt, ψ)′(

∂at|t(ψ)

∂ψ

)

+ dgo(at|t(ψ), yt, ψ)+

Hd(at|t(ψ), ξt(ψ), ψ)′(

∂at|t(ψ)

∂ψ

)

+ dgd1(at|t(ψ), ξt(ψ), ψ)+

dgd2(at|t(ψ), ξt(ψ), ψ)′(

∂ξt(ψ)

∂ψ

)

, (C.3)

as by the chain rule (Magnus and Neudecker, 1999, p. 91)

∂ f (g(ψ))

∂ψ=

(

∂ f

∂g

)

(g(ψ))′(

∂g

∂ψ

)

(ψ). (C.4)

Thus, if (C.2) holds31 we have:

∂at|t(ψ)

∂ψ= −H−1

(

dgo(at|t(ψ), yt, ψ) + dgd1(at|t(ψ), ξt(ψ), ψ) +

dgd2(at|t, ξt(ψ), ψ)′(

∂ξt(ψ)

∂ψ

)

+ δ

)

, (C.5)

where −H−1 is the ‘observed covariance matrix’ of the state.

C.2. The derivative of the Hessian with respect to a param-

eter ψ.

The Hessian Ht(α, ξt(ψ), yt, ψ) of lt(α, ξt(ψ), yt, ψ) at a = at|t(ψ) is:

Ht(ψ) = [Ho(α, yt, ψ) + Hd(α, ξt(ψ), ψ)]α=at|t(ψ) ,

thus its derivative is

∂Ht(ψ)

∂ψ=

∂ψHo(at|t(ψ), yt, ψ) +

∂ψHd(at|t(ψ), ξt(ψ), ψ)

=

[

∂ψHo(α, yt, ψ)

]

α=at|t(ψ)

+

[

∂αHo(α, yt, ψ)

]′

α=at|t(ψ)

(

∂at|t(ψ)

∂ψ

)

+

[

∂ψHd(α, ξ, ψ)

]

ξ=ξt(ψ)α=at|t(ψ)

+

[

∂αHd(α, ξt(ψ), ψ)

]′

α=at|t(ψ)

(

∂at|t(ψ)

∂ψ

)

+

31This is only achieved numerically, so δ ≈ 0. The algorithm should verify whether δ doesnot get too far from zero.

195

[

∂ξHd(at|t(ψ), ξ, ψ)

]′

ξ=ξt(ψ)

(

∂ξt(ψ)

∂ψ

)

.

The state error covariance matrix Pt|t(ψ) is estimated by minus the inverse of

the Hessian.

C.3. The derivative of the Laplace approximated log-likeli-

hood at time t

By (7.18) the integrals of the individual terms in (7.14) can be approximated:

It (a, ξt, yt, ψ) =m

2log (2π/k)− 1

2log det (−Ht (a, ξt, yt, ψ))

+lt (a, ξt, yt, ψ) .(C.6)

Using the chain rule (C.4) we need to obtain the derivative with respect to ψ of

It(at|t(ψ), ξt(ψ), yt, ψ). Using the standard results:

∂ψlog det F = trace

[

F−1 ∂

∂ψF

]

and∂

∂ψF−1 = −F−1

(

∂ψF

)

F−1 (C.7)

Ignoring constant terms, we have

∂ψIt

(

at|t(ψ), ξt(ψ), yt, ψ)

=

− 1

2

∂ψlog det

(

−Ht

(

at|t(ψ), ξt(ψ), yt, ψ))

+∂

∂ψlt

(

at|t(ψ), ξt(ψ), yt, ψ)

,

= − 1

2trace

[

Pt|t

(

∂ψHt(ψ)

)]

+

+

[

∂ψqt(α, yt, ψ)

]

α=at|t(ψ)

+ g(t)0 (at|t(ψ), yt, ψ)′

(

∂at|t(ψ)

∂ψ

)

+

[

∂ψrt(α, ξ, ψ)

]

ξ=ξt(ψ)α=at|t(ψ)

+ g(t)d (at|t(ψ), ξt(ψ), ψ)′

(

∂at|t(ψ)

∂ψ

)

+

[

∂ξrt(at|t(ψ), ξ, ψ)

]′

α=at|t(ψ)

(

∂ξt(ψ)

∂ψ

)

.

(C.8)

196

C.4. The classical univariate Linear Gaussian case

Applying the Laplace approximated Gaussian state likelihood approach to the

classical univariate Linear Gaussian state space model, as described in (7.1)

and (7.2) yields results identical to the classical prediction error decomposition

(see Durbin and Koopman, 2001, p 138).

This is demonstrated using the univariate case, where Tt, Rt, Zt ≡ 1. Ignoring

constants, from (7.15) the log-likelihood is:

lt(α, ψ) =1

2

(

−(α − αt|t−1)

2

Pt|t−1

− (α − yt)2

Rt− log(Pt|t−1) − log(Rt)

)

.

Its derivative with respect to α is:

l′t(α, ψ) =∂

∂αlt(α, ψ) =

1

2

(

−2(α − αt|t−1)

Pt|t−1

− 2 (α − yt)

Rt

)

.

Solving the equation l′t(α, ψ) = 0 for α yields:

αt|t =αt|t−1Rt + Pt|t−1yt

Pt|t−1 + Rt,

so l′′t (α, ψ) = ∂∂α l′t(α, ψ) is:

l′′t (α, ψ) = −Pt|t−1 + Rt

Pt|t−1Rt.

Substituting α into lt(α, ψ) yields (ignoring constants):

lt(α, ψ) = −1

2log(Pt|t−1Rt) −

1

2

(

αt|t−1 − yt

)2

(Pt|t−1 + Rt)

Adding (−1/2) log det (−l′′(α, ψ)):

−1

2log

(

Pt|t−1 + Rt

Pt|t−1Rt

)

197

yields

lt(α, ψ) = −1

2log(Pt|t−1 + Rt) −

1

2

(

αt|t−1 − yt

)2

(Pt|t−1 + Rt),

which is equivalent to the prediction error decomposition (see also Harvey

(1989), or Durbin and Koopman (2001, p 138)).

C.5. Issues with respect to the use Laplace approximation

The following assumptions (univariate case) on lt(α, ψ) are necessary for the

Laplace approximation to be applicable (for fixed ψ):

1. lt(α, ψ) is real and continuous. This means that the observation density

must be positive for all possible outcomes and parameter values com-

bined.

2. the integral∫

exp lt(α, ψ)dα should converge,

3. lt(α, ψ) has an absolute maximum at α,

4. l′t(α, ψ) exists in a neighbourhood of α,

5. l′′t (α, ψ) < 0

Note that the Laplace approximation relies on a second order approximation

in α of lt(α, ψ) at α. Consequent on that, in the full Gaussian case the Laplace

approximation in (7.18) is accurate. Therefore the argument that (Huber et al.,

2004, p. 896) use to state that “the approximation improves as the number

of latent variables grows (because with more latent variables we need more

manifest variables).” can be extended to the linear Gaussian state space ap-

plications of structural time series that involve unobserved state components

like slopes and seasonal components. Such components will contribute sec-

ond order polynomial components to l(α, ψ), which should not deteriorate the

Laplace approximation.

198

Samenvatting

Tijdreeksanalyse in verkeersveiligheidsonderzoek met behulpvan state space methodologie

In dit proefschrift wordt een aantal studies gepresenteerd waarin tijdreeksana-

lyse wordt toegepast op geaggregeerde verkeersveiligheidsgegevens, waaron-

der aantallen verkeersongevallen, aantallen verkeersdoden en aantallen zwaar-

gewonden.

Veel onderzoek is en wordt verricht naar hoe de verkeersveiligheid kan wor-

den verbeterd. Daarbij wordt vaak geprobeerd verbanden te leggen tussen

veranderingen in aantallen verkeersongevallen of verkeersslachtoffers aan de

ene kant en aan de andere kant bijvoorbeeld factoren zoals expositie (een maat

voor de hoeveelheid verkeer), beleid, het rijden onder invloed van alcohol,

snelheidsgedrag en infrastructurele maatregelen. Een belangrijk doel hiervan

is op te sporen welke factoren gecontroleerd kunnen worden en een positief

effect op de verkeersveiligheid hebben, zodat langs die weg de verkeersveilig-

heid verbeterd kan worden.

Sommige van deze factoren, zoals regelgeving, wetgeving en beleid kunnen

rechtstreeks worden waargenomen: de datum van implementatie staat vast,

hoewel dat niet hoeft te gelden voor de naleving daarvan. Andere factoren

kunnen alleen in theorie rechtstreeks worden waargenomen. In de praktijk

zou hun rechtstreekse meting zeer moeilijk of zeer kostbaar zijn. Een voor-

beeld van dergelijke factoren is de hoeveelheid reizigerskilometers. In theorie

zou voor ieder individu iedere dag de hoeveelheid reizigerskilometers kunnen

worden vastgesteld. In de praktijk wordt dit gegeven met behulp van enquetes

geschat, hetgeen ten koste gaat van de nauwkeurigheid. Een ander voorbeeld

is het percentage bestuurders dat onder invloed van alcohol aan het verkeer

deelneemt. Dit wordt geschat met behulp van steekproefsgewijze alcoholcon-

troles. Tenslotte is een aantal factoren nog moeilijker waar te nemen zoals de

ervaring van bestuurders. Dergelijke verschillen in waarneming hebben hun

weerslag op de verschillen in nauwkeurigheid van de gegevens. Als onder

gebruikte gegevens (grote) verschillen in nauwkeurigheid voorkomen kan dat

statistische analyse nadelig beınvloeden als dat genegeerd wordt.

Een andere complicerende factor voor (tijdreeks)analyse in verkeersveiligheids-

onderzoek is dat er geen unieke maat voor de verkeersveiligheid beschikbaar

199

is. Meestal wordt verkeersveiligheid gemeten in termen van het aantal onge-

vallen of het aantal slachtoffers. Hoewel in de praktijk de situatie gecompli-

ceerder ligt, kan worden gesteld dat sommige maatregelen voor de verkeers-

veiligheid vooral van invloed zullen zijn op het ontstaan van ongevallen, en

andere maatregelen vooral op de afloop ervan (maar zo eenvoudig ligt het

niet). Wanneer een onderzoek wordt uitgevoerd naar het effect van een maat-

regel die het ontstaan van ongevallen zou moeten beınvloeden, zou de ont-

wikkeling van het aantal van een relevant type ongevallen moeten worden

bestudeerd. Aan de andere kant, als het effect van een maatregel wordt on-

derzocht die in hoofdzaak gevolgen zou hebben voor de ernst van ongevallen,

zou de ontwikkeling van het aantal slachtoffers kunnen worden bestudeerd.

De voorkeur heeft een analyse uit te voeren op zowel de ontwikkeling van

het aantal slachtoffers van een type waarop de maatregel van invloed geacht

wordt alsook, ter vergelijking, de ontwikkeling van het aantal slachtoffers van

een type waarvan niet wordt verwacht dat de maatregel daar invloed op heeft.

Behalve als gevolg van de maatregel moeten beide ontwikkelingen vergelijk-

baar zijn. Indien een vermindering van het aantal slachtoffers is aangetoond is

het vervolgens van belang vast te stellen dat het aantal slachtoffers is verlaagd

omdat ernst van ongevallen is afgenomen, en niet omdat het aantal ongevallen

is afgenomen. Daarnaast worden idealiter geen veranderingen gevonden in de

aantallen van het type slachtoffer dat niet geacht wordt beınvloed te worden

door de maatregel.

Het is echter niet waarschijnlijk dat in de praktijk aan beide voorwaarden

kan worden voldaan wanneer het aantal slachtoffers over een langere perio-

de wordt onderzocht. De mogelijkheid bestaat dat in de waargenomen peri-

ode andere factoren van invloed zijn geweest op de verkeersveiligheid. Het

is mogelijk dat de invloed van deze factoren op zich moeten worden gemo-

delleerd. Daarnaast hoeft het niet zo te zijn dat de invloed van deze factoren

volledig onafhankelijk van de maatregel is. Om in dergelijke gevallen een ver-

antwoorde statistische analyse uit te voeren kan het zinvol zijn een simultaan

model te specificeren dat de gezamenlijke afhankelijke variabelen, met name

ongevallen en slachtoffers omvat.

In dit proefschrift wordt een nieuwe benadering voor op verkeersveiligheid

georienteerde tijdreeksanalyse van de ontwikkelingen van geaggregeerde ver-

keersveiligheidsgegevens gepresenteerd, met als doel de mogelijkheden en be-

trouwbaarheid te verbeteren ten opzichte van gewoonlijk gebruikte alternatie-

ven. De benadering is gebaseerd op zogenaamde multivariate heteroscedasti-

sche structurele tijdreeksmodellen en vermindert veel van de problemen van

de gebruikelijke tijdreeksmodellen gebruikt voor verkeersveiligheidsanalyse,

200

en alle van de bovengenoemde problemen. De momenteel veel gebruikte mo-

dellen kunnen deze problemen slechts ten dele verminderen.

De gelijktijdige combinatie van drie fundamentele aspecten van de aanpak be-

schreven in dit proefschrift maken verbeteringen van de tijdreeks modellen in

het verkeersveiligheidsonderzoek mogelijk. Deze drie aspecten zijn:

• Het gebruik van structurele componenten. De tijdreeksmodellen zijn op-

gebouwd met behulp van interpreteerbare structurele componenten, die

in principe de expositie, het risico en eventueel de ernst representeren,

en zonodig andere invloeden representeren zoals de registratiegraad, het

percentage gebruik van autogordels, het percentage van de bestuurders

met meer dan het wettelijke maximum voor de bloed alcohol concentra-

tie, het snelheidsgedrag van bestuurders. De structurele componenten

kunnen ook seizoensgebonden trends en patronen hebben. De voorde-

len van het gebruik van interpreteerbare componenten worden duide-

lijk als een individuele component kan worden gerelateerd aan meer dan

een afhankelijke variabele. Zo zal het gordeldraagpercentage voor meer-

dere typen ongevallen hetzelfde kunnen zijn (geldt niet altijd). Boven-

dien stellen de structurele componenten in deze modellen de onderzoe-

ker in staat onderscheid te maken tussen effecten die de verkeersveilig-

heid of belangrijke onderdelen daarvan beınvloeden en effecten die hoe

we de verkeersveiligheid waarnemen beınvloeden. Hierbij kan worden

gedacht aan de afnemende volledigheid van de registratie van ongeval-

len: ze gebeuren wel, maar we zien ze niet meer. Daarnaast stelt deze

benadering de onderzoeker op een relatief transparante manier in staat

effecten van verklarende variabelen op de relevante componenten te be-

studeren in plaats van op het aantal ongevallen of slachtoffers: bijvoor-

beeld ten behoeve van onderzoeksvragen als “heeft politie toezicht effect

op het percentage autogordelgebruik, en daarmee effect op de verkeers-

veiligheid of zit het toch anders?”

• Meerdere afhankelijke variabelen: zowel aantallen ongevallen als aan-

tallen slachtoffers kunnen worden opgenomen in een model, zonodig

apart voor meerdere typen ongevallen. Verklarende variabelen kunnen

op de traditionele manier worden opgenomen. Verklarende variabelen

die met onzekerheid gemeten zijn, zoals bijvoorbeeld gegevens verkre-

gen uit steekproeven of resultaten van ander onderzoek, kunnen in een

model worden opgenomen waarbij een structurele component gebruikt

wordt als schatting van de werkelijke waarde (de waarde zonder obser-

vatiefout, hetgeen zinvol kan zijn als verondersteld kan worden dat de

201

werkelijke waarde invloed heeft op de verkeersveiligheid of andere com-

ponenten, en niet de geobserveerde waarde). Een voorbeeld is het per-

centage autogordelgebruik. Gegevens hiervoor kunnen zijn verkregen

uit relatief kleine onderzoeken, en niet noodzakelijkerwijs beschikbaar

voor alle tijdstippen. Als kan worden verondersteld dat het percentage

gebruik van de autogordel ongeveer constant is, kan een structurele com-

ponent worden gebruikt die het gemiddelde van alle waarnemingen is.

Deze structurele component kan vervolgens worden gebruikt voor alle

tijdstippen, inclusief tijdstippen waarvoor geen eigen onderzoeksresul-

taat beschikbaar is. Als niet kan worden verondersteld dat het percenta-

ge gebruik van de autogordel constant is, kan de structurele component

daarop aangepast worden, zodat nog steeds de observaties van alle tijd-

stippen bruikbaar zijn (uiteraard wordt de onzekerheid over het percen-

tage autogordelgebruik op een bepaald tijdstip dan (veel) groter).

• Niet identieke structuur van waarnemingsfouten: de nauwkeurigheid

kan varieren in de tijd. Een voorbeeld is de OVG/MON mobiliteitsen-

quete, die behoorlijk in omvang is veranderd. Een extreem voorbeeld is

het geheel ontbreken van gegevens. Niet alle enquetes of onderzoeken

worden ieder jaar uitgevoerd.

De modellen gebaseerd op de aanpak beschreven in dit proefschrift kunnen

omstandigheden hanteren waarin variabelen verschillen in nauwkeurigheid.

Bijvoorbeeld, de hoeveelheid verkeer in auto’s kan nauwkeuriger worden be-

paald dan de hoeveelheid verkeer op motorfietsen (de relatieve fout voor de

totale hoeveelheid verkeer in auto’s is ongeveer 2 tot 3% in de mobiliteitsen-

quete van 2003, terwijl de relatieve fout voor het totale verkeer van motorfiet-

sen bijna 30% is in deze mobiliteitsenquete van 2003). Het is de combinatie

van deze eigenschappen in een model en het gebruik van gedeelde structu-

rele componenten dat het model aantrekkelijk maakt voor tijdreeksanalyse in

verkeersveiligheidsonderzoek. Gezien het feit dat in verkeersveiligheidson-

derzoek over het algemeen een langere periode wordt bestudeerd, waar de

omstandigheden kunnen veranderen in de loop van de tijd, kunnen de vol-

gende praktische voordelen worden genoemd:

• Omdat verschillende afhankelijke variabelen tegelijk in samenhang kun-

nen worden geanalyseerd en het feit dat geen identieke structuur van

waarnemingsfouten wordt geeist, kunnen de modellen worden gebruikt

om rekening te houden met covariantie tussen de afhankelijke variabe-

len. Het is bekend dat het aantal slachtoffers in een periode (mede) af-

202

hankelijk is van het aantal ongevallen. Het is niet per se opvallend dat

een jaar met meer ongevallen ook meer slachtoffers heeft.

Het zou wel opvallend zijn als een jaar meer ongevallen heeft en minder

slachtoffers, of minder ongevallen en meer slachtoffers. Terwijl de abso-

lute verschillen even groot zijn (een zelfde aantal meer of minder onge-

vallen of slachtoffers), is een jaar met meer ongevallen en meer slachtof-

fers of minder ongevallen en minder slachtoffers minder onwaarschijn-

lijk dan een jaar met meer ongevallen en minder slachtoffers, of minder

ongevallen en meer slachtoffers.

Met andere woorden: als er in een jaar meer ongevallen zijn gebeurd,

hoeft dat niet opvallend te zijn. Als er in een jaar minder slachtoffers zijn

gevallen hoeft dat ook niet opvallend te zijn. Maar als in datzelfde jaar

zich meer ongevallen hebben voorgedaan terwijl er minder slachtoffers

zijn gevallen, kan dat best opvallend zijn.

Met deze afhankelijkheid moet rekening worden gehouden in statistische

toepassingen. Ook de fout in de hoeveelheid verkeer geschat op basis van

de enquetes kan samenhangen. De samenhang tussen aantallen onge-

vallen en slachtoffers is gebruikt in beide toepassingen van hoofdstuk 3.

De fout in de hoeveelheid verkeer van de enquetes wordt eveneens ge-

bruikt in de toepassingen van hoofdstuk 3, de Nederlandse toepassing

in hoofdstuk 5 en de aan verkeersveiligheid gerelateerde toepassing in

hoofdstuk 7.

• Het gebruik van structurele componenten stelt de onderzoeker in staat

een beperkte mate van verificatie van de resultaten uit te voeren. De ont-

wikkeling van structurele componenten over de tijd kan worden verge-

leken met secundaire informatie, hoewel als veel secundaire informatie

beschikbaar is, het waarschijnlijk beter is om die informatie in het model

op te nemen. Een succesvolle toepassing van zo’n verificatie is te vinden

in hoofdstuk 6, waar de vorm van de ontwikkeling van de hoeveelheid

gemotoriseerd verkeer buiten de bebouwde kom vergeleken is met een

schatting daarvan op basis van de lengte van het wegennet en de ver-

keersintensiteit.

• Structurele componenten kunnen gemeenschappelijk worden gebruikt

voor de specificatie van vele afhankelijke variabelen. In de verkeersvei-

ligheidstoepassing van hoofdstuk 7 worden weersomstandigheden (duur

van de neerslag) gemeten aan de hand van tien weerstations, gerelateerd

aan een structurele component die de gemiddelde duur van de neerslag

in Nederland voorstelt. Als het weer waargenomen door de stations on-

derling overeenkomt, zal de waarde van de structurele component nauw-

203

keurig bekend zijn. Aan de andere kant, als de weerstations een verschil-

lend weerbeeld geven, zal de waarde van de structurele component min-

der nauwkeurig bekend zijn. Op deze wijze kunnen observaties waar-

bij de gemiddelde hoeveelheid neerslag minder goed bekend is, minder

invloed hebben op het eindresultaat dan observaties waarbij de gemid-

delde hoeveelheid neerslag wel goed bekend is, zodat eventuele fouten

hopelijk minder gevolgen hebben. De mogelijkheid componenten te de-

len wordt ook gebruikt in hoofdstuk 3, waarin een structurele compo-

nent die het aantal slachtoffers per ongeval voorstelt wordt gedeeld door

de afhankelijke variabelen die “het aantal door de politie geregistreer-

de slachtoffers” en “een schatting van het werkelijke aantal slachtoffers”

voorstellen. Deze component wordt gebruikt voor het verbeteren van

de schatting van het werkelijke aantal ongevallen, die niet kan worden

afgeleid uit ziekenhuisgegevens.

• De aanpak maakt het mogelijk andere verdelingen dan de normale voor

de waarnemingsfouten in de modellen te gebruiken. Deze verdelingen

kunnen bovendien gecombineerd worden en per tijdstip verschillen. In

de verkeersveiligheidstoepassing van hoofdstuk 7 wordt een model voor

weer en verkeersveiligheid ontwikkeld met behulp van zowel meerdere

normaal verdeelde waarnemingsfouten als Poisson verdeelde afhankelij-

ke variabelen.

• De aanpak maakt het mogelijk verbanden te leggen tussen de zogenaam-

de innovaties van structurele componenten. Innovaties zijn schokken in

de ontwikkeling van de componenten. Wanneer de innovaties van twee

componenten gecorreleerd zijn, dan hebben de ontwikkelingen iets ge-

meenschappelijks. Dit is van belang wanneer structurele componenten

verschijnselen beschrijven die elkaar kunnen beınvloeden, zoals expo-

sitie en risico. Een ander voorbeeld is de ontwikkeling van de ernst van

ongevallen en ongevalsrisico. De ontwikkeling van de ernst van ongeval-

len is van invloed op het ontstaan van ongevallen die een bepaalde ernst

overstijgen. Als we in staat zijn de gemiddelde ongevalsernst zoveel te

verminderen dat er bijna geen doden in het verkeer meer te betreuren

zijn, dan zullen er ook bijna geen dodelijke ongevallen meer gebeuren.

Een zelfde effect doet zich voor ten gevolge van veranderingen in de be-

zettingsgraad van auto’s. Minder inzittenden betekent minder kans dat

er iemand zo zwaar gewond raakt dat hij of zij in het ziekenhuis moet

worden opgenomen of zelfs omkomt. Gemiddeld genomen zal het lijken

alsof er dus minder ongevallen gebeuren die zo erg zijn dat er zieken-

huisgewonden of doden vallen.

204

In de praktijk worden altijd min of meer ernstige ongevallen bestudeerd.

In dit proefschrift zijn dat dodelijke ongevallen en ongevallen met doden

en of ziekenhuisgewonden. Praktisch betekent dit dat valt te verwachten

dat met een afname van de gemiddelde ongevalsernst, ook het aantal

ongevallen zal afnemen.

Het feit dat de expositie van invloed kan zijn op risico’s wordt gebruikt in

bijna alle toepassingen in dit proefschrift, terwijl het feit dat de ernst van

ongevallen van invloed kan zijn op het ontstaan van ernstige ongevallen

wordt gebruikt in de eerste toepassing van hoofdstuk 3.

In de verkeersveiligheidstoepassingen in dit proefschrift is de expositie altijd

in een op een verhouding verondersteld met de omvang van het verkeer, ge-

meten in voertuigkilometers. Deze beperking is echter niet fundamenteel. De

aanpak kan ook worden toegepast op modellen die van een niet-lineaire relatie

uitgaan tussen expositie en voertuigkilometers, alsmede van een andere maat

voor het verkeersvolume dan voertuigkilometers. Bovendien zijn de dyna-

mische relaties gebruikt in dit proefschrift afgeleid van lokale lineaire trend-

modellen. Hoewel geleidelijke ontwikkelingen over het algemeen op bevre-

digende wijze benaderd lijken te kunnen worden door lokale lineaire trend-

modellen, hoeft dit niet altijd het geval te zijn. Met name het model voor de

ontwikkeling van de fractie van de hoeveelheid van het verkeer met neerslag

in hoofdstuk 7 zou kunnen worden verbeterd.

Voor de beschrijving van ontwikkelingen over de tijd is bij de modellen in dit

proefschrift in alle gevallen uitgegaan uit van lineaire dynamische relaties met

normaal verdeelde toevalsfluctuaties. Dit is een aanname die niet in alle ge-

vallen hoeft op te gaan, hoewel de benadering met behulp van lokaal lineaire

trendmodellen in de praktijk goed lijkt te functioneren. De effectiviteit ervan

wordt in dit proefschrift empirisch aangetoond, alsmede van de voorgestel-

de nieuwe methode van tijdreeksanalyse voor verkeersveiligheidsonderzoek.

Het is de bedoeling de ontwikkeling van de methodiek verder voort te zet-

ten in hogere dimensies en in meer gedetailleerde modellen voor de verkeers-

veiligheid. In dit proefschrift zijn de belangrijkste bijdragen van de nieuwe

aanpak gerapporteerd.

205

Dankwoord

Graag wil ik iedereen bedanken die mij geholpen heeft of met wie ik heb sa-

mengewerkt. Ik wil daarbij beginnen met mijn vrouw en kinderen, die mij de

nodige ruimte hebben gelaten. Daarna, maar niet minder, wil ik de mensen

waarmee ik direct heb samengewerkt bedanken, in het bijzonder mijn colle-

ga Jacques Commandeur, mijn co-auteur Phillip Gould, mijn promotor Siem

Jan Koopman en copromotor Kees van Montfort. Voor hun commentaar op de

verkeersveiligheidsaspecten van mijn onderzoek wil ik in het bijzonder Siem

Oppe, Fred Wegman en Shalom Hakkert bedanken.

207

Time series analysis in road safetyresearch using state space methods

Frits Bijleveld

Time series analysis in road safety research using state space m

ethods Frits Bijleveld

ISB

N: 978-90-73946-04-0VU University Amsterdam