elementary statistics on trial (the case of lucia de berk)pietg/lucia.pdf · a nurse experiences by...

10
Elementary Statistics on Trial (the case of Lucia de Berk) Richard D. Gill, Piet Groeneboom and Peter de Jong The trial of the Dutch nurse Lucia de Berk, suspected of sev- eral murders and attempts of murder was a very high profile case in the Netherlands. The ini- tial suspicion rested mainly on quasi-statistical considerations, which produced (based partly on incorrect calculations) extremely small probabilities. Since the out- comes proved controversial, the court claimed to have dropped the statistical calculations from the verdict. But the verdict still rested on intuitive notions as “very improbable”. So statistics were at center stage. In the conviction of Lucia de Berk an important role was played by a simple (so-called) hy- pergeometric model, used by the law psychologist (H. Elffers) con- sulted as statistician by the court, which produced very small prob- abilities of occurrences of certain numbers of incidents. In this article we want to draw attention to the fact that, if we take into account the variation among nurses in incidents they experience during their shifts, these probabilities become con- siderably larger. This points to the danger of using an oversim- plified discrete probability model in these circumstances. The outcomes of applying our alternative model to this case are in striking contrast with those of the first calculations which led to the initial suspicions and were instrumental in determin- ing the atmosphere surrounding the trial and subsequent hyste- ria. The main result is that un- der the assumption of heterogene- ity, the probability of experienc- ing a number of incidents (14) that led to Lucia’s conviction is about 0.0206161 or one in 49 if the calculations are based on the same data as used by the law psy- chologist of the court. In his cal- culation, however, this probabil- ity was equal to one in 342 mil- lion. Figure 1: Lucia de Berk before her imprisonment 1

Upload: others

Post on 22-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

  • Elementary Statistics on Trial

    (the case of Lucia de Berk)

    Richard D. Gill, Piet Groeneboom and Peter de Jong

    The trial of the Dutch nurseLucia de Berk, suspected of sev-eral murders and attempts ofmurder was a very high profilecase in the Netherlands. The ini-tial suspicion rested mainly onquasi-statistical considerations,which produced (based partly onincorrect calculations) extremelysmall probabilities. Since the out-comes proved controversial, thecourt claimed to have droppedthe statistical calculations fromthe verdict. But the verdict stillrested on intuitive notions as“very improbable”. So statisticswere at center stage.

    In the conviction of Luciade Berk an important role was

    played by a simple (so-called) hy-pergeometric model, used by thelaw psychologist (H. Elffers) con-sulted as statistician by the court,which produced very small prob-abilities of occurrences of certainnumbers of incidents.

    In this article we want to drawattention to the fact that, if wetake into account the variationamong nurses in incidents theyexperience during their shifts,these probabilities become con-siderably larger. This points tothe danger of using an oversim-plified discrete probability modelin these circumstances.

    The outcomes of applying ouralternative model to this case are

    in striking contrast with thoseof the first calculations whichled to the initial suspicions andwere instrumental in determin-ing the atmosphere surroundingthe trial and subsequent hyste-ria. The main result is that un-der the assumption of heterogene-ity, the probability of experienc-ing a number of incidents (14)that led to Lucia’s conviction isabout 0.0206161 or one in 49 ifthe calculations are based on thesame data as used by the law psy-chologist of the court. In his cal-culation, however, this probabil-ity was equal to one in 342 mil-lion.

    Figure 1: Lucia de Berk before her imprisonment

    1

  • The data

    We use data from the unpub-lished reports of Elffers [1] and[2]. But before going into this,we want to make some general re-marks on the data collection.

    One of the key features ofthe data was the flawed datacollection. Here different disci-plines came into conflict: crim-inal investigation and scientificdata gathering are very differ-ent. Their objectives, meth-ods and results are not compat-ible. Criminal investigation isstarted when there is (suspicion

    of) a crime, hence one is look-ing for or hunting down a sus-pect. If there is need for meaning-ful statistics another methodol-ogy is needed, guaranteeing cleardefinitions and uniformity of thedata collection. In the case of Lu-cia de Berk this clash of culturesproved disastrous. Incidents out-side shifts of Lucia were discardedand some initially reported inci-dents were later relabeled withoutclear reasons. Extra shifts with-out incidents and incidents out-side shifts of Lucia were subse-quently brought to light. More-over, the data collection rested fora large part on memory.

    Clearly, the context of a crim-inal investigation produces a spe-cific mindset: on the one handthe witnesses know what is lookedfor (and some of them may al-ready be convinced of the guilt ofthe suspect), on the other handfear of implicating one’s self andfriends can considerably distortmemory. The data on shifts andincidents for the period whichwas singled out in Elffers’ reportsare given in the following table(our Table 1 corrects an error in[6], where the number of shift inthe ward RCH-41 was erroneouslygiven by 336 instead of 366 intheir table on top of p. 235).

    Table 1: Data on shifts and incidents

    Hospital name (and ward number) JCH RCH-41 RCH-42 TotalTotal number of shifts 1029 366 339 1734Lucia’s number of shifts 142 1 58 201Total number of incidents 8 5 14 27Number of incidents during Lucia’s shifts 8 1 5 14

    JCH and RCH denote the “Juliana Children’s Hospital” and “Red Cross Hospital”, respectively, and41 and 42 were different ward numbers of the Red Cross hospital.

    Elffers’ method

    We first discuss the analysis ofthe law psychologist H. Elffers,the statistician consulted by thecourt. This analysis was basedon Table 1. As was noticed later,Lucia de Berk had actually donethree shifts in RCH-41 instead ofjust one, but we will argue fromthe data used by H. Elffers. Asexplained in [6], Elffers arguedby conditioning on part of thedata and used two fundamental

    assumptions:

    1. There is a fixed probabilityp for the occurrence of anincident during a shift (forexample, p does not dependon whether the shift is a dayshift or a night shift or onthe nurse involved, etc.),

    2. Incidents occur indepen-dently of one another.

    On the basis of these assump-tions, one can compute the prob-

    ability that L incidents occur dur-ing Lucia’s shifts, given the totalnumber I of incidents and the to-tal number N of shifts consideredin the period of study. This isa hypergeometric probability givenby (

    SL

    )(N−SI−L

    )(NI

    ) (1)where S is the number of shifts ofLucia and I is the total number ofincidents, and where

    (SL

    ), etc. de-

    note binomial coefficients. If we

    2

  • just take all the data of Table 1together, we have a total num-ber of N = 1734 shifts, Lucia hadS = 201 shifts, there was a total

    number I = 27 of incidents, andL = 14 incidents during shifts ofLucia. If we evaluate (1) withthese values for N,S, I and L, we

    get the very small probability ofabout one in 4.2 million.

    Figure 2: A Fokke & Sukke cartoon from 10-30-2007 in the Dutch newspaper NRC Next. The text waskindly translated into English for us by the creators of the cartoon: Reid, Geleijnse and Van Tol. Luciade Berk was still in prison at that time. The canary and the duck are defending a family guardian,accused of being responsible for the death of the girl Savanna, who died by suffocation. The accusedwoman was in fact acquitted (by another defense!). What counselor Sukke is saying corresponds to whatthe law psychologist H. Elffers told the court: “Honored court, this is no coincidence. The rest is up toyou.”.

    If we want to compute theprobability (p-value) that a nurseis present with 14 or more inci-dents in Elffers’ method of test-ing a null hypothesis of no sys-tematic effects on these combineddata (but he actually did nottest it in this way on the com-bined data, see below), we haveto sum the probabilities for L =14, 15, . . . , 27, and then we get the

    probability of about one in 3.8million. This is a very small prob-ability, although still about 100times larger than the probabilityElffers arrived at as described be-low. For the model we introducein the next section, however, weget, using the same data, a prob-ability of one in 49.

    However, Elffers proceededsomewhat differently, not com-

    bining the data of the differenthospitals. The details of whathe actually did are described in[6]. The most important mistakehe made in his calculation wasto take the three hospitals sepa-rately, and multiplying the prob-abilities he got for these sepa-rately. This has the absurd con-sequence that a nurse working inseveral different hospitals gets a

    3

  • higher chance of being accused ofinexplicably being present at in-cidences than a nurse working injust one hospital. In this way hearrived at his estimate that theprobability that Lucia de Berkwas present at the given numbersof incidents at the Juliana Chil-dren’s hospital and the Red Crosshospital was equal to one in 342million. We refer the interestedreader to [6] and to Chapter 7“Math error number 7: the in-credible coincidence” of the book[7].

    Post-hoc testing

    A reviewer of this paper has askedus to comment on the issue of thedanger of post-hoc testing: test-ing a hypothesis using the samedata which suggested that hy-pothesis. Elffers actually triedto take account of this prob-

    lem in the following way. Hestarted from the assumption thatthe number of incidents in thedata from JCH was much largerthan expected, and that the pur-pose of his analysis was to dis-cover whether there was an as-sociation with any of the nurseswho worked on the ward. Hemultiplied his p-value for the as-sociation with Lucia’s shifts by27, the number of nurses in thatperiod who worked on the sameward. By the time he came tolook at the data from RCH, Lu-cia was a prime suspect and hejudged that no further Bonferronitype correction was required. Fi-nally, he proposed to take a verysmall probability for the signifi-cance level of his test.

    In fact, his starting assump-tion was false: in the previousyear there had been no incidentsin the ward, but the year before

    that, an even larger number. Thehospital director had not revealedthe information from two yearsago to the investigators since theward previously had had a differ-ent name (he had changed it).

    One could try to use aBayesian approach to deal withthe post-hoc problem. Therewould be good arguments for arather low prior probability ofan arbitrary nurse being a serialkiller. The difficult task for theBayesian would be determining areasonable model for number ofincidents if Lucia is a murderer,since one has to take into accountthat some proportion of the in-cidents are not murders at all.Heterogeneity would also remainan issue for a Bayesian analysis.Explaining the methodology in acourt of law could well be thebiggest barrier.

    Alternative model

    We can model the incidents thata nurse experiences by a so-calledPoisson process, with a nurse-dependent intensity A, where weuse A for “accident proneness”. APoisson process is used to modelincoming phone calls during non-busy hours, fires in a big city,etc. Since we believe the inci-dents to be rare, a Poisson pro-cess is an obvious choice for mod-eling the incidents that a nurseexperiences.

    This approach models twoseparate phenomena. Firstly, theintensity of nurses seeing or re-porting incidents is modeled byintroducing the random variable

    A. We assume that A has an ex-ponential distribution, but otherchoices are also possible.

    Note that we move away herefrom a simple discrete model,as used by Elffers, but use in-stead a continuous distributionfor the “accident proneness” Aof the nurse. Statistical mod-els with continuously varying ran-dom variables are perhaps moredifficult to explain to the judges,but are often much more realistic,which should be the only impor-tant consideration here.

    Secondly, the number of in-cidents happening to a nurse onduty depends on A and the timeinterval she is working, and fol-

    lows (conditionally on A) a Pois-son distribution. The time inter-val is measured by the number ofshifts the nurse has had.

    Assuming that A is exponen-tially distributed implies, amongother things, that it can easilyhappen that one nurse has twicethe incident rate of another nurse.The probability of this event is2/3; in fact the probability of anincidence rate of a factor k timesthat of another nurse is 2/(k+1).

    The statistical problem boilsdown to the estimation of the pa-rameter, characterizing the mix-ture of Poisson processes for thedifferent nurses. Combining theJuliana Children’s Hospital and

    4

  • the two wards of the Red CrossHospital, Lucia had 201 shifts and14 incidents.

    A major flaw in the investi-gation is that the data collectionis irreproducible and lacks rigor-ous methods and definitions. Itcrucially depended on the mem-ory of people who knew what wassought after. But we will arguefrom the data in Table 1 above,which also was used in the com-putations of Elffers.

    We’ll take the overall proba-bility of an incident per shift tobe the ratio of total number of in-cidents to total number of shifts,µ = 27/1734. If we take a shiftto be our unit time interval, thenthis would be a so-called momentestimate of the mean intensity ofincidents.

    This means, that, condition-ally on the time interval T =201, the number of incidents fol-lows a mixture of Poisson randomvariables with parameter 201A,where the intensity A has an ex-ponential distribution with firstmoment µ. Thus on average, an

    innocent Lucia would experience201 ·µ = 201 ·27/1734 ≈ 3.12976incidents. A picture of the prob-abilities that the number of inci-dents is greater or equal to k =1, 2, . . . is shown in Figure 3,which is based on the calculationsgiven at the end of this section.

    Heterogeneity of any kind in-creases the variation in the num-ber of incidents experienced bya randomly chosen nurse over agiven period of time (given num-ber of shifts). From the well-known relations for conditionalexpectation (E) and variances(var)

    E(X) = E(E(X|Y )),

    var(X) = E(var(X|Y ))+ var(E(X|Y )),

    it follows that whereas for a Pois-son distributed random variablevariance and mean are equal, for amixture of Poisson’s (with differ-ent conditional means), the vari-ance is larger than the mean. Soif some nurses experience more or

    less incidents than other, in allcases the end-result is overdisper-sion caused by heterogeneity.

    Applied to the current modelwhich is geometric with parame-ter (1 + tµ)−1 (see the computa-tion at the end of this section):

    var(N) = (1 + tµ)2 − (1 + tµ)= tµ+ (tµ)2,

    where the latter term neatly splitsover the expected variance of thePoisson process plus the varianceof the conditional parameter ofthe Poisson process which we as-sumed to be exponential.

    The fact that a modestamount of heterogeneity turns analmost impossible occurrence intosomething merely mildly unusual,is strong support for further em-pirical research whether and ifso in what forms heterogeneityplays a role in healthcare. Itcan have major implications indifferent areas, such as medicalresearch (representing an extrasource of variation) and trainingof medical staff.

    Computation of the probabilities in the mixed Poisson model

    If N is a Poisson random variable with parameter λ, the probability that the number of incidents isbigger than k, k = 1, 2, . . . , is given by an integral, namely

    1

    (k − 1)!

    ∫ λ0

    e−xxk−1 dx, (2)

    see, e.g., [3], Exercise 46, p. 173. This means that if we assume that the “accident proneness” of thenurses has an exponential distribution with expectation µ (in our case estimated by 27/1734) and theparameter of the Poisson distribution for the nurse is given by ta, where t is the time interval (in ourcase t = 201) and a the accident proneness, we have to integrate (2) with respect to the density of theexponential distribution with expectation µ, taking λ = ta So we get for the probability that a nurse

    5

  • 1 2 3 4 5 6 7 8 9 10 11 12 13 14

    0.0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    Figure 3: Probabilities (in the Poisson model) that the number of incidents in 201 shifts for one nurseis at least 1,2,3,. . . , if µ = 27/1704. The probabilities are given by the heights of the columns above1, 2, 3, . . . , respectively.

    experiences k or more incidents:∫ ∞0

    P {I ≥ k|A = a, T = t} e−a/µ

    µda =

    ∫ ∞0

    {1

    (k − 1)!

    ∫ ta0

    e−xxk−1 dx

    }e−a/µ

    µda

    =

    (tµ

    1 + tµ

    )k.

    This is the geometric distribu-tion with parameter 1/(1 + tµ).With k = 14 and tµ = 3.12976this yields 0.0206161 or about onein 49.

    An early version of this paperused a revision of Elffers’ data-set proposed by Professor TonDerksen, philosopher of science,who together with his sister, med-ical doctor Metta de Noo, wasthe first to actively contest thecourt’s reasoning in the case ofLucia de Berk.

    Our model then led us to aright tail probability of one in

    nine. We later noticed that Derk-sen had also removed all incidentswhich the court finally decidednot to count as provenly causedby Lucia; he used the legal argu-ment that Elffers had previouslybeen instructed by the judges todo the same for the data fromthe Juliana Children’s Hospital.This does not make any statisti-cal sense.

    Going back to original medi-cal records, Derksen and de Nooalso found inconsistencies in theclassification and timing of sev-eral incidents, which underlines

    the unreliability of the data. Cor-recting the data for apparent er-rors would also improve the re-sults from the defence point ofview.

    We decided in the present pa-per to stick with Elffers’ num-bers in order to focus on our mainpoint concerning the impact ofheterogeneity.

    6

  • Extended discussionof heterogeneity

    We showed that a modest amountof heterogeneity leads to very dif-ferent orders of magnitude in theoutcomes of crucial calculations.Here we address some of the un-derlying mechanisms which maylead to the postulated hetero-geneity.

    Clearly, the data in this caseshow heterogeneity. The datastem from two hospitals with verydifferent patients, young childrenin the JCH and elder adult pa-tients in the RCH. The data comefrom three wards and the rates ofincidents per shift vary consider-ably for each ward.

    We describe two generalmechanisms causing heterogene-ity. The first one concerns prop-erties of subjects directly relatedto the intensity of the rate ofincidents. The other mechanismis more indirect and results from“spurious correlations”, in whichproperties not related to the un-derlying intensity influence themeasurement via unexpected de-pendencies and systematic varia-tions in variables assumed to beindependent and uniform.

    Related to this is another as-pect of the data: the degree towhich a specific model or null-hypothesis is susceptible to smallvariations in the data. We willshow this to be the case in theoriginal calculations. Althoughour example is tuned to this veryspecific case, it refers to a muchmore general caveat. It shouldbe established how stable certain

    models are under small perturba-tions of the data.

    Are nurses interchange-able?

    According to medical specialistswe have spoken to, nurses arecompletely interchangeable withrespect to the occurrence of med-ical emergencies among their pa-tients. However, according tonursing staff we have consulted,this is not the case at all. Dif-ferent nurses have different stylesand different personalities, andthis can and does have a medicalimpact on the state of their pa-tients. Especially regarding careof the dying, it is folk knowledgethat terminally ill persons tend todie preferentially on the shifts ofthose nurses with whom they feelmore comfortable. As far as weknow there has been no statisticalresearch on this phenomenon.

    There is another obvious wayin which the intensity of incidentsdepends on characteristics thatvary over the population. Anyevent that can turn out to be an“incident” starts with the call of adoctor. And in all cases it is thenurse who decides to call a doc-tor. This decision is influencedby professional and personal at-titude, past experience and per-sonality traits like self-confidence.It seems obvious to us that thesecharacteristics vary greatly in anypopulation. Hence we assumethat the intensity of experiencingincidents varies accordingly.

    Inadequacy of the hyper-geometric distribution asa model and spuriouscorrelations

    The model underlying the null-hypothesis (which led to thehypergeometric distribution) de-pends on two assumptions: Boththe incidents and the nurses areassigned to shifts uniformly andindependently of each other.

    Above we have establishedtwo ways in which characteristicsof individual subjects may lead tovariation in the intensity of expe-riencing an incident. This vari-ation is in contrast with one ofthe assumptions underlying thehypergeometric distribution: uni-formity.

    Next we discuss sources of cor-relation which correspond to in-direct rather than direct causa-tion: we speak then of spurious-correlation, correlation which canbe explained by confounding fac-tors, by common causes.

    There are serious reasons todoubt the uniformity of incidentsover shifts. There may occur pe-riodical differences. The popula-tion of a hospital ward may varyover the seasons. The patientsmay differ in character and sever-ity of illness due to seasonal influ-ences. There are differences be-tween day and night shifts andbetween weekend shifts and shiftson weekdays. An extensive studyof Dutch Intensive Care Units ad-missions shows a marked increasein deaths when the admission fallsoutside “office hours”[5]. Recallthat there have to be nurses on

    7

  • duty throughout the night andthroughout the weekends, whilethe medical specialists tend tohave “normal working hours”. Fi-nally there is the periodical cycleof the circadian rhythm, influenc-ing the condition of the patientsand the attention of the medicalstaff [4].

    Notice that circadian varia-tion in e.g. mortality and the re-sulting variation of incident ratebetween different shifts over theday interacts with the variation inthe number of nurses on a shift,with more personnel on the dayshifts. This can result in a highernumber of nurses with an incidenton their shift if the incident rate ishigher during day time shifts andconversely, a lower number in theopposite case.

    There may be other, non-periodical variations that affectthe uniformity of incidents. Inthe case of the Juliana Chil-dren’s hospital there has been arather sensitive matter of policy:whether very ill children, who arenot going to live for very long,should die at home or in the hos-pital wards. We understand thatthis policy did change at leastonce at the JCH in the period ofinterest. Presumably a change inpolicy concerning where the hos-pital wants children to die, willhave impact on the rate of inci-dents. Further, incidents may beclustered, since one patient cangive rise to several incidents.

    On the other hand the waynurses are assigned to shifts iscertainly not uniform and ‘ran-

    dom’. Nurses take shifts in pat-terns, for example several nightshift on a row, alternated byrows of evening or day shifts.Nurses are assigned to shifts ac-cording to skills, qualificationand other characteristics. Maybesome nurses take relatively moreweekend shifts than others, be-cause of personal circumstances.

    Although both the assignmentof nurses to shifts and the as-signment of incidents to shifts arenot uniform processes, one couldhope that there might be some‘mixing’ condition that makesthe ultimate result indistinguish-able from the postulated indepen-dence and uniformity. Certainlyone may hope, but this magi-cal mechanism should at least bemade plausible.

    Taken together, even if weconsider both the shifts of agiven nurse as a random pro-cess, and the incidents on a wardas a random process, and evenif we consider the two processesas stochastically independent ofone another, the assumption ofconstant intensities of either is aguess, not based on any evidenceor argument. There may be pat-terns in the risk of incidents andthere are certainly patterns in theshifts of nurses. These patternsmay be correlated, through theprocess by which shifts are sharedover the different nurses accord-ing to their different personal sit-uations, their different wishes forparticular kinds of shifts, theirdifferent qualifications, and thechanging situation on the ward.

    How stable are the hy-pergeometric probabili-ties under small changesin the data?

    Consider the data of the ward atJCH. These numbers and theirinterpretation are at the root ofwhat turned out to be one of thegravest miscarriages of justice ofthe Dutch Juridical system. Un-der the assumption of the hyper-geometric distribution the proba-bility of this configuration is verysmall, less than one in nine mil-lion. The configuration is in somerespects extreme: eight out ofeight incidents occur in the shiftof one nurse. However the dataare in another respect also con-spicuous: no incidents occur inthe 887 shifts where this nursewas not present (see Table 1).The data collection had been farfrom flawless, with no formal def-inition of incident, no or incom-plete documentation, and restedat least in part on recollectionof witnesses who were aware ofwhich facts were looked for.

    Assuming the possibility oftiny flaws in the process of dataacquisition, it is legitimate to in-vestigate the effect of 1, 2, . . . , 8incidents that could have beenforgotten, or overlooked. Thisamounts to allowing a maximalerror of less than one percent.The results are quite remarkable;in table 2 we give the probabili-ties..

    8

  • Table 2: The effect of perturbations on the probabilities

    Shifts with incidentsoutside Lucia’s (postulated) 0 1 2 3 4Probability 1/9043864 1/1137586 1/257538 1/79497 1/29989

    Shifts (continued) 5 6 7 8Probability 1/13051 1/6329 1/3341 1/1889

    The very small numbers van-ish easily. Six or more inci-dents not remembered, not re-ported, or just defined away makethe difference between astronom-ically small on the one hand andvery unusual on the other. Thisshows that the probabilities arevery sensitive to small errors inthe data.

    A judgement on data qual-ity is not only the concern ofa statistician. Judges are usedto inconsistent and incompletedata (statements), psychologistsare very well aware of the possiblefallacies of memory. Both groupshave their own professional stan-dards of how to deal with thesephenomena. A statistician, how-ever, should point out what theeffects of these phenomena can beon the outcome of his models.

    If this model is used to cor-roborate evidence this sensitivityshould be made explicit, just asadverse workings of a medicineare mentioned explicitly for theusers.

    Concluding remarks

    In the body of this paper we haveshown the considerable effect thata modest amount of heterogene-ity can have on tail probabili-ties. The broader impact of al-lowing heterogeneity in the analy-sis of (medical) research has inter-esting consequences outside thecase of Lucia de Berk. What re-mains is a very short descriptionof how the case ended in acquit-tal. Lucia was arrested in de-cember 2001. As indicated inthe introduction, the court (of ap-peal) stated that it did not in-clude statistical considerations asbasis for its verdict. This maybe true for formal statistical con-siderations, but the essential stepin the construction of the guiltyverdict was that only one or twocases of murder had to be provenconvincingly, the rest of the mur-ders could be considered provenbased on the ”very improbable”occurrence of incidents during theshifts of Lucia. In this way sta-tistical considerations were cru-

    cial, but the verdict was immu-nized against formal statistics. Inthis way Lucia was convicted in2004 for seven murders and threeattempts of murder. What fol-lowed was a long legal strugglewhere the emphasis was on thevalidity of the medical argumentsand increasingly intricate juridi-cal matters. The Lucia case wasfiercely debated in public and thestatistical notions remained herean important issue. Statisticians,now banned from the courtrooms,continued to play a role, for ex-ample by mobilizing the scientificcommunity. Gradually the no-tion emerged that a gross miscar-riage of justice had taken place.A complicating factor remainedthat, since the juridical path hadbeen followed till the end, a new“fact”, a so-called novum had tobe found. In 2008, Lucia was al-lowed to wait for the end of thelegal proceedings outside prison,and two years later she was fi-nally acquitted of all murder ac-cusations.

    9

  • References

    [1] H. Elffers. Distribution of incidents of resuscitation and death in the Juliana Kinderziekenhuis [JulianaChildren’s Hospital] and the Rode Kruisziekenhuis [Red Cross Hospital]. Unpublished report to theCourt, May 29 2002. URL http://www.math.leidenuniv.nl/~gill/Elffers2eng.pdf.

    [2] H. Elffers. Distribution of incidents of resuscitation and death in the Juliana Kinderziekenhuis [JulianaChildren’s Hospital] and the Rode Kruisziekenhuis [Red Cross Hospital]. Unpublished report to theCourt, May 8 2002. URL http://www.math.leidenuniv.nl/~gill/Elffers1eng.pdf.

    [3] William Feller. An introduction to probability theory and its applications. Vol. I. Third edition. JohnWiley & Sons, Inc., New York-London-Sydney, 1968.

    [4] Kuhn G. Circadian rhythm, shift work, and emergency medicine. Ann Emerg Med, 37:88–98, 2000.doi: 10.1067/mem.2001.111571.

    [5] Hans A. J. M. Kuijsten, Sylvia Brinkman, Iwan A. Meynaar, Peter E. Spronk, Johan I. van derSpoel, Rob J. Bosman, Nicolette F. de Keizer, Ameen Abu-Hanna, and Dylan W. de Lange. Hospitalmortality is associated with icu admission time. Intensive Care Med, online first, 2010. doi: 10.1007/s00134-010-1918-1.

    [6] R. Meester, M. Collins, R.D. Gill, and M. van Lambalgen. On the (ab)use of statistics in the legalcase against the nurse Lucia de B. Probability and Risk, 5:233–250, 2007. With discussion by DavidLucy.

    [7] Leila Schneps and Coralie Colmez. Math on trial. Basic Books, New York, 2013. ISBN 978-0-465-03292-1. How numbers get used and abused in the courtroom.

    10