the effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · the...

70
The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan M. Brauner MD @,*, 1,2 , Sören Mindermann *, 1 , Mrinank Sharma *, 3 , David Johnston 4,5 , John Salvatier 5 , Tomáš Gavenˇ ciak PhD 6 , Anna B. Stephenson 7 , Gavin Leech 8 , George Altman MBChB 9 , Vladimir Mikulik 6 , Alexander John Norman 10 , Joshua Teperowski Monrad 2 , Tamay Besiroglu 11 , Hong Ge PhD 12 , Meghan A. Hartwick PhD 13 , Prof Yee Whye Teh PhD 14 , Prof Leonid Chindelevitch PhD +, 15 , Prof Yarin Gal PhD +, 1 , Jan Kulveit PhD +, 2 1 OATML Group, Department of Computer Science, University of Oxford, Oxford, United Kingdom. 2 Future of Humanity Institute, University of Oxford, Oxford, United Kingdom. 3 Department of Engineering Science, University of Oxford, Oxford, United Kingdom. 4 College of Engineering and Computer Science, Australian National University, Australia. 5 Quantified Uncertainty Research Institute, California, USA. 6 No affiliation. 7 Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA. 8 School of Computer Science, University of Bristol, Bristol, United Kingdom. 9 School of Medical Sciences, University of Manchester, Manchester, United Kingdom. 10 MPLS Doctoral Training Centre, University of Oxford, Oxford, United Kingdom 11 Faculty of Economics, University of Cambridge, Cambridge, United Kingdom. 12 Engineering Department, University of Cambridge, Cambridge, United Kingdom. 13 Tufts Initiative for the Forecasting and Modeling of Infectious Diseases, Tufts University, Boston, USA. 14 Department of Statistics, University of Oxford, Oxford, United Kingdom. 15 Department of Infectious Disease Epidemiology, Imperial College, London, United Kingdom. Abstract Governments are attempting to control the COVID-19 pandemic with nonpharmaceutical interventions (NPIs). However, it is still largely unknown how effective different NPIs are at reducing transmission. Data-driven studies can estimate the effectiveness of NPIs while minimising assumptions, but existing analyses lack sufficient data and validation to robustly distinguish the effects of individual NPIs. We gather chronological data on NPIs in 41 countries between January and the end of May 2020, creating the largest public NPI dataset collected with independent double entry. Wethen estimate the effectiveness of 8 NPIs with a Bayesian hierarchical model by linking NPI implementation dates to national case and death counts. The results are supported by extensive empirical validation, including 11 sensitivity analyses with over 200 experimental conditions. We find that closing schools and universities was highly effective; that banning gatherings and closing high-risk businesses @ Correspondence to [email protected] * Equal contribution + Contributed equally to senior authorship . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted October 14, 2020. ; https://doi.org/10.1101/2020.05.28.20116129 doi: medRxiv preprint NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

Upload: others

Post on 23-Jan-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

The effectiveness of eight nonpharmaceutical interventions againstCOVID-19 in 41 countries

Jan M Brauner MDlowast 12 Soumlren Mindermannlowast1 Mrinank Sharmalowast3 David Johnston45John Salvatier5 Tomaacuteš Gavenciak PhD6 Anna B Stephenson7 Gavin Leech8 GeorgeAltman MBChB9 Vladimir Mikulik6 Alexander John Norman10 Joshua TeperowskiMonrad2 Tamay Besiroglu11 Hong Ge PhD12 Meghan A Hartwick PhD13 Prof Yee WhyeTeh PhD14 Prof Leonid Chindelevitch PhD+15 Prof Yarin Gal PhD+1 Jan Kulveit PhD+2

1OATML Group Department of Computer Science University of Oxford Oxford United Kingdom2Future of Humanity Institute University of Oxford Oxford United Kingdom3Department of Engineering Science University of Oxford Oxford United Kingdom4College of Engineering and Computer Science Australian National University Australia5Quantified Uncertainty Research Institute California USA6No affiliation7Harvard John A Paulson School of Engineering and Applied Sciences Harvard University CambridgeMassachusetts USA8School of Computer Science University of Bristol Bristol United Kingdom9School of Medical Sciences University of Manchester Manchester United Kingdom10MPLS Doctoral Training Centre University of Oxford Oxford United Kingdom11Faculty of Economics University of Cambridge Cambridge United Kingdom12Engineering Department University of Cambridge Cambridge United Kingdom13Tufts Initiative for the Forecasting and Modeling of Infectious Diseases Tufts University Boston USA14Department of Statistics University of Oxford Oxford United Kingdom15Department of Infectious Disease Epidemiology Imperial College London United Kingdom

Abstract

Governments are attempting to control the COVID-19 pandemic with nonpharmaceuticalinterventions (NPIs) However it is still largely unknown how effective different NPIs areat reducing transmission Data-driven studies can estimate the effectiveness of NPIs whileminimising assumptions but existing analyses lack sufficient data and validation to robustlydistinguish the effects of individual NPIs We gather chronological data on NPIs in 41countries between January and the end of May 2020 creating the largest public NPI datasetcollected with independent double entry We then estimate the effectiveness of 8 NPIs witha Bayesian hierarchical model by linking NPI implementation dates to national case anddeath counts The results are supported by extensive empirical validation including 11sensitivity analyses with over 200 experimental conditions We find that closing schools anduniversities was highly effective that banning gatherings and closing high-risk businesses

Correspondence to janbraunerengoxacuklowastEqual contribution+Contributed equally to senior authorship

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NOTE This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice

was effective but closing most other businesses had limited further benefit and that manycountries may have been able to reduce R below 1 without issuing a stay-at-home order

2

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Introduction

Worldwide governments have mobilised vast resources to fight the COVID-19 pandemic Awide range of nonpharmaceutical interventions (NPIs) has been deployed including dras-tic measures like stay-at-home orders and the closure of all nonessential businesses Recentanalyses show that these large-scale NPIs were jointly effective at reducing the virusrsquo ef-fective reproduction number1 but it is still largely unknown how effective individual NPIswere As time progresses and more data become available we can move beyond estimatingthe combined effect of a bundle of NPIs and begin to understand the effects of individualinterventions This can help governments efficiently control the epidemic by focusing onthe most effective NPIs to ease the burden put on the population

A promising way to estimate NPI effectiveness is data-driven cross-country modelling in-ferring effectiveness by relating the NPIs implemented in different countries to the courseof the epidemic in these countries To disentangle the effects of individual NPIs we need toleverage data from multiple countries with diverse sets of interventions in place Previousdata-driven studies (Table F8) estimate effectiveness for individual countries2ndash4 or NPIsalthough some exceptions exist15ndash8 (summarised in Table F7) In contrast we evaluatethe impact of 8 NPIs on the epidemicrsquos growth in 34 European and 7 non-European coun-tries To isolate the effect of individual NPIs we also require sufficiently diverse data Ifall countries implemented the same set of NPIs on the same day the individual effect ofeach NPI would be unidentifiable However the COVID-19 response was far less coordi-nated countries implemented different sets of NPIs at different times in different orders(Figure 1)

Even with diverse data from many countries estimating NPI effects remains a challengingtask First models are based on epidemiological parameters that are only known with un-certainty our NPI effectiveness study to the best of our knowledge is the first to includesome of this uncertainty in the model Second the data are retrospective and observationalmeaning that unobserved factors could confound the results Third NPI effectiveness es-timates can be highly sensitive to arbitrary modelling decisions as demonstrated by tworecent replication studies910 Fourth large-scale public NPI datasets suffer from frequentinconsistencies11 and missing data12 For these reasons the data and the model must becarefully validated and insufficiently validated results should not be used to guide pol-icy decisions We collect the largest public dataset on NPI implementation dates that wasvalidated by independent double entry and perform to our knowledge the most extensivevalidation of any COVID-19 NPI effectiveness estimates to datemdasha crucial but largely absentor incomplete element of NPI effectiveness studies10

Even with extensive validation we need to be careful when interpreting this studyrsquos resultsWe only study the impact NPIs had between January and the end of May 2020 and NPIeffectiveness may change over time as circumstances change In particular lifting an NPIdoes not imply that transmission will return to its original level These and other limitationsare detailed in the Discussion

3

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask wearing Gatherings lt1000 Gatherings lt 100 Gatherings lt 10 Some businessesclosed

Universities closedSchools closed Stay-at-home orderMost businessesclosed

Figure 1 Timing of NPI implementations in early 2020 Crossed-out symbols signify when an NPI waslifted Detailed definitions of the NPIs are given in Table 1

4

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Methods

Dataset

We analyse the effects of NPIs (Table 1) in 41 countriesa (see Figure 1) We recorded NPIimplementations when the measures were implemented nationally or in most regions of acountry (affecting at least three fourths of the population) For each country the windowof analysis starts on the 22nd of January and ends after the first NPI was lifted or on the30th of May 2020 whichever was earlier The reason to end the analysis after the firstmajor reopeningb was to avoid a distribution shift For example when schools reopenedit was often with safety measures such as smaller class sizes and distancing rules It istherefore expected that contact patterns in schools will have been different before schoolclosure compared to after reopening Modelling this difference explicitly is left for futurework Data on confirmed COVID-19 cases and deaths were taken from the Johns HopkinsCSSE COVID-19 Dataset13 The data used in this study including sources are availableonline here

aThe countries were selected for the availability of reliable NPI data at the time when we started data collectionand modelling (April 2020) and for their presence in at least one of the public datasets that we used tocross-validate our collected data We excluded countries with fewer than 100 cases (or 10 deaths) by March31 as our model neglects new cases and deaths below these thresholds We also excluded a small numberof countries if there were credible media reports casting doubt on the trustworthiness of their reporting ofcases and deaths Finally we excluded very large countries like China the US and Canada for ease of datacollection as these would require more locally fine-grained data 33 of the 41 included countries are in EuropeAs a result the NPI effectiveness estimates may be biased towards effects in Europe and NPI effectiveness mayhave been different in other parts of the world

bConcretely the window of analysis extended until three days after the first reopening for confirmed cases and13 days after the first reopening for deaths These values correspond to the 5 quantile of the infection-to-confirmationdeath distributions ensuring that less than 5 of the new infections on the reopening day werestill observed in the window of analysis

5

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-home or-der was often accompanied or preceded by other NPIs such as businessclosures We account for these other NPIs separately eg when a coun-try issued a law that both closed businesses and ordered the populationto stay at home this is encoded as both a business closure and a stay-at-home order (with exemptions) in the data In our results we thusestimate the additional effect of the order to generally stay at home

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase14 the Oxford COVID-19 Government Response Tracker15 and the mask4all dataset16

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

When the case count is small a large fraction of cases may be imported from other countriesand the testing regime may change rapidly To prevent this from biasing our model weneglect case numbers before a country has reached 100 confirmed cases and death numbers

cWe evaluated the following datasets

bull Epidemic Forecasting Global NPI Database14

bull Oxford COVID-19 Government Response Tracker (OxCGRT)15

bull ACAPS COVID19 Government Measures Dataset

Note that these datasets are under continuous development Many of the mistakes found will already havebeen corrected We know from our own experience that data collection can be very challenging We havethe fullest respect for the people behind these datasets In this paper we focus on a more limited set ofcountries and NPIs than these datasets contain allowing us to ensure higher data quality in this subset Givenour experience with public datasets and our data collection we encourage fellow COVID-19 researchers toindependently verify the quality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

before a country has reached 10 deaths We include these thresholds in our sensitivityanalysis (Appendix C3)

Short model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure 2 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

In this section we give a short summary of the model (Figure 2) The detailed modeldescription is given in Appendix A Code is available online here

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple data-driven approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different age

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

groups and so forth that are typically required in modelling studies We make several addi-tions to the semi-mechanistic Bayesian hierarchical model of Flaxman et al1 allowing ourmodel to observe both cases and death data This increases the amount of data from whichwe can extract NPI effects reduces distinct biases in case and death reporting and reducesthe bias from including only countries with many deaths Since epidemiological parametersare only known with uncertainty we place priors over them following recent recommendedpractice17 Additionally as we do not aim to infer the total number of COVID-19 infectionswe can avoid assuming a specific infection fatality rate (IFR) or ascertainment rate (rate oftesting)

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems18 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place an Asymmetric Laplace prior that allows for both positive and negativeeffects but places 80 of its mass on positive effects reflecting that NPIs are more likely toreduce Rt c than to increase it Following Flaxman et al and others168 each NPIrsquos effecton Rt c is assumed to independently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c

19 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution The prior on the mean generation interval has mean 506 days derived from ameta-analysis20

We model the daily new infection count separately for confirmed cases and deaths repre-senting those infections which are subsequently reported and those which are subsequentlyfatal However both infection numbers are assumed to grow at the same daily rate in ex-pectation allowing the use of both data sources to estimate each αi The infection numberstranslate into reported confirmed cases and deaths after a stochastic delay The delay is thesum of two independent distributions assumed to be equal across countries the incuba-tion period and the delay from onset of symptoms to confirmation We put priors over themeans of both distributions resulting in a prior over the mean infection-to-confirmationdelay with a mean of 1092 days20 see Appendix A3 Similarly the infection-to-death

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

delay is the sum of the incubation period and the delay from onset of symptoms to deathand the prior over its mean has a mean of 218 days20 Finally as in related NPI models16both the reported cases and deaths follow a negative binomial noise distribution with aninferred noise dispersion

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers pos-terior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates as well as uncertainty in the generation interval anddelay distributions To analyse the extent to which modelling assumptions affect the re-sults our sensitivity analysis includes all epidemiological parameters prior distributionsand many of the structural assumptions introduced above (Appendix B2 and AppendixC) MCMC convergence statistics are given in Appendix C7

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed as apercentage reduction in R As in related work168 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data10 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualises which NPIs typically co-occurred aiding interpretation

Under the default model settings the mean percentage reduction in R (with 95 credibleinterval) associated with each NPI is as follows (Figure 3) mandating mask-wearing in(some) public spaces -1 (-13ndash8) limiting gatherings to 1000 people or less 13(-3ndash31) to 100 people or less 28 (9ndash44) to 10 people or less 36 (17ndash53) closing some high-risk businesses 20 (0ndash40) closing most nonessential busi-nesses 29 (8ndash47) closing schools and universities 41 (23ndash56) and issuingstay-at-home orders (with exemptions) 10 (-2ndash22) Note that we cannot robustlydisentangle the individual effects of closing schools and closing universities since the im-plementation dates of these NPIs coincided nearly perfectly in all countries except Icelandand Sweden (Appendix D21) We thus show the joint effect of closing both schools anduniversities and treat schools and universities closed as one single NPI going forward

Some NPIs frequently co-occurred ie were partly collinear However we are able to iso-late the effects of individual NPIs since the collinearity is imperfect and our dataset is largeFor every pair of NPIs we observe one of them without the other for 748 country-days onaverage (Appendix D22) The minimum number of country-days for any NPI pair is 143(for limiting gatherings to 1000 or 100 attendees) Additionally under excessive collinear-ity and insufficient data to overcome it individual effectiveness estimates are highly sensi-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 2: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

was effective but closing most other businesses had limited further benefit and that manycountries may have been able to reduce R below 1 without issuing a stay-at-home order

2

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Introduction

Worldwide governments have mobilised vast resources to fight the COVID-19 pandemic Awide range of nonpharmaceutical interventions (NPIs) has been deployed including dras-tic measures like stay-at-home orders and the closure of all nonessential businesses Recentanalyses show that these large-scale NPIs were jointly effective at reducing the virusrsquo ef-fective reproduction number1 but it is still largely unknown how effective individual NPIswere As time progresses and more data become available we can move beyond estimatingthe combined effect of a bundle of NPIs and begin to understand the effects of individualinterventions This can help governments efficiently control the epidemic by focusing onthe most effective NPIs to ease the burden put on the population

A promising way to estimate NPI effectiveness is data-driven cross-country modelling in-ferring effectiveness by relating the NPIs implemented in different countries to the courseof the epidemic in these countries To disentangle the effects of individual NPIs we need toleverage data from multiple countries with diverse sets of interventions in place Previousdata-driven studies (Table F8) estimate effectiveness for individual countries2ndash4 or NPIsalthough some exceptions exist15ndash8 (summarised in Table F7) In contrast we evaluatethe impact of 8 NPIs on the epidemicrsquos growth in 34 European and 7 non-European coun-tries To isolate the effect of individual NPIs we also require sufficiently diverse data Ifall countries implemented the same set of NPIs on the same day the individual effect ofeach NPI would be unidentifiable However the COVID-19 response was far less coordi-nated countries implemented different sets of NPIs at different times in different orders(Figure 1)

Even with diverse data from many countries estimating NPI effects remains a challengingtask First models are based on epidemiological parameters that are only known with un-certainty our NPI effectiveness study to the best of our knowledge is the first to includesome of this uncertainty in the model Second the data are retrospective and observationalmeaning that unobserved factors could confound the results Third NPI effectiveness es-timates can be highly sensitive to arbitrary modelling decisions as demonstrated by tworecent replication studies910 Fourth large-scale public NPI datasets suffer from frequentinconsistencies11 and missing data12 For these reasons the data and the model must becarefully validated and insufficiently validated results should not be used to guide pol-icy decisions We collect the largest public dataset on NPI implementation dates that wasvalidated by independent double entry and perform to our knowledge the most extensivevalidation of any COVID-19 NPI effectiveness estimates to datemdasha crucial but largely absentor incomplete element of NPI effectiveness studies10

Even with extensive validation we need to be careful when interpreting this studyrsquos resultsWe only study the impact NPIs had between January and the end of May 2020 and NPIeffectiveness may change over time as circumstances change In particular lifting an NPIdoes not imply that transmission will return to its original level These and other limitationsare detailed in the Discussion

3

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask wearing Gatherings lt1000 Gatherings lt 100 Gatherings lt 10 Some businessesclosed

Universities closedSchools closed Stay-at-home orderMost businessesclosed

Figure 1 Timing of NPI implementations in early 2020 Crossed-out symbols signify when an NPI waslifted Detailed definitions of the NPIs are given in Table 1

4

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Methods

Dataset

We analyse the effects of NPIs (Table 1) in 41 countriesa (see Figure 1) We recorded NPIimplementations when the measures were implemented nationally or in most regions of acountry (affecting at least three fourths of the population) For each country the windowof analysis starts on the 22nd of January and ends after the first NPI was lifted or on the30th of May 2020 whichever was earlier The reason to end the analysis after the firstmajor reopeningb was to avoid a distribution shift For example when schools reopenedit was often with safety measures such as smaller class sizes and distancing rules It istherefore expected that contact patterns in schools will have been different before schoolclosure compared to after reopening Modelling this difference explicitly is left for futurework Data on confirmed COVID-19 cases and deaths were taken from the Johns HopkinsCSSE COVID-19 Dataset13 The data used in this study including sources are availableonline here

aThe countries were selected for the availability of reliable NPI data at the time when we started data collectionand modelling (April 2020) and for their presence in at least one of the public datasets that we used tocross-validate our collected data We excluded countries with fewer than 100 cases (or 10 deaths) by March31 as our model neglects new cases and deaths below these thresholds We also excluded a small numberof countries if there were credible media reports casting doubt on the trustworthiness of their reporting ofcases and deaths Finally we excluded very large countries like China the US and Canada for ease of datacollection as these would require more locally fine-grained data 33 of the 41 included countries are in EuropeAs a result the NPI effectiveness estimates may be biased towards effects in Europe and NPI effectiveness mayhave been different in other parts of the world

bConcretely the window of analysis extended until three days after the first reopening for confirmed cases and13 days after the first reopening for deaths These values correspond to the 5 quantile of the infection-to-confirmationdeath distributions ensuring that less than 5 of the new infections on the reopening day werestill observed in the window of analysis

5

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-home or-der was often accompanied or preceded by other NPIs such as businessclosures We account for these other NPIs separately eg when a coun-try issued a law that both closed businesses and ordered the populationto stay at home this is encoded as both a business closure and a stay-at-home order (with exemptions) in the data In our results we thusestimate the additional effect of the order to generally stay at home

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase14 the Oxford COVID-19 Government Response Tracker15 and the mask4all dataset16

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

When the case count is small a large fraction of cases may be imported from other countriesand the testing regime may change rapidly To prevent this from biasing our model weneglect case numbers before a country has reached 100 confirmed cases and death numbers

cWe evaluated the following datasets

bull Epidemic Forecasting Global NPI Database14

bull Oxford COVID-19 Government Response Tracker (OxCGRT)15

bull ACAPS COVID19 Government Measures Dataset

Note that these datasets are under continuous development Many of the mistakes found will already havebeen corrected We know from our own experience that data collection can be very challenging We havethe fullest respect for the people behind these datasets In this paper we focus on a more limited set ofcountries and NPIs than these datasets contain allowing us to ensure higher data quality in this subset Givenour experience with public datasets and our data collection we encourage fellow COVID-19 researchers toindependently verify the quality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

before a country has reached 10 deaths We include these thresholds in our sensitivityanalysis (Appendix C3)

Short model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure 2 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

In this section we give a short summary of the model (Figure 2) The detailed modeldescription is given in Appendix A Code is available online here

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple data-driven approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different age

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

groups and so forth that are typically required in modelling studies We make several addi-tions to the semi-mechanistic Bayesian hierarchical model of Flaxman et al1 allowing ourmodel to observe both cases and death data This increases the amount of data from whichwe can extract NPI effects reduces distinct biases in case and death reporting and reducesthe bias from including only countries with many deaths Since epidemiological parametersare only known with uncertainty we place priors over them following recent recommendedpractice17 Additionally as we do not aim to infer the total number of COVID-19 infectionswe can avoid assuming a specific infection fatality rate (IFR) or ascertainment rate (rate oftesting)

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems18 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place an Asymmetric Laplace prior that allows for both positive and negativeeffects but places 80 of its mass on positive effects reflecting that NPIs are more likely toreduce Rt c than to increase it Following Flaxman et al and others168 each NPIrsquos effecton Rt c is assumed to independently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c

19 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution The prior on the mean generation interval has mean 506 days derived from ameta-analysis20

We model the daily new infection count separately for confirmed cases and deaths repre-senting those infections which are subsequently reported and those which are subsequentlyfatal However both infection numbers are assumed to grow at the same daily rate in ex-pectation allowing the use of both data sources to estimate each αi The infection numberstranslate into reported confirmed cases and deaths after a stochastic delay The delay is thesum of two independent distributions assumed to be equal across countries the incuba-tion period and the delay from onset of symptoms to confirmation We put priors over themeans of both distributions resulting in a prior over the mean infection-to-confirmationdelay with a mean of 1092 days20 see Appendix A3 Similarly the infection-to-death

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

delay is the sum of the incubation period and the delay from onset of symptoms to deathand the prior over its mean has a mean of 218 days20 Finally as in related NPI models16both the reported cases and deaths follow a negative binomial noise distribution with aninferred noise dispersion

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers pos-terior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates as well as uncertainty in the generation interval anddelay distributions To analyse the extent to which modelling assumptions affect the re-sults our sensitivity analysis includes all epidemiological parameters prior distributionsand many of the structural assumptions introduced above (Appendix B2 and AppendixC) MCMC convergence statistics are given in Appendix C7

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed as apercentage reduction in R As in related work168 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data10 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualises which NPIs typically co-occurred aiding interpretation

Under the default model settings the mean percentage reduction in R (with 95 credibleinterval) associated with each NPI is as follows (Figure 3) mandating mask-wearing in(some) public spaces -1 (-13ndash8) limiting gatherings to 1000 people or less 13(-3ndash31) to 100 people or less 28 (9ndash44) to 10 people or less 36 (17ndash53) closing some high-risk businesses 20 (0ndash40) closing most nonessential busi-nesses 29 (8ndash47) closing schools and universities 41 (23ndash56) and issuingstay-at-home orders (with exemptions) 10 (-2ndash22) Note that we cannot robustlydisentangle the individual effects of closing schools and closing universities since the im-plementation dates of these NPIs coincided nearly perfectly in all countries except Icelandand Sweden (Appendix D21) We thus show the joint effect of closing both schools anduniversities and treat schools and universities closed as one single NPI going forward

Some NPIs frequently co-occurred ie were partly collinear However we are able to iso-late the effects of individual NPIs since the collinearity is imperfect and our dataset is largeFor every pair of NPIs we observe one of them without the other for 748 country-days onaverage (Appendix D22) The minimum number of country-days for any NPI pair is 143(for limiting gatherings to 1000 or 100 attendees) Additionally under excessive collinear-ity and insufficient data to overcome it individual effectiveness estimates are highly sensi-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 3: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

Introduction

Worldwide governments have mobilised vast resources to fight the COVID-19 pandemic Awide range of nonpharmaceutical interventions (NPIs) has been deployed including dras-tic measures like stay-at-home orders and the closure of all nonessential businesses Recentanalyses show that these large-scale NPIs were jointly effective at reducing the virusrsquo ef-fective reproduction number1 but it is still largely unknown how effective individual NPIswere As time progresses and more data become available we can move beyond estimatingthe combined effect of a bundle of NPIs and begin to understand the effects of individualinterventions This can help governments efficiently control the epidemic by focusing onthe most effective NPIs to ease the burden put on the population

A promising way to estimate NPI effectiveness is data-driven cross-country modelling in-ferring effectiveness by relating the NPIs implemented in different countries to the courseof the epidemic in these countries To disentangle the effects of individual NPIs we need toleverage data from multiple countries with diverse sets of interventions in place Previousdata-driven studies (Table F8) estimate effectiveness for individual countries2ndash4 or NPIsalthough some exceptions exist15ndash8 (summarised in Table F7) In contrast we evaluatethe impact of 8 NPIs on the epidemicrsquos growth in 34 European and 7 non-European coun-tries To isolate the effect of individual NPIs we also require sufficiently diverse data Ifall countries implemented the same set of NPIs on the same day the individual effect ofeach NPI would be unidentifiable However the COVID-19 response was far less coordi-nated countries implemented different sets of NPIs at different times in different orders(Figure 1)

Even with diverse data from many countries estimating NPI effects remains a challengingtask First models are based on epidemiological parameters that are only known with un-certainty our NPI effectiveness study to the best of our knowledge is the first to includesome of this uncertainty in the model Second the data are retrospective and observationalmeaning that unobserved factors could confound the results Third NPI effectiveness es-timates can be highly sensitive to arbitrary modelling decisions as demonstrated by tworecent replication studies910 Fourth large-scale public NPI datasets suffer from frequentinconsistencies11 and missing data12 For these reasons the data and the model must becarefully validated and insufficiently validated results should not be used to guide pol-icy decisions We collect the largest public dataset on NPI implementation dates that wasvalidated by independent double entry and perform to our knowledge the most extensivevalidation of any COVID-19 NPI effectiveness estimates to datemdasha crucial but largely absentor incomplete element of NPI effectiveness studies10

Even with extensive validation we need to be careful when interpreting this studyrsquos resultsWe only study the impact NPIs had between January and the end of May 2020 and NPIeffectiveness may change over time as circumstances change In particular lifting an NPIdoes not imply that transmission will return to its original level These and other limitationsare detailed in the Discussion

3

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask wearing Gatherings lt1000 Gatherings lt 100 Gatherings lt 10 Some businessesclosed

Universities closedSchools closed Stay-at-home orderMost businessesclosed

Figure 1 Timing of NPI implementations in early 2020 Crossed-out symbols signify when an NPI waslifted Detailed definitions of the NPIs are given in Table 1

4

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Methods

Dataset

We analyse the effects of NPIs (Table 1) in 41 countriesa (see Figure 1) We recorded NPIimplementations when the measures were implemented nationally or in most regions of acountry (affecting at least three fourths of the population) For each country the windowof analysis starts on the 22nd of January and ends after the first NPI was lifted or on the30th of May 2020 whichever was earlier The reason to end the analysis after the firstmajor reopeningb was to avoid a distribution shift For example when schools reopenedit was often with safety measures such as smaller class sizes and distancing rules It istherefore expected that contact patterns in schools will have been different before schoolclosure compared to after reopening Modelling this difference explicitly is left for futurework Data on confirmed COVID-19 cases and deaths were taken from the Johns HopkinsCSSE COVID-19 Dataset13 The data used in this study including sources are availableonline here

aThe countries were selected for the availability of reliable NPI data at the time when we started data collectionand modelling (April 2020) and for their presence in at least one of the public datasets that we used tocross-validate our collected data We excluded countries with fewer than 100 cases (or 10 deaths) by March31 as our model neglects new cases and deaths below these thresholds We also excluded a small numberof countries if there were credible media reports casting doubt on the trustworthiness of their reporting ofcases and deaths Finally we excluded very large countries like China the US and Canada for ease of datacollection as these would require more locally fine-grained data 33 of the 41 included countries are in EuropeAs a result the NPI effectiveness estimates may be biased towards effects in Europe and NPI effectiveness mayhave been different in other parts of the world

bConcretely the window of analysis extended until three days after the first reopening for confirmed cases and13 days after the first reopening for deaths These values correspond to the 5 quantile of the infection-to-confirmationdeath distributions ensuring that less than 5 of the new infections on the reopening day werestill observed in the window of analysis

5

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-home or-der was often accompanied or preceded by other NPIs such as businessclosures We account for these other NPIs separately eg when a coun-try issued a law that both closed businesses and ordered the populationto stay at home this is encoded as both a business closure and a stay-at-home order (with exemptions) in the data In our results we thusestimate the additional effect of the order to generally stay at home

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase14 the Oxford COVID-19 Government Response Tracker15 and the mask4all dataset16

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

When the case count is small a large fraction of cases may be imported from other countriesand the testing regime may change rapidly To prevent this from biasing our model weneglect case numbers before a country has reached 100 confirmed cases and death numbers

cWe evaluated the following datasets

bull Epidemic Forecasting Global NPI Database14

bull Oxford COVID-19 Government Response Tracker (OxCGRT)15

bull ACAPS COVID19 Government Measures Dataset

Note that these datasets are under continuous development Many of the mistakes found will already havebeen corrected We know from our own experience that data collection can be very challenging We havethe fullest respect for the people behind these datasets In this paper we focus on a more limited set ofcountries and NPIs than these datasets contain allowing us to ensure higher data quality in this subset Givenour experience with public datasets and our data collection we encourage fellow COVID-19 researchers toindependently verify the quality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

before a country has reached 10 deaths We include these thresholds in our sensitivityanalysis (Appendix C3)

Short model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure 2 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

In this section we give a short summary of the model (Figure 2) The detailed modeldescription is given in Appendix A Code is available online here

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple data-driven approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different age

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

groups and so forth that are typically required in modelling studies We make several addi-tions to the semi-mechanistic Bayesian hierarchical model of Flaxman et al1 allowing ourmodel to observe both cases and death data This increases the amount of data from whichwe can extract NPI effects reduces distinct biases in case and death reporting and reducesthe bias from including only countries with many deaths Since epidemiological parametersare only known with uncertainty we place priors over them following recent recommendedpractice17 Additionally as we do not aim to infer the total number of COVID-19 infectionswe can avoid assuming a specific infection fatality rate (IFR) or ascertainment rate (rate oftesting)

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems18 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place an Asymmetric Laplace prior that allows for both positive and negativeeffects but places 80 of its mass on positive effects reflecting that NPIs are more likely toreduce Rt c than to increase it Following Flaxman et al and others168 each NPIrsquos effecton Rt c is assumed to independently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c

19 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution The prior on the mean generation interval has mean 506 days derived from ameta-analysis20

We model the daily new infection count separately for confirmed cases and deaths repre-senting those infections which are subsequently reported and those which are subsequentlyfatal However both infection numbers are assumed to grow at the same daily rate in ex-pectation allowing the use of both data sources to estimate each αi The infection numberstranslate into reported confirmed cases and deaths after a stochastic delay The delay is thesum of two independent distributions assumed to be equal across countries the incuba-tion period and the delay from onset of symptoms to confirmation We put priors over themeans of both distributions resulting in a prior over the mean infection-to-confirmationdelay with a mean of 1092 days20 see Appendix A3 Similarly the infection-to-death

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

delay is the sum of the incubation period and the delay from onset of symptoms to deathand the prior over its mean has a mean of 218 days20 Finally as in related NPI models16both the reported cases and deaths follow a negative binomial noise distribution with aninferred noise dispersion

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers pos-terior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates as well as uncertainty in the generation interval anddelay distributions To analyse the extent to which modelling assumptions affect the re-sults our sensitivity analysis includes all epidemiological parameters prior distributionsand many of the structural assumptions introduced above (Appendix B2 and AppendixC) MCMC convergence statistics are given in Appendix C7

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed as apercentage reduction in R As in related work168 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data10 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualises which NPIs typically co-occurred aiding interpretation

Under the default model settings the mean percentage reduction in R (with 95 credibleinterval) associated with each NPI is as follows (Figure 3) mandating mask-wearing in(some) public spaces -1 (-13ndash8) limiting gatherings to 1000 people or less 13(-3ndash31) to 100 people or less 28 (9ndash44) to 10 people or less 36 (17ndash53) closing some high-risk businesses 20 (0ndash40) closing most nonessential busi-nesses 29 (8ndash47) closing schools and universities 41 (23ndash56) and issuingstay-at-home orders (with exemptions) 10 (-2ndash22) Note that we cannot robustlydisentangle the individual effects of closing schools and closing universities since the im-plementation dates of these NPIs coincided nearly perfectly in all countries except Icelandand Sweden (Appendix D21) We thus show the joint effect of closing both schools anduniversities and treat schools and universities closed as one single NPI going forward

Some NPIs frequently co-occurred ie were partly collinear However we are able to iso-late the effects of individual NPIs since the collinearity is imperfect and our dataset is largeFor every pair of NPIs we observe one of them without the other for 748 country-days onaverage (Appendix D22) The minimum number of country-days for any NPI pair is 143(for limiting gatherings to 1000 or 100 attendees) Additionally under excessive collinear-ity and insufficient data to overcome it individual effectiveness estimates are highly sensi-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 4: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

Mask wearing Gatherings lt1000 Gatherings lt 100 Gatherings lt 10 Some businessesclosed

Universities closedSchools closed Stay-at-home orderMost businessesclosed

Figure 1 Timing of NPI implementations in early 2020 Crossed-out symbols signify when an NPI waslifted Detailed definitions of the NPIs are given in Table 1

4

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Methods

Dataset

We analyse the effects of NPIs (Table 1) in 41 countriesa (see Figure 1) We recorded NPIimplementations when the measures were implemented nationally or in most regions of acountry (affecting at least three fourths of the population) For each country the windowof analysis starts on the 22nd of January and ends after the first NPI was lifted or on the30th of May 2020 whichever was earlier The reason to end the analysis after the firstmajor reopeningb was to avoid a distribution shift For example when schools reopenedit was often with safety measures such as smaller class sizes and distancing rules It istherefore expected that contact patterns in schools will have been different before schoolclosure compared to after reopening Modelling this difference explicitly is left for futurework Data on confirmed COVID-19 cases and deaths were taken from the Johns HopkinsCSSE COVID-19 Dataset13 The data used in this study including sources are availableonline here

aThe countries were selected for the availability of reliable NPI data at the time when we started data collectionand modelling (April 2020) and for their presence in at least one of the public datasets that we used tocross-validate our collected data We excluded countries with fewer than 100 cases (or 10 deaths) by March31 as our model neglects new cases and deaths below these thresholds We also excluded a small numberof countries if there were credible media reports casting doubt on the trustworthiness of their reporting ofcases and deaths Finally we excluded very large countries like China the US and Canada for ease of datacollection as these would require more locally fine-grained data 33 of the 41 included countries are in EuropeAs a result the NPI effectiveness estimates may be biased towards effects in Europe and NPI effectiveness mayhave been different in other parts of the world

bConcretely the window of analysis extended until three days after the first reopening for confirmed cases and13 days after the first reopening for deaths These values correspond to the 5 quantile of the infection-to-confirmationdeath distributions ensuring that less than 5 of the new infections on the reopening day werestill observed in the window of analysis

5

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-home or-der was often accompanied or preceded by other NPIs such as businessclosures We account for these other NPIs separately eg when a coun-try issued a law that both closed businesses and ordered the populationto stay at home this is encoded as both a business closure and a stay-at-home order (with exemptions) in the data In our results we thusestimate the additional effect of the order to generally stay at home

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase14 the Oxford COVID-19 Government Response Tracker15 and the mask4all dataset16

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

When the case count is small a large fraction of cases may be imported from other countriesand the testing regime may change rapidly To prevent this from biasing our model weneglect case numbers before a country has reached 100 confirmed cases and death numbers

cWe evaluated the following datasets

bull Epidemic Forecasting Global NPI Database14

bull Oxford COVID-19 Government Response Tracker (OxCGRT)15

bull ACAPS COVID19 Government Measures Dataset

Note that these datasets are under continuous development Many of the mistakes found will already havebeen corrected We know from our own experience that data collection can be very challenging We havethe fullest respect for the people behind these datasets In this paper we focus on a more limited set ofcountries and NPIs than these datasets contain allowing us to ensure higher data quality in this subset Givenour experience with public datasets and our data collection we encourage fellow COVID-19 researchers toindependently verify the quality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

before a country has reached 10 deaths We include these thresholds in our sensitivityanalysis (Appendix C3)

Short model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure 2 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

In this section we give a short summary of the model (Figure 2) The detailed modeldescription is given in Appendix A Code is available online here

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple data-driven approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different age

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

groups and so forth that are typically required in modelling studies We make several addi-tions to the semi-mechanistic Bayesian hierarchical model of Flaxman et al1 allowing ourmodel to observe both cases and death data This increases the amount of data from whichwe can extract NPI effects reduces distinct biases in case and death reporting and reducesthe bias from including only countries with many deaths Since epidemiological parametersare only known with uncertainty we place priors over them following recent recommendedpractice17 Additionally as we do not aim to infer the total number of COVID-19 infectionswe can avoid assuming a specific infection fatality rate (IFR) or ascertainment rate (rate oftesting)

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems18 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place an Asymmetric Laplace prior that allows for both positive and negativeeffects but places 80 of its mass on positive effects reflecting that NPIs are more likely toreduce Rt c than to increase it Following Flaxman et al and others168 each NPIrsquos effecton Rt c is assumed to independently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c

19 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution The prior on the mean generation interval has mean 506 days derived from ameta-analysis20

We model the daily new infection count separately for confirmed cases and deaths repre-senting those infections which are subsequently reported and those which are subsequentlyfatal However both infection numbers are assumed to grow at the same daily rate in ex-pectation allowing the use of both data sources to estimate each αi The infection numberstranslate into reported confirmed cases and deaths after a stochastic delay The delay is thesum of two independent distributions assumed to be equal across countries the incuba-tion period and the delay from onset of symptoms to confirmation We put priors over themeans of both distributions resulting in a prior over the mean infection-to-confirmationdelay with a mean of 1092 days20 see Appendix A3 Similarly the infection-to-death

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

delay is the sum of the incubation period and the delay from onset of symptoms to deathand the prior over its mean has a mean of 218 days20 Finally as in related NPI models16both the reported cases and deaths follow a negative binomial noise distribution with aninferred noise dispersion

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers pos-terior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates as well as uncertainty in the generation interval anddelay distributions To analyse the extent to which modelling assumptions affect the re-sults our sensitivity analysis includes all epidemiological parameters prior distributionsand many of the structural assumptions introduced above (Appendix B2 and AppendixC) MCMC convergence statistics are given in Appendix C7

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed as apercentage reduction in R As in related work168 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data10 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualises which NPIs typically co-occurred aiding interpretation

Under the default model settings the mean percentage reduction in R (with 95 credibleinterval) associated with each NPI is as follows (Figure 3) mandating mask-wearing in(some) public spaces -1 (-13ndash8) limiting gatherings to 1000 people or less 13(-3ndash31) to 100 people or less 28 (9ndash44) to 10 people or less 36 (17ndash53) closing some high-risk businesses 20 (0ndash40) closing most nonessential busi-nesses 29 (8ndash47) closing schools and universities 41 (23ndash56) and issuingstay-at-home orders (with exemptions) 10 (-2ndash22) Note that we cannot robustlydisentangle the individual effects of closing schools and closing universities since the im-plementation dates of these NPIs coincided nearly perfectly in all countries except Icelandand Sweden (Appendix D21) We thus show the joint effect of closing both schools anduniversities and treat schools and universities closed as one single NPI going forward

Some NPIs frequently co-occurred ie were partly collinear However we are able to iso-late the effects of individual NPIs since the collinearity is imperfect and our dataset is largeFor every pair of NPIs we observe one of them without the other for 748 country-days onaverage (Appendix D22) The minimum number of country-days for any NPI pair is 143(for limiting gatherings to 1000 or 100 attendees) Additionally under excessive collinear-ity and insufficient data to overcome it individual effectiveness estimates are highly sensi-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 5: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

Methods

Dataset

We analyse the effects of NPIs (Table 1) in 41 countriesa (see Figure 1) We recorded NPIimplementations when the measures were implemented nationally or in most regions of acountry (affecting at least three fourths of the population) For each country the windowof analysis starts on the 22nd of January and ends after the first NPI was lifted or on the30th of May 2020 whichever was earlier The reason to end the analysis after the firstmajor reopeningb was to avoid a distribution shift For example when schools reopenedit was often with safety measures such as smaller class sizes and distancing rules It istherefore expected that contact patterns in schools will have been different before schoolclosure compared to after reopening Modelling this difference explicitly is left for futurework Data on confirmed COVID-19 cases and deaths were taken from the Johns HopkinsCSSE COVID-19 Dataset13 The data used in this study including sources are availableonline here

aThe countries were selected for the availability of reliable NPI data at the time when we started data collectionand modelling (April 2020) and for their presence in at least one of the public datasets that we used tocross-validate our collected data We excluded countries with fewer than 100 cases (or 10 deaths) by March31 as our model neglects new cases and deaths below these thresholds We also excluded a small numberof countries if there were credible media reports casting doubt on the trustworthiness of their reporting ofcases and deaths Finally we excluded very large countries like China the US and Canada for ease of datacollection as these would require more locally fine-grained data 33 of the 41 included countries are in EuropeAs a result the NPI effectiveness estimates may be biased towards effects in Europe and NPI effectiveness mayhave been different in other parts of the world

bConcretely the window of analysis extended until three days after the first reopening for confirmed cases and13 days after the first reopening for deaths These values correspond to the 5 quantile of the infection-to-confirmationdeath distributions ensuring that less than 5 of the new infections on the reopening day werestill observed in the window of analysis

5

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-home or-der was often accompanied or preceded by other NPIs such as businessclosures We account for these other NPIs separately eg when a coun-try issued a law that both closed businesses and ordered the populationto stay at home this is encoded as both a business closure and a stay-at-home order (with exemptions) in the data In our results we thusestimate the additional effect of the order to generally stay at home

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase14 the Oxford COVID-19 Government Response Tracker15 and the mask4all dataset16

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

When the case count is small a large fraction of cases may be imported from other countriesand the testing regime may change rapidly To prevent this from biasing our model weneglect case numbers before a country has reached 100 confirmed cases and death numbers

cWe evaluated the following datasets

bull Epidemic Forecasting Global NPI Database14

bull Oxford COVID-19 Government Response Tracker (OxCGRT)15

bull ACAPS COVID19 Government Measures Dataset

Note that these datasets are under continuous development Many of the mistakes found will already havebeen corrected We know from our own experience that data collection can be very challenging We havethe fullest respect for the people behind these datasets In this paper we focus on a more limited set ofcountries and NPIs than these datasets contain allowing us to ensure higher data quality in this subset Givenour experience with public datasets and our data collection we encourage fellow COVID-19 researchers toindependently verify the quality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

before a country has reached 10 deaths We include these thresholds in our sensitivityanalysis (Appendix C3)

Short model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure 2 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

In this section we give a short summary of the model (Figure 2) The detailed modeldescription is given in Appendix A Code is available online here

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple data-driven approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different age

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

groups and so forth that are typically required in modelling studies We make several addi-tions to the semi-mechanistic Bayesian hierarchical model of Flaxman et al1 allowing ourmodel to observe both cases and death data This increases the amount of data from whichwe can extract NPI effects reduces distinct biases in case and death reporting and reducesthe bias from including only countries with many deaths Since epidemiological parametersare only known with uncertainty we place priors over them following recent recommendedpractice17 Additionally as we do not aim to infer the total number of COVID-19 infectionswe can avoid assuming a specific infection fatality rate (IFR) or ascertainment rate (rate oftesting)

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems18 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place an Asymmetric Laplace prior that allows for both positive and negativeeffects but places 80 of its mass on positive effects reflecting that NPIs are more likely toreduce Rt c than to increase it Following Flaxman et al and others168 each NPIrsquos effecton Rt c is assumed to independently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c

19 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution The prior on the mean generation interval has mean 506 days derived from ameta-analysis20

We model the daily new infection count separately for confirmed cases and deaths repre-senting those infections which are subsequently reported and those which are subsequentlyfatal However both infection numbers are assumed to grow at the same daily rate in ex-pectation allowing the use of both data sources to estimate each αi The infection numberstranslate into reported confirmed cases and deaths after a stochastic delay The delay is thesum of two independent distributions assumed to be equal across countries the incuba-tion period and the delay from onset of symptoms to confirmation We put priors over themeans of both distributions resulting in a prior over the mean infection-to-confirmationdelay with a mean of 1092 days20 see Appendix A3 Similarly the infection-to-death

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

delay is the sum of the incubation period and the delay from onset of symptoms to deathand the prior over its mean has a mean of 218 days20 Finally as in related NPI models16both the reported cases and deaths follow a negative binomial noise distribution with aninferred noise dispersion

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers pos-terior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates as well as uncertainty in the generation interval anddelay distributions To analyse the extent to which modelling assumptions affect the re-sults our sensitivity analysis includes all epidemiological parameters prior distributionsand many of the structural assumptions introduced above (Appendix B2 and AppendixC) MCMC convergence statistics are given in Appendix C7

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed as apercentage reduction in R As in related work168 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data10 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualises which NPIs typically co-occurred aiding interpretation

Under the default model settings the mean percentage reduction in R (with 95 credibleinterval) associated with each NPI is as follows (Figure 3) mandating mask-wearing in(some) public spaces -1 (-13ndash8) limiting gatherings to 1000 people or less 13(-3ndash31) to 100 people or less 28 (9ndash44) to 10 people or less 36 (17ndash53) closing some high-risk businesses 20 (0ndash40) closing most nonessential busi-nesses 29 (8ndash47) closing schools and universities 41 (23ndash56) and issuingstay-at-home orders (with exemptions) 10 (-2ndash22) Note that we cannot robustlydisentangle the individual effects of closing schools and closing universities since the im-plementation dates of these NPIs coincided nearly perfectly in all countries except Icelandand Sweden (Appendix D21) We thus show the joint effect of closing both schools anduniversities and treat schools and universities closed as one single NPI going forward

Some NPIs frequently co-occurred ie were partly collinear However we are able to iso-late the effects of individual NPIs since the collinearity is imperfect and our dataset is largeFor every pair of NPIs we observe one of them without the other for 748 country-days onaverage (Appendix D22) The minimum number of country-days for any NPI pair is 143(for limiting gatherings to 1000 or 100 attendees) Additionally under excessive collinear-ity and insufficient data to overcome it individual effectiveness estimates are highly sensi-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 6: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-home or-der was often accompanied or preceded by other NPIs such as businessclosures We account for these other NPIs separately eg when a coun-try issued a law that both closed businesses and ordered the populationto stay at home this is encoded as both a business closure and a stay-at-home order (with exemptions) in the data In our results we thusestimate the additional effect of the order to generally stay at home

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase14 the Oxford COVID-19 Government Response Tracker15 and the mask4all dataset16

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

When the case count is small a large fraction of cases may be imported from other countriesand the testing regime may change rapidly To prevent this from biasing our model weneglect case numbers before a country has reached 100 confirmed cases and death numbers

cWe evaluated the following datasets

bull Epidemic Forecasting Global NPI Database14

bull Oxford COVID-19 Government Response Tracker (OxCGRT)15

bull ACAPS COVID19 Government Measures Dataset

Note that these datasets are under continuous development Many of the mistakes found will already havebeen corrected We know from our own experience that data collection can be very challenging We havethe fullest respect for the people behind these datasets In this paper we focus on a more limited set ofcountries and NPIs than these datasets contain allowing us to ensure higher data quality in this subset Givenour experience with public datasets and our data collection we encourage fellow COVID-19 researchers toindependently verify the quality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

before a country has reached 10 deaths We include these thresholds in our sensitivityanalysis (Appendix C3)

Short model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure 2 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

In this section we give a short summary of the model (Figure 2) The detailed modeldescription is given in Appendix A Code is available online here

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple data-driven approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different age

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

groups and so forth that are typically required in modelling studies We make several addi-tions to the semi-mechanistic Bayesian hierarchical model of Flaxman et al1 allowing ourmodel to observe both cases and death data This increases the amount of data from whichwe can extract NPI effects reduces distinct biases in case and death reporting and reducesthe bias from including only countries with many deaths Since epidemiological parametersare only known with uncertainty we place priors over them following recent recommendedpractice17 Additionally as we do not aim to infer the total number of COVID-19 infectionswe can avoid assuming a specific infection fatality rate (IFR) or ascertainment rate (rate oftesting)

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems18 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place an Asymmetric Laplace prior that allows for both positive and negativeeffects but places 80 of its mass on positive effects reflecting that NPIs are more likely toreduce Rt c than to increase it Following Flaxman et al and others168 each NPIrsquos effecton Rt c is assumed to independently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c

19 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution The prior on the mean generation interval has mean 506 days derived from ameta-analysis20

We model the daily new infection count separately for confirmed cases and deaths repre-senting those infections which are subsequently reported and those which are subsequentlyfatal However both infection numbers are assumed to grow at the same daily rate in ex-pectation allowing the use of both data sources to estimate each αi The infection numberstranslate into reported confirmed cases and deaths after a stochastic delay The delay is thesum of two independent distributions assumed to be equal across countries the incuba-tion period and the delay from onset of symptoms to confirmation We put priors over themeans of both distributions resulting in a prior over the mean infection-to-confirmationdelay with a mean of 1092 days20 see Appendix A3 Similarly the infection-to-death

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

delay is the sum of the incubation period and the delay from onset of symptoms to deathand the prior over its mean has a mean of 218 days20 Finally as in related NPI models16both the reported cases and deaths follow a negative binomial noise distribution with aninferred noise dispersion

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers pos-terior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates as well as uncertainty in the generation interval anddelay distributions To analyse the extent to which modelling assumptions affect the re-sults our sensitivity analysis includes all epidemiological parameters prior distributionsand many of the structural assumptions introduced above (Appendix B2 and AppendixC) MCMC convergence statistics are given in Appendix C7

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed as apercentage reduction in R As in related work168 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data10 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualises which NPIs typically co-occurred aiding interpretation

Under the default model settings the mean percentage reduction in R (with 95 credibleinterval) associated with each NPI is as follows (Figure 3) mandating mask-wearing in(some) public spaces -1 (-13ndash8) limiting gatherings to 1000 people or less 13(-3ndash31) to 100 people or less 28 (9ndash44) to 10 people or less 36 (17ndash53) closing some high-risk businesses 20 (0ndash40) closing most nonessential busi-nesses 29 (8ndash47) closing schools and universities 41 (23ndash56) and issuingstay-at-home orders (with exemptions) 10 (-2ndash22) Note that we cannot robustlydisentangle the individual effects of closing schools and closing universities since the im-plementation dates of these NPIs coincided nearly perfectly in all countries except Icelandand Sweden (Appendix D21) We thus show the joint effect of closing both schools anduniversities and treat schools and universities closed as one single NPI going forward

Some NPIs frequently co-occurred ie were partly collinear However we are able to iso-late the effects of individual NPIs since the collinearity is imperfect and our dataset is largeFor every pair of NPIs we observe one of them without the other for 748 country-days onaverage (Appendix D22) The minimum number of country-days for any NPI pair is 143(for limiting gatherings to 1000 or 100 attendees) Additionally under excessive collinear-ity and insufficient data to overcome it individual effectiveness estimates are highly sensi-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 7: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase14 the Oxford COVID-19 Government Response Tracker15 and the mask4all dataset16

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

When the case count is small a large fraction of cases may be imported from other countriesand the testing regime may change rapidly To prevent this from biasing our model weneglect case numbers before a country has reached 100 confirmed cases and death numbers

cWe evaluated the following datasets

bull Epidemic Forecasting Global NPI Database14

bull Oxford COVID-19 Government Response Tracker (OxCGRT)15

bull ACAPS COVID19 Government Measures Dataset

Note that these datasets are under continuous development Many of the mistakes found will already havebeen corrected We know from our own experience that data collection can be very challenging We havethe fullest respect for the people behind these datasets In this paper we focus on a more limited set ofcountries and NPIs than these datasets contain allowing us to ensure higher data quality in this subset Givenour experience with public datasets and our data collection we encourage fellow COVID-19 researchers toindependently verify the quality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

before a country has reached 10 deaths We include these thresholds in our sensitivityanalysis (Appendix C3)

Short model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure 2 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

In this section we give a short summary of the model (Figure 2) The detailed modeldescription is given in Appendix A Code is available online here

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple data-driven approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different age

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

groups and so forth that are typically required in modelling studies We make several addi-tions to the semi-mechanistic Bayesian hierarchical model of Flaxman et al1 allowing ourmodel to observe both cases and death data This increases the amount of data from whichwe can extract NPI effects reduces distinct biases in case and death reporting and reducesthe bias from including only countries with many deaths Since epidemiological parametersare only known with uncertainty we place priors over them following recent recommendedpractice17 Additionally as we do not aim to infer the total number of COVID-19 infectionswe can avoid assuming a specific infection fatality rate (IFR) or ascertainment rate (rate oftesting)

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems18 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place an Asymmetric Laplace prior that allows for both positive and negativeeffects but places 80 of its mass on positive effects reflecting that NPIs are more likely toreduce Rt c than to increase it Following Flaxman et al and others168 each NPIrsquos effecton Rt c is assumed to independently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c

19 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution The prior on the mean generation interval has mean 506 days derived from ameta-analysis20

We model the daily new infection count separately for confirmed cases and deaths repre-senting those infections which are subsequently reported and those which are subsequentlyfatal However both infection numbers are assumed to grow at the same daily rate in ex-pectation allowing the use of both data sources to estimate each αi The infection numberstranslate into reported confirmed cases and deaths after a stochastic delay The delay is thesum of two independent distributions assumed to be equal across countries the incuba-tion period and the delay from onset of symptoms to confirmation We put priors over themeans of both distributions resulting in a prior over the mean infection-to-confirmationdelay with a mean of 1092 days20 see Appendix A3 Similarly the infection-to-death

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

delay is the sum of the incubation period and the delay from onset of symptoms to deathand the prior over its mean has a mean of 218 days20 Finally as in related NPI models16both the reported cases and deaths follow a negative binomial noise distribution with aninferred noise dispersion

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers pos-terior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates as well as uncertainty in the generation interval anddelay distributions To analyse the extent to which modelling assumptions affect the re-sults our sensitivity analysis includes all epidemiological parameters prior distributionsand many of the structural assumptions introduced above (Appendix B2 and AppendixC) MCMC convergence statistics are given in Appendix C7

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed as apercentage reduction in R As in related work168 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data10 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualises which NPIs typically co-occurred aiding interpretation

Under the default model settings the mean percentage reduction in R (with 95 credibleinterval) associated with each NPI is as follows (Figure 3) mandating mask-wearing in(some) public spaces -1 (-13ndash8) limiting gatherings to 1000 people or less 13(-3ndash31) to 100 people or less 28 (9ndash44) to 10 people or less 36 (17ndash53) closing some high-risk businesses 20 (0ndash40) closing most nonessential busi-nesses 29 (8ndash47) closing schools and universities 41 (23ndash56) and issuingstay-at-home orders (with exemptions) 10 (-2ndash22) Note that we cannot robustlydisentangle the individual effects of closing schools and closing universities since the im-plementation dates of these NPIs coincided nearly perfectly in all countries except Icelandand Sweden (Appendix D21) We thus show the joint effect of closing both schools anduniversities and treat schools and universities closed as one single NPI going forward

Some NPIs frequently co-occurred ie were partly collinear However we are able to iso-late the effects of individual NPIs since the collinearity is imperfect and our dataset is largeFor every pair of NPIs we observe one of them without the other for 748 country-days onaverage (Appendix D22) The minimum number of country-days for any NPI pair is 143(for limiting gatherings to 1000 or 100 attendees) Additionally under excessive collinear-ity and insufficient data to overcome it individual effectiveness estimates are highly sensi-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 8: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

before a country has reached 10 deaths We include these thresholds in our sensitivityanalysis (Appendix C3)

Short model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure 2 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

In this section we give a short summary of the model (Figure 2) The detailed modeldescription is given in Appendix A Code is available online here

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple data-driven approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different age

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

groups and so forth that are typically required in modelling studies We make several addi-tions to the semi-mechanistic Bayesian hierarchical model of Flaxman et al1 allowing ourmodel to observe both cases and death data This increases the amount of data from whichwe can extract NPI effects reduces distinct biases in case and death reporting and reducesthe bias from including only countries with many deaths Since epidemiological parametersare only known with uncertainty we place priors over them following recent recommendedpractice17 Additionally as we do not aim to infer the total number of COVID-19 infectionswe can avoid assuming a specific infection fatality rate (IFR) or ascertainment rate (rate oftesting)

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems18 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place an Asymmetric Laplace prior that allows for both positive and negativeeffects but places 80 of its mass on positive effects reflecting that NPIs are more likely toreduce Rt c than to increase it Following Flaxman et al and others168 each NPIrsquos effecton Rt c is assumed to independently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c

19 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution The prior on the mean generation interval has mean 506 days derived from ameta-analysis20

We model the daily new infection count separately for confirmed cases and deaths repre-senting those infections which are subsequently reported and those which are subsequentlyfatal However both infection numbers are assumed to grow at the same daily rate in ex-pectation allowing the use of both data sources to estimate each αi The infection numberstranslate into reported confirmed cases and deaths after a stochastic delay The delay is thesum of two independent distributions assumed to be equal across countries the incuba-tion period and the delay from onset of symptoms to confirmation We put priors over themeans of both distributions resulting in a prior over the mean infection-to-confirmationdelay with a mean of 1092 days20 see Appendix A3 Similarly the infection-to-death

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

delay is the sum of the incubation period and the delay from onset of symptoms to deathand the prior over its mean has a mean of 218 days20 Finally as in related NPI models16both the reported cases and deaths follow a negative binomial noise distribution with aninferred noise dispersion

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers pos-terior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates as well as uncertainty in the generation interval anddelay distributions To analyse the extent to which modelling assumptions affect the re-sults our sensitivity analysis includes all epidemiological parameters prior distributionsand many of the structural assumptions introduced above (Appendix B2 and AppendixC) MCMC convergence statistics are given in Appendix C7

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed as apercentage reduction in R As in related work168 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data10 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualises which NPIs typically co-occurred aiding interpretation

Under the default model settings the mean percentage reduction in R (with 95 credibleinterval) associated with each NPI is as follows (Figure 3) mandating mask-wearing in(some) public spaces -1 (-13ndash8) limiting gatherings to 1000 people or less 13(-3ndash31) to 100 people or less 28 (9ndash44) to 10 people or less 36 (17ndash53) closing some high-risk businesses 20 (0ndash40) closing most nonessential busi-nesses 29 (8ndash47) closing schools and universities 41 (23ndash56) and issuingstay-at-home orders (with exemptions) 10 (-2ndash22) Note that we cannot robustlydisentangle the individual effects of closing schools and closing universities since the im-plementation dates of these NPIs coincided nearly perfectly in all countries except Icelandand Sweden (Appendix D21) We thus show the joint effect of closing both schools anduniversities and treat schools and universities closed as one single NPI going forward

Some NPIs frequently co-occurred ie were partly collinear However we are able to iso-late the effects of individual NPIs since the collinearity is imperfect and our dataset is largeFor every pair of NPIs we observe one of them without the other for 748 country-days onaverage (Appendix D22) The minimum number of country-days for any NPI pair is 143(for limiting gatherings to 1000 or 100 attendees) Additionally under excessive collinear-ity and insufficient data to overcome it individual effectiveness estimates are highly sensi-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 9: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

groups and so forth that are typically required in modelling studies We make several addi-tions to the semi-mechanistic Bayesian hierarchical model of Flaxman et al1 allowing ourmodel to observe both cases and death data This increases the amount of data from whichwe can extract NPI effects reduces distinct biases in case and death reporting and reducesthe bias from including only countries with many deaths Since epidemiological parametersare only known with uncertainty we place priors over them following recent recommendedpractice17 Additionally as we do not aim to infer the total number of COVID-19 infectionswe can avoid assuming a specific infection fatality rate (IFR) or ascertainment rate (rate oftesting)

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems18 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place an Asymmetric Laplace prior that allows for both positive and negativeeffects but places 80 of its mass on positive effects reflecting that NPIs are more likely toreduce Rt c than to increase it Following Flaxman et al and others168 each NPIrsquos effecton Rt c is assumed to independently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c

19 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution The prior on the mean generation interval has mean 506 days derived from ameta-analysis20

We model the daily new infection count separately for confirmed cases and deaths repre-senting those infections which are subsequently reported and those which are subsequentlyfatal However both infection numbers are assumed to grow at the same daily rate in ex-pectation allowing the use of both data sources to estimate each αi The infection numberstranslate into reported confirmed cases and deaths after a stochastic delay The delay is thesum of two independent distributions assumed to be equal across countries the incuba-tion period and the delay from onset of symptoms to confirmation We put priors over themeans of both distributions resulting in a prior over the mean infection-to-confirmationdelay with a mean of 1092 days20 see Appendix A3 Similarly the infection-to-death

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

delay is the sum of the incubation period and the delay from onset of symptoms to deathand the prior over its mean has a mean of 218 days20 Finally as in related NPI models16both the reported cases and deaths follow a negative binomial noise distribution with aninferred noise dispersion

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers pos-terior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates as well as uncertainty in the generation interval anddelay distributions To analyse the extent to which modelling assumptions affect the re-sults our sensitivity analysis includes all epidemiological parameters prior distributionsand many of the structural assumptions introduced above (Appendix B2 and AppendixC) MCMC convergence statistics are given in Appendix C7

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed as apercentage reduction in R As in related work168 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data10 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualises which NPIs typically co-occurred aiding interpretation

Under the default model settings the mean percentage reduction in R (with 95 credibleinterval) associated with each NPI is as follows (Figure 3) mandating mask-wearing in(some) public spaces -1 (-13ndash8) limiting gatherings to 1000 people or less 13(-3ndash31) to 100 people or less 28 (9ndash44) to 10 people or less 36 (17ndash53) closing some high-risk businesses 20 (0ndash40) closing most nonessential busi-nesses 29 (8ndash47) closing schools and universities 41 (23ndash56) and issuingstay-at-home orders (with exemptions) 10 (-2ndash22) Note that we cannot robustlydisentangle the individual effects of closing schools and closing universities since the im-plementation dates of these NPIs coincided nearly perfectly in all countries except Icelandand Sweden (Appendix D21) We thus show the joint effect of closing both schools anduniversities and treat schools and universities closed as one single NPI going forward

Some NPIs frequently co-occurred ie were partly collinear However we are able to iso-late the effects of individual NPIs since the collinearity is imperfect and our dataset is largeFor every pair of NPIs we observe one of them without the other for 748 country-days onaverage (Appendix D22) The minimum number of country-days for any NPI pair is 143(for limiting gatherings to 1000 or 100 attendees) Additionally under excessive collinear-ity and insufficient data to overcome it individual effectiveness estimates are highly sensi-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 10: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

delay is the sum of the incubation period and the delay from onset of symptoms to deathand the prior over its mean has a mean of 218 days20 Finally as in related NPI models16both the reported cases and deaths follow a negative binomial noise distribution with aninferred noise dispersion

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers pos-terior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates as well as uncertainty in the generation interval anddelay distributions To analyse the extent to which modelling assumptions affect the re-sults our sensitivity analysis includes all epidemiological parameters prior distributionsand many of the structural assumptions introduced above (Appendix B2 and AppendixC) MCMC convergence statistics are given in Appendix C7

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed as apercentage reduction in R As in related work168 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data10 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualises which NPIs typically co-occurred aiding interpretation

Under the default model settings the mean percentage reduction in R (with 95 credibleinterval) associated with each NPI is as follows (Figure 3) mandating mask-wearing in(some) public spaces -1 (-13ndash8) limiting gatherings to 1000 people or less 13(-3ndash31) to 100 people or less 28 (9ndash44) to 10 people or less 36 (17ndash53) closing some high-risk businesses 20 (0ndash40) closing most nonessential busi-nesses 29 (8ndash47) closing schools and universities 41 (23ndash56) and issuingstay-at-home orders (with exemptions) 10 (-2ndash22) Note that we cannot robustlydisentangle the individual effects of closing schools and closing universities since the im-plementation dates of these NPIs coincided nearly perfectly in all countries except Icelandand Sweden (Appendix D21) We thus show the joint effect of closing both schools anduniversities and treat schools and universities closed as one single NPI going forward

Some NPIs frequently co-occurred ie were partly collinear However we are able to iso-late the effects of individual NPIs since the collinearity is imperfect and our dataset is largeFor every pair of NPIs we observe one of them without the other for 748 country-days onaverage (Appendix D22) The minimum number of country-days for any NPI pair is 143(for limiting gatherings to 1000 or 100 attendees) Additionally under excessive collinear-ity and insufficient data to overcome it individual effectiveness estimates are highly sensi-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 11: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

EndsTransmission

北 便

i

j

100

100

100 81

10

0 64

100

100 45

27

100 94

78

89

65

88

93

44

29

100

100 82

92

68

91

96

46

28

100

100

100 99

76

97

98

55

30

99

97

86

100 72

95

98

49

27

99

98

91

100

100 99

99

64

30

97

95

84

95

72

100 99

49

29

98

95

81

92

68

93

100 46

28

100

100 98

10

0 94

99

99

100

Frequency i Active Given j Active

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects under default model settings The Figure shows the average percentagereductions in R as observed in our data (or in terms of the model the posterior marginal distributionsof 1minusexp(minusαi )) with median 50 and 95 credible intervals A negative 1 reduction refers to a 1increase in R Cumulative effects are shown for hierarchical NPIs (gathering bans and business closures)ie the result for Most nonessential businesses closed shows the cumulative effect of two NPIs withseparate parameters and symbols - closing some (high-risk) businesses and additionally closing mostremaining (non-high-risk but nonessential) businesses given that some businesses are already closedBottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis) wasactive given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-homeorder was active (bottom row third column from the right) but not vice versa Bottom Right Totalnumber of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 12: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

Most businessesclosed

Stay-at-homeorder

Mask wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businessesclosed

Schools anduniversities closed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that could be reduced to be-low 1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

tive to variations in the data and model parameters22 High sensitivity prevented Flaxmanet al1 who had a smaller dataset from disentangling NPI effects9 Our estimates are sub-stantially less sensitive (see next section) Finally the posterior correlations between theeffectiveness estimates are weak suggesting manageable collinearity (Appendix D23)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined NPI effects For example if two NPIs frequentlyco-occurred there may be more certainty about the combined effect than about the twoindividual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data Together our set of NPIs reduced R by 77 (74ndash79)on average Across countries the mean R without any NPIs (ie the R0) was 33 (Ta-ble D5 reports R0 for all countries) Starting from this number the estimated R likely couldhave been reduced below 1 by closing schools and universities high-risk businesses andlimiting gathering sizes Readers can interactively explore the effects of sets of NPIs athttpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of all NPIcombinations is available online here

Sensitivity and validation

We perform a range of validation and sensitivity experiments (Appendix B with furtherexperiments in Appendix C) First we analyse how the model extrapolates to unseen coun-tries and find that it makes calibrated forecasts over periods of up to 2 months with un-certainty increasing over time Further we perform multiple sensitivity analyses studyinghow results change if we modify the priors over epidemiological parameters exclude coun-tries from the dataset use only deaths or confirmed cases as observations vary the datapreprocessing and more Finally we investigate our key assumptions by showing results

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 13: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan

for several alternative models (structural sensitivity10) and examine possible confoundingof our estimates by unobserved factors influencing R In total we consider 201 alternativeexperimental conditions

Figure 5 (left) shows the median NPI effectiveness across these 201 experimental condi-tions Compared to the results under our default settings (Figures 3 and 4) median NPIeffects vary under alternative plausible experimental conditions However the trends in theresults are robust and some NPIs outperform others under all tested conditions While wetest over large ranges of plausible values our experiments do not include every possiblesource of uncertainty and the results might change more substantially under experimentalconditions we have not tested

We categorise NPI effects into small moderate and large which we define as a medianreduction in R of less than 175 between 175 and 35 and more than 35 (verticallines in Figure 5) Six of the NPIs fall into a single category in a large fraction of experi-mental conditions school and university closures are associated with a large effect in 99of experimental conditions limiting gatherings to 10 people or less in 94 Closing mostnonessential businesses has a moderate effect in 96 of conditions limiting gatherings to100 people or less in 97 Making mask-wearing mandatory in (some) public spaces fallsinto the ldquosmall effectrdquo category in 100 of experimental conditions issuing stay-at-homeorders (with exemptions) in 99 Two NPIs fall less clearly into one category Closing some(high-risk) businesses has a moderate effect in 86 of conditions and limiting gatherings to1000 people or less has a small effect in 79 However both NPIs have small-to-moderateeffects in more than 99 of experimental conditions The effect of limiting gatherings to1000 people or less is the least stable across the sensitivity analyses which may reflect itsaforementioned partial collinearity with limiting gatherings to 100 people or less

Aggregating all sensitivity analyses can hide sensitivity to specific assumptions We displaythe median NPI effects in four categories of sensitivity analyses (Figure 5 right) and eachindividual sensitivity analysis is shown in the Appendix The trends in the results are alsostable within the categories

Discussion

We use a data-driven approach to estimate the effects that eight nonpharmaceutical in-terventions had on COVID-19 transmission in 41 countries between January and the endof May 2020 We find that several NPIs were associated with a clear reduction in R inline with the mounting evidence that NPIs can be effective at mitigating and suppressingoutbreaks of COVID-19 Furthermore our results suggest that some NPIs outperformedothers While the exact effectiveness estimates vary with modelling assumptions the broadconclusions discussed below are largely robust across 201 experimental conditions in 11sensitivity analyses

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 25 50Median reduction in Rt

Mask-wearing mandatory in(some) public spacesGatherings limited to1000 people or lessGatherings limited to100 people or lessGatherings limited to10 people or lessSome businessesclosedMost nonessentialbusinesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

All Sensitivity Analyses (201 conditions)北便

Model Structure(5 conditions)

北便

Data and Preprocessing(51 conditions)

0 25 50Median reduction in Rt

北便

Epidemiological Priors(131 conditions)

0 25 50Median reduction in Rt

北便

Unobserved Factors(14 conditions)

Figure 5 Median NPI effects across the sensitivity analyses Left Median NPI effects (reduction in R)when varying different components of the model or the data in 201 experimental conditions Results aredisplayed as violin plots using kernel density estimation to create the distributions Inside the violinsthe box plots show median and interquartile-range The vertical lines mark 0 175 and 35 (seetext) Right Categorised sensitivity analyses Structural Using only cases or only deaths as observations(2 experimental conditions Figure B12) varying the model structure (3 conditions Figure B13 left)Data Leaving out one country at a time (41 conditions Figure B10) varying the threshold below whichcases and deaths are masked (8 conditions Figure C17) sensitivity to correcting for undocumentedcases and to country-level differences in case ascertainment (2 conditions Figure B11) Epidemiologicalpriors Jointly varying the means of the priors over the means of the generation interval the infection-to-case-confirmation delay and the infection-to-death delay (125 conditions Figure C15) varying theprior over R0 (4 conditions Figure C16 left) varying the prior over NPI effectiveness (2 conditionsFigure C16 right) Unobserved factors Excluding observed NPIs one at a time (9 conditions Figure B14left) controlling for additional NPIs from a different dataset (5 conditions Figure B14 right)

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Business closures and gathering bans both seem to have been effective at reducing COVID-19 transmission Closing only high-risk businesses appears to have been only somewhat lesseffective than closing most nonessential businesses the median reduction in R differs onlyby 9 (6ndash12 mean and 95 interval of median estimates across the experiments set-tings in Figure 5) Closing only high-risk businesses may thus have been the more promisingpolicy option in some circumstances Limiting gatherings to 10 people or less was more ef-fective than limits up to 100 or 1000 people This may reflect the fact that small gatheringsare common

As previously discussed we estimate the average effect each NPI had in the contexts inwhich it was implemented When countries introduced stay-at-home orders they nearly al-ways also banned gatherings and closed schools universities and some or most businessesif they had not done so already (Figure 3 bottom left) Flaxman et al1 and Hsiang et al3

add the effect of these distinct NPIs to the effectiveness of stay-at-home orders and accord-ingly find a large effect In contrast we account for these other NPIs separately and isolatethe additional effect of ordering the population to stay at home (when large gatheringsare banned and educational institutions and some businesses closed)d In accordance withother studies that took this approach we find a small effect26 A typical country could havereduced R to below 1 without a stay-at-home order (Figure 4) provided other NPIs wereimplemented

Mandating mask-wearing in various public spaces had no clear effect on average in thecountries we studied This does not rule out mask-wearing mandates having a larger ef-fect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaceswearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier823 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 may play a largerrole potentially reducing effectiveness While our results cast doubt on reports that mask-wearing is the main determinant shaping a countryrsquos epidemic23 the policy still seemspromising given all available evidence due to its comparatively low economic and socialcosts Its effectiveness may have increased as other NPIs have been lifted and public inter-actions have recommenced

We find a large effect for school and university closures This finding is remarkably robustacross different model structures variations in the data and epidemiological assumptions(Figure 5) It remains robust when controlling for NPIs excluded from our study (Fig-ure B14) Our approach cannot distinguish direct and indirect effects such as forcing par-

dIn principle a stay-at-home order does not imply these other NPIs It is possible for a country to simultaneouslyhave allowed schools to be open and have a stay-at-home order (with exemptions for attending school) inplace However this is not how stay-at-home orders were implemented by the countries in our dataset andwe cannot estimate the effect of such a hypothetical stay-at-home order from the available data

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ents to stay at home or causing broader behaviour changes by increasing public concernAdditionally since school and university closures almost perfectly coincided in the coun-tries we study our approach cannot distinguish their individual effects (Appendix D21)This limitation likely also holds for other observational studies which do not include dataon university closures and estimate only the effect of school closures1ndash35ndash8 Previous evi-dence on school and university closures is mixed1626 Early data suggest that children andyoung adults are equally susceptible to infection but have a notably lower observed inci-dence rate than older adultsmdashwhether this is due to school and university closures remainsunknown27ndash30 Although infected young people are often asymptomatic they appear toshed similar amounts of virus as older people3132 and might therefore transmit the infec-tion to higher-risk demographics unknowingly As the role of children in transmission isstill unclear33 while outbreaks detected in schools are rising33ndash35 this topic merits carefulattention

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset10and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B3) However this sensitivity check cannot provide certainty Investigating therole of unobserved effects is an important topic to explore further Third our results can-not be used without qualification to predict the effect of lifting NPIs For example closingschools and universities seems to have greatly reduced transmission but this does not meanthat reopening them will cause infections to soar Educational institutions can implementsafety measures such as reduced class sizes as they reopen Further work is needed to anal-yse the effects of reopenings we hope that our collected data aids this effort Fourth whilewe included more NPIs than most previous work (Table F7) several promising NPIs wereexcluded For example testing tracing and case isolation may be an important part of acost-effective epidemic response36 but were not included because it is difficult to obtaincomprehensive data We discuss further limitations in Appendix E

Although our work focuses on estimating the impact of NPIs on the reproduction numberR the ultimate goal of governments may be to reduce the incidence prevalence and excessmortality of COVID-19 Controlling R is essential but the contribution of NPIs towards thesegoals may also be mediated by other factors such as their duration and timing37 periodicityand adherence3839 and successful containment40 While each of these factors addressestransmission within individual countries it can be crucial to additionally synchronise NPIsbetween countries since cases can be imported41

In conclusion as governments around the world seek to keep R below 1 while minimisingthe social and economic costs of their interventions we hope that our results can informpolicy decisions on which NPIs to implement in any potential further wave of infectionsAdditionally our work may provide insight on which areas of public life are most in need

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

of restructuring so that they can continue despite the pandemic However our estimatesshould not be seen as the final word on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies simulation studiesexperimental trials and clinical experience

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

The paid contractor work helping with the data collection the development of the interac-tive website and the costs for cloud compute were funded by the Berkeley Existential RiskInitiative

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier V Mikulik MAHartwick L Chindelevitch designed the model and modelling experiments M Sharma ABStephenson T Gavenciak J Salvatier performed and analysed the modelling experimentsJ Kulveit T Gavenciak JM Brauner conceived the research S Mindermann M SharmaJM Brauner L Chindelevitch J Kulveit T Besiroglu did the literature search JM BraunerS Mindermann M Sharma G Leech L Chindelevitch T Besiroglu V Mikulik wrote themanuscript All authors read and gave feedback on the manuscript and approved the finalmanuscript JM Brauner S Mindermann and M Sharma contributed equally L Chindele-vitch Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

3 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

4 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

5 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

6 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

7 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

8 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

9 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

10 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

11 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

12 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

13 Ensheng Dong Hongru Du and Lauren Gardner ldquoAn interactive web-based dashboardto track COVID-19 in real timerdquo In The Lancet Infectious Diseases 205 (May 2020)pp 533ndash534 DOI 101016s1473-3099(20)30120-1

14 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

15 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

16 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

17 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

18 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

19 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

20 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Young Joon Park et al ldquoContact tracing during coronavirus disease outbreak SouthKorea 2020rdquo In Emerg Infect Dis 2610 (2020)

28 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

29 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

30 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

31 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

32 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

33 Chen Stein-Zamir et al ldquoA large COVID-19 outbreak in a high school 10 days afterschoolsrsquo reopening Israel May 2020rdquo In Eurosurveillance 2529 (2020) p 2001352

34 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 38 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile919676Weekly_COVID19_Surveillance_Report_week_38_FINAL_UPDATEDpdf 2020

35 Meira Levinson Muge Cevik and Marc Lipsitch Reopening Primary Schools during thePandemic 2020

36 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

37 K Prem et al ldquoThe effect of control strategies to reduce social mixing on outcomes ofthe COVID-19 epidemic in Wuhan China a modelling studyrdquo In Lancet Public Health5 (5 May 2020) e261ndashe270

38 Patrick G T Walker et al ldquoThe impact of COVID-19 and strategies for mitigationand suppression in low- and middle-income countriesrdquo In Science 3696502 (2020)pp 413ndash422

39 Nicholas G Davies et al ldquoEffects of non-pharmaceutical interventions on COVID-19cases deaths and demand for hospital services in the UK a modelling studyrdquo In TheLancet Public Health 57 (2020) e375ndashe385

40 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

41 N W Ruktanonchai et al ldquoAssessing the impact of coordinated COVID-19 exit strate-gies across Europerdquo In Science (2020) DOI 101126scienceabc5096

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 23

Appendix A1 Detailed model description 23

Appendix A2 Testing reporting and infection fatality rates 26

Appendix A3 Method for uncertainty over key epidemiological parameters 27

Appendix B Validation 30

Appendix B1 Unseen data 30

Appendix B2 Sensitivity Analysis 30

Appendix B3 Robustness to unobserved effects 35

Appendix C Additional sensitivity analyses and validation 38

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results 38

Appendix C2 Sensitivity to additional epidemiological assumptions 40

Appendix C3 Sensitivity to data preprocessing 40

Appendix C4 Validation by predicting unseen data - all countries 42

Appendix C5 Posterior predictive distributions 47

Appendix C6 Calibration without countries used for hyperparameter selection 48

Appendix C7 MCMC stability results 49

Appendix D Additional results 50

Appendix D1 Estimated R0 by country 50

Appendix D2 Collinearity 51

Appendix D3 Posterior Epidemiological Parameter Distributions 53

Appendix E Additional discussion of assumptions and limitations 55

Appendix E1 Limitations of the data 55

Appendix E2 Model limitations 55

Appendix F Overview of previous work 57

Appendix G Handling edge cases in the data collection 60

Appendix H All model equations 64

Appendix I References 67

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections N (sdot)tc

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Mask wearing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closedSchools closed

Stay-at-home order

Growth reductions from interventions αi i

Daily growth rate gtc

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Uncertain delay from infection to

confirmation death

Uncertain generation interval

Universities closed

Figure A6 Model Overview Purple nodes are observed We describe the diagram from bottom to topThe effectiveness of NPI i is represented by αi which is independent of the country On each day t a countryrsquos daily reproduction number Rt c depends on the countryrsquos basic reproduction number R0cand the active NPIs The active NPIs are encoded by Φi t c which is 1 if NPI i is active in country c attime t and 0 otherwise Rt c is transformed into the daily growth rate gt c using the generation intervalparameters and subsequently is used to compute the new infections N (C )

t c and N (D)t c that will turn into

confirmed cases and deaths respectively Finally the number of new confirmed cases Ct c and deathsDt c is computed by a discrete convolution of N (middot)

t c with the respective delay distributions Our modeluses both death and case data it splits all nodes above the daily growth rate gt c into separate branchesfor deaths and confirmed cases We account for uncertainty in the generation interval infection-to-case-confirmation delay and the infection-to-death delay by placing priors over the parameters of thesedistributions

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1but with several key extensions First we model both confirmed cases and deaths al-lowing us to leverage significantly more data Furthermore we do not assume a specificinfection fatality rate (IFR) since we do not aim to infer the total number of COVID-19 in-fections Additionally since epidemiological parameters are only known with uncertaintywe place priors over them following recent best practice2 Concretely we place priors onthe means and standard deviationsdispersions of the generation interval the infection-to-confirmation delay and the infection-to-death delay These prior choices are detailedin Appendix A3 The end of this section details further adaptations which allow us to

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

minimise assumptions about testing reporting and the IFR Code is available online hereFor readers who wish to re-implement the model a concise list of all equations is given inAppendix H

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior distribution over R0c reflecting the wide disagreement of regionalestimates of R0

3 The mean R0c is 328 based on a meta-analysis4 We parameterize theeffectiveness of NPI i assumed to be same across countries and time with αi Each NPI isassumed to have an independent multiplicative effect on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We place an Asymmetric Laplace prior over αi with scale parameter10 asymmetry parameter 05 and location parameter 0 Our prior allows for (unbounded)positive and negative effects as we cannot a priori exclude the possibility that the introduc-tion of an NPI increases R However our prior places 80 of its mass on positive effectsreflecting a belief that NPIs are much more likely to reduce Rt than not This is a shrinkageprior placing 50 of its mass on lsquosmallrsquo effectiveness (less than 10 change in Rt ) All priordistributions are independent

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c and Rt c

(Eq 29 in5)

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We account for uncertainty inthe generation interval (GI) assumed to be a Gamma distribution by placing prior distri-butions over its mean and standard deviation (Appendix A3) Using (A2) we can writeg t c as a function g t c (Rt c microGIσGI) (see Appendix H) where microGI is the GI mean and σGI isthe GI standard deviation

aMany epidemiological models define growth rates as the exponent r in an exponential growth function Herewe use daily growth rates instead for ease of exposition These choices are mathematically equivalent Notethat we adapted equation (29) in Wallinga amp Lipsitch5 to account for our choice

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Note that the GI can change over time due to NPIs In particular the effective contact tracingand case isolation in China substantially shortened the mean GI to only 26 days among theidentified cases6 Since most cases were likely not identified even in Wuhan China7 andour study omits most of the countries with highly effective contact tracing programs suchas China or South Korea we model the GI as constant (but uncertain) over time A possibleextension for further work is to model the change in GI explicitly

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are inferred from the observation models for cases and deathsWe assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)] (A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)] (A4)

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross-validation as no reference is available for it We evaluate five dif-ferent values (σN isin 00501020304) with a fixed randomly chosen validation set of6 countries σN = 02 maximises the predictive log-likelihood on the validation set Thisensures a more calibrated model which is less likely to produce overconfident or unstableestimates10 We did not tune any other aspect of the model

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation In realitythis delay distribution is the sum of two independent distributions the incubation period(lognormal distribution) and the delay from onset of symptoms to confirmation (negativebinomial distribution) These distributions are uncertain but modelling (and placing pri-ors over) them individually leads to unidentifiability Therefore we model the total delay

bSince we treat new infections as a continuous number its initial value can be between 0 and 1

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

between infection and confirmation assuming that this follows a negative binomial dis-tribution We convert priors over the individual delays to a prior over the total delay bybootstrapping mdash please see Appendix A3

As in Flaxman et al1 the observed cases Ct c follow a negative binomial noise distribu-tion with mean C t c and an inferred dispersion parameter Ψ(C ) This distribution encodesthat small case numbers are more noisy and should therefore receive less weight Hav-ing separate dispersion parameters for cases and deaths ensures that they can be weighteddifferently if there is a difference in their noise distributions

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ) (A6)

where PD (delay) is the distribution of the delay from infection to death As for cases wemodel the total delay between infection and death as negative binomial which is in realitythe sum of two independent distributions the incubation period and the delay from onsetof symptoms to death (gamma distribution)

Finally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and an inferred dispersion parameter Ψ(D)

The model was implemented in PyMC311 We infer the unobserved variables in our modelusing the No-U-Turn Sampler (NUTS)12 a standard Markov chain Monte Carlo samplingalgorithm

Appendix A2 Testing reporting and infection fatality rates

Scaling all values of a time series by a constant does not change its growth rates Themodel is therefore invariant to the scale of the observations and consequently to country-level differences in the IFR and the ascertainment rate (the proportion of infected peoplewho are subsequently reported positive) For example assume countries A and B differ onlyin their ascertainment rates Then our model will infer a difference in N (C )

0c (Eq (A3)) butnot in the growth rates g t c across A and B Accordingly the inferred NPI effectiveness willbe identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas the

cThis is only approximately true The negative binomial output distribution has a coefficient of variation dimin-ishing with its mean ie smaller observations are relatively more noisy and carry less weight Furthermorewhilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligible effect

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

introduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimics changes in the ascertainment ratemdashnoise at time

τ affects all future casesmdashand allows for gradual multiplicative changes in ascertainment

Appendix A3 Method for uncertainty over key epidemiological parameters

We account for uncertainty in the following delay distributions the generation interval thedelay from infection to symptom onset (incubation period) the delay from symptom onsetto case confirmation and the delay from symptom onset to death

Each distribution has an uncertain mean (Table A3) whose prior distribution we takefrom a meta-analysis13 published shortly after our window of analysis This helps accountfor possible heterogeneity between different populations and therefore often finds higheruncertainty in the means than estimated in primary studies while using more data Themeta-analysis reports mean values with 95 confidence intervals We set normal priors overthe mean delays with mean equal to the reported mean in the meta-analysis and standarddeviation set such that the prior 95 credible interval includes the 95 confidence intervalsfrom the meta-analysis For example the meta-analysis reports an expected mean delayfrom symptoms to death of 1671 days with a 95 credible interval ranging from 1537 to1817 We thus assume that the prior is normal with micro= 1671 and σ= 075 which ensuresthat its 95 credible interval ranges from 1525 to 1817 strictly including the intervalreported in the meta analysis to be conservative Priors are truncated at zero

Furthermore we account for the uncertainty in the standard deviation of these delay dis-tributions by placing normal priors based on primary studies following the same methodas above (Table A4) For the standard deviation of the generation interval we use patientdata from Feretti et al14 and reproduce their method fitting a Gamma distribution to thegeneration time

Finally the distributional forms of the delay distributions are taken from the above primarystudies

The delay from infection to case confirmation is the sum of the incubation period and thesymptom-onset-to-case-confirmation delay Similarly the delay from infection to death isthe sum of the incubation period and the symptom-onset-to-death delay However forcomputational tractability we model the total delay distributions converting the priors de-scribed above to priors over the total delays We assume that the total delays follow negativebinomial distributions which are computationally tractable and empirically provide a closefit to both sum distributions We place normal priors over the mean and dispersion parame-ters of these total delay distributions by bootstrapping For example we compute priors forthe total infection-to-case-confirmation delay distribution by sampling Nbootstrap = 250 in-

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

cubation periods and symptom-onset-to-case-confirmation distributions For each sampledpair of distributions we draw 106 samples from their sum and fit a negative binomial usingmoment matching We compute the mean and standard deviation of the negative binomialdistribution parameters across the bootstrap and use this for our prior The resulting priorsare given in Table A2

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table A2 Epidemiological parameter distributions used in the model The details on sources and priorsused to generate this Table are given in Tables A3 and A4

Delay type Sources Prior on mean Prior on standard Distributionaldeviation or formdispersion

Infection to case Fonfria et al13 N (1092σ= 094) Dispersion Negativeconfirmation Lauer et al15 N (541σ= 027) binomial

Cereda et al16

Infection to death Fonfria et al13 N (2182σ= 101) Dispersion NegativeLauer et al15 N (1426σ= 518) binomialLinton et al17

Generation interval Fonfria et al13 N (506σ= 033) SD N (211σ= 05) GammaFeretti et al14

Table A3 Mean of epidemiological parameters all from meta-analysis13 and distributional forms

Delay type Reported mean Prior on mean Distributionaland 95 CI form of delay

Incubation period 579 (548ndash611) logmicrosimN (153σ= 0051) Lognormal15

Onset to death 1671 (1537ndash1817) N (1671σ= 075) Gamma17

Onset to case confirmation 582 (474ndash715) N (582σ= 068) Negative binomial16

Generation interval 506 (450ndash570) N (506σ= 033) Gamma14

Table A4 Standard deviation or dispersion of epidemiological parameters

Delay type Source Reported standard deviation or Prior on standarddispersion with uncertainty deviation or

dispersionIncubation period Lauer et al15 SD of log delay SD of log delay

048 (95 CI 027ndash054) N (048σ= 00759)Onset to death Linton et al17 SD 69 (95 CI 52ndash91) N (69σ= 1122)Onset to case confirmation Cereda et al16 Dispersion 157 plusmn 0054 N (157σ= 0054)Generation interval Feretti et al14 SD 211 plusmn 05 (reproduced by us) N (211σ= 05)

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking its predictions on unseendata even if prediction is not the purpose of the model1018 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries we have strong reason to doubt itseffectiveness estimates However we do not expect NPI effectiveness models to extrapolateperfectly Almost always unobserved factors such as changes in the ascertainment rate orIFR spontaneous behaviour changes or unobserved NPIs will affect the observed numberof cases and deaths Our models ought to treat these factors as noise and not attribute theireffects on Rt to the observed NPIs

We use Leave-One-Out Cross-Validation (LOO-CV) fitting the model on 40 countries andextrapolating to the excluded country We repeat this process for all 41 countries In theexcluded country the first 14 days of case and death data are observed to estimate R0c aswell as the initial outbreak sizes N (C )

0c and N (D)0c All later days are unobserved (masked)

The model uses the effectiveness estimates inferred from the 40 included countries and NPIactivation dates to predict the course of the pandemic in the excluded country

Figure B7 visually shows extrapolations obtained from LOO-CV for a randomly chosensubset of six countries (all countries are shown in Figures C18 to C21) The predictiveaccuracy of the model as expected degrades with an increasing prediction horizon sug-gesting the presence of unobserved factors However the predictive credible intervals arewide and increase over time as noise in the growth rate accumulates Importantly ourmodel makes calibrated forecasts in excluded countries (Fig B8) over periods of up to 2months

These are challenging predictions to the best of our knowledge no related study analyseshow their estimated NPI effects generalise to unseen countries Most related studies donot validate predictions on unseen data at all19ndash23 reviewed in9 Flaxman et al1 hold outthe last 14 days from 20 April to 4th of May for all countries in aggregate However thecountries they study implemented all NPIs in March or earlier (Supplementary Table 2 in1)meaning that in fact all countries were used to estimate NPI effects

Note that Fig B8 includes the 6 countries that were used to select the hyperparameter(Appendix A) However the calibration is similar if these 6 countries are excluded (Fig-ure C23)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals the extent to which results depend on uncertain parametersand modelling choices and can diagnose model misspecification and excessive collinearity

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

100

101

102

103

104

105

106

便便

Slovenia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

便 便

Greece

Figure B7 Model extrapolations on a randomly chosen subset of 6 excluded countries produced usingleave-one-out cross-validation In each excluded country the first 14 days of death and case data (soliddots) are observed the model then extrapolates to all future days within the period of analysis (emptydots) For each country we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier see Methods) The shaded areas representthe 95 credible intervals

in the data24 We vary many of the components of our model and recompute the NPIeffectiveness estimates summarised here Further analysis can be found in Appendix C

For each sensitivity analysis we quantify sensitivity using an easily interpretable measureshown above the legend of each plot the average standard deviation denoted σ Firstfor each NPI we compute the standard deviation of its median effect estimates across allexperimental conditions in the given sensitivity analysis We then average these acrossthe NPIs Since effectiveness is expressed in percentage reductions of Rt the unit of σ ispercent Zero percent corresponds to no sensitivity

We perform a multivariate sensitivity analysis on the key epidemiological priors For com-putational reasons all other sensitivity analyses are univariate

Multivariate sensitivity to main epidemiological priors The key epidemiological param-eters in our model describe the delay from infection to case-confirmation the delay from

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100

Em

piric

al C

umul

ativ

e Fr

eque

ncy

Calibration Country Holdouts

Figure B8 Calibration in held-out countries with leave-one-out cross validation In each excludedcountry the first 14 days of death and case data are observed to estimate R0c as well as the initialoutbreak sizes the model then extrapolates to all future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days The identity line represents ideal calibration

infection to death and the generation interval Our model places prior distributions overthe means of these delay distributions Since the priors already account for uncertainty inthe delays this analysis considers what would happen if our prior beliefs changed Sincethe epidemiological parameters may interact in unexpected ways we perform a multivari-ate sensitivity analysis jointly changing the mean of the prior over the mean of each delaydistribution (referred to as the prior mean of the mean below) We vary the prior means ofthe mean delays across a wide but epidemiologically plausible range of 5 values resultingin 125 different experimental conditions We present these results in two ways

First Figure B9 shows the global sensitivity of our results to the prior mean of the mean ofeach distribution individually For example for each setting of the prior mean of the meangeneration interval we show the average over the 5times5 = 25 runs with different prior meansplaced over the mean delays between infection and case confirmationdeath This is equiv-alent to placing a uniform prior across the 125 experiment conditions and marginalisingand is standard methodology for computing global sensitivity to an individual parameter

The prior means seem to mostly influence the relative effectiveness of business closuresand gathering bans Increasing the prior means of the infection-to-case-confirmationdeathmean delays results in larger effectiveness estimates for gatherings bans and smaller esti-

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

mates for business closures while increasing the prior mean of the generation interval hasthe opposite effect

Second we assess the sensitivity of our results to jointly varying the prior means This yieldsσ= 246 We present this result graphically in Appendix C1

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Infection to Case-Confirmation Delay= 112Mean 792 daysMean 942 daysMean 1092 daysMean 1242 daysMean 1392 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Infection to Death Delay= 080Mean 1782 daysMean 1982 daysMean 2182 daysMean 2382 daysMean 2582 days

-25 0 25 50 75Average reduction in Rtin the context of our data

Generation Interval= 149Mean 306 daysMean 406 daysMean 506 daysMean 606 daysMean 706 days

Figure B9 Sensitivity to key epidemiological parameter choices evaluated using a multivariate analysisWe jointly vary the prior means over the mean of the generation interval the delay between infectionand case confirmation and the delay between infection and death Each panel displays sensitivity to theprior mean of one distribution and the NPI effects are computed by averaging over the 25 runs withdifferent prior means of the two other distributions (see text)

Sensitivity to country exclusions Figure B10 shows results if one country at a time isexcluded from the data As there is no strong justification for including or excluding anyparticular country results ought to be stable if a country is excluded

-25 0 25 50 75

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultAlbaniaAndorraAustriaBelgiumBosnia andHerzegovinaBulgariaCroatia

-25 0 25 50 75

= 151DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-25 0 25 50 75

= 151DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

= 151DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-25 0 25 50 75Average reduction in Rtin the context of our data

= 151DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure B10 Sensitivity to excluding individual countries

Sensitivity to case-undercounting Figure B11 shows results when scaling the recordednumber of cases in each country First we correct the number of reported cases on each day

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

using estimates of the time-varying ascertainment rate from Golding et al25 For example ifthe estimated ascertainment rate in country c at time t is 50 we double the correspondingnumber of reported cases With this altered case data the model estimates a larger effectfor closing schools and universities and a smaller effect for limiting gatherings to 1000people or less There is little change in the effects estimated for the other NPIs

Second we randomly either double or halve the reported number of cases in each countryThis is intended to demonstrate that the model is invariant to constant differences in ascer-tainment between countries As expected there are negligible differences compared to thedefault results

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Case Undercounting= 163

Time-Varying ScalingRandom Constant ScalingDefault

Figure B11 Sensitivity to case undercounting considering both constant undercounting and time-varying undercounting

Sensitivity to structurally different models Figure B12 shows the NPI effectivenessestimates from models that use only cases or deaths as observations in contrast to ourmain model which uses both As expected there are some differences in the NPI effectestimates between the models which may be caused by the unique biases present in casedata and in death data Differences could also occur if NPIs differentially affected casesand deaths (eg if they affect different age groups) Nevertheless the studyrsquos qualitativeconclusions as outlined in the Discussione hold across the different data types

A number of implicit structural assumptions are made by the choice of model structureWe test sensitivity to these assumptions by evaluating NPI effectiveness estimates from al-ternative models reproducing the structural sensitivity analysis from our concurrent workwhere these models are described and compared in detail9 While these alternative modelstructures are also plausible we chose the model described in the main text as it is relativelysimple whilst also being highly robust9

eThe main qualitative conclusions as outlined in the Discussion are Stay-at-home orders had a small effectMandating mask-wearing had a small effect School and university closures had a large effect Gatherings banswere effective More strict gathering bans were more effective than less strict ones Business closures wereeffective Closing most nonessential businesses had limited benefit over closing just high-risk businesses

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Data Type

= 347Only Case DataOnly Death DataDefault

Figure B12 Sensitivity to different data types ie models using only confirmed cases only deaths orboth

The models are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting Rt into a daily growth rate a renewalprocess is used as the infection model as in earlier work1826ndash28 For computationalreasons we do not place a prior distribution over the generation interval parametersfor this model

3 Noisy-R Model The noise terms ε(middot)ct affect Rt rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on Rt The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

As Figure B13 (left) shows the multiplicative effect models support almost all conclusionsdrawn in the Discussione The only exception is that the stay-at-home order NPI is cate-gorised as moderately effective under the Different Effects Model (median reduction in Rt

of 18)

The results of the Additive Effects Model cannot be directly compared to the other modelssince it expresses results on a different scale (percentage reduction in R0 instead of Rt ) Itis therefore shown in a separate panel However the trend in the results is visually similarto the default results

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs which were implemented nor directly measures broaderbehavioural changes Since these factors influence Rt we must be wary of their effectbeing attributed to observed NPIs We investigate this further by assessing how much effec-

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Structural Sensitivity Multiplicative Models= 311

Noisy-RDiscrete Renewal

Different EffectsDefault

-25 0 25 50 75Average reduction in Rtas a percentage of R0

in the context of our data

Structural Sensitivity Additive Model

Figure B13 Structural sensitivity analysis NPI effectiveness estimates under different structural as-sumptions Left Effectiveness estimates for multiplicative effect models Note that for computationalreasons the discrete renewal model does not have a prior over the generation interval Right Effective-ness estimates for an additive effect models For this model the effectiveness of each NPI is expressedas a reduction of R0 rather than Rt

tiveness estimates change when previously unobserved factors are included and also whenobserved factors are excluded This is best practice for addressing unobserved factors suchas confounders2930

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs31 The timings of our observed NPIsrsquo implementation dates are indeed mutuallycorrelated prompting the question of how much results change when we make previouslyobserved NPIs unobserved Figure B14 (left) shows NPI effectiveness estimates when pre-viously observed NPIs are excluded in turn The studyrsquos main qualitative conclusionse holdacross all experimental conditions and the variations in NPI effectiveness are rarely largeenough to cause a change in effectiveness category (Figure 5) except for the Gatheringslimited to 1000 people or less NPI Considering that some of the excluded NPIs have strongestimated effects when included and are correlated with other NPIs this degree of robust-ness is encouragingly high It suggests that unobserved factors will not significantly biasresults as long as their effects and their correlations with the studied NPIs do not exceedthose of the studied NPIs We hypothesise that this robustness to unobserved factors is dueto the noise on the daily growth rate in our model9

In addition Figure B14 (right) shows NPI effectiveness estimates when we observe (iecontrol for) additional NPIs taken from the OxCGRT dataset32

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Inclusion of OxCGRT NPIs= 153

Travel ScreenQuarantineTravel BansSymptomatic TestingPublic Information CampaignsInternal Movement LimitedPublic Transport LimitedDefault

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order(with exemptions)

Exclusion of Collected NPIsUncombined Effects Shown

= 188Mask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay Home OrderDefault

Figure B14 Robustness to unobserved effects Left Results when excluding previously observed NPIsWe exclude one of the NPIs in turn and show the estimates for the other NPIs Note that this subplotshows the marginal effect for hierarchical NPIs rather than the total effect different to other figuresthat show cumulative effects for gathering bans and businesses closures For example Figure 3 displaysthe total effect of closing most nonessential businesses while here we show the additional effect ofclosing most nonessential businesses over just closing some high-risk businesses We show the additionaleffects here because the effect of a cumulative intervention would become undefined when part of it isexcluded from the analysis Right Results when controlling for previously unobserved NPIs We includeone additional NPI in turn and show the estimates for the NPIs in our study (the additional NPI is notshown)

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Appendix C1 Multivariate Sensitivity Analysis - Expanded Results

We now present the results of our joint multivariate sensitivity analysis (previously sum-marised in Figure B9) in which we vary the mean of the prior (ldquoprior meanrdquo) over themean of the generation interval the delay between infection and death and the delaybetween infection and case confirmation

Figure C15 shows for each NPI how NPI effects change when all prior means are variedsimultaneously

We find that the most sensitive NPIs are Gatherings limited to 1000 people or fewer Gath-erings limited to 100 people or fewer and Most nonessential businesses closed However thelargest differences appear when all of the parameters are set to the most extreme valuesthat jointly produce a change in a particular direction We also see systematic trends asthe parameters are varied eg increasing the prior mean of the generation interval meantends to decrease the effectiveness of the Gatherings limited to 1000 or fewer NPI while in-creasing prior means of the infection to deathcase confirmation distribution means tendsto increase the effectiveness of this NPI

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

792

942

1092

1242

1392

Mas

kWea

ring

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

GI = 306

-150

-75

00

75

150GI = 406

-150

-75

00

75

150GI = 506

-150

-75

00

75

150GI = 606

-150

-75

00

75

150GI = 706

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

00

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

0

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Gat

heri

ngs

lt10

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Som

eBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Mos

tBus

s

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

792

942

1092

1242

1392

Educ

at

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

792

942

1092

1242

1392

Stay

Hom

e

Infe

ctio

n to

Cas

eC

onfir

mat

ion

Del

ay (D

ays)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

1782 1982 2182 2382 2582

Infection to Death Delay (Days)

-150

-75

00

75

150

Figure C15 Global sensitivity analysis Each cell shows the difference between the median effectivenessestimate under a specific experimental condition and the default condition where the estimates areexpressed as percentage reduction in Rt Each subplot represents a particular prior mean of the meangeneration interval and an NPI The NPIs vary top to bottom and the generation interval prior meanincreases left to right Each cell with a subplot represents a particular choice of the prior mean of themean infection to death delay (increasing left to right) and the prior mean of the mean infection to caseconfirmation delay (increasing bottom to top) Positive cell values indicate that the median NPI estimateis larger than with default settings and vice versa

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C2 Sensitivity to additional epidemiological assumptions

For completeness Figure C16 shows sensitivity to two further epidemiological assumptionsOn the left we show the dependence of NPI effectiveness estimates on R0 We place aNormal prior over R0 and a hyperprior over its variance reflecting the wide disagreementof regional estimates of R0 In the sensitivity analysis we vary the mean of the prior froma low value (238) to a high one (428) As expected higher mean R0 values result in largerNPI effects This trend is apparent in all NPIs but stronger for NPIs that tended to beimplemented early on (like school closures and gathering bans)

On the right we vary the prior over NPI effectiveness Our default prior on NPI effectivenessis assymmetric reflecting a belief that NPIs are more likely to reduce Rt than increase itHere we additionally test a symmetric Normal prior and a one-sided Half-Normal priorwhich does not allow for NPIs to increase Rt

Appendix C3 Sensitivity to data preprocessing

When the case count is small a large fraction of cases may be imported from other coun-tries and the testing regime may change rapidly To prevent this from biasing our modelwe neglect case numbers before a country has reached 100 confirmed cases Similarly weneglect death numbers before a country has reached 10 deaths Here we vary these thresh-olds Intuitively the minimum cases threshold should be higher than the minimum deathsthreshold

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

R0 prior= 333

[R] = 238[R] = 278

Default[R] = 328[R] = 378[R] = 428

-25 0 25 50 75Average reduction in Rtin the context of our data

NPI Effectiveness Prior= 137

i Normal(0 022)i Half Normal(022)

Default

Figure C16 Sensitivity to additional epidemiological priors Left Sensitivity to the prior mean on R0Right Sensitivity to the NPI effectiveness prior

-25 0 25 50 75Average reduction in Rtin the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universitiesclosedStay-at-home order(with exemptions)

Minimum Cases Threshold= 137

3050Default (100)200300

-25 0 25 50 75Average reduction in Rtin the context of our data

Minimum Deaths Threshold= 102

15Default (10)3050

Figure C17 Data preprocessing sensitivity Left Variations in the minimum number of cases RightVariations in the minimum number of deaths

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Validation by predicting unseen data - all countries

Here we show the country-level plots of the single-country holdout experiments from Ap-pendix B1 We use leave-one-out cross-validation fitting the model on 40 countries andshowing its predictions on the excluded country We repeat this process for all 41 countriesIn the excluded country the first 14 days (filled dots) of death and case data are observedto allow roughly inferring R0 These days are not used to infer NPI effectiveness All laterdays are masked (unfilled dots) The activation dates of NPIs are also given and the modeluses the effectiveness estimates inferred from the 40 other countries

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

北便

France

Figure C18 Predictions on excluded countries

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

北 便

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Malta

Figure C19 Predictions on excluded countries

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

Slovenia

Figure C20 Predictions on excluded countries

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Switzerland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Observed Daily DeathsHeldout Daily Deaths

Observed Daily Confirmed CasesHeldout Daily Confirmed Cases

United Kingdom

Figure C21 Predictions on excluded countries Vertical lines show the activation (or inactivation) datesof NPIs Shaded areas are 95 credible intervals Blue and red dots show the observed confirmed casesand deaths while blue and red lines show the median model estimates of cases (Ct ) and deaths (Dt )Empty dots are not observed by the model For each country we show the full window of analysis (fromthe start of the epidemic until the first NPI was lifted or the 30th of May 2020 whichever was earliersee Methods)

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Posterior predictive distributions

The posterior predictive distribution (Figure C22) shows the predicted number of cases anddeaths after observing the data Although these curves can be called lsquofitsrsquo the degree of fitto the data must be interpreted with great care The fit is generally tight but this is partlydue to the inferred latent noise variables ε(C )

t and ε(D)t Inferring this latent noise allows

the posterior predictive distribution to closely match the data without overfitting the effec-tiveness parameters to the data Such behavior is common in Bayesian models which oftenperfectly interpolate the data without overfitting33 The noise terms can account for peri-ods where infections grew faster or slower than predicted based solely on the active NPIsIn such periods the noise may account for changes in testing reporting and unobservedinterventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Predicted Daily DeathsPredicted Daily Confirmed Cases

Daily DeathsDaily Confirmed Cases

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C22 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 Calibration without countries used for hyperparameter selec-tion

0 20 40 60 80 100Credible Interval

0

20

40

60

80

100E

mpi

rical

Cum

ulat

ive

Freq

uenc

y

Calibration Country Holdouts

Figure C23 Calibration in held-out countries with leave-one-out cross-validation Here we include onlythose 35 countries that were not used to select the hyperparameter We show the first 14 days of casesand deaths in each country and then extrapolate to future days within the period of analysis The plotshows the percentage of observed daily case and death counts that lie within the X credible intervalacross all countries and days

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 MCMC stability results

1000 1002 10040

1000

2000

3000

4000

R

0 1 2 30

500

1000

1500

2000

Relative ESS

Figure C24 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size A value of 1 indicates perfect decorrelation between samplesValues above (below) 1 indicate that the effective number of samples is higher (lower) than the actualnumber of samples due to negative (positive) correlation respectively

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D5 Estimated values for R0 by country The parenthesis give the 95 credible interval which isoften wide The mean R0 across countries is 33

Country Estimated R0 Country Estimated R0

Albania 348 (275431) Lithuania 323 (248403)Andorra 275 (21434) Malaysia 296 (236361)Austria 31 (25377) Malta 327 (248414)Belgium 36 (302427) Mexico 399 (32748)Bosnia and Herzegovina 331 (259406) Morocco 359 (299427)Bulgaria 375 (304457) Netherlands 316 (26375)Croatia 342 (27418) New Zealand 241 (17731)Czech Republic 346 (276425) Norway 273 (215338)Denmark 28 (22347) Poland 397 (32482)Estonia 291 (229361) Portugal 355 (292425)Finland 279 (221343) Romania 401 (335475)France 331 (28389) Serbia 38 (308461)Georgia 356 (281439) Singapore 295 (238356)Germany 285 (231346) Slovakia 353 (271445)Greece 321 (261388) Slovenia 299 (23373)Hungary 403 (332482) South Africa 447 (377527)Iceland 171 (113242) Spain 36 (306423)Ireland 368 (309433) Sweden 229 (176288)Israel 379 (309459) Switzerland 289 (234351)Italy 336 (288391) United Kingdom 32 (267378)Latvia 301 (233376)

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 Collinearity

Appendix D21 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the inclusion or exclusion ofthese countries in the dataset (Figure D25) If we included all countries we would con-clude that university closures were more effective than school closures (black markers inFigure D25) However if we excluded Iceland or Sweden we would conclude that theywere roughly equally effective As there is no strong justification for including or excludingone particular country we cannot meaningfully disentangle the effects of school and uni-versity closures However there is much more data to determine the joint effect and it isindeed much more stable (Figure D25)

-25 0 25 50 75 100Average reduction in Rtin the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D25 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown

Appendix D22 Co-occurrence of NPIs

Table D6 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or neither) Note thatwe do not show the traditional collinearity statistics (variance inflation factors and datacorrelations) since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D6 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D23 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations24 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus042 between closing schools anduniversities and closing some businesses (Figure D26) The weak correlations are oneindicator that collinearity is manageable with our dataset

Effect on NPI combinations To better understand posterior correlations we visualizetheir effect in hosted video files As we condition on different values for one NPI we cansee that the estimates of other NPIs change only slightly always staying well within thecredible intervals in Figure 3 The significance of posterior correlations is small enough thatit is possible to calculate a reasonable approximation to the mean effect of a set of NPIs bysimply combining the mean percentage reductions for each individual NPI (eg two 50reductions lead to a 75 reduction) For example this approximation leads to a joint effectof 75 for all NPIs together which closely matches the exact mean joint effect of 77

Videos are available online here

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D26 Posterior correlations between effectiveness parameters αi

Appendix D3 Posterior Epidemiological Parameter Distributions

Figure D27 shows the posterior distributions for variables describing the key delay distribu-tions in our model the generation interval the delay between infection and case confirma-tion and the delay between infection and death We find that the posterior distributions ofthese parameters are somewhat tighter than their priors but they still explore a wide rangeof values This suggests that the data provides evidence albeit weak about these delaydistributions (for example from visual inspection the data would show that the delay islonger for deaths than for cases) An exception is the dispersion of the infection to deathdelay distribution which has a very wide prior

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

5 6

0

1

dens

ity

Generation Interval Mean

posteriorprior

200 225

00

05

Infection to Death Delay Mean

100 125

00

05

Infection to Case-Confirmation Delay Mean

0 2 4

value

00

05

dens

ity

Generation Interval std

10 20

value

00

01

Infection to Death Delay disp

5 6

value

0

1

Infection to Case-Confirmation Delay disp

Figure D27 Posterior distributions over key epidemiological parameters describing the generation in-terval and the delays between infection and case confirmationdeath

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect morethan three-quarters of the population) We thus miss NPIs which were only implementedregionally For example a few regions in Germany implemented stay-at-home orders butmost did not Thus Germany is listed as no stay-at-home order in our data Additionallyour NPI definitions were not perfectly granular For example a gathering ban on gatheringsof gt15 people and a ban on gatherings of gt60 people would both fall under the NPIGatherings limited to 100 people or less despite likely having different effects on Rt Finally while we included more NPIs than previous work (Table F7) there are many NPIsfor which we were not able to collect enough high-quality data for our modeling such aspublic cleaning or changes to public transportation

Of the 41 countries in our dataset 33 are in Europe As a result the NPI effectivenessestimates may be biased towards effects in Europe and NPI effectiveness may have beendifferent in other parts of the world

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on Rt is constantacross countries and time However the exact implementation and adherence of each NPIis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa limited degree Additionally different countries have different cultural norms and ageprofiles affecting the degree to which a particular intervention is effective For example acountry where a higher proportion of the population is in education will likely experience alarger effect from a government order to close schools and universities Our estimates thusshould be adjusted to local circumstances To address differences between countries ourstructural sensitivity analysis includes a model where each NPI can have a different effectper country (Appendix B2) The average effectiveness estimates across countries in thismodel match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponential de-cline would have the same effect on observations as an unobserved NPI Consequently wecannot quantify its effect on our results (though the sensitivity analyses look reassuring)

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expectthat closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g t and the reproductive number Rt holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso report results from a renewal process model8 that does not depend on this assumptionand we find similar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix F Overview of previous work

Table F7 Data-driven multi-country multi-NPI studies of the effectiveness of observed (as opposed tohypothetical) NPIs in reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Banholzer etal 202023

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Chen andQiu 202021

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Islam et al202020

School closuresworkplace closuresrestrictions on massgatherings publictransport closure

lockdown

149 countries or regionsInterrupted timeseries regression

Liu et al202022

13 NPIs from theOXCGRT dataset

130 countries and regions panel regression

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 Some data-driven studies of the effectiveness of observed NPIs in reducing the transmissionof COVID-19 which study a single-country andor a single-NPI

Study NPIs studiedRegionscountries

studiedMethod

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dandekarand

Barbastathis202035

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Dehning etal 202036

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Gatto et al202037

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Hsiang et al202019

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US (each

country modelledindividually)

Linear regression onestimated growth

rates

Jarvis et al202038

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelKraemer etal 202039

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202040 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) model

Lai et al202041

Case detection andisolation travel

restrictions contactreductions

ChinaTravel-network-based

SEIR model

Continued on next page

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F8 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Lorch et al202042

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Maier andBrockmann

202043

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Orea andAacutelvarez202044

Lockdown SpainSpatial econometric

analysis

Quilty et al202045

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

Sears et al202046

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Siedner etal 202047

General socialdistancing

USInterruptedtime-series

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F7) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

61

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

62

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all ldquotrade activitiesrdquo (except within shopping malls) service jobsthat donrsquot involve physical contact museums libraries and galleries opened Sincethe criteria regarding ldquoall retail stores being openrdquo was met we counted April 27 asthe end date for ldquoMost nonessential businesses closedrdquo

63

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H All model equations

This section will mainly be of interest to readers that wish to re-implement the model

Variables are indexed by NPI i country c and day t All prior distributions are independent

Data

1 NPI Activations φi t c isin 012 Observed (Daily) Cases Ct c 3 Observed (Daily) Deaths D t c

Prior Distributions

1 Country-specific R0 R0c sim Normal(325κ) κsim Half Normal(micro= 0σ= 05)2 NPI effectiveness αi sim Asymmetric Laplace(m = 0κ= 05λ= 10) m is the location

parameter κgt 0 is the asymmetry parameter and λgt 0 is the scale parameter3 Infection Initial Counts

N (C )0c = exp(ζ(C )

c )

N (D)0c = exp(ζ(D)

c )

ζ(C )c sim Normal(micro= 0σ= 50)

ζ(D)c sim Normal(micro= 0σ= 50)

4 Observation Noise Dispersion Parameters

Ψcases sim Half Normal(micro= 0σ= 5) (H1)

Ψdeaths sim Half Normal(micro= 0σ= 5) (H2)

Hyperparameters

1 Growth Noise Scale σg = 02

Delay Distributions

1 Generation interval distribution1314

microGI sim Normal(micro= 506σ= 03265)

σGI sim Normal(micro= 211σ= 05)

64

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

2 Time from infection to case confirmation T (C )131516f

microinfrarrconf sim Normal(micro= 1092σ= 094)

Ψinfrarrconf sim Normal(micro= 541σ= 027)

This distribution is converted into a forward-delay vector

T (C )[t ] =

1ZC

Negative Binomial(t micro=microinfrarrconfα=Ψinfrarrconf) t lt 32

0 otherwise

with ZC =31sum

t prime=0Negative Binomial(t primemicro=microinfrarrconfα=Ψinfrarrconf)

ie the delay follows a truncated and normalised negative binomial distribution3 Time from infection to death T (D)f131517

microinfrarrdeath sim Normal(micro= 2182σ= 101)

Ψinfrarrdeath sim Normal(micro= 1426σ= 518)

This distribution is converted into a forward-delay vector

T (D)[t ] =

1ZD

Negative Binomial(t micro=microinfrarrdeathα=Ψinfrarrdeath) t lt 48

0 otherwise

with ZD =47sum

t prime=0Negative Binomial(t primemicro=microinfrarrdeathα=Ψinfrarrdeath)

ie the delay follows a truncated and normalised negative binomial distribution

Infection Model

Rt c = R0c middotexp

(minus

Isumi=1

αi φi t c

) where I is the number of NPIs

αGI =microGI

σ2GI

βGI =micro2

GI

σ2GI

g t c = exp

(βGI(R

1αGIct minus1)

)minus1

fα in the definition of the Negative Binomial distribution is the dispersion parameter Larger values of αcorrespond to a smaller variance and less dispersion With our parameterisation the variance of the Negative

Binomial distribution is micro+ micro2

α

65

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

]

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise

ε(C )τc sim Normal(micro= 0σ=σg )

ε(D)τc sim Normal(micro= 0σ=σg )

Observation Modelf

Ct c =31sumτ=0

N (C )tminusτcT

(C )[τ]

D t c =47sumτ=0

N (D)tminusτcT

(D)[τ]

Ct c sim Negative Binomial(micro= Ct c α=Ψcases)

D t c sim Negative Binomial(micro= D t c α=Ψdeaths)

66

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix I References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Sam Abbott et al ldquoEstimating the time-varying reproduction number of SARS-CoV-2using national and subnational case countsrdquo In Wellcome Open Research 5112 (2020)p 112

3 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

4 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

5 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

6 Sheikh Taslim Ali et al ldquoSerial interval of SARS-CoV-2 was shortened over time bynonpharmaceutical interventionsrdquo In Science 3696507 (2020) pp 1106ndash1109

7 Xingjie Hao et al ldquoReconstruction of the full transmission dynamics of COVID-19 inWuhanrdquo In Nature 5847821 (Aug 2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the robustness of effectiveness estimation of nonpharma-ceutical interventions against COVID-19 transmissionrdquo In arXiv preprint arXiv200713454(2020) URL httpsarxivorgabs200713454

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

12 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

13 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

14 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

67

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

15 Stephen A Lauer et al ldquoThe incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases estimation and applicationrdquo In Annals ofinternal medicine 1729 (2020) pp 577ndash582

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

19 Solomon Hsiang et al ldquoThe effect of large-scale anti-contagion policies on the COVID-19 pandemicrdquo In Nature 5847820 (2020) pp 262ndash267

20 Nazrul Islam et al ldquoPhysical distancing interventions and incidence of coronavirusdisease 2019 natural experiment in 149 countriesrdquo In bmj 370 (2020)

21 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo In Covid Economics 7 (Apr 2020) pp 46ndash67

22 Yang Liu et al ldquoThe impact of non-pharmaceutical interventions on SARS-CoV-2 trans-mission across 130 countries and territoriesrdquo In medRxiv (2020)

23 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

24 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

25 Nick Golding et al ldquoReconstructing the global dynamics of under-ascertained COVID-19 cases and infectionsrdquo In medRxiv (2020) DOI 1011012020070720148460eprint httpswwwmedrxivorgcontentearly202007082020070720148460fullpdf URL httpswwwmedrxivorgcontentearly202007082020070720148460

26 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

27 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

28 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

68

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

29 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

30 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

31 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

32 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

33 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

36 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

37 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

38 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

39 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

40 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

41 Shengjie Lai et al ldquoEffect of non-pharmaceutical interventions to contain COVID-19 inChinardquo en In Nature 5857825 (Sept 2020) pp 410ndash413 DOI 101038s41586-020-2293-x

69

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

42 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

43 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

44 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

45 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

46 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

47 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

70

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Testing reporting and infection fatality rates
      • Method for uncertainty over key epidemiological parameters
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Multivariate Sensitivity Analysis - Expanded Results
              • Sensitivity to additional epidemiological assumptions
              • Sensitivity to data preprocessing
              • Validation by predicting unseen data - all countries
              • Posterior predictive distributions
              • Calibration without countries used for hyperparameter selection
              • MCMC stability results
                • Additional results
                  • Estimated R0 by country
                  • Collinearity
                    • The individual effects of school and university closures
                    • Co-occurrence of NPIs
                    • Correlations between effectiveness estimates
                      • Posterior Epidemiological Parameter Distributions
                        • Additional discussion of assumptions and limitations
                          • Limitations of the data
                          • Model limitations
                            • Overview of previous work
                            • Handling edge cases in the data collection
                            • All model equations
                            • References
Page 14: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 15: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 16: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 17: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 18: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 19: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 20: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 21: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 22: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 23: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 24: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 25: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 26: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 27: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 28: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 29: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 30: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 31: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 32: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 33: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 34: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 35: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 36: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 37: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 38: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 39: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 40: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 41: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 42: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 43: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 44: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 45: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 46: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 47: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 48: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 49: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 50: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 51: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 52: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 53: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 54: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 55: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 56: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 57: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 58: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 59: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 60: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 61: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 62: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 63: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 64: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 65: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 66: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 67: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 68: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 69: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan
Page 70: The effectiveness of eight nonpharmaceutical interventions … · 2020. 5. 28. · The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan