analysis of time-stratified case- crossover studies in environmental epidemiology using stata...

27
Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC), Barcelona, Spain Ben Armstrong and Antonio Gasparrini London School of Hygiene and Tropical Medicine, UK 2014 UK Stata Users Group meeting London, 12th September 2014

Upload: brittany-fisher

Post on 02-Jan-2016

296 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Analysis of time-stratified case-crossover studies in environmental epidemiology

using Stata

Aurelio TobíasSpanish Council for Scientific Research (CSIC), Barcelona, Spain

Ben Armstrong and Antonio Gasparrini

London School of Hygiene and Tropical Medicine, UK

2014 UK Stata Users Group meetingLondon, 12th September 2014

Page 2: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Background

• The time stratified case crossover design is a popular alternative to conventional time series regression for analysing associations between time series of environmental exposures (air pollution, weather) and counts of health outcomes

Page 3: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Healthevent

Have you made any unusual activity …

The case-crossover design

• Proposed by Maclure (1991) to study transient effects on the risk of acute health events

… compared to your usual routine?

Page 4: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

The case-crossover design

• Proposed by Maclure (1991) to study transient effects on the risk of acute health events

HealtheventRisk periodControl period

Exposed? Exposed?

Page 5: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

The case-crossover design

• Proposed by Maclure (1991) to study transient effects on the risk of acute health events

• Analysis likewise a matched case-control

HealtheventRisk periodControl period

Exposed? Exposed?

ControlExp. No

Exp.No

Riskperiod

ac

bd

OR = b/c

Page 6: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Application to environmental studies

• Firstly adapted to air pollution studies – Philadelphia (Neas et al. 1999) and Barcelona (Sunyer et

al. 2000)

• comparing exposure levels for a given day (t) when health event occurs vs. levels before (t-7) and after (t+7) the health event

• Allows to control for time-trend and seasonality by design, since compares exposure levels between same weekdays within each month of each year

Page 7: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Time-stratified case-crossover

(Figueiras et al. 2010)

Page 8: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Option 1: conditional logistic regression

• Dataset needs to be reshaped from time-series format to individual matched case-control by using the new command,

• . mkcco vary, date(vardate)– This also creates three new variables,– case case days (=1)– wn weight observations by n events on case day – ccset set num. to match case-control days

• Then, use Stata’s clogit command

Page 9: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

. use madrid, clear

. list date y pm10 temp in 1/31, noobs clean

date y pm10 temp 01jan2003 64 14 9.6 02jan2003 56 12 11 03jan2003 73 16 9.8 04jan2003 69 14 8.8 05jan2003 71 17 4.3 06jan2003 68 9 7 07jan2003 63 19 3 08jan2003 85 13 6.8 09jan2003 67 13 4.6 10jan2003 80 22 2.7 11jan2003 65 23 1 12jan2003 95 17 .85 13jan2003 60 45 .5 14jan2003 76 60 .45 15jan2003 77 72 1.3 16jan2003 75 54 2.6 17jan2003 76 65 2 18jan2003 75 43 .8 19jan2003 74 12 6.3 20jan2003 79 19 5.8 21jan2003 73 16 8 22jan2003 72 21 6.7 23jan2003 72 25 7 24jan2003 74 46 6.2 25jan2003 59 42 8.1 26jan2003 77 26 10 27jan2003 64 22 15 28jan2003 65 41 9.7 29jan2003 74 18 5.7 30jan2003 77 17 4.8 31jan2003 55 17 2.3

Page 10: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

. generate dow = dow(date)

. list date y pm10 temp in 1/31 if dow==1, noobs clean

date y pm10 temp 06jan2003 68 9 7 13jan2003 60 45 .5 20jan2003 79 19 5.8 27jan2003 64 22 15

. mkcco y, date(date)

Dataset reshaped from 1096 to 5480 obs. (1096 case days and 4384 control days).New variables case, wn and ccset have been added to the resaphed dataset.

Sun Mon Tue Wed Thu Fri Sat

1 2 3 4

5 6 7 8 9 10 11

12 13 14 15 16 17 18

19 20 21 22 23 24 25

26 27 28 29 30 31

Page 11: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

. list date y pm10 temp case ccset wn in 1/36 if dow==1, noobs clean

date y pm10 temp case ccset wn 06jan2003 68 9 7 1 2003021 68 13jan2003 60 45 .5 0 2003021 68 20jan2003 79 19 5.8 0 2003021 68 27jan2003 64 22 14.8 0 2003021 68

...

Sun Mon Tue Wed Thu Fri Sat

1 2 3 4

5 6 7 8 9 10 11

12 13 14 15 16 17 18

19 20 21 22 23 24 25

26 27 28 29 30 31

Page 12: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

...

date y pm10 temp case ccset wn 06jan2003 68 9 7 0 2003022 60 13jan2003 60 45 .5 1 2003022 60 20jan2003 79 19 5.8 0 2003022 60 27jan2003 64 22 14.8 0 2003022 60

Sun Mon Tue Wed Thu Fri Sat

1 2 3 4

5 6 7 8 9 10 11

12 13 14 15 16 17 18

19 20 21 22 23 24 25

26 27 28 29 30 31

Page 13: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

...

date y pm10 temp case ccset wn 06jan2003 68 9 7 0 2003023 79 13jan2003 60 45 .5 0 2003023 79 20jan2003 79 19 5.8 1 2003023 79 27jan2003 64 22 14.8 0 2003023 79

Sun Mon Tue Wed Thu Fri Sat

1 2 3 4

5 6 7 8 9 10 11

12 13 14 15 16 17 18

19 20 21 22 23 24 25

26 27 28 29 30 31

Page 14: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

...

date y pm10 temp case ccset wn 06jan2003 68 9 7 0 2003024 64 13jan2003 60 45 .5 0 2003024 64 20jan2003 79 19 5.8 0 2003024 64 27jan2003 64 22 14.8 1 2003024 64

Sun Mon Tue Wed Thu Fri Sat

1 2 3 4

5 6 7 8 9 10 11

12 13 14 15 16 17 18

19 20 21 22 23 24 25

26 27 28 29 30 31

Page 15: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

. clogit case pm10 [w=wn], group(ccset)(frequency weights assumed)...

Conditional (fixed-effects) logistic regression Number of obs = 295025 LR chi2(1) = 8.39 Prob > chi2 = 0.0038Log likelihood = -98913.41 Pseudo R2 = 0.0000

------------------------------------------------------------------------------ case | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- pm10 | .0007406 .0002552 2.90 0.004 .0002404 .0012408------------------------------------------------------------------------------

Page 16: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Option 1: conditional logistic regression

• Advantage– Similar approach to a matched

case-control study, being familiar to the epidemiologist

• Limitations– Unfriendly data management– Large (huge) data-sets at

individual level– Slow convergence when fitting

interaction terms for individual factors (sex, age)

– Cannot account for overdispersion and residual autocorrelation

Page 17: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Option 2: Poisson regression

• Lu and Zeger (2007) – “ … the time-stratified case-crossover design leads to the

same estimate as obtained from a Poisson regression with dummy variables indicating the strata”

• Strata defined by the 3-way interaction between year, month and day of week

• Either poisson or glm (jointly with the scale option) Stata’s commands can be used

Page 18: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

. use madrid, clear

. generate yy = year(date)

. generate mm = month(date)

. generate dow = dow(date)

. poisson y pm10 i.yy##i.mm#i.dow

...

Poisson regression Number of obs = 1096 LR chi2(252) = 1374.12 Prob > chi2 = 0.0000Log likelihood = -3787.589 Pseudo R2 = 0.1535

------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- pm10 | .0007406 .0002552 2.90 0.004 .0002404 .0012408 | ... 252 parameters! | _cons | 4.35927 .0563535 77.36 0.000 4.248819 4.469721------------------------------------------------------------------------------

Page 19: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

. glm y pm10 i.yy##i.mm#i.dow, family(poisson) link(log) scale(x2)

...

Generalized linear models No. of obs = 1096Optimization : ML Residual df = 843 Scale parameter = 1Deviance = 1069.73306 (1/df) Deviance = 1.26896Pearson = 1065.616046 (1/df) Pearson = 1.264076

Variance function: V(u) = u [Poisson]Link function : g(u) = ln(u) [Log]

AIC = 7.373338Log likelihood = -3787.588997 BIC = -4830.78------------------------------------------------------------------------------ | OIM y | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- pm10 | .0007406 .0002869 2.58 0.010 .0001782 .001303 | ... | _cons | 4.35927 .0633589 68.80 0.000 4.235088 4.483451------------------------------------------------------------------------------(Standard errors scaled using square root of Pearson X2-based dispersion.)

Page 20: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Option 2: Poisson regression

• Advantages– Avoids reshape of dataset

keeping original time-series format

– Can account for overdispersion and residual autocorrelation

• Limitations– Large number of interaction

parameters, slowing the convergence time (even making difficult it with missing data)

– Even slower when fitting interaction terms for individual factors (sex, age)

Page 21: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Option 3: Conditional Poisson regression

• Armstrong and Gasparrini (ISEE 2011) – “ … Poisson models with large numbers of indicator variables

can alternatively be fit with conditional Poisson models, conditioning on numbers of events in the time stratum”

– Firstly proposed by Farrington (1995) for the self-controlled case-series design for vaccine safety studies

• Strata defined by grouping same weekday within each month of each year, and then use Stata’s xtpoisson command jointly with the fe option

• But overdispersion needs to accounted for with the new xtpois_ovdp command

Page 22: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

. egen set = group(yy mm dow)

. xtset set date

...

. xtpoisson y pm10, fe

...

Conditional fixed-effects Poisson regression Number of obs = 1096Group variable: set Number of groups = 252

Obs per group: min = 4 avg = 4.3 max = 5

Wald chi2(1) = 8.42Log likelihood = -2854.4979 Prob > chi2 = 0.0037

------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- pm10 | .0007406 .0002552 2.90 0.004 .0002404 .0012408------------------------------------------------------------------------------

. xtpois_ovdp

Estimate and standard errors corrected for over-dipersiondf: 843 ; pearson x2: 1065.6 ; dispersion: 1.26------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- pm10 | .0007406 .0002869 2.58 0.010 .0001782 .001303------------------------------------------------------------------------------

Page 23: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

CPU time

• My example (3 years data, 60 events/day)– clogit 0.181 sec.– poisson 2.111 sec.– xtpoisson 0.066 sec.

Page 24: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Option 3: Conditional Poisson regression

• Advantages– Avoids reshape of dataset

keeping original time-series format

– Can account for overdispersion and residual autocorrelation

– Easy fit of interaction terms for individual factors (sex, age) with faster convergence

• Limitations– ?

Page 25: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Summary

• Case-crossover studies can easily be analysed using Stata 1) Conditional logistic regression requires unfriendly data

management, and large (huge) data-sets at individual level

2) Poisson, 3-way interaction, regression requires to fit large number of interaction parameters, making it difficult convergence

3) Conditional Poisson seems to solve both

Page 26: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Thanks for your attention!

Page 27: Analysis of time-stratified case- crossover studies in environmental epidemiology using Stata Aurelio Tobías Spanish Council for Scientific Research (CSIC),

Thanks for your attention!