applications of g-estimation using a new stata command jonathan sterne [email protected]...

29
Applications of G- estimation using a new Stata command Jonathan Sterne [email protected] Kate Tilling [email protected] Department of Social Medicine, University of Bristol UK

Upload: eric-mclaughlin

Post on 12-Jan-2016

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

Applications of G-estimation using a new Stata command

Jonathan [email protected]

Kate [email protected]

Department of Social Medicine,University of Bristol UK

Page 2: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

Outline

• Time varying confounding and G-estimation

• G-estimation in Stata

• Applications

• Discussion and future plans

Page 3: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

A covariate is a time-varying confounder for the effect of exposure on outcome if:

1. past covariate values predict current exposure2. current covariate value predicts outcome

Example:1. people with low CD4 are more likely to get HAART2. Low CD4 is a risk factor for AIDS and death

If, in addition, past exposure predicts current covariate value then standard survival analyses with time-updated exposure effects will give biased exposure effect estimates

For example, CD4 count predicts HAART and HAART raises CD4 counts

Page 4: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

G-estimation (1)• Assume that subject i has an underlying counterfactual failure time Ui - the time

to failure had they never been exposed. This is unobservable for subjects who were exposed at any time

• Assume that exposure accelerates failure time by a factorexp(-) - the causal survival time ratio. So if <0 exposure increases survival, if >0 exposure decreases survival

• If we knew , then for any subject who experienced the outcome event at time Ti, the counterfactual failure time could be derived by:

• Example: if subject i experienced the outcome event at 5 years and was exposed for 3 years then Ui =3exp()+2 dttEU

iT

i ))(exp(0

Page 5: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

G-estimation (2)

Assume that there are no unmeasured confounders

• conditional on measured history (past and present confounders and past exposure) subjects’ present exposure is independent of their counterfactual failure time Ui

• e.g. for 2 individuals with identical histories, the decision to quit smoking does not depend on underlying survival time

Use logistic regression to search for a value of that satisfies this condition

Page 6: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

• No competing risksReplace U() with variable indicating whether individual would have been observed to fail both if they were exposed and if they were unexposed.

• Competing risksAssume that conditional on known covariates censoring due to competing risks is independent of failure time

Estimate the cumulative probability of being free from competing risks until end of follow up, and weight by the inverse of this probability.

Censoring

Page 7: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

The stgest command

• Written for Stata• User specifies exposure, covariates (including

baseline and lagged covariates) and any censoring variables

• Data set up in Stata survival analysis format (i.e. start time, end time and failure indicator for each interval for each individual)

• Uses interval bisection method to search for G-estimate and 95% CI (or user can specify range and ‘step’ for grid search)

Page 8: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

Caerphilly study

– 2512 men first examined 1979 to 1983, mean age at baseline 52 years

– Three further follow up surveys with ascertainment of MI and deaths to August 2000

– Data from the first examination is used to provide baseline exposure measures, so follow-up starts from the second examination

– 1756 men included in analyses

– 244 had a first MI or died from CHD between the second examination and the end of follow up

Page 9: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

• Baseline

smoking history, age, self-reported CHD, gout, diabetes, high blood pressure

• Every visit

BP, BMI, smoking status, total cholesterol, CHD, gout, diabetes, fibrinogen

Data

Page 10: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

• Four possibilities:– Not censored 1175 (66.9%)– MI or MI death 244 (13.9%)– Death from other cause 231 (13.2%)– Lost 106 (6.0%)

• Multinomial logistic regression estimate the probability that each id was censored (last

two categories) as the product of the probability of censoring at each examination

Censoring

Page 11: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

list id visit examdat exitdate mi examdat2 cursmok if touse

id visit examdat exitdate mi examdat2 cursmok 16. 1021 1 10sep1979 31jul1984 0 31jul1984 0 17. 1021 2 31jul1984 17mar1992 0 31jul1984 0 18. 1021 3 17mar1992 18jun1996 1 31jul1984 0 19. 1022 1 10sep1979 19sep1984 0 19sep1984 1 20. 1022 2 19sep1984 20nov1989 0 19sep1984 1 21. 1022 3 20nov1989 28oct1993 0 19sep1984 1 22. 1022 4 28oct1993 31dec1998 0 19sep1984 0 23. 1023 1 10sep1979 03oct1984 0 03oct1984 1 24. 1023 2 03oct1984 20nov1989 0 03oct1984 1 25. 1023 3 20nov1989 08nov1993 0 03oct1984 1 26. 1023 4 08nov1993 31dec1998 0 03oct1984 1

Page 12: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

. stset exitdate, id(id) failure(mi) origin(time examdat2) scale(365.25)

id: id failure event: mi ~= 0 & mi ~= .obs. time interval: (exitdate[_n-1], exitdate] exit on or before: failure t for analysis: (time-origin)/365.25 origin: time examdat2

----------------------------------------------------------------------- 6377 total obs. 1756 obs. end on or before enter()----------------------------------------------------------------------- 4621 obs. remaining, representing 1756 subjects 244 failures in single failure-per-subject data 18547.87 total analysis time at risk, at risk from t = 0 earliest observed entry t = 0 last observed exit t = 14.47502

Page 13: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

. list id visit examdat exitdate mi _t0 _t _d _st if touse, noobs nodisp

id visit examdat exitdate mi _t0 _t _d _st 1021 1 10sep1979 31jul1984 0 . . . 0 1021 2 31jul1984 17mar1992 0 0.00 7.63 0 1 1021 3 17mar1992 18jun1996 1 7.63 11.88 1 1 1022 1 10sep1979 19sep1984 0 . . . 0 1022 2 19sep1984 20nov1989 0 0.00 5.17 0 1 1022 3 20nov1989 28oct1993 0 5.17 9.11 0 1 1022 4 28oct1993 31dec1998 0 9.11 14.28 0 1 1023 1 10sep1979 03oct1984 0 . . . 0 1023 2 03oct1984 20nov1989 0 0.00 5.13 0 1 1023 3 20nov1989 08nov1993 0 5.13 9.10 0 1 1023 4 08nov1993 31dec1998 0 9.10 14.24 0 1

Page 14: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

. makebase cursmok hearta gout highbp diabet fibrin chol cholsq /*> */ bpsyst bpdias obese thin, firstvis(1) visit(visit)

Baseline confounders

storage display valuevariable name type format label variable label---------------------------------------------------------------------Bcursmok byte %9.0g Bhearta byte %9.0g Bgout byte %9.0g Bhighbp byte %9.0g Bdiabet byte %9.0g Bfibrin float %9.0g Bchol float %9.0g Bcholsq float %9.0g Bbpsyst int %9.0g Bbpdias int %9.0g Bobese byte %9.0g Bthin byte %9.0g

Page 15: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

. makelag cursmok hearta gout highbp diabet fibrin chol cholsq /*> */ bpsyst bpdias obese thin, firstvis(1) visit(visit)

Lagged confounders

storage display valuevariable name type format label variable label----------------------------------------------------------------------Lcursmok byte %9.0g Lhearta byte %9.0g Lgout byte %9.0g Lhighbp byte %9.0g Ldiabet byte %9.0g Lfibrin float %9.0g Lchol float %9.0g Lcholsq float %9.0g Lbpsyst int %9.0g Lbpdias int %9.0g Lobese byte %9.0g Lthin byte %9.0g

Page 16: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

. stcox cursmok Agegrp* hearta gout highbp diabet fibrin chol cholsq bpsyst bpdias obese thin B* L*

failure _d: mi analysis time _t: (exitdate-origin)/365.25 origin: time examdat2 id: id

No. of subjects = 1756 Number of obs = 4621No. of failures = 244Time at risk = 18547.87132 LR chi2(41) = 178.92Log likelihood = -1662.3478 Prob > chi2 = 0.0000

---------------------------------------------------------------------- _t | _d | Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]---------+------------------------------------------------------------ cursmok | 1.014992 .2085446 0.07 0.942 .6785331 1.518288

(remaining output omitted)

Page 17: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

. stgest cursmok Agegrp* fibrin hearta gout highbp diabet chol cholsq bpsyst bpdias obese thin, visit(visit) firstvis(2) lagconf(cursmok fibrin hearta gout highbp diabet chol cholsq bpsyst bpdias obese thin) baseconf(fibrin hearta gout highbp cursmok chol cholsq diabet bpsyst bpdias obese thin) lasttime(mienddat) range(-2 2) saveres(caergestsmoknocens) replace

causvar: cursmokvisit: visitRange: -2 2, rnum: 2Search method: interval bisection

-2.00 2.00 0.00 1.00 0.50 0.25 0.13 0.19 0.22 0.23 0.24 0.24 0.24 0.24 0.38 0.31 0.34 0.36 0.37 0.37 0.37 0.37 -1.00 -0.50 -0.25 -0.13 -0.06-0.03 -0.02 -0.01 -0.00 -0.00 -0.00

savres: caergestsmoknocens

G estimate of psi for cursmok: 0.239 (95% CI -0.001 to 0.368)

Causal survival time ratio for cursmok: 0.787 (95% CI 0.692 to 1.001)

Page 18: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

z

psi-.2 0 .2 .4

-2

0

2

Page 19: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

. weibull _t cursmok Agegrp* hearta gout highbp diabet fibrin chol cholsq bpsyst bpdias obese thin B* L* if visit>=2, dead(_d) t0(_t0) hr

_t | Haz. Ratio Std Err z P>|z| [95% Conf. Interval] --------+--------------------------------------------------------- cursmok | 1.01690 .2083929 0.08 0.935 .6805221 1.519549

(rest of output omitted)

. gesttowb

g-estimated hazard ratio 1.28 ( 1.00 to 1.47)

Page 20: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

. * allowing for censoring due to competing risks;

. stgest cursmok Agegrp* fibrin hearta gout highbp diabet chol cholsq bpsyst bpdias obese thin, visit(visit) firstvis(2) lagconf(fibrin hearta gout highbp diabet cursmok chol cholsq bpsyst bpdias obese thin) baseconf(fibrin hearta gout highbp cursmok chol cholsq diabet bpsyst bpdias obese thin) lasttime(mienddat) saveres(caergestsmok) replace idcens(idcrcens) range(-2 2) pnotcens(pnotcens)

G estimate of psi for cursmok: 0.290 (95% CI -0.190 to 0.773)

Causal survival time ratio for cursmok: 0.748 (95% CI 0.462 to 1.210)

. gesttowbg-estimated hazard ratio 1.34 ( 0.82 to 2.19)

Page 21: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

Atherosclerosis Risk in Communities (ARIC) study

• 15, 792 members of 4 communities in the USA

• baseline exam between 1987 and 1989

• 3 follow-up exams at 3 year intervals

• followed up for death, CHD and stroke

Page 22: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

ARIC data

• Baseline

smoking history, education level, age, sex, ethnicity, self-reported stroke/CHD

• Every visit

BP, BMI, smoking status, total, HDL and LDL cholesterol, diabetes status, use of anti-hypertensive medication

Page 23: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

ARIC data

13898 persons with data on visits 1 and 2

7699 (55%) female

Mean age =54 (min=45, max=65).

CHD present in 625 (5%)

9754 (70%) not on anti-hypertensive medication at visits 1 or 2.

Page 24: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

Methods

Weibull analysis and G-estimation

Outcomes - death, incident CHD.CHD as outcome - exclude those with CHD at baseline/1st visit, censor if die of other causes

Exposures - BP, smoking, BMI, HDL,LDLBP - exclude those on anti-hypertensives at baseline, censor at anti hypertensive use.

Page 25: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

Results

Published in the American Journal of Epidemiology, April 15th 2002.

Tilling K, Sterne JAC, Szklo M. G-estimation of the effects of cardiovascular risk factors on all-cause mortality and CHD: the ARIC study. AJE 2004; 155: 710-718

Summary: effects tended to be under-estimated by Weibull compared to g-estimation.

Page 26: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

Discussion - model specification

Model specified that exposure at a given visit multiplies survival from that moment by a given amount.

Alternatives:

• effect on survival only lasts for a given period (e.g. use of anti-hypertensives)

• effect on survival starts after a given period (e.g. possible lagged effect of smoking)

Page 27: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

Future work and (we hope) collaboration

• Implement MSMs in Stata• Effect of cardiovascular risk factors (e.g. smoking,

fibrinogen) and anti-hypertensives in Caerphilly study

• Effect of treatments (e.g. anti-hypertensives, anti-platelet agents) on stroke recurrence using South London Stroke Register

Page 28: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department

Future work and (we hope) collaboration

• Causal effect of HAART– When to start– Effect of different drug combinations– Will require large collaborations between cohorts– Aim to build on an existing collaboration

between 13 cohorts involving 12500 patients starting HAART

Page 29: Applications of G-estimation using a new Stata command Jonathan Sterne jonathan.sterne@bristol.ac.uk Kate Tilling kate.tilling@bristol.ac.uk Department