cox regression ii

71
Cox Regression II

Upload: gaye

Post on 13-Jan-2016

33 views

Category:

Documents


1 download

DESCRIPTION

Cox Regression II. Monday “Gut Check” Problem…. Write out the likelihood for the following data, with weight as a time-dependent variable:. SAS code for a time-dependent variable…. proc phreg data=example; model time*censor(0) = weight ; if time

TRANSCRIPT

Page 1: Cox Regression II

Cox Regression II

Page 2: Cox Regression II

Monday “Gut Check” Problem…

Write out the likelihood for the following data, with weight as a time-dependent variable:Time-to-

event (months)

Survival(1=died/0=censored)

Weight at baseline

Weight at 3 months

Weight at 9 months

Weight at 12 months

10 0 140 145 155 .

2 1 240 . . .

4 0 130 130 . .

8 1 200 210 250 .

12 0 150 145 145 140

14 0 180 180 180 175

10 1 180 190 240 .

1 0 230 . . .

3 0 110 110 . .

Page 3: Cox Regression II

SAS code for a time-dependent variable…

proc phreg data=example;model time*censor(0) = weight;if time<3 then weight=w0;if time>=3 and time<6 then weight=w3;if time>=6 and time<9 then weight=w6;if time>=9 then weight=w9;run;

Page 4: Cox Regression II

Model results Using baseline weight: HR=2.8 Using weight as time-changing

variable: HR=9.3

Page 5: Cox Regression II

1. Stratification

Violations of PH assumption can be resolved by:•Adding time*covariate interaction

•Adding other time-dependent version of the covariate

•Stratification

Page 6: Cox Regression II

Stratification

•Different stratum are allowed to have different baseline hazard functions.

•Hazard functions do not need to be parallel between different stratum.

•Essentially results in a “weighted” hazard ratio being estimated: weighted over the different strata.

•Useful for “nuisance” confounders (where you do not care to estimate the effect).

•Assumes no interaction between the stratification variable and the main predictors.

Page 7: Cox Regression II

Males: 1, 3, 4, 10+, 12, 18 (subjects 1-6) Females: 1, 4, 5, 9+ (subjects 7-10)

Example: stratify on gender

)....)5()5(

)5((

))1()4()4(

)4(()

)4()4()4()4(

)4(()

)3()3()3()3()3(

)3((

))1()1()1()1(

)1(()

)1()1()1()1()1()1(

)1(()(

109

1098

8

6543

3

65432

2

10987

7

654321

1

1

hh

hx

hhh

hx

hhhh

hx

hhhhh

h

hhhh

hx

hhhhhh

hLL

m

iip

β

♀♂ ♂

Page 8: Cox Regression II

The PL

....

))1()1()1()1(

)1((

))1()1()1()1()1()1(

)1((

)(

10987

7

654321

1

0000

0

000000

0

1

βxβxβxβx

βx

βxβxβxβxβxβx

βx

β

eeee

e

xeeeee

et

LL

ffff

f

mmmmmm

m

m

iip

)...()()(10987

7

654321

1

1βxβxβxβx

βx

βxβxβxβxβxβx

βx

βeeee

ex

eeeee

eLL

m

iip

Page 9: Cox Regression II

Age is a common confounder in Cox Regression, since age is strongly related to death and disease.

You may control for age by adding baseline age as a covariate to the Cox model.

A better strategy for large-scale longitudinal surveys, such as NHANES, is to use age as your time-scale (rather than time-in-study).

You may additionally stratify on birth cohort to control for cohort effects.

2. Using age as the time-scale in Cox Regression

Page 10: Cox Regression II

Age as time-scale The risk set becomes everyone who was

at risk at a certain age rather than at a certain event time.

The risk set contains everyone who was still event-free at the age of the person who had the event.

Requires enough people at risk at all ages (such as in a large-scale, longitudinal survey).

Page 11: Cox Regression II

The likelihood with age as time

Event times: 3, 5, 7+, 12, 13+ (years-in-study)

Baseline ages: 28, 25, 40, 29, 30 (years)

Age at event or censoring: 31, 30, 47+, 41, 43+

))41()41()41(

)41(()

)31()31()31(

)31((

))30()30()30()30(

)30(()(

543

4

541

1

5421

2

1

hhh

hx

hhh

h

xhhhh

hLL

m

iip

β

Page 12: Cox Regression II

3. Residuals Residuals are used to investigate

the lack of fit of a model to a given subject.

For Cox regression, there’s no easy analog to the usual “observed minus predicted” residual of linear regression

Page 13: Cox Regression II

Martingale residual ci (1 if event, 0 if censored) minus the estimated

cumulative hazard to ti (as a function of fitted model) for individual i:

ci-H(ti,Xi,ßi) E.g., for a subject who was censored at 2 months, and whose predicted

cumulative hazard to 2 months was 20% Martingale=0-.20 = -.20

E.g., for a subject who had an event at 13 months, and whose predicted cumulative hazard to 13 months was 50%:

Martingale=1-.50 = +.50

Gives excess failures. Martingale residuals are not symmetrically

distributed, even when the fitted model is correctly, so transform to deviance residuals...

Page 14: Cox Regression II

Deviance Residuals

The deviance residual is a normalized transform of the martingale residual. These residuals are much more symmetrically distributed about zero.

Observations with large deviance residuals are poorly predicted by the model.

Page 15: Cox Regression II

Deviance Residuals Behave like residuals from ordinary

linear regression Should be symmetrically distributed

around 0 and have standard deviation of 1.0.

Negative for observations with longer than expected observed survival times.

Plot deviance residuals against covariates to look for unusual patterns.

Page 16: Cox Regression II

Deviance Residuals In SAS, option on the output

statement:Output out=outdata resdev=Varname

**Cannot get diagnostics in SAS if time-dependent covariate in the model

Page 17: Cox Regression II

Example: uis data

Out of 628 observations, a few in the range of 3-SD is not unexpected

Pattern looks fairly symmetric around 0.

Page 18: Cox Regression II

Example: uis data

What do you think this cluster represents?

Page 19: Cox Regression II

Example: censored only

Page 20: Cox Regression II

Example: had event only

Page 21: Cox Regression II

Schoenfeld residuals Schoenfeld (1982) proposed the first set

of residuals for use with Cox regression packages Schoenfeld D. Residuals for the proportional

hazards regresssion model. Biometrika, 1982, 69(1):239-241.

Instead of a single residual for each individual, there is a separate residual for each individual for each covariate

Note: Schoenfeld residuals are not defined for censored individuals.

Page 22: Cox Regression II

Schoenfeld residuals The Schoenfeld residual is defined as the

covariate value for the individual that failed minus its expected value. (Yields residuals for each individual who failed, for each covariate).

Expected value of the covariate at time ti = a weighted-average of the covariate, weighted by the likelihood of failure for each individual in the risk set at ti.

)(

1

residualitRj

ijjkik pxx

person i for the nowevent ofy probabilit

(age)years 56.,.e

th

setrisk

1

jp

pgj

ij

The person who died was 56; based on the fitted model, how likely is it that the person who died was 56 rather than older?

Page 23: Cox Regression II

Example 5 people left in our risk set at

event time=7 months: Female 55-year old smoker Male 45-year old non-smoker Female 67-year old smoker Male 58-year old smoker Male 70-year old non-smoker

The 55-year old female smoker is the one who has the event…

Page 24: Cox Regression II

ExampleBased on our model, we can calculate a

predicted probability of death by time 7 for each person (call it “p-hat”):

Female 55-year old smoker: p-hat=.10 Male 45-year old non-smoker : p-hat=.05 Female 67-year old smoker : p-hat=.30 Male 58-year old smoker : p-hat=.20 Male 70-year old non-smoker : p-hat=.30

Thus, the expected value for the AGE of the person who failed is:

55(.10) + 45 (.05) + 67(.30) + 58 (.20) + 70 (.30)= 60And, the Schoenfeld residual is: 55-60 = -5

Page 25: Cox Regression II

ExampleBased on our model, we can calculate a

predicted probability of death by time 7 for each person (call it “p-hat”):

Female 55-year old smoker: p-hat=.10 Male 45-year old non-smoker : p-hat=.05 Female 67-year old smoker : p-hat=.30 Male 58-year old smoker : p-hat=.20 Male 70-year old non-smoker : p-hat=.30

The expected value for the GENDER of the person who failed is:

0(.10) + 1(.05) + 0(.30) + 1 (.20) + 1 (.30)= .55And, the Schoenfeld residual is: 0-.55 = -.55

Page 26: Cox Regression II

Schoenfeld residuals Since the Schoenfeld residuals are, in

principle, independent of time, a plot that shows a non-random pattern against time is evidence of violation of the PH assumption. Plot Schoenfeld residuals against time to

evaluate PH assumption Regress Schoenfeld residuals against time

to test for independence between residuals and time.

Page 27: Cox Regression II

Example: no pattern with time

Page 28: Cox Regression II

Example: violation of PH

Page 29: Cox Regression II

Schoenfeld residualsIn SAS: option on the output statement:Output out=outdata ressch= Covariate1

Covariate2 Covariate3

Page 30: Cox Regression II

Summary of the many ways to evaluate PH assumption…

1. Examine log(-log(S(t)) plotsPH assumption is supported by parallel lines and refuted by lines that cross or

nearly crossMust use categorical predictors or categories of a continuous predictor

2. Include interaction with time in the modelPH assumption is supported by non-significant interaction coefficient and refuted by

significant interaction coefficientRetaining the interaction term in the model corrects for the violation of PHDon’t complicate your model in this way unless it’s absolutely necessary!

3. Plot Schoenfeld residualsPH assumption is supported by a random pattern with time and refuted by a non-

random pattern

4. Regress Schoenfeld residuals against time to test for independence between residuals and time.

PH assumption is supported by a non-significant relationship between residuals and time, and refuted by a significant relationship

Page 31: Cox Regression II

Death (presumably) can only happen once, but many outcomes could happen twice… Fractures Heart attacks PregnancyEtc…

4. Repeated events

Page 32: Cox Regression II

Strategy 1: run a second Cox regression (among those who had a first event) starting with first event time as the origin

Repeat for third, fourth, fifth, events, etc. Problems: increasingly smaller and

smaller sample sizes.

Repeated events: 1

Page 33: Cox Regression II

Treat each interval as a distinct observation, such that someone who had 3 events, for example, gives 3 observations to the dataset Major problem: dependence between

the same individual

Repeated events: Strategy 2

Page 34: Cox Regression II

Stratify by individual (“fixed effects partial likelihood”)

In PROC PHREG: strata id; Problems: does not work well with RCT data requires that most individuals have at least 2

events Can only estimate coefficients for those

covariates that vary across successive spells for each individual; this excludes constant personal characteristics such as age, education, gender, ethnicity, genotype

Strategy 3

Page 35: Cox Regression II

5. Competing Risks

Page 36: Cox Regression II

BMT: Related vs. Unrelated Donor

Page 37: Cox Regression II

SAS Output

37

Patients with related donors survive longer.

Page 38: Cox Regression II

Related/Unrelated Donor is significant.

Can you say definitively to a patient: If you find a related donor, you will have

longer survival time. What variables could be confounders?

38

Page 39: Cox Regression II

Survival Analysis categorizes subjects

1 Event of interest was observed2 Censored3 Competing risk was observed

39

Page 40: Cox Regression II

Event of Interest Competing Risk

Death from the disease Death from other causes

Relapse Non-relapse mortality

Relapse Treatment complications

Local progression Metastasis

Competing Risk

40

an event that either precludes the event of interest or alters its probability

Page 41: Cox Regression II

BMT Example

41

Interested in Time to Relapse Competing Risks (preclude or alter

probability of relapse) Non-relapse mortality Graft-vs-host disease (GVHD)

Page 42: Cox Regression II

Who failed from the event of interest?

1 Event of interest was observed2 Censored3 Competing risk was observed

YesMaybeNo

42

Common Pitfall: treating competing risks as censoring Treats nos as maybes Puts them partially in the numerator of occurrence

when they shouldn’t be there Thus overestimates risk (underestimates S)

Page 43: Cox Regression II

What to do instead

KM estimate of event free survival (EFS)

Cumulative Incidence Analysis

43

Page 44: Cox Regression II

Event-Free Survival

44

In cancer, often Progression-Free Survival (PFS) Treats competing risks as events Can use KM For each subject, the first event to occur “Survival” implies death is considered an event BMT: first of relapse, GVHD or death Is this of interest? May not be, e.g., Local progression and

metastasis

Page 45: Cox Regression II

Cumulative Incidence Analysis

45

Separates competing risks from event of interest

If no competing risks, equivalent to KM Estimates occurrence probability: F(t) =

1 – S(t) Each event goes into one bin (event

type)

Page 46: Cox Regression II

BMT CumulativeIncidence Curves

DeathRelapse

GVHD

Page 47: Cox Regression II

6. Considerations when analyzing data from an RCT…

Page 48: Cox Regression II

Intention-to-Treat Analysis

Intention-to-treat analysis: compare outcomes according to the groups to which subjects were initially assigned, regardless of which intervention they actually received.

Evaluates treatment effectiveness rather than treatment efficacy

Page 49: Cox Regression II

Why intention to treat? Non-intention-to-treat analyses lose the

benefits of randomization, as the groups may no longer be balanced with regards to factors that influence the outcome.

Intention-to-treat analysis simulates “real life,” where patients often don’t adhere perfectly to treatment or may discontinue treatment altogether.

Page 50: Cox Regression II

Drop-ins and Drop-outs: example, WHI

Both women on placebo and women on active treatment discontinued study

medications.

Women on placebo “dropped in” to treatment because their regular doctors put

them on hormones (dogma= “hormones are good”).

Women on treatment “dropped in” to treatment because their doctors took them off study drugs and put them on hormones to

insure they were on hormones and not placebo.

Page 51: Cox Regression II

Effect of Intention to treat on the statistical analysis Intention-to-treat analyses tend to

underestimate treatment effects; increased variability due to switching “waters down” results.

Page 52: Cox Regression II

ExampleTake the following hypothetical RCT:Treated subjects have a 25% chance of dying during the 2-

year study vs. placebo subjects have a 50% chance of dying.

TRUE RR= 25%/50% = .50 (treated have 50% less chance of dying)

You do a 2-yr RCT of 100 treated and 100 placebo subjects. If nobody switched, you would see about 25 deaths in the

treated group and about 50 deaths in the placebo group (give or take a few due to random chance).

Observed RR .50

Page 53: Cox Regression II

Example, continuedBUT, if early in the study, 25 treated subjects

switch to placebo and 25 placebo subjects switch to treatment.

You would see about 25*.25 + 75*.50 = 43-44 deaths in the placebo group

And about25*.50 + 75*.25 = 31 deaths in the treated group

Observed RR = 31/44 .70Diluted effect!

Page 54: Cox Regression II

7. Example analysis: stress fracture study

• Women runners may have reduced levels of estrogen, which puts them at risk of bone loss and stress fractures

• This was a randomized trial of hormones (oral contraceptives) to prevent stress fractures in women runners

• Two groups: treatment and control (no placebo)

Page 55: Cox Regression II

Baseline Description and Comparability of Groups

Baseline descriptors are summarized as:• means and standard deviations for continuous

variables • frequencies and percentages for categorical variables

How good was the randomization?; i.e., Are the groups indeed balanced with regards to variables known to be prognostically related to the outcome?

For cohort study, what factors are related to exposure, and thus might be confounders?

Who is in the population?

Page 56: Cox Regression II

Age (yrs) 21.9 22.4Stress fracture (%) 40.0 39.1Menses in past year 9.5 9.4No. of lifetime menses 67.4 68.9Oligo/amenorrhea (%) 35.8 30.0Amenorrhea (%) 6.2 11.4Oligomenorrhea (%) 29.6 18.6Elevated EDI score (%) 21.0 30.0Whole body BMD (g/cm2) 1.10 1.11Total hip BMD (g/cm2) .97 .99Spine BMD (g/cm2) .99 .98Total bone mineral content (g) 2146 2179Height (inches) 65.2 65.4Weight (lbs) 128.0 128.7BMI (kg/ m2) 21.2 21.1Percent body fat 23.3 22.7Calcium per day (mg) 1412 1401

control treatment

Stress fracture studyBaseline characteristics by randomization assignment

Page 57: Cox Regression II

Summary of events Might be presented as overall

incidence rates. If events are heterogeneous (as

with stress fractures), tabulate results.

Page 58: Cox Regression II

Stress Fracture 1

Diagnostic test

Stress fracture 2

Study Area

right tibial bone

right tibial bone

right tibial bone

right tibial bone

right tibial bone

right tibial bone

left tibial bone

left tibial bone

left tibial bone

left tibial bone

right foot

right foot

left third metatarsal

right 4th metatarsal

left cuboid

navicular bone

upper right femur

right femoral neck

 18

bone scan

x-ray

bone scan

bone scan

bone scan

bone scan

bone scan

bone scan

bone scan

bone scan

bone scan

x-ray

x-ray

x-ray

MRI

bone scan 

MRI

MRI

 

 

 

 

 

right tibial bone

right tibial bone

right femur

 

 

 left foot

 

 

 

 

 

 

 

4

Boston

Boston

Boston

Boston

Stanford

Michigan

Boston

Michigan

Los Angeles

Michigan

Los Angeles

New York

Boston

Stanford

Stanford

Stanford

Los Angeles

Stanford

 

Page 59: Cox Regression II

Evaluation of primary hypothesis Intention-to-treat analysis for RCT Primary exposure-event hypothesis

for cohort study, adjusted for confounding

Page 60: Cox Regression II

Treatment (n=52)6 fractures

Control (n=70)12 fractures

Corresponding Kaplan-Meier curve

Page 61: Cox Regression II

Hazard Ratio (95% CI)

Randomized to treatment .82 (0.30, 2.27)

Corresponding HR

Page 62: Cox Regression II

Secondary analyses For RCT: any non-intention to treat

analyses For RCT and cohort: evaluate other

predictors; effect modification; subgroups

Page 63: Cox Regression II

Hazard Ratio (95% CI)

Randomized to treatment .82 (0.30, 2.27)Randomized to treatment, on-protocol only (n=82) .63 (0.21, 1.92)Actually took OCs at least 1-month .41 (0.15,1.08)

Per month on OCs .92 (0.85, 0.98)Time-dependent treatment variable, when on treatment .50 (0.18,1.40)

**All analyses are stratified on site and menstrual status at baseline (amenorrheic, oligomenorrheic, or eumenorrheic), and adjusted for

age and spine Z-score at baseline using Cox Regression.

Hazard ratios for treatment variables

Page 64: Cox Regression II

<1800 g (n=15)

1800-2199 g (n=55)

≥2200 g (n=52)

Kaplan-Meier estimates of stress fracture-free survivorship by BMC at baseline

Page 65: Cox Regression II

<800 mg/day (n=22)

800-1499 mg/day (n=63)

1500+mg/day (n=36)

Kaplan-Meier estimates of stress fracture-free survivorship by levels of daily calcium intake at baseline

Page 66: Cox Regression II

Previous fracture (n=39)

No previous fracture(n=83)

Kaplan-Meier estimates of stress fracture-free survivorship by previous stress fracture

Page 67: Cox Regression II

Lowest quartile of lean mass

Highest quartile of lean mass

Middle two quartiles

Page 68: Cox Regression II
Page 69: Cox Regression II

Risk Factors

  Hazard Ratio (95% CI) History of menstrual irregularity prior to baseline 2.91 (0.81,10.43)BMC<1800g 3.70 (1.31, 10.46)

Low calcium (<800 mg/d) 3.60 (1.12,11.59)

Stress fracture prior to baseline 5.45 (1.48,20.08)Fat mass (per kg) 1.05 (0.91, 1.21)

**All analyses are stratified on site and menstrual status at baseline, and adjusted for age and spine Z-score at baseline using Cox

Regression.

Page 70: Cox Regression II

Other protective factors

Hazard Ratio (95% CI) Spine BMD (per 1-standard deviation increase) .54 (0.30, 0.96)Every 100-mg/d calcium (continuous) .90 (0.81, 0.99)

Lean mass (per kg), time-dependent .91 (0.81, 1.02)Change in lean mass (per kg) .83 (0.56, 1.24)Menarche (per 1-year older) .55 (0.34,0.90)

**All analyses are stratified on site and menstrual status at baseline, and adjusted for age and spine Z-score at baseline (except spine Z

score) using Cox Regression.

Page 71: Cox Regression II

ReferencesPaul Allison. Survival Analysis Using SAS. SAS Institute Inc., Cary, NC:

2003.