lecture 2: key functions and parametric distributions survival function hazard function median...

62
Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Upload: lindsay-cook

Post on 28-Dec-2015

235 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Lecture 2: Key Functions and Parametric Distributions

Survival FunctionHazard FunctionMedian SurvivalCommon Parametric Distributions

Page 2: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

But First• Let’s think a little more about censoring and truncation

using an example…

• An investigator is interested in determining if treatment with amoxetine leads to recovery of cognitive function in rats with brain lesions that mimic Parkinson’s disease.

• The outcome of interest is time to complete recovery of cognitive function– i.e. the time it takes to return to baseline cognitive function

after treatment with amoxetine.

Page 3: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Amoxetine and Cognitive Function• Collect baseline measure of cognitive function

– Time to correctly perform water radial arm maze (WARM) task

• Induce cognitive impairment– Treat 4 week old rats with N-(2-chloroethyl)-N-ethyl-bromo-

benzylamine (DSP-4)– causes noradrenergic lesions in the locus coeruleus.

• Treat lesioned animals with Amoxetine– daily dose for 4 weeks (ages 4 to 8 weeks)– 0, 0.3, 1.0, or 3.0 mg/kg

• Measures cognitive performance post treatment– weekly for 16 weeks (ages 8 to 24 weeks)– Endpoint: time it takes to 100% cognitive function

Page 4: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Describe the type of censoring

• Rat does not achieve complete cognitive recovery at 12 weeks but does by 13 weeks.

• Rat that dies at 82 days but has not yet achieve complete cognitive recovery

• Rat survives to 24 weeks but never achieves complete cognitive recovery

Page 5: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Describe the type of censoring

• Rat doesn’t develop brain lesions due to misplaced DSP-4 treatment and shows complete cognitive recovery at 8 weeks

• Rat shows complete cognitive recovery 8 at weeks

Page 6: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions
Page 7: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Time to Event Outcomes

• Modeled using “survival analysis”• Define X = time to event

– X is a random variable– Realizations of X are denoted x– X > 0

• Key characterizing functions– Survival functions– Hazard rate (or function)– Probability density function– Mean residual life

Page 8: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

PDF, survival function, hazard rate, and mean residual life

• f(x)

• S(x)

Page 9: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

PDF, survival function, hazard rate, and mean residual life

• h(x)

• mrl(x)

Page 10: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Survival Function

• S(x) = the probability of an individual surviving to time x

• Basic properties– Monotonic non-increasing– S(0) = 1– S(∞) = 0*

*debatable: cure-rate distribution allow plateau at some other value

Page 11: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Types of time to event data• Continuous t

– Observe actual time

• Discrete t– Interval censoring– Grouping into intervals

Where p(xj) is the probability mass function, P(X = xj)

x

S x p X x f t dt

j

jx x

S x p X x p x

Page 12: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Example of Discrete Time to Event

• Discrete Uniform (3 times possible)

13

0

1

2

3

jpmf P x P x j

S

S

S

S

Page 13: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Hazard Rate• A little harder to conceptualize• Instantaneous failure rate or conditional

failure rate

• Interpretation: probability that a person at time t experiences the event in the interval (x, x+Dx) given survival to time x.

0

limx

P x X x x X xh x

x

Page 14: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Hazard Rate

• Only constraint:• Relationship between h(x), S(x) and pdf

(continuous):

0

limx

P x X x x X xh x

x

Page 15: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Hazard Function

• Useful for conceptualizing how the chance of an event changes over time

• i.e. consider hazard ‘relative’ over time• Examples:

– Treatment related mortality• Early on, high risk of death• Later on, risk of death decreases

– Aging• Early on, low risk of death• Later on, high risk of death

Page 16: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Shapes of Hazard Functions

• Increasing– Natural aging and wear

• Decreasing– Early failures due to device or transplant failures

• Bathtub– Populations followed from birth

• Hump Shaped– Initial risk of event, followed by decreasing chance

of event

Page 17: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Examples

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

Time

Ha

zard

Fu

nct

ion

Page 18: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

R Code for Hazard Function Shapes#Examples of hazard function shapesweibull.hazard<-function(x,alp,lam) { h<-alp*lam*x^(alp-1) return(h)

}loglogistic.hazard<-function(x,alp,lam) {

h<-alp*lam*x^(alp-1)/(1+lam*x^alp) return(h)}x<-seq(0, 6, 0.05)h1<-weibull.hazard(x, 1.5, 0.25)plot(x, h1, type="l", lwd=2, ylab="Hazard Function", xlab="Time", ylim=c(0,1))h2<-loglogistic.hazard(x, 0.5, 0.25)lines(x, h2, lwd=2, col=2)h3<-loglogistic.hazard(x, 2, 1)lines(x, h3, lwd=2, col=3)h4<-0.01*(x-3)^4lines(x, h4, lwd=2, col=4)

Page 19: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Cumulative Hazard Function

• Often used instead of the hazard function

– Relationship between H(x) and S(x)

• More on this later or model checking…

0

xH x h u du

Page 20: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

What if T is discrete?

• So far we’ve focused on T as a continuous r.v.• Discrete x

– Interval censoring– Grouping into intervals

• Depending on level of discreteness, use discrete data approach

where p(xj) is a pmf (P(X = xj)).

j

jx x

S x P X x p x

Page 21: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Complications• How can we use this to define our “discrete”

hazard function?

1

1

1 1

1

,

1

1

ln 1

Consider:

Note:

Implying:

And: but

So redefine as: so

j j

j

j

j j

j j j

j

j j j

j j j j

j j

j

jx X x Xj

H tjx X

jx X

P X x X xh x P X x X x

P X x

S x S x S xp x S x S x h x

S x S x

S xS x h x

S x

H x h x S x e

H x h x S x e

holdsH t

Page 22: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Mean Residual Life

• Biomedical applications– Median is very common– MRL is not common

• MRL = the expected residual life

• Theoretically, could be useful to predict survival times given survival to a certain point in time.

x xt x f t dt S t dt

mrl x E X x X xS x S x

Page 23: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Mean

• We do not see the mean quantified very often in biomedical applications

• Why?– Recall our censoring issue– Empirical means depend on parametric model– Means can only be ‘model-based’– Somewhat counterintuitive, especially when alternatives exist

• More common: median

0 0

E x tf t dt S t dt

Page 24: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Median

• Very/Most common way to express the ‘center’ of the distribution

• Rarely see another quantile expressed• Find t such that

• Complication: in some applications, median is not reached empirically

• Reported median based on model seems like an extrapolation• Often just state ‘median not reached’ and given alternative

point estimates

0.5S x

Page 25: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

X-Year Survival Rate

• Many applications have ‘landmark’ times that historically used to quantify survival

• Examples:– Breast cancer: 5 year relapse-free survival– Pancreatic cancer: 6 month survival– Acute myeloid leukemia (AML): 12 month relapse-

free survival• Solve for S(x) given x

Page 26: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Common Parametric Distributions

• Course will focus on non-parametric and semi-parametric methods

• But… some parametrics can be useful• Especially for trial design• Note that power and precision are improved

under parametric approaches versus others

Page 27: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Example 1: Exponential

• Recall the exponential distribution– f(t) = – F(t) =

• What is S(t) based on F(t) and f(t)– S(t) =

• l represents the failure rate per unit of time– Large l, rapid decay– Small l, slow decay

Page 28: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Example 1: Exponential

0 10 20 30 40 50 60

0.0

0.2

0.4

0.6

0.8

1.0

Time

Su

rviv

al F

un

ctio

n

= 0.1 = 0.05 = 0.01

Page 29: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

R Code for the Plottime<-seq(0, 60, 0.1)S1<-exp(-0.1*time)S2<-exp(-0.05*time)S3<-exp(-0.01*time)plot(time, S1, xlab="Time", ylab="Survival Function", col=3 , lwd=2, type="l")lines(time, S2, col=2 , lwd=2)lines(time, S3, col=4 , lwd=2)labs<-c(expression(paste(lambda, " = ",0.1, sep="")),

expression(paste(lambda, " = ",0.05, sep="")), expression(paste(lambda, " = ",0.01, sep="")))

legend(x=45, y=.95, labs, col=c(3,2,4), lty=c(1,1,1), lwd=(2,2,2), cex=0.9)

Page 30: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Example: Kidney Infection after Catheterization

• Kidney infection after catheter insertion in patients using portable dialysis equipment

• Time to event was time to catheter removal BUT should be noted that catheter can be removed for reasons other than infection (right censored)

• Only 76 observations (!)• Time to infection is outcome of interest• Question: can we describe it using a parametric

approach?

Page 31: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Kidney Infection Example:Survival curve and 95% confidence intervals

Page 32: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Exponential• Overly used due to simplicity• One parameter• Recall: S(t) = e-lt

• Hazard function:

• Note: constant hazard (huge assumption)

Page 33: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Exponential

• Mean =

• Median =

Page 34: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Exponential• MRL =

• “lack of memory”

• Realistic?

P T t z T t P T z

Page 35: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Exponential• Recall the cumulative hazard function H(t)• For exponential:

• Plot of ln(H(t)) vs. ln(t) should be a straight line with:– Slope = ?– Intercept = ?

• Used for model checking with non-parametric distribution of H(t)

Page 36: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Does Exponential Fit the Kidney Data?

Page 37: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

R Code### Kidney infection examplelibrary(survival)surv.kid<-Surv(kidney$time, kidney$status)fit.kid<-survfit(surv.kid~1)plot(fit.kid, xlab="Time", ylab="Survival Fraction")# summarize KM estimator to get median survivalsummary(fit.kid)# define log cumulative hazard and log timelogHt<-log(-log(fit.kid$surv))logt<-log(fit.kid$time)# Plot log cumulative hazard vs. log timeplot(logt, logHt, lwd=2, type="l", xlab="log(t)", ylab="log(H(t))")points(logt, logHt, pch=16)# add plot of x=y line. If exponential fits, should be parallel.# Note intercepts may be differentabline(-4.89, 1, lwd=2, col="red")

Page 38: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Exponential

• Another alternative model check• Note that H(t) = lt for exponential• Can simply plot –ln(S(t)) versus t• Should be a straight line with

– Slope = ?– Intercept = ?

• Why would the previous be preferred?• Because it can accommodate Weibull as we will

see….

Page 39: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Another Exponential Check

Page 40: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

More Model Checking

• We will build likelihood later• For now, accept that the MLE of l is

• Where di indicates whether the event is observed or censored for patient i, an ti is the event or censoring time

• Here: • This implies a model such that S(t) = e-0.0075t

ˆ i

i

d

t

587724

ˆ 0.0075

Page 41: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Compare Fitted and Observed S(t)

Page 42: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

What about specific survival time? Median survival? Mean survival?

• Empirical:– 200 day survival = 21.0%– Median survival = 66 days– Mean survival = ?

• Exponential Model:– 200 day survival = S(200) = ?– Median survival = ?– Mean survival = ?

Page 43: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Weibull

• Generalization of the Exponential• VERY common for survival, but not always

perfect• Shape and Scale parameters: a and l• Variable hazard

– Increasing– Decreasing– Constant (a = 1)

Page 44: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Weibull: Generalization of Exponential

• Shape Parameter: a• Scale Parameter: l

• Equivalent to the exponential when a = 1• Note: There are different parameterizations for

the Weibull

1

1

; ,

( )

a

t

t

f t t e

S t e

h t t

Page 45: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Weibull Example

0 10 20 30 40 50 60

0.0

0.2

0.4

0.6

0.8

1.0

Time

Su

rviv

al F

un

ctio

n

= 0.05, = 0.5 = 0.05, = 1 = 0.01, = 0.5 = 0.01, = 1

Page 46: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

R Code for the Weibull Plot#Weibulltime<-seq(0,60, 0.1)S1<-exp(-0.05*time^.5)S2<-exp(-0.05*time^1)S3<-exp(-0.01*time^0.5)S4<-exp(-0.01*time^1)plot(time, S1, xlab="Time", ylab="Survival Function", col=2, lwd=2, type="l", ylim=c(0,1))lines(time, S2, col=1, lwd=2)lines(time, S3, col=3, lwd=2)lines(time, S4, col=4, lwd=2)labs<-c(expression(paste(lambda, " = ",0.05, ", ", alpha, " = ",0.5, sep="")), expression(paste(lambda, " = ",0.05, ", ", alpha, " = ",1, sep="")), expression(paste(lambda, " = ",0.01, ", ", alpha, " = ",0.5, sep="")), expression(paste(lambda, " = ",0.01, ", ", alpha, " = ",1, sep="")))legend(x=0, y=.25, labs, col=c(2,1,3,4), lty=1, lwd=2,cex=0.9)

Page 47: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Effect of Shape Parameter

Page 48: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Weibull

• Mean is ugly (gamma included, see pg. 38)• Median:

• Model checking:– Can do the same log(-log(S(t))) plot:

log(-log(S(t))) = log(l) + alog(t)

– Here, linearity required, but the slope not = 1• More later when we discuss likelihoods

1

lim

0.5

ln 2xt

Page 49: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Log-normal

• Just like it sounds• If Y ~ normal, then log(Y) ~ log-normal• Two parameters: m and s• Survival function

• Median

ln1

tS t

0.50 expt

Page 50: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Log-normal

• Log-normal can work well in medical applications (e.g. age of disease onset)

• Hazard is hump-shaped

• Critics think that decreasing hazard at later times is unrealistic

Page 51: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Log-logistic

• If Y follows a logistic, then log(Y) ~ log-logistic• Logistic is similar to normal, but the survival

function is easier to work with• Hazard similar to Weibull, but more variable in

shapes for hazard– Monotone decreasing– Hump-shaped

Page 52: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Log-logistic

• Survival Function:

• Hazard function:

• Median:

1

1S t

t

1

1

th t

t

1

0.50

1 a

t

Page 53: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Gamma

• Generalization of exponential

• Not easy to work with

1

( ) , , 0k tt e

f t kk

1

0

1

( ) 1

( )1

s k x

k

k

k t

k

x e dxI s

k

S t I k

t eh t

k I k

Page 54: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Cure Rate Distribution

• Not in K & M• Assumption: fraction of individuals never fail• Violates assumption that S(∞) = 0• Useful for clinical trials in which

– A fraction of the patients are cured– Event my never occur (e.g. cancer relapse)

Page 55: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Cure Rate Example

• 75% of women with early stage breast cancer are cured by treatment

• Remaining 25% of women relapse– Assume exponential– l = 0.05

Page 56: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Cure Rate Distribution

• Mixture model:

• S(t) =• p = • S*(t) =

*1i iS t p S t p

Page 57: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Cure Rate: Breast cancer example

Page 58: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

R Code

par(mfrow=c(1,2))t<-seq(0,1000,0.1)St<-0.25*exp(-0.05*t)+0.75par(mfrow=c(1,2))plot(t, St, xlim=c(0,60), ylim=c(0,1), type="l", lwd=2, xlab="Time(months)", ylab="Survival Fraction")plot(t, St, xlim=c(0,1000), ylim=c(0,1), type="l", lwd=2,

xlab="Time(months)", ylab="Survival Fraction")

Page 59: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Competing Risks• Used to be somewhat ignored• Not so much anymore• Idea:

– Each subject can fail due to one of K causes (K > 1)– Occurrence of one event precludes us from

observing the other event– Usually, quantity of interest is the cause specific

hazard• Overall hazard equals sum of each hazard

1

K

T kkh t h t

Page 60: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Example

• An investigator is looking at graft rejection in kidney transplant patients

• However… patients can also experience graft failure and death

• Treat graft failure and graft rejection events as censored observations

• Why is this a problem?

Page 61: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Assumptions

• Dependence structure between the ‘potential’ failure times

• Identifiability dilemma: Can only observe one time per person so not testable

• We can not distinguish between independent and dependent competing risks

Page 62: Lecture 2: Key Functions and Parametric Distributions Survival Function Hazard Function Median Survival Common Parametric Distributions

Useful Approaches

• Want to account for other causes– Adjust the denominator

• Compare rates of events– Use measures of probabilities

• Crude: probability of event k allowing for all other risks• Net: probability of event k if it is the ONLY risk• Partial: probability of event k is one of a subset of risks

acting in the population

• See K & M for more details