duration models: parametric models - loginpsfaculty.ucdavis.edu/bsjjones/slide3_parm.pdf ·...

70
Bradford S. Jones, UC-Davis, Dept. of Political Science Duration Models: Parametric Models Brad Jones 1 1 Department of Political Science University of California, Davis January 28, 2011 Jones POL 290G

Upload: nguyenkien

Post on 29-Apr-2018

260 views

Category:

Documents


4 download

TRANSCRIPT

Bradford S. Jones, UC-Davis, Dept. of Political Science

Duration Models: Parametric Models

Brad Jones1

1Department of Political ScienceUniversity of California, Davis

January 28, 2011

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I Some Motivation for Parametrics

I Consider the hazard rate:

dh(t)

dt> 0,

Hazard increasing wrt time.

dh(t)

dt< 0,

Hazard decreasing wrt time.

dh(t)

dt= 0,

Hazard “flat” wrt time.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I Parametric models give structure (shape) to the hazardfunction.

I N.B.: the structure is a function of the c.d.f., not necessarilyof the “real world.”

I . . . though some c.d.f.s do a good job of approximating somefailure-time processes.

I Any c.d.f. with positive support on the real number line willwork.

I Lots of choices: exponential, Weibull, gamma, Gompertz,log-normal, log-logistic . . . etc.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I For parametrics, we work with standard likelihood methods.

I Specify a distribution function and write out the log-likelihoodfor the data.

I The question is, which distribution function?

I In all software programs/computing environments, youre givena menu.

I Stata:streg, R:survreg, eha

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I “Advantages” of parametric models?

I If S(t) is known to follow, or closely approximate a knowndistribution, then estimates will be consistent the thetheoretical survivor function.

I Unlike K-M or Cox (discussed later), the hazard may be usedfor forecasting (under KM or Cox, the hazard is only definedup until the last observed failure).

I Will return smooth functions of h(t) or S(t).

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I As noted, there are a wide variety of choices.

I I sometimes refer to these choices as “plug and play”estimators.

I Why? Consider the survivor function:

S(t) = Pr(T > t) =

∫ ∞t

f (u)d(u) = 1−∫ t

0f (u)d(u) = 1−F (t)

(1)

I If we know this function follows some distribution, then wewrite a likelihood function in terms of this distribution . . .

I If it follows a different distribution, just replace the previouslikelihood with another pdf.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I Most texts, including ours, typically begin with theexponential distribution.

I The reason is easy: it’s an easy distribution to work with andvisualize.

I It also may be unrealistic in many settings.

I The basic feature: the hazard rate is flat wrt time.

I That is:h(t) = λ (2)

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I Recall from the first week:

S(t) = exp{−H(t)} (3)

where

H(t) =

∫ t

0h(u)du

I Substituting λ into (3) ,

S(t) = exp{−∫ t

0λdu}

and soS(t) = exp(−λt)

I This is the survivor function for the exponential distribution.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I Since we know f (t) = h(t)S(t) then

f (t) = λ exp(−λt)

I This is the pdf of a random variable T that is exponentiallydistributed.

I Note how the unconditional probability of failure, f (t),handles censored cases.

I Consider the hazard function:

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I What is λ?

I Or put differently, where are the predictor variables?

I Typically λ will be parameterized in terms of regressioncoefficients and covariates, X .

I A model:h(t) = λ = exp(β0 + β1T )

I Suppose T is a treatment indicator and we’re interested in thehazard of failure for the treated and untreated.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I Two hazards:

h(tT=1) = exp(β0 + β1)

h(tT=0) = exp(β0)

I If we plotted the hazards, we would have two parallel linesseparated by exp(β1).

I Or analogously, if we want to compare hazards:

h(tT=1)

h(tT=0)=

exp(β0 + β1)

exp(β0)

item¡4-¿ This expression must simplify to exp(β1).

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I In words (sort of)...the ratio of the treated to the untreatedsimplifies to exp(β1).

I So all we need to know to know the differences in the hazardsis the coefficient for the treated.

I This is an important result because it shows the hazards areproportional hazards.

I Some simulated data.

I h(t) = −4.59 + .96(Z )

I Let Z denote whether or not a subject was exposed to somecondition.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I Since β1 is positive, this implies exposure increases the risk.

I The hazard is higher for the exposed than for the unexposed.

I Treatment estimate is .96 implies difference in hazard isexp(.96) ≈ 2.6

I Risk for exposed is about 2.6 times greater than for theunexposed.

I Consider the hazards:

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I PH property is important to understand.

I By way of analogy, think about what odds ratios are in alogit-type setting or recall the ordered logit model: the OR areinvariant to the scale scores.

I The proportional difference in hazards is invariant to time.

I So under the exponential we are making two assumptions:1. The hazards are flat wrt time.2. The difference in hazards across levels of a covariate is afixed proportion.

I Which is the stronger assumption?

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models

I Note that even with the PH assumption, we are not saying (ingeneral) the hazards are invariant to time (though in theexponential case, we are).

I The hazards may change but the proportional differencebetween (say) two groups, does not change.

I That’s the basic result of proportionality.

I Suppose it does not hold. Then what?

I Consider another model that relaxes the assumption of flathazards (but not the PH assumption).

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Weibull

I A more flexible distribution function is given by the Weibull.

I Named for Waloddi Weibull, who derived it (1939, 1951)

I Why more general than the exponential?

I It is a two-parameter distribution:

h(t) = λptp−1 (4)

where λ is a positive scale parameter and p is a shapeparameter.

I Note:p > 1, the hazard rate is monotonically increasing with time.p < 1, the hazard rate is monotonically decreasing with time.p = 1, the hazard is flat.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Weibull

I Thus if p = 1 then

h(t) = λ1t1−1 = λ. (5)

I Thus demonstrating that the exponential model is nestedwithin the Weibull.

I For this reason (and for many other reasons), the Weibull isthe most commonly applied parametric model in survivalanalysis.

I As with the exponential, the scale parameter λ is usuallyexpressed in terms of covariates, exp(βkxi ).

I Hazard functions plotted for different p:

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Weibull

I Using the connection between S(t) and the cumulative hazard(see eq. [3]), the Weibull survivor function is given by

S(t) = exp{−∫ t

0λpup−1du} = exp(−λtp).

I And since the pdf is h(t)S(t), the density for a randomvariable T distributed as a Weibull is

f (t) = λptp−1 exp(−λtp).

I Suppose we estimate a Weibull hazard using the data frombefore.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Weibull

I Note:h(tE )

h(tNE )=

exp(β0 + β1)ptp−1

exp(β0)ptp−1= exp(β1)

I In other words, the Weibull model is a proportional hazardsmodel.

I So unlike the exponential, the hazards can change wrt timebut like the exponential, the ratio of the hazards is a constant.

I They are offset by a proportionality factor of exp(β).

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Weibull

I The Weibull (and therefore) the exponential are interestingmodels.

I They are both proportional hazards models as well asaccelerated failure time models.

I In other words, one can estimate the model in terms of thehazards or in terms of the survival times and reproduceequivalent results from different parameterizations.

I Under the PH model, the covariates are a multiplicative effectwith respect to the baseline hazard function (see previousslide).

I Under the AFT, the covariates are multiplicative wrt thesurvival time.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Weibull

I Proportional Hazards:

h(t | x) = h0t exp(β1x1 + β2x2 + . . .+ βjxj)

I Accelerated Failure Time:

log(T ) = β0 + β1x1 + β2x2 + . . .+ βjxj + σε

where ε is a stochastic disturbance term with type-1extreme-value distribution scaled by σ.

I Note: σ = 1/p.

I Extreme-value has a close connection to Weibull: thedistribution of the log of a Weibull distributed randomvariable yields a type-1 extreme value distribution.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Weibull

I In the AFT formulation, the coefficients are sometimesreferred to as “acceleration” factors.

I They give information about how the survival times aredifferentially accelerated for different levels of a covariate.

I Suppose we estimate a treatment effect for two groups: Dand H.

I Imagine the estimated treatment effect yields a coefficient of“7.”

I That is, group H is estimated to survive 7 times longer thangroup D.

I SD(t) = SH(7t)

I If D are dogs and H are humans, the acceleration factorsuggests human lifespans are “stretched out” 7 times longerthan dogs. (Example from K and K, p. 266.)

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Weibull

I Important to be aware of what your software is doing!

I The PH coefficients inform us about the hazard (i.e. risk).

I The AFT coefficients inform us about survival.

I Therefore, the coefficients will be signed differently.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Weibull

I Weibull hazard is monotonic.

I Log-logistic and log-normal allow for nonmonotonic hazards.

I Both estimated only as AFT models:

log(T ) = βx + σε.

I The AFT for each of these models has two parameters.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Log-Logistic

I The log-logistic is one choice for non-monotonic hazards:

h(t) =λptp−1

1 + λtp

I h(t) increases and then decreases if p > 1; monotonicallydecreasing when p ≤ 1.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Log-Logistic

I Again, λ gives information on the covariates (i.e. here iswhere the regression coefficients are.

I While the log-logistic is not a PH model, it is a proportionalodds model.

I Recall what this is from your previous course on MLE.

I Survivor function:

S(t) =1

1 + λtp=

λtp

1 + λtp

I Substitute exp(β) in for λ and you can see the connectionback to the logistic cdf.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Log-Logistic

I The odds of failure:

1− S(t)

S(t)=

λtp

1+λtp

11+λtp

= λtp

I In terms of parameters, exponentiating β will give theacceleration factor.

I Interpretation is really quite similar to a logit model (but it isnot exactly the same!).

I Other models?

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Estimation

I Previous can be estimated through MLE

I Imagine n observations upon which t1, t2, . . . tn duration timesare measured.

I Assume conditional independence of ti (may be herculeanassumption; more later)

I Specify a PDF (or CDF); if f (t) is derived, S(t) easily follows

I Write out likelihood function and maximize (standardalgorithm is Newton-Raphson)

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Parametric Models: Estimation

I Generic Likelihood:

L =n∏

i=1

{f (ti )}δi{S(ti )}1−δi

where δi is the censoring (failure) indicator.I Example: Weibull

f (t) = λptp−1 exp−(λtp)

I Survivor function

S(t) = exp−(λtp)

I The likelihood of the t duration times:

L =n∏

i=1

{λptp−1 exp−(λtp)}δi{exp−(λtp)}1−δi

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Getting Our Hands Dirty

I The only way to learn is to do.

I Useful to consider estimation and interpretation of someparametric models.

I Examples are based on cabinet duration data and most of thecode is in Stata.

I Stata do file is accessible on SmartSite and website.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Exponential

I Cabinet duration as a function of post-election negotiationsindicator and formation attempts.

Table: Estimation results : PH Exponential

I

Variable Coefficient (Std. Err.)format 0.146 (0.039)postelec -1.036 (0.124)Intercept -2.762 (0.106)

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Exponential

I Coefficients are in PH scale so a positively signed coefficientimplies the hazard is increasing as a function of x .

I Post-election negotiations lowers the hazard; increasednumber of formation attempts increase the hazard.

I Graphical display of two covariate profiles.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Exponential

I Turn attention to the Stata examples (we will do this inclass).

I Consider the AFT model.

I Recall the AFT model:

log(T ) = βkxi + σε

I If ε is type-1 extreme value (aka Gumbel) then the Weibull isobtained. If σ = p = 1 then the exponential is obtained.

I The coefficients are multiples of the survivor function.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Exponential

Table: Estimation results : AFT Exponential

Variable Coefficient (Std. Err.)format -0.146 (0.039)postelec 1.036 (0.124)Intercept 2.762 (0.106)

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Exponential

I Contrast the PH and AFT models.

I Under the exponential, the signs shift but the coefficients areunchanged in value.

I Sign shift makes sense: AFT formulation tells us aboutsurvivorship.

I AFT Hazard: −ho(t) exp−(xβ) = exp−(β0 + xβk)

I Solving for t: t = [− log(S(t)]× exp(β0 + β1 × postelec)

I If t = .5, we solve for the median survival time.

I Turn back to the Stata examples.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Exponential

I From the application, note the equivalency of the two models.

I Note also that the ratio of two survival times for two covariateprofiles (i.e. X = 1 vs. X = 0) will be constant andproportional wrt S(t).

I Hence either parameterization exhibits proportionality.

I Weibull example.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Weibull

I Under the exponential the hazard is flat..

I Under the Weibull:h(t) = λptp−1 (6)

λ is positive scale parameter; p is the shape parameter.

I p > 1, the hazard rate is monotonically increasing with time.

I p < 1, the hazard rate is monotonically decreasing with time.

I p = 1, the hazard is flat, i.e. exponential.

I Note that λ corresponds to covariates: exp(βkxi )

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Weibull hazards

I Consider application again.

Table: Estimation results : PH Weibull

I

Variable Coefficient (Std. Err.)

Equation 1 : t

format 0.156 (0.039)postelec -1.109 (0.129)Intercept -3.094 (0.199)

Equation 2 : ln p

Intercept 0.106 (0.050)

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Weibull

I Coefficients are interpreted as before though now we have anadditional parameter.

I p > 1 implying rising hazards for this model.

I Consider the hazard rates for two covariate profiles.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Weibull

I Observations?

I Note the shape is governed by p . . .

I But the difference in the two hazards are proportional.

I Looks may be deceiving; perhaps you think the lines shownonproportionality.

I Back to the application.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Weibull

I Consider the AFT formulation

Table: Estimation results : weibull

I

Variable Coefficient (Std. Err.)

Equation 1 : t

format -0.140 (0.035)postelec 0.998 (0.113)Intercept 2.784 (0.096)

Equation 2 : ln p

Intercept 0.106 (0.050)

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Weibull

I Similar interpretation is afforded this model as was the casewith the exponential AFT.

I Note under the AFT: S(t) = exp(−λtp)

I Therefore, t = [− log S(t)]1/p × 1λ1/p .

I Expressing 1λ1/p in terms of the model parameters, we obtain

t = [− log S(t)]1/p × exp(β0 + βkx)

I As with the exponential, let q denote some S(t), then we canestimate S(t) for some value q:t = [− log S(q)]1/p × exp(β0 + βkx)

I So for the median, q = .5.

I Go to example.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Log-Logistic

I Consider now the log-logistic.

I The log-logistic is only an AFT model.

Table: Estimation results : AFT: Log-Logistic

I

Variable Coefficient (Std. Err.)

Equation 1 : t

format -0.200 (0.050)postelec 0.995 (0.130)Intercept 2.474 (0.126)

Equation 2 : ln gam

Intercept -0.419 (0.051)

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Log-Logistic

I Consider now the log-logistic.

I The log-logistic is only an AFT model.

I Note that Stata reports γ as the shape parameter.

I This is the inverse of p.

I Consider the survivor function: S(t) = 11+λtp = 1

1+(λ1/pt)p

I Suppose we solve for t: t = [ 1S(t) − 1]1/p × 1

λ1/p

I Express the second term in terms of covariates, we obtain:t = [ 1

S(t) − 1]1/p × exp(β0 + βkx)

I To the example.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Log-Logistic

I Because the log-logistic is AFT and proportional odds, thisratio should be equivalent to the acceleration factor (i.e. theodds ratio exp(β1)).

I So this too is a proportional model . . . in the odds ratios.

I This assumption may not hold.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Many Applications

I These are “plug and play” estimators.

I They are easy to do.

I Let’s run through some illustrations, first in Stata and thenin R

I I use the cabinet duration data.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Weibull

. streg invest polar numst format postelec caretakr, dist(weib) time nolog

failure _d: censor

analysis time _t: durat

Weibull regression -- accelerated failure-time form

No. of subjects = 314 Number of obs = 314

No. of failures = 271

Time at risk = 5789.5

LR chi2(6) = 171.94

Log likelihood = -414.07496 Prob > chi2 = 0.0000

------------------------------------------------------------------------------

_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

invest | -.2958188 .1059024 -2.79 0.005 -.5033838 -.0882538

polar | -.017943 .0042784 -4.19 0.000 -.0263285 -.0095575

numst | .4648894 .1005815 4.62 0.000 .2677533 .6620255

format | -.1023747 .0335853 -3.05 0.002 -.1682006 -.0365487

postelec | .6796125 .104382 6.51 0.000 .4750276 .8841974

caretakr | -1.33401 .2017528 -6.61 0.000 -1.729438 -.9385818

_cons | 2.985428 .1281146 23.30 0.000 2.734328 3.236528

-------------+----------------------------------------------------------------

/ln_p | .257624 .0500578 5.15 0.000 .1595126 .3557353

-------------+----------------------------------------------------------------

p | 1.293852 .0647673 1.172939 1.42723

1/p | .7728858 .0386889 .700658 .8525593

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Exponential

. streg invest polar numst format postelec caretakr, dist(exp) time nolog

failure _d: censor

analysis time _t: durat

Exponential regression -- accelerated failure-time form

No. of subjects = 314 Number of obs = 314

No. of failures = 271

Time at risk = 5789.5

LR chi2(6) = 148.53

Log likelihood = -425.90641 Prob > chi2 = 0.0000

------------------------------------------------------------------------------

_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

invest | -.3322088 .1376729 -2.41 0.016 -.6020426 -.0623749

polar | -.0193017 .0055465 -3.48 0.001 -.0301725 -.0084308

numst | .515435 .1291486 3.99 0.000 .2623084 .7685616

format | -.1079432 .0435233 -2.48 0.013 -.1932474 -.022639

postelec | .7403427 .134558 5.50 0.000 .4766138 1.004072

caretakr | -1.319272 .2595422 -5.08 0.000 -1.827965 -.8105783

_cons | 2.944518 .1663401 17.70 0.000 2.618498 3.270539

------------------------------------------------------------------------------

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Log-logistic

. streg invest polar numst format postelec caretakr, dist(loglog) time nolog

failure _d: censor

analysis time _t: durat

Log-logistic regression -- accelerated failure-time form

No. of subjects = 314 Number of obs = 314

No. of failures = 271

Time at risk = 5789.5

LR chi2(6) = 148.72

Log likelihood = -424.10921 Prob > chi2 = 0.0000

------------------------------------------------------------------------------

_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

invest | -.3367541 .1278083 -2.63 0.008 -.5872538 -.0862544

polar | -.0221958 .0052638 -4.22 0.000 -.0325127 -.0118789

numst | .4830709 .1212506 3.98 0.000 .2454241 .7207177

format | -.1093453 .0419715 -2.61 0.009 -.1916078 -.0270827

postelec | .6408808 .1240329 5.17 0.000 .3977807 .8839808

caretakr | -1.26921 .2310272 -5.49 0.000 -1.722015 -.8164046

_cons | 2.728818 .1595866 17.10 0.000 2.416034 3.041602

-------------+----------------------------------------------------------------

/ln_gam | -.5657686 .0511353 -11.06 0.000 -.665992 -.4655451

-------------+----------------------------------------------------------------

gamma | .5679235 .029041 .5137636 .6277928

------------------------------------------------------------------------------

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Log-normal

. streg invest polar numst format postelec caretakr, dist(lognorm) time nolog

failure _d: censor

analysis time _t: durat

Log-normal regression -- accelerated failure-time form

No. of subjects = 314 Number of obs = 314

No. of failures = 271

Time at risk = 5789.5

LR chi2(6) = 150.66

Log likelihood = -425.30621 Prob > chi2 = 0.0000

------------------------------------------------------------------------------

_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

invest | -.3738013 .1327055 -2.82 0.005 -.6338993 -.1137032

polar | -.021988 .0054825 -4.01 0.000 -.0327336 -.0112424

numst | .5717579 .1232281 4.64 0.000 .3302353 .8132805

format | -.1194982 .0432516 -2.76 0.006 -.2042698 -.0347266

postelec | .6668079 .1292366 5.16 0.000 .4135088 .920107

caretakr | -1.126047 .2576962 -4.37 0.000 -1.631122 -.6209713

_cons | 2.632497 .164494 16.00 0.000 2.310095 2.954899

-------------+----------------------------------------------------------------

/ln_sig | .0078719 .0439881 0.18 0.858 -.0783432 .0940871

-------------+----------------------------------------------------------------

sigma | 1.007903 .0443358 .924647 1.098655

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Weibull

> cab.weib<-survreg(Surv(durat,censor)~invest + polar + numst +

+ format + postelec + caretakr,data=cabinet,

+ dist=’weibull’)

>

> summary(cab.weib)

Call:

survreg(formula = Surv(durat, censor) ~ invest + polar + numst +

format + postelec + caretakr, data = cabinet, dist = "weibull")

Value Std. Error z p

(Intercept) 2.9854 0.12811 23.30 4.15e-120

invest -0.2958 0.10590 -2.79 5.22e-03

polar -0.0179 0.00428 -4.19 2.74e-05

numst 0.4649 0.10058 4.62 3.80e-06

format -0.1024 0.03359 -3.05 2.30e-03

postelec 0.6796 0.10438 6.51 7.47e-11

caretakr -1.3340 0.20175 -6.61 3.79e-11

Log(scale) -0.2576 0.05006 -5.15 2.65e-07

Scale= 0.773

Weibull distribution

Loglik(model)= -1014.6 Loglik(intercept only)= -1100.6

Chisq= 171.94 on 6 degrees of freedom, p= 0

Number of Newton-Raphson Iterations: 5

n= 314

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Log-Logistic

> cab.ll<-survreg(Surv(durat,censor)~invest + polar + numst +

+ format + postelec + caretakr,data=cabinet,

+ dist=’loglogistic’)

>

> summary(cab.ll)

Call:

survreg(formula = Surv(durat, censor) ~ invest + polar + numst +

format + postelec + caretakr, data = cabinet, dist = "loglogistic")

Value Std. Error z p

(Intercept) 2.7288 0.15959 17.10 1.50e-65

invest -0.3368 0.12781 -2.63 8.42e-03

polar -0.0222 0.00526 -4.22 2.48e-05

numst 0.4831 0.12125 3.98 6.77e-05

format -0.1093 0.04197 -2.61 9.18e-03

postelec 0.6409 0.12403 5.17 2.38e-07

caretakr -1.2692 0.23103 -5.49 3.93e-08

Log(scale) -0.5658 0.05114 -11.06 1.87e-28

Scale= 0.568

Log logistic distribution

Loglik(model)= -1024.7 Loglik(intercept only)= -1099

Chisq= 148.72 on 6 degrees of freedom, p= 0

Number of Newton-Raphson Iterations: 4

n= 314

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

> ##Log-Normal can be fit using survreg:

>

> cab.ln<-survreg(Surv(durat,censor)~invest + polar + numst +

+ format + postelec + caretakr,data=cabinet,

+ dist=’lognormal’)

>

> summary(cab.ln)

Call:

survreg(formula = Surv(durat, censor) ~ invest + polar + numst +

format + postelec + caretakr, data = cabinet, dist = "lognormal")

Value Std. Error z p

(Intercept) 2.63250 0.16449 16.004 1.21e-57

invest -0.37380 0.13271 -2.817 4.85e-03

polar -0.02199 0.00548 -4.011 6.06e-05

numst 0.57176 0.12323 4.640 3.49e-06

format -0.11950 0.04325 -2.763 5.73e-03

postelec 0.66681 0.12924 5.160 2.47e-07

caretakr -1.12605 0.25770 -4.370 1.24e-05

Log(scale) 0.00787 0.04399 0.179 8.58e-01

Scale= 1.01

Log Normal distribution

Loglik(model)= -1025.9 Loglik(intercept only)= -1101.2

Chisq= 150.66 on 6 degrees of freedom, p= 0

Number of Newton-Raphson Iterations: 4

n= 314

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Comparing Log-Likelihoods (note: non-nested models). I did this in R:

anova(cab.weib, cab.ln, cab.ll)

1 invest + polar + numst + format + postelec + caretakr

2 invest + polar + numst + format + postelec + caretakr

3 invest + polar + numst + format + postelec + caretakr

Resid. Df -2*LL Test Df Deviance P(>|Chi|)

1 306 2029.238 NA NA NA

2 306 2051.701 = 0 -22.462507 NA

3 306 2049.307 = 0 2.394004 NA

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Back to Stata: Generalized Gamma

. streg invest polar numst format postelec caretakr, dist(gamma) nolog

failure _d: censor

analysis time _t: durat

Gamma regression -- accelerated failure-time form

No. of subjects = 314 Number of obs = 314

No. of failures = 271

Time at risk = 5789.5

LR chi2(6) = 165.78

Log likelihood = -414.00944 Prob > chi2 = 0.0000

------------------------------------------------------------------------------

_t | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

invest | -.3005269 .108745 -2.76 0.006 -.5136633 -.0873906

polar | -.0182998 .0044674 -4.10 0.000 -.0270559 -.0095438

numst | .4692142 .1030895 4.55 0.000 .2671626 .6712659

format | -.1031368 .0342637 -3.01 0.003 -.1702925 -.0359811

postelec | .6807161 .1061356 6.41 0.000 .4726942 .888738

caretakr | -1.328476 .2066422 -6.43 0.000 -1.733487 -.9234647

_cons | 2.963114 .1447075 20.48 0.000 2.679492 3.246735

-------------+----------------------------------------------------------------

/ln_sig | -.234325 .0802121 -2.92 0.003 -.3915378 -.0771122

/kappa | .9241712 .2065399 4.47 0.000 .5193605 1.328982

-------------+----------------------------------------------------------------

sigma | .7911047 .0634561 .6760165 .9257859

------------------------------------------------------------------------------

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Adjudication

I Lots of ChoicesI Selection can be arbitraryI If parametrically nested, standard LR tests apply.I Encompassing Distribution: generalized gamma:

f (t) =λp(λt)pκ−1 exp[−(λt)p]

Γ(κ)(7)

I When κ = 1, the Weibull is implied; when κ = p = 1, theexponential distribution is implied; when κ = 0, thelog-normal distribution is implied; and when p = 1, thegamma distribution is implied.

I In illustrations above, verify that Weibull would be preferredmodel among the choices.

I AIC (−2(log L) + 2(c + p + 1)) also confirms Weibull ispreferred model among choices.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Survivor Functions

Cabinet Duration

0 20 40 60

0

.5

1

Figure: The figure graphs the generalized gamma and Weibull survivorfunctions for the cabinet duration data. The Weibull estimates aredenoted by the “O” symbol and the generalized gamma estimates are

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

denoted by the line.

Jones POL 290G

Bradford S. Jones, UC-Davis, Dept. of Political Science Parametric Survival Models

Table: AIC and Log-Likelihoods for Cabinet Models

Model Log-Likelihood AICExponential −425.91 865.82Weibull −414.07 844.14Log-Logistic −424.11 864.22Log-Normal −425.31 866.62Gompertz −418.98 853.96Generalized Gamma −414.01 846.02

Jones POL 290G