approval - summitsummit.sfu.ca/system/files/iritems1/13564/etd6333_cco.pdf · ing me direction in...
Post on 14-Jul-2020
2 Views
Preview:
TRANSCRIPT
INVESTIGATING THE USE OF THE ACCELERATEDHAZARDS MODEL FOR
SURVIVAL ANALYSIS
by Caroll Anne Co
B.Sc., Simon Fraser University, 2007
PROJECT SUBMITTED IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
in the
Department of Statistics & Actuarial Science
Faculty of Science
Caroll Anne Co 2010
SIMON FRASER UNIVERSITY Fall 2010
All rights reserved. However, in accordance with the Copyright Act of Canada, this work may
be reproduced, without authorization, under the conditions for “Fair Dealing.” Therefore, limited reproduction of this work for the
purposes of private study, research, criticism, review, and news reporting is likely to be in accordance with the law, particularly if cited appropriately.
APPROVAL
Name: Caroll Anne Co
Degree: Master of Science
Title of Thesis: Investigating the Use of the Accelerated Hazards Model for
Survival Analysis
Examining Committee: Dr. Derek Bingham (Chair)
Dr. Charmaine Dean, Senior Supervisor
Dr. Leilei Zeng, Supervisor
Dr. Joan Hu, External Examiner
Date Approved: December 9, 2010
ii
Last revision: Spring 09
Declaration of Partial Copyright Licence The author, whose copyright is declared on the title page of this work, has granted to Simon Fraser University the right to lend this thesis, project or extended essay to users of the Simon Fraser University Library, and to make partial or single copies only for such users or in response to a request from the library of any other university, or other educational institution, on its own behalf or for one of its users.
The author has further granted permission to Simon Fraser University to keep or make a digital copy for use in its circulating collection (currently available to the public at the “Institutional Repository” link of the SFU Library website <www.lib.sfu.ca> at: <http://ir.lib.sfu.ca/handle/1892/112>) and, without changing the content, to translate the thesis/project or extended essays, if technically possible, to any medium or format for the purpose of preservation of the digital work.
The author has further agreed that permission for multiple copying of this work for scholarly purposes may be granted by either the author or the Dean of Graduate Studies.
It is understood that copying or publication of this work for financial gain shall not be allowed without the author’s written permission.
Permission for public performance, or limited permission for private scholarly use, of any multimedia materials forming part of this work, may have been granted by the author. This information may be found on the separately catalogued multimedia material and in the signed Partial Copyright Licence.
While licensing SFU to permit the above uses, the author retains copyright in the thesis, project or extended essays, including the right to change the work for subsequent purposes, including editing and publishing the work in whole or in part, and licensing other parties, as the author may desire.
The original Partial Copyright Licence attesting to these terms, and signed by this author, may be found in the original bound copy of this work, retained in the Simon Fraser University Archive.
Simon Fraser University Library Burnaby, BC, Canada
Abstract
This project contrasts the Proportional Hazards, Accelerated Failure Time and Accelerated
Hazards (AH) models in the analysis of time to event data. The AH model handles data that
exhibit crossing of the survival and hazard curves, unlike the other two models considered.
The three models are illustrated on five contrasting data sets. A simulation study is con-
ducted to assess the small sample performance of the AH model by quantifying the mean
squared error of the predicted survivor curves under scenarios of crossing and non-crossing
survivor curves. The results show that the AH model can perform poorly under model
misspecification for models with a crossing hazard. Problems with variance estimation of
parameters in the AH model are observed for small sample sizes and a bootstrap approach
is offered as an alternate method of quantifying precision of estimates.
iii
Acknowledgments
This manuscript would not have gotten this far without the guidance and supervision from
Drs. Charmaine Dean and Leilei Zeng. I thank you both for your mindful insights in giv-
ing me direction in my research and course-work throughout the program. Many thanks
to Dr. Rachel Altman for providing me with helpful advice during the early stages of the
research project. I am grateful to Drs. Joan Hu and Derek Bingham for serving as the
external examiner and chair in my graduate committee. To my fellow graduate students in
the Department of Statistics & Actuarial Science, your friendships have truly made learn-
ing and research enjoyable. I have learned as much from the class lectures as I have from
the discussions with you on both academic and non-academic subjects. To my friends who
have kept me motivated in my academic pursuit, I would not be able to reach the finish line
without you.
I am indebted to my manager and co-workers from BC Mental Health & Addiction
Services for their understanding and encouragement in allowing me to pursue higher edu-
cation while working part-time; and to the staff at Cardiac Services BC for their patience
and support as I finish the last few steps in the program. I thank my family for their un-
wavering love and care throughout my education; and to PB, my best friend and confidant,
thank you for everything.
iv
Contents
Approval ii
Abstract iii
Acknowledgments iv
Contents v
List of Tables vii
List of Figures ix
1 Introduction 1
2 Models for Survival Analysis 42.1 The Proportional Hazards Model . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 The Accelerated Failure Time Model . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 The Accelerated Hazards Model . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Comparison of the functional forms for the PH, AFT, and AH Models . . . 15
2.5 Small sample investigation of the performance of the AH estimator . . . . . 17
3 Data Analysis 333.1 Breast Cancer Clinical Trial . . . . . . . . . . . . . . . . . . . . . . . . . 33
v
CONTENTS vi
3.2 Coronary Artery Bypass Graft Surgery . . . . . . . . . . . . . . . . . . . . 39
3.3 Veteran Lung Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4 Kidney Catheter Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 Brain Tumor Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.6 Summary AH Parameter Interpretation . . . . . . . . . . . . . . . . . . . . 60
4 Exploring the fit of the PH and AH survivor curves 614.1 Case I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.2 Case II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5 Discussion 74
A Appendix 77
Bibliography 79
List of Tables
2.1 Parameter interpretation for PH, AFT and AH models . . . . . . . . . . . . 16
2.2 Summary of bias, variance, and variance ratios for the PH and AH models
fitted to a Weibull distribution for the 0% censoring case. Note that βPH =
βAH (k-1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Summary of bias, variance, and variance ratios for the PH and AH models
fitted to a Weibull distribution for the 27% censoring case. Note that βPH
= βAH (k-1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Summary of bias, variance, and variance ratio for the PH and AH models
fitted to a Weibull distribution for the 53% censoring case. Note that βPH
= βAH (k-1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1 Estimated treatment effects in the analysis of the breast cancer data using
PH, AFT, and AH models. Values with * in the AH model represent boot-
strapped estimates. The p-values correspond to Wald tests of a hypothesis
of no treatment effect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Estimated gender effects on the coronary artery bypass graft data using
PH, AFT and AH models. Values with * in the AH model represent boot-
strapped estimates. The p-values compared to Wald tests of a hypothesis of
no treatment effect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3 Estimated treatment effects on the veteran lung cancer data using PH, AFT,
and AH models. Values with * in the AH model represent bootstrapped
estimates. The p-values correspond to Wald tests of a hypothesis of no
treatment effect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
vii
LIST OF TABLES viii
3.4 Estimated treatment effects on the percutaneous catheter placement using
PH, AFT, and AH models. Values with * in the AH model represent boot-
strapped estimates. The p-values correspond to Wald tests of a hypothesis
of no treatment effect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5 Estimated treatment effects on the carmustine (BCNU) polymer disc data
using PH, AFT, and AH models. Values with * in the AH model repre-
sent bootstrapped estimates. The p-values correspond to Wald tests of a
hypothesis of no treatment effect. . . . . . . . . . . . . . . . . . . . . . . . 56
3.6 Summary of features of the five datasets considered in the chapter, and the
best model fitted in each dataset. The column headers reflect features seen
from the estimated Kaplan-Meier curves, without regard for significance of
these effects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
List of Figures
1.1 Hypothetical example of a scenario exhibiting crossing survivor curves,
with the control group representing individuals on an oral medication ther-
apy (black), and the treatment group representing individuals who were
surgically treated (red). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Hazard (top row) and survivor (bottom row) functions of PH, AFT, and
AH models. Non-crossing hazard functions for the AFT and AH models
are not shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Levelplot of variance ratios of the empirical and model-based accelerated
hazards model variance estimates, for treatment effects (te) 0,0.5, and 2;
shape parameter, k=0.5, 1.5 and 3; scale parameter, λ=0.25, 1, and 1.3;
sample sizes, n=100, 500, 1000, 5000, and 10,000. A negative ratio implies
an underestimation of the model-based variance denoted by a warm orange
color, while a positive ratio implies an overestimation of the model-based
variance denoted by a dark green color. . . . . . . . . . . . . . . . . . . . 25
2.3 Levelplot of variance ratios of empirical and model-based proportional haz-
ard model variance estimates, for treatment effects (te) 0,0.5, and 2; shape
parameter, k=0.5, 1.5 and 3; scale parameter, λ=0.25, 1, and 1.3; sample
sizes, n=100, 500, 1000, 5000, and 10,000. A negative ratio implies an
underestimation of the model-based variance denoted by a yellow color,
while a positive ratio implies an overestimation of the model-based vari-
ance denoted by green. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
ix
LIST OF FIGURES x
2.4 Dotplot of three variance estimates - empirical variance (blue), model-
based variance (pink), and non-parametric bootstrapped variance (green)
for data taken from a Weibull distribution with n=100, shape (k)=0.5, 1.5,
and 3, scale (λ)=0.25, 1, and 1.3, and treatment effect, βAH denoted (te) =
0, 0.5, and 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Dotplot of three variance estimates - empirical variance (blue), model-
based variance (pink), and non-parametric bootstrapped variance (green)
for data taken from a Weibull distribution with n=300, shape (k)=0.5, 1.5,
and 3, scale (λ)=0.25, 1, and 1.3, and treatment effect, βAH denoted (te) =
0, 0.5, and 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6 Dotplot of three variance estimates - empirical variance (blue), model-
based variance (pink), and non-parametric bootstrapped variance (green)
for data taken from a Weibull distribution with n=500, shape (k)=0.5, 1.5,
and 3, scale (λ)=0.25, 1, and 1.3, and treatment effect, βAH denoted (te) =
0, 0.5, and 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.7 Dotplot of two coverage probabilities using the model-based variance (blue),
and non-parametric bootstrapped variance (pink) for data taken from a
Weibull distribution with n=100, shape (k)=0.5, 1.5, and 3, scale (λ)=0.25,
1, and 1.3, and treatment effect, βAH denoted (te) = 0, 0.5, and 2. . . . . . . 30
2.8 Dotplot of two coverage probabilities using the model-based variance (blue),
and non-parametric bootstrapped variance (pink) for data taken from a
Weibull distribution with n=300, shape (k)=0.5, 1.5, and 3, scale (λ)=0.25,
1, and 1.3, and treatment effect, βAH denoted (te) = 0, 0.5, and 2. . . . . . . 31
2.9 Dotplot of two coverage probabilities using the model-based variance (blue),
and non-parametric bootstrapped variance (pink) for data taken from a
Weibull distribution with n=500, shape (k)=0.5, 1.5, and 3, scale (λ)=0.25,
1, and 1.3, and treatment effect, βAH denoted (te) = 0, 0.5, and 2. . . . . . . 32
3.1 Left: Kaplan-Meier survivor curves for the control group (black) and treat-
ment group (red). Right: Cumulative hazard curves for the control group
and treatment group. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Smoothed hazard curves for the breast cancer data. . . . . . . . . . . . . . 36
LIST OF FIGURES xi
3.3 Top: Cumulative hazard curves for the PH, Weibull AFT, and AH models.
The fitted cumulative hazard curves are shown in solid lines vs the non-
parametric hazard curves in dashed lines. The baseline (control) hazard
curves are shown in black. Bottom: Kaplan-Meier survivor curves for the
control group (black dashed) and treatment group (red dashed), with the
fitted survivor curves for treatment group shown in red solid lines. . . . . . 37
3.4 Residual plot from the fit of the proportional hazards model to the breast
cancer data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5 Left: Kaplan-Meier coronary artery bypass graft surgery 2-year survivor
curves for males (black) and females (red). Right: Two-year cumulative
hazard curve for coronary artery bypass graft surgeries. . . . . . . . . . . . 40
3.6 Smoothed hazard curves for males (black solid) and females (red dashed)
who underwent a coronary artery bypass graft surgery. . . . . . . . . . . . 42
3.7 Top: Cumulative hazard curves for the 2-year coronary artery bypass data
using the PH, AFT, and AH models. The non-parametric estimates are
shown in dashed lines, while the estimated cumulative hazard curves are
shown in solid lines for males (black) and females (red). Bottom: Kaplan-
Meier 2-year survivor curves for males (black dashed) and females (red
dashed), with the fitted survivor curves shown in solid lines. . . . . . . . . 43
3.8 Residual plot from the fit of the proportional hazards model to the coronary
artery bypass graft surgery data during the first two years post-surgery. . . . 44
3.9 Left: Kaplan-Meier survivor curves for the standard (black) and test (red)
chemotherapy groups. Right: Cumulative hazard curves for the two groups. 45
3.10 Smoothed hazard curves for the standard and test chemotherapy for the
treatment of lung cancer. . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.11 Top: Cumulative hazard curves for the veteran lung cancer data using the
PH, AFT, and AH models. The non-parametric estimates are shown in
dashed lines, while the estimated cumulative hazard curves are shown in
solid lines for standard treatment (black) and test treatment (red). Bottom:
Kaplan-Meier survivor curves for standard treatment (black dashed) and
test treatment (red dashed), with the fitted survivor curves shown in solid
lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
LIST OF FIGURES xii
3.12 Residual plot from the fit of the proportional hazards model to the veteran
lung cancer data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.13 Left: Kaplan-Meier survivor curves for surgical (black) and percutaneous
placement (red) of the catheter for patients undergoing kidney dialysis.
Right: Cumulative hazard curves for the two groups. . . . . . . . . . . . . 50
3.14 Smoothed hazard curves for surgically and percutaneously-placed catheters
for patients undergoing kidney dialysis. . . . . . . . . . . . . . . . . . . . 52
3.15 Top: Cumulative hazard curves for the kidney catheter placement data us-
ing the PH, AFT, and AH models. The non-parametric estimates are shown
in dashed lines, while the estimated cumulative hazard curves are shown
in solid lines for the surgically placement group (black) and the percuta-
neously placement group (red). Bottom: Kaplan-Meier survivor curves
for the surgically placement group (black dashed) and the percutaneously
placement group (red dashed), with the fitted survivor curves shown in solid
lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.16 Residual plot from the fit of the proportional hazards model to the kidney
catheter placement data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.17 Left: Kaplan-Meier survivor curves for the placebo (black) and BCNU
polymer (red) groups. Right: Cumulative hazard curves for the two groups. 55
3.18 Smoothed hazard curves for the placebo (black solid) and BCNU polymer
(red dashed) for the treatment of brain tumor. . . . . . . . . . . . . . . . . 57
3.19 Top: Cumulative hazard curves for the BCNU polymer disc data using
the PH, AFT, and AH models. The non-parametric estimates are shown
in dashed lines, while the estimated cumulative hazard curves are shown
in solid lines for the placebo group (black) and the BCNU polymer group
(red). Bottom: Kaplan-Meier survivor curves for the placebo group (black
dashed) and the BCNU polymer group (red dashed), with the fitted survivor
curves shown in solid lines. . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.20 Residual plot from the fit of the proportional hazards model to the brain
tumor data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
LIST OF FIGURES xiii
4.1 Hazard curves for the loglogistic distribution with shape parameters, a=0.5,
1, and 1.5 and scale parameter, s=1. . . . . . . . . . . . . . . . . . . . . . 62
4.2 Hazard (left) and survivor (right) curves for a loglogistic distribution with
shape=1.5 and scale=4 for a treatment effect size of -1. . . . . . . . . . . . 64
4.3 Comparison of the PH and AH fits for effect sizes of -0.5 (top), -1 (mid-
dle), and -1.5 (bottom) when the hazards do not start at the same point at
t=0. The dashed curves display true survivor functions corresponding to
the baseline and treatment groups. . . . . . . . . . . . . . . . . . . . . . . 65
4.4 Comparison of the fitted PH and AH curves with the true survivor curves
(shown with dashed lines) for three effect sizes (-0.5, -1, -1.5), with the
fitted baseline curves on the left panels, and the fitted treatment curves on
the right panels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.5 Boxplots of mean squared errors for the AH and PH models for β = -0.5,
-1, -1.5 in Case I, for a sample size of 100, for 1000 simulation runs. . . . . 67
4.6 Hazard (left) and survivor (right) curves for a loglogistic distribution with
shape parameters, a=4 and scale parameter, s=50 for a treatment effect size
of β=-1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.7 Comparison of the PH and AH fits for effect sizes of -0.5 (top), -1 (middle),
and -1.5 (bottom) when the hazards start at the same point at t=0. The
dashed curves display true survivor functions corresponding to the baseline
and treatment groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.8 Comparison of the fitted PH and AH curves with the true survivor curves
(shown with dashed lines) for three effect sizes (-0.5, -1, -1.5), with the
fitted baseline curves on the left panels, and the fitted treatment curves on
the right panels. Fixed censoring was done at t=100. . . . . . . . . . . . . 71
4.9 Boxplots of mean squared errors for the AH and PH models for β = -0.5,
-1, -1.5 in Case II, for a sample size of 100, for 1000 simulation runs. . . . 72
Chapter 1
Introduction
Survival analysis is concerned with the analysis of time to event data, such as time to
recurrence of a disease, or time to death, and determining the effects of treatments and
other factors on times to such events. In the modeling of survival time data, it is common
to use the hazard function and the survivor function to describe the distribution of lifetime
and the effect of the predictors or covariates on the cohorts’ risk and survival time. A hazard
function is defined as the instantaneous rate of failure (or having an event occur) at time t,
given survival up to time t. It is defined as:
h(t) = lim∆t→0+
P(t ≤ T < t +∆t|T ≥ t)∆t
=f (t)S(t)
(1.1)
where f (t) is the probability density function of the lifetime, and S(t) = P(T ≥ t) =1−F(t) is the survivor function, the probability of surviving beyond time t. In contrast,
F(t) is the probability of surviving up to time t. The hazard and survivor function are inter-
related by the following equation:
S(t) = e−∫ t
0 h(s)ds = e−H(t) (1.2)
where H(t) is defined as the cumulative hazard function.
The more commonly used models for survival analysis are the so-called Proportional
Hazards model (PH) and the Accelerated Failure Time model (AFT). Both of these models
assume an immediate treatment effect at the start of the study/trial. However, in random-
ized clinical trials, which compare a treatment with a placebo, it may be more reasonable to
1
CHAPTER 1. INTRODUCTION 2
assume that the risks of failure for the treatment and control (placebo) groups are equivalent
at the start of the trial and change as the trial proceeds. The Accelerated Hazards model
(AH), proposed by Chen & Wang (2000), holds this property. It is also able to handle data
that exhibit crossing of the survivor and hazard functions for treatment and control groups,
a feature that the PH and AFT models cannot accommodate. The model’s ability to capture
cross-overs of either or both of the hazard and survivor curves is often useful in practice.
An example of a situation where a crossing in the survivor curves may occur is in a ran-
domized controlled trial where one group receives oral medication, while another receives
a riskier treatment involving surgery. It may be that the operative mortality of individuals
undergoing surgery is initially high post-surgery but the rate of failure tapers off after a
certain time, indicating a cure. In the oral medication group however, mortality rate may
decline far more slowly at the start of the trial, since the treatment is less invasive, but may
continue to decline over time, with lower survivor rates exhibited after some period than
those for individuals undergoing the surgery. Figure 1.1 illustrates this hypothetical sce-
nario. This project contrasts the PH, AFT and AH models, develops methods for analysis
using the AH model, and considers the application of these models to several data sets.
In the proportional hazards model, the effect of the treatment is quantified on the haz-
ard scale. The effect of the treatment over a placebo, for example, is modeled in terms of
the hazard ratio of the placebo and treatment groups. This ratio can be thought of as the
relative risk, which quantifies how much more (less) risk the treatment group has over the
placebo group. In the accelerated failure time model, the effect of the treatment is quan-
tified by how fast (or slow) the treatment group ages (along the survivor curve) relative
to the control group. The treatment effect acts multiplicatively on time when calculating
the survivor function, and can be described as a survivor time ratio of the two groups. In
the accelerated hazards model, the treatment effect is quantified on the hazard scale and
describes how much faster (or slower) the risk progression is for the treatment group when
compared to the placebo group.
The precise formulation and inference for these three models are described and dis-
cussed in Chapter 2. A common feature of survival data is that observations may be cen-
sored; censoring occurs when individuals do not fail before the termination of the trial
CHAPTER 1. INTRODUCTION 3
0.2
0.4
0.6
0.8
1.0
Time in months
P(S
urvi
val)
0 1 2 3 4 5 6 7 8 9 10
Oral Medication Therapy
Surgery
Figure 1.1: Hypothetical example of a scenario exhibiting crossing survivor curves, with
the control group representing individuals on an oral medication therapy (black), and the
treatment group representing individuals who were surgically treated (red).
and what is recorded is the censoring time, or time at which failure had not yet occured.
For individuals who fail before the termination of the trial we have available their lifetime
or failure time. Understanding the censoring process is fundamental to correct statistical
inference; here, we assume that failure times are independent from censoring times and
develop inferential procedures based on this assumption in Chapter 2. Chapter 3 presents
data analyses of five datasets to illustrate the performance of the models, and we use these
examples to illustrate how parameter interpretation varies among the models under study.
Chapter 4 describes the results of a simulation study that considers the goodness of fit
of the proportional hazards and accelerated hazards models when the underlying hazard
curves for treatment and control groups cross. The goodness of fit is quantified by calculat-
ing the mean squared error of the predicted survivor curves for both the control group and
the treatment group. Some suggestions for future work and a discussion of the findings in
this project are presented in Chapter 5.
Chapter 2
Models for Survival Analysis
This chapter discusses inference for the proportional hazards, accelerated failure time, and
accelerated hazards models. We discuss semiparametric methods of inference for the pro-
portional hazards and the accelerated hazards models, and likelihood inference for the
Weibull model, a commonly used accelerated failure time model. We illustrate the vari-
ous functional forms these models exhibit, contrasting the shapes they may attain. We also
investigate small sample properties of estimators of treatment effects in these models, and
describe resampling approaches for improving the performance of the small sample vari-
ance estimator in the accelerated hazards model.
We first introduce some basic notation that is used throughout the Chapter. We denote
the observed event time, Xi = min(Ti,Ci), where Ti is the actual event time, and Ci is the
censoring time for the individual i, i = 1, ...,n. Throughout the analyses in this project, we
assume independence between Ti and Ci conditional on a given p×1 vector of covariates,
ZZZi = (Zi1, ...,Zip)′. We define the lifetime indicator as ∆i = I(Ti < Ci), where the function
I(.) takes a value of 1 if the condition is satisfied, and 0 otherwise. The observed data is
written as a triple, (Xi,∆i,ZZZi), for i = 1, ...n. The realization of (Xi,∆i,ZZZi) is denoted by
(xi,δi,zzzi).
4
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 5
2.1 The Proportional Hazards Model
The proportional hazards model is among the most commonly used model in survival anal-
ysis, due to its simple conceptual framework, the fact that the model is semiparametric,
its excellent small sample performance, and the widespread availability of inferential tech-
niques for this model in statistical computing packages. In fact, this model, commonly
called the Cox model (Cox, 1972), is the standard model for survival analysis. This model
assumes hazard curves are proportional for individuals with different covariates.
The hazard function for an individual i with a covariate vector zzzi is formulated as
hPHi(t) = hPH0(t)g(zzzi;βββPH) (2.1)
where hPH0(t) is the so-called baseline hazard function, g(.), the relative risk function,
describes the effect of the covariates, ZZZi, on the baseline hazard, and βββPH is a p×1 vector
of regression parameters associated with covariates ZZZi. A typical form for g(.) is ezzz′iβββPH . In
this case, the survivor function for individual i is
SPHi(t) = e−∫ t
0 hPHi(s)ds
= [e−∫ t
0 hPH0(s)ds]ezzz′iβββPH
= [SPH0(t)]ezzz′iβββPH
where SPH0(t) is the survivor function corresponding to the baseline hazard; ie. the sur-
vivor function corresponding to an individual with ZZZi = 0. With semiparametric methods,
the model consists of a mixture of a non-parametric and parametric form. Here, the base-
line hazard function is non-parametric, while the relative risk function, ezzz′iβββPH , is paramet-
ric. Although the covariates in the formulation above are expressed as fixed over time,
extensions of the Cox model to accommodate time-dependent covariates are handled in a
straightforward manner with simple adjustments to inference for the basic framework.
2.1.1 Inference
The partial likelihood method was proposed by Cox (1972) for estimation of the regression
parameter βββPH . For simplicity, we assume there are no ties in the observed event times.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 6
Techniques in Breslow (1974) can be used for tied event times. Let 0 < t1 < ... < tm denote
the m distinct ordered event times. The risk set Rl is defined as the set of individuals who
are still alive (ie. subjects who have not had the event) just prior to time tl . Let (l) denote
the subject with event at time tl , l = 1, ...,m. Under the proportional hazards model (2.1),
the conditional probability that individual (l) fails at tl , given the risk set (a summarized
history of the process) is,
h(tl;zzz(l))
∑ j∈Rlh(tl;zzz( j))
=hPH0(tl)e
zzz′(l)βββPH
∑ j∈RlhPH0(tl)e
zzz′( j)βββPH
=ezzz′(l)βββPH
∑ j∈Rlezzz′( j)βββPH
, (2.2)
where l = 1, ...,m.
The numerator on the right-hand side of (2.2) pertains to the hazard/risk that person (l)fails over a small interval. The denominator quantifies the combined hazards of everyone
who is at risk just prior to tl . The partial likelihood is formed by taking the product over all
m distinct failure points (or events) to give:
LPH(βββPH) =m
∏l=1
[ezzz′(l)βββPH
∑ j∈Rlezzz′( j)βββPH
]. (2.3)
It can be shown that the baseline hazard function, hPH0(t), is uninformative for the estima-
tion of βββPH . An alternative way of writing the partial likelihood function (2.3) is
LPH(βββPH) =n
∏i=1
[ezzz′iβββPH
∑nj=1 I(x j ≥ xi)e
zzz′jβββPH
]δi
. (2.4)
The log (partial) likelihood of βββPH based on (2.3) is,
logLPH(βββPH) =m
∑l=1
zzz′(l)βββPH−m
∑l=1
[log( ∑
j∈Rl
ezzz′( j)βββPH )
](2.5)
Taking the first derivative of (2.5) with respect to the rth element of βββPH yields:
∂ logLPH
∂βPHr
=m
∑l=1
z(l)r−
∑ j∈Rlz( j)re
zzz′( j)βββPH
∑ j∈Rlezzz′( j)βββPH
. (2.6)
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 7
The maximum likelihood estimate (MLE) of βββPH is easily obtained by using a Newton-
Raphson iterative procedure to solve UUU(βββPH) = 000, where UUU(βββPH) is the score vector with
elements ∂ logLPH∂βPHr
, r = 1, ..., p. This estimation procedure is shown to provide consistent and
asymptotically normally distributed estimates for βββPH . The asymptotic variance of the esti-
mate βββPH is the inverse of the information matrix, I(βββPH), where I(βββPH) = E(− ∂2 logLPH
∂βββPH∂βββ′PH
).
Usual maximum likelihood theory holds for Cox regression analysis. In particular, asymp-
totically it can be shown that:
• The likelihood ratio test statistic,−2log[L(βββPH)/L(βββPH)], has a χ2 distribution with
p degrees of freedom.
• The score function, UUU(βββPH), has a N(0, I(βββPH)) distribution.
• The Wald test statistic, (βββPH −βββPH)′I(βββPH)(βββPH −βββPH), has a χ2 distribution with
p degrees of freedom.
2.2 The Accelerated Failure Time Model
The accelerated failure time model is commonly used in parametric estimation, where spec-
ification of the probability distribution function is required. When the failure time, Ti,
arises from a log-linear family (that is, Yi = logTi corresponds to a linear model, with the
covariates ZZZi having an additive effect on the log of failure time, Ti) then the family is an
accelerated failure time model. This model is written as:
Yi = logTi = zzz′iβββAFT +σεi, εi =Yi− zzz′iβββAFT
σ(2.7)
where σ is the scale parameter, and εi is a random error term assumed to have a particular
density function, f (.). The parameter βββAFT describes the effect of the covariates, ZZZi, on
the log failure time, Yi. A positive value for βββAFT signifies a deceleration of failure (ie.
survival time is lengthened) for an increasing value of ZZZi. Similarly, a negative value for
βββAFT implies an acceleration in failure (ie. survival time is shortened) for an increasing
value of ZZZi. The distribution of failure time Ti depends on the distribution assumed for εi.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 8
Exponentiating both sides of (2.7) yields:
Ti = ezzz′iβββAFT eσεi
= ezzz′iβββAFT T ∗i , (2.8)
where T ∗i = eσεi . Suppose T ∗i has a known hazard function, hAFT0(t), then the correspond-
ing hazard function for Ti is
hAFTi(t) = hAFT0(te−zzz′iβββAFT )e−zzz′iβββAFT , (2.9)
where hAFT0(t) is also referred to as the baseline hazard function of failure time Ti.
The survivor function of failure time Ti takes the form:
SAFTi(t) = SAFT0(tezzz′iβββAFT ) (2.10)
where SAFT0(t) is the survival function of T ∗i and also termed the baseline survivor function
of Ti. In this formulation, the model assumes that the covariate effect, βββAFT , acts multi-
plicatively on the time scale for the survivor function. It implies that the survivor curve
for an individual in a treatment group is a time-scale change of that for the control group
with all other covariates fixed, and survivor functions for both treatment and control groups
exhibit the same shape but with one group showing either a delay or an advancement of
failure time. Equation (2.9) shows the model formulation on the hazard scale. Note that
this is different from the PH model, where the parameter βββPH describes the multiplicative
effect of the covariates on the hazard.
The AFT model can handle crossing of hazard functions (but not survival functions),
and like the PH model, assumes an immediate treatment effect at t=0. Any distribution
that belongs to the location-scale family is a member of the accelerated failure time family.
The more commonly used distributions are the Exponential, Weibull, Lognormal and Log-
logistic.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 9
2.2.1 Inference
We present maximum likelihood estimation for the Weibull case. The Weibull is a very pop-
ular choice of survival distribution and we use this distribution throughout the project for
exemplifying the AFT model. Since the Weibull distribution is a member of the location-
scale family, the model can be written as in (2.7), such that if Ti ∼Weibull, then log(Ti)∼Extreme Value Distribution,
Yi = logTi = zzz′iβββAFT +σεi, (2.11)
where εi follows an extreme value distribution; f (εi)∼ exp(εi− exp(εi)), −∞ < εi < ∞.
The likelihood function is:
LAFT (βββAFT ,σ) = ∏i∈D
1σ
exp[
yi− zzz′iβββAFTσ
− exp(
yi− zzz′iβββAFTσ
)]∏i∈C
exp[−exp
(yi− zzz′iβββAFT
σ
)], (2.12)
where the first product is the probability density function of all observed events, D, and the
second product is the survivor function of all censored observations, C.
The log-likelihood function is:
logLAFT (βββAFT ,σ) =−k logσ+ ∑i∈D
yi− zzz′iβββAFTσ
−n
∑i=1
exp(
yi− zzz′iβββAFTσ
)(2.13)
where k is the number of events or deaths.
Taking the first and second derivatives of the log-likelihood function (2.13) with respect
to the rth element of βββAFT and σ yields,
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 10
∂ logLAFT
∂βAFTr
=− 1σ
∑i∈D
zir +1σ
n
∑i=1
zire(yi−zzz′iβAFT
σ) (2.14a)
∂ logLAFT
∂σ=− k
σ− 1
σ∑i∈D
yi− zzz′iβAFT
σ+
1σ
n
∑i=1
yi− zzz′iβAFT
σe(
yi−zzz′iβAFTσ
) (2.14b)
∂2 logLAFT
∂βAFTr∂βAFTs
=− 1σ2
n
∑i=1
zirzise(yi−zzz′iβAFT
σ) (2.14c)
∂2 logLAFT
∂σ2 =k
σ2 +2
σ2 ∑i∈D
yi− zzz′iβAFT
σ−
2σ2
n
∑i=1
yi− zzz′iβAFT
σe(
yi−zzz′iβAFTσ
) 1σ2
n
∑i=1
[yi− z′iβAFT
σ]2e(
yi−zzz′iβAFTσ
) (2.14d)
∂2 logLAFT
∂βAFTr∂σ=
1σ2 ∑
i∈Dzir−
1σ2
n
∑i=1
zire(yi−zzz′iβAFT
σ)− 1
σ2
n
∑i=1
ziryi− zzz′iβAFT
σe(
yi−zzz′iβAFTσ
)
(2.14e)
The maximum likelihood estimates of the parameters βββAFT and σ are obtained by
solving UUU(βββAFT ) = 000, and U(σ) = 0 where UUU(βββAFT ) is the score vector with elements,∂ logLAFT
∂βAFTr, for r = 1, ..., p, and U(σ) = ∂ logLAFT
∂σ. This is easily handled using a Newton-
Raphson algorithm. The asymptotic variance of the estimates of βββ∗AFT =[βββAFT ,σ](p+1)×1 is
the inverse of the information matrix, I(βββ∗AFT ), where I(βββ∗AFT ) = E(− ∂2 logLAFT
∂βββ∗AFT ∂βββ
∗TAFT
), a matrix
with dimensions ((p + 1)× (p + 1)). Similar to the asymptotic properties of the Cox PH
model, it can be shown that:
• The likelihood ratio test statistic, −2log[L(βββ∗AFT )/L(βββ∗AFT )], has a χ2 distribution
with p+1 degrees of freedom.
• The score function, UUU(βββ∗AFT ), has a N(0, I(βββ∗AFT )) distribution.
• The Wald test statistic, (βββ∗AFT −βββ
∗AFT )′I(βββ
∗AFT )(βββ
∗AFT −βββ
∗AFT ), has a χ2 distribution
with p+1 degrees of freedom.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 11
2.3 The Accelerated Hazards Model
The Accelerated Hazards (AH) model, proposed by Chen & Wang (2000), allows for
greater flexibility in the modeling of survivor data. The model for the hazard function
of an individual i with failure time Ti is written as follows:
hAHi(t) = hAH0(tezzz′iβββAH ), (2.15)
where hAH0 is the baseline hazard function. In this model, ezzz′iβββAH characterizes how the
covariates ZZZi alter the time scale of the underlying hazard function. For instance, βββAH > 0
or βββAH < 0 imply acceleration or deceleration of the time scale for the hazard, respectively.
As an example, if there exists one covariate, Zi, that takes a value of 0 for a control group,
and 1 for a treatment group, then eβAH = 12 means that the hazard of the treatment group
progresses in half the time as those in the control group. Similarly, eβAH = 2 means that the
hazard of the treatment group progresses in twice the time as those in the control group;
eβAH = 1 implies no difference between the two groups.
Alternatively, this model can be written in terms of the survival function
SAHi(t) =[
SAH0
(t
ezzz′iβββAH
)]exp(zzz′iβββAH)
, (2.16)
where SAH0 is the survivor function for the baseline group for which all covariates take a
value of 0.
Unlike the PH and the AFT models, the AH model can accommodate crossing of hazard
and survivor curves. Furthermore, the AH model allows the hazard curves of both the
treatment and control groups to start at the same time point. This is particularly useful
in randomized controlled trials where it is more reasonable to assume a comparable risk
or hazard between groups at t = 0. A restriction in the AH model that is not found in
the PH and AFT models is its inability to handle situations where the hazard function is
constant over time (eg. exponential distribution). Therefore, it is imperative to check for
non-constancy of the baseline hazard function before implementing this model.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 12
2.3.1 Inference
When the baseline hazard function, hAH0(t), is fully parameterized, estimation of the pa-
rameters in the AH model can be performed by maximum likelihood. When hAH0(t) is un-
specified, the usual semiparametric estimation procedure for the PH model can be adopted
for estimation of the AH model as will be described in this section. Chen & Wang (2000)
proposed an estimation procedure for the AH model motivated by the fact that the only dif-
ference between the hazard functions, hAHi(t) and hAH0(t) in (2.15) is a time scale change.
Specifically, notice that for a random event time T with a hazard function h(t), its trans-
formation T ∗ = Tezzz′βββaaa , will have a hazard function of the form, h∗(t) = h(te−zzz′βββaaa)e−zzz′βββaaa ,
where βββa is a vector of arbitrary positive real numbers. If the AH model is true for T ,
with the true parameter vector, βββ000, such that h(t) = h0(tezzz′βββ000), the hazard function for the
transformed event time T ∗ then takes the form:
h∗(t) = h(te−zzz′βββaaa)e−zzz′βββaaa
= h0(tezzz′(βββ000−βββaaa))e−zzz′βββaaa. (2.17)
Notice that when βββ000 = βββaaa, the above equation becomes,
h∗(t) = h0(t)e−zzz′βββ000, (2.18)
which implies that the transformed time, T ∗, recovers the proportionality between hazard
functions with a ratio of e−zzz′βββ000 , when the true values of parameter βββ000 are used for trans-
formation. Hence, a partial likelihood method using (2.18) as a working model on the
transformed times may be used for the estimation of the AH model.
The algorithm developed by Chen & Wang (2000) is as follows:
1. Multiply the observed event times xi by a positive number ezzz′iβββaaa , where βββaaa is a p×1
vector of arbitrary real numbers. This transformation allows for a rescaling of the
time axis for individual i, while keeping the individuals in the baseline group on the
original time scale.
2. Consider the working PH model (2.18) on the transformed times. The naive partial
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 13
likelihood function takes the form,
LPH(βββaaa,βββ) =n
∏i=1
[e−zzz′iβββ
∑nj=1 I(xie
zzz′jβββaaa > xiezzz′iβββaaa)e−zzz′jβββ
]δi
. (2.19)
The naive partial likelihood estimator of βββ can be defined as values solving the fol-
lowing score equations, by taking the first derivative of the logarithm of (2.19) with
respect to the rth element of βββ,
∂ logLPH(βββaaa,βββ)∂βr
=n
∑i=1
δi
[zir−
∑nj=1 I(x jez jβa > xieziβ)e−z jβz jr
∑nj=1 I(x jez jβa > xieziβ)e−z jβ
]= 0, (2.20)
where βr indicates the parameter associated with the rth covariate, zir, r = 1, ..., p.
3. Update βββaaa in Step 1 to take values that are equivalent to the solution from (2.20) in
Step 2. Repeat Steps 1 and 2 until convergence to obtain the estimate βAFT .
The above algorithm is equivalent to solving a set of equations UUU(βββAH) = 000, where
UUU(βββAH) = (U1(βAH), ...,Up(βAH))′and
Ur(βAH) =n
∑i=1
δi
[zir−
∑nj=1 I(x jez jβAH > xieziβAH )e−z jβAH z jr
∑nj=1 I(x jez jβAH > xieziβAH )e−z jβAH
], r = 1, ..., p. (2.21)
In the two-sample case where there is only one covariate Zi taking a value of 0 or 1 to
indicate two treatment groups, the estimating equation U(βAH) can be written as:
U(βAH) = ∑i1∈D(1)
δi1
[1−
∑nj=1 I(x jez jβAH > xi1eβAH )e−z jβAH z j
∑nj=1 I(x jez jβAH > xi1eβAH )e−z jβAH
]−
∑i0∈D(0)
δi0
[∑
nj=1 I(x jez jβAH > xi0)e
−z jβAH z j
∑nj=1 I(x jez jβAH > xi0)e
−z jβAH
]
= ∑i1∈D(1)
δi1
[∑ j0∈D(0) I(x j0 > xi1eβAH )
∑ j0∈D(0) I(x j0 > xi1eβAH )+∑ j1∈D(1) I(x j1eβAH > xi1eβAH )e−βAH
]−
∑i0∈D(0)
[∑ j1∈D(1) I(x j1eβAH > xi0)e
−βAH
∑ j0∈D(0) I(x j0 > xi0)+∑ j1∈D(1) I(x j1eβAH > xi0)e−βAH
], (2.22)
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 14
where D(0) and D(1) indicate the subset of individuals in the control and treatment groups,
respectively.
Define
Y (q)(t) =n
∑i=1
I(Xi ≥ t, Zi = q)
and
N(q)(t) =n
∑i=1
I(Xi ≤ t, ∆i = 1,Zi = q),
where q = 0,1, representing the control and treatment groups respectively. N(q)(t) is the
number of observed deaths or failures up to time t in group q, and Y (q)(t) is the number of
censored observations at time t in group q. The estimating equation (2.22) can be rewritten
in the counting process framework as:
U(βAH) =∫ Y (0)(t)
Y (0)(t)+Y (1)( teβAH
)/eβAHdN(1)
( teβAH
)−
∫ Y (1)( teβAH
)/eβAH
Y (0)(t)+Y (1)( teβAH
)/eβAHdN(0)(t). (2.23)
Notice that in (2.23), the number of individuals at risk in group 1 at time t/eβAH ,
Y (1)( teβAH
), is weighted by a factor of 1/eβAH to put the treatment group at a compara-
ble hazard as the control group at t = 0.
Asymptotically, it can be shown that solving the estimating equation U(βAH) presented
in (2.23) yields estimates for βAH that are normally distributed. A caution in solving the
proposed estimating equation is its tendency to yield multiple solutions. This issue arises
due to the discontinuity of the function. One way of resolving this issue is to take the zero-
crossing of U(βAH) as the estimate for βAH . A zero-crossing is defined as the βAH that
satisfies U(βAH+)U(βAH−)≤ 0 (Tsiatis 1990).
If a solution to (2.23), βAH , is found, the cumulative baseline hazard function, HAH0(t),
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 15
in the two-sample case can be estimated by using the Breslow estimator,
HAH0(·; βAH) =∫ ·
0
dN(0)(t)+dN(1)(
teβAH
)Y (0)(t)+Y (1)
(t
eβAH
)/eβAH
(2.24)
where HAH0(·) =∫ ·
0 hAH0(t)dt.
Throughout this project, estimation for βAH is done by solving the estimating equation
shown in (2.23). The corresponding variance estimation procedure for the algorithm above
was also derived by Chen & Wang (2000). Chen & Jewell (2001) presented an alterna-
tive method to estimate the variance without estimating the baseline hazard function using
asymptotic linear theory. The method used in this project to estimate the variance for βAH
from the AH model is based on the development by Chen & Jewell (2001). The theory un-
derpinning variance estimation is challenging and is omitted here. However, an algorithm
for computing the variance estimator, reproduced from Chen & Jewell (2001), is provided
in the Appendix.
2.4 Comparison of the functional forms for the PH, AFT,and AH Models
To compare these models, we focus on a two-sample situation as above, where the scalar
covariate Zi indicates the grouping (control or treatment). When the underlying failure
time distribution is Weibull, the PH, AFT and AH models are equivalent, and the treatment
effects in these models have the relationship,
βPH =−kβAFT = (k−1)βAH , (2.25)
where k is the shape parameter of the Weibull distribution (k = 1/σ), and k 6= 1. When the
underlying distribution is not a Weibull distribution, the three models are not equivalent
- each parameter has its own interpretation, and the choice of model may depend on the
question of interest. Table 2.1 outlines the differences in interpretation of the parameters
of the three models.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 16
Mod
elE
ffec
tIn
terp
reta
tion
PHβ
PH
>0
Trea
tmen
tpro
port
iona
llyin
crea
ses
risk
/haz
ard
bya
fact
orof
eβP
H.
βP
H<
0Tr
eatm
entp
ropo
rtio
nally
decr
ease
sri
sk/h
azar
dby
afa
ctor
ofeβ
PH
.
AFT
βA
FT
>0
Trea
tmen
tdec
eler
ates
failu
retim
eof
the
surv
ivor
func
tion
bya
fact
orof
eβA
FT.
βA
FT
<0
Trea
tmen
tacc
eler
ates
failu
retim
eof
the
surv
ivor
func
tion
bya
fact
orof
eβA
FT.
AH
βA
H>
0Tr
eatm
enta
ccel
erat
esth
eri
sk/h
azar
dby
afa
ctor
ofeβ
AH
.
βA
H<
0Tr
eatm
entd
ecel
erat
esth
eri
sk/h
azar
dby
afa
ctor
ofeβ
AH
.
All
βP
H=
βA
FT
=β
AH
=0
Trea
tmen
tdoe
sno
thav
ean
effe
ct.
Tabl
e2.
1:Pa
ram
eter
inte
rpre
tatio
nfo
rPH
,AFT
and
AH
mod
els
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 17
In the proportional hazards model, the relative risk ratio, eβPH , quantifies the magnitude
of the risk ratio between the treatment and control groups. A positive value for βPH (or
eβPH > 1) suggests that the treatment group has a greater risk of failure than the baseline. A
negative value for βPH (or eβPH < 1) suggests a proportionately smaller risk in the treatment
group compared to the baseline. In the accelerated failure time model, a positive value for
βAFT (or eβAFT > 1) can be interpreted as a deceleration of failure time (or a lengthening
of survival time), while a negative value for βAFT (or eβAFT < 1) implies an acceleration of
failure time (or a shortening of the survival time) in the treatment group compared to the
baseline, with respect to the survivor curve. In the accelerated hazards model, interpreta-
tion of βAH depends on the shape of the baseline hazard function. If the hazard function is
increasing over time, a positive value for βAH (or eβAH > 1) implies that the treatment group
has a greater hazard than the control. On the other hand, if the hazard function is decreas-
ing over time, a positive value for βAH (or eβAH > 1) implies that the treatment group has
a reduced hazard compared to the control. The effect of βAH is multiplicative on the time
scale operating on the hazard function.
Figure 2.1 shows the differences in hazard and survivor curves for the three models. The
proportional hazard model is restricted to scenarios with non-crossing hazard and survivor
curves. The accelerated failure time model can handle both crossing and non-crossing of
the hazard curves, but not of the survivor curves. The accelerated hazards model is the only
model that handles crossings of either or both of the hazard and survivor curves.
2.5 Small sample investigation of the performance of theAH estimator
Simulation studies using different baseline hazard functions were conducted to investi-
gate the small sample properties of the accelerated hazards model estimator. Failure times
were generated from a Weibull distribution with shape parameters, k=0.5, 1.5, and 3, and
scale parameters, λ =0.25, 1, and 1.3. The censoring time was simulated from a Uni-
form (0,τ) distribution, where the value of τ was chosen to produce 27% and 53% cen-
sored observations; we also considered the case of no censoring. The covariate (scalar)
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 18
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
Time in Days
Ris
k
Control
PH Treatment
Proportional Hazard
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
Time in Days
Ris
k
Control
AFT Treatment
Accelerated Failure Time
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
Time in Days
Ris
k
Control AH Treatment
Accelerated Hazards
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
Time in Days
P(s
urv
iva
l)
Control
PH Treatment
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
Time in Days
P(s
urv
iva
l)
Control
AFT Treatment
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
Time in Days
P(s
urv
iva
l)
Control
AH Treatment
Figure 2.1: Hazard (top row) and survivor (bottom row) functions of PH, AFT, and AH
models. Non-crossing hazard functions for the AFT and AH models are not shown.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 19
was an indicator for a control or a treatment group. Both control and treatment groups
had equal sample sizes, with the total sample size, n, taking values of 100, 500, 1000,
5000, and 10,000. Three magnitudes of treatment effects in the accelerated hazards model
scale, βAH =0, 0.5, and 2, were investigated. A full factorial design was implemented with
the factors k (3 levels), λ (3 levels), censoring percent (3 levels), n (5 levels), and βAH
(3 levels), for 1000 runs of each factor combination. Table 2.2, 2.3, and 2.4 provide the
mean bias of the 1000 estimates of the treatment effect from analyses using PH and AH
models, the empirical variance of the estimators, as well as the ratio of the model-based
variance to the empirical variance for n=100, 500, and 1000. Note that the true values of
the treatment effect are provided in terms of the effect in the AH model. Bias is defined
as the difference between the mean estimate and the true value, βAH . The bias for the PH
model is calculated similarly with the true value, βPH = (k− 1)βAH . The empirical vari-
ance is the standard deviation of the 1000 estimates. The Variance Ratio compares the
model-based variance with the empirical variance. Let s∗ denote the sign of the difference
between the model-based variance and the empirical variance estimate, s∗ = sign(model-
based variance - empirical variance estimate). Then the quantity Variance Ratio is defined
as s∗(model-based variance/empirical variance)s∗ . With this formulation, the range of pos-
sible values of the Variance Ratio is the union of the disjoint set >1 and <-1. In this
formulation, when the ratio of the model-based to empirical variance is less than 1, the
Variance Ratio is the negative of its inverse. A negative ratio implies that the model-based
variance underestimates the empirical variance, while a positive ratio implies an overesti-
mation of the model-based variance. A ratio of 1 implies that both variance estimates are
equivalent.
The results show that the PH and AH estimators are generally unbiased for all scenarios
considered in the study. While the variance ratio is close to 1 for the PH estimator, the AH
variance estimator behaves poorly when n=100, with the model-based variance severely
underestimating the empirical variance, denoted by the negative variance ratios in Tables
2.2, 2.3 and 2.4.
Figure 2.2 shows a level plot of the variance ratios in the accelerated hazards model.
Warm colors (red, darker shades of orange), signifying underestimation of the variance,
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 20
are mostly identified in the n=100 case. Figure 2.3 displays the corresponding variance
ratios for the proportional hazards model, and unlike the ratios based from the AH model,
the PH ratios fall close to -1 and 1 (having light green and yellow colors). As the sample
size grows to 500, and 1000, the asymptotic variance estimator for the AH model improves
greatly, showing comparable magnitude to that of the empirical variance. It seems that
model-based variance estimates are only reliable when sample sizes are quite large.
An alternative strategy in estimating the variance for small sample sizes is to use a
bootstrap approach. Here, we propose a non-parametric bootstrap variance estimation pro-
cedure and compare it with the asymptotic variance estimator of Chen & Jewell (2001).
The following outlines the steps in performing a non-parametric bootstrap procedure, with
a bootstrap resampling size of 1000, to obtain an estimate of the variance:
1. Sample with replacement from the dataset to obtain a new (resampled) dataset of the
same size.
2. Fit the AH model as outlined in Section 2.3 to obtain an estimate of the treatment
effect, βAHboot based on these data.
3. Repeat steps 1-2 1000 times to obtain 1000 estimates of βAHboot , and hence an em-
pirical distribution of the estimator. Estimate the variance of βAH by calculating the
variance of the 1000 βAHboot ’s. The 95% confidence interval for ˆVar(βAH) is the 2.5%
and 97.5% quantile of βββAHboot.
Figures 2.4, 2.5, and 2.6, compare the performance of the bootstrap and model-based
variance estimates based on 1000 runs from the same Weibull distribution as the previous
simulation study but focusing on small sample sizes of 100, 300 and 500. (Note again, that
for each run of this study, 1000 bootstrap resamples are generated. As well, note that in a
few of the resampled scenarios the estimating procedure did not converge and these sce-
narios were omitted.) The bootstrapped variance estimates (shown in green circles) when
n=100 are close to the empirical variance estimates (blue circles) in general, except a slight
deviation when the shape parameter, k, is 3. As expected, as the percentage of censoring
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 21
gets bigger, the variance increases. As the sample size increases from 100 to 300 and 500,
the model-based variance estimates approach the empirical and bootstrapped estimates. At
n=500, the only scenario where the model-based variance still performs poorly is when the
censoring percentage is high (at 53%), the treatment effect is large (te=2), and the shape
parameter, k, is 1.5.
Figures 2.7, 2.8 and 2.9, display the 95% coverage probabilities of the two variance es-
timators (model-based and bootstrap) based from the AH model, for sample sizes, n=100,
300, and 500, respectively. The bootstrapped estimate of the variance attained coverage
probabilities that are close to 95% in all cases. The AH model-based variance, on the other
hand, had low coverage probabilities in the small sample size case (n=100) but performed
better when the sample size grew to 300 and 500.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 22
Acc
eler
ated
Haz
ards
Mod
elPr
opor
tiona
lHaz
ards
Mod
el(β
PH
=β
AH(k−
1))
Bia
s(E
mpi
rica
lVar
ianc
e)Va
rian
ceR
atio
Bia
s(E
mpi
rica
lVar
ianc
e)Va
rian
ceR
atio
Cen
.%Si
zeSc
ale
(λ)
Shap
e(k
)β
AH
=0β
AH
=0.5
βA
H=2
βA
H=0
βA
H=0
.5β
AH
=2β
AH
=0β
AH
=0.5
βA
H=2
βA
H=0
βA
H=0
.5β
AH
=2
0%10
00.
250.
50.
011
(0.1
83)
0.02
1(0
.187
)0.
105
(0.3
60)
-1.8
9-1
.90
-3.0
0-0
.006
(0.0
41)
-0.0
02(0
.041
)-0
.024
(0.0
53)
1.01
1.04
-1.0
4
1.5
-0.0
05(0
.192
)-0
.002
(0.1
94)
0.05
4(0
.335
)-3
.51
-3.5
7-7
.29
-0.0
02(0
.045
)-0
.001
(0.0
44)
0.02
9(0
.049
)-1
.09
-1.0
51.
04
30.
009
(0.0
10)
0.00
1(0
.011
)0.
000
(0.0
21)
1.39
1.38
1.15
0.01
7(0
.044
)0.
015
(0.0
55)
0.01
0(0
.395
)-1
.05
-1.0
91.
13
10.
50.
011
(0.1
97)
0.03
8(0
.203
)0.
048
(0.3
41)
-2.0
2-2
.14
-3.0
0-0
.003
(0.0
45)
-0.0
11(0
.045
)-0
.007
(0.0
54)
-1.0
9-1
.06
-1.0
6
1.5
0.00
8(0
.180
)0.
019
(0.2
11)
0.02
6(0
.332
)-3
.43
-4.1
6-7
.09
0.00
3(0
.043
)0.
003
(0.0
43)
0.01
8(0
.052
)-1
.04
-1.0
3-1
.03
30.
001
(0.0
11)
-0.0
04(0
.011
)-0
.007
(0.0
19)
1.23
1.29
1.25
0.00
4(0
.046
)0.
006
(0.0
53)
0.09
1(0
.318
)-1
.10
-1.0
51.
23
1.3
0.5
-0.0
16(0
.191
)0.
024
(0.1
87)
0.09
1(0
.361
)-1
.99
-1.9
1-3
.07
0.00
9(0
.043
)-0
.002
(0.0
42)
-0.0
16(0
.053
)-1
.04
1.01
-1.0
4
1.5
-0.0
03(0
.177
)-0
.032
(0.1
77)
0.01
5(0
.333
)-3
.49
-3.5
7-7
.21
-0.0
02(0
.043
)-0
.017
(0.0
42)
0.01
7(0
.050
)-1
.03
-1.0
01.
01
3-0
.003
(0.0
10)
-0.0
03(0
.010
)0.
008
(0.0
20)
1.49
1.33
1.26
-0.0
09(0
.043
)0.
016
(0.0
50)
0.12
5(0
.353
)-1
.04
1.01
1.16
500
0.25
0.5
0.00
3(0
.035
)0.
005
(0.0
36)
0.03
5(0
.065
)1.
061.
061.
10-0
.002
(0.0
09)
-0.0
01(0
.008
)-0
.007
(0.0
10)
-1.0
6-1
.03
1.03
1.5
0.00
2(0
.034
)-0
.006
(0.0
35)
0.00
6(0
.061
)1.
271.
241.
310.
000
(0.0
08)
-0.0
01(0
.008
)0.
006
(0.0
09)
-1.0
5-1
.02
1.06
30.
001
(0.0
02)
-0.0
01(0
.002
)-0
.001
(0.0
04)
1.05
1.10
1.04
0.00
3(0
.008
)-0
.002
(0.0
10)
0.02
1(0
.061
)-1
.00
1.00
-1.0
2
10.
50.
000
(0.0
33)
0.00
4(0
.036
)0.
012
(0.0
65)
1.15
1.11
1.11
0.00
0(0
.008
)0.
000
(0.0
08)
0.00
1(0
.010
)1.
01-1
.02
-1.0
3
1.5
0.00
0(0
.033
)-0
.007
(0.0
37)
0.00
4(0
.067
)1.
331.
201.
15-0
.001
(0.0
08)
-0.0
05(0
.008
)0.
003
(0.0
10)
1.00
-1.0
3-1
.04
3-0
.002
(0.0
02)
0.00
2(0
.002
)-0
.001
(0.0
04)
1.07
1.07
1.07
-0.0
03(0
.008
)0.
006
(0.0
09)
0.01
6(0
.058
)-1
.04
1.07
1.02
1.3
0.5
0.00
0(0
.032
)0.
012
(0.0
33)
0.04
2(0
.074
)1.
231.
17-1
.02
0.00
0(0
.008
)-0
.004
(0.0
08)
-0.0
09(0
.010
)1.
021.
06-1
.06
1.5
-0.0
02(0
.033
)-0
.008
(0.0
35)
0.02
1(0
.065
)1.
271.
221.
25-0
.001
(0.0
08)
-0.0
02(0
.009
)0.
005
(0.0
09)
-1.0
3-1
.05
1.07
30.
003
(0.0
02)
-0.0
02(0
.002
)-0
.001
(0.0
04)
1.10
1.10
1.10
0.00
5(0
.009
)-0
.003
(0.0
09)
0.02
5(0
.056
)-1
.06
1.05
1.07
1000
0.25
0.5
-0.0
05(0
.015
)0.
000
(0.0
17)
0.01
2(0
.029
)1.
171.
051.
150.
003
(0.0
04)
0.00
1(0
.004
)-0
.001
(0.0
05)
1.09
1.01
1.04
1.5
-0.0
01(0
.017
)0.
004
(0.0
17)
0.00
6(0
.033
)1.
081.
121.
12-0
.001
(0.0
04)
0.00
1(0
.004
)0.
000
(0.0
05)
-1.0
61.
01-1
.06
30.
001
(0.0
01)
-0.0
03(0
.001
)0.
001
(0.0
02)
1.09
1.07
-1.0
40.
002
(0.0
04)
-0.0
03(0
.005
)0.
010
(0.0
29)
1.02
1.03
-1.0
1
10.
50.
000
(0.0
15)
0.00
2(0
.018
)0.
012
(0.0
33)
1.14
1.02
-1.0
00.
000
(0.0
04)
0.00
0(0
.004
)-0
.002
(0.0
05)
1.05
-1.0
3-1
.09
1.5
0.00
2(0
.016
)0.
001
(0.0
16)
0.00
9(0
.034
)1.
141.
131.
150.
001
(0.0
04)
0.00
1(0
.004
)0.
002
(0.0
05)
-1.0
11.
051.
06
3-0
.002
(0.0
01)
0.00
0(0
.001
)-0
.002
(0.0
02)
1.07
1.09
-1.0
5-0
.003
(0.0
04)
0.00
2(0
.005
)0.
012
(0.0
30)
1.01
-1.0
1-1
.05
1.3
0.5
0.00
1(0
.017
)0.
006
(0.0
17)
0.01
0(0
.032
)1.
051.
061.
030.
000
(0.0
04)
-0.0
02(0
.004
)0.
000
(0.0
05)
-1.0
21.
04-1
.05
1.5
0.00
2(0
.016
)0.
006
(0.0
18)
0.00
0(0
.032
)1.
171.
061.
150.
001
(0.0
04)
0.00
2(0
.004
)-0
.002
(0.0
05)
1.02
-1.0
21.
00
30.
000
(0.0
01)
0.00
0(0
.001
)0.
001
(0.0
02)
-1.0
51.
031.
100.
000
(0.0
04)
0.00
2(0
.005
)0.
015
(0.0
30)
-1.1
1-1
.01
-1.0
3
Tabl
e2.
2:Su
mm
ary
ofbi
as,v
aria
nce,
and
vari
ance
ratio
sfo
rth
ePH
and
AH
mod
els
fitte
dto
aW
eibu
lldi
stri
butio
nfo
r
the
0%ce
nsor
ing
case
.Not
eth
atβ
PH
=β
AH
(k-1
).
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 23
Acc
eler
ated
Haz
ards
Mod
elPr
opor
tiona
lHaz
ards
Mod
el(β
PH
=β
AH(k−
1))
Bia
s(E
mpi
rica
lVar
ianc
e)Va
rian
ceR
atio
Bia
s(E
mpi
rica
lVar
ianc
e)Va
rian
ceR
atio
Cen
.%Si
zeSc
ale
(λ)
Shap
e(k
)β
AH
=0β
AH
=0.5
βA
H=2
βA
H=0
βA
H=0
.5β
AH
=2β
AH
=0β
AH
=0.5
βA
H=2
βA
H=0
βA
H=0
.5β
AH
=2
27%
100
0.25
0.5
0.00
7(0
.260
)0.
028
(0.2
69)
0.05
6(0
.388
)-2
.13
-2.2
3-2
.91
-0.0
01(0
.056
)-0
.002
(0.0
57)
-0.0
08(0
.058
)-1
.01
-1.0
01.
05
1.5
-0.0
09(0
.255
)-0
.017
(0.2
70)
-0.0
06(0
.426
)-4
.65
-4.9
0-8
.88
-0.0
03(0
.058
)-0
.010
(0.0
61)
0.01
1(0
.071
)-1
.01
-1.0
5-1
.08
30.
001
(0.0
15)
0.00
5(0
.016
)0.
001
(0.0
32)
1.30
1.27
-1.0
60.
001
(0.0
64)
0.02
8(0
.071
)0.
092
(0.3
66)
-1.0
9-1
.01
1.29
10.
50.
006
(0.2
58)
0.04
1(0
.276
)0.
049
(0.3
98)
-2.1
5-2
.27
-2.9
2-0
.004
(0.0
58)
-0.0
11(0
.058
)-0
.004
(0.0
60)
-1.0
2-1
.02
1.04
1.5
-0.0
12(0
.266
)0.
001
(0.2
62)
0.06
8(0
.462
)-4
.66
-4.7
7-9
.19
-0.0
03(0
.059
)-0
.003
(0.0
55)
0.02
6(0
.066
)-1
.01
1.04
1.01
30.
000
(0.0
15)
0.00
4(0
.016
)0.
014
(0.0
32)
1.29
1.30
-1.0
30.
001
(0.0
64)
0.02
0(0
.068
)0.
085
(0.3
54)
-1.1
01.
021.
34
1.3
0.5
0.01
8(0
.242
)0.
039
(0.2
72)
0.03
3(0
.392
)-1
.98
-2.2
2-2
.96
-0.0
11(0
.055
)-0
.006
(0.0
57)
-0.0
07(0
.065
)1.
03-1
.02
-1.0
5
1.5
-0.0
20(0
.284
)0.
006
(0.2
71)
0.05
0(0
.494
)-5
.41
-5.2
1-1
0.17
-0.0
08(0
.064
)0.
000
(0.0
56)
0.00
7(0
.067
)-1
.09
1.04
1.01
3-0
.008
(0.0
16)
0.00
1(0
.016
)0.
001
(0.0
32)
1.25
1.33
-1.1
1-0
.015
(0.0
64)
0.02
3(0
.070
)0.
081
(0.3
71)
-1.0
9-1
.00
1.28
500
0.25
0.5
0.00
4(0
.044
)0.
017
(0.0
48)
0.03
8(0
.083
)1.
091.
051.
07-0
.002
(0.0
11)
-0.0
06(0
.011
)-0
.010
(0.0
12)
1.04
-1.0
2-1
.02
1.5
0.00
4(0
.048
)0.
013
(0.0
53)
0.03
7(0
.116
)1.
231.
14-1
.07
0.00
2(0
.012
)0.
005
(0.0
12)
0.00
5(0
.012
)-1
.05
-1.1
01.
02
3-0
.002
(0.0
03)
-0.0
03(0
.003
)0.
004
(0.0
05)
1.20
1.03
1.11
-0.0
04(0
.011
)0.
001
(0.0
14)
0.04
2(0
.075
)1.
04-1
.08
-1.0
5
10.
50.
000
(0.0
46)
0.01
4(0
.050
)0.
036
(0.0
85)
1.07
1.00
1.02
0.00
0(0
.011
)-0
.004
(0.0
12)
-0.0
08(0
.012
)1.
01-1
.04
-1.0
2
1.5
0.01
0(0
.046
)0.
016
(0.0
49)
0.01
9(0
.107
)1.
281.
241.
110.
004
(0.0
11)
0.00
6(0
.011
)0.
001
(0.0
13)
1.00
1.05
-1.0
0
30.
001
(0.0
03)
0.00
2(0
.003
)0.
004
(0.0
06)
1.06
1.08
1.12
0.00
2(0
.011
)0.
005
(0.0
13)
0.04
1(0
.079
)-1
.01
1.03
-1.0
8
1.3
0.5
0.00
5(0
.042
)0.
003
(0.0
47)
0.03
0(0
.075
)1.
131.
071.
20-0
.003
(0.0
10)
0.00
0(0
.011
)-0
.004
(0.0
11)
1.10
1.01
1.05
1.5
0.00
6(0
.048
)0.
009
(0.0
49)
0.02
4(0
.112
)1.
231.
271.
060.
002
(0.0
12)
0.00
2(0
.011
)-0
.004
(0.0
13)
-1.0
21.
02-1
.03
30.
002
(0.0
03)
0.00
0(0
.003
)-0
.003
(0.0
06)
1.07
1.14
1.03
0.00
4(0
.012
)0.
004
(0.0
13)
0.02
4(0
.074
)-1
.07
1.03
-1.0
4
1000
0.25
0.5
-0.0
02(0
.021
)0.
000
(0.0
23)
0.00
7(0
.041
)1.
051.
051.
000.
001
(0.0
05)
0.00
1(0
.005
)0.
001
(0.0
06)
1.03
1.01
-1.0
3
1.5
0.00
2(0
.022
)0.
005
(0.0
25)
0.01
8(0
.049
)1.
181.
091.
280.
001
(0.0
05)
0.00
0(0
.006
)0.
004
(0.0
07)
1.01
1.01
-1.0
8
30.
000
(0.0
01)
0.00
2(0
.001
)0.
000
(0.0
03)
1.16
1.04
1.07
0.00
1(0
.005
)0.
004
(0.0
06)
0.01
6(0
.035
)1.
031.
04-1
.01
10.
5-0
.001
(0.0
23)
-0.0
02(0
.024
)0.
006
(0.0
39)
1.06
-1.0
01.
070.
000
(0.0
06)
0.00
3(0
.006
)0.
002
(0.0
06)
1.00
-1.0
3-1
.02
1.5
0.00
0(0
.023
)-0
.001
(0.0
24)
0.02
6(0
.051
)1.
161.
081.
240.
000
(0.0
06)
-0.0
01(0
.006
)0.
004
(0.0
07)
-1.0
5-1
.01
-1.0
4
30.
000
(0.0
01)
-0.0
01(0
.002
)0.
003
(0.0
03)
1.08
-1.0
1-1
.04
-0.0
01(0
.006
)0.
001
(0.0
07)
0.02
7(0
.038
)-1
.03
-1.0
2-1
.08
1.3
0.5
-0.0
05(0
.023
)0.
009
(0.0
23)
0.01
4(0
.041
)-1
.01
1.06
-1.0
00.
003
(0.0
06)
-0.0
03(0
.005
)-0
.001
(0.0
06)
-1.0
21.
02-1
.01
1.5
0.00
2(0
.023
)0.
010
(0.0
24)
0.00
9(0
.054
)1.
131.
131.
230.
001
(0.0
06)
0.00
4(0
.005
)0.
002
(0.0
07)
-1.0
41.
04-1
.01
30.
000
(0.0
01)
-0.0
01(0
.001
)0.
002
(0.0
03)
1.01
1.06
1.01
0.00
1(0
.006
)0.
003
(0.0
07)
0.01
0(0
.038
)-1
.08
-1.0
2-1
.11
Tabl
e2.
3:Su
mm
ary
ofbi
as,v
aria
nce,
and
vari
ance
ratio
sfo
rth
ePH
and
AH
mod
els
fitte
dto
aW
eibu
lldi
stri
butio
nfo
r
the
27%
cens
orin
gca
se.N
ote
that
βP
H=
βA
H(k
-1).
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 24
Acc
eler
ated
Haz
ards
Mod
elPr
opor
tiona
lHaz
ards
Mod
el(β
PH
=β
AH(k−
1))
Bia
s(E
mpi
rica
lVar
ianc
e)Va
rian
ceR
atio
Bia
s(E
mpi
rica
lVar
ianc
e)Va
rian
ceR
atio
Cen
.%Si
zeSc
ale
(λ)
Shap
e(k
)β
AH
=0β
AH
=0.5
βA
H=2
βA
H=0
βA
H=0
.5β
AH
=2β
AH
=0β
AH
=0.5
βA
H=2
βA
H=0
βA
H=0
.5β
AH
=2
53%
100
0.25
0.5
0.02
4(0
.398
)0.
054
(0.3
91)
0.04
0(0
.543
)-2
.63
-2.6
6-3
.18
-0.0
17(0
.087
)-0
.012
(0.0
80)
-0.0
23(0
.095
)-1
.00
1.07
1.03
1.5
0.03
6(0
.437
)0.
055
(0.4
33)
0.01
9(0
.662
)-7
.31
-7.9
3-1
2.05
0.01
9(0
.084
)0.
016
(0.0
85)
0.00
3(0
.109
)1.
051.
02-1
.07
30.
001
(0.0
25)
0.01
1(0
.029
)0.
086
(0.1
91)
1.24
1.01
-4.6
50.
002
(0.1
00)
0.03
6(0
.111
)0.
037
(0.3
28)
-1.0
2-1
.01
1.81
10.
5-0
.004
(0.3
93)
0.03
4(0
.421
)0.
018
(0.5
06)
-2.6
5-2
.75
-2.8
70.
002
(0.0
90)
-0.0
10(0
.089
)-0
.030
(0.0
93)
-1.0
3-1
.04
1.09
1.5
-0.0
30(0
.466
)0.
031
(0.4
91)
0.02
3(0
.704
)-7
.57
-8.4
8-1
2.77
-0.0
10(0
.091
)0.
004
(0.0
92)
0.01
3(0
.099
)1.
011.
011.
01
30.
004
(0.0
23)
0.00
6(0
.025
)0.
081
(0.1
76)
1.32
1.19
-4.4
40.
008
(0.0
93)
0.02
7(0
.106
)0.
038
(0.3
08)
-1.0
3-1
.01
1.84
1.3
0.5
-0.0
20(0
.422
)0.
043
(0.3
97)
-0.0
32(0
.564
)-2
.69
-2.5
6-3
.22
0.00
6(0
.092
)-0
.017
(0.0
85)
0.00
3(0
.110
)-1
.05
1.03
-1.0
9
1.5
-0.0
01(0
.468
)-0
.020
(0.4
90)
-0.0
10(0
.594
)-7
.72
-9.0
1-1
1.44
0.01
8(0
.093
)0.
003
(0.0
97)
0.03
7(0
.108
)-1
.02
-1.0
8-1
.08
30.
006
(0.0
21)
0.00
0(0
.027
)0.
060
(0.1
57)
1.34
1.16
-4.0
80.
012
(0.0
85)
0.01
2(0
.106
)0.
041
(0.3
58)
1.04
-1.0
01.
63
500
0.25
0.5
0.02
0(0
.071
)0.
017
(0.0
71)
0.04
3(0
.145
)1.
041.
06-1
.06
-0.0
10(0
.017
)-0
.006
(0.0
16)
-0.0
08(0
.019
)-1
.02
1.03
1.00
1.5
0.01
6(0
.077
)0.
023
(0.0
82)
0.06
6(0
.235
)1.
151.
12-1
.74
0.00
7(0
.018
)0.
005
(0.0
17)
0.00
1(0
.019
)-1
.05
-1.0
11.
02
3-0
.002
(0.0
04)
-0.0
03(0
.005
)0.
020
(0.0
22)
1.26
1.09
1.30
-0.0
04(0
.017
)-0
.004
(0.0
20)
0.05
4(0
.107
)1.
09-1
.01
-1.1
1
10.
5-0
.005
(0.0
76)
0.02
2(0
.077
)0.
057
(0.1
52)
1.03
1.02
-1.0
90.
002
(0.0
18)
-0.0
06(0
.017
)-0
.008
(0.0
21)
-1.0
6-1
.03
-1.0
7
1.5
0.01
1(0
.074
)0.
023
(0.0
85)
0.05
2(0
.235
)1.
211.
24-1
.53
0.00
5(0
.017
)0.
005
(0.0
17)
0.00
0(0
.019
)1.
021.
031.
01
30.
000
(0.0
04)
0.00
3(0
.005
)0.
010
(0.0
17)
1.12
1.02
1.47
0.00
1(0
.017
)0.
005
(0.0
20)
0.04
6(0
.102
)-1
.02
-1.0
0-1
.09
1.3
0.5
0.01
3(0
.068
)0.
003
(0.0
82)
0.05
2(0
.159
)1.
14-1
.01
-1.1
0-0
.006
(0.0
16)
0.00
3(0
.018
)-0
.005
(0.0
19)
1.05
-1.0
61.
04
1.5
0.00
7(0
.080
)0.
007
(0.0
80)
0.02
4(0
.201
)1.
111.
20-1
.49
0.00
2(0
.019
)-0
.004
(0.0
16)
-0.0
01(0
.019
)-1
.06
1.11
1.01
30.
001
(0.0
04)
0.00
1(0
.005
)0.
010
(0.0
19)
1.19
1.25
1.30
0.00
2(0
.017
)0.
007
(0.0
18)
0.03
0(0
.095
)-1
.01
1.09
-1.0
3
1000
0.25
0.5
-0.0
01(0
.034
)0.
002
(0.0
34)
0.01
3(0
.065
)1.
041.
101.
050.
000
(0.0
08)
0.00
1(0
.008
)-0
.002
(0.0
09)
1.01
1.04
-1.0
2
1.5
0.00
2(0
.036
)0.
007
(0.0
40)
0.05
2(0
.117
)1.
151.
021.
090.
000
(0.0
08)
0.00
1(0
.009
)0.
006
(0.0
10)
1.00
-1.0
4-1
.01
30.
000
(0.0
02)
0.00
1(0
.003
)0.
008
(0.0
10)
1.16
1.02
1.15
0.00
0(0
.009
)0.
002
(0.0
10)
0.02
6(0
.050
)1.
05-1
.00
-1.1
1
10.
50.
000
(0.0
34)
0.00
5(0
.037
)0.
015
(0.0
69)
1.07
1.00
1.08
0.00
0(0
.008
)-0
.001
(0.0
09)
0.00
1(0
.010
)1.
04-1
.05
-1.0
1
1.5
0.00
5(0
.037
)0.
002
(0.0
39)
0.05
8(0
.120
)1.
141.
181.
000.
002
(0.0
09)
-0.0
03(0
.009
)0.
006
(0.0
09)
1.01
1.02
1.01
30.
000
(0.0
02)
0.00
0(0
.003
)0.
009
(0.0
09)
1.06
1.02
1.09
0.00
0(0
.009
)0.
003
(0.0
10)
0.03
1(0
.048
)-1
.04
-1.0
6-1
.10
1.3
0.5
-0.0
07(0
.036
)0.
012
(0.0
34)
0.01
5(0
.070
)1.
031.
091.
020.
003
(0.0
09)
-0.0
04(0
.008
)-0
.001
(0.0
10)
-1.0
41.
05-1
.02
1.5
0.01
0(0
.036
)0.
015
(0.0
40)
0.04
6(0
.123
)1.
161.
14-1
.11
0.00
5(0
.009
)0.
003
(0.0
08)
0.00
4(0
.009
)1.
011.
061.
03
30.
001
(0.0
02)
0.00
1(0
.002
)0.
004
(0.0
09)
1.02
1.08
1.13
0.00
3(0
.009
)0.
006
(0.0
10)
0.01
5(0
.046
)-1
.05
1.04
-1.0
5
Tabl
e2.
4:Su
mm
ary
ofbi
as,v
aria
nce,
and
vari
ance
ratio
fort
hePH
and
AH
mod
els
fitte
dto
aW
eibu
lldi
stri
butio
nfo
rthe
53%
cens
orin
gca
se.N
ote
that
βP
H=
βA
H(k
-1).
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 25
No
Cen
sorin
g
λλ
k
0.5
1.53
0.25
11.
3
100
te=
0
0.25
11.
3
500
te=
0
0.25
11.
3
1000
te=
0
0.25
11.
3
5000
te=
0
0.25
11.
3
1000
0te
= 0
0.5
1.53
100
te=
0.5
500
te=
0.5
1000
te=
0.5
5000
te=
0.5
1000
0te
= 0
.5
0.5
1.53
100
te=
250
0te
= 2
1000
te=
250
00te
= 2
1000
0te
= 2
27%
Cen
sorin
g
λλ
k
0.5
1.53
0.25
11.
3
100
te=
0
0.25
11.
3
500
te=
0
0.25
11.
3
1000
te=
0
0.25
11.
3
5000
te=
0
0.25
11.
3
1000
0te
= 0
0.5
1.53
100
te=
0.5
500
te=
0.5
1000
te=
0.5
5000
te=
0.5
1000
0te
= 0
.5
0.5
1.53
100
te=
250
0te
= 2
1000
te=
250
00te
= 2
1000
0te
= 2
Und
eres
t−
5−
4−
3−
2E
ven
(−1)
Eve
n(1)
2
53%
Cen
sorin
g
λλ
k
0.5
1.53
0.25
11.
3
100
te=
0
0.25
11.
3
500
te=
0
0.25
11.
3
1000
te=
0
0.25
11.
3
5000
te=
0
0.25
11.
3
1000
0te
= 0
0.5
1.53
100
te=
0.5
500
te=
0.5
1000
te=
0.5
5000
te=
0.5
1000
0te
= 0
.5
0.5
1.53
100
te=
250
0te
= 2
1000
te=
250
00te
= 2
1000
0te
= 2
Figu
re2.
2:L
evel
plot
ofva
rian
cera
tios
ofth
eem
piri
cala
ndm
odel
-bas
edac
cele
rate
dha
zard
sm
odel
vari
ance
estim
ates
,
fort
reat
men
teff
ects
(te)
0,0.
5,an
d2;
shap
epa
ram
eter
,k=0
.5,1
.5an
d3;
scal
epa
ram
eter
,λ=0
.25,
1,an
d1.
3;sa
mpl
esi
zes,
n=10
0,50
0,10
00,5
000,
and
10,0
00.
Ane
gativ
era
tioim
plie
san
unde
rest
imat
ion
ofth
em
odel
-bas
edva
rian
cede
note
d
bya
war
mor
ange
colo
r,w
hile
apo
sitiv
era
tioim
plie
san
over
estim
atio
nof
the
mod
el-b
ased
vari
ance
deno
ted
bya
dark
gree
nco
lor.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 26
No
Cen
sorin
g
λλ
k
0.5
1.53
0.25
11.
3
100
te=
0
0.25
11.
3
500
te=
0
0.25
11.
3
1000
te=
0
0.25
11.
3
5000
te=
0
0.25
11.
3
1000
0te
= 0
0.5
1.53
100
te=
0.5
500
te=
0.5
1000
te=
0.5
5000
te=
0.5
1000
0te
= 0
.5
0.5
1.53
100
te=
250
0te
= 2
1000
te=
250
00te
= 2
1000
0te
= 2
27%
Cen
sorin
g
λλ
k
0.5
1.53
0.25
11.
3
100
te=
0
0.25
11.
3
500
te=
0
0.25
11.
3
1000
te=
0
0.25
11.
3
5000
te=
0
0.25
11.
3
1000
0te
= 0
0.5
1.53
100
te=
0.5
500
te=
0.5
1000
te=
0.5
5000
te=
0.5
1000
0te
= 0
.5
0.5
1.53
100
te=
250
0te
= 2
1000
te=
250
00te
= 2
1000
0te
= 2
Und
eres
t(−
2)E
ven(
−1)
Eve
n(1)
Ove
rest
(2)
53%
Cen
sorin
g
λλ
k
0.5
1.53
0.25
11.
3
100
te=
0
0.25
11.
3
500
te=
0
0.25
11.
3
1000
te=
0
0.25
11.
3
5000
te=
0
0.25
11.
3
1000
0te
= 0
0.5
1.53
100
te=
0.5
500
te=
0.5
1000
te=
0.5
5000
te=
0.5
1000
0te
= 0
.5
0.5
1.53
100
te=
250
0te
= 2
1000
te=
250
00te
= 2
1000
0te
= 2
Figu
re2.
3:L
evel
plot
ofva
rian
cera
tios
ofem
piri
cala
ndm
odel
-bas
edpr
opor
tiona
lhaz
ard
mod
elva
rian
cees
timat
es,f
or
trea
tmen
teff
ects
(te)
0,0.
5,an
d2;
shap
epa
ram
eter
,k=0
.5,1
.5an
d3;
scal
epa
ram
eter
,λ=0
.25,
1,an
d1.
3;sa
mpl
esi
zes,
n=10
0,50
0,10
00,5
000,
and
10,0
00.A
nega
tive
ratio
impl
ies
anun
dere
stim
atio
nof
the
mod
el-b
ased
vari
ance
deno
ted
by
aye
llow
colo
r,w
hile
apo
sitiv
era
tioim
plie
san
over
estim
atio
nof
the
mod
el-b
ased
vari
ance
deno
ted
bygr
een.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 27
No
Cen
sorin
g
λλ
Variance Estimates
0.1
0.2
0.3
0.4
0.5
0.6
0.25
11.
3
●●
●
●●
●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3●
●●
●●
●●
●●
k =
3te
= 0
0.1
0.2
0.3
0.4
0.5
0.6
●●
●
●●
●
●●
●
k =
0.5
te =
0.5
●●
●
●●
●
●●
●
k =
1.5
te =
0.5
●●
●●
●●
●●
●
k =
3te
= 0
.5
0.1
0.2
0.3
0.4
0.5
0.6
●●
●
●●
●
●●
●
k =
0.5
te =
2
●●
●
●●
●
●●
●
k =
1.5
te =
2
●●
●●
●●
●●
●
k =
3te
= 2
27%
Cen
sorin
g
λλ
0.1
0.2
0.3
0.4
0.5
0.6
0.25
11.
3
●●
●
●●
●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3●
●●
●●
●●
●●
k =
3te
= 0
0.1
0.2
0.3
0.4
0.5
0.6
●●
●
●●
●
●●
●
k =
0.5
te =
0.5
●●
●
●●
●
●●
●
k =
1.5
te =
0.5
●●
●●
●●
●●
●
k =
3te
= 0
.5
0.1
0.2
0.3
0.4
0.5
0.6
●
●
●
●●
●
●●
●
k =
0.5
te =
2
●●
●
●●
●
●●
●
k =
1.5
te =
2
●●
●●
●●
●●
●
k =
3te
= 2
53%
Cen
sorin
g
λλ
0.1
0.2
0.3
0.4
0.5
0.6
0.25
11.
3
●
●●
●●
●
●●
●
k =
0.5
te =
0
0.25
11.
3
●
●●
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3●
●●
●●
●●
●●
k =
3te
= 0
0.1
0.2
0.3
0.4
0.5
0.6
●●
●
●●
●
●●
●
k =
0.5
te =
0.5
●●
●
●●
●
●●
●
k =
1.5
te =
0.5
●●
●●
●●
●●
●
k =
3te
= 0
.5
0.1
0.2
0.3
0.4
0.5
0.6
●
●●
●●
●
●●
●
k =
0.5
te =
2
●●
●
●●
●
●●
●
k =
1.5
te =
2
●●
●
●●
●
●●
●
k =
3te
= 2
Em
piric
alM
odel
−B
ased
Boo
tstr
ap
● ● ●
Figu
re2.
4:D
otpl
otof
thre
eva
rian
cees
timat
es-
empi
rica
lva
rian
ce(b
lue)
,m
odel
-bas
edva
rian
ce(p
ink)
,an
dno
n-
para
met
ric
boot
stra
pped
vari
ance
(gre
en)
for
data
take
nfr
oma
Wei
bull
dist
ribu
tion
with
n=10
0,sh
ape
(k)=
0.5,
1.5,
and
3,sc
ale
(λ)=
0.25
,1,a
nd1.
3,an
dtr
eatm
ente
ffec
t,β
AH
deno
ted
(te)
=0,
0.5,
and
2.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 28
No
Cen
sorin
g
λλ
Variance Estimates
0.1
0.2
0.3
0.25
11.
3
●●
●●
●●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●●
●●
●●
●
k =
1.5
te =
0
0.25
11.
3●
●●
●●
●●
●●
k =
3te
= 0
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
0.5
●●
●●
●●
●●
●
k =
1.5
te =
0.5
●●
●●
●●
●●
●
k =
3te
= 0
.5
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
2
●●
●●
●●
●●
●
k =
1.5
te =
2
●●
●●
●●
●●
●
k =
3te
= 2
27%
Cen
sorin
g
λλ
0.1
0.2
0.3
0.25
11.
3
●●
●●
●●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●●
●●
●●
●
k =
1.5
te =
0
0.25
11.
3●
●●
●●
●●
●●
k =
3te
= 0
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
0.5
●●
●●
●●
●●
●
k =
1.5
te =
0.5
●●
●●
●●
●●
●
k =
3te
= 0
.5
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
2
●●
●
●●
●
●●
●
k =
1.5
te =
2
●●
●●
●●
●●
●
k =
3te
= 2
53%
Cen
sorin
g
λλ
0.1
0.2
0.3
0.25
11.
3
●●
●●
●●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●●
●●
●●
●
k =
1.5
te =
0
0.25
11.
3●
●●
●●
●●
●●
k =
3te
= 0
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
0.5
●●
●●
●●
●●
●
k =
1.5
te =
0.5
●●
●●
●●
●●
●
k =
3te
= 0
.5
0.1
0.2
0.3
●
●●
●●
●
●●
●
k =
0.5
te =
2
●●
●
●●
●
●●
●
k =
1.5
te =
2
●●
●●
●●
●●
●
k =
3te
= 2
Em
piric
alM
odel
−B
ased
Boo
tstr
ap
● ● ●
Figu
re2.
5:D
otpl
otof
thre
eva
rian
cees
timat
es-
empi
rica
lva
rian
ce(b
lue)
,m
odel
-bas
edva
rian
ce(p
ink)
,an
dno
n-
para
met
ric
boot
stra
pped
vari
ance
(gre
en)
for
data
take
nfr
oma
Wei
bull
dist
ribu
tion
with
n=30
0,sh
ape
(k)=
0.5,
1.5,
and
3,sc
ale
(λ)=
0.25
,1,a
nd1.
3,an
dtr
eatm
ente
ffec
t,β
AH
deno
ted
(te)
=0,
0.5,
and
2.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 29
No
Cen
sorin
g
λλ
Variance Estimates
0.1
0.2
0.3
0.25
11.
3
●●
●●
●●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●●
●●
●●
●
k =
1.5
te =
0
0.25
11.
3●
●●
●●
●●
●●
k =
3te
= 0
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
0.5
●●
●●
●●
●●
●
k =
1.5
te =
0.5
●●
●●
●●
●●
●
k =
3te
= 0
.5
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
2
●●
●●
●●
●●
●
k =
1.5
te =
2
●●
●●
●●
●●
●
k =
3te
= 2
27%
Cen
sorin
g
λλ
0.1
0.2
0.3
0.25
11.
3
●●
●●
●●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●●
●●
●●
●
k =
1.5
te =
0
0.25
11.
3●
●●
●●
●●
●●
k =
3te
= 0
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
0.5
●●
●●
●●
●●
●
k =
1.5
te =
0.5
●●
●●
●●
●●
●
k =
3te
= 0
.5
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
2
●●
●●
●●
●●
●
k =
1.5
te =
2
●●
●●
●●
●●
●
k =
3te
= 2
53%
Cen
sorin
g
λλ
0.1
0.2
0.3
0.25
11.
3
●●
●●
●●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●●
●●
●●
●
k =
1.5
te =
0
0.25
11.
3●
●●
●●
●●
●●
k =
3te
= 0
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
0.5
●●
●●
●●
●●
●
k =
1.5
te =
0.5
●●
●●
●●
●●
●
k =
3te
= 0
.5
0.1
0.2
0.3
●●
●●
●●
●●
●
k =
0.5
te =
2
●●
●
●●
●
●●
●
k =
1.5
te =
2
●●
●●
●●
●●
●
k =
3te
= 2
Em
piric
alM
odel
−B
ased
Boo
tstr
ap
● ● ●
Figu
re2.
6:D
otpl
otof
thre
eva
rian
cees
timat
es-
empi
rica
lva
rian
ce(b
lue)
,m
odel
-bas
edva
rian
ce(p
ink)
,an
dno
n-
para
met
ric
boot
stra
pped
vari
ance
(gre
en)
for
data
take
nfr
oma
Wei
bull
dist
ribu
tion
with
n=50
0,sh
ape
(k)=
0.5,
1.5,
and
3,sc
ale
(λ)=
0.25
,1,a
nd1.
3,an
dtr
eatm
ente
ffec
t,β
AH
deno
ted
(te)
=0,
0.5,
and
2.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 30
No
Cen
sorin
g
λλ
95% Coverage Probability
0.4
0.5
0.6
0.7
0.8
0.9
0.25
11.
3
●●
●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
3te
= 0
0.4
0.5
0.6
0.7
0.8
0.9
●●
●
●●
●
k =
0.5
te =
0.5
●●
●
●●
●
k =
1.5
te =
0.5
●●
●
●●
●
k =
3te
= 0
.5
0.4
0.5
0.6
0.7
0.8
0.9
●●
●
●●
●
k =
0.5
te =
2
●●
●
●●
●
k =
1.5
te =
2
●●
●
●●
●
k =
3te
= 2
27%
Cen
sorin
g
λλ
0.4
0.5
0.6
0.7
0.8
0.9
0.25
11.
3
●●
●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
3te
= 0
0.4
0.5
0.6
0.7
0.8
0.9
●●
●
●●
●
k =
0.5
te =
0.5
●●
●
●●
●
k =
1.5
te =
0.5
●●
●
●●
●
k =
3te
= 0
.5
0.4
0.5
0.6
0.7
0.8
0.9
●●
●
●●
●
k =
0.5
te =
2
●●
●
●●
●
k =
1.5
te =
2
●●
●
●●
●
k =
3te
= 2
53%
Cen
sorin
g
λλ
0.4
0.5
0.6
0.7
0.8
0.9
0.25
11.
3
●●
●
●●
●
k =
0.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
3te
= 0
0.4
0.5
0.6
0.7
0.8
0.9
●●
●
●●
●
k =
0.5
te =
0.5
●●
●
●●
●
k =
1.5
te =
0.5
●●
●
●●
●
k =
3te
= 0
.5
0.4
0.5
0.6
0.7
0.8
0.9
●
●●
●●
●
k =
0.5
te =
2
●●
●
●●
●
k =
1.5
te =
2
●●
●
●●
●
k =
3te
= 2
Mod
el−
Bas
edB
oots
trap
● ●
Figu
re2.
7:D
otpl
otof
two
cove
rage
prob
abili
ties
usin
gth
em
odel
-bas
edva
rian
ce(b
lue)
,and
non-
para
met
ric
boot
stra
pped
vari
ance
(pin
k)fo
rda
tata
ken
from
aW
eibu
lldi
stri
butio
nw
ithn=
100,
shap
e(k
)=0.
5,1.
5,an
d3,
scal
e(λ
)=0.
25,1
,and
1.3,
and
trea
tmen
teff
ect,
βA
Hde
note
d(t
e)=
0,0.
5,an
d2.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 31
No
Cen
sorin
g
λλ
95% Coverage Probability
0.4
0.5
0.6
0.7
0.8
0.9
0.25
11.
3
●●
●●
●●
k =
0.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
3te
= 0
0.4
0.5
0.6
0.7
0.8
0.9
●●
●●
●●
k =
0.5
te =
0.5
●●
●
●●
●
k =
1.5
te =
0.5
●●
●
●●
●
k =
3te
= 0
.5
0.4
0.5
0.6
0.7
0.8
0.9
●●
●●
●●
k =
0.5
te =
2
●●
●
●●
●
k =
1.5
te =
2
●●
●
●●
●
k =
3te
= 2
27%
Cen
sorin
g
λλ
0.4
0.5
0.6
0.7
0.8
0.9
0.25
11.
3
●●
●●
●●
k =
0.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
3te
= 0
0.4
0.5
0.6
0.7
0.8
0.9
●●
●●
●●
k =
0.5
te =
0.5
●●
●
●●
●
k =
1.5
te =
0.5
●●
●
●●
●
k =
3te
= 0
.5
0.4
0.5
0.6
0.7
0.8
0.9
●●
●●
●●
k =
0.5
te =
2
●●
●
●●
●
k =
1.5
te =
2
●●
●
●●
●
k =
3te
= 2
53%
Cen
sorin
g
λλ
0.4
0.5
0.6
0.7
0.8
0.9
0.25
11.
3
●●
●●
●●
k =
0.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
3te
= 0
0.4
0.5
0.6
0.7
0.8
0.9
●●
●●
●●
k =
0.5
te =
0.5
●●
●
●●
●
k =
1.5
te =
0.5
●●
●
●●
●
k =
3te
= 0
.5
0.4
0.5
0.6
0.7
0.8
0.9
●●
●
●●
●
k =
0.5
te =
2
●●
●
●●
●
k =
1.5
te =
2
●●
●
●●
●
k =
3te
= 2
Mod
el−
Bas
edB
oots
trap
● ●
Figu
re2.
8:D
otpl
otof
two
cove
rage
prob
abili
ties
usin
gth
em
odel
-bas
edva
rian
ce(b
lue)
,and
non-
para
met
ric
boot
stra
pped
vari
ance
(pin
k)fo
rda
tata
ken
from
aW
eibu
lldi
stri
butio
nw
ithn=
300,
shap
e(k
)=0.
5,1.
5,an
d3,
scal
e(λ
)=0.
25,1
,and
1.3,
and
trea
tmen
teff
ect,
βA
Hde
note
d(t
e)=
0,0.
5,an
d2.
CHAPTER 2. MODELS FOR SURVIVAL ANALYSIS 32
No
Cen
sorin
g
λλ
95% Coverage Probability
0.4
0.5
0.6
0.7
0.8
0.9
0.25
11.
3
●●
●●
●●
k =
0.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3
●●
●●
●●
k =
3te
= 0
0.4
0.5
0.6
0.7
0.8
0.9
●●
●●
●●
k =
0.5
te =
0.5
●●
●●
●●
k =
1.5
te =
0.5
●●
●
●●
●
k =
3te
= 0
.5
0.4
0.5
0.6
0.7
0.8
0.9
●●
●●
●●
k =
0.5
te =
2
●●
●
●●
●
k =
1.5
te =
2
●●
●●
●●
k =
3te
= 2
27%
Cen
sorin
g
λλ
0.4
0.5
0.6
0.7
0.8
0.9
0.25
11.
3
●●
●●
●●
k =
0.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
1.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
3te
= 0
0.4
0.5
0.6
0.7
0.8
0.9
●●
●●
●●
k =
0.5
te =
0.5
●●
●
●●
●
k =
1.5
te =
0.5
●●
●
●●
●
k =
3te
= 0
.5
0.4
0.5
0.6
0.7
0.8
0.9
●●
●●
●●
k =
0.5
te =
2
●●
●
●●
●
k =
1.5
te =
2
●●
●
●●
●
k =
3te
= 2
53%
Cen
sorin
g
λλ
0.4
0.5
0.6
0.7
0.8
0.9
0.25
11.
3
●●
●●
●●
k =
0.5
te =
0
0.25
11.
3
●●
●●
●●
k =
1.5
te =
0
0.25
11.
3
●●
●
●●
●
k =
3te
= 0
0.4
0.5
0.6
0.7
0.8
0.9
●●
●●
●●
k =
0.5
te =
0.5
●●
●●
●●
k =
1.5
te =
0.5
●●
●
●●
●
k =
3te
= 0
.5
0.4
0.5
0.6
0.7
0.8
0.9
●●
●
●●
●
k =
0.5
te =
2
●●
●
●●
●
k =
1.5
te =
2
●●
●
●●
●
k =
3te
= 2
Mod
el−
Bas
edB
oots
trap
● ●
Figu
re2.
9:D
otpl
otof
two
cove
rage
prob
abili
ties
usin
gth
em
odel
-bas
edva
rian
ce(b
lue)
,and
non-
para
met
ric
boot
stra
pped
vari
ance
(pin
k)fo
rda
tata
ken
from
aW
eibu
lldi
stri
butio
nw
ithn=
500,
shap
e(k
)=0.
5,1.
5,an
d3,
scal
e(λ
)=0.
25,1
,and
1.3,
and
trea
tmen
teff
ect,
βA
Hde
note
d(t
e)=
0,0.
5,an
d2.
Chapter 3
Data Analysis
We illustrate the use of the three models presented in Chapter 2 on five data sets - a ran-
domized trial for the treatment of breast cancer, an observational study on coronary artery
bypass graft (CABG) surgery in British Columbia taken from the Cardiac Services BC Reg-
istry, a veteran administration lung cancer study, a study on catheter placement for kidney
dialysis patients, and a study of the administration of carmustine (BCNU) drug for patients
with malignant brain tumours. For simplicity, in all studies, we focus on comparisons of the
treatment and control groups to use these examples for contrasting the PH and AH models.
3.1 Breast Cancer Clinical Trial
A prospective randomized trial for post-menopausal women diagnosed with node-positive
stage I or II breast cancer in British Columbia from 1979-1986 was undertaken to study
the effects of combining radiotherapy with chemotherapy in the treatment of breast cancer
(Ragaz, et al. 1997). All subjects were referred to the British Columbia Cancer Agency
for group randomization and treatment planning. There were 318 subjects in total, 154 of
whom were assigned to the control group (chemotherapy only), while the rest (164) were
assigned to the proposed treatment group (chemotherapy and radiotherapy). Subjects were
followed for 20 years. Failure in this study is defined as either the recurrence of cancer
or death from cancer or other causes; 36% from the control group were right censored,
while about 52% of individuals in the chemotherapy and radiotherapy group were cen-
sored. Overall, there were 177 subjects who failed at the end of the study.
33
CHAPTER 3. DATA ANALYSIS 34
0.0
0.2
0.4
0.6
0.8
1.0
Kaplan−Meier Plot
Time in Years
P(S
urvi
val)
0 2 4 6 8 10 12 14 16 18
Chemotherapy only
Chemotherapy + Radiotherapy
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Cumulative Hazard Plot
Time in Years
Haz
ard
0 2 4 6 8 10 12 14 16 18
Chemotherapy only
Chemotherapy + Radiotherapy
Figure 3.1: Left: Kaplan-Meier survivor curves for the control group (black) and treatment
group (red). Right: Cumulative hazard curves for the control group and treatment group.
The Kaplan-Meier survivor curves, displayed in Figure 3.1, suggest that a proportional
hazards model may fit the data well. We fit the three models discussed in Chapter 2 to com-
pare and contrast model fit, parameter estimates and their interpretation. Table 3.1 lists the
estimated covariate effects and their corresponding 95% confidence intervals for the pro-
posed treatment effect (chemotherapy and radiotherapy), using PH, AFT, and AH models.
The PH, Weibull AFT, and the bootstrapped AH models reported p-values < 0.05 in a test
of no difference between treatment and control groups. The bootstrap variance estimates in
the AH model are denoted with an asterisk (*). Note that when the model-based variance
was used in conducting the Wald test for the significance of the treatment effect in the AH
model, the resulting p-value was marginally significant with a much larger value of 0.08.
Using a non-parametric bootstrap approach for variance estimation in the AH model led to
a smaller standard error with a Wald statistic that is significant at α = 5%, and a CI for the
estimate with lower bound that is well above 1. Furthermore, all models agree in the di-
rection of the effect - with all models identifying a favorable outcome to the chemotherapy
CHAPTER 3. DATA ANALYSIS 35
plus radiotherapy treatment over the standard chemotherapy only treatment.
In the proportional hazards model, the proposed treatment has a hazard rate that is
proportionately 35% less than the stand-alone chemotherapy treatment. In the accelerated
failure time Weibull model, individuals in the control group age (along the survivor curve)
approximately twice as fast as those in the treatment group. The estimated Extreme Value
scale parameter is 1.38 (SE=0.09), which suggests that the hazard function decreases with
time. In the accelerated hazards model framework, the time scale for risk or hazard pro-
gression for the chemotherapy plus radiotherapy group is about 2.5 times faster than that
for the chemotherapy only group. This acceleration is seen as beneficial because of a gen-
erally decreasing hazard function. A plot of the smoothed hazard function using a kernel
approximation in Figure 3.2 shows such trajectory.
Model β SE(β) p-value eβ 95% CI for eβ
PH -0.437 0.152 0.004 0.646 (0.48, 0.87)
AFT 0.641 0.211 0.002 1.898 (1.25, 2.87)
AH 0.928 0.531 (0.332*) 0.081 (0.005*) 2.529 (0.89, 7.16) (1.32, 4.85)*
Table 3.1: Estimated treatment effects in the analysis of the breast cancer data using PH,
AFT, and AH models. Values with * in the AH model represent bootstrapped estimates.
The p-values correspond to Wald tests of a hypothesis of no treatment effect.
Figure 3.3 shows the non-parametric and fitted cumulative hazard (top panels) and sur-
vivor curves (bottom panels) for the three models overlaid on empirical estimates of these
quantities. The estimated Cox proportional hazard survivor curves seem to fit the data quite
well at all time points for both control and treatment groups. In the accelerated failure time
model, the predicted cumulative hazard curves for the both groups are underestimated up
to about year 10, and overestimated in later years. (The converse can be seen in the es-
timated survivor curve.) The accelerated hazards model sets the risk for both groups to
be equivalent at time 0. However, this feature yielded an undesired overestimation of the
treatment group’s hazard at the beginning of the trial, causing the survivor curve of the
treatment group to be below that of the control group at 2 years. This is not an accurate
description of the data as the treatment group had consistently higher survival rates than
CHAPTER 3. DATA ANALYSIS 36
the control group throughout the entire duration of the trial. Overall, the PH model fit the
breast cancer data the best. A plot of the residuals in Figure 3.4 is shown as a visual check
to assess if there is any deviation from proportional hazards. In fact, a linear test that the
slope is 0 gives a p-value of 0.72, providing support for non-violation of the proportional
hazards assumption.
0e+
001e
−04
2e−
043e
−04
Follow−up Time in Years
Haz
ard
Rat
e
0 2 4 6 8 10 12 14 16 18
Smoothed Hazard Curves
Chemotherapy only
Chemotherapy + Radiotherapy
Figure 3.2: Smoothed hazard curves for the breast cancer data.
CHAPTER 3. DATA ANALYSIS 37
0.00.20.40.60.81.01.2
Tim
e in
Yea
rs
Cumulative Hazard
PH
Mod
el
02
46
810
1214
1618
Kap
lan−
Mei
erF
itted
PH
Che
mot
hera
py
Che
mot
hera
py +
Rad
iatio
n
0.00.20.40.60.81.01.2
Tim
e in
Yea
rs
Cumulative Hazard
AF
T W
eibu
ll M
odel
02
46
810
1214
1618
Kap
lan−
Mei
erF
itted
AF
T W
eibu
ll
Che
mot
hera
py
Che
mot
hera
py +
Rad
iatio
n
0.00.20.40.60.81.01.2
Tim
e in
Yea
rs
Cumulative Hazard
02
46
810
1214
1618
AH
Mod
el
Kap
lan−
Mei
erA
H
Che
mot
hera
py
Che
mot
hera
py +
Rad
iatio
n
0.00.20.40.60.81.0
Tim
e in
Yea
rs
P(Survival)
02
46
810
1214
1618
Kap
lan−
Mei
erF
itted
PH
Che
mot
hera
py
Che
mot
hera
py +
Rad
iatio
n
0.00.20.40.60.81.0
Tim
e in
Yea
rs
P(Survival)
02
46
810
1214
1618
Kap
lan−
Mei
erF
itted
AF
T W
eibu
ll
Che
mot
hera
py
Che
mot
hera
py +
Rad
iatio
n
0.00.20.40.60.81.0
Tim
e in
Yea
rs
P(Survival)
02
46
810
1214
1618
Kap
lan−
Mei
erF
itted
AH
Che
mot
hera
py
Che
mot
hera
py +
Rad
iatio
n
Figu
re3.
3:To
p:C
umul
ativ
eha
zard
curv
esfo
rthe
PH,W
eibu
llA
FT,a
ndA
Hm
odel
s.T
hefit
ted
cum
ulat
ive
haza
rdcu
rves
are
show
nin
solid
lines
vsth
eno
n-pa
ram
etri
cha
zard
curv
esin
dash
edlin
es.
The
base
line
(con
trol
)ha
zard
curv
esar
e
show
nin
blac
k.B
otto
m:
Kap
lan-
Mei
ersu
rviv
orcu
rves
for
the
cont
rol
grou
p(b
lack
dash
ed)
and
trea
tmen
tgr
oup
(red
dash
ed),
with
the
fitte
dsu
rviv
orcu
rves
fort
reat
men
tgro
upsh
own
inre
dso
lidlin
es.
CHAPTER 3. DATA ANALYSIS 38
Cox Proportional Hazards Residual Plot
Time
Bet
a(t)
for
trt
230 450 660 860 1200 1700 2900 4300
−2
−1
01
2 ●
●
●
●
●●●
●●
●
●
●
●●
●●
●●
●
●
●
●●
●
●●●●
●
●●●●●
●●
●●
●
●
●
●
●●●
●
●●
●●
●●
●●●
●●
●●
●
●●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●
●●●●●●●
●●
●●
●●
●
●
●●●●●
●●
●●●●●●
●●
●●
●●
●
●
●●●
●
●
●●●
●●
●●●
●●●●●
●
●
●
●●
●●
●
●●
●
●●●
●
●
●●
●●●
●
●
●●●●●●
●●●●●●●
●●●●●
●
●
Figure 3.4: Residual plot from the fit of the proportional hazards model to the breast cancer
data.
CHAPTER 3. DATA ANALYSIS 39
3.2 Coronary Artery Bypass Graft Surgery
This study considers the two-year post-surgery survival outcomes of a random sample of
individuals who underwent coronary artery bypass graft (CABG) surgeries from January
1991 to February 2006 in British Columbia. The Cardiac Services BC Registy maintains
a database on all heart-related surgeries performed in BC. Death or mortality data were
taken from the British Columbia Vital Statistics Agency. Our sample consists of 2,644 in-
dividuals, with men comprising about 80% of the sample. Among the individuals sampled,
171 (6.5%) experienced death within 2 years post-surgery. Furthermore, among those who
died, 88 (58 males; 30 females) failed during the first month after having CABG surgery.
This short-term mortality is well-known with CABG surgery, and is defined as the 30-day
operative mortality period. Shown in Figure 3.5 are the 2-year survivor and cumulative
hazard curves for both male and female patients who underwent a CABG surgery. There
is a steep decline in the survival rate, and a corresponding sharp rise in the cumulative
hazard during the first month post-surgery. However, survival rates after the 30-day period
are generally good and optimistic. It should be noted that the Kaplan-Meier curves seem
to suggest that females have a higher mortality rate than males. Indeed, Ghahramani, et
al. (2001) has identified significant gender differences in the modeling of CABG survival
time data. Previous studies (Humphries et al., 2007) found that short-term gender mortality
differences may be attributed to intrinsic factors such as body surface area (BSA) (which
serves as a proxy for size of coronary vessels), with women having lower BSA, and thus
smaller vessels. (Body surface area in m2 units is defined as√
height(cm)weight(kg)3600 ). In this
study, we use gender as the binary covariate in our two-sample comparison.
We performed analyses of gender effects on the CABG data using the proportional haz-
ards, Weibull accelerated failure time, and accelerated hazards models. Table 3.2 shows
the estimates for the female gender effect, β, for the three models considered. Although the
estimates of this effect from all three models have different interpretation, all models show
evidence of a significant difference between males and females, with all p-values < 0.01
under the Wald-type test of no effect. Furthermore, all models find that females perform
worse than males.
CHAPTER 3. DATA ANALYSIS 40
0.90
0.92
0.94
0.96
0.98
1.00
Kaplan−Meier Plot
Time in months
P(S
urvi
val)
0 2 4 6 8 10 12 14 16 18 20 22 24
Males
Females
0.00
0.02
0.04
0.06
0.08
0.10
Cumulative Hazard Plot
Time in months
Haz
ard
0 2 4 6 8 10 12 14 16 18 20 22 24
Males
Females
Figure 3.5: Left: Kaplan-Meier coronary artery bypass graft surgery 2-year survivor curves
for males (black) and females (red). Right: Two-year cumulative hazard curve for coronary
artery bypass graft surgeries.
In the proportional hazards model, women have a 1.6 times higher risk of mortality
than men during the first 2 years after the surgery. In the accelerated failure time model,
the expected failure rate for females age faster (along the survival scale) by 83% (1-0.169)
compared to males undergoing the same type of surgery. It is estimated that 5% of females
who undergo a CABG surgery will die in about 5 months (162 days), whereas it would
take about 2.6 years (960 days) for males to have an equivalent mortality rate. We note
that the standard error for this estimate is relatively high, resulting in a wide confidence
interval. In the accelerated hazards model, the risk or hazard progression for women un-
dergoing CABG surgery is about 40% slower than that of men undergoing the same type of
surgery during the first 2 years. This deceleration is interpreted as a harmful effect since the
hazard function is decreasing over time, as shown in Figure 3.6. A non-parametric boot-
strap approach was performed to obtain a reliable estimate of the variance of the gender
CHAPTER 3. DATA ANALYSIS 41
effect. Bootstrapped variance estimates are denoted with an asterisk (*). The bootstrapped
standard error for βAH is fairly close to the model-based standard error; so too are the cor-
responding 95% confidence interval estimates for eβAH . This close agreement between the
model-based variance and the bootstrapped variance is a reflection of the large sample size
used in this analysis.
Time Period Model β SE(β) P-value eβ 95% CI for eβ
2 years
PH 0.459 0.1672 0.006 1.583 (1.14, 2.20)
AFT -1.776 0.6881 0.0098 0.169 (0.04, 0.65)
AH -0.490 0.18 (0.184*) 0.007 (0.007*) 0.613 (0.43, 0.87) (0.43, 0.88)*
Table 3.2: Estimated gender effects on the coronary artery bypass graft data using PH,
AFT and AH models. Values with * in the AH model represent bootstrapped estimates.
The p-values compared to Wald tests of a hypothesis of no treatment effect.
Plots of the cumulative hazard and survivor curves during the 2-year period are shown
in Figure 3.7. Although the PH model estimates the baseline hazard and survivor curves
quite well, it fails to capture the wide gap in hazards between males and females before
month 8 (shown in Figure 3.6), which results in an overestimation of the survivor rate
for females up to month 8. However, in the succeeding months, the PH model seems to
fit the data well. The Weibull accelerated failure time model does not describe the data
well. Although the estimated gender effect, βAFT , looks reasonable, the estimated baseline
survivor function does not follow the same trajectory as the Kaplan-Meier curves, and is
undershooting the baseline cumulative hazard curve consistently. The estimated scale in the
location-scale framework for the Weibull model is 4.03 (SE=0.30), which implies that the
hazard function decreases with time. In the AH model, the fitted curves overestimate the
survivor rates for females. Visually, the PH model provides the best fit to the data, despite
the slight deviation from a proportional hazards assumption as shown in the residual plot,
Figure 3.8. The deviation from the proportional hazards assumption is detected by the slight
downward slope from time 0 to 240 days, and a positive slope onwards suggesting that the
sharp decline in survival rate early on violates the proportional assumption. However, a test
to see if the slope of the trend in this plot is 0 is non-significant with a p-value of 0.26.
CHAPTER 3. DATA ANALYSIS 42
0.00
000
0.00
005
0.00
010
0.00
015
0.00
020
0.00
025
0.00
030
Follow−up Time in months
Haz
ard
Rat
e
0 2 4 6 8 10 12 14 16 18 20 22 24
Smoothed Hazard Curve
Males
Females
Figure 3.6: Smoothed hazard curves for males (black solid) and females (red dashed) who
underwent a coronary artery bypass graft surgery.
CHAPTER 3. DATA ANALYSIS 43
0.000.020.040.060.080.10
Tim
e in
mon
ths
Cumulative Hazard
02
46
810
1214
1618
2022
24
Kap
lan−
Mei
erF
itted
PH
PH
Mod
el
Mal
es
Fem
ales
0.000.020.040.060.080.10
Tim
e in
mon
ths
Cumulative Hazard
02
46
810
1214
1618
2022
24
Kap
lan−
Mei
erF
itted
AF
T
AF
T M
odel
Mal
es
Fem
ales
0.000.020.040.060.080.10
Tim
e in
mon
ths
Cumulative Hazard
02
46
810
1214
1618
2022
24
Kap
lan−
Mei
erF
itted
AH
AH
Mod
el Mal
es
Fem
ales
0.900.920.940.960.981.00
Tim
e in
mon
ths
P(Survival)
02
46
810
1214
1618
2022
24
Kap
lan−
Mei
erF
itted
PH
Mal
es
Fem
ales
0.900.920.940.960.981.00
Tim
e in
mon
ths
P(Survival)
02
46
810
1214
1618
2022
24
Kap
lan−
Mei
erF
itted
AF
T
Mal
es
Fem
ales
0.900.920.940.960.981.00
Tim
e in
mon
ths
P(Survival)
02
46
810
1214
1618
2022
24
Kap
lan−
Mei
erF
itted
AH
Mal
es
Fem
ales
Figu
re3.
7:To
p:C
umul
ativ
eha
zard
curv
esfo
rthe
2-ye
arco
rona
ryar
tery
bypa
ssda
taus
ing
the
PH,A
FT,a
ndA
Hm
odel
s.
The
non-
para
met
ric
estim
ates
are
show
nin
dash
edlin
es,w
hile
the
estim
ated
cum
ulat
ive
haza
rdcu
rves
are
show
nin
solid
lines
for
mal
es(b
lack
)an
dfe
mal
es(r
ed).
Bot
tom
:K
apla
n-M
eier
2-ye
arsu
rviv
orcu
rves
for
mal
es(b
lack
dash
ed)
and
fem
ales
(red
dash
ed),
with
the
fitte
dsu
rviv
orcu
rves
show
nin
solid
lines
.
CHAPTER 3. DATA ANALYSIS 44
Cox Proportional Hazards Residual Plot
Time
Bet
a(t)
for
sex
0.92 4.6 9.4 24 67 240 490 680
−1
01
23
4
●●●●●
●●●
●
●
●●●
●●
●●●● ●
●
●
●●
●
●● ●●
●
●
●●●
●●
●
●
●● ●●
●
●●●●● ●●●
●●
●●●●● ●●
●
●
●
●●●
●●
●
●
●
●
●●●●●●
●●
●●
●
●
●
●●●●●●●●●●●●●
●●
●●
●
●●
●
●
●●
●●●●●●
●
●●●●●●
●
●●●●●●
●
●●●
●●●
●●●●●
●
●●●●●●
●●●
●●●●●●●●●
●
●●
●●●●
●●●●
Figure 3.8: Residual plot from the fit of the proportional hazards model to the coronary
artery bypass graft surgery data during the first two years post-surgery.
CHAPTER 3. DATA ANALYSIS 45
3.3 Veteran Lung Cancer
Kalbfleisch & Prentice (1980, Appendix I, p.223) present data taken from a randomized
clinical trial on the treatment of inoperable advanced lung cancer. A total of 137 male pa-
tients were randomized to either a standard chemotherapy or a new test chemotherapy for
the treatment of the disease. Several covariates were included in the randomization process
at the start of the study including, histology of the cancer cells, Karnofsky score, number
of months from diagnosis, age, and prior therapy. The trial had a high mortality rate with
about 95% of all patients dying within a period of a year and a half. Figure 3.9 shows
the survivor and cumulative hazard curves for the standard and test groups. The estimated
survivor curves for both groups cross at around month 6, with individuals on the standard
treatment performing better earlier in the study, but poorer after 6 months when compared
to the group receiving the test treatment, though these may not be significant differences.
0.0
0.2
0.4
0.6
0.8
1.0
Kaplan−Meier Plot
Follow−up Time in Months
P(S
urvi
val)
0 2 4 6 8 10 12 14 16 18 20
Standard
Test
01
23
4
Cumulative Hazard Plot
Follow−up Time in Months
Haz
ard
0 2 4 6 8 10 12 14 16 18 20
Standard
Test
Figure 3.9: Left: Kaplan-Meier survivor curves for the standard (black) and test (red)
chemotherapy groups. Right: Cumulative hazard curves for the two groups.
Table 3.3 lists the parameter estimates of the treatment effect for the three models con-
sidered. All models report a relatively small treatment effect. Furthermore, none of the
models show a significant treatment effect, and there is no agreement in the direction of
CHAPTER 3. DATA ANALYSIS 46
this non-significant treatment effect. The estimated scale parameter from the Weibull AFT
model was 1.17 (SE=0.08), signifying that a simpler exponential model may fit the data.
However, a test if the scale parameter can be set to 1 produced a p-value of 0.02. The accel-
erated hazards model estimates the treatment effect as having about a 7% (non-significant)
slower risk progression than the standard therapy. Figure 3.10 shows the smoothed hazard
curves for the standard and test treatment groups. The hazard curve for the standard group
looks relatively flat, while the hazard curve for the test group decreases with time. Recall
that one of the assumptions of the accelerated hazards model is the non-constancy of the
hazard function. Even if the treatment were significant, it would be difficult for the AH
model to capture differences between the two groups.
Model β SE(β) P-value eβ 95% CI for eβ
PH 0.018 0.181 0.920 1.018 (0.71, 1.45)
AFT 0.048 0.208 0.082 1.049 (0.70, 1.58)
AH -0.069 0.105 (1.33*) 0.514 (0.959*) 0.934 (0.76, 1.15) (0.07, 12.75)*
Table 3.3: Estimated treatment effects on the veteran lung cancer data using PH, AFT, and
AH models. Values with * in the AH model represent bootstrapped estimates. The p-values
correspond to Wald tests of a hypothesis of no treatment effect.
Figure 3.11 displays the fitted cumulative hazard and survivor curves for the three mod-
els. The PH and AFT models are restricted to fitting survivor curves that do not cross.
Therefore, both models fail to capture the (perhaps non-significant) crossing at about 6
months displayed in the Kaplan-Meier estimates. Figure 3.12 shows the residual plot from
fitting a proportional hazards model to the data. The curvature in the residual plot suggests
a violation of the proportional hazards assumption. However, a test to see if the slope of
the trend in this plot is 0 is not rejected, with a p-value of 0.74; this may be due to the
non-significant difference between the two groups (and/or lack of sensitivity of the test to
capture quadratic effects).
CHAPTER 3. DATA ANALYSIS 47
0.00
00.
005
0.01
00.
015
0.02
0
Smoothed Hazard Curve
Follow−up Time in Days
Haz
ard
Rat
e
0 20 40 60 80 100 120 140 160 180 200
Standard
Test
Figure 3.10: Smoothed hazard curves for the standard and test chemotherapy for the treat-
ment of lung cancer.
CHAPTER 3. DATA ANALYSIS 48
01234
Fol
low
−up
Tim
e in
Mon
ths
Cumulative Hazard
02
46
810
1214
1618
20
PH
Mod
el
Kap
lan−
Mei
erF
itted
PH
Sta
ndar
d
Tes
t
01234
Fol
low
−up
Tim
e in
Mon
ths
Cumulative Hazard
02
46
810
1214
1618
20
AF
T W
eibu
ll M
odel
Kap
lan−
Mei
erF
itted
AF
T
Sta
ndar
d
Tes
t
01234
Fol
low
−up
Tim
e in
Mon
ths
Cumulative Hazard
02
46
810
1214
1618
20
AH
Mod
el
Kap
lan−
Mei
erF
itted
AH
Sta
ndar
d
Tes
t
0.00.20.40.60.81.0
Fol
low
−up
Tim
e in
Mon
ths
P(Survival)
02
46
810
1214
1618
20
Kap
lan−
Mei
erF
itted
PH
Sta
ndar
d
Tes
t
0.00.20.40.60.81.0
Fol
low
−up
Tim
e in
Mon
ths
P(Survival)
02
46
810
1214
1618
20
Kap
lan−
Mei
erF
itted
AF
T
Sta
ndar
d
Tes
t
0.00.20.40.60.81.0
Fol
low
−up
Tim
e in
Mon
ths
P(Survival)
02
46
810
1214
1618
20
Kap
lan−
Mei
erF
itted
AH
Sta
ndar
d
Tes
t
Figu
re3.
11:
Top:
Cum
ulat
ive
haza
rdcu
rves
for
the
vete
ran
lung
canc
erda
taus
ing
the
PH,A
FT,a
ndA
Hm
odel
s.T
he
non-
para
met
ric
estim
ates
are
show
nin
dash
edlin
es,
whi
leth
ees
timat
edcu
mul
ativ
eha
zard
curv
esar
esh
own
inso
lid
lines
fors
tand
ard
trea
tmen
t(bl
ack)
and
test
trea
tmen
t(re
d).B
otto
m:K
apla
n-M
eier
surv
ivor
curv
esfo
rsta
ndar
dtr
eatm
ent
(bla
ckda
shed
)and
test
trea
tmen
t(re
dda
shed
),w
ithth
efit
ted
surv
ivor
curv
essh
own
inso
lidlin
es.
CHAPTER 3. DATA ANALYSIS 49
Cox Proportional Hazards Residual Plot
Time
Bet
a(t)
for
trt
8.2 19 32 54 99 130 220 390
−3
−2
−1
01
2
●●●
●●●
●●
●●
●●
●●●●●●
●●●
●●●
● ●●
●
●
●
●
●
●●●●●
●
●
●
●
●
●●
●
●
●
●●●●●
●
●●
●
●● ●
●●●●
●
●●
●●●
●
●●●
●●
●●●
●●●●
●●●
●●●●●●
●
●
●
●●●●●●●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●●●
Figure 3.12: Residual plot from the fit of the proportional hazards model to the veteran
lung cancer data.
CHAPTER 3. DATA ANALYSIS 50
3.4 Kidney Catheter Data
McGilchrist & Aisbett (1991) discuss a clinical trial to assess the time to infection, at the
catheter insertion site, of patients receiving kidney dialysis treatment. A total of 119 pa-
tients were grouped according to either a surgical or percutaneous insertion of the catheter.
Patients were followed until an infection was observed; individuals who needed removal of
the catheter for reasons other than an infection were treated as censored events. About 22%
(26) developed an infection over the 30-month course of the study. Figure 3.13 displays
the survivor and cumulative hazard curves for the two groups we consider here, namely
those undergoing surgical or percutaneous catheter placement. Notice that the estimated
survivor curves for the two groups cross early on, with the percutaneous group having a
higher infection rate (low survival) than the surgical group before 5 months but remaining
relatively flat after.
0 5 10 15 20 25
0.0
0.2
0.4
0.6
0.8
1.0
Survivor Curves
Time in months
P(in
fect
ion)
Surgically
Percutaneously
0 5 10 15 20 25
0.0
0.5
1.0
1.5
Cumulative Hazard Curves
Time in months
Haz
ard
Surgically
Percutaneously
Figure 3.13: Left: Kaplan-Meier survivor curves for surgical (black) and percutaneous
placement (red) of the catheter for patients undergoing kidney dialysis. Right: Cumulative
hazard curves for the two groups.
CHAPTER 3. DATA ANALYSIS 51
Table 3.4 shows estimates of the effects of catheters placed percutaneously over surgi-
cal placement for the three models. All models agree in the direction of the effect, with
the percutaneous catheter placement preferred over the surgical placement of catheters, al-
though none of the p-values suggest a significant difference. The PH model predicts the
risk for patients who had a percutaneous-placed catheter to be half that of developing an
infection against catheters placed surgically. Similarly, the AFT model estimates that the
percutaneously-placed catheter group have a 1.8 times decelerated (failure time) pace than
the surgically-placed catheter group. It is estimated that 50% of individuals in the surgical
group will develop an infection in 24 months (2 years), while it would take 45 months in the
percutaneous group for half of the subjects to get an infection. The AH model estimates a
deceleration in the time of risk progression in the percutaneous group by 80%. This is seen
as a favorable procedure because the estimated underlying baseline (surgically-treated) haz-
ard curve is shown to be increasing (see Figure 3.14). Using the model-based variance to
calculate the Wald test statistic for the AH model resulted in a significant p-value (0.0014);
however results using the bootstrapped estimate resulted in a non-significant effect. Sub-
stantial differences between model-based and resampling variance estimates are observed
in this study with a small sample size of 119.
Model β SE(β) P-value eβ 95% CI for eβ
PH -0.618 0.398 0.121 0.539 (0.25, 1.18)
AFT 0.623 0.468 0.184 1.865 (0.75, 4.67)
AH -1.564 0.4904 (1.1143*) 0.0014 (0.1605*) 0.209 (0.08, 0.55) (0.02, 1.86)*
Table 3.4: Estimated treatment effects on the percutaneous catheter placement using PH,
AFT, and AH models. Values with * in the AH model represent bootstrapped estimates.
The p-values correspond to Wald tests of a hypothesis of no treatment effect.
Figure 3.15 shows the non-parametric and fitted survivor and cumulative hazard curves
based on the fitted PH, AFT Weibull, and AH models. Neither the PH, nor any parametric
AFT model can handle a cross in the survivor curves. It is therefore not surprising to see
that both PH and AFT models failed to capture this feature in the data, while the AH model
was able to offer a reasonably good fit to the data. The proportional hazards residual plot
in Figure 3.16 verifies the violation of the proportionality assumption in the early months
CHAPTER 3. DATA ANALYSIS 52
(p-value=0.003).
0 5 10 15
0.00
0.02
0.04
0.06
0.08
Follow−up Time in months
Haz
ard
Rat
eSmoothed Hazard Curves
Surgically
Percutaneously
Figure 3.14: Smoothed hazard curves for surgically and percutaneously-placed catheters
for patients undergoing kidney dialysis.
CHAPTER 3. DATA ANALYSIS 53
05
1015
2025
0.00.51.01.5
PH
Mod
el
Tim
e in
mon
ths
Cumulative Hazard
Kap
lan
Mei
erF
itted
PH
Sur
gica
lly
Per
cuta
neou
sly
05
1015
2025
0.00.51.01.5
AF
T W
eibu
ll M
odel
Tim
e in
mon
ths
Cumulative Hazard
Kap
lan
Mei
erF
itted
AF
T
Sur
gica
lly
Per
cuta
neou
sly
05
1015
2025
0.00.51.01.5
AH
Mod
el
Tim
e in
mon
ths
Cumulative Hazard)
Kap
lan
Mei
erF
itted
AH
Sur
gica
lly
Per
cuta
neou
sly
05
1015
2025
0.00.20.40.60.81.0
Tim
e in
mon
ths
P(Infection)
Kap
lan
Mei
erF
itted
PH
Sur
gica
lly
Per
cuta
neou
sly
05
1015
2025
0.00.20.40.60.81.0
Tim
e in
mon
ths
P(Infection)
Kap
lan
Mei
erF
itted
AF
T
Sur
gica
lly
Per
cuta
neou
sly
05
1015
2025
0.00.20.40.60.81.0
Tim
e in
mon
ths
P(Infection)
Kap
lan
Mei
erF
itted
AH
Sur
gica
lly
Per
cuta
neou
sly
Figu
re3.
15:
Top:
Cum
ulat
ive
haza
rdcu
rves
for
the
kidn
eyca
thet
erpl
acem
entd
ata
usin
gth
ePH
,AFT
,and
AH
mod
els.
The
non-
para
met
ric
estim
ates
are
show
nin
dash
edlin
es,w
hile
the
estim
ated
cum
ulat
ive
haza
rdcu
rves
are
show
nin
solid
lines
for
the
surg
ical
lypl
acem
entg
roup
(bla
ck)
and
the
perc
utan
eous
lypl
acem
entg
roup
(red
).B
otto
m:
Kap
lan-
Mei
er
surv
ivor
curv
esfo
rth
esu
rgic
ally
plac
emen
tgro
up(b
lack
dash
ed)
and
the
perc
utan
eous
lypl
acem
entg
roup
(red
dash
ed),
with
the
fitte
dsu
rviv
orcu
rves
show
nin
solid
lines
.
CHAPTER 3. DATA ANALYSIS 54
Cox Proportional Hazards Residual Plot
Time
Bet
a(t)
for
V3
1 3.6 6.5 10 16 17 22 25
−6
−4
−2
02
●●●●●●
●
●●
●
●
●● ●
●
●● ● ● ● ●
●
● ● ●●
Figure 3.16: Residual plot from the fit of the proportional hazards model to the kidney
catheter placement data.
CHAPTER 3. DATA ANALYSIS 55
3.5 Brain Tumor Data
Brem et al (1995) conducted a randomized, prospective clinical trial on the implantation of
carmustine (also known as bis-chloronitrosourea or BCNU) polymer discs in patients with
recurrent malignant brain gliomas. The carmustine in the polymer disc is a chemother-
apeutic drug for the treatment of brain tumours. The drug in the disc is slowly released
into the brain over a 2 to 3 week period from the day of surgery or implantation. The trial
consisted of 222 patients who were randomly assigned with equal probability to either a
placebo group (112), or the BCNU group (110), the group receiving the carmustine poly-
mer discs, with the placebo group receiving empty polymer implants. The mortality rate
for this trial was high, with about 93% of the patients dying within the 4-year time period.
However, in this study, we will only consider the first 52 weeks of the trial, as this is where
the marked differences between the placebo and BCNU groups occur. At week 52, about
80% (176 out of 222) of the patients had died, corresponding to a censoring percentage
of 20%. Displayed in 3.17 are the Kaplan-Meier survivor curves and the corresponding
cumulative hazard curves for the two groups. The placebo group seems to perform worse
than the BCNU group after about week 12.
0.0
0.2
0.4
0.6
0.8
1.0
Follow−up Time in Weeks
P(S
urvi
val)
0 4 8 12 16 20 24 28 32 36 40 44 48 52
Kaplan−Meier Survival Curves
Placebo
BCNU
0.0
0.5
1.0
1.5
Follow−up Time in Weeks
Haz
ard
0 4 8 12 16 20 24 28 32 36 40 44 48 52
Cumulative Hazard Curves
Placebo
BCNU
Figure 3.17: Left: Kaplan-Meier survivor curves for the placebo (black) and BCNU poly-
mer (red) groups. Right: Cumulative hazard curves for the two groups.
CHAPTER 3. DATA ANALYSIS 56
Table 3.5 provides the estimated treatment effect using the three models under study.
Although all p-values under a Wald test report a non-significant treatment effect, all models
agree on the direction of the treatment estimate, with both PH and AH models having β < 0
and the AFT model having a β > 0, implying a better prognosis on patients who received
the BCNU polymer disc. The hazard risk ratio of 0.83 from the PH model means that the
BCNU group has a 13% less risk than the placebo group. The AFT estimate of 1.25 implies
that the BCNU group age slower (along the survival scale) by 25% when compared to the
individuals in the placebo group. It estimates that 50% of individuals in the placebo group
will die in about 32 weeks, while it would take about 40 weeks in individuals receiving
BCNU to have the same mortality rate. Similarly, the AH estimate of 0.54 suggests that the
BCNU group decelerates the hazard progression of the placebo group by 46%. This is seen
as a beneficial effect since the hazard function increases after the initial decline in weeks 1
to 4.
Model β SE(β) P-value eβ 95% CI for eβ
PH -0.191 0.151 0.207 0.827 (0.62, 1.11)
AFT 0.223 0.164 0.170 1.249 (0.91, 1.72)
AH -0.609 0.1004 (1.0985*) 0.0545 (0.5791*) 0.544 (0.45, 0.66) (0.06, 4.68)*
Table 3.5: Estimated treatment effects on the carmustine (BCNU) polymer disc data using
PH, AFT, and AH models. Values with * in the AH model represent bootstrapped estimates.
The p-values correspond to Wald tests of a hypothesis of no treatment effect.
Figure 3.19 displays the non-parametric cumulative hazard (top) and the survivor curves
(bottom), overlaid on the fitted curves from the three models. The PH model does not cap-
ture the gap between the two curves from weeks 20 to 32 very well. In fact, the residual plot
on Figure 3.20 shows a deviation from the proportionality assumption. The AFT Weibull
model fails to capture the shape of the survivor curves, with the estimated Weibull shape
parameter being 0.92 (SE=0.05). Interestingly, the AH model gives a fairly good fit to the
data except from weeks 36 onwards, where the model displays a crossing of the survivor
curves when it should not.
CHAPTER 3. DATA ANALYSIS 57
0.00
0.01
0.02
0.03
0.04
0.05
0.06
Follow−up Time in Weeks
Haz
ard
Rat
e
0 4 8 12 16 20 24 28 32 36 40 44 48 52
Smoothed Hazard Curve
Placebo
BCNU
Figure 3.18: Smoothed hazard curves for the placebo (black solid) and BCNU polymer
(red dashed) for the treatment of brain tumor.
CHAPTER 3. DATA ANALYSIS 58
0.00.51.01.5
Fol
low
−up
Tim
e
Hazard
04
812
1620
2428
3236
4044
4852
PH
Mod
el
Kap
lan
Mei
erF
itted
PH
Pla
cebo
BC
NU
0.00.51.01.5
Fol
low
−up
Tim
e
Hazard
04
812
1620
2428
3236
4044
4852
Kap
lan
Mei
erF
itted
AF
T
AF
T M
odel P
lace
bo
BC
NU
0.00.51.01.5
Fol
low
−up
Tim
e
Hazard
04
812
1620
2428
3236
4044
4852
Kap
lan
Mei
erF
itted
AH
AH
Mod
el
Pla
cebo
BC
NU
0.00.20.40.60.81.0
Fol
low
−up
Tim
e
P(Survival)
04
812
1620
2428
3236
4044
4852
PH
Mod
el
Kap
lan
Mei
erF
itted
PH
Pla
cebo
BC
NU
0.00.20.40.60.81.0
Fol
low
−up
Tim
e
P(Survival)
04
812
1620
2428
3236
4044
4852
Kap
lan
Mei
erF
itted
AF
T
Pla
cebo
BC
NU
0.00.20.40.60.81.0
Fol
low
−up
Tim
e
P(Survival)
04
812
1620
2428
3236
4044
4852
Kap
lan
Mei
erF
itted
AH
Pla
cebo
BC
NU
Figu
re3.
19:
Top:
Cum
ulat
ive
haza
rdcu
rves
for
the
BC
NU
poly
mer
disc
data
usin
gth
ePH
,AFT
,and
AH
mod
els.
The
non-
para
met
ric
estim
ates
are
show
nin
dash
edlin
es,w
hile
the
estim
ated
cum
ulat
ive
haza
rdcu
rves
are
show
nin
solid
lines
fort
hepl
aceb
ogr
oup
(bla
ck)a
ndth
eB
CN
Upo
lym
ergr
oup
(red
).B
otto
m:K
apla
n-M
eier
surv
ivor
curv
esfo
rthe
plac
ebo
grou
p(b
lack
dash
ed)a
ndth
eB
CN
Upo
lym
ergr
oup
(red
dash
ed),
with
the
fitte
dsu
rviv
orcu
rves
show
nin
solid
lines
.
CHAPTER 3. DATA ANALYSIS 59
Cox Proportional Hazards Residual Plot
Time
Bet
a(t)
for
trea
t
6.4 12 15 20 25 30 37 44
−2
−1
01
2
●●
●
●
●
●
●●●
●
●
●●
●
●
●●
●●
●
●●
●
●●●
●●
●●
●
●
●●
●●
●
●● ●●●
●
●
●
●●●● ●
●●●
●●
●●
●●
● ●●●
●●
●
●●●
●●
●●
●●●
●●●
●
●●
●
●●●●●
●●
●
●
●●
●
●●
●
●
●●●
●●
●●
● ●●●
●●
●●●
●
●
●
●●
●●●●●
●●
●
●
●
●●
●● ●●●●
●
●
●●
●●●●
●
●
●
●
●
●●●
●●
●●●
●
●
●
●
●
●● ●●
●
●
●●
●
●
●●
Figure 3.20: Residual plot from the fit of the proportional hazards model to the brain tumor
data.
CHAPTER 3. DATA ANALYSIS 60
3.6 Summary AH Parameter Interpretation
Parameter interpretation for the AH model is more complicated than the PH or AFT mod-
els because it relies on a careful assessment of the underlying hazard curve. In practice,
the hazard function can be estimated by the use of kernel smoothing methods, which can
give slightly different shapes depending on the bandwidth chosen. The sign of the estimate
cannot in itself determine a favourable or harmful treatment outcome. Instead, it needs to
be interpreted with the understanding of the shape of the underlying baseline hazard func-
tion. In cases where the baseline hazard function is non-monotone, interpretation may be
difficult.
Table 3.6 shows a summary of features of the five datasets. All of the datasets presented
had crossing hazards, except for the CABG data, which make the AH model a plausible
choice in data fitting. In general, when the proportionality assumption is not grossly vio-
lated, the proportional hazards model gives reasonable estimates, and is the preferred model
choice. The AH model is superior only in the case when there is crossing in both hazard and
survivor functions, and when the hazard function is not flat (eg. kidney catheter data). In
this particular case, the AH model was able to capture the crossing of the survivor curves,
which neither the PH nor the AFT model can accommodate. In the brain tumor data, both
the placebo and BCNU groups have similar hazards at time 0. The AH model gave a good
fit to the BCNU group but the fit to the placebo group had problems past week 28.
Dataset Hazard same at t=0? Hazards Cross? Survivor Curves Cross? Best Model
Breast Cancer No Yes (tail) No PH
CABG No No No PH
Veteran Lung Cancer No Yes Yes None
Kidney Catheter No Yes Yes AH
Brain Tumor Yes Yes No AH (treatment)
Table 3.6: Summary of features of the five datasets considered in the chapter, and the best
model fitted in each dataset. The column headers reflect features seen from the estimated
Kaplan-Meier curves, without regard for significance of these effects.
Chapter 4
Exploring the fit of the PH and AHsurvivor curves when the model hascrossing hazards
We investigate the performance of the proportional hazards and the accelerated hazards
models for a two-sample scenario when the true underlying hazard and survivor curves
cross. Our interest is mainly to see how well the accelerated hazards model can fit data
with crossing curves, but which are not of the AH family. Although we know that the pro-
portional hazards model cannot handle data with crossing hazards and/or survivor curves,
we use it here as a reference since the PH model is widely used in practice, at times even
when violations in the proportionality assumption is observed. This limited investigation
will then also provide some insight on the behaviour of PH regression when the PH model
is false.
We induce a non-AH model with crossing survivor curves through the use of a log-
logistic distribution with density function, f (t) = as (
ts)
a−1/(1 + ( ts)
a)2, hazard function,
h(t) = as (
ts)
a−1/(1+( ts)
a), and corresponding survivor function, S(t) = 1/(1+( ts)
a). Here,
a is the shape parameter, while s is the scale parameter of the distribution. In a two-sample
(treatment/control) situation, survivor curves for the two groups cross if the treatment ef-
fect (denoted as β) is incorporated in the shape parameter of the loglogistic distribution.
Therefore, the shape and scale parameters for the control group are a and s, while the
61
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 62
corresponding parameters for the treatment group are aeβ and s, respectively. The hazard
function of the loglogistic distribution is monotonically decreasing when the shape param-
eter is ≤ 1, and increases to a peak then decreases when the shape parameter is > 1 (see
Figure 4.1). The scale parameter, s, modulates the narrowing or widening of the hazard
curve on the time scale.
0 2 4 6 8 10
0.2
0.4
0.6
0.8
1.0
Time
Haz
ard
Loglogistic Hazard Curves
a=1
a=0.5
a=1.5
Figure 4.1: Hazard curves for the loglogistic distribution with shape parameters, a=0.5, 1,
and 1.5 and scale parameter, s=1.
We conducted a simulation study with a sample size of 100, with half of the observa-
tions in each of the control and treatment groups, using two different scenarios - a case
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 63
where both control and treatment groups do not have equal hazards at time 0, and a case
where they do. The fits of the models are compared by computing the mean square error
of the mean estimated survivor curves from 1000 runs. In both cases, to make a fair com-
parison of the treatment curves, the fitted curve for the PH model is truncated to match that
of the AH model. Recall that the AH model scales the time, and therefore, the timeline
for the predicted treatment survivor curve is a scaled function of that of the control group’s
survivor curve.
4.1 Case I
We explore the performance of the AH model when the control and treatment groups do
not have equivalent risk at the start of the study. A loglogistic distribution having a shape
parameter of 1.5 and a scale parameter of 4 was chosen for the control group and a shape
of 1.5eβ for the treatment group, with β = −1. The hazard and survivor curves for both
groups are displayed in Figure 4.2, with the control group denoted as the baseline. We note
that a crossing occurs at around t=4, which is about the median (50th) percentile for both
groups. This implies that half of the subjects in each group have failed by time=4. This
particular choice of parameters also induces a sharper decline in the survival curve for the
treatment group early in the study, but the rate of failure tapers off as time progresses.
Figure 4.3 shows the predicted baseline and treatment survivor curves of the two mod-
els, with the mean fitted curves from the PH model displayed in the panels on the left, and
the mean fitted curves from the AH model displayed in the panels on the right, for three
effect sizes, β = −0.5,−1,−1.5. Both models failed to capture the crossing of the sur-
vivor curves even when the effect size is large at -1.5. Alternatively, Figure 4.4 displays a
comparison of the mean fitted PH and AH baseline and treatment curves against the true
loglogistic survivor curves. When the effect size is small (-0.5), both the PH and AH mod-
els have comparable mean squared errors (MSE). Mean squared error is defined here as
the average of the sum of the squared difference between the true survivor curve and the
estimated mean survivor curve at some grid of values (ie. ∑ni=1(true−estimate)2
m−1 , where m is the
number of grid points). For a moderate effect size (-1), both models give a similar fit, by
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 64
0 5 10 15 20
0.00
0.05
0.10
0.15
0.20
0.25
0.30
Hazard Curves
Time
Ris
k
Baseline
Treatment
0 5 10 15 20
0.0
0.2
0.4
0.6
0.8
1.0
Survivor curves
TimeP
(Sur
viva
l)
Baseline
Treatment
Figure 4.2: Hazard (left) and survivor (right) curves for a loglogistic distribution with
shape=1.5 and scale=4 for a treatment effect size of -1.
underestimating the survival rate in the earlier times, and overestimating the survival rates
at later times. This pattern is consistent when the effect size is larger, at -1.5. The MSEs of
the mean fitted curves for the treatment group based on the AH model are smaller than the
corresponding values from the fit of the PH model.
The boxplots in Figure 4.5 show the distribution of the MSEs of the estimates of the
survivor function for the control and treatment groups, from each of the 1000 runs based
on both AH and PH models. The distribution of these MSEs based on the AH model has
a consistently heavier tail than that based on the PH model in all scenarios but the median
MSE values based on the AH model are generally similar or smaller, compared to those
based on the PH model for the estimated survivor function of the treatment group, and
larger for the estimated survivor function of the control group.
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 65
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted PH BaselineFitted PH Treatment
ββ == −− 0.5
PH
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted AH BaselineFitted AH Treatment
ββ == −− 0.5
AH
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted PH BaselineFitted PH Treatment
ββ == −− 1
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted AH BaselineFitted AH Treatment
ββ == −− 1
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted PH BaselineFitted PH Treatment
ββ == −− 1.5
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted AH BaselineFitted AH Treatment
ββ == −− 1.5
Figure 4.3: Comparison of the PH and AH fits for effect sizes of -0.5 (top), -1 (middle),
and -1.5 (bottom) when the hazards do not start at the same point at t=0. The dashed curves
display true survivor functions corresponding to the baseline and treatment groups.
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 66
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted PH (MSE=8.0 E−4)Fitted AH (MSE=5.8 E−4)
ββ == −− 1 2
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted PH (MSE=8.9 E−4)Fitted AH (MSE=2.1 E−4)
ββ == −− 1 2
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted PH (MSE=2.6 E−3)Fitted AH (MSE=3.5 E−3)
ββ == −− 1
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted PH (MSE=3.4 E−3)Fitted AH (MSE=0.8 E−3)
ββ == −− 1
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted PH (MSE=5.0 E−3)Fitted AH (MSE=1.2 E−2)
ββ == −− 1.5
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted PH (MSE=6.2 E−3)Fitted AH (MSE=1.9 E−3)
ββ == −− 1.5
Figure 4.4: Comparison of the fitted PH and AH curves with the true survivor curves
(shown with dashed lines) for three effect sizes (-0.5, -1, -1.5), with the fitted baseline
curves on the left panels, and the fitted treatment curves on the right panels.
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 67
●●● ●● ● ● ●●● ● ●●●● ●●●● ● ●● ●● ●● ●● ● ●●● ●● ● ● ● ●● ●●●● ● ● ● ● ● ● ●●●● ●● ● ●
●●●● ● ●●●● ●●● ● ●●● ●● ●●● ● ● ●●●● ● ●● ●● ●●● ●● ●● ● ● ●●●● ● ●●● ●● ●● ●● ●
0.0000.0050.0100.0150.0200.0250.030
ββ==
−−1
2
MSE
AH
PH
●● ● ●● ●●● ● ●●● ●● ●●● ●● ● ● ● ●● ●● ●
●●● ●● ● ●●● ● ●● ● ●● ● ●● ●●●● ● ●● ● ●● ●●● ● ●●●
0.0000.0050.0100.0150.0200.0250.030
ββ==
−−1
MSE
AH
PH
Bas
elin
e cu
rve
MS
E
● ● ● ●● ● ●● ●●● ● ●● ● ●● ●● ●● ●● ●●
●●●● ● ●● ●● ● ●●●●● ●●● ● ●● ● ● ●● ●●● ●● ●● ●
0.0000.0050.0100.0150.0200.0250.030
ββ==
−−1.
5
MSE
AH
PH
●●●● ●● ● ● ● ●● ● ●●● ●● ●● ● ●● ● ●●● ● ●● ●●● ●●● ● ● ●● ●● ●●● ●● ● ●● ●●● ●● ● ●● ● ●● ● ●● ●●●● ● ●
● ●●● ● ●●●● ●● ●● ●● ● ● ●●● ●● ● ● ●● ●● ●●●● ● ●● ●● ●● ●● ● ●●● ●● ● ● ●● ● ●● ●● ●●● ● ●● ●● ●● ●● ●
0.000.010.020.030.040.050.06
ββ==
−−1
2
MSE
AH
PH
●● ●●● ●●● ● ●● ●● ●●●● ●● ●● ● ●●● ● ●● ● ●● ●● ●●● ●● ●● ●● ●●● ●●● ● ●● ● ●● ●●● ● ●● ●● ●●● ●● ● ● ●●●● ●●
●● ●●● ●● ● ● ●● ●●● ●●● ●● ●●●● ●●●● ● ● ●● ● ● ●●●● ●●● ● ●● ● ●● ● ●●● ●●● ● ●● ● ●● ●● ● ●● ● ●●●● ●●
0.000.010.020.030.040.050.06
ββ==
−−1
MSE
AH
PH
Tre
atm
ent c
urve
MS
E
● ● ●● ●● ●● ●● ● ●● ●● ●●●●● ●●● ●● ●● ●● ● ●● ● ● ●●● ● ●●● ● ● ●● ● ● ●●● ● ●● ●●● ● ● ●● ● ●● ● ●●●●● ●● ●●● ● ● ●● ●●●● ● ●●● ●●● ●●● ● ●●● ●●●●●
● ● ●●● ●● ●●●● ●●●●● ●● ● ●● ●●● ●● ●●●● ●● ●● ●● ●●● ● ●●● ●● ●●●● ●●● ●● ● ● ● ●●● ● ●●● ● ●● ● ●● ● ● ●● ●● ●●
0.000.010.020.030.040.050.06
ββ==
−−1.
5
MSE
AH
PH
Figu
re4.
5:B
oxpl
ots
ofm
ean
squa
red
erro
rsfo
rthe
AH
and
PHm
odel
sfo
rβ=
-0.5
,-1,
-1.5
inC
ase
I,fo
rasa
mpl
esi
ze
of10
0,fo
r100
0si
mul
atio
nru
ns.
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 68
4.2 Case II
This section explores the performance of the AH model when the control and treatment
groups have equivalent risk at the beginning of the study. A loglogistic distribution having
a shape parameter of 4 and a scale parameter of 50 has hazard and survivor curves that cross
when the treatment curve is taken from the same distribution but with a shape parameter
of 4eβ, β 6= 0. Figure 4.6 displays the hazard and survivor curves for the loglogistic distri-
bution with β =−1. We restrict our attention to the time interval (0,100] by incorporating
fixed censoring at t=100. This censoring scheme corresponds to about a 20% censoring
percentage. This allows both an incorporation of censoring in this study, as well as a fo-
cus on the interesting feature of this scenario, namely the time of crossing of the survivor
curves.
0 50 100 150 200 250 300
0.00
0.01
0.02
0.03
0.04
0.05
Hazard Curves
Time
Ris
k
Baseline
Treatment
0 50 100 150 200 250 300
0.0
0.2
0.4
0.6
0.8
1.0
Survivor curves
Time
P(S
urvi
val)
Baseline
Treatment
Figure 4.6: Hazard (left) and survivor (right) curves for a loglogistic distribution with shape
parameters, a=4 and scale parameter, s=50 for a treatment effect size of β=-1.
The mean fitted baseline and treatment curves based on both the PH (left panels) and
AH (right panels) models are shown in Figure 4.7. Although the AH model was able to
capture the crossing of the survivor curves, the predicted curves do not provide an adequate
fit to the data. Both models had trouble fitting the earlier part of the survivor curves but
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 69
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
PH Model
BaselineTreatmentFitted PH BaselineFitted PH Treatment
ββ == −− 0.5
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
AH Model
BaselineTreatmentFitted AH BaselineFitted AH Treatment
ββ == −− 0.5
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted PH BaselineFitted PH Treatment
ββ == −− 1
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
BaselineTreatmentFitted AH BaselineFitted AH Treatment
ββ == −− 1
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
PH Model
BaselineTreatmentFitted PH BaselineFitted PH Treatment
ββ == −− 1.5
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
AH Model
BaselineTreatmentFitted AH BaselineFitted AH Treatment
ββ == −− 1.5
Figure 4.7: Comparison of the PH and AH fits for effect sizes of -0.5 (top), -1 (middle), and
-1.5 (bottom) when the hazards start at the same point at t=0. The dashed curves display
true survivor functions corresponding to the baseline and treatment groups.
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 70
gave a reasonable fit in the latter part of the curve. Alternatatively, Figure 4.8 gives a com-
parison of the mean fitted PH and AH baseline and treatment survivor curves against the
true loglogistic survivor curves. It is interesting to note that the AH model fits the baseline
curve reasonably well, having lower MSEs than the PH model. However, the opposite is
observed, with the PH model having lower MSEs than the AH model, in estimating the
treatment survivor curve.
The boxplots on Figure 4.9 show the distribution of the MSEs of the predicted base-
line and treatment survivor curves for the 1000 simulation runs based on both AH and PH
models. In all scenarios, the distribution of MSEs for the fits based on the AH model gives
a consistently heavier tail than that from the PH model. The median MSE values from the
fitted baseline curves based from the AH model are generally similar or smaller compared
to those from the fitted PH model but are bigger when looking at the estimated survivor
function of the treatment group. This pattern is opposite to what was observed in Case I.
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 71
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
ββ == −− 1 2BaselineTreatmentFitted PH (MSE=1.3 E−3)Fitted AH (MSE=1.4 E−3)
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
ββ == −− 1 2BaselineTreatmentFitted PH (MSE=1.8 E−3)Fitted AH (MSE=3.9 E−3)
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
ββ == −− 1
BaselineTreatmentFitted PH (MSE=4.8 E−3)Fitted AH (MSE=2.7 E−3)
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
ββ == −− 1
BaselineTreatmentFitted PH (MSE=4.1 E−3)Fitted AH (MSE=17.9 E−3)
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val)
ββ == −− 1.5
BaselineTreatmentFitted PH (MSE=10.0 E−3)Fitted AH (MSE=3.3 E−3)
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
Time
P(S
urvi
val) ββ == −− 1.5
BaselineTreatmentFitted PH (MSE=8.1 E−3)Fitted AH (MSE=38.2 E−3)
Figure 4.8: Comparison of the fitted PH and AH curves with the true survivor curves
(shown with dashed lines) for three effect sizes (-0.5, -1, -1.5), with the fitted baseline
curves on the left panels, and the fitted treatment curves on the right panels. Fixed censoring
was done at t=100.
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 72
●● ● ●● ●●● ● ● ●●●● ●●● ●● ●●●● ● ●● ●● ●● ●● ●●●● ●● ●● ●●● ● ● ●●● ●●●● ● ●● ●● ● ●●● ●●● ●●●● ● ● ●● ● ●● ●● ●●
0.000.010.020.030.04
ββ==
−−1
2
MSE
AH
PH
●● ● ●●●●● ●● ● ●● ●●
● ● ●● ●● ●● ●● ●● ●● ● ●● ●● ●● ●●●● ●● ●●
0.000.010.020.030.04
ββ==
−−1
MSE
AH
PH
Bas
elin
e cu
rve
MS
E
●●● ●● ●● ● ● ●● ● ●● ●● ●● ● ● ●●● ●●● ● ●●● ●● ● ●●● ●●● ●●
● ● ●● ● ●● ● ●
0.000.010.020.030.04
ββ==
−−1.
5
MSE
AH
PH
● ● ●● ●● ●● ●● ●●● ●●●● ● ●● ● ● ●● ● ● ●●● ● ● ●● ●● ●●● ●● ●● ●●●● ● ●●●● ●● ●●●● ●
●●● ●●● ● ●● ● ●● ● ● ●●●● ● ●● ● ●●● ●● ●● ● ●● ●●● ● ●●● ● ● ●● ●● ● ● ●● ●● ● ● ●● ● ●● ●●● ●●● ●● ●● ●
0.000.010.020.030.04
ββ==
−−1
2
MSE
AH
PH
●●●●●● ● ● ●● ●●● ● ●●
●●● ●● ● ● ●● ●● ●●● ●●● ●● ●●● ● ●● ●● ●●● ●●● ●●● ●● ● ●● ●● ●● ●● ● ●● ●● ●
0.000.010.020.030.04
ββ==
−−1
MSE
AH
PH
Tre
atm
ent c
urve
MS
E●● ● ●● ●●● ●● ● ●● ● ●● ●
● ●●● ● ●● ● ●● ●● ● ●● ●●●● ● ● ●●● ● ● ●●● ●●● ● ●
0.000.010.020.030.04
ββ==
−−1.
5
MSE
AH
PH
Figu
re4.
9:B
oxpl
ots
ofm
ean
squa
red
erro
rsfo
rthe
AH
and
PHm
odel
sfo
rβ=
-0.5
,-1,
-1.5
inC
ase
II,f
ora
sam
ple
size
of10
0,fo
r100
0si
mul
atio
nru
ns.
CHAPTER 4. EXPLORING THE FIT OF THE PH AND AH SURVIVOR CURVES 73
4.3 Conclusion
Both cases investigated in this chapter allowed for a crossing between the baseline and
treatment survivor curves. In Case I, we showed a scenario where the differences between
the survivor rates for the baseline and treatment groups before the crossing are relatively
small, and may not be clinically significant. In this case, both the PH and AH models
captured the main feature of the data - the wide gap between the baseline and treatment
survivor curves after the cross-over. In Case II, the gaps between the baseline and treat-
ment survivor functions before and after the cross-over are both substantially large. The PH
model essentially fits the data by smoothing out the before and after effect, and may lead to
no evidence of a difference between groups. The AH model acknowledges the cross-over
effect but does not estimate it well.
In our exploration, we found that the AH model had difficulties fitting crossing survivor
curves which are not from the AH family. Even when the true survivor curves cross, the
fitted AH model may not. The PH model tends to form a middle ground between the
baseline and treatment groups, under(over)-estimating the higher (lower) survivor curve
before and after the cross-over. We comment that this suggests an important need for
development and use of powerful tests of PH forms before adopting a PH model.
Chapter 5
Discussion
This project investigated performance of the Accelerated Hazards model relative to the
more commonly-used Proportional Hazards and Accelerated Failure Time models. Unlike
the PH Model, the Accelerated Hazards model allows for cross-overs of the survivor func-
tions of the cohorts under study. Since the AH model assumes that hazards for different
cohorts are time-scaled versions of the same function, it is useful to first empirically exam-
ine the shape of the hazard curves in a k-sample problem before utilization of this model
to assess this assumption. The simulation studies performed in Chapter 2 proposed the use
of the non-parametric bootstrap as an alternative method for estimating the variance for the
AH model in small sample sizes, as the performance of the Wald statistic was quite poor.
Analyses on five different datasets were performed to explore how the fit of the AH
model compared to the PH and AFT models. In the case when the shapes of the hazard
curves show the same pattern but have different time scales, as seen in the brain tumor
example, and when the hazards of the groups at time 0 are similar, this model may provide
a useful alternative to the other, more frequently used, models. However, when the shapes
of the estimated hazard functions differ substantially such that one group’s curve cannot
be empirically described as a scaled version of the other (eg. lung cancer example), and
when the estimated hazards at time 0 vary greatly between groups, as seen in the breast
cancer study, then difficulties may arise when attempting to fit the AH model. Note also
that interpretation of parameters of the AH model may be difficult or not particularly in-
formative when the hazard function is non-monotone, especially when either of the hazard
74
CHAPTER 5. DISCUSSION 75
or survivor curves cross. For example, in the case where the hazard curve decreases at the
beginning of the study and then increases later, the treatment group may be preferable (hav-
ing lower risk) over the control group in the beginning but has higher risk in the latter part
of the study. In such cases, the researcher’s interest may lie on estimating the time when
the hazard curves reach their peaks or troughs rather than quantifying how much faster the
hazard of one group is accelerated over the other.
In Chapter 4, simulation studies were conducted to quantify the goodness of fit of the
AH model both when the hazards of the control and treatment groups did not have similar
values at time 0 (Case I), and when they did (Case II). In Case I, the shapes of the underly-
ing hazard curves were very different from each other. Although both PH and AH models
failed to capture the crossing in the beginning, the AH model had smaller MSE than the PH
model. In Case II, the shapes of the true hazard functions were similar but one group had
a considerably higher peak than the other, making it difficult for the AH model to adapt,
although the crossing of the survivor curves was captured slightly when the treatment effect
was large.
Chen & Jewell (2001) proposed a general class of model that captures the proportional
hazards, accelerated failure time and accelerated hazards models. For the two-sample prob-
lem, for example, the model includes two parameters - one parameter (β1) quantifies the
time scale change of the hazard function for the treatment group, relative to the control,
and the other parameter (β2) measures the proportionality of the hazard curves of the two
groups. The model is written as h1(t) = h0(teβ1)eβ2 , where h1 is the hazard function for
the treatment group and h0 is the hazard function for the control group. Though fitting this
model is not straightforward, it would be interesting to explore if a scenario as described
in Case II of Chapter 4 can be estimated well by the above-mentioned more general model.
Another idea might be to permit individual-specific flexibility in the AH model through
frailty terms which operate on the time scale for the hazard function, in a similar manner
as the covariate effects. As well, though semiparametric methods are quite popular, splines
are becoming increasingly used for modeling baseline intensity functions; in the AH model,
using a simple cubic spline, say, with one or two linear knots might be useful.
CHAPTER 5. DISCUSSION 76
As an attempt to describe the goodness of fit of the PH and AH models, we examined the
mean squared error of the mean estimate of the survivor curves. Formal goodness of fit tests
were not explored in this project but literature by Chen (2001) has explored tests for model
adequacy such as the Gill-Schumacher and Kolmogorov-Smirnov test. We suggest that the
topic of goodness of fit for these models, particularly with regard to simple diagnostics
and plots for assessing departures from model assumptions, deserves greater study. The
incorporation of weight functions in the estimating equation presented in Chapter 2 for
the AH model might also be given further consideration. We conclude by noting that the
PH model, however, tends to fit data reasonably well provided that the proportionality
assumption is not grossly violated.
Appendix A
Appendix
Chen & Wang (2000) showed that the solution to (2.23), βAH , yields asymptotically normal
estimates with variance in the form of,
Σ =β2
AH∫ T0
0 g(t,βAH)h0(t)dt
{∫ T0
0 g(t,βAH)h′0(t)tdt}2,
g(t,βAH) = [1−π(t,βAH)]π(t,βAH){ρS∗0(t)+(1−ρ)S∗1(t/βAH)/βAH},
π(t,βAH) =(1−ρ)S∗1(t/βAH)/βAH
ρS∗0(t)+(1−ρ)S∗1(t/βAH)/βAH,
ρ = limn→∞
n0
n0 +n1> 0.
The estimation procedure for Σ as outlined in the Appendix of Chen & Wang (2000)
requires an estimate of the first and second derivatives of the unknown baseline hazard
function, hAH0(t). In practice, this may be difficult to implement, although several strategies
have been put forward by Tsiatis (1990) and Lin et al (1998). In a succeeding paper by Chen
& Jewell (2001), the authors suggested a variance estimation method that relies on large
sample size approximation and does not need an estimate of the baseline hazard function.
The method was adapted from a technical report by Huang and was subsequently published
(Huang, 2002). The method can be performed as follows:
77
APPENDIX A. APPENDIX 78
1. Find a solution for βAH by solving U(βAH−) U(βAH+) ≤ 0 (zero crossings of
U(βAH), where U(βAH) is defined as in (2.23).
2. Decompose the estimate of the variance Σ into σσ′, where
Σ = n−1n
∑i=1
∫ t
0(zi− z(t))2dNi(te−ziβAH ), (A.1)
and
z(t) = ∑ni=1Yi(te−ziβAH zi)
∑ni=1Yi(te−ziβAH )
= ∑ni=1 I(Xi ≥ te−ziβAH )zi
∑ni=1 I(Xi ≥ te−ziβAH )
3. Solve for b such that U(b) = σ.
4. A variance estimate of√
n(βAH −βAH) is (b− βAH)2, where βAH is the true β, and
βAH is an estimate of βAH from Step 1.
Bibliography
[1] H. Brem, S. Piantadosi, P. Burger, M. Walker, R. Selker, N. Vick, K. Black, M. Sisti,
S. Brem, G. Mohr, P. Muller, and R. Morawetz. Placebo-controlled trial of safety
and efficacy of intraoperative controlled delivery by biodegradable polymers of
chemotherapy for recurrent gliomas. The Lancet, 345:1008–1012, 1995.
[2] N. Breslow. Covariance analysis of censored survival data. Biometrics, 30:89–99,
1974.
[3] Y.Q. Chen and N.P. Jewell. On a general class of semiparametric hazards regression
models. Biometrika, 88:687–702, Sept 2001.
[4] Y.Q. Chen and M.C. Wang. Analysis of accelerated hazards model. Journal of the
American Statistical Association, 95(450):608–618, June 2000.
[5] Y.Q. Chen and M.C. Wang. Estimating a treatment effect with the accelerated hazards
model. Controlled Clinical Trials, 21:369–380, 2000.
[6] D.R. Cox. Regression models and life-tables. Journal of the Royal Statistical Society
Series B, 33:187–220, 1972.
[7] M. Ghahramani, C. Dean, and J. Spinelli. Simultaneous modelling of operative
mortality and long-term survival after coronary artery bypass surgery. Statistics in
Medicine, 20:1931–1945, 2001.
[8] Y.J. Huang. Calibration regression of censored lifetime medical cost. Journal of the
American Statistical Association, 97:318–327, 2002.
79
BIBLIOGRAPHY 80
[9] K. Humphries, M. Gao, A. Pu, S. Lichtenstein, and C. Thompson. Significant
improvement in short-term mortality in women undergoing coronary artery bypass
surgery (1991 to 2004). Journal of the American College of Cardiology, 49:1552–
1558, 2007.
[10] J.D. Kalbfleisch and R.L. Prentice. The Statistical Analysis of Failure Time Data.
Wiley: New York, 1980.
[11] J. Lawless. Statistical Models and Methods for Lifetime Data. John Wiley and Sons:
Hoboken, 2003.
[12] D. Lin, L. Wei, and Z Ying. Accelerated failure time models for counting processes.
Biometrika, 85:605–618, 1998.
[13] C. McGlichrist and C. Aisbett. Regression with frailty in survival analysis. Biomet-
rics, 47:461–466, 1991.
[14] S. Piantadosi. Clinical Trials: A Methodological Perspective. Wiley: New York,
1997.
[15] J. Ragaz, S. Jackson, N. Le, I. Plenderleith, and J. Spinelli. Adjuvant radiotherapy
and chemotherapy in node-positive premenopausal women with breast cancer. The
New England Journal of Medicine, 337:956–962, 1997.
[16] T. Therneau and P. Grambsch. Modeling survival data: extending the Cox model.
Springer: New York, 2000.
[17] A. A. Tsiatis. Estimating regression parameters using linear rank tests for censored
data. The Annals of Statistics, 18:354–372, 1990.
[18] L.J. Wei, Z.. Ying, and D.Y. Lin. Linear regression analysis of censored survival data
based on rank tests. Biometrika, 77:845–851, 1990.
top related