methods for interval-censored failure time data and beyond ii · analysis of interval-censored data...
Post on 12-Oct-2020
21 Views
Preview:
TRANSCRIPT
Analysis of Interval-censored Data and Beyond II'
&
$
%
Methods for Interval-Censored Failure Time
Data and Beyond II
(Tony) Jianguo Sun
Department of Statistics, University of Missouri
September 19, 2008
Department of Statistics, University of Missouri Page 1
Analysis of Interval-censored Data and Beyond II'
&
$
%
OUTLINE
• I. Analysis of Bivariate Interval-Censored DataI.1. An Example — AIDS Clinical TrialI.2. Nonparametric Maximum likelihood EstimationI.3. Estimation of the Association ParameterI.4. Regression Analysis
• II. Analysis of Doubly Censored DataII.1. An Example — AIDS Cohort StudyII.2. Nonparametric Estimation of Survival FunctionsII.3. Nonparametric Comparison of Survival FunctionsII.4. Regression Analysis
• III. Other Topics and Future ResearchIII.1. Analysis with Informative Interval CensoringIII.2. Bayesian Analysis of Interval-Censored DataIII.3. Some Future Research Directions
Department of Statistics, University of Missouri Page 2
Analysis of Interval-censored Data and Beyond II'
&
$
%
I. Analysis of Bivariate Interval-Censored Data
I.1. An Example — AIDS Clinical Trial
• Subjects: 204 HIV-infected individuals
• Study: a substudy of a comparative clinical trial of three
anti-pneumocystis drugs w.r.t. the opportunistic
infection cytomegalovirus (CMV)
• Variables of interest: times to the presence of CMV
in blood and urine
• Observations: blood and urine samples were collected
and tested every 4 or 12 weeks
• Bivariate interval-censored data
• Goggins and Finkelstein (2000).
Department of Statistics, University of Missouri Page 3
Analysis of Interval-censored Data and Beyond II'
&
$
%
Table 1: Observed intervals in weeks for blood and urine shedding times alongwith the baseline CD4 status from ACTG 181
ID LB RB LU RU CD4 ID LB RB LU RU CD4
1 11 - 11 - 1 45 6 10 0 2 1
2 11 - 11 - 1 46 2 - 2 - 1
3 11 - 11 - 0 47 13 - 13 - 0
4 11 - 8 10 0 48 15 - 0 3 0
5 7 - 6 8 1 49 8 - 0 1 0
6 11 - 12 - 0 50 16 - 6 9 0
7 8 12 8 10 1 51 5 - 0 1 1
8 10 - 10 - 0 52 2 - 0 1 1
9 6 - 6 - 1 53 13 - 0 1 0
10 2 9 9 11 0 54 13 - 13 - 1
...... ......
Department of Statistics, University of Missouri Page 4
Analysis of Interval-censored Data and Beyond II'
&
$
%
I.2. Nonparametric Maximum Likelihood Estimation
• Consider a survival study giving bivariate interval-censored data
{Ui = (L1i, R1i] × (L2i, R2i], i = 1, ..., n}.
• Let F (t1, t2) = P (T1i ≤ t1, T2i ≤ t2) denote the cdf and
H = {Hj = (r1j, s1j] × (r2j, s2j], j = 1, ...,m}
the disjoint rectangles that constitute the regions of
possible support of the NMLE of F .
• Define pj = F (Hj) and αij = I(Hj ⊆ Ui).
Department of Statistics, University of Missouri Page 5
Analysis of Interval-censored Data and Beyond II'
&
$
%
• Then the likelihood function has the form
L(p) =n∏
i=1
m∑
j=1
αij pj
and the NMLE of F can be obtained by maximizing L(p)
subject to∑m
j=1 pj = 1 and pj ≥ 0 for all j.
• How to determine H:
Betensky and Finkelstein (1999),
Gentleman and Vandal (2001, 2002),
Bogaerts and Lesaffre (2004).
Department of Statistics, University of Missouri Page 6
Analysis of Interval-censored Data and Beyond II'
&
$
%
I.3. Estimation of the Association Parameter
— A copula model approach
• Consider a survival study giving bivariate interval-censored data
for (T1, T2) whose joint survival function is given by
S(t1, t2) = Cα(S1(t1), S2(t2)) ,
where α is a global association parameter.
• Let l(α, S1, S2) denote the log likelihood function and
S1 and S2 the marginal MLE of S1 and S2. Then one can
estimate α by solving the equation
∂l(α, S1, S2)
∂α= 0 .
• Wang and Ding (2000), Sun, Wang and Sun (2006).
Department of Statistics, University of Missouri Page 7
Analysis of Interval-censored Data and Beyond II'
&
$
%
— An imputation approach
• Instead of the copula model approach, one can directly estimate
the Kendall’s τ defined as
τ = P{(T1i −T1j)(T2i −T2j) > 0}−P{(T1i −T1j)(T2i −T2j) < 0}.
• Step 1: Estimate the joint survival function of (T1, T2).
• Step 2: Impute the exact failure times M times.
• Step 3: Calculate the empirical Kendall’s τ for each of M sets
of the imputed data.
• Step 4: Estimate the Kendall’s τ by the average of the empirical
estimates.
• Betensky and Finkelstein (1999).
Department of Statistics, University of Missouri Page 8
Analysis of Interval-censored Data and Beyond II'
&
$
%
I.4. Regression Analysis
— Observed data and models
• Consider a survival study involving K possibly correlated
failure times (T1, · · · , TK) and n independent subjects.
• Assume that for each Tk, only an interval (Lk, Rk] is observed,
giving Tk ∈ (Lk, Rk]. So the observed data have the form
{ (L1i, R1i] , ..., (LKi, RKi], Zi ; i = 1, ..., n }.
• Let λk(t;Z) and Sk(t;Z) denote the marginal hazard and
survival functions of Tk given covariates Z, respectively.
Department of Statistics, University of Missouri Page 9
Analysis of Interval-censored Data and Beyond II'
&
$
%
• The PH model:
λk(t;Z) = λk0(t) exp(Z ′ β) .
• The PO model:
Sk(t;Z)
1 − Sk(t;Z)= e−Z′β Sk(t;Z = 0)
1 − Sk(t;Z = 0),
logit[Sk(t;Z)] = logit[Sk0(t)] − Z ′ β .
• The AH model:
λk(t;Z) = λk0(t) + Z ′ β .
Department of Statistics, University of Missouri Page 10
Analysis of Interval-censored Data and Beyond II'
&
$
%
— A marginal inference procedure
• Assume that the Tk’s are discrete variables. For the analysis,
note that if T1, ..., TK are independent, the log-likelihood
is proportional to
l(β,A1, ..., AK) =K∑
k=1
n∑
i=1
log {Lik(β,Ak)} ,
where Lik denotes the marginal likelihood on Tk from subject i,
Ak(t) = Sk0(t)/{1 − Sk0(t)} for the PO model, or
Ak(t) =∫ t0 λk0(s) ds for the PH or AH model.
• Thus one can estimate β and Ak’s by maximizing
l(β,A1, ..., AK).
Department of Statistics, University of Missouri Page 11
Analysis of Interval-censored Data and Beyond II'
&
$
%
• Let β denote the estimate of β defined above. Then
under certain conditions, β is consistent and for large n, one can
approximate its distribution using the normal distribution
with the covariance matrix consistently estimated by
I−1(β, Ak)D(β, Ak) I−1(β, Ak) .
• Goggins and Finkelstein (2000), Kim and Xue (2002),
Chen, Tong and Sun (2007),
Tong, Chen and Sun (2008).
• Bogaerts, Leroy, Lesaffre and Declerck (2002),
He and Lawless (2003).
Department of Statistics, University of Missouri Page 12
Analysis of Interval-censored Data and Beyond II'
&
$
%
— An efficient estimation procedure
• Assume that K = 2 and the joint survival function of (T1, T2)is specified by a copula model as
S(s, t) = Cα(S1(s), S2(t)) .
• Also assume that the marginal survival functions S1 and S2
follow the PH model with
Sk(t) = exp(−Λ0 k (t) exp(β ′X)), k = 1, 2.
• Let l(β, α,Λ01,Λ02) denote the log likelihood function. Forestimation of θ = (β, α), one can derive the efficient score
function l∗θ for θ and solve l∗θ(θ, Λ01, Λ02) = 0.
• Wang, Sun and Tong (2008) investigated this for bivariate case Iinterval-censored and showed that the resulting estimates areconsistent and efficient.
Department of Statistics, University of Missouri Page 13
Analysis of Interval-censored Data and Beyond II'
&
$
%
II. Analysis of Doubly Censored Data
II.1. An Example — AIDS Cohort Study
• Subjects: 257 individuals with hemophilia who were treated by
given HIV contaminated blood from 1978 to August 1988
• Groups: heavily treated group and lightly treated group
if received at least 1000 µg/kg of the blood for at least
less between 1982 and 1985
• Variable of interest: AIDS latency time, from HIV infection to
AIDS diagnosis
• Interval-censored HIV infection times, right-censored AIDS
diagnosis time
• De Gruttola and Lagakos (1989), Kim, De Gruttola and
Lagakos (1993)
Department of Statistics, University of Missouri Page 14
Analysis of Interval-censored Data and Beyond II'
&
$
%
Table 2: Observed intervals in 6-month scale given by (L,R] for HIV infectiontime and observations (denoted by T with starred numbers being right-censoredtimes) for AIDS diagnosis time for some of 188 HIV-infected patients (the numbersin parentheses are multiplicities)
L R T L R T L R T L R T
Lightly treated group0 5 23∗ (2) 0 11 23∗ (2) 0 12 23∗ (3) 0 14 23∗
0 15 23∗ (9) 0 16 23∗ (4) 0 17 23∗ 0 18 23∗
2 10 23∗ 5 8 23∗ 6 10 23∗ 6 12 23∗
7 12 23∗ 7 13 23∗ 7 15 23∗ 8 13 23∗
8 14 23∗ (3) 9 12 23∗ (2) 9 16 23∗ 10 14 23∗ (4)11 13 23∗ (4) 11 14 23∗ 12 14 23∗ (4) 12 15 23∗ (3)13 15 23∗ (4) 14 16 23∗ (5) 0 3 8 0 12 155 12 16 9 11 20 9 12 21 10 12 2012 13 22 12 15 22 0 13 23∗ 6 13 173 11 23∗ 4 11 23∗ 5 13 23∗ 7 16 23∗
8 12 23∗ 9 15 23∗ 11 13 23
Department of Statistics, University of Missouri Page 15
Analysis of Interval-censored Data and Beyond II'
&
$
%
II.2. Nonparametric Estimation of Survival Functions
• Consider a study that involves n independent subjects who
experience two related events denoted by Xi and Si with
Xi ≤ Si and both being discrete, i = 1, ..., n.
• Suppose that the survival time of interest is Ti = Si − Xi
and Xi and Ti are independent.
• Let u1 < ... < ur denote the possible mass points for
the Xi’s and v1 < ... < vs the possible mass points
for the Ti’s.
• Define
w = {wj = Pr(Xi = uj) }, f = { fk = Pr(Ti = vk) },
j = 1, ..., r, k = 1, ..., s.
Department of Statistics, University of Missouri Page 16
Analysis of Interval-censored Data and Beyond II'
&
$
%
• Suppose that the observed data have the form
{ (Li, Ri] , (Ui, Vi] , i = 1, ..., n }
with Xi ∈ (Li, Ri] and Si ∈ (Ui, Vi].
• Then the full likelihood function is proportional to
LF (w, f) =n∏
i=1
r∑
j=1
s∑
k=1
αijk wj fk ,
where αijk = I(Li < uj ≤ Ri , Ui < uj + vk ≤ Vi).
• To estimate w and f , three self-consistency algorithms —
De Gruttola and Lagakos (1989): MLE
Gomez and Lagakos (1994): a two-step approach
Sun (1997): a conditional approach
Department of Statistics, University of Missouri Page 17
'
&
$
%Time by Six Months
Surviv
al Func
tion
0 5 10 15
0.40.5
0.60.7
0.80.9
1.0
ML estimate for LT groupTS estimate for LT group CL estimate for LT groupML estimate for HT groupTS estimate for HT groupCL estimate for HT group
Analysis of Interval-censored Data and Beyond II'
&
$
%
II.3. Nonparametric Comparison of Survival Functions
• Let the Xi’s, Si’s, and Ti’s be defined as before and suppose that
only doubly censored data are available and given by
{ (Li, Ri] , (Ui, Vi] , i = 1, ..., n } ,
where Xi ∈ (Li, Ri] and Si ∈ (Ui, Vi]. Also suppose that the
subjects come from p + 1 different groups with the survival
functions S1(t), ..., Sp+1(t) and the goal is to test
H0 : S1(t) = ... = Sp+1(t) .
• Again as before, let u1 < ... < ur denote the possible mass
points for the Xi’s and v1 < ... < vs the possible mass
for the Ti’s. Define
w = {wj = Pr(Xi = uj) }, f = { fk = Pr(Ti = vk) },
j = 1, ..., r, k = 1, ..., s.
Department of Statistics, University of Missouri Page 19
Analysis of Interval-censored Data and Beyond II'
&
$
%
— A generalized log-rank test
• To test H0, let S and F denote the MLE of the common
survival function of the Ti’s and the common cdf of the Xi’s
under H0, respectively. Define
dj =∑n
i=1 P (Ti = νj | (Li, Ri], (Ui, Vi], S, F ) ,
nj =∑n
i=1 P (Ti =≥ νj | (Li, Ri], (Ui, Vi], S, F ) ,
and djl and njl as dj and nj except the summation being
over the subjects in the lth group, j = 1, ..., s, l = 1, ..., p + 1.
• Then one can test H0 using the statistic U = (U1, ..., Up+1)′ with
Ul =s−1∑
j=1
(
djl −njl dj
nj
)
.
• Sun (2001).
Department of Statistics, University of Missouri Page 20
Analysis of Interval-censored Data and Beyond II'
&
$
%
II.4. Regression Analysis
• Let the Xi’s, Si’s, Ti’s, wj’s, fk’s, and αijk be defined as before
with doubly censored data
{ (Li, Ri] , (Ui, Vi] , Zi , i = 1, ..., n } .
• Assume that covariates have no effects on the Xi’s and thatXi is independent of Ti. For estimation of covariate effectson the Ti’s, consider
Sk(Zi) = Pr(Ti > vk|Zi) = (q1 · · · qk)exp(Z′
iβ) ,
the discrete PH model, or
λi(t) = λ0(t) exp(Z ′
i β ) ,
the continuous PH model, or the continuous AH model
λ(t) = λ0(t) + Z ′
i β .
Department of Statistics, University of Missouri Page 21
Analysis of Interval-censored Data and Beyond II'
&
$
%
• For the discrete PH model:
Kim, De Gruttola and Lagakos (1993) — MLE
Pan (2001) — Multiple imputation
• For the continuous PH model:
Sun, Liao and Pagano (1999) — Estimating equation
• For the continuous AH model:
Sun, Kim and Sun (2004) — Estimating equation
• If Xi and Ti = Si − Xi are not independent,
Frydman (1995) — Three state model
Department of Statistics, University of Missouri Page 22
Analysis of Interval-censored Data and Beyond II'
&
$
%
III. Other Topics and Future Research
III.1. Analysis with Informative Interval Censoring
• Consider a survival study yielding case II interval-censored data
{Ui, Vi, δi1 = I(Ti < Ui), δi2 = I(Ui ≤ Ti < Vi), δi3 = 1−δi1−δi2, Zi(·)}
for the survival times of interest Ti’s from n independent
subjects, where Ui < Vi are observation times.
• For Ti along with Ui and Vi, assume that
λTi (t |Zi(s), bi(s), s ≤ t) = λ0(t) + β ′
0 Zi(t) + bi(t) ,
λUi (t |Zi(s), bi(s), s ≤ t) = λ1(t) eγ′
0Zi(t)+bi(t) ,
λVi (t |Ui = ui, Zi(s), bi(s), s ≤ t) =
λ2(t)eγ′
0Zi(t)+bi(t) if t ≥ ui
0 if t < ui
given the latent process bi(t) with mean zero.
Department of Statistics, University of Missouri Page 23
Analysis of Interval-censored Data and Beyond II'
&
$
%
• To estimate regression parameters, define
N(1)i (t) = (1 − δ1i) I(Ui ≤ t)
and
N(2)i (t) = δ3i I(Ui ≤ t) I(Vi ≤ t)
conditional on Ui, i = 1, ..., n. Also define
N(1)i (t) = I(Ui ≤ t), N
(2)i (t) = I(Ui ≤ t) I(Vi ≤ t)
given Ui.
• For these counting processes, the intensity functions are
I(Ui ≥ t)Eb{e−
∫
t
0bi(s)ds ebi(t)} e−Λ0(t) λ1(t) e−β′
0Z∗
i(t)+γ′
0Zi(t),
I(ui < t ≤ Vi)Eb{e−
∫
t
0bi(s)ds ebi(t)} e−Λ0(t) λ2(t) e−β′
0Z∗
i(t)+γ′
0Zi(t),
I(Ui ≥ t)Eb{bi(t)}λ1(t) eγ′
0Zi(t),
I(ui < t ≤ Vi)Eb{bi(t)}λ2(t) eγ′
0Zi(t).
Department of Statistics, University of Missouri Page 24
Analysis of Interval-censored Data and Beyond II'
&
$
%
• It follows that one can develop an estimating function as
Uβ(β, γ) =
n∑
i=1
(1−δ1i){
Z∗
i (Ui)−S
(1)1,β(Ui, β, γ)
S(0)1,β(Ui, β, γ)
}
+n∑
i=1
δ3i
{
Z∗
i (Vi)−S
(1)2,β(Vi, β, γ)
S(0)2,β(Vi, β, γ)
}
for estimation of β0 given γ0, where
S(j)1,β(t, β, γ) = n−1
n∑
i=1
I(t ≤ Ui) e−β′ Z∗
i(t)+γ′Zi(t) Z
∗ (j)i (t) ,
S(j)2,β(t, β, γ) = n−1
n∑
i=1
I(Ui < t ≤ Vi) e−β′ Z∗
i(t)+γ′Zi(t) Z
∗ (j)i (t)
for j = 0, 1 with Z∗ (0)i (t) = 0 and Z
∗ (1)i (t) =
∫ t0 Zi(s)ds.
Department of Statistics, University of Missouri Page 25
Analysis of Interval-censored Data and Beyond II'
&
$
%
• Let γ denote a consistent estimate of γ0. Then one can
estimate β0 by the solution β to Uβ(β, γ) = 0.
• Wang, Sun and Tong (2008) showed that under some regularity
conditions, β is consistent and the distribution of
n1/2 ( β − β0 )
can be asymptotically approximated by the normal distribution
with mean zero and covariance matrix Σ that can be
consistently estimated by
Σ = Aβ(β, γ)−1 Γ(β, γ) [Aβ(β, γ)−1]′ ,
where Aβ(β, γ) = −n−1 ∂Uβ(β, γ)/∂β.
Department of Statistics, University of Missouri Page 26
Analysis of Interval-censored Data and Beyond II'
&
$
%
III.2. Bayesian Analysis of Interval-censored Data
• Parametric Bayesian approaches —
Banerjee and Carlin (2004)
• Nonparametric Bayesian approaches —
Dirichlet prior: Doss (1994), Calle and Gomez (2001),
Zhou (2004)
Discrete beta prior: Sinha (1997)
• Semiparametric Bayesian approaches —
Komarek and Lesaffre (2007, 2008), : AFT models for
multivariate interval-censored and doubly censored data,
respectively.
• Gomez, Calle and Oller (2004)
Department of Statistics, University of Missouri Page 27
Analysis of Interval-censored Data and Beyond II'
&
$
%
III.3. Some Future Research Directions
• Analysis with informative censoring: Assume that the
variables L and R in L < T ≤ R are not independent of T , or
P (L < T ≤ R,L = l, R = r) = P (l < T ≤ r|L = l, R = r) dG(l, r)
cannot be replaced by P (l < T ≤ r)
— Joint modeling approach: Finkelstein, Goggins and
Schoenfeld (2002)
— Conditional modeling approach: Zhang, Sun and Sun (2005),
Zhang, Sun, Sun and Finkelstein (2007)
Department of Statistics, University of Missouri Page 28
Analysis of Interval-censored Data and Beyond II'
&
$
%
• Model checking and regression diagnostics:
— case I interval-censored data: Ghost (2003)
— case II interval-censored data: Farrington (2000),
Sun, Sun and Zhu (2007)
— Bivariate interval-censored data: Wang, Sun and Sun (2006)
• Asymptotic properties of MLE:
— for bivariate interval-censored data and
— for doubly censored data
Department of Statistics, University of Missouri Page 29
Analysis of Interval-censored Data and Beyond II'
&
$
%
Thank you !
Department of Statistics, University of Missouri Page 30
top related