goodness of fit of a joint model for event time and nonignorable missing longitudinal quality of...
Post on 18-Dec-2015
214 views
TRANSCRIPT
Goodness of Fit of a Joint Model for Event Time and Nonignorable Missing Longitudinal Quality of Life Data – A Study
bySneh Gulati*
*with Jean-Francois Dupuy and Mounir Mesbah
HISTORY
Project is a result of the sabbatical spent at University of South Brittany
In survival studies two variables of interest:
terminal event and a covariate (possibly time dependent)
Dupuy and Mesbah (2002) modeled above for unobserved covariates
We propose here a test statistic to validate their model
Still in progress
BREAKDOWN OF TALK
I) Preliminary Results
II) Missing Observations
III) Dupuy’s Model
IV) Goodness of fit for Dupuy’s Model
Preliminaries
The survival time (or duration time to some terminal event) T is often modeled by the Cox Regression Model :
( t | Z) = (t)exp{TZ)}
(t) is baseline hazard rate, Z is the vector of covariates. Survival times are censored on right – one observes X = min (T, C) and = I { T C}
Types of Covariates
External – Not directly involved with the failure mechanism
Internal – Generated by the individual under study – observed only as long as the individual survives
Solution to the full model
Parameter vector obtained 0 by maximizing the following:
i
n
1in
1jijij
ii
)X('exp)XX(I
)X('exp)(L
Z
Z
Estimate of the cumulative hazard function:
tX n
jijij
ii
)X('exp)XX(I)t(ˆ
oftimesfailurebetweenerpolationintLinear
1
Z
Let denote the history of the covariate upto time t:
Let T be the time to some event. Then thehazard of T at time t is
((t)| )dt = lim dt →0 Pr( t < T < t + dt)| )
)t(Z
)t(Z
)t(Z
CLASSIFICATION OF THE DROP-OUT PROCESS
Completely Random Dropout – Drop-out process is independent of both observed and non observed measurements.
Random Drop-out – Drop-out process is independent of unobserved measurements, but depends on the observed measurements.
Nonignorable Drop-out – Drop-out process depends on unobserved measurements.
Approaches Use only the complete observations
Replace missing values with sample mean.
Estimate missing values with consistent estimators so that the likelihood is maximized (IMPUTATION)
Previous Notable Work for Nonignorable Dropout
Diggle and Kenward (1994) Little (1995), Hogan and Laird (1997) - Essentially one integrates out the
unobserved covariates
Martinussen (1999) – uses EM algorithm
Work of Dupuy and Mesbah
Subjects measured at discrete time intervals
Terminal Event – Disease Progression
Patients can dropout and covariate can be unobserved at dropout
The Model n subjects observed at fixed times tj,
t0 = 0 < . . . < tj-1 < tj< . . . <
0 < 0 t = tj – tj-1 < 1 <
Let Z = internal covariate and Zi(t) = value of Z at time t for the ith individual
Z i, j denote the response for the ith subject on (tj, tj+1].
Assumptions 1) The covariate vector Z is assumed to
have uniformly bounded continuous density f(z,
2) The censoring time C has continuous distribution function GC(u)
3) The censoring distribution is assumed to be independent of the unobserved covariate, and of the parameters , and .
Likelihood
dz);z,z.,..,z(fdue)u()x(wexp)x(1x
)u(Tw
a0
x
0
T
L() =
Let us call the above model Equation (1)
Solution
Method of Sieves:
Replace original parameter space of the parameters () by an approximating space n, called the sieve.
Instead of the hazard function, one considers;
n,i = n(T(i))
T(i), i = 1, 2,. . ., p(n), where T(1) T(2) . . . T(p(n)
are the order statistics corresponding to the distinct
dropout times T1 T2 . . . Tp(n)
Hence the approximating sieve is
n = { = (, , n): Rp, R2 ,
n, 1 n, 2 . . . n, p(n)}.
here L(i) () =
dz) ; z, z .,. ., z( f e ) x( w exp1 i x
i x ) k(
) k( iT
}i x ) k( T i
a,i 0,iT: k
) T( wk, n i i
Ti
) n( p
1 k
1
k, n
dz);z,z.,..,z(fe)x(wexp
1ix
ix)k(
)k(iT
}ix)k(Ti
a,i0,iT:k
)T(wk,nii
Ti
)n(p
1k
1
k,n
THE MLE
)ˆ,ˆ,ˆ(ˆnnnn
The MLE
Obtained via the EM algorithm is identifiable and asymptotically normally distributed
Goodness-of-fit for Dupuy’s Model
Issue of Model Checking Important
PROBLEM – MISSING DATA
Could use DOUBLE SAMPLING or IMPUTATION
SOLUTION: Validate model in Equation (1) – Marginal Model
Done by Using the Weights Method of Lin (1991)
Development of the Test Statistic
Using a random weight function, WG(.) define a class of weighted pseudo-likelihood functions given by
WLn() =
Call the above equation (2)
)(WLn
1i
)i(
where WL(i)() =
dz);z,z.,..,z(fe)x(WG)x(w)x(WGexp1ix
ix)k(
)k(iT
}ix)k(Tii
a,i0,i
T:k
)T(wk,niii
Tii
)n(p
1k
1)x(WG
k,n
Define the maximizer of equation (2) as:
)ˆ,ˆ,ˆ(ˆn,Wn,Wn,Wn,W
The test statistic is a function of n,W
Asymptotic Results For
Under the model in Equation (1), the vector converges in
distribution to a bivariate normal
distribution with zero mean and a covariance matrix
n,W
)ˆ(n 0n,W
10,W
If model in Equation (1) is correct, the weighted and the nonweighted MLE’s should be close to each other:
Under the model in Eqn (1), the vector converges in distribution to a bivariate normal distribution with zero mean and a covariance matrix DW =
)ˆˆ(n nn,W
10
10,W
Proof still in progress
Proposed method:
Show that score function for weighted likelihood and unweighted likelihood are asymptotically joint normal.
Use counting process techniques and martingale theory.