estimation of the mean lifetime in non-parametric model under modified type i censoring
TRANSCRIPT
Vol.8 No°3 ACTA MATHEMATICAE APPLICATAE SINICA July, 1992
ESTIMAT!'ON OF THE MEAN LIFETIME
IN NON-PARAMETRIC MODEL
UNDER MODIFIED TYPE ! CENSORING"
ZHENG ZUK&NG (~}~7~[L~) WU LIPENG (~ ..~..~)
Abstract
The methods to estimate the means of the lifetimes under modified Type I CenJoring are
discussed, in which case the distribution function of the lifetimes is unknown and the information
about the individuMs is cut by the fixed time C. As a slight modification, we suggest observing
small percentage of the lifetimes to construct the estimations. These estimations are unbimsed
and consistent.
1. Introduction
Sometimes experiments are run over a fixed time period in such a way that individual's lifetime will be known exactly only if it is less than some predetermined value. In such situations the data are sMd to be Type I censored. Precisely let Xt," -" , X,~ be nonnegative independen~ identically d~trSbuted random variables with mean # ~md variance a 2 < oo. Let C be a positive constant which always denotes the limit time that we could wait for. In such experiments we only observe
& =min(Xi,U), &=I{=~_<c) (i=l,-..,n),
where l(=,<c} denotes the indicator function, equal to 1 if Xi ~_ C or to 0 if Xi > C. Based on {Z~,6i} (i = 1,...,n), we want to estimate the mean value p, of the Xi.
1 g One may use ; ~ i as a statistics to approach #, but it would be underestimated since the true Xi may be greater than Zi. An alternative estimation of # is ~ Zi6i/~ 6i, which discards all the censored data and then averages the uncensored ones. Unfortunately this is the estimation of the conditional expectation of Xi under Xi ~_ C. Therefore both of them fail to estimate p.
In practice, if we do not treat the censored data well, it seems impossible to estimate # since the information is not enough. We sometimes get very similar observations bu from
two populations with different means. Thus how to treat the censored data skillfully is the key to solving our problem. A
naive idea is that if X~ is censored we add something to it to make up for the censored paxt, and th,~t if X~ is uncensored we also modify it appropriately to ensure unbiasedness in
"Received August 28, 1989. Revised June 29, 1990.
230 ACTA MATHEMATICAE APPLICATAE SINICA Vol.8
the sense that the modification has the same expectation as Xi. This leads to the Class K method in linear regression analy~sis due to T.L. Lai and Zheng Zukang (1984, 1987), which suggests using the modification Xi of the form :
X¢ = 6~io1(Zi) + (1 - 6,:) ~(Z ' : ) , (i)
where ~1, ~ are continuous such that
~0 :g [1 -- G(=)]¢pl(:~ ) Jr" ¢f:~(f;)dG(t,) = z , (2)
G being the continuous distribution function of independent censoring random variables T~, and Z': = min(Xh T/).
Now, in our problem the distribution function of lifetimes is unknown and G is de- generate at C, which cut the information about the individuals after time C. As modified Type I Censoring, we suggest observing a small percentage of the lifetimes to construct the estimations. We need a more complicated procedure which will be discussed in the following sections.
2. ~ O p e n W i n d o w s M e t h o d
We assume that the distribution function F(z) of X': is continuous at point C and F(C) < 1. Intuitively the fixed time C for all individuals cuts all thc information after that and leads to failure to estimate p. As a slight modification, one may think of opening a awindow= to let a small percentage of the individuals go through this =window = and get their true lifetimes. It means that one looses the censoring type. Mathematically, we consider such a sequence of independent identically distributed censoring random variables Tl , - ' " , T.~ which concentrate 1 - A mass on point C and the remaining on oo, where A is a small constant with 0 < A < 1 i.e., the T~ with the common distribution function
O, t < C,
G ( t ) = l - A , t = C ,
1, t---co.
(3)
We assume that {X':} and {T~} are independent. Since for every i, X': is censored by T':, we can only observe
z': = m l n ( X . 6': = I{X,_<T,).
In particular, if ~ = co, we get the true value X':. We suggest
(4)
and
i = 1
It" is clear that X': are independent identically distributed random variables and ~ is the
No.3 ESTIMATION OF T H E MEAN LIFETIME 231
unb i~ed estimation of ~ by checking E (k ' ) = E(X) as follows:
~,~,= {/f.j,,.<_.,,.,=),~,,+ f f .~_.(o+-=~-),,..o,,.,=,,o,o} + If f.j,,._.,,,,=,,~,o+ f f . . ,o + ~-,,,,.o,,.(.,,~,o1
= [/oo °° {1-G(¢)}zI(=<_e)dF(z)+/oo a° { (1-G(z)} (O+ ~O)I(=>e)dF(z)]
=
= zdF(z) + zdF(z) + ( A C - C)dr(z) + 0 ( 1 - A)dF(z)
// // = E ( z ) + O ( A -- 1) dF(x) + C(1 - A)dF(z)
= s { = } . (6)
In the same fashion, we can prove in Appendix that
var(X,:-~2+ ( - ~ - l ) [~°°(z-O)=dF(z)]. (7)
It shows that ~,he variance of X is always greater than that of X since A < I. Hence
_,c, ] v = ( ; , ) = - ,~ , E - d F ( = ) , (s)
and ~ is consistent.
3. Addi t iona l Exponent ia l Censor ing
In Section 2, we give a method to estimate the mean of the lifetimes, where we still have the true lifetimes of A percentage of the individuals. But sometimes these lifetimes are too long that we could not wait for. We need to use additional censoring to save the observation time. A simple way to do this is to add an exponential censoring after the fixed time C. Thus the common distribution function of Tj becomes
0, t < C, G(t) = 1 - A, t = C, (9)
1 - A e x p ( - 0 ( t -- C)) , t > O,
where the parameter 8 > 0 is to determine the average of T, that is,
E ( T ) = / o ~ ° [ 1 - G ( t ) ] d t = / o C [ 1 - G ( t ) ] d t + ~°°[1- G(t)]dt
/ E o0 A exp[--8(t -- O)]dO(t - 0) = c + t', e~p{-O(t - c ) t d t = c + T
= C + - - . (10)
232 ACTA MATHEMATICAE APPLICATAE SINICA Vol.8
E(T) will increase while 0 decreases, In pamticular, let 0 = - lo~ ~ Then C "
= S o, t < c, Gzx (0 L 1-A~, t>C,
L " du
x " - 1 - a ( , d
{z, z.<_c = C+ Iexp{8(Z- C)} - 1]. Z > C (11)
We still assume that F(z) is continuous at O and F(C) < I. Now calculate the expectation of X °,
E(r)
=//.<=I(=<o, af(~}aC(,)+//~<_, [C+ ~{exp(O(t- C))- l}]l(=>c)af(z)aG(t)
+//.> ~,,._<o,~(=)~(,, + f/.>, [c+ ~{exp(O(t-C))- 1}]Ii,>¢)dF(z)dG(t )
= -L®{z _ G(x) }zI(=<~}dF(z)
+ --/~ 0(1 - a)aF(.)
+/~" i" [o+ 5('~p('('-~"-l~],,..o,~,..p(-,(,- ~,, ~, ~,(.,
I" /; = z(/F(z) + ,% exp(-8(x - Clio dF(z)
l~'=dF(z)_l ~" -~ ~ ~ exp(-O(z - 0)) af(z) + 0(1 - A)(1 - f(C)) f r 1 / = [CA(1 -- exp(-SCz - 6'))} + (z - C) + ~.{exp(-@(z - 6")) - 1}] (IF(z)
"I +
- - -E(X) . (13)
Then we uee
1 ~ X~" (14) i.=1
No.3 ESTIMATION OF THE MEAN LIFETIME 233
to estimate p. It is unbiased and consistent for the same reason as in Section 2. Furthermore, we have
var ( ~ * ) = !var ( x ' ) , n
where
var (X*)= or2-1 - {exp(0(z-C))- 1 } - { ( x - C ) - I " ~-~'} -T- ~]dFCz ). (15)
It is easy to verify
v a r ( x ' ) - ~ a r (~) ~- [~---~{exp(0(x- C))- 1}- ((z- C)-I- ~-~ -I- ~---~62] dF(z)
= {exp(0(z -- O)) -- 1} (x -- C) 2 dF(x) 0A A
oo 0 2 ( x - C ) 2 ) - 1} 2 ( x - C ) 1 C) 2] dF(z) >-~ [A---~-'~ { ( 1+0(x-C)+ 2 0/% 7, (x- ~0.
This shows that the additional censoring will enlarge the variance of the estimation.
4. S i m u l a t i o n s
Let lifetimes Xj be independently identically exponentially distributed with a parameter A, i.e., the density function be
/ ( : ) = ~ e x p ( - ~ : ) , : _ o. (16)
Let Ti be independent identically distributed censoring random variables with a common distribution function
0, t <: I ,
G(t) = 1 - A, t ---- I, (17)
1, t = o o .
{Xi} and {T~} are independent. Table 1 shows t h e / i value when the sample size r~ is 500. Notice that the expectation
of X is A - I and the variance of it is A -2. In order to compare with the results based on X s which we can not observe in practice, we also list
i=1
in the table. We denote by rn number of uncensored data and by s.d. the sample standard deviation, we choose A = 1,0.5,0.2 and A = 0.1,0.2,0.5,0.7 in Table 1.
234 ACTA MATHEMATICAE APPLICATAE SINICA V~A.8
Table 2 gives the results under the additional censoring in Section 3. The distribution function of T~ is
0, t < L
c ( t ) = 1 - ,~ , t = , , (18) 1 , ~ e x p ( - e ( t - 1)) , t > 1,
where 0 = 0.02.
T a b l e 1
0.1
0.2
0.5
0.7
1 s.d. of 0.5
value z~ or ~ value
rn 340
p 1.0091 1.0538
0.9776 2.8827
rn 356
# 1.009~1 1.0538
L0519 2.3532
rn 410
p 1.0091 1.0538
1.0167 1.4361
m 451
1.0091 1.0538
1.0440 1.2562
I
228
2.0182
1.9618
259
2,0182
2.1005
352
2.0182
2.0375
411
2.0182
2.0965
s.d. of
z~ or ~,
2.1076
7.0545
2.1076
5.5345
2.1076
3.1981
2.1076
2.6839
0.2
value
119
5.0455
4.8440
159
5.0455
5.2302
298
5.0455
5.1120
376
5.0455
5.2386
s.d. of
zl or ~,
5.2690
20.0862
5.2690
15.3917
5.2690
8.6411
5.2690
7.0626
No.3 ESTIMATION OF THE MEAN LIFETIME 235
0.1
0.2
0.5
0.7
m
#
m
B
1
value
339
1.0091
0.9776
356
1.0091
1.0627
rn 408
B 1.0091
~" 1.0165
rn 449
B 1.0091
~* 1.0431
Tab le 2
s.d. of
s Xi or x i
1.0537
2.9838
1.0537
2.43767
1.0537
1.4698
1.0537
1.2634
0.5
value
227
2.0182
1.9717
256
2.0182
2.1019
346
2.0182
2.0291
403
2.0182
2.1033
s.d. of
X i o r x ;
2.1075
7.5526
2.1075
5.4578
2.1075
3.2595
2.1075
2.7746
0.2
value
116
5.0455
5.1022
153
5.0455
5.1241
275
5.0455
5.0667
348
5.0455
5.1868
s.d. of
X4 or x~
5.2689
23.7257
5.2689
15.3585
5.2689
9.2104
5.2689
7.4750
A p p e n d i x
Calculation of vax {X)
In Section 2 we give the expression of var {X). Here is the proof.
_ - + ÷
. I
236 ACTA MATHEMATICAE APPLICATAE SINICA Vol.8
/o ° /7 = ~dF(~) + a c ~ { ~ - F(C)} + 2C (~ - C)dF(~)
1 + -~ ~°°{z - C}'dF(z} + ~°° C2(1 - ~ ,dF(z ,
/o" /7 = z2dF(z) + {AC 2 + C~(l - A)}-{1 - F(C)} + 2C zdF(z)
/7 /7 1 z:dF(z) - 2C oo xdF(z) + C2 dF(z) - 2C2{1 - F(C)} + ~ ~- -~-
= z2dF(x) + {1 - Fie)) • - C 2 + + 2C zdF(=)
1 ~ 2C zdF(z) + -~ z2dF(x ) - --~
1
C 2 +{1- r(c)}. {~--c~}.
Therefore
var (X) = E ( X 2) - E(X)) 2 = E(~(~) - (E(X 2 ) - o "2) 1 oo oo
= c r 2 + ( - ~ - l ) { ~ ° ~ ( x - C } 2 g F { x , } .
This completes the proof. By using the same method, we can also prove (15).
References
[1] Zheng Zukang, Regression Analysis with Censored Data. Ph.D. Dissertation, Columbia University, 1954.
[2] Zheng Zukang, A Class of Estimators for the Parameters in Linear Regression with Censored Data,
Acta Mathematicae Applicatae Sinica (English Series), 3(3) (1987), 231-241.