daniels's association measures under right censoring

11
APPLIED STOCHASTIC MODELS AND DATA ANALYSIS, VOL. 11,109-119 (1995) DANIELS’S ASSOCIATION MEASURES UNDER RIGHT CENSORING JOHN O’QUIGLEY INSERM U194.75654 Paris Cedex 13. France and Department of Mathematics, University of California, Son Diego. La Jolla. CA 92093, U.S.A. SUMMARY This paper considers extensions of the Daniels generalized measures of association (Daniels 1944) to censored data. A number of existing tests for the detection of association in the presence of right- censoring (Gehan 1965; Efron 1967; Brown, Hollander and Korwar 1974; Weier and Basu 1980) are seen to be special cases of the Daniels coefficient under particular scoring schemes. The Daniels coefficient provides a very general framework for the construction of tests and a number of other scoring schemes, simplifying to give well known tests in the non-censored case, are considered. In this paper we focus attention on tests which are, in some sense, non-parametric with respect to both variables, i.e. the explanatory variable as well as the failure time variable, such a property, though, not necessarily implying rank invariance. Parametric tests can be seen to fit in with the same formulation although no exploration of this approach is given here. Moving away from the null hypothesis of no association, it is also possible to obtain population measures of association applicable to right censored data. These will not in general converge to their non-censored equivalents although, for a large family of cases studied, they do come very close. We illustrate these findings by some simulations. KEY WORDS censoring; correlation; Daniels coefficient; non-parametric; normal scores; ranks; survival analysis 1. INTRODUCTION In studying the association between a biological parameter and survival in a medical context we are frequently confronted with the problem of choosing an appropriate scale for our measurements. The versatility and power of the proportional hazards model due to Cox’ enables us to ignore scale as regards survival since inference is rank invariant with respect to time. However, simply inserting the observed value of the covariate, or for instance its logarithm, implies strong additional hypotheses which may not always be met. Procedures which are rank invariant both with respect to time and the covariate then provide a useful additional tool in analysing survival data. These considerations partly underlay the motivation given to the development of linear rank tests with right-censored data by Prentice2and C ~ z i c k . ~ In Prentice’s work, attention focused on inference for the linear regression model y = a + /?z+ ae, e being an error term with density f(e) based on the ranks of the residuals w = y - &z. Cuzick suggested that we could replace z by a score computed from its rank leading to procedures that are rank invariant with respect to survival and to the explanatory variable. Underlying the calculations is the marginal likelihood of the ranks from which, under some fairly general assumptions about the censoring CCC 8755-0024/020109-11 0 1995 by John Wiley & Sons, Ltd. Received January 1994 Revised November 1994

Upload: john-oquigley

Post on 06-Jun-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Daniels's association measures under right censoring

APPLIED STOCHASTIC MODELS AND DATA ANALYSIS, VOL. 11,109-119 (1995)

DANIELS’S ASSOCIATION MEASURES UNDER RIGHT CENSORING

JOHN O’QUIGLEY INSERM U194.75654 Paris Cedex 13. France and Department of Mathematics, University of California, Son Diego.

La Jolla. CA 92093, U.S.A.

SUMMARY

This paper considers extensions of the Daniels generalized measures of association (Daniels 1944) to censored data. A number of existing tests for the detection of association in the presence of right- censoring (Gehan 1965; Efron 1967; Brown, Hollander and Korwar 1974; Weier and Basu 1980) are seen to be special cases of the Daniels coefficient under particular scoring schemes. The Daniels coefficient provides a very general framework for the construction of tests and a number of other scoring schemes, simplifying to give well known tests in the non-censored case, are considered. In this paper we focus attention on tests which are, in some sense, non-parametric with respect to both variables, i.e. the explanatory variable as well as the failure time variable, such a property, though, not necessarily implying rank invariance. Parametric tests can be seen to fit in with the same formulation although no exploration of this approach is given here. Moving away from the null hypothesis of no association, it is also possible to obtain population measures of association applicable to right censored data. These will not in general converge to their non-censored equivalents although, for a large family of cases studied, they do come very close. We illustrate these findings by some simulations.

KEY WORDS censoring; correlation; Daniels coefficient; non-parametric; normal scores; ranks; survival analysis

1. INTRODUCTION

In studying the association between a biological parameter and survival in a medical context we are frequently confronted with the problem of choosing an appropriate scale for our measurements. The versatility and power of the proportional hazards model due to Cox’ enables us to ignore scale as regards survival since inference is rank invariant with respect to time. However, simply inserting the observed value of the covariate, or for instance its logarithm, implies strong additional hypotheses which may not always be met. Procedures which are rank invariant both with respect to time and the covariate then provide a useful additional tool in analysing survival data.

These considerations partly underlay the motivation given to the development of linear rank tests with right-censored data by Prentice2 and C ~ z i c k . ~ In Prentice’s work, attention focused on inference for the linear regression model y = a + /?z+ ae , e being an error term with density f(e) based on the ranks of the residuals w = y - &z. Cuzick suggested that we could replace z by a score computed from its rank leading to procedures that are rank invariant with respect to survival and to the explanatory variable. Underlying the calculations is the marginal likelihood of the ranks from which, under some fairly general assumptions about the censoring

CCC 8755-0024/020109-11 0 1995 by John Wiley & Sons, Ltd.

Received January 1994 Revised November 1994

Page 2: Daniels's association measures under right censoring

110 J. O'QUIGLEY

mechanism, can be calculated a score statistic and its variance. This score statistic is a simple linear rank statistic, the coefficients of which, conditional on the observed pattern of failures and censorings, depend only on the density of the error term f( e). Thus, choosing different error terms leads to different linear rank statistics. An extreme value density leads to the log-rank scores. The log-rank test, then, as pointed out by Prentice, will have good power if the underlying error density is of the extreme value form. It can in fact be shown that these coefficients are directly related to the weights used in the group of tests known as weighted log-rank

Cuzick3 generalized the ideas of Prentice to bivariate models enabling such things as covariate measurement error or measures of bivariate association to be explicitly catered for. Some of these ideas have been further developed in a more recent paper by Clayton and Cuzick' and although their main focus was the establishment of bivariate and multivariate failure time models by incorporating an underlying random effect ('frailty'), the approach can be used to advantage in the less general context where one of the variables is not subject to censoring.

However, apart from the logistic and extreme value error densities for e, approximations are necessary to evaluate the rank statistics. First order approximations have been proposed by Prentice and their properties studied by C ~ z i c k . ~ Furthermore, Pettitt7 has considered Taylor series approximations which could be applied to some fairly general situations including left- censoring. Cuzick3 examines the question of efficiency loss under incorrect specification of the covariate distribution. In certain cases, this can be quite sizeable, for instance should the model for the explanatory variable lean on a negative exponential distribution but a proportional hazards model be assumed, then this loss amounts to nearly 40% asymptotically.

Uncertainty about the suitability of an underlying model may suggest a more data analytic approach to inference as opposed to a model oriented one. Such data analytic approaches motivated the derivation of the Gehan test8 -modifications to accommodate its dependence on the censoring rnechani~m.~*'~~"*'~ All of these can be shown to be special cases of the Daniels ~oefficient,'~ generalized in different ways to accommodate censoring. Just as a number of well- known tests can be seen to be particular cases of linear rank statistics, based on the marginal likelihood of ranks, so can a number of tests be seen to come under the umbrella of the Daniels formulation.

The Gehan test, as originally conceived or its various modifications, can be derived using either approach. We shall see that a test stemming from the Daniels coefficient, which can be seen to be a generalization of Spearman's rank correlation measure, simplifies to give the same test as that given by Cuzick3 based on linear rank procedures for which f(e) = 2n-I exp (-e)/( 1 + exp (-e) )*, when there is no censoring. This is no longer so when censoring is present.

On the other hand it does not seem possible, for instance, to obtain the log-rank test as a special case of the Daniels coefficient. Tests derived under the Daniels coefficient are not necessarily rank invariant and so it is clear that some of these tests will not be derivable within the framework of generalized linear rank statistics. Another important difference is that, for generalized linear rank procedures, no use is made of the information contained in the rankings of censored observations located between adjacent failures. For the coefficients considered in this paper, such information can be used and would, in some circumstances, lead to increased power.

The purpose of this paper is not to compare the two approaches or to try and formulate any recommendations, the range of possibilities being too wide to enable any useful coverage here. The aim is rather: (1) to present the Daniels coefficient; (2) to show how it permits a unified treatment to the problem of detecting association with right-censored data: and (3) to present an

These two forms of linear rank statistics are therefore equivalent.

Page 3: Daniels's association measures under right censoring

DANIELS ASSOCIATION MEASURES 111

example in which its use is seen to be of help in demonstrating association between survival and a highly skewly distributed explanatory variable.

2. THE DANIELS COEFFICIENT

2.1. A general test for association

u = (u,, ..., u,,)’, b= (bl, ..., b,,)” and define Daniels13 introduced the following measure of association: consider the vectors

in which G ( . , .) is some ‘distance’ function such that G(u, v)= -G(v , u ) for all real u and v. Unmarked summations are understood to range freely over distinct indices, i.e. i and j in the above formula. If we choose G(u , v) = u - v then r reduces to the ordinary product moment correlation coefficient. The choice G(u, v ) = 1 (-1) for u greater than (less than) v leads to Kendall’s t whilst choosing G(u,, uj) to be the difference between the ranks of ui and uj leads to Spearman’s rank correlation p. One form not considered in Daniels’s 1944 paper is G(u,, uj) = t( r(ui)ln} where r ( q ) and r(uj) are the ranks of the untied observations ui and uj, analagous definitions holding for G(b, , bj), and where t(k1n) is the expected value of the kth smallest standardized deviate in a sample of n observations from a normal population. In this case, we can show that r reduces to r, where

i = 1 I i = l

and where r(bi) are the ranks of the second variable corresponding to the natural ordering, i, of the first. The coefficient r, is known as the Fisher-Yates ~0efficient.l~

The distribution of r will depend on the scores chosen, the underlying model generating the scores and the sample size. Letting

then, under the null hypothesis of no association, Daniels13 showed that Sl(u, b; G ) converges to a normal distribution, that E ( S, (u,, b; G ) ) = 0, and that

Var (Sl(u, b; G ) } = a,J3(u, G)S3(b; G ) + (n-2)a,Sz(u, b; G ) / 2 (2.2) where a n = 4 / n ( n - l ) ( n - 2 ) and where expectations and variances are understood to be with respect to the permutational distribution under H,,.

2.2. Generalizations to incorporate right-censoring

Two different approaches are possible in order to accommodate right censoring. The first of these, let us call it the marginal approach, and only considered briefly here, is to choose some non-decreasing real function, 0 at --m and 1 at 00, and average the scores G ( . , .) with respect to this function, the function’s argument corresponding to values the score could assume were

Page 4: Daniels's association measures under right censoring

112 I. O’QUIGLEY

censoring removed. A sensible choice, at least intuitively, would be one based on some consistent estimate of common survival probabilities under H,, the Kaplan-Meier estimate for instance. The marginal approach, along with Kaplan-Meier estimates, in the case of Kendall’s z, and assuming Daniels’s results still hold, gives the test described by Brown et u2.l’ In the two-sampIe case we could use separate estimates for survival according to whether or not an observation came from the first or the second sample. This leads to the test described by Efron.” A slight variation on this idea, and one we would anticipate as having very similar properties, would be to average over the potentially observable times rather than the potentially observable scores, i.e. replacing censored observations by their conditional expectations under Ho. Once again this has been considered in the case of Kendall’s z and, under some mild additional conditions, leads to the test described by Weier and Basu.” A simple step function, only depending on the censoring indicators and not on estimated survival, leads to Gehan’s test.*

For more complicated scoring than that leading to Kendall’s t, that leading to Spearman’s p for instance, the situation is unfortunately much less tractable. The calculations rapidly become unwieldy and are likely to prove too dissuasive to find practical application. Furthermore, if Daniels’s distributional reasoning follows through in this case, as intuitively it would seem to do, it relies on the censoring mechanism’s being marginally independent of the failure mechanism. In many practical applications of survival analysis such an assumption is often deemed too strong, for example long term studies in which more or less subtle changes in recruitment policy occur through time, and it is preferred to make the weaker assumption of independence between the failure and censoring mechanisms, conditional on the observed explanatory variables. Finally, with the marginal approach it is not clear how to get an angle on small sample properties; even for large samples, a rigorous demonstration would seem to be difficult.

For the remainder of this article, then, we focus attention on an alternative approach-the conditional approach-so called because we sequentially condition on the risk sets thereby deriving results that hinge only on the weaker censoring assumption alluded to above. The algebraic expressions we obtain are substantially simpler than those arising under the marginal approach and can, under any of the scoring schemes considered by Daniels, be readily evaluated. What is more, S,(., .; G) under this generalization can be written in terms of a submartingale for which, following the Doob-Meyer decomposition theorem and the martingale central limit theorem, asymptotic properties follow straightforwardly. Finally, Robinson’s results15 can be seen to apply to this version of S,(., .; G) and give us a means of tackling small sample behaviour.

The following sections are organized as follows. In Section 2.3 we reformulate the expression for S,(., -; G) in a way that immediately lends itself to the accommodation of censoring. Explicit formulae are derived for some special cases. In Section 3, large sample behaviour is investigated. Saddlepoint approximations are used in Section 4 to obtain a more accurate assessment of tail probabilities for smaller samples. The multivariate problem is looked at in Section 5, and an example is considered in Section 6.

2.3. A reformulation of the Daniels coefficient

We will need some additional definitions and notation before reformulating the expression for Sl(., -; G). Our data will consist of the triples (ti, xi, dJ, i = 1, ..., n, where ti is the observed survival for the ith subject, xi the associated covariate value and di the usual censoring indicator taking the value 1 for a failure and zero for a censored observation. Let N ( t ) = (N,(t), ..., N,,(r))

Page 5: Daniels's association measures under right censoring

DANIELS ASSOCIATION MEASURES 113

be a multivariate counting process governed by the intensity process A ( t ) = { A,( t ) , ..., A&)}. The univariate process N i ( t ) is a step function, right-continuous with a left-hand limit, which has value zero prior to an observed failure and makes a unit jump at the observed failure time, if such a failure occurs. X( t ) is the n-dimensional vector of univariate processes xi( t ) which, were we to be interested in the immediate extension to time dependent covariates, should be left continuous with right-hand limits. Y ( t ) , an n-dimensional vector of univariate processes Yi( t ) , each left continuous with right-hand limits, takes the value one at Y i ( t ) if the ith subject is exposed to failure at t and is zero otherwise. Finally, we consider 9,, the concept upon which our generalization of the Daniels expression to censored data hinges, defined by $,= { X ( u ) , Y(u), N ( u ) ] u < s and contains all information, accumulated on failures, censorings and covariate paths, up until time s.

Let to be some real number larger than all of the observed survival times and for simplicity, since rescaling is always possible, we take t' = 1. Note that replacing the unrestricted summation Ci xi in equation (2.1) by xj CiCj only affects the numerator by the constant multiple two. We see then, still only considering the uncensored case, that an equivalent formulation for & ( a , .; G), the numerator in (2.1), can be expressed in terms of a double Stieltjes integral. Indeed, apart from the constant factor 2, we have that

(2.3) S, ( t , X , G ) = lo1 Ji ~ ( s , WMO, X(u) 1 d ~ u ) ~ s )

where A(t)=C?,l N i ( t ) and where x ( t ) , a function of bounded variation, is left- continuous with a right-hand limit and equals xi when t = ti. The most natural way to accommodate the possibility of censoring in the above formulation is to introduce into the scores, at least implicitly, the quantity 9,, noting that 9, E 9,, u < s, the resulting scores depending on three rather than two quantities. However, rather than write G(s, u , 8,) we continue to write G(s, u) and use the notation S,'(t, x ; G) for example simply as a reminder that the scores G are restricted to a class more limited than that outlined at the beginning of Section 2.1. All we are saying is that at time s the information necessary in the evaluation of G(s, u), u d s, must be available to us. More technically, we require that G(s, u ) be adapted to the filtration 9,.

3. LARGE SAMPLE PROPERTIES

Suppose that

Ji ~ ( s , u ) ~ ( x ( s ) , x(u) I d ( u )

is of order no greater than n2, in which case Sf(t, x; G) is of order no greater than n3. For all of the scoring schemes considered in this paper this is the case. We can then argue in the same way as Daniels" to derive the asymptotic normality of S l ( r , x; G). This, however, we can already deduce using standard martingale arguments. Of more interest is the expression for the variance derived via the Daniels approach, which is in many cases simpler to evaluate than that leaning on the theory of stochastic integrals.

Let Ep and Varp denote expectation and variance taken with respect to the permutational distribution, conditional on the observed censoring and chosen scores; E , and Var, are the expectation and variance taken with respect to the distribution of 9,. Defining

V',z)=Var, (Sf(t, X ; G)lS,: Ho), v(d)= E,(Sf(t, X ; G)19,: H,)

Page 6: Daniels's association measures under right censoring

114 J. O’QUIGLEY

and noting that

Var (Sf(t, x; G ) = E,(V,$’) + Var, (@)

we have that @ will be very close to zero. In the absence of censoring, it is exactly equal to zero and for an independent censoring mechanism it tends to zero provided the proportion of censored observations tends to some number lying strictly between 0 and 1 as sample size increases. Consequently, we have that Var (Sy(t , x; G) can be well approximated by

&I a,S3(t; G)S3(x; G ) I + (n - 2)a,S2(t , x; G)/21

where n is random and corresponds to the number of distinct failure times,

S:(t; G) = 1’ I’ J’ G(s, u)G(s, v) d ( s ) &(u) dN(v) 0 0 0

$(x; G ) =jl 1’ 1” G{x(s), x(u))G{x(s), x(v)) &(s) d ( u ) dN(v) 0 0 0

and where

These expectations could be estimated via the procedures of the previous section, under the additional assumption of an independent censoring mechanism and some model for the cumulative hazard, or via say resampling methods, the bootstrap for instance.I6 This additional complexity though can be avoided on noting that the large sample results still apply when expectations are replaced by any consistent estimator, in particular the observed sample quantities. Our approximation for Var (Sy(t , x ; G) then becomes

a$:(f; G)S:(x; G ) + (k-2)akS:( t , x; G ) / 2 (3.1) where k is the observed number of failure times. This expression should be compared with equation (2.2). In practice then, we only need bear in mind: (1) the way 9,, is used to define the scores; (2) that contributions to the sums only occur at distinct failures, and that the results of Daniels can then be directly applied, proceeding now as if we had vectors of dimension k rather than dimension n. In some cases further simplification will be possible, the conditional Wilcoxon scores for instance. For these, (S:(t, x; G), is easily evaluated using elementary results of integer summation (see, for example, Reference 17, p. 308). We find that

k

$(t, G,) = C { Wi) - 1 1 I Mi) - 2 I i= 1

where n(R,) denotes the number of elements in the risk set Ri.

4, SMALL SAMPLE PROPERTIES

The sum in equation (2.3) is composed of k(k- 1)/2 terms. We can leave the time ordering alone and note that for any given comparison ( t i , ti), under Ho absence of association, the pair (x i , xi) could equally well have been (x j , xi). The null permutational distribution is then defined

Page 7: Daniels's association measures under right censoring

DANIELS ASSOCIATION MEASURES 115

on an ordered set of 2’(’-’In values, of which (Sp(r, x; G) is just one. Its significance can be assessed by counting the number of more extreme sums in the null distribution. Approximations are necessary since, for k as small as 7, the null distribution ranges over more than two million possible values.

Robinson” developed a simple saddlepoint approximation to this distribution. Relabelling the elements of this sum as yi, i = 1, ..., k ( k - 1)/2, and the null density as *(u), we then obtain, following Robinson’s development and some intermediary calculations,

k2(k- 1)’ )” exp (c log cosh (S;.) - A,,.) 8n y,’ sech’ (AN) 1 W(u) =

where

For given values of u, AM is obtained via this second equation. In the above expressions, all sums range from 1 to k(k - 1)/2. The significance level was then taken to be

the integrals being evaluated numerically using orthogonal polynomials (see Reference 18, Chapter 25). It is possible to avoid numerical integration using saddlepoint approximations to the cumulative distribution’’ but we found that this process, in our case at least, turned out to be numerically more involved.

5. MULTIVARIATE CASE

Frequently we will have recordings on one or more variables known or suspected of being correlated with survival and whose influence we would like to take into consideration when assessing any additional role of the variables under immediate study.

A multiparameter model in which the relationships between survival and the covariables are explicitly set out is one approach. However, this can lead to other problems. Even the general model of Clayton and Cuzick,6 with a correct choice for the distribution of the explanatory variable, may run into identifiability problems” such that the dependence parameter describes something other than dependence. Alternatives have been suggested” but appear to be computationally complex.

5.1. Blocking variables

Taylor” looked at the problem of obtaining significance tests for Kendall’s t and Spearman’s p in the presence of a blocking variable. As a test for conditional independence, given the block, he looked at statistics of the form T = &, whrh where rh is the value of the coefficient in the hth block and w,, is a weight to be chosen. Four possibilities for wh were considered: unity; a factor of order nh (the number of pairs in the hth block; a factor of order ni; and a factor that attempts to compensate for variance underestimation in the presence of ties.

Our suggestion would be to look at statistics of the form T = E h w,&$h(t, x ; G) where ST,(?, x; G) is our test statistic in the hth block and where wh is inversely proportional to its

Page 8: Daniels's association measures under right censoring

116 J. O’QUIGLEY

estimated variance. Even so, particular departures from the null hypothesis will escape detection using T. For instance, a test of the form T = C h wh{ Sr,,(t, x; G)}’ would have much greater power in the presence of qualitative interactions across blocks. If a quantitative interaction were suspected, a statistic of the form T = & wh@(h)S:h(t, x; G) where @ ( h ) is some monotonic function, may be worth considering. Otherwise, should the number of failures not vary greatly across blocks, and the direction of the alternative hypothesis be unclear, then Fisher’s procedure for combining tests (see Reference 22, p. 80) ought to be adequate.

5.2. Models

The fact that the generalized measure of Daniels can be interpreted as a coefficient of correlation suggests studying measures of partial association such as partial correlation coefficients. However, once we move away from the null hypothesis of no association, we quickly run into trouble. The null hypothesis of no association is still appropriate for the variable under consideration, but if we are correctly to ‘account for the effects’ of a second variable, then the association between this variable and survival needs to be consistently estimated. Some model for this association is thus required in advance. A useful formulation is given by the parameter y=pr (T:- T;>OIX, -X,>O) where and are the random true survival times corresponding to two subjects with associated covariate values X, and X,. For a binary covariate, Efrong indicates how, in the presence of censoring, y can be consistently estimated. In essence, rather than work with a single estimate of the marginal survival distribution under H,, a separate independent estimate is obtained for each level of the covariate. For a covariate value assuming p levels, Efron’s results are immediately available to obtain (p- 1) values of y (and thereby, after a simple linear transformation, equivalents of Kendall’s r ) , providing we can estimate survival at each level. For a continuous covariate this is clearly no longer possible since, in practice, any given value will only be represented once in a data set not containing ties. We could employ some regression model, a proportional hazards model for instance, to overcome this difficulty, but might prefer in such cases a less circuitious route based on measures of association for these models.23

An attractive alternative approach stems from the model described by E f r ~ n . * ~ This model assumes the existence of marginal transformations to produce multivariate normality. For the continuous bivariate pair (T*, X), marginal transformations to normality always exist. Efron’s model consists in assuming that such transformations also produce conditional normality. Several introductory statistics books provide examples in which marginal normality does not accompany bivariate normality but in most practical cases it is probably quite reasonable to assume the model at least closely approximates the true bivariate distribution (see, for instance, Reference 25). Note also that such transformations are monotone and will therefore leave any rank based procedure unchanged. In the uncensored case, the coefficient rF, described in Section 2.1, provides a consistent and fully efficient estimate for p when sampling from a bivariate normal distribution with correlation parameter p. Monotonic transformations on the marginals leave r, unchanged, and so it can be considered to be more generally applicable than say the product moment correlation r , which estimates something other than p outside the bivariate normal model. So, if we denote Fisher-Yates scoring by GF,, then p in the more general model to bivariate normality described by E f r ~ n ~ ~ can be estimated by

(5.1)

A matrix of pairwise correlations can then be constructed from which partial or multiple correlations can be calculated. The distribution theory would seem to be more difficult,

rs(t, x) = sr<t, x, Gm/S?(t, x, GFy)

Page 9: Daniels's association measures under right censoring

DANIELS ASSOCIATION MEASURES 117

although, in extensive simulations carried out by myself and Bob Blizard% we found that the variance of tanh-’ (r’(t, x ) ) does not, to a high level of approximation, depend on p (as in the uncensored case) and is well approximated by l/(E(k) - 3) for an independent censoring mechanism. In practice, E ( k ) would be replaced by k. Recall that, in the uncensored case, the variance of tanh-’ r is accurately approximated by (n - 3)-’.

6. EXAMPLE AND FURTHER POINTS

The relationship between pre-operative levels of four acute phase reactant proteins (carcinoembryonic antigen (CEA), C-reactive protein (CRP), a,-acid glycoprotein (AGP) and a,-antichymotrypsin) and prognosis in gastric cancer was the subject of an investigation by Rashid et al.” A proportional hazards regression model was used in this work. Careful thought needed to be given to the scaling as the marginal distributions of the covariables were extremely skew and transformations had an impact on inferences. The results reported here are based on the Fisher-Yates coefficient. Similar findings were obtained for the other coefficients considered in this article, the significance levels for tests based on Wilcoxon generalizations being only slightly but systematically weaker.

For CEA and ACT the hypothesis of independence was rejected at the 1% level. For AGP the hypothesis was rejected at the 5% level, whilst for CRP the test was marginally nonsignificant. All p-values became slightly less impressive when calculating tail probabilities via equation (4.1). the test for AGP becoming non-significant whereas those for CEA and ACT now become significant at the 5% level. This may have implications more generally as to the use of small sample approximations since here, with a total number of patients equal to 104 and less than 20% censoring, the study is as respectable in terms of size as many. Population estimates, under the model described in Section 5.2, were 0.24(CEA), 0*21(ACT), 0-18(AGP) and 0-26(CRP). All partial estimates dropped to values much closer to zero apart from CEA, which dropped to 0.19. This reflected well the conclusions of the analysis based on the proportional hazards model, in which the use of several nested models, implied that CEA carried prognostic information not contained by the other proteins but that the converse was not so. No tests were carried out for the partial coefficients. Purely for the sake of interest we calculated a squared multiple correlation coefficient from the matrix of pairwise correlations and this gave the value 0.51, in very good agreement with the approximate coefficient for this same data set calculated under quite different consideration^.'^

In this paper we have considered how the Daniels coefficient, already quite general in nature, can be extended to deal with right censored data. Approaches leaning on different considerations are no less valuable28p29 and although we have not carried out any comparative investigation here it may be a subject worth pursuing. One advantage of the approach of this paper is that r e s ~ l t s ~ ~ - ~ * from counting processes and stochastic integrals readily apply.

Tests based on the Daniels coefficient may be of most use when it is unclear what should be an appropriate scale for the explanatory variable. It is difficult to give any general recommendations as to which one to use, although questions of efficiency, as in the uncensored case, may give a guide in some instances. Wilcoxon type scores are the easiest to calculate but do not use as much information as the others which are, essentially, making non-parametric transformations to uniformity or normality. One advantage is that no maximization takes place and calculations are based on closed formulae requiring no iteration. In the data set analysed above there were some ties and these were randomly split. For very severe tying (a binary covariate) our impression was that tests based on the Daniels coefficient lacked power when

Page 10: Daniels's association measures under right censoring

118 J. O’QUIGLEY

compared with a proportional hazards model but this and other situations require more thorough investigation before putting forward general recommendations.

ACKNOWLEDGEMENTS

The author would like to thank the referees for suggestions, for having detected some errors in the formulae and for pointing out some important references that had been overlooked.

REFERENCES

1. D. R. Cox, ‘Regression models and life tables (with discussion)’, J . R. Statist. SOC B, 34, 187-220 (1972). 2. R. L. Prentice, ‘Linear rank tests with right censored data’, Biometrika, 65, 167-179 (1978). 3. J. Cuzick, ‘Asymptotic propexties of censored linear rank tests’, Ann. Statist., 13,133-141 (1985). 4. R. L. Rentice and P. Marek, ‘A qualitative difference between censored data rank tests’, Biomerrics, 35,861-867

5. K. G. Mehrotra, J. E. Michalek and D. Mihalko, ‘A relationship between two forms of linear rank procedures for

6. D. Claytons and J. Cuzick, ‘Multivariate generalizations of the proportional hazards model’, J. R. Statist. SOC. A ,

7. A. N. Pettitt. ‘ADDroXimate methods using ranks for regression with censored data’, Biometrika, 70. 121-132

(1979).

censored data’, Biometrika, 66,674-676 (1982).

148,82-117 (1985). - _ _ - -

(1983). 8. E. A. Gehan, ‘A generalized Wilcoxon test for comparing arbitrarily singly censored samples’, Biometriko, 52,

203-223 (1965). . . 9. B. Efron, The two sample problem with censored data’. Proceedings of the 5th Berkley Symposium in

10. R. Pet0 and J. Peto, ‘Asymptotically efficient rank invariant test procedures (with discussion)’. J. R. Statist. Soc A. Mathematical Statistics, Prentice-Hall, New York, 1967, pp. 831-853.

- - 135,185-206 (1972).

11. B. W. Brown. M. hollander and R. M. Korwar. “on-Dammetric tests of indeoendence for censored data with applications to heart transplant studies’, Reliabiliiy and Biometry, Philadelphia, i.1.A.M.. 327-354 (1974).

Statist. Inf. & Planning, 4, 381-393 (1980). 12. D. R. Weier and A. P. Basu, ‘An investigation of Kendall’s tau modified for censored data with applications’, 1.

13. H. E. Daniels, ‘The relation between measures of correlation in the universe of sample permutations’, Biometriku,

14. R. A. Fisher and F. Yates, Statistical Tables for Biological, Agricultural and Medical Research, Oliver and Boyd,

15. J. Robinsn, ‘Saddlepoint approximations for permutation tests and confidence intervals’, J. R. Statist. Soc B , 44,

16. B. Efron, ‘Censored data and the bootstrap’, J. Amer. Statist. Assoc., 76,312-319 (1981). 17. T. P. Hettmansperger, Statistical Inference Based on Ranks, New York, Wiley, 1984. 18. M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions, Dover, New York, 1970. 19. H. E. Daniels, ‘Tail probability approximations’, Inr. Sratisr. Review, 55, 37-48 (1987). 20. P. Hougaard, ‘Fitting a multivariate survival distribution’, Biostatistical Department. Novo Research Institute,

21. J. M. G. Taylor, ‘Kendall’s and Spearman’s correlation coefficients in the presence of a blocking variable’,

22. D. R. Cox, and D. V. Hinkley. Theoretical Statistics, Chapman & Hall, London, U.K. 1974. 23. J. Kent and J. O’Quigley, ‘Measures of dependence for censored survival data’, Biometrika, 75,525-534 (1988). 24. B. Efron ‘Bootstrap confidence intervals for a class of parametric problems’. Biometriku, 72,45-58 (1985). 25. E. C. Fieller, H. 0. Hartley and E. S. Pearson, ‘Tests for rank correlation coefficients 1’. Biometriku, 44,470-481

26. J. O’Quigley, and R. Blizard, ‘A model for univariate and multivariate association for randomly censored data’.

27. S. A. Rashid, J. O’Quigley, A. T. Axon and E. H. cooper, ‘Plasma protein profiles and prognosis in gastric

28. F. E. Zegers and J. M. F. Berge, ‘Correlation coefficients for more than one scale type: an alternative to the Janson

29. F. Marcotorchino, ‘Maximal association theory as a tool of research’, in Proceedings of the 9th Annual Meeting of

33,129-135 (1944).

Edinburgh, U.K. 1938.

91-101 (1982).

Copenhagen, Denmark, 1987.

Biometn’cs, 43,409-416 (1987).

(1957).

University of Washington, Dept. of Biostatistics Technical Report, 84. (1 988).

cancer’, Br. J. Cancer, 45. 390-394 (1982).

and Vegelius approach’, Psychometriku, 51,549-557 (1986).

the Classification Societies, North Holland, Amsterdam, 1986.

Page 11: Daniels's association measures under right censoring

DANIELS ASSOCIATION MEASURES 119

30. P. K. Andersen and 0. Borgan, ‘Counting process models for life history data: A review’, Scand. J. Statist., 12,

31. P. K. Andersen and R. D. Gill. ‘Cox’s regression model for counting processes: a large sample study’, A m .

32. P. Bdmaud, Point Processes and Queues, Springer-Verlag, Berlin, 1981. 33. T. R. Fleming and D. P. Hanington, Counting Processes andSurvival Analysis, New York, Wiley (191). 34. P. Hougaard, ‘A class of multivariate failure time distributions’, Biometrikn, 73,671-678 (1986). 35. J. Kalbfleish and R. L. Rentice The Statistical Analysis of Failure Time Data, New York, Wiley, 1980. 36. P. C. O’Brien, ‘A nonparametric test for association with censored data’, Biometrics. 34,243-250 (1978). 37. J. O’Quigley and R. L. Prcntice, “on parametric tests of association between survival time and continuously

38. R. Rebolledo, ‘Central limit theory for local martingales’, Z . Wahrsch. v e w . Gebiete, 51,269-286 (1980).

97-158 (1985).

Statist., 10. 1100-1120 (1982).

measured covariates; the logit-rank and associated procedures’, Biometrics, 47, 117-127 (1991).