nonparametric methods in longitudinal studies

9
This article was downloaded by: [Moskow State Univ Bibliote] On: 22 May 2013, At: 13:15 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of the American Statistical Association Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/uasa20 Nonparametric Methods in Longitudinal Studies Malay Ghosh a , James E. Grizzle b & Pranab Kumar Sen b a Department of Statistics, Indian Statistical Institute, Calcutta, India b Department of Biostatistics, University of North Carolina, Chapel Hill, N.C., 27514, USA Published online: 05 Apr 2012. To cite this article: Malay Ghosh , James E. Grizzle & Pranab Kumar Sen (1973): Nonparametric Methods in Longitudinal Studies, Journal of the American Statistical Association, 68:341, 29-36 To link to this article: http://dx.doi.org/10.1080/01621459.1973.10481329 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

Upload: pranab-kumar

Post on 11-Dec-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nonparametric Methods in Longitudinal Studies

This article was downloaded by: [Moskow State Univ Bibliote]On: 22 May 2013, At: 13:15Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41Mortimer Street, London W1T 3JH, UK

Journal of the American Statistical AssociationPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/uasa20

Nonparametric Methods in Longitudinal StudiesMalay Ghosh a , James E. Grizzle b & Pranab Kumar Sen ba Department of Statistics, Indian Statistical Institute, Calcutta, Indiab Department of Biostatistics, University of North Carolina, Chapel Hill, N.C., 27514, USAPublished online: 05 Apr 2012.

To cite this article: Malay Ghosh , James E. Grizzle & Pranab Kumar Sen (1973): Nonparametric Methods in Longitudinal Studies,Journal of the American Statistical Association, 68:341, 29-36

To link to this article: http://dx.doi.org/10.1080/01621459.1973.10481329

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contents willbe complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should beindependently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings,demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with orarising out of the use of this material.

Page 2: Nonparametric Methods in Longitudinal Studies

© Journal of the American Statistical AssociationMarch 1973,Volume 68, Number 341

Applications Section

Nonparametric Methods in Longitudinal StudiesMALAY GHOSH, JAMES E. GRIZZLE and PRANAB KUMAR SEN*

Inference procedures based on some simple rank statistics are pro­posed and studied for the statistical analysis of longitudinal data.These robust and asymptotically efficient procedures do not requirethe basic assumption of multivariate normality of the underlying dis­tributions. The theory is illustrated with two examples.

1. INTRODUCTION

The analysis of data collected in longitudinal studiespresents problems due to dependence among succes­sive observations made on the same individual. Althoughsome problems remain, good statistical methods for theanalysis of continuous variables collected in longitudinalstudies have been developed recently (cf. [12, 4J). How­ever, the assumption of multivariate normality madeinvariably is not always tenable since a continuousvariable (vector) may not have the hypothesized mul­tivariate normal distribution, or a variable may take ondiscrete values based on some nominal scale such as asubjective clinical score. The object of the present articleis to present generally applicable rank based statisticalmethods appropriate when either of these two situationsprevails.

We shall start with two practical examples in Section 2for the motivation of the problems, and then developstatistical methods in Section 3.

2. SOME EXAMPLES

The first example is based on an experiment the purposeof which was to investigate the potentialities of Beta­amino propionitrile (BAPN) as a preventive of jointstiffness.' The right knee joints of 250 gram femaleSprague-Dawley rats were immobilized in 40 0 flexionwith an internal extra-articular split for two or threeweeks. Five groups of rats were treated as follows:

Group 1: Knees immobilized three weeks-no drug given.Group 2: Knees immobilized three weeks-drug given all

three weeks.Group 3: Knees immobilized three weeks-drug given the

first week.Group 4: Knees immobilized three weeks-drug given the

third week.Group 5: Knees immobilized two weeks-no drug given.

At the end of the period of treatment, the rats weresacrificed, and all the soft tissue was removed from theleg except the joint capsule and collateral ligaments. Thefemur was mounted on the movable arm of a goniometer,and the tibia was maintained in a horizontal position sothat the force produced by adding weights to the distalend of the tibia would always be perpendicular to thehorizontal plane. Half gram weights were added everyten seconds, and the angle (j between the tibia and thefemur was recorded after the addition of each weight.In most cases, a point was reached at which the jointcontinued to extend without the addition of more weight.After the voluntary extension had stopped, addition ofweights was resumed until an angle of 1350 was reached.

A plot of the results for a typical group is shown inFigures A and B. Each line depicts the results for a dif­ferent animal in the group.

A. GROUP /I BEFORE TRANSFORM ATlON OF DATA

e

100

40

2015

GRAMS

10

IN

r----L-------I...-~o 5

WEIGHT

* Malay Ghosh is associate professor, Department of Statistics, Indian StatisticalInstitute, Calcutta, India. Both James E. Grizzle and Pranab Kumar Sen areprofessors, Department of Biostatistics, University of North Carolina, Chapel Hill,N.C. 27514. This work was done while Professor Ghosh was also at Chapel Hill andwas supported by the National Institutes of Health, Grant GM-12868. The authorsare grateful to the referees for their useful comments on the article.

I MANOVA for the same data under the classical normal theory model is con­sidered in Grizzle, J. E., Essays in Probability Statistitics, R. C. Bose, et al., ed•.University of North Carolina Press, 1966. pp. 311-26.

29

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

13:

15 2

2 M

ay 2

013

Page 3: Nonparametric Methods in Longitudinal Studies

30 Journal of the American Statistical Association, March 1973

B. GROUP II AFTER TRANSFORMA TION OF DATA

SIN~2

Two straight lines were fitted to the response curvesfor each rat. These curves fit exceedingly well; the per­cent of the total sum- of squares accounted for by re­gression was always over 95 percent and usually ex­ceeded 98 percent. The estimated regression coefficientshave a physical interpretation. They are the rate of in­crease of the distance between the point of attachmentof the leg to the apparatus and the point of addition ofweight per half gram of weight.

For an adequate summary of the original data foreach animal, we propose the vector

where -91 and !Xl are the estimated intercept and theslope of the first component of the response line, while12 and !X2 are those for the second. The statistic s~ax/S~in

(where s~ax and S~in are, respectively, the largest andsmallest estimated variances) showed that the withingroup variances of the !Xl are not homogeneous evenafter making a logarithmic transformation. The esti­mated parameters of the individual lines are shown inTable 1. We shall use these datato illustrate the use of

!!

!p.~_.

.. .-I '! /r I

6 ! I,·• a / ~

! I •.5~7

.4

.3

.2.1

1.0.9

.8

.7

The figures show that the first portion of the responsecurve for each animal was approximately linear whilethe second portion was often definitely nonlinear. Oneform of Hooke's law is that the extension of a spring isproportional to the weight attached to it. The joint andcapsule do not behave like a spring since the femur doesnot resume its original position after the weight is re­moved, but plastic substances often obey Hooke's lawwith the exception in behavior noted. The quantityy = sin 0/2 is proportional to the distance from the pointat which the leg is suspended to the point the weight isapplied, and thus provides the transformation for chang­ing the angular measurement to a linear measurement.Since the points of suspension and the points of additionof the one-half gram weights were constant amonganimals, apart from experimental error in setting up theapparatus, the fact that we do not know the constant ofproportionality is no handicap. Plotting the new variabley = sin 0/2 against weight resulted in two approxi­mately straight lines for each animal j the first being theresponse line before the spontaneous movement withoutadditional weights and the second being the responseafter the spontaneous movement had taken place andthe addition of weights had been resumed. The dis­continuity between the two lines is the point at which thespontaneous movement took place.

1. SUMMARY OF DATA OBTA/NED FROM THEREGRESS/ON LINES

5 10

WEIGHT IN

15 20

GRAMS

Group

1

2

3

4

5

A A A A

Y1 (Xl Y2 (X2

0.3113 0.0332 0.7777 0.00920.3202 0.0287 0.7438 0.01400.3786 0.0219 0.7675 0.00940.3921 0.0270 0.5520 0.02360.3836 0.0402 0.7800 0.01960.3869 0.0383 0.7092 0.0174

0.4524 0.1372 0.7487 0.02550.3634 0.1013 0.8189 0.01620.3734 0.0569 0.7619 0.01650.3756 0.0495 0.7598 0.01360.4361 0.0568 0.8174 0.0192

0.3596 0.0604 0.8521 0.01030.4001 0.0468 0.8139 0.01060.3861 0.0374 0.6014 0.02290.3968 0.0412 0.7283 0.02260.3963 0.0286 0.7662 0.01l0

0.3603 0.0470 0.6917 0.04370.3452 0.0381 0.7473 0.01720.3962 0.0407 0.7848 0.01650.3763 0.0434 0.8062 0.01840.3910 0.0406 0.7334 0.0293

0.5005 0.0725 0.7767 0.03550.5699 0.0699 0.7391 0.02690.6ll4 0.0670 0.8009 0.01760.3803 0.0417 0.7540 0.02700.3951 0.9765 0.8518 0.0137

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

13:

15 2

2 M

ay 2

013

Page 4: Nonparametric Methods in Longitudinal Studies

Nonparametric Longitudinal Studies 31

• (1) (p) ,

OI.ijk = (aijk, "', aijk) , k = 1, ... ,nij;

(1) (p) ,:l..j = (X, , .. " Xj ); j = 1, ... , c; (3.3)

where G1, "', Gh are arbitrary, and :I../s account for thetreatment-effects, i.e., the differences in the true values ofcr.g) - afJ~, 1 ~ j ,e l' ~ c, s = 1, "', p.

Hospital 1 Hospital 2

Months after treatment Months after treatmentPatient Patient

1'> 3 12 18 24 1'> 12 18 24

Treatment 1

1 3 5 3 5 0 0 5 0

2 0 0 0 0 0 0 4 0 0

by an arbitrary function g('), involving a set of unknownparameters and a set of known factors, such as time andother accountable variables.

For simplicity of presentation, we shall consider theparticular model (3.1). LetXi j k = (Xi j k(t1), "', Xijk(tl»'

be the observation (vector) for the kth individual in theith block receiving the jth treatment, for k = 1, "', nih

1 < i < h, 1 < j ~ c. Based on Xij k, we estimate'YiJT(1 ~ r ~ d), cr.g) (1 ~ s ~ p) in some convenient wayand denote these estimates by 'YiJT,k, ag~k(1 ~ i ~ h,

1 ~ j ~ c, 1 ~ s ~ p, k = 1, "', nij). For brevity, letus confine ourselves to the model connected with thesecond example. Let

000

o

19

20

18

35

o

o

Treatment 2

o

5 5

o 0 0

2. CLINICAL EVALUATlONS (TYPICAL DATA)

27

28

29

57

j = 1, .. " c; ~ = 1; "', h, (3.2)

where the a's are the coefficients of a p - 1 degreepolynomial. The procedure for estimating OI.ijk'S may bequite arbitrary. For example, we may fit a polynomial ofdegree p - 1 by the usual least squares principle, or wemay use some other robust procedure, such as the onein Sen [14J and Sen and Puri [16J when p = 2, etc.

Since we apply the same method of estimation of OI.ij

for all the nij vectors Xi jk, k = 1, "', nih the estimatesaijk, k = 1, "', nih are all defined by the same func­tion of the vector F.ijk, k = 1, "', nij. which are in­dependent and identically distributed random vectors.Hence, aijk, k = 1, "', nih are nij independent andidentically distributed random vectors, having a p-variatecontinuous distribution Gi j (X1, .. " x p ) , for j = 1, .. " c,

i = 1, .. " h. We adopt the model

Gij(x) = Gi(x - :l..j),

d p wXijk(tt) = LT=l 'YijTgijT + L.=l cr.ij m.(tt)

+ Eijk(tt),f = 1, "', q, k = 1, "', nih (3.1)

p + d ~ q, where 'YijT( ~ r < d), cr.iS') (1 < s ~ p) are un­known real constants, gijT(1 ~ r ~ d) are known con­stants, m.(tl) are known functions of tl(l~s~p, l~l~q).

In the first example, h = 1, d = 2, P = 2, glj1(glj2) = 1,if the observation corresponding to the jth treatment isassociated with the first (second) portion of the curve,= 0, otherwise; m(l)(tl) (m(2)(tl» = time in the first(second) portion of the curve, = 0, otherwise. In thesecond example, h = 2, gijT = 0 (identically) for allr = 1, 2, <:>, d, h.(tl) = ti- 1 (1 < s ~ p, 1 s: ~ q).

We propose to consider the MAN0 VA tests (as regardsthe effects of treatments) on 'YijT and cr.g). We assume thatthe error vector F.ijk = (Eijk(t1), "', Eijk(t q»' has a joint(q-variate) distribution function (d.f.) which is con­tinuous everywhere. The procedures developed here arenot necessarily restricted to the linear models in (3.1)but can as well be applied to general models where thefirst two terms on the right side of (3.1) can be replaced

3. STATISTICAL METHODS

3.1 The Model

nonparametric multivariate procedures for testing ho­mogeneity of the five groups.

In the second example, the data were collected in astudy of two treatments of duodenal ulcer. Each patient'sclinical status was evaluated and arbitrarily scored asworse (0), unchanged (1), slightly improved (2), moder­ately improved (3) or markedly improved (4) when com­pared to his pretreatment status, or the patient couldbe symptom-free (5) .These patients were evaluated at1, I!, 3, 6, 18 and 24 months after treatment. A cursoryexamination of data such as that shown in Table 2 makesit clear that the patients are relapsing. We shall be inter­ested in whether one treatment group is relapsing at afaster rate than the other.

Patients were randomly assigned to one of the twotreatments within hospital. Thus the design is a ran­domized block design with unequal replication. In thiscase also, we have reduced the original data to theirintercept and trend components by fitting orthogonalpolynomials to the arbitrary scores. The coefficientsnecessary are easily derived even though the spacing isunequal. Graphical examination of each patient's re­gression coefficients suggest that, contrary to what onemight expect, their distributions are nonnormal in spiteof each regression coefficient being a linear combinationof random variables. The problem is to test for treatmentdifferences without assuming multivariate normality.

Consider a set of q( ~2) distinct time points t1, "', tq

(where 0 ~ t1 < iz: .. < t q ) , and let at the time-pointtl, Xijk(tl) be the response on theleth individual receivingthe J'th treatment from the ith block for k = 1, .. " nih

i = 1, "', h( ~ 2), j = 1, "', c( ~ 2). Consider themodel

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

13:

15 2

2 M

ay 2

013

Page 5: Nonparametric Methods in Longitudinal Studies

32 Journal of the American Statistical Association, March 1973

Our first problem is to test the null hypothesis

H( 1 ) .o • :1.1 = ... = :I.e = 0, (3.4)

Then, on denoting by

m, = HNi + 1)(1, ",,1)'; 1, = 1, "', h, (3.12)

_ -1 nij Ck) •n., = nij Lk=l R i j for 1 :-s; J :-s; c, 1:-S; i :-s; h, (3.9)

and the Spearman rank covariance matrices

(3.15)

. H ( 1 )reject 0,

(1)accept H; ,

i = 1, "', h,

2

{

~ XphCc-l).a,£ 2

< XphCc-l).a,

if

Gij(X) = Gi(x - 6j ) , Gi(x) = G(x - ~i),

and following the Chatterjee-Sen [2, 3J rank permuta­tion principle, our statistic £i is proposed as

where ~l, "', ~h are the block effects (nuisance parame­ters), and the treatment effects 6j = (O}l) , "', O}p})' ,(j = 1, "', e), are assumed to have elements "close to

has asymptotically the chi-square distribution withph(c - 1) degrees of freedom. Thus, for large samples,we may proceed as follows:

c - * -.e, = [(N i - l)/NiJ Lj=l ni;(Rij - mi)'Vi(Ri j - m.),

i = 1, .. " h. (3.13)

The permutation distribution theory of £i has beenstudied in detail by Chatterjee and Sen [3J, and Puriand Sen [8]. It follows from their results that underHbl}, £i is asymptotically distributed as a central X2 withp(e - 1) degrees of freedom for each i = 1, 2, "', h.Since £1, "', £h are stochastically independent, thedistribution of

where x;'a is the upper 100 percent point of the chi-squaredistribution with t d.f., and o (O < a < 1) is the desiredlevel of significance of the test.

3.3 Test of Homogeneity Under the Assumption of NoBlock vs Treatment Interaction

The method just presented will not be the most efficient(powerful) that could be used when there is no inter­action. One can reason heuristically as follows: If thereis no interaction, one should be able to devise a teststatistic based on (c - l)p deprees of freedom by averag­ing over the h blocks which would have very nearly thesame numerical value. Instead of being compared to thetabular value of X2 with hp(c - 1) degrees of freedom todetermine its significance level, it would be comparedto the smaller tabular value with p (e - 1) degrees offreedom.

The classical weighted least squares analysis based onthe centered mean rank scores can be used in the presentcase. The method proposed can, under the assumption ofadditivity of block effects, give the best asymptoticnormal estimates of the treatment effects averaged overthe blocks when the treatment effects under the alterna­tives are "close" (e.g., in the Pitman sense) to the nullhypothesis. We want to test the null hypothesis Hb1

}

given in (3.4) against the alternatives

(3.5)

(3.7)

(3.10)

i = 1, "', h, j = 1, "', c,

Vi = ((Vi ...'» ...'=l ..... p; i = 1, "', h

and state the hypotheses of no interaction as

(2)H o : "(11 = ... = "(he = O. (3.6)

The tests we develop are based on suitable rank testsfor MANOVA considered by Chatterjee and Sen [3J,Puri and Sen [8J, Sen [13J and others.

without assuming the additivity of the block effects.Second, we want to study whether the assumption ofadditivity of the block effects can be utilized to provideany improvement in the tests for (3.4). Finally, we wantto test for no interaction between blocks and treatments.

We define

whose elements are numbers from 1 to N, = Lf=lnij.Hence, the N i observations aijk in the ith block yield therank collection matrix

where

3.2 Test of Homogeneity of Treatment Effects Withoutthe Assumption of Additivity of Block Effects

R k '"Cs) ",Cs} .<'ICs) ACS}' dian Uilb "', UilniIl "', Uiel, "', aien" In ascen mgorder of magnitude and let R~jl be the rank of a~jl in thisset; by virtue of the continuity of the joint distributionof the G:iik'S, the possibility of ties may be neglected, inprobability. Note that the ranking is done separatelyfor each block and, also within each block, separatelyfor each s( = 1, "', p). Thus, corresponding to thevector G:ijk we have a p-vector

(Each row of R i N i consists of a permutation of the num­bers 1, 2, .. " N i . )

Our proposed test is based on a statistic £ = LZ=l£i,where £i is the multivariate multisample rank sumstatistic for the ith set, considered in detail in [2, 3]. Forthis we consider the average ranks

-1 c nij {e ) c.')Vi...' = N i Lj=l Lk=l RijkRijk

-t(Ni + 1)2; s, s' = 1, "', p; (3.11)

and let V: be a generalized inverse of Vi, i = 1, "', h.

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

13:

15 2

2 M

ay 2

013

Page 6: Nonparametric Methods in Longitudinal Studies

Nonparametric Longitudinal Studies 33

Then, we take,

(3.28)

(3.29)

(3.35)

(3.31)

(3.34)

(3.33)

.c* = U'(;,-l ® A*)U,

Finally, on definingh

V = h-1 Li=l (N, + 1)-2Vi ,

A* = Diag(nol. "', noe) .

Consequently, we have-1

A = ((no; 0"., - N-1)).

Hence, we may simplify (3.31) as

;, ~ v, in probability.

If we now express Z as a column vector

j = 2, .. " c, s = 1, .. " p, '/, = 1, .. " h. (3.36)

where «e-» = ;'-1.

The test for Hb1) is based on .c*, using the same rule

as in (3.14), with the degrees of freedom adjusted asp(c - 1).

For certain asymptotic properties of the tests basedon .c* and .c, we may refer to Sen [13J, who consideredthe special case of c = 2.

•Ai = Diag(ni1, .. " nie), '/, = 1, .. " h, (3.32)

so that on letting nOj = L~=lnij, we have by (3.24),

and from the results of Chatterjee and Sen [3J, and Puriand Sen [8J, it follows that under Hb2

) : 6 = 0, .c* hasapproximately the chi-square distribution with p(c - 1)degrees of freedom. The reduction in the degrees offreedom, made at the cost of the assumption of additivityof block effects, usually improves the efficiency of .c*as compared to that of .c.

We can also take

where the Vi are defined by (3.10) and (3.11), we havefrom Chatterjee and Sen [3J that

(zill , .. " Z{P), .. " Z;l), ... , Z;pY = U', (3.30)

say, we can propose our test statistic as

3.4 Test for Interaction

Under the hypothesis of no interaction, for the ithreplicate am, Qi21 - ..i l2 , •• " aiel - ..i 1e (i = 1,2, .. " h)are all identically distributed random variables, where..i 1j = '),.j - '),.1 (j = 2,3, .. " c). In other words, ..i l2 , •• "

..i1e are the alignments needed for the observations cor­responding to different samples (treatments) to haveidentical distributions. Let ..i \t "', ..ii~) denote two­sample Wilcoxon scores estimators of A l2 , "', Ale, re­spectively, based on the data relating to the ith replicate,for i = 1, .. " h. Then

(3.18)

(3.24)

(3.23)

(3.19)

(N i + 1)-lRie - !l),'/, = 1, "', h, (3.16)

i = 1, .. ',h,

i = 1, "', h,

E(Zi) ~ rn.,V (Zi) "-' v ® Ai,

r = Be = B(Ol, "', 6e ) ,

B = Diag(B1, . ", B p ) ; (3.17)

B. = L~ g~(x)dx, s = 1, "', p,

Zi = ((N i + 1)-lRi1 - !l,

- h •Z = (Li~l ZiAi)A. (3.25)

Then, by (3.19), (3.20), (3.24), and (3.25), we have

V s s r = L~ L~ [G[.](x) - n. [GC"l(Y) - !JdG[, .•'J(x, y), (3.22)

- h •EZ ~ r(Li=l DiAi)A( =0 when 6 = 0), (3.26)

Var(Z) ~ v ® A. (3.27)

where G [oj is the marginal cdf of the sth variate andGro,o'] is the bivariate joint cdf of the (s, s'Ith variate;note that for s = e', v.. = l2.

If the design is balanced, i.e., nijNj does not dependon i and i. an optimal linear compound of the adjustedrank averages is

where g.(x) is the marginal density of the sth variate forthe cdf G(x), defined in (3.15). Then, by the same tech­nique as in [3J and [8J, it follows that when the O~·) aresmall and the nij are not small,

where

D, = ((Oii' - nijNi)) , Ai = ((n:;10ii , - Ni-))'

i = 1, .. " h, (3.21)

Ojj' is the usual Kroneckerdelta, and v is the p X p matrixof the grade covariances for the cdf G, i.e.,

zero." [More precisely, we write OJ = N-l'),.j, j = 1, "',C1 where N = ts.». and the x, contain finite elements.]

Since the Rij - m, defined by (3.9) and (3.12) areunaffected by the variation of the nuisance parameters~1, .. " ~h, we like to consider a linear compound of the hsets of vectors {R·· - m J' = 1 ... c } i = 1 ... h

t1 J' '" , 'Jhaving some desirable properties. For this, we write1 = (1, .. " 1)', and let

and one can then construct a suitable quadratic form inthe elements of Z as a test statistic. In the unbalancedcase, the picture is a bit more complicated and is pre­sented below.

We denote by A:, a generalized inverse of Ai, i = 1,.. " h, and let

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

13:

15 2

2 M

ay 2

013

Page 7: Nonparametric Methods in Longitudinal Studies

34 Journal of the American Statistical Association, March 1973

3. RANKS AND ADJUSTED RANKS OF DATAFOR EXAMPLE 1

Yl "'1 Y2 "'2Group

RankAdj.

RankAdj.

RankAdj.

RankAdj.

rank rank rank rank

1.0 -12.5 5.0 -8.5 17.0 3.5 1.0 -12.52.0 -11.5 4.0 -9.5 8.0 -5.5 8.0 -5.5

10.0 -3.5 1.0 -12.5 15.0 1.5 2.0 -11.516.0 2.5 2.0 -11.5 1.0 -12.5 20.0 6.512.0 -1.5 9.0 -4.5 18.0 4.5 17.0 3.514.0 .5 8.0 -5.5 4.0 -9.5 13.0 -.5

Mean 9.17 -4.33 4.83 -8.67 10.50 -3.00 9.17 -3.33

23.0 9.5 26.0 12.5 10.0 -3.5 21.0 7.56.0 -7.5 25.0 11.5 24.0 10.5 9.0 -4.57.0 -6.5 19.0 5.5 13.0 -.5 10.5 -3.08.0 -5.5 17.0 3.5 12.0 -1.5 6.0 -7.5

22.0 8.5 18.0 4.5 23.0 9.5 16.0 2.5

Mean 13.20 -.30 21.00 7.50 16.40 2.90 11.5 -1.00

4.0 -9.5 20.0 6.5 26.0 12.5 3;0 -10.521.0 7.5 15.0 1.5 22.0 8.5 4.0 -9.513.0 -.5 6.0 -7.5 2.0 -11. 5 19.0 5.520.0 6.5 12.0 -1.5 5.0 -8.5 18.0 4.519.0 5.5 3.0 -10.5 14.0 .5 5.0 -8.5

Mean 15.40 1.90 11.20 -2.30 13.80 .30 8.80 -3.70

5.0 -8.5 16.0 2.5 3.0 -10.5 26.0 12.53.0 -10.5 7.0 -6.5 9.0 -4.5 12.0 -1.5

4 18.0 4.5 11.0 -2.5 19.0 5.5 10.5 -3.09.0 -4.5 14.0 .5 21.0 7.5 15.0 1.5

15.0 1.5 10.0 -3.5 6.0 -7.5 24.0 10.5

Mean 10.00 -3.50 11.60 -1.90 11.60 -1.90 16.5 4.00

24.0 10.5 23.0 9.5 16.0 2.5 25.0 11.525.0 11.5 22.0 8.5 7.0 -6.5 22.0 8.526.0 12.5 21.0 7.5 20.0 6.5 14.0 .511.0 -2.5 13.0 -.5 11.0 -2.5 23.0 9.517.0 3.5 24.0 10.5 25.0 11.5 7.0 -6.5

Mean 20.60 7.10 20.60 7.10 15.80 2.30 18.2 4.70

We denote by

A ((A~~l) '.': Ai~P»))~. = ...

1. ••• J

Al.}) Al~')z = 1, "', h; (3.37)

estimator of the true ~Oh the difference between thealigned Spearman rank correlation matrix and the trueSpearman rank correlation matrix converges in prob­ability to a null matrix. On the other hand, for linearrank statistics, such as the rank averages, using therecent results of Sen [15J, Koul [6J and Jureckova [5Jon the asymptotic linearity in translation parameters, itfollows that the rank averages of the aligned observations&~l in (3.41) can be expressed as a linear function of therank averages of the af;l - .16j), a linear combination ofthe .16'/ and a residual term which becomes negligible(in probability) for large nij. Thus, it follows that .£ canbe expressed for large sample sizes as a quadratic formin the np (e - 1) estimates Aff), j = 2, "', e, S = 1,.. " p and i = 1, .. " h. Hence, using the multinormalityof these estimates, studied by Puri and Sen [9J, itfollows that under the hypothesis of no interaction,.£ has approximately the chi-square distribution withp(e - l)(h - 1) degrees of freedom. Thus, we have atest for H62

) , based on .£, where we use the same rule asin (3.14) with the degrees of freedom adjusted asp(e - l)(h - 1).

3.5 Use of General Rank Statistics

In the same way as Puri and Sen [8J extend the ranksum test of Chatterjee and Sen [3J to a general class ofrank order tests, we can also extend the statistics £, £*and .£ to a general class of rank order statistics. This ex­tension is based on a replacement of the ranks 1, .. " N i

by rank scores E N; . lI "', EN;,Ni/ where EN;,; = EN.(j),j = 1, "', N i are explicitly known functions of theranks 1, .. " N«. For example, if

The statistics £;, i = 1, "', h, are based on averageranks of the aligned observations and their Spearmanrank correlation matrix. Since the ..iOj is a consistent

D i = ((n~l + ~:::)), i, j' = 2, . ", e;i = 1, <:>, h; (3.38)

where Ojj' is the usual Kronecker delta, and let

(3.43)j ~ [tN;J

j < [tN;J'

~ ~

Coefficient Yl Cl.1 Y2 Cl.2

Y1 56.25 18.98 1. 75 21.11

~

.34 11. 87Cl.1~

.03 -29.87Y2

Cl.2 .38 56.25

• Correlations show below the diagonal.

4. RANK COVARIANCE MATRIX AND SPEARMANCORRELATIONS FOR EXAMPLE ,.

we are led to the multivariate median tests, consideredearlier by Chatterjee and Sen [2]. Similarly, if EN, (J) isthe expected value of the jth smallest observation of asample of size N; from the standard normal distribution,j = 1, "', N«, we have the so called normal scores, andthe tests based on these were studied earlier by Puri andSen [8].

We need to replace them in (3.9) and (3.10), the ranksby the rank scores, so that mf') = Ni-1L:~lEN~'~, S = 1,

(3.39)

(3.42)

A h -1.~O = Do(L;~l D; .1;)

.(.) ) . (340)= ((.1 0j ) J = 2, .. " e; s = 1, "', p, .

and conventionally, we let ~.¥ = .16~ = 0 for s = 1, .. "p. Finally, let

(e ) (a) • (.)a;jk = a;jk - .1OJ , k = 1, .. " nih

j = 1, ... I e, S = 1, .. " p, z = 1, ... ,h. (3.41)

We consider these aligned observations, and then pro­ceeding as in (3.7) through (3.12), but replacing every­where a~jk by &~jL we define .£i, i = 1, "', h, and finally,let

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

13:

15 2

2 M

ay 2

013

Page 8: Nonparametric Methods in Longitudinal Studies

Nonparametric Longitudinal Studies 35

6. RANK COVARIANCE AND CORRELATIONMATRICES FOR EXAMPLE 20.

.. " p. The construction of .£i and .£i remains the same.The estimates of .1f;), based on general rank scores, aremore complicated, see Puri and Sen [9]. However the

A '

construction of .£ rests on the same alignment procedure. Term

Intercept

Hospital 1

Linearterm

Quadraticterm

Intercept

Hospital 2

Linear Quadra ticterm term

• The Spearman rank correlations shown below the diagonal are calculated byPi = 12/(N: - nVi.

= 29.77, where .£i is computed by (3.13). Its valueshould be compared to the percentage points of thecentral x2 with 16 degrees of freedom. The observedvalue of the test statistic is well beyond the five percentcritical value.

The basic statistics Wand V can be used to calculate aunivariate Kruskal-Wallis test for each of the fourvariables, and a bivariate test can be made for thehomogeneity of parameters for the five treatments ineach segment of the response curve. These tests yieldfreedom for the four univariate K - W tests, and 22.20and 12.65, each with eight degrees of freedom, for thebivariate tests of homogeneity within each segment ofthe response line.

The two bivariate tests show that the most importantdifferences occur within the first segment of the responseline and that within this segment differences among theslopes are the most important. Inspection of the meanranks shows that animals kept on BAPN three weeks(Group 3) had the largest slopes, i.e., had the greatestlinear extension with the addition of each weight, or toput it more directly had the least stiffness in their joints.However, one should note that Group 5 which receivedno BAPN and spent only two weeks in the cast did aswell. All treatments produced a more flexible joint thanthose kept for three weeks without treatment.

All the intercepts should have been the same. How­ever, the test statistics suggest that this may not be so.Probably this can be ascribed to those having less jointstiffness, having a spontaneous extension due to theweight of the extended limb in the measuring apparatus.This is further supported by the positive correlations be­tween the slopes and intercepts of the first portion of theresponse line (Spearman correlation = .34).

In the second example, the estimated parameters ob­tained by fitting a second degree polynomial were rankedacross treatments within each hospital. Table 2 showstypical data used in the second example, and the relevantsummary statistics are displayed in Tables 5 and 6. Thetest statistic (based on a pooled estimate of the covariancematrix) has a x2 distribution with six degrees of freedomunder H61

) . The observed value of this statistic is 9.26which does not attain the .05 critical value of 12.59. Thecomputation is straightforward except for the block bytreatment interaction, which is accomplished as follows.

Compute all possible nilni2 differences among treat­ments for each group (block) and each variable. Then

32.47 -33.74101. 61-0.08 -65.04269.15Intercept

Linear term

Quadratic term

5. MEAN RANKS AND ADJUSTED MEAN RANKSFOR EXAMPLE 2

3.6 Asymptotic Relative Efficency (ARE) Results

For testing H61) (when h = 1), the classical parametric

test based on the (normal theory) likelihood ratiocriterion is asymptotically equivalent to the alternativecriterion based on the Hotelling-Lawley trace. Either ofthese statistics has asymptotically a chi-square distribu­tion with p(c - 1) degrees of freedom when H61

) holds,and has a non-central chi-square distribution for localtranslation alternatives. Chatterjee and Sen [3J andPuri and Sen [8J have compared the relative perform­ances of the proposed rank test and the likelihood ratiotest when the underlying distribution is not necessarilynormal. Since both the statistics have, asymptotically,(for local alternatives) noncentral chi-square distribu­tions with the same degrees of freedom but differingonly in the noncentrality parameters, the ratio of thenon-centrality parameters provide a measure of theARE. Unlike the univariate case, this ratio depends onthe actual shift under the alternative hypotheses; how­ever, certain useful lower and upper bounds are avilablein Puri and Sen [10, Chapter 5]. The proposed tests canbe justified for their robustness for gross errors or out­liers, and high ARE for distributions with heavy tails.The bounds for the ARE for h = 1 remain good forboth H61

) and H62) when h ~ 2, but the study of the exact

ARE becomes more difficult. We may conclude thissection by saying that the use of the normal scores leadsus to asymptotically most efficient procedures when theunderlying distribution is also multivariate normal. Com­putationally, the rank sum statistic appears to be moresimple, and in many cases, reasonably efficient too.

4. APPLICATIONS

The data used in the first example (Table 1), consistof the intercepts and slopes of the two portions of theindividual response lines described in the first part ofSection 2. The design used in this study is completelyrandomized without blocking. The matrix of ranks R,and its corresponding adjusted rank W, and the meansof the adjusted and unadjusted ranks are shown inTable 3. The rank covariance matrix and the Spearmanrank-correlations are computed by (3.10) and (3.11)and are shown in Table 4. The test statistic .£ = Li~l.£i

Mean rank Adjusted mean rankHospital Treatment

InterceptLinear Quadratic Linear Quadratic

term termIntercept

te re term

27.85 25.39 25.39 -1.15 -2.00 -3.61

30.03 30.80 32.25 1.03 1.80 3.25

16.89 14.50 18.94 -1.11 -3.58 0.94

19.18 21.11 11.00 1.18 3.71 -1.00

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

13:

15 2

2 M

ay 2

013

Page 9: Nonparametric Methods in Longitudinal Studies

36 Journal of the American Statistical Association, March 1973

find the median of the differences for each variable. Thisyields a 1 X 2 vector since c = 2, p = 2. The matrix D,defined in (3.38) becomes the scalar (llni1) + (1Ini2),

A 5 -1 A -1 5 -1~o = D o(Li=l D, ~i), Do = Li~l D, .

Now, b.~~ is subtracted from each &gl (k = 1, 2,.. " ni2), s = 1, 2 to adjust the effects of blocks and treat­ments leaving interaction effects only. The resultingobservationsaredenotedby&ij1(k = 1, "',nij;j = 1,2;i = 1, ",,5; s = 1,2). Note that t:,,~¥ = 0, s = 1, 2.Hence&NI = &Nl,k = 1, "',nil;s = 1,2,i = 1, ",,5.

Once the &~jl are calculated, the remainder of the testprocedure is the same as before, except that the degreesof freedom are altered to p (c - 1) (h - 1) and the rankcorrelation matrix is pooled across all blocks. The re­sulting test statistic is x2 = 0.850 which is not significantat the five percent level of significance, since the upperfive percent point of a x! statistic with eight degrees offreedom is 15.507.

Since there is no evidence of interaction, the weightedregression principle can be used. This leads to a x2

statistic with two degrees of freedom. The observedvalue of the statistic is 16.523. Although, the observedvalue' of x2 is smaller compared to the one obtained bythe unweighted least squares principle, the degrees offreedom in this case are two as compared to ten in thepreceeding case, so that the value of the chi-squarestatistic per degree of freedom is much higher.

The test statistic used for testing the presence ofinteraction has a x2-distribution with three degrees offreedom, and yields an observed value of 3.13. Obviouslythere is no evidence for interaction and the weightedleast square principle can be used to obtain a morepowerful test. The calculations result in x2 = 6.62 withthree degrees of freedom, which has a significance levelof slightly less than .10 level.

Although the evidence is not as strong as one wouldlike, it is in favor of Treatment 2. In this case the inter­cept is simply the mean over the entire period of treat­ment. Notice that the adjusted rank score of the inter­cept of Treatment 2 is larger than for Treatment 1 inboth hospitals. Also, in both hospitals, the linear regres­sion coefficients were smaller (larger negative) forTreatment 1, indicating that there was a faster relapseon this treatment. However, keep in mind that theoverall significance level was only .10.

Obviously the analysis could present more details ofthe findings for both Examples 1 and 2. We have pointed

out a few facets of the data to emphasize that the physicalinterpretation of the analysis based on multivariate non­parametric methods is not particularly difficult.

[Received April 1970. Revised September 1972.J

REFERENCES

[lJ Anderson, T. W., An Introduction to Multivariate StatisticalAnalysis, New York: John Wiley and Sons, Inc., 1958.

[2J Chatterjee, S.K and Sen, P.K, "Nonparametric Tests for theBivariate Two-Sample Location Problem," Calcutta StatisticalAssociation Bulletin, 13 (April 1964), 18-58.

[3J --- and Sen, P.K., "Nonparametric Tests for the Multi­sample Multivariate Location Problem," R.C. Bose, et al., eds.,in Roy Memorial Volume, University of North Carolina, 1966,198-228.

[4J Grizzle, J.E. and Allen, David M., "Analysis of Growth andDose Response Curves," Biometrics, 2.') (June 1969), 307-18.

[liJ Jureekova, Jana, "Asymptotic Linearity of a Rank Statisticin Regression Parameter," Annals of Mathematical Statistics,40 (December 1969), 1889-1900.

[6J Koul, H. L., "Asymptotic Behavior of Wilcoxon Type Con­fidence Regions in Multiple Linear Regression," Annals ofMathematical Statistics, 40 (December 1969),1950-79.

[7J Kruskal, W.H. and Wallis, W.A., "Use of Ranks in OneCriterion Variance of Analysis," Journal of the AmericanStatistical Association, 47 (June 1952), 583--621.

[8J Puri, M.L. and Sen, P.K, "On a Class of Multivariate Multi­sample Rank Order Tests," Sankhya, A, 28 (December 1966),353-76.

[9J -- and Sen, P.K., "On a Class of Rank Order Estimatorsof Contrasts in MANOVA," Sankhya, A, 30 (March 1968),31-6.

[1OJ -- and Sen, P.K., Nonparameiric Methods in MultivariateAnalysis, New York: John Wiley and Sons, Inc., 1971.

[Ll ] Rao, C.R, Linear Statistical Inference and Its Applications,New York: John Wiley and Sons, Inc., 196.').

[12J ---, "Least Squares Theory Using an Estimated DispersionMatrix and Its Application to Measurement of Signals,"Proceedings of the Fifth. Berkeley Symposium on MathematicalStatistics and Probability, 1, 1967, 3;j5-72.

[13J Sen, P. K, "Rank Methods for Combination of IndependentExperiments in Multivariate Analysis of Variance, Part I:Two Treatment Multiresponse Case," in RC. Bose, ct al. eds.,Roy Memorial Volume, University of North Carolina Press,1966,631-54.

[14J --, "Estimates of the Regression Coefficient Based onKendall's Tau," Journal of the American Statistical Association,63 (December 1968), 1379-89.

[15J ---, "On a Class of Rank Order Tests for the Parallelismof Several Regression Lines," Annals of Mathematical Statistics,40 (October 1969), 1668-83.

[16J --- and Puri, M.L., "On Robust Nonparametric Estimationin Some Multivariate Linear Models," in P.R Krishnaiah,ed., Proceedings of the Second International Symbosium onAfultivariate Analysis, New York: Academic Press, Ine., 1969,33-52.

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

13:

15 2

2 M

ay 2

013