modeling nonstationary longitudinal data

8
Modeling Nonstationary Longitudinal Data Author(s): Vicente Nunez-Anton and Dale L. Zimmerman Source: Biometrics, Vol. 56, No. 3 (Sep., 2000), pp. 699-705 Published by: International Biometric Society Stable URL: http://www.jstor.org/stable/2676911 . Accessed: 25/06/2014 00:40 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access to Biometrics. http://www.jstor.org This content downloaded from 195.34.79.214 on Wed, 25 Jun 2014 00:40:27 AM All use subject to JSTOR Terms and Conditions

Upload: vicente-nunez-anton-and-dale-l-zimmerman

Post on 01-Feb-2017

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Modeling Nonstationary Longitudinal Data

Modeling Nonstationary Longitudinal DataAuthor(s): Vicente Nunez-Anton and Dale L. ZimmermanSource: Biometrics, Vol. 56, No. 3 (Sep., 2000), pp. 699-705Published by: International Biometric SocietyStable URL: http://www.jstor.org/stable/2676911 .

Accessed: 25/06/2014 00:40

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

International Biometric Society is collaborating with JSTOR to digitize, preserve and extend access toBiometrics.

http://www.jstor.org

This content downloaded from 195.34.79.214 on Wed, 25 Jun 2014 00:40:27 AMAll use subject to JSTOR Terms and Conditions

Page 2: Modeling Nonstationary Longitudinal Data

BIOMETRICS 56, 699-705 September 2000

Modeling Nonstationary Longitudinal Data

Vicente Nuniiez-Anton

Departamento de Econometria y Estadistica, Universidad del Pais Vasco, Avenida del Lehendakari Aguirre 83, 48015 Bilbao-Vizcaya, Spain

and

Dale L. Zimmerman

Department of Statistics and Actuarial Science, The University of Iowa, Iowa City, Iowa 52242, U.S.A.

email: [email protected]

SUMMARY. An important theme of longitudinal data analysis in the past two decades has been the de- velopment and use of explicit parametric models for the data's variance-covariance structure. A variety of these models have been proposed, of which most are second-order stationary. A few are flexible enough to accommodate nonstationarity, i.e., nonconstant variances and/or correlations that are not a function solely of elapsed time between measurements. We review five nonstationary models that we regard as most useful: (1) the unstructured covariance model, (2) unstructured antedependence models, (3) structured antedepen- dence models, (4) autoregressive integrated moving average and similar models, and (5) random coefficients models. We evaluate the relative strengths and limitations of each model, emphasizing when it is inappro- priate or unlikely to be useful. We present three examples to illustrate the fitting and comparison of the models and to demonstrate that nonstationary longitudinal data can be modeled effectively and, in some cases, quite parsimoniously. In these examples, the antedependence models generally prove to be superior and the random coefficients models prove to be inferior. We conclude that antedependence models should be given much greater consideration than they have historically received.

KEY WORDS: Antedependence; ARIMA models; Covariance structure; Heteroscedasticity; Mixed models; Nonstationary correlations; Random coefficients; Unstructured covariance.

1. Introduction

An important theme of longitudinal data analysis in the past two decades has been the development of explicit paramet- ric models for the data's variance-covariance structure (cf., Crowder and Hand, 1990, Chapters 5 and 6; Jones, 1993, Chapters 2 and 3; Lindsey, 1993, Chapters 3 and 4; Diggle, Liang, and Zeger, 1994, Chapters 4 and 5; Wolfinger, 1996). Compared with various analysis-of-variance methods, which ignore the covariance structure, and with the classical multi- variate approach, which estimates the covariance matrix but imposes no structure on it, parametric covariance modeling has several advantages. First, it generally results in more effi- cient estimation of the mean structure and more appropriate standard error estimates. Second, it can deal more effectively with missing data and with data for which the measurement times are not common across subjects. Finally, it can be em- ployed even when the number of measurement times is large relative to the number of subjects. The disadvantage of para- metric covariance modeling has, until recently, been the lack of standard, widely available computer software for fitting the models. With the advent of the SAS MIXED Procedure (SAS

Institute, 1996), however, the situation has changed consider- ably.

Perhaps the most prevalent kind of covariance structure ex- hibited by longitudinal data is serial correlation, i.e., within- subject correlations that decrease as the elapsed time between measurements increases. The most popular parametric models for serial correlation are stationary autoregressive (AR) mod- els and other parsimonious second-order stationary models (see Jennrich and Schluchter, 1986; Jones and Boadi-Boateng, 1991; Muiioz et al., 1992). In these models, variances are constant over time and correlations between measurements equidistant in time are equal. When sample variances and correlations do not comport with these assumptions, models flexible enough to accommodate nonstationarity should be considered.

If nonstationarity is manifested by nonconstant variances only, options for analysis include transforming the data to stabilize the variance or generalizing stationary models to al- low for heterogeneous variances. Heterogeneous extensions of several stationary models are described by Wolfinger (1996). These options may not be sufficient, however, when nonsta- tionarity is also manifested by the correlations. Several alter-

699

This content downloaded from 195.34.79.214 on Wed, 25 Jun 2014 00:40:27 AMAll use subject to JSTOR Terms and Conditions

Page 3: Modeling Nonstationary Longitudinal Data

700 Biometrics, September 2000

native models are applicable to longitudinal data that exhibit nonstationarity in their correlations and variances. In this ar- ticle, we examine and compare five of these, considering their relative strengths and weaknesses and emphasizing when each is inappropriate or problematic. We present three examples that demonstrate that nonstationary longitudinal data can be modeled effectively and, in some cases, quite parsimoniously with appropriate parametric models.

2. Data Sets Data from three longitudinal studies serve to motivate the consideration of nonstationary models and later will be used to illustrate the fitting and comparison of models.

The cattle data come from an experiment reported by Ken- ward (1987). Cattle receiving one of two intestinal parasite treatments, say A and B, were weighed 11 times over a 133- day period. Thirty animals received treatment A and 30 re- ceived treatment B; we consider only the treatment A data. The first 10 measurements on each animal were made at 2- week intervals and the final measurement was made 1 week after the 10th measurement.

The speech recognition data are taken from the Iowa Cochlear Implant Project (Gantz et al., 1988). The data con- sist of scores on a sentence test administered under audition- only conditions to groups of human subjects wearing one of two types of cochlear implants, referred to here as A and B. Twenty subjects received implant A and 21 received implant

B. Measurements were scheduled at 1, 9, 18, and 30 months after connection. However, some subjects did not show up for one or more scheduled follow-up times, so a substantial proportion of the data is missing.

The Jones data come from three related experiments de- scribed by Jones (1990). Measurement times are common across subjects within experiments, except for a very small number of missing observations, but they differ substantially across experiments. Each experiment considers the same two treatments. Experiments 1, 2, and 3 have 17, 33, and 40 sub- jects, respectively. Jones provides no further details about the nature of the response variate, treatments, or subjects.

Table 1 displays matrices of sample variances and corre- lations corresponding to the data sets. In the case of the speech recognition data, the EM algorithm was used to com- pute these quantities; for the remaining data sets, calculations were based on all available observations. Some pooling over subgroups was performed. Homogeneity tests indicated that pooling covariance matrices over treatments is reasonable for the speech recognition data and Jones data. Pooling the Jones data over experiments is not an option because the measure- ment times are not common across experiments.

These matrices exhibit several interesting features common to many longitudinal data sets. First, the variances are not homogeneous but instead tend to increase over time. Second, the correlations are mostly positive. Third, serial correlation

Table 1 Sample variances, along the main diagonal, and correlations, off the main

diagonal, of data from three longitudinal studies described in Section 2

(a) Cattle data

106 0.82 155 0.76 0.91 165 0.66 0.84 0.93 185 0.64 0.80 0.88 0.94 243 0.59 0.74 0.85 0.91 0.94 284 0.52 0.63 0.75 0.83 0.87 0.93 306 0.53 0.67 0.77 0.84 0.89 0.94 0.93 341 0.52 0.60 0.71 0.77 0.84 0.90 0.93 0.97 389 0.48 0.58 0.70 0.73 0.80 0.87 0.88 0.94 0.96 470 0.48 0.55 0.68 0.71 0.77 0.83 0.86 0.92 0.96 0.98 445

(b) Speech recognition data

395 0.85 600 0.70 0.90 577 0.64 0.87 0.95 606

(c) Jones data

Experiment 1 Experiment 2 Experiment 3

0.182 0.022 0.083 -0.20 0.067 0.02 0.182 0.60 0.141

0.01 0.89 0.417 0.09 0.51 1.35 0.52 0.89 0.281 0.08 0.84 0.99 0.925 0.07 0.74 0.72 3.53 0.58 0.85 0.84 2.03 0.02 0.91 0.97 0.95 0.666

This content downloaded from 195.34.79.214 on Wed, 25 Jun 2014 00:40:27 AMAll use subject to JSTOR Terms and Conditions

Page 4: Modeling Nonstationary Longitudinal Data

Modeling Nonstationary Longitudinal Data 701

appears to be present, as correlations within any given col- umn tend to decrease toward zero (unless they are close to zero initially). Finally, correlations lagged the same number of observations apart are not constant but tend to increase early in the study before leveling off.

A battery of power transformations was attempted for each data set with the aim of variance stabilization. These ef- forts met with only limited success, which is not surprising given that, in several cases, the variances do not appear to be smooth functions of the mean. Even in those cases where a transformation successfully stabilized the variance, the non- stationary behavior of the correlations persisted after trans- formation.

3. The General Parametric Modeling Approach Suppose, as in the previous three examples, that repeated measurements of a continuous response variable are observed over time on each of m subjects. Let yi = (YiIi... Yini) be the vector of ni measurements on the ith subject and let ti = (ti1,... , tini) be the corresponding vector of mea- surement times. Suppose also that we observe a p-vector of covariates, Xii, associated with Yij. Put y (yj,. . /

t=(t/, Itm) , Xi = (Xil, ,Xini), X =(X1 , , X/Y, and N = Si ni-

We refer to the set of measurement times in the study as the measurement schedule. Measurement times may be unequally spaced within a subject (as is the case for all three exam- ples) and may differ across subjects. If measurement times are common across subjects, we call the measurement sched- ule rectangular. Thus, the measurement schedule of the cattle data is rectangular, that of the speech recognition data was intended to be rectangular but is not so because of missing observations, and that of the Jones data is nonrectangular by design. The extent of the measurement schedule's depar- ture from rectangularity can have important implications for modeling the covariance structure.

Several general assumptions now provide a framework for parametric modeling of the covariance structure (see Diggle et al., 1994). These include independence across subjects, linear mean structure, homogeneity of covariance matrices across subjects, normality, and ignorable missing data. These as- sumptions, together with an assumed parametric structure for the covariance matrix, yield the model

y , MVN(X,/3, E(t, 0)), (1)

where /3 is a p-vector of fixed, unknown, and typically un- restricted parameters; Z(t, 9) is an N x N block-diagonal covariance matrix, with nonzero blocks Vi(t,9) = var(yi); and 9 is a q-vector of unknown parameters, restricted to a parameter space 9, which is either the set of all 9-vectors for which E is positive definite or some subset of that set. Note that E is positive definite if and only if Vi is posi- tive definite for every i and that the Vi are all equal if the measurement schedule is rectangular. Because the parametric models we consider are models for within-subject covariance structure only, when examining this structure, we suppress the subscript i (denoting subject) on Yij, ni, xij, and Vi. 4. Nonstationary Models 4.1 Unstructured Covariance Model The unstructured (UN) covariance model can be regarded as an extreme case of a parametric covariance structure in which

9 consists of the n(n + 1)/2 variances and covariances. The model is applicable regardless of spacing between measurements. Rectangularity is not essential, but near rectangularity is a practical necessity. If the measurement schedule is rectangular and m is sufficiently large (greater than or equal to n + p), then the residual maximum likelihood (REML) estimator is merely the sample covariance matrix, S, of the residuals from the regression of y on X, and the maximum likelihood (ML) estimator is [(m - p)/m]S. If, however, the measurement schedule is not rectangular, then explicit expressions for the REML and ML estimators do not exist. In this case, depending on the extent of the departure from rectangularity, it can be difficult or impossible to maximize the likelihood function. In particular, the likelihood function may be very flat (a consequence of the large number of parameters), which may cause convergence problems. Moreover, the parameter space E) = {9: V is positive definite} cannot be expressed as linear inequality constraints on individual parameters, which makes these constraints difficult to enforce.

4.2 Unstructured Antedependence Models

Observations Yi, . . ., yn whose joint distribution is multi- variate normal are sth-order antedependent if yj and Yj+k+l are independent given the intervening observations Yj+l, * * *, Yj+k, this having to hold for all j =1,.. , n-k-1 and all k > s. This original definition of antedependence, due to Gabriel (1962), makes no explicit reference to a parametric structure. An equivalent but parametrically specified definition is as follows:

yi x1X3 +,El, s

Yi jp X +I? E4jk(Yj-k -xk/) + ?j (j = 2, ... ,n), k=l

(2)

where s* = min(s,j - 1), the Ej's are independent normal random variables with zero means and possibly time- dependent variances 2 > 0, and the Ojk's are unrestricted parameters. Like AR models, this model allows for serial correlation within subjects, but unlike AR models it does not stipulate that the variances are constant nor that correlations between measurements equidistant in time are equal. Henceforth, we refer to model (2) as the unstructured antedependence (UAD) model of order s [UAD(s)], where by "unstructured" we mean that the (s+1)(2n-s)/2 parameters {Ojbk} and {aI2} cannot be expressed as functions of a smaller set of parameters.

The UAD(s) model is applicable regardless of spacing between measurements. Because the UAD(s) model has fewer parameters than the UN model, a rectangular or nearly rectangular measurement schedule is not as imperative as it is for the UN model. If the schedule is rectangular, however, and m is sufficiently large, then simple, closed-form expressions for the REML and ML estimators of V are possible in the first- order case (see Byrne and Arnold, 1983). In rectangular higher order cases, a recursive procedure requiring no numerical optimization can be given for computing the REML and ML estimators from the elements of S. In nonrectangular situations, one must resort to numerical optimization.

This content downloaded from 195.34.79.214 on Wed, 25 Jun 2014 00:40:27 AMAll use subject to JSTOR Terms and Conditions

Page 5: Modeling Nonstationary Longitudinal Data

702 Biometrics, September 2000

4.3 Structured Antedependence Models Although the UAD(s) model is more parsimonious than the UN model, it may still have impracticably many parameters. This led Zimmerman and Nuniiez-Anton (1997) to propose more parsimonious versions called structured antedependence (SAD) models. In one useful class of these models, the autoregressive coefficients follow a Box-Cox power law and the innovation variances are parsimonious functions of time, i.e.,

= f (tj;Ak)-f (tjk;\k (j ? S + 1, * ... ,n; k = 1,.. ., s),

2= g(t;) (jsU = S ..., n), (3)

where f (t; A) equals (tA - 1)/A if A 54 0 and equals log t if A 0, g is a function of relatively few parameters (e.g., a low-order polynomial), and Pkjk: j = 2,... ,s, k =1,... ,j - 1} and

al2,...a.2 are unstructured. Note that f prescribes that the kth-order autoregressive coefficients are monotone increasing if Ak < 1, monotone decreasing if Ak > 1, or constant if Ak = 1 (k = 1, . . . , s). Another useful class of SAD models takes same-lag correlations to be monotone functions of time and observation (rather than innovation) variances to be parsimonious functions of time (see Zimmerman and Nuniez- Anton, 1997).

A SAD(s) model has considerably fewer parameters than a UAD(s) model and is applicable regardless of the measurement schedule. ML and REML estimation of model parameters require numerical optimization over a constrained parameter space. The constraints associated with (3), e.g., are kk > 0, a2 > 0, and {4': g(t; 4) > 0}.

4.4 ARIMA Models

An ARIMA(s, d, q) model generalizes a stationary auto- regressive moving average (ARMA) model by postulating that the dth-order differences among adjacent measurements, rather than the measurements themselves, follow a stationary ARMA(s, q) model. A highly parsimonious special case is the ARIMA(O, 1, 0) or random walk model,

yj -X,3 at (j = 1, . .., n),

t= 1

where al... , a,, are independent N(0, aa) random variables. For this process, var(yj) = jagI cov(yj, yu) jaa for 1 ? j ? u < n, and corr(yj,yu) = (j/u)1/2 for 1 ? j < u < n. Thus, the variances increase (linearly) over time and the correlations between equidistant measurements also increase (nonlinearly) over time. This behavior is typical of ARIMA models in general (see Cryer, 1986, Chapter 5).

In order for ARIMA models to be applicable to longitudinal data, the measurement schedule must be equally spaced and rectangular. However, continuous-time analogues exist that permit these restrictions to be relaxed. Here we consider only the Wiener (WI) process, which is a continuous-time analogue of the random walk model. The covariance function of a Wiener process is cov(yj, Yk) = a2 min(tj, tk), which coincides with the covariance function of (4) for equally spaced data. Simple expressions exist for the ML and REML estimators of o2

4.5 Random Coefficients Models A rather general random coefficients (RC) model is

Yi = Xi,3 + Ziui + ei (i =: 1, ... ., m),

where the Zi are specified matrices, the ui are vectors of random coefficients distributed independently as MVN(O, Gi), the Gi are positive definite but otherwise un- structured matrices, and the ei are distributed independently (of the ui and of each other) as MVN(O, a2Ini). Typically, the Gi are assumed to be equal; hence, Vi = ZiGZV + a2Ini. Special cases include the linear random coefficients (RCL) and quadratic random coefficients (RCQ) models, for which Zi = [1ni, ti] and Zi = [lniv ti, (t2 . )'], respectively.

RC models have often been considered as distinct from parametric covariance models, probably because they typically are motivated by a consideration of regressions that vary across subjects rather than a consideration of within- subject similarity. Nevertheless, they yield parsimonious parametric covariance structures that, in general, have nonconstant variances and nonstationary correlations, a fact that does not appear to be widely appreciated. Several kinds of variance and correlational behavior are permitted, including increasing variances, decreasing variances, and correlations of which some are negative while others are positive; however, the model does not accommodate variances that are a concave-down function of time nor allow variances to be constant if the same-lag correlations are not.

An arbitrary measurement schedule is permissible for RC models. Likelihood-based estimation generally requires numerical optimization, with parameters constrained to be such that a2 > 0 and G is positive definite.

5. Examples 5.1 General Issues

Since our main focus here is on modeling covariance structure, we take the mean structure to be as saturated as possible. For the speech recognition data and cattle data, two treatments are involved, so we assume that E(yij) = IAj if subject i receives treatment A and E(yij) = ,uBj if subject i receives treatment B. For the Jones data, the measurement times differ across experiments; thus, when we analyze each experiment separately, we use this mean structure also, but when we do a combined analysis of data from all three experiments, we follow Jones (1990) by using a mean model with fixed effects for experiments and treatments and a cubic function of time.

Nonstationary models fit to each data set include UN, WI, RCL, and RCQ, one or more UAD models, and one or more SAD models. PROC MIXED is used to fit UN, UAD(1), WI, RCL, and RCQ. SAD and higher order UAD models are fit using FORTRAN programs written by the authors (and available from them by request). For comparison purposes, we also fit two stationary models, compound symmetry (CS) and first-order autoregressive (AR(1)), and heterogeneous extensions of them (CSH and ARH(1)). The CS and CSH models are fit using PROC MIXED; AR(1) and ARH(1), which are special cases of SAD models, are fit by specializing our FORTRAN programs appropriately.

Fits of covariance models are compared using two widely used information criteria in larger-is-better form, i.e., AIC =

This content downloaded from 195.34.79.214 on Wed, 25 Jun 2014 00:40:27 AMAll use subject to JSTOR Terms and Conditions

Page 6: Modeling Nonstationary Longitudinal Data

Modeling Nonstationary Longitudinal Data 703

lR(9) - q and BIC = lR() -(q/2) log(N - p). Here IR is the residual log likelihood, and 9 and q are the REML estimate and dimension, respectively, of 9. Another criterion used to compare fits of covariance models is a residual likelihood ratio test (RLRT). An RLRT is conducted by subtracting the min- imized values of -21R for two nested models and comparing this value to the x2 distribution with degrees of freedom equal to the difference in the number of covariance parameters.

5.2 Cattle Data Zimmerman and Nuniiez-Anton (1997) previously fitted UAD and SAD models to these data, and we expand upon that analysis here. Gabriel's (1962) test for the order of antedependence indicated that it was necessary to fit UAD models up to order 2 only.

Table 2 gives the number of covariance parameters, AIC, BIC, and -21R for each fitted model. The SAD(1) model is superior on the basis of AIC and the AR(1) and SAD(2) models are the next most competitive. On the basis of BIC,

the WI model fits best, followed closely by the SAD(1) and AR(1) models. The RLRTs also support the use of a SAD(1) model, though this model cannot be compared to a WI model using an RLRT.

5.3 Speech Recognition Data Recall (from Table lb) that the variances of these data are rather constant over at least the last three measurement times and perhaps over all four times and that correlations between consecutive measurements increase slightly over time. Consequently, we consider SAD(1) models with monotonic same-lag correlations and two functions for observation variances, (1) a step function having one value for measurements taken 1 month after connection and another value for measurements taken at the three remaining times (SADS) and (2) a constant function (SADC).

Table 3 gives, for each fitted model, the number of covariance parameters, AIC, BIC, and -21R. The two SAD(1) models fit best. The next closest competitors, depending on

Table 2 REML information criteria and likelihood ratio tests of covariance structures for the cattle data. Here and in subsequent tables, q is the number of covariance parameters, CM is the comparison model, v is

the degrees of freedom, x2 is the test statistic, and P is the p-value for the residual likelihood ratio test.

Structure q AIC BIC -21R CM u X p

CS 2 -1192.2 -1196.0 2380.4 AR(1) 2 -1052.9 -1056.7 2101.8 CSH 12 -1172.3 -1194.9 2320.6 CS 10 59.8 0.00 ARH(1) 12 -1057.0 -1079.6 2089.9 AR(1) 10 11.8 0.30 UN 66 -1075.7 -1200.0 2019.4 UAD(2) 36 31.7 0.67 UAD(1) 21 -1055.9 -1095.4 2069.8 ARH(1) 9 20.2 0.02

SAD(1) 17 19.9 0.28 UAD(2) 30 -1055.6 -1112.1 2051.2 UAD(1) 9 18.6 0.03

SAD(2) 22 35.2 0.04 SAD(1) 4 -1048.9 -1056.4 2089.7 AR(1) 2 12.0 0.00 SAD(2) 8 -1051.2 -1066.2 2086.4 SAD(1) 4 3.4 0.49 WI 1 -1053.7 -1055.6 2105.4 RCL 4 -1080.8 -1088.3 2153.6 CS 2 226.8 0.00 RCQ 7 -1051.2 -1064.4 2088.3 RCL 3 65.3 0.00

Table 3 REML information criteria and likelihood ratio tests of covariance structures for the speech recognition data (see Table 2 caption for explanation of column heads)

Structure q AIC BIC -21R CM v X P

CS 2 -534.6 -537.5 1065.2 AR(1) 2 -529.1 -531.2 1054.1 CSH 5 -536.1 -543.2 1062.2 CS 3 3.0 0.39 ARH(1) 5 -528.8 -535.9 1047.6 AR(1) 3 6.5 0.09 UN 10 -527.7 -542.0 1035.4 UAD(2) 1 1.2 0.27 UAD(1) 7 -526.0 -536.0 1038.1 ARH(1) 2 9.5 0.01

SADS(1) 3 2.1 0.55 UAD(2) 9 -527.3 -540.2 1036.6 UAD(1) 2 1.5 0.47 SADS(1) 4 -524.1 -529.8 1040.2 SADC(1) 1 5.0 0.03 SADC(1) 3 -525.6 -529.9 1045.2 AR(1) 1 8.9 0.00 WI 1 -605.0 -606.4 1208.0 RCL 4 -531.1 -536.8 1054.2 CS 2 11.0 0.00 RCQ 7 -527.8 -537.8 1041.7 RCL 3 12.5 0.01

This content downloaded from 195.34.79.214 on Wed, 25 Jun 2014 00:40:27 AMAll use subject to JSTOR Terms and Conditions

Page 7: Modeling Nonstationary Longitudinal Data

704 Biometrics, September 2000

which criterion is used, are a stationary AR(1) model and a UAD(1) model; note that the former is a special case of and the latter is a generalization of a SAD(1) model.

5.4 Jones Data The sample variances and correlations of these data (Table lc) are not as well behaved as those of the other data sets. In particular, the variances are much more heterogeneous and, in experiments 1 and 2, the same-diagonal correlations do not increase as smoothly over time. Consequently, we do not fit any SAD models to these data. Moreover, in order to compare our results with those of Jones (1990), all models were fit to the data combined over experiments. Difficulties in fitting the WI model led us to exclude it.

Table 4 gives, for each fitted model, the number of covariance parameters, AIC, BIC, and -21R. Three models clearly stand out from the others, those being UAD(2), UN, and UAD(1). Based on AIC, these models, in this order, are best; based on BIC, the same models are best but their rank order is UAD(1) > UAD(2) > UN. A size-0.05 RLRT rejects UAD(1) in favor of UAD(2) but does not reject UAD(2) in favor of UN.

Jones (1990) fitted AR(1) and RCQ models only to these data, finding RCQ to be superior. On this basis, he asserted that a quadratic random coefficients model adequately explains the variance heterogeneity and serial correlation extant in these data. Upon comparing our results for RCQ with those of UN, we would dispute this assertion. Instead, we would make such an assertion about the UAD(2) model.

6. Discussion The five nonstationary models we have considered can be compared and contrasted in several ways. First is the tradeoff between model flexibility and parsimony. The UN model with its 0(n2) parameters is, of course, the most flexible and least parsimonious. The UAD model, in which variances are unstructured and unrelated to the correlations but the correlations are structured to some extent, has 0(n) parameters and is the next most flexible. The SAD, ARIMA, and RC models all are highly structured, with 0(1) parameters, and thus are not as flexible as the others. A closely related issue is the sample size required for the existence of a positive definite maximum likelihood estimate of the covariance matrix; this is largest for UN, intermediate

for UAD, and smaller for the remaining three models. Second, the parameter constraints required for positive definiteness range from being very simple to enforce or even nonexistent (UAD) to not quite as simple to enforce (SAD, ARIMA, and RC) to requiring considerable care to enforce (UN). Third, irregular spacing of measurements and nonrectangularity of the measurement schedule present no problems for SAD or RC models, but one or both of these may require special care for UN, UAD, or ARIMA models. A final comparison pertains to the existence of widely available software for fitting the models. The UN and RC models and certain low-order UAD and ARIMA models have the advantage here, for they can be fitted in PROC MIXED.

Of all the parametric covariance structures that have been proposed for longitudinal data, stationary autoregressive and random coefficient models seem to receive the most attention; antedependence models, in contrast, get very little press. In our three examples, however, stationary models (either autoregressive or compound symmetric) generally did not fit as well as an antedependence model of some kind, and random coefficients models were not competitive at all. Thus, in these examples at least, it appears that some kind of antedependence model strikes the right balance between model flexibility and parsimony. Consequently, we believe that antedependence models should play a much more prominent role in longitudinal data analysis in the future.

ACKNOWLEDGEMENTS

Nuniiez-Anton's work was partially supported by Direccion General de Enseiianza Superior del Ministerio Espaniol de Educacion y Cultura and Universidad del Pais Vasco (UPV/EHU) under research grants PB95-0346, PB98- 0149, and UPV 038.321-HC236/97. Zimmerman's work was partially supported by NSF grant 9628612.

R1ESUMPE

Un aspect important de l'analyse de donnees longitudinales dans les deux dernieres decennies a ete le developpement et l'utilisation de modeles parametriques explicites pour la structure de variance-covariance des donnees. Des modeles varies ont ete proposes, dont la majorite sont stationnaires du second-ordre. Peu d'entre eux sont assez flexibles pour s'adapter a la non stationnarite, c'est a dire l'existence de variances et/ ou correlations qui ne sont pas fonction de la

Table 4 REML information criteria and likelihood ratio tests of covariance structures

for the Jones data (see Table 2 caption for explanation of column heads)

Structure q AIC BIC -21R CM u X P

CS 2 -579.6 -583.5 1155.2 AR(1) 2 -537.9 -541.8 1071.8 CSH 10 -426.2 -445.5 832.5 CS 8 322.7 0.00 ARH(1) 10 -448.8 -468.1 877.5 AR(1) 8 194.3 0.00 UN 29 -373.2 -429.2 688.5 UAD(2) 5 4.7 0.45 UAD(1) 18 -380.9 -415.6 725.8 ARH(1) 8 151.7 0.00 UAD(2) 24 -370.6 -416.9 693.2 UAD(1) 6 32.6 0.00 RCL 4 -442.2 -449.9 876.4 CS 2 278.8 0.00 RCQ 7 -432.6 -446.1 851.2 RCL 3 25.2 0.00

This content downloaded from 195.34.79.214 on Wed, 25 Jun 2014 00:40:27 AMAll use subject to JSTOR Terms and Conditions

Page 8: Modeling Nonstationary Longitudinal Data

Modeling Nonstationary Longitudinal Data 705

seule difference de temps entre mesures. Nous passons en revue cinq modeles non stationnaires consideres comme les plus utiles: (1) le modele de covariance non structuree, (2) les modeles d'antedependance non structures, (3) les modeles d'antedependance structures, (4) les modeles ARIMA et assimiles, et (5) les modeles a coefficients aleatoires. Nous evaluons les avantages et limitations respectifs de chaque modele, en insistant sur les conditions ou ils s'averent inappropries ou vraisemblablement inutiles. Nous presentons trois exemples pour illustrer les ajustements et la comparaison des modeles, et pour montrer que des donnees longitudinales non stationnaires peuvent reellement etre modelisees, et dans certains cas avec parcimonie. Dans ces exemples, nous montrons que les modeles d'antedependance sont en general meilleurs, et que le modele a coefficients aleatoires s 'avere etre le moins bon. Nous concluons que les modeles d'antedependance devraient etre plus souvent pris en compte qu'ils ne l'ont ete jusque la.

REFERENCES

Byrne, P. J. and Arnold, S. F. (1983). Inference about multivariate means for a nonstationary autoregressive model. Journal of the American Statistical Association 78, 850-855.

Crowder, M. J. and Hand, D. J. (1990). Analysis of Repeated Measures. London: Chapman & Hall.

Cryer, J. D. (1986). Time Series Analysis. Boston: PWS- Kent.

Diggle, P. J., Liang, K. Y., and Zeger, S. L. (1994). Analysis of Longitudinal Data. New York: Oxford University Press.

Gabriel, K. R. (1962). Ante-dependence analysis of an ordered set of variables. Annals of Mathematical Statistics 33, 201-212.

Gantz, B. J., Tyler, R. S., Knutson, J. F., Woodworth, G. G., Abbas, P., McCabe, B. F., Hinrichs, J., Tye-Murray, N., Lansing, C., Kuk, F., and Brown, C. (1988). Evaluation of five different cochlear implant designs: Audiologic assessment and predictors of performance. Laryngoscope 98, 1100-1106.

Jennrich, R. L. and Schluchter, M. D. (1986). Unbalanced repeated-measures models with structured covariance matrices. Biometrics 42, 805-820.

Jones, R. H. (1990). Serial correlation or random subject effects? Communications in Statistics, Simulation, and Computation 19, 1105-1123.

Jones, R. H. (1993). Longitudinal Data with Serial Correl- ation: A State-Space Approach. London: Chapman & Hall.

Jones, R. H. and Boadi-Boateng, F. (1991). Unequally spaced longitudinal data with AR(1) serial correlation. Biometrics 47, 161-175.

Kenward, M. C. (1987). A method for comparing profiles of repeated measurements. Applied Statistics 36, 296-308.

Lindsey, J. K. (1993). Models for Repeated Measurements. Oxford: Oxford University Press.

Muiioz, A., Carey, V., Schouten, J. P., Segal, M., and Rosner, B. (1992). A parametric family of correlation structures for the analysis of longitudinal data. Biometrics 48, 733- 742.

SAS Institute. (1996). SAS/STAT Software: Changes and Enhancements through Release 6.12. Cary, North Carolina: SAS Institute.

Wolfinger, R. D. (1996). Heterogeneous variance-covariance structures for repeated measures. Journal of Agri- cultural, Biological, and Environmental Statistics 1, 205-230.

Zimmerman, D. L. and Nuniiez-Anton, V. (1997). Structured antedependence models for longitudinal data. In Modelling Longitudinal and Spatially Correlated Data. Methods, Applications, and Future Directions, T. G. Gregoire, D. R. Brillinger, P. J. Diggle, E. Russek-Cohen, W. G. Warren, and R. Wolfinger (eds), 63-76. New York: Springer-Verlag.

Received January 1999. Revised February 2000. Accepted February 2000.

This content downloaded from 195.34.79.214 on Wed, 25 Jun 2014 00:40:27 AMAll use subject to JSTOR Terms and Conditions