nonlinear models for repeated measurement data

2

Click here to load reader

Upload: l-t-skovgaard

Post on 06-Jun-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NONLINEAR MODELS FOR REPEATED MEASUREMENT DATA

1462 BOOK REVIEWS

3. NONLINEAR MODELS FOR REPEATED MEASURE- MENT DATA. Marie Davidian and David M. Gilti- man, Chapman and Hall, Great Britain, 1995. No. of pages: xv + 359. Price: S32. ISBN: 0-412- 98431-9

The central statistical model of this book is the non-linear mixed model for continuous responses, that is, a non-linear random regression model. The data structure can be thought of as one or more groups of individuals who are followed over a peri- od of time over which some response is measured repeatedly.

The book consists of 12 chapters, which can be roughly grouped into three categories: prerequi- sites; model formulation and inference, and ap- plications.

The prerequisite part comprises the first three chapters: the introduction; the fundamentals of ordinary non-linear regression, and the theory of hierarchical linear models (classical two-stage ran- dom regression models).

The introductory chapter serves as a very fine appetizer, offering a series of examples with very clearly stated purposes.

Chapter 2 gives an outline of the theory of ordinary non-linear regression (with only one 'individual'), with focus on the modelling of inhomogeneity in the variance as a parametric function of the mean. Useful practical and com- putational guidance is given, although in no detail.

Chapter 3 gives an almost self-contained exposi- tion of the classical normal theory inference in the hierarchical (two-stage) linear model (linear mixed effects model). The computation of ML and REML and best linear unbiased predictors (BLUP) for the random effects are given, and the relation to Bayes estimation is discussed. Numerical algorithms (Newton-Raphson, EM) are discussed, together with implementations in the standard software packages SAS, BMDP and S + . The last page gives an extremely useful bibliography.

Chapters 2 and 3 together form the natural building bricks for the two-stage non-linear mod- els, which are introduced in Chapter 4. The first stage of the model describes the mean value behav- iour of a single individual as a non-linear function, depending on possible covariates, and with a speci- fied variance and covariance structure which in theory can be quite arbitrary. In the second stage the parameters from the various individuals are assumed to follow some specified distribution, typ- ically a normal distribution with unknown para- meters.

To be specific, the models considered allow for the following:

(i) A non-linear mean value structure with de- pendence upon (possibly time-varying) covariates.

(ii) A variance which is allowed to depend upon the mean via some link function and pos- sible covariates.

(iii) Some sort of intra-individual covariance structure.

(iv) A classical mixed-effects model for the para- meters of the mean value, with a systematic structure given by some covariates (typi- cally groups) and a random part governed by other covariates. This specification may be non-linear, possibly involving time-de- pendent covariates as well.

(v) A specification of the distribution of the random effects (typically taken to be nor- mal).

A key issue in the use of such general models in applied work is the possibility of making adequate diagnostics. At present, few appropriate methods exist, and this book does not fill this gap. There are, however, some suggestions, for instance in the advice of the specification of the distribution of random effects. Whereas the normal distribu- tion is usually assumed without much specific reason, it is here advocated to specify a more flexible class of distributions, allowing for bi- modality, so that a possible inhomogeneity in the random effects can be detected and possibly in- cluded in the model as an explicit dependence on some covariate.

A model this flexible can easily cause identifia- bility problems, as we approach the limit of in- formation available in the data. Often, a simplifica- tion (although incorrect) may be preferable in order to avoid a great increase in estimation uncertainty. A good discussion on the pros and cons of complicated model building can be found in Chapter 4.

Being a combined generalization of the models described in the two previous chapters, these non- linear hierarchical models are necessarily bur- dened by more than the combination of the problems connected with either one. The main problems can be said to be:

instability of the model because of its large flexibility (a large number of parameters in the mean value as well as in the vari- ancefcovariance, which to some extent de- scribe the same features); the asymptotic nature of the inference pro- cedures (often carried out on approximations to the model as well, because of intractability of the non-linearity).

Page 2: NONLINEAR MODELS FOR REPEATED MEASUREMENT DATA

BOOK REVIEWS 1463

These problems can in broad terms be character- ized as computational. A key problem is the inabil- ity to calculate the marginal distribution of the response, making it impossible to use ML in a straightforward manner.

Basically, there are three ways of proceeding with inference in these models, namely to base inference on parameters estimated from each indi- vidual separately (Chapter 9, to linearize the prob- lem (Chapter 6) or to use a Bayes approach with a Gibbs sampler (Chapter 8).

In order to base inference on individual esti- mates, we must of course demand a considerable amount of information from each individual. In particular, this approach makes all parameters random, thus rendering population-fixed para- meters impossible (which the authors believe to be of minor importance, since in their opinion, all parameters do vary across individuals). The com- putational aspects of this approach is fairly straightforward and can be carried out through the use of software for matrix manipulations (Gauss, SAS, IML, S + ) .

In contrast to the method of individual para- meter estimation, linearization of the random ef- fects (the topic of Chapter 6) is a technique that can be used with a limited (or even sparse?) amount of data for each individual. In addition to this, the approximation with linear models gives the oppor- tunity of utilizing methods which very much re- semble those of linear inference.

Chapter 7 deals with the relaxing of the distribu- tional assumptions for the random vector of para- meters. A totally unspecified distribution is dis- cussed, but hardly wins many votes due to inherent difficulties with the discrete nature of the estimated distribution, and lack of uncertainty estimates. The more reasonable suggestion is the smooth alternative, in the form of a series expansion of the distribution itself, allowing for all sorts of distributions, including those with several modes. The series expansion is governed by a tuning parameter, in the absence of which the distribution is back to normal, thus giving rise to a simple test for normality (which in essence is a test for omission of important covariates for the speci- fication of the random effects). A splendid instruc- tion on its use is given here. The method is quite demanding in terms of the number of individuals and it is very computer intensive, but the idea seems highly relevant and ought to be further explored.

Chapter 8 deals with Bayesian methods and in particular exploits the possibility of determining the marginal posterior distribution through the use of a Gibbs sampler, given a full set of condi-

tional distributions (known only up to a normaliz- ing constant). From an initial guess, simulations are done from the conditional distributions suc- cessively by rejection sampling, at each stage up- dating the guess. In the end, the result is a simulated marginal density, to be used for infer- ence purposes. The method of Gibbs sampling has a big potential for use in complicated non-linear models and ought to be known by all applied statisticians.

The last three chapters of the book (9-1 1) give an excellent exposition of the usefulness of the models and the possibilities of model comparisons. The applied fields are mainly pharmacokinetics, pharmacodynamics and biological assays, in which a background knowledge of the system at hand gives a natural suggestion of the specific nature of the mean value structure of a statistical model. Surely, other fields of application can make good use of these models as well; an example is given from seismology on the prediction of hori- zontal accelerations in connection with earth- quakes.

The present book fills a gap in the literature. Up to now, work concerning non-linear repeated measures has been widely scattered in the statis- tical literature. Although the style of the book is somewhat technical, the essence can be also gras- ped by less mathematically oriented researchers, of which primarily pharmacokineticists and biolo- gists could profit. In particular, the numerous well- documented real life examples add a lot to the practical understanding of the models. In order to be able to actually carry out computations on ones own datasets, however, I think, that close co- operation with a professional statistician is indis- pensa ble.

Even for applied statisticians, it would have been an advantage to have the computational as- pects expanded by including a comparison of vari- ous software and routines, preferably with practi- cal examples of programming. In the preface, it is said that most of the data sets together with code for estimation can be found on Statlib. The inclu- sion of a diskette containing the same information might have been handy.

I do not hesitate to recommend this book, as it will surely provide new insight and inspiration and a belief in the possibility of handling inference in complicated models.

L. T. SKOVGAARD Department of Biostatistics

University of Copenhagen Blegsdamsvej 3,

DK-2200 Copenhagen N, Denmark