analysis of repeated measurement data using the nonlinear mixed effects model

n Tutorial 1

Chemometrics and Intelligent Laboratory Systems, 20 (1993) l-24 Elsevier Science Publishers B.V., Amsterdam

Analysis of repeated measurement data the nonlinear mixed effects model

Marie Davidian

using

Department of Statistics, Box 8203, North Carolina State University, Raleigh, NC 27695 (USA)

David M. Giltinan

Biostatistics Department, Genentech, Inc., South San Francisco, CA 94080 (USA)

(Received 20 July 1992; accepted 20 October 1992)

Abstract

Davidian, M. and Giltinan, D.M., 1993. Analysis of repeated measurement data using the nonlinear mixed effects model. Chemometrics and intelligent Laboratory Systems, 20: l-24.

Situations in which repeated measurements are taken on each of several individual items arise in many areas. These include assay development, where concentration-response data are available for each assay run in a series of assay experiments; pharmacokinetic analysis, where repeated blood concentration measurements are obtained from each of several subjects; and growth or decay studies, where growth or decay are measured over time for each plant, animal, or some other experimental unit. In these situations the model describing the response is often nonlinear in the parameters to be estimated, as is the case for the four-parameter logistic model, which is frequently used to characterize concentration-response relationships for radioimmunoassay enzyme-linked immunosorbent assay. Furthermore, response variability typically increases with level of response. The objectives of an analysis vary according to the application: for assay analysis, calibration of unknowns for the most recent run may be of interest; in pharmacokinetics, characterization of drug disposition for a patient population may be the focus. The nonlinear mixed effects (NME) model has been used to describe repeated measurement data for which the mean response function is nonlinear. In this tutorial, the NME model is motivated and described, and several methods are given for estimation and inference in the context of the model. The methods are illustrated by application to examples from the fields of water transport kinetics, assay development, and pharmacokinetics.

CONTENTS

1. Introduction ........................................................... 2. Examples .............................................................

2.1. Water transport kinetics of high flux hemodialyzers ............................

2 4

4

Correspondence to: M. Davidian, Department of Statistics, Box 8203, North Carolina State University, Raleigh, NC 27695 (USA). Tel.: (+ l-919)5152532; fax: (+ l-91915157591.

0169-7439/93/$06.00 0 1993 - Elsevier Science Publishers B.V. All rights reserved

2 M. Davidian and D.M. Giltinan / Chemom. Intell. Lab. Syst. 20 (1993) l-24 / Tutorial W

2.2. Bioassay of relaxin by RIA .............................................. 2.3. Pharmacokinetics of cefamandole .........................................

3. The nonlinear mixed effects model ........................................... 3.1. Motivation and description .............................................. 3.2. Intra-individual variability .............................................. 3.3. Inter-individual variability .............................................. 3.4. Summary ..........................................................

4. Methods based on individual estimation ........................................ 4 .l. Introduction ........................................................ 4.2. Construction of individual estimates /IF .................................... 4.3. Two-stage estimation of population parameters ............................... 4.4. Pooled two-stage algorithm .............................................

5. Methods based on linearization .............................................. 5.1. Introduction ........................................................ 5.2. Linear mixed effects algorithm ...........................................

6.Examplesrevisited.. ..................................................... 6.1. Water transport kinetics of high flux hemodialyzers ............................ 6.2. Bioassay of relaxin by RIA .............................................. 6.3. Pharmacokinetics of cefamandole .........................................

7. Conclusion ............................................................

References ..............................................................

7 8 8 8

11 11 12

13 13 14

15 16 16 16 16 18 18 19 21 23

23

1. INTRODUCTION

Data consisting of repeated measurements

taken on each of a number of individuals arise in several areas, such as pharmacokinetics, assay

development, and studies of growth and decay. The use of the term ‘individual’ may be quite broad; for instance, it may refer to human or animal subjects, experimental runs, laboratories, devices, etc. Repeated measurements on an individual may be taken over time, at several concentrations of analyte, at several pressures, or over some other set of conditions. The proposed systematic relationship between the measured response, y, and the repeated covariate, x, is often nonlinear in its parameters. In some applications, an appropriate nonlinear model may be derived theoretically on the basis of physical considera- tions. In other contexts, a nonlinear relationship may be used to provide an empirical description of the data.

The presence of repeated observations on an

individual requires particular care in describing the random variation in the data. It is important to recognize explicitly two kinds of variability: random variation among measurements within a given individual (intra-individual variability) and random variation among individuals (inter-individual variability). Within an individual, a common phenomenon is a heterogeneous pattern of variability in the observed measurements which is systematically related to overall response level.

The objectives of any analysis of repeated measurement data depend to some extent on the particular problem. However, in any application, proper characterization of inter- and intra-individual variation is essential to ensure reliable inference. Analysis should therefore be conducted in a framework which explicitly acknowledges the existence of these two sources of variation in repeated measurement data and allows them to be evaluated. A natural parametric framework which accommodates these features is the nonlinear mixed effects model (NME).

l M. Davidian ana’ D.M. Giltinan /Chemotn. Intell. Lab. SysS 20 (1993) I-24/Tutorial 3

This tutorial discusses the NME model and shows how it provides a basis for inference in several applications. Section 2 introduces examples from the fields of water transport kinetics, assay development, and pharmacokinetics. Sec-

tion 3 describes the model framework. Several approaches to estimation and inference are described in Sections 4 and 5, and these techniques are illustrated by application to the examples in Section 6.

TABLE 1

Ultrafiltration rate responses (UFR, ml/h) for twenty high flux hemodialyzers operated at two flow rates; TMP = transmembrane pressure (mm Hg)

Dialyzer

Flow rate = 200 ml /mrk 1 UFR 123.0

TMP 23.5 2 UFR 948.0

TMP 30.5 3 UFR 393.0

TMP 25.5 4 UFR 156.0

TMP 25.0 5 UFR 982.5

TMP 30.5 6 UFR 298.5

TMP 24.5 7 UFR 321.0

TMP 25.5 8 UFR 366.0

TMP 26.0 9 UFR 372.0

TMP 24.0 10 UFR 64.5

TMP 24.0

Flow rate = 300 ml / min 1 UFR 150.0

TMP 28.5 2 UFR 642.0

TMP 29.5 3 UFR 388.5

TMP 25.5 4 UFR 1093.5

TMP 40.0 5 UFR 405.0

TMP 29.0 6 UFR 360.0

TMP 23.5 7 UFR 117.0

TMP 23.5 8 UFR 189.0

TMP 26.0 9 UFR 1041.0

TMP 35.5 10 UFR 571.5

TMP 28.0

1537.5 3283.5 3783.0 4059.0 3255.0 3430.5 50.5 102.0 147.5 197.0 248.0 300.0

2175.0 3723.0 4443.0 4216.5 4306.5 3661.5 50.5 99.5 150.0 199.0 248.0 300.0

1983.0 4042.5 5225.0 4949.5 4597.5 4191.0 49.5 99.5 148.0 199.5 249.0 303.0

1665.0 3453.0 4381.5 4849.5 4752.0 4164.0 49.5 100.0 150.0 196.5 248.5 298.0

2163.0 4227.0 5028.0 4551.0 4425.0 4230.0 50.5 98.0 150.5 200.5 250.5 299.0

1770.0 3529.5 4195.5 4761.0 4473.0 4603.5 48.0 101.0 150.5 200.0 251.5 297.0

1770.5 3249.0 4233.0 4573.5 4785.0 4804.5 51.5 100.0 150.5 202.0 249.0 301.0

1695.0 3609.0 4263.0 4647.0 4627.5 4398.0 50.0 102.0 149.0 199.0 248.0 299.5

1888.5 3469.5 4030.5 4447.5 4243.5 4465.5 54.0 99.5 147.5 200.0 250.0 301.0

2011.5 3846.0 4498.5 5176.5 4657.5 4081.5 50.5 99.5 148.5 202.0 249.5 297.0

1540.5 3252.0 4243.5 4857.0 5368.5 5365.5 52.0 100.5 150.0 198.5 249.0 299.5

2025.0 4305.0 5811.0 6199.5 6091.5 6360.0 51.5 101.0 148.0 200.0 248.0 300.5

1915.5 3765.0 4789.5 5449.5 5317.5 5935.5 50.0 98.0 149.0 201.5 251.0 298.0

1347.0 3535.5 4534.5 4944.0 5362.5 5643.0 47.0 101.0 151.5 198.0 251.0 300.0

1659.0 4051.5 5284.5 6043.5 6483.0 6382.5 49.5 101.5 152.0 202.0 250.0 297.5

2049.0 4188.0 4999.5 5767.5 6247.5 6214.5 48.0 101.0 149.0 199.0 248.0 300.5

1768.5 3970.5 5268.0 6180.0 6148.5 6142.5 48.5 102.5 151.5 199.0 251.0 302.0

1851.0 3721.5 5235.0 6091.5 6298.5 6477.0 51.5 97.0 150.5 199.0 250.0 299.5

1932.0 4377.0 5122.5 5809.5 5409.0 6201.0 48.0 102.5 150.0 199.0 250.0 300.5

2050.5 3940.5 5010.0 5515.5 6118.5 5071.5 50.5 100.0 149.0 200.0 250.5 302.0

4 M. Davidian and D.M. Gilthan /Chemom. Intell. Lab. Syst. 20 (1993) l-24/Tutorial w

2. EXAMPLES

2.1. Water transport kinetics of high flux hemodialyzers

Ref. 1 presents data from an experiment which evaluates the water transport kinetics of high flux membrane dialyzers used for hemodialysis for

patients with end-stage renal disease. Twenty dialyzers were evahtated in vitro with bovine blood at two different blood flow rates, 200 or 300 ml/min. For each of seven values of transmembrane pressure (TMP, mm Hg) exerted on the dialyzer membrane, ultrafiltration rate (UFR, ml/h) at which water is removed was measured for each dialyzer. These data are given in Table

0 40 60 120 160 200 240 280 320

PRED. VALUE

0 11,. 0 40 80 120 160 200 240 280 320

PRED. VALUE

Fig. 1. Ultrafiltration rate data for twenty high flux hemodialyzers operated at two blood flow rates. Each symbol represents data for a different dialyzer. Flow rate: (a) 200; (b) 300 ml/min.

W M. Davidian and D.M. Giltinan /Chemom. Intell. Lab. Syst. 20 (1993) 1-24/Tutorihl 5

1, and profiles for each dialyzer are plotted in cotic pressure also affects the relationship. Thus, Fig. 1. Inspection of the profiles indicates that, as described in ref. 1, a nonlinear model for the although the form of the relationship between relationship between UFR and TMP is postu- UFR and TMP is similar for different dialyzers, lated: its exact nature varies among dialyzers.

The relationship between UFR and TMP is UFR = &{I - exp[ -&CI’MP - /%>I} (1)

believed to depend on protein polarization, which results in a relatively constant ultrafiltration rate at high pressure. Resistance due to patient on-

Although this model is derived partly on empirical grounds, the parameters in Eqn. 1 have the following interpretation: & represents maximum

8 z I

9-I

0 /--

a -- --- --

e -0

0 d,.,.,.m.a.t. 1 *I 0 40 80 120 160 200 240 280 320

TRANSMEMBRANE PRESSURE (mmHg)

0 -4’: , I , 1 , 1 , I I . I I

0 40 80 120 160 200 240 280 320

TRANSMEMBRANE PRESSURE CmmHg)

Fig. 2. Ultrafiltration rate data for dialyzer No. 1 from each of the two blood flow rates with LS fit of Model 1 superimposed. HOW

rate: (a) 200; (b) 300 ml/min.

6 M. Dauidian and D.M. Giltinan / Chemom. Intell. Lab. Syst. 20 (1993) l-24 / Tutorial n

attainable ultrafiltration rate, & is a hydraulic permeability transport rate, and & is the transmembrane pressure required to offset patient oncotic pressure.

For a given dialyzer, the seven available observations provide enough information to fit Eqn. 1 to the data by nonlinear (unweighted) least squares (LS). Fig. 2 shows a plot of the LS fit of

Model 1 to the data for dialyzer 1 at each blood flow rate. Fig. 3 shows a plot of LS residuals against predicted response, as described in ref. 3, for all dialyzers. The ‘fan-shaped’ pattern is evidence that intra-dialyzer variability is an increasing function of ultrafiltration response level.

One objective of the experiment was to contrast kinetic properties for dialyzers at the two

1000 2000 3000 4000 5000 6000 7000

O A. 0 0

0 v 0

PRED. VALUE

04

1 1

‘0 1000 2000 3000 4flOo 5000 6000 7000

PRED. VALUE

Fig. 3. Standardized LS residuals versus predicted response for the dialyzer data. Each symbol represents data for a different dialyzer. Flow rate: (a) 200, (b) 300 ml/min.

W M. Davidian and D.M. Giltinan /Chemom. Intell. Lab. Syst. 20 (1993) I-24/Tutorial 7

different blood flow rates. This question is addressed in Section 6 within the framework of the NME model to incorporate both inter- and intra-dialyzer variation.

2.2. Bioassay of relaxin by RlA

Determination of the concentration of a particular protein in an unknown sample frequently relies on immunoassay or bioassay techniques. Bioassay methods are generally based on a relevant measure of the bioactivity of the protein in question and involve measuring activity at several known (standard) concentrations of the protein. The resulting concentration-response curve is used to determine protein concentration in unknown samples by inverse regression (calibration).

Table 2 shows concentration-response data obtained for standard concentrations (x, ng/ml) in nine runs of a bioassay for the therapeutic protein relaxin [2]. For this assay, bioactivity of relaxin is measured by increased generation and release of intracellular adenosine-3’,5’-cyclic

TABLE 2

Relaxin bioassay responses (CAMP, pmoles/ml) for nine experiments; x = standard concentration (ng/ml), triplicate responses at

each x except 0.0

monophosphate (CAMP, pmoles/ml) by normal human uterine endometrial cells in the presence of relaxin. (CAMP is an enzyme catalyst which plays a key role in regulating glycogen metabolism in the cell.) For each of the nine runs, triplicate CAMP measurements were determined by RIA for each of seven known relaxin concentrations. A single measurement at zero standard was also available for each run. Fig. 4 illustrates concentration-response data for the standard for each run, where, by convention, the response at zero concentration has been plotted at two dilutions below the lowest standard. The four-parameter logistic model

82-64

cAMp =” + ji+ exp[ #14(10g x - S,)]} (2)

is a standard choice in describing assay data [3]. The parameters have the following interpretation: & and &, represent response at zero and infinite concentration, respectively; & is the log EC,, value, that is, the log of the concentration which gives a response midway between j?r

x Experiment

0.00 0.34 0.69 1.38 2.75 5.50

11.00 22.00

0.00 1.37 1.17 0.34 3.95,4.00,3.75 3.15,3.70,3.70 0.69 6.00,10.60,10.00 2.45,6.45, 6.25 1.38 16.20,23.95, 22.30 11.85, 16.75,23.55 2.75 34.80, 61.75, 54.05 28.45,42.45, 49.40 5.50 62.25, 112.90, 105.60 51.75, 91.25, 77.10

11.00 99.30, 137.15, 155.10 81.10, 122.90, 125.90 22.00 101.95, 166.30, 177.60 72.90, 118.20, 117.95

1 2 3 4 5

1.77 1.80 1.87 1.62 1.03 3.35,4.00, 6.10 4.20,5.85,4.70 3.60,4.70,4.20 3.65,3.60, 4.95 1.60,2.30, 2.95 8.40, 12.00, 8.15 7.10,9.20, 9.00 8.65,8.60, 4.85 5.35, 7.35, 9.00 3.15,5.05, 6.70

13.25,24.25, 17.85 15.00, 23.45,25.40 13.00, 16.95, 13.85 10.80, 18.25, 19.60 7.60, 11.55, 12.80 40.15,49.35,40.05 30.15,46.90,74.45 24.60, 34.60, 31.70 21.75,26.05,32.60 14.90,23.75,27.55 61.85, 85.15,58.80 51.85, 72.95,83.70 57.45, 58.25, 49.05 57.05,92.05, 99.90 34.85,61.50, 50.00 95.05, 118.50, 76.25 66.30, 101.25,92.80 60.55,77.80, 78.95 85.10, 92.30, 104.90 47.40,57.20, 61.25

116.70, 140.45,90.40 87.50, 115.30, 109.40 74.15, 103.90, 89.85 87.55,97.35, 101.60 38.70,55.30, 63.20

6 7 8 9

1.87 2.32 5.40,4.55,3.05 4.85,4.50,4.20 7.35, 7.10,7.30 8.40, 7.90, 7.05

11.10, 16.35, 14.15 12.90, 18.20, 16.10 25.70,27.85, 35.90 31.2S,37.55, 35.75 49.30,63.60, 72.95 65.95, 94.60, 71.65 57.50, 71.45, 69.75 63.75,68.85,82.60 69.65, 92.05,88.55 83.70, 103.85,93.00

8 M. Dauidiun and D.M. Giltinan / Chemom hell. Lab. Syst. 20 (I 993) I-24 / Tutorial n

and &; and /34 is a slope parameter measuring the steepness of the concentration-response curve. LS fits of Model 2 are superimposed on the raw data for each run in Fig. 4. Inspection of these plots shows that the four-parameter logistic model provides a reasonable representation of the relaxin concentration-response relationship for a given run, but that some of the parameters, particularly &, &, and p4, vary considerably from run to run. Furthermore, it is evident that variability in measured CAMP increases with response level for a given run (see ref. 3).

A primary objective in analyzing data such as those in Table 2 is the calibration of unknown samples in a given assay run. As,described in refs. 4 and 5, the accuracy of calibration confidence intervals and precision profiles for an assay run depends critically on how well the increasing intra-assay variation is characterized. Section 6 shows that using the NME model as a framework for analysis allows this variation to be characterized using data from all nine runs, resulting in improved calibration inference.

2.3. Pharmacokinetics of cefamandole

The results of a pilot study to investigate the pharmacokinetics of cefamandole, a cephalo- sporin antibiotic, are reported in ref. 6 and are shown here in Table 3. A dose of 15 mg/kg body weight of cefamandole was administered by lo- min intravenous infusion to six healthy male volunteers, and blood samples were collected from each subject at each of fourteen time points (t, min) post-dose. Drug concentrations in plasma (y, pg/ml) for each sample were determined by high-performance liquid chromatography (HPLC). Fig. 5 plots the resulting plasma concentration-time profiles for each subject.

A commonly employed approach to characterizing the pharmacokinetics of a drug is to represent the body as a system of compartments and to assume that the rate of transfer between compartments and the rate of drug elimination from compartments follow first-order or linear kinetics [7]. Solution of the resulting differential equa- tions shows that the relationship between drug concentration and time is described by a sum of

exponential terms. For instance, the biexponential equation

y =& exp( -&G +& exp( -/W (3)

follows from the assumption of a two-compartment model to describe kinetics following intravenous injection [7]. Fig. 6, which shows a LS fit of Model 3 to data for subject 2, indicates that cefamandole kinetics are well described by a two-compartment model. The data in Fig. 5 share the characteristic observed in the previous examples: similarly shaped profiles for each subject, with possibly different parameter values for different subjects. It is commonly recognized that intra-subject variation in plasma concentrations increases with plasma concentration level @-lo], in part due to the nature of the HPLC assay.

In pilot experiments on volunteers, such as this study, the primary objectives are to establish an appropriate kinetic model, obtain preliminary information on values of the model parameters, and assess intra-subject measurement error, such as that due to the assay used to process blood samples. These issues are addressed within the NME framework in Section 6. Typically, results from the analysis of a pilot study are used as a basis in the subsequent investigation of kinetics in a more extensive patient population.

3. THE NONLINEAR MIXED EFFECTS MODEL

3.1. Motivation and description

Several common features are apparent in the examples in Section 2. The same nonlinear model for the relationship between the measured response and the covariate is suitable for describing data for each individual, but the values of the parameters that specify the model fully may differ among individuals (inter-individual variation). Furthermore, the variability associated with response measurements for a given individual depends on the response value in a way that is likely to be similar for all individuals, due to, for example, properties of an assay (intra-individual variation). Correct analysis should account for both sources of variation; moreover, characterization

H M. Davidian and D.M. Giltinan /Chemom. Intell. Lab. Syst. 20 (1993) l-24/ Tutorial 9

c-9

P 8 a

.

” .

; 4

8 8

mz 091 001 0!7 0 ooz o!a 001 ffi 0

8 d

5 $

8 d

8 d

031 OS1 M)l OS 0 Nlz OS1 all OS 0

dWW dW%’

@Jz OS1 a31 OS 0

dWW

10 M. Davidian and D.M. Giltinun / Chemom. Intell. Lab. Syst. 20 (1993) I-24 / Tutorial n

UJ-

4

I *-

(Y-

O-

?- 0 2 4 6 u

TIME (hours)

Fig. 5. Cefamandole plasma level-time profiles for six subjects. Each symbol represents data for a different subject.

of these features may be a goal in itself. The following NME model incorporates both inter- and intra-individual variation. The use of this model as an analytic tool was pioneered by Beal and Sheiner [&lo], who recognized and advo-

cated the need to accommodate and assess both types of variability in pharmacokinetic analysis.

Let yij denote the jth response, j = 1,. . . , mi, for the ith individual, i = 1,. . . , n, at the set of conditions summarized by the vector of covariates

’ 0 2 4 6 8

TIME (hours)

Fig. 6. Cefamandole plasma level-time profile for subject No. 2 with LS fit of Model

8 M. Davidian and D.M. Giltinan /Chtmom. Intell. Lab. Syst. 20 (1993) 1-24/Tutorid 11

TABLE 3

Plasma concentrations (pg/mU of cefamandole following lO-

min infusion in six subjects

Time Subject (min) 1 2 3 4 5 6

10 127.00 120.00 154.00 181.00 253.00 140.00 15 87.00 90.10 94.00 119.00 176.00 120.00 20 47.40 70.00 84.00 84.30 150.00 106.00 30 39.90 40.10 56.00 56.10 90.30 60.40 45 24.80 24.00 37.10 39.80 69.60 60.90 60 17.90 16.10 28.90 23.30 42.50 42.20

75 11.70 11.60 25.20 22.70 30.60 26.80 90 10.90 9.20 20.00 13.00 19.60 22.00

120 5.70 5.20 12.40 8.00 13.80 14.50 150 2.55 3.00 8.30 2.40 11.40 8.80 180 1.84 1.54 4.50 1.60 6.30 6.00 240 1.50 0.73 3.40 1.10 3.80 3.00 300 0.70 0.37 1.70 0.48 1.55 1.30 360 0.34 0.19 1.19 0.29 1.22 1.03

rij. The vector xii incorporates variables such as time, concentration, pressure, etc. Suppose that a (nonlinear) function f(x, /3> may be specified to model the relationship between yii and Xii.

Inter-individual variability is accommodated by the assumption that, although f is common to all individuals, the (p x 1) regression parameter vector p may vary across individuals. This is incorporated by specification of a separate (p X 1) vector of parameters pi for the ith individual. For example, for the relaxin bioassay, pi would be the (4 x 1) vector whose components are the parameters of the four-parameter logistic Model 2 corresponding to the ith run of the assay. The mean response for individual i, given the parameter vector &, is thus E(yij I &I =f(xij, pi).

3.2. Intra-individual variability

For a given individual, the variability in y,, may be a function of f(xij, &). For example, intra-individual variance may be proportional to a power of the mean response given &, that is, Var(yij 1 pi) = 02(f(xij, &>j2e for some scale parameter (T and a power 8; if 0 = 1, this is the constant coefficient of variation (C.V.) model with C.V. = u. In this specification, the parameters c and 8 are common to all individuals, reflecting

the belief that the pattern of variability in measurements is similar across individuals. This is the case if, for example, the pattern is primarily due to a (common) assay used to obtain response measurements. In general, write Var(yij I pi) = a2g2(f(xij, Bi), 01, where the variance function g describes the common pattern of variability. Other examples of models g for intra-individual variance are given in refs. 4 and 10.

With these definitions, the NME model assumes that the jth measurement on individual i can be written as

Yij=f(xijY Pi) + ug{f(xij~ Si), e}cij (4) where l ij is a random error with mean 0 and variance 1. It is often reasonable to assume that measurements within a given individual are statis- tically independent, reflected in the model by the assumption that the random errors Eij are independently distributed.

3.3. Inter-individual variability

In Eqn. 4, inter-individual variation is modeled through the assumption of the individual-specific regression parameter vector &. Part of the inter- individual variation in the values of the parameters characterizing mean response may be due to systematic dependence on individual attributes. For example, it is well known that pharmacokinetic parameters depend on an individual’s weight, disease status, or other demographic characteristics [8,9]. Parameters may also vary due to unexplained random variation in the population of individuals; for example, due to natural biological or physical variation among individuals or subtle run-to-run variation in assay procedure.

To account for these possibilities, a model for the dependence of Pi on individual attributes and random variation may be specified. The simplest such model is that in which inter-individual variation is assumed to be entirely due to unexplained phenomena:

&= y+zi (5)

where y is a (p x 1) vector of fixed parameters and zi is a (p X 1) random vector assumed to arise from a population with mean O,, a (p x 1)

12 M. Davidian and D.M. Giltinan / Chemom. Intell. Lab. Syst. 20 (1993) 1-24 /Tutorial n

vector of zeros, and covariance matrix Z. Eqn. 5 states that the & vary in the population of individuals about the value y, their mean, and the variation in the population is described by the matrix X. The diagonal elements of Z characterize the variance of each component of the /Ii about y, and the off-diagonal elements describe how the components vary together (covariance).

A more complicated model allows for dependence on both random and systematic phenomena. For example, in the dialyzer study of Section 2.1, water transport kinetics may be thought to vary among dialyzers in part because of flow rate and in part because of natural variation expected to occur among the devices. Both possibilities are taken into account by assuming that

Bi=Y2OO+~i

if dialyzer i is used with flow rate 200 ml/min

Bi=Y300+zi


(6)

where y2c0 and ysoo are (3 x 1) vectors describing the central tendency of the three kinetic parameters in Eqn. 1 for the populations of dialyzers operated at flow rates 200 ml/min and 300 ml/min, respectively. The (3 X 1) random component zi is again assumed to have zero mean and covariance Z, so that random variation in both populations is assumed to be similar. Model 6 may be written compactly in the form of a ‘linear regression’ model:

fii=wir+zi (7)

where y = [r&, y&IT, that is, the (6 x 1) vector

with yzoo and y300 stacked, and Wi is a (3 X 6) ‘design’ matrix such that

wi = [I3 l”3,31


W = P3.3 1131


(8)

where I, is an (r X r> identity matrix and O,,, is an (r x s) matrix of zeros. Note that, under Model 7, a comparison of kinetics between the two flow rates could be made by comparing yZoo and y3m, the ‘typical’ kinetic parameters for each population.

In some cases, it may be appropriate to assume that certain components of the vector /Ii do not vary across individuals. For example, in a preliminary analysis of the dialyzer data, Vonesh and Carter [l] determined that p3 - transmembrane pressure required to offset patient oncotic pressure - was constant across dialyzers within each flow rate, although the value of p3 was probably different for the two flow rates. This may be accommodated in the model as follows. For the vector of kinetic parameters pi = [/Iii, &, fii31T associated with individual i,

Bil = YLZOO + zil 7 Pi2 = Y2,200 + zi2 7 Pi3 = Y3,2OU


Pii = Yl, 300 + zil, pi2 = 72,300 + zi2 3 pi3 = Y3,300

if dialyzer i is used with flow rate 300 ml/min (9)

where Y200 = h1,200, Y2,2m Y3.200 IT and y300 = [Y 1,300, Y2.300, Y3,300 IT and zi = [zil, zi21T is the random vector describing how the first two Da- rameters (maximum ultrafiltration rate and permeability transport rate) vary in both flow rate populations. Model 9 may be written as

#Ii = Wiy + H,zi (10)

where Y = [rTm, y3~ ’ IT (6 x l), W is defined in Eqn. 8, and Hi is the (3 X 2) matrix whose columns are the first two columns of I,.

3.4. Summary

For convenience, the form of the NME model is summarized in vector notation as follows. Let y, = [.Vii, * * * 7 .YjmilT, _fi<S,> = [.Hxil, Pi)9 * * * 7 f(xi,., fii>lT, and Ei = [eil, . . . pQimiIT be the (mi X 1) vectors of responses, mean response functions, and random errors, respectively, for individual i. Let G,(&, 0) be the (mi x m,) diagonal matrix with diagonal elements gij =g{_f(xij, pi), 0). With

n M. Davidian and D.M. Giltinan / Chemom. Intell. Lab. Syst. 20 (1993) I-24 / Tutorial 13

these definitions, the nonlinear mixed effects model may be written as

Yi =fi(Si) + cGi(Bi, e)Ei (mi X 1)

Bi=Wir+HiZi (PX 1)

zi N (0, X;> (M x 1) distributed independently of Qi N (0, Im,) (mi x l)

i=l ,*“, n, j=l ,...,mi (11)

where the notation ‘ - (a, B)’ means ‘distributed with mean a and covariance matrix B', A4 is the dimension of the random components vector Zi,

and the assumption that zi and 4i are distributed independently reflects the belief that the mecha- nisms governing the two sources of variation op- erate independently. This is a version of the model discussed by several authors [1,5,8,9,11-131.

For repeated measurement data, analysis objectives may usually be formulated in terms of the relevant components of the NME Model 11. Broadly speaking, two types of inference may be distinguished: population and individual inference. In the former case, interest focuses on estimation of the population characteristics y and S. For instance, in the dialyzer study, the objective is to contrast the kinetic behavior of dialyzers operated at the two flow rates as opposed to investigating the behavior of any particular dialyzer. This may be accomplished within the NME framework by estimation and comparison of yZoo and ysoo. In contrast, calibration inference is typically based on the current assay run. Thus, inference about the individual run parameters Bi is an important objective. Working within the NME framework leads to improved inference on pi by exploiting information on the common intra-assay variation from all assay runs.

4. METHODS BASED ON INDIVIDUAL ESTIMATION

4.1. Introduction

Model 11 acknowledges that inter- and intra- individual variation should be considered in an analysis of repeated measurement data. As pointed out by Beal and Sheiner [9], early at-

tempts at analysis of these data did not take both features into account. In particular, a popular method of analysis was essentially to assume that the model for yij is

Yij =f(xij, B) + acij (12)

where /3 is common to all individuals. Estimation of /?, the vector of model parameters (assumed the same for all individuals), was accomplished by ordinary nonlinear LS based on Eqn. 12, ‘pooling’ the data from all individuals. This is referred to as the ‘naive pooled data’ method [8,9], since it ignores variation across individuals. Estimates of #! obtained by this method can be biased or imprecise [9], and no assessment of inter-individual variability is possible. This method is not recommended [9].

The nonlinear mixed effects Model 11 is complex, since, in order to account for the two sources of variation, two random components, Zi and l i, are required. The random component zi appears in the model through the nonlinear function f; thus, the effect of inter-individual variation on response measurements is complicated. Standard statistical approaches such as maximum likelihood (ML) estimation or LS are predicated on the ability to specify a distributional model for a response vector yi. In this situation, because of the complex way in which zi appears in the model, it is not possible to write down a distribution for yi, even if it is assumed that both zi and ei are normally distributed. Thus, standard techniques may be difficult to implement. As a result, many of the methods that have been proposed for analysis of Model 11 are based on approximations which allow a distribution to be specified for yi.

One approach to an approximation based on Model 11 is based on the ability to construct, for each individual, estimates /3: for /gi. These estimates form the basis for estimation of y, X, u, 8. Intuitively, for this idea to be successful, sufficient data must be available on each individual to obtain suitable /37. These methods are referred to as two-stage methods in the pharmacokinetics literature [8,9,12] since the idea consists of two stages: construction of &* and subsequent estimation of other parameters.

14 M. Davidian and D.M. Giltinan / Chemom. Intell. Lab. Syst. 20 (1993) I-24 / Tutorial n

4.2. Construction of individual estimates /3:

For any two-stage method, the quality of the estimates of y and Z depends on the quality of the estimates /3T. If intra-individual variance is not constant but varies with response level according to the function g, using ordinary IS estimates for /IT is undesirable, because IS estimates are inefficient relative to estimates based on weighted least squares (WLS) [4]. The weighting scheme to be used depends on what is known about the function g. As is frequently the case, one may be able to specify a function g that describes the pattern of variation, but the value of 8 providing a full characterization is unknown and must be estimated from the data [4]. If it is realistic to assume a common pattern of variation for all individuals, as in Model 11, it makes sense to use the data from all individuals to estimate 8 (and a) rather than to estimate separate values for each individual.

It is useful to first review estimation of the intra-individual variance parameters u and 8 based on data from a single individual. As described in ref. 4, for a given individual, variance function estimation (VFE) procedures use residuals from a previous fit as the basis for estimation of u and 8. Two such procedures are described in ref. 4 (Section 4) and are incorporated into an iterative generalized least squares (GLS) algorithm for estimation of the individual’s regression parameter. This algorithm may be summarized as follows for use with data from a single individual and is abbreviated GLS-I to emphasize its use on individual data only.

GLS-I Algorithm: For individual i: 1. 2.

Obtain the LS estimator /Z?:(O) and set k = 0. Given /IT(k), minimize the objective function 0i(/3T’k’, u, t9) in u and 8 to obtain $ck), where Oi(&, u, 0) is an objective function for VFE such as

PLi(B,, U, 0) = {Yij-f(xijp Si>}’

U2g2{f(Xijy Pi), ‘)

+1W[uZg2{f(Xij9 Pi), 0)

ARi(Bi, U, 0) = IYij-f(xij> Si>l

ug{f(xij’ Pi), e}

+log[ug(f(xij, Pi), e}] I

(13) and form estimated weights $;I:) = l/g2{f(x.. B?(k)) $0)

-, I,? 6 ,

3. Obtain the GLS estimator of pi by WLS with weights );!f). Set k = k + 1, let /3tck) be this GLS estimator, and return to step 2.

The objective functions for the estimation of (a, 0) in step 2 are motivated in refs. 4 and 5. The scheme may be iterated a fixed number of times or until convergence, see ref. 4.

In ref. 5, the GLS-I algorithm is extended to allow estimation of u and 8 based on data from all individuals. The resulting estimation scheme is abbreviated GLS-P to emphasize that data are pooled across individuals in step 2.

GLS-P Algorithm: For each individual i, estimate & by IS. Call these n estimates @r(‘), i = 1,. . . , n. Minimize in u and 8

i oi( B:(~), U, e) i=l

where Oj is one of the VFE objective functions in Eqns. 13, obtaining $ck). For each individual i, form estimated weights $if’ = l/g2{f(xij, #Ii*(k9, I!%~)}, i = l,..., n. For each individual i, obtain the GLS estimator of pi by WLS with weights $). Set k = k + 1, let fl;tk), i = 1,. . . , n be these GLS estimators, and return to step 2.

In step 2, information from all individuals on the (common) intra-individual variance structure is used to estimate (a, 0) by pooling information across individuals. Since the objective function is simply the sum of the objective functions for all individuals, minimization is no more difficult than for data from a single individual only. See refs. 4 and 5 for details on implementation with stan-

n M. Davidian and D.M. Giltinan /Chemom. Intell. Lab. Syst. 20 (1993) 1-24/Tutorial 15

dard nonlinear regression software. As in the case of individual data, this scheme is iterated a fixed number of times or until convergence.

4.3. Two-stage estimation of population parameters

Once individual estimates SF have been obtained, several possibilities exist for constructing estimates of y and Z.

The standard two-stage (STS) method (for example, ref. 12) treats estimates /3,? as if they were the true &. In the simplest case of Model 5, estimates for y and Z are constructed as

n

i=l

i:=(n-1,-1~(s~-9,(u-i)T (14) i=l

the sample mean and covariance. Although simple, these estimators take no account of the uncertainty associated with estimation of & by /3:. The result is that the estimator for Z can be biased and imprecise. Furthermore, no attempt to improve on j3: by taking advantage of all available information is made. These estimators are generally regarded as undesirable [5,8,9,12].

One drawback of STS estimation is that it does not account for estimation error in pi*. Several methods to do this have been proposed [8,9,12,13]; an explicit description of one such method is referred to in ref. 12 as the global two-stage (GTS) method. The method is described in terms for the particular case given by Model 7.

An assessment of the uncertainty of estimation in /3: may be obtained by appealing to large sample theory for nonlinear regression, as described in ref. 4 (Section 3.2). The theory states that, for large numbers of observations (in our case, large m,), the sampling distribution of an estimator /3; is p-variate normal with mean jli and some covariance matrix Vi, where the form of x is determined by the nature of f and #lT and depends on &. In practice, y is estimated by replacing parameters by estimates where they ap- pear; henceforth, assume that Vi refers to such an estimate. If this theory is relevant, then given pi, /3: is approximately normally distributed with mean pi and covariance vi, write /3* I pi -

NC/$, VJ. Under the further assumption that pi is normally distributed with mean Wiy and covariance Z, as suggested by Model 7 for normal zi, then it follows that j?T N N(Wiy, Vi + Z). This argument thus leads to a distributional assumption about /3;, suggesting that standard approaches to the estimation of y and X may be used, treating the /?F as ‘data’. The GTS method estimates y and I; by ML estimation based on this normal distribution and is implemented by a two-step iterative algorithm [5,12] which produces as a by-product ‘refined’ estimates of #li that make use of information from all individuals. The GTS algorithm may be started by using initial values for y and Z from Eqns. 14.

GTS Algorithm: At iteration (k + 1) Produce refined estimates of &:

for i= l,...,n (15) Obtain updated estimates of y and Z:

&+I) = ,~lQ~k~~:k+l’,

Qik' =

e (k+l) = n -li$&k+l’-w;ck+l,)

+n-’ i (y-l +e,g-’ i=l

Iteration continues until the Ggorithm converges to final estimates 9, Z, and &. The algorithm is easily programmed in a matrix programming lan- guage. Note that the ‘refined’ estimates in Eqn. 15 have the form of a ‘weighted average’ of the individual estimate /3: obtained from the data for individual i only and Wi3, the estimate of & predicated on Model 7 using data from all individuals. These estimates are in fact empirical Bayes estimates for #Ii; that is, pi is the mean of the distribution of fli given the data {yij], where y and Z have been replaced by the current estimates [12]. An algorithm similar to the GTS algorithm is given in ref. 13.

16 M. Davidian and D.M. Giltinan /Chemom. Intell. Lab. Syst. 20 (1993) 1-24 / Tutorial n

By standard statistical theory for ML estimation, estimates of the uncertainty associated with, for example, 9, may be obtained. An estimate of the covariance matrix for 9 is

A= &‘(V,-‘+T-yw, i i

-1

(16) i=l

(ref. 2) and may be used to construct hypothesis tests about y, as described in Section 6.

Two-stage methods are based on separate estimates #3T for each individual. Thus, it is not straightforward to use these methods when one of the components of & is taken to remain constant across individuals, as in Model 9.

4.4. Pooled two-stage algorithm

The final estimates #IT from the GLS-P scheme are proposed in ref. 5 as input to one of the procedures such as GTS for estimation of y and X. These /Jr are to be preferred to those from the GLS-I method based on a separate estimate for 0 for each individual. This is because the GLS-P estimates are likely to be more precise, being based on a weighting scheme that takes advantage of the information from all individuals through the ‘pooled’ estimation of CT and 8. The term pooled two-stage (PTS) algorithm is used to refer to the following procedure.

1.

2.

Obtain /3: by the GLS-P algorithm based on pooled estimation of (a, 13), and use these estimates to form Vi. Use /3: and y as input to the GTS algorithm to estimate (y, Z) and obtain refined estimates of pi if desired.

PTS methods may be used when are available to obtain estimates individual.

sufficient data /3: from each

5. METHODS BASED ON LINBARIZATION

5.1. Introduction

The other main class of approaches to estimation in the nonlinear mixed effects model is based

on linearization of Model 11 by a Taylor series in the random effects zi. This idea was first advo- cated by Beal and Sheiner [8,9]. Under the assumption that Zi N (0, X> and l i N (0, I,& they proposed that the Taylor series be taken about zi = 0, leading to the approximate model

yi ~:fi(Wi::y) + Di(W~y)~i + OGi(Wiy, e)~, (17)

where Di(Wi::y) = X,(Wiy)Hi, X,(s) is the (mi Xp>

matrix of derivatives of f@,> with respect to Bi. This approximation implies

E(Yi) &:fi(Yr)

Cov( yi) J Di(Wiy)EDF(Wiy) + a2G’(Wi:,r, t9)

(18)

If one assumes that the random components zi and 4i are normally distributed, and if Model 17 is treated as exact, then the yi have m,-variate normal distributions with mean and covariance given in Eqn. 18. This approximation is the basis for the suggestion of Beal and Sheiner to obtain estimates of (7, Z, g, 0) by maximizing in these parameters the normal likelihood for the data vectors y,, i= l,..., n, corresponding to these assumptions. In the pharmacokinetics literature, this procedure is referred to as the ‘first-order’ method, and the estimation scheme corresponding to normal ML is referred to as extended least squares (ELS) [8-101. The method is implemented in the software package NONMEM, and the reader is referred to ref. 14 for computational details. The individual & may be estimated by an empirical Bayes approach as well [141. NON- MEM is widely used in the analysis of population pharmacokinetic data. For some problems, this approach can be computationally intensive.

5.2. Linear mixed effects algorithm

Recently, Vonesh and Carter [ll have proposed an alternative to the ELS method which is also based on the linearization Model 17 but which is computationally simpler. In ref. 5, their method has been modified to incorporate a variance function g and estimation of 8. The procedure is a four-step iterative algorithm in the spirit of GLS. The second step is based on the idea

l M. Davidian and D.M. Giltinan / Chemom. Intell. Lab. Syst. 20 (1993) I-24/Tutorial 17

that, if y were known in Model 17, [yi -fi{Wiy)] = Di(Wiy)Zi + aGi(Wiy, t?)Ei is a weighted linear ‘regression model’ for individual i with ‘regression parameter’ zi, ‘design matrix’ Di(Wiy), and ‘covariance matrix’ (~~G:(w,y, 0). The term linear mixed effects (LME) refers to the following procedure.

1. Let k = 0, and obtain the ordinary LS estimator q(0) by minimizing in y

5 [ Yi -.fi(WiYllTIYi -fi(%Y)l

i=l

2. Form Zi = yi -fi(Wi+&, let D, = Di(Wi$&, and (9

(ii)

(iii)

3. Let

iterate the following steps: Let j = 0. For each individu$, obtain an initial estimate of zi, ii” = (DTDi)-‘Di~i. Form ‘residuals’ ii = & - D&j) and obtain (3, 6) by jointly maximizing in (a, 0)

2 oi(ii, W+c(k), u9 0) i=l

where Oi is a suitably chosen objective function. Form Gi = Gi(Wi+ck,, 02, update the random effects estimates for each individual by @+l) = (~~~~~l~i)-‘~~~z~ ‘zi, and let

j = j + 1. Return to (ii). Call the final e;timates fcom this proce%s ii”), 6&, Q, and let Gi = Gi(Wi9ck,, 8&

s,, = (n - 1)-‘x;Jijk’ - z) (iJk’ -zY, ,. . Z = n-‘C;=i,iiK), and estimate Z by

where i is the smallest root of

Is==-An-‘~(~Ti;;l~i)-ll=O

Form (fii),il=J fi$&r + &‘GF and update . .

estimation of +c(k + i) by minimizing in y

i [Yi -fi(%Y)]T(tii)(~~[ Yi -.fi(YY)] i=l

Let k = k + 1 and return to Step 2.

Gi = I_ no iteration is required within Step 2.

In general, Step 2 is not too intensive since the ‘regression’ fits are linear, and the process often converges after a few iterations. As with the GLS-P algorithm of Section 4.2, at least one iteration of the entire algorithm should be taken; results often stabilize after three to five iterations. At the end of Step 4, individual estimates may be constructed as

Bi *(++I) = wi&k+l) + Hii;@ (19)

Choice of VFE objective function Oi is based on the same ideas as in Section 4.2. Two possibilities are

PLi(ri, j3, U, e)= z i 2

j = 1 u2s2(.f(Zii, S), e)

+l”g[ u2g2{f(xij> B), e}]

ARi(ri, /3, cr, 8) = 5 i

Iri I

j = I og{f(xijY S), 0)

+log[cig{f(Xij, B), e}]

Maximization of these objective functions in Step 2(ii) can be accomplished using standard nonlinear regression software, see refs. 4,5. The required minimizations in Steps 1 and 4 can also be accomplished with standard nonlinear regression software, as described in ref. 1.

The covariance matrix of +ck+l) obtained at the end of Step 4 can be estimated by

(21)

where Xi

pi(wi+fk +

is ,Xi evaluated at W$(,,+ 1j and 6, =

e,,,). This i))~~k~Di(W9~~ + I,)= + +$~G?(W?~~ + I), method may be used when information

fro’m each individual is sparse as long as mi > M (the dimension of zJ.

Lindstrom and Bates [ 1 l] suggest an improved approximation leading to a more computationally intensive algorithm.

18 M. Davidian and D.M. Giltinan /Chemom. Intell. Lab. Syst. 20 (1993) l-24/ Tutorial n

6. EXAMPLES REVISITED

6.1. Water transport kinetics of high flux hemodialyzers

One objective of the dialyzer study was to compare the kinetic properties of dialyzers operated at the two flow rates. This question is now addressed in the context of the NME framework, assuming the kinetic Model 1 for f throughout.

Following Vonesh and Carter 111, pi is assumed to conform to the model given by Eqns. 9 and 10. From Fig. 3, intra-dialyzer variability is an increasing function of ultrafiltration level. To accommodate this feature, assume that intra-dialyzer variance follows the power of the mean model

kT{f(xij> Pi), e} =f(nij9 Pi)” (22)

In their analysis, Vonesh and Carter [ll assumed constant intra-dialyzer variance.

The two-stage methods in Section 4.3 do not allow for the possibility that some components of pi are fixed; the LME method based on linearization in Section 5.2 does. Because Bi3, the transmembrane pressure required to offset patient oncotic pressure, is taken to be fixed, two-stage methods are not applied to this example. Instead, the four-step LME algorithm given in Section 5.2 is used to estimate (y, Z, W, 0). Results are similar for the two VFE objective functions in Eqn. 20; hence, results are reported for PL only. The process stabilized after three iterations in that the relative change in parameter estimates between successive iterations was less than 10W4. The estimates are:

+= [4512.782, 0.021, 21.922, 6253.007,

0.013, 21 .7911T

~ = 131142.26

(

- 0.325 - 0.325 7.53 x lo-+ 1

(6, e^> = (30.04, 0.26). The estimated value of the power parameter 13 suggests that variance in measurements on a given dialyzer increases like the square root of mean ultrafiltration rate.

Standard errors for 9 may be obtained as the square roots of the diagona! elements of the estimated covariance matrix 0 in Eqn. 21. For the assumed model, #?, is a (6 x 6) block diagonal matrix with (3 x 3) blocks

A,,=

15776.794

1 :

-6.715 x 1O-2 -5.500 1.776 x 1O-6 2.665 x 1O-4

0.2688 1

&Jo =

20668.417

I :

- 7.289 x lo-* - 18.568

1.053 x lo+ 1.941 x 1o-4 0.4175 1

corresponding to the (3 X 1) components +200 and +300 of 9, respectively.

One objective of this study was to determine whether kinetic properties of dialyzers differ between the two flow rates. A formal statistical hypothesis test of this question is the three de- gree of freedom test of the hypotheses

H,: Yl, 200 - 71,300 = 0, 72,200 - Y2.300 = 0,

vs.

H,: Ho not true

which contrasts the three kinetic parameters for the two flow rates. Ho may be written in the form of a general linear hypothesis as Ho: Ly = 0 vs. H,: Ly # 0, where L = [I3 I - 13]. By standard statistical theory [1,2], the test may be conducted by comparing the test ‘statistic

T2 = qTLT(L&LT) -lLj (23)

to critical values from the x2 distribution with Y degrees of freedom, where v corresponds to the degrees of freedom associated with the test (equivalently, the number of rows of L). (For the PTS algorithm of Section 4.4, a test st$istic anal- ogous to Eqn. 23 is Tz =9TLT(LALT)-‘L9, where 3 is the final estimate of y obtained from Stage 2 of the PTS algorithm at convergence and 8 is given in Eqn. 16.)

n M. Davidian and D.M. Giltinan /Chemom. Intell. Lab. Syst. 20 (1993) l-24/ Tutorial 19

conclusion here is the same, the knowledge that intra-dialyzer variance increases with mean response is useful for further study of the process. Evidence suggesting that incorrect specification of the intra-individual variance function can produce biased, less reliable estimates of y and Z is given in refs. 2 and 5.

For the dialyzer data, T2 = 86.778. From a table of x2 critical values with three degrees of freedom, this is highly significant (p < O.OOOl), suggesting that, overall, kinetic properties of dialyzers operated at 200 ml/min differ from those of dialyzers operated at 300 ml/mm. Inspection of the estimate 9 indicates that this result is largely due to differences in the first two components of ym and ySoo, maximum attainable ultrafiltration rate and hydraulic permeability transport rate, for the two flow rates. This is the same qualitative result obtained in ref. 1 assuming constant intra-dialyzer variance. Although the

4.2. Bioassay of relaxin by RL4

Calibration of unknown samples in a cell-based bioassay such as that for relaxin is always based on the standard curve for the current run. As

TABLE 4

Estimation based on LS, GLS-I, and GLS-P estimation, relaxin bioassay data (standard errors are given in parentheses)

Exper- iment

LS, individual data Final estimates fit from Step 3 Final estimates /3; from Step 3

Bl 82 83 84 of GLS-I algorithm of GLS-P algorithm with PL VFE obj. func. with PL VFE obj. func.

Bl 82 83 I34 Bl B2 83 84

b,el

1 132.58

2 111.05

3 100.12

4 95.96

5 54.67

6 158.99

7 111.08

8 82.64

9 86.19

0.71 1.64 1.31

- 1.03 1.27 1.31

1.60 1.64 1.38

7.65 4.18

4.11

1.30

1.15

1.50

1.33

1.31

1.15

3.25

1.85 1.67

3.21 2.12

3.07

5.51

1.86

2.73

127.37

(16.59) 1.76 1.57

(0.44) (0.23) [0.27, 0.931 1.86 1.27

(0.29) CO.241 [0.14, 1.161 1.93 1.66

(0.35) (0.19) [0.21,0.90] 1.68 1.75

(0.28) (0.33) [0.16, 1.161 1.07 1.29

(0.27) (0.22) [0.27,0.99]

1.37 1.58 (0.14) (0.21) [O.lO, 1.211 1.43 1.42

(0.44) (0.21) [0.35,0.93 2.08 1.52

(0.41) (0.21) [0.22,0.95] 2.35 1.61

(0.16) (0.21) lO.03, 1.471

1.43 127.32 1.76 1.57 1.43 (0.14) (19.33) (0.36) (0.26) (0.14)

107.79 (16.38)

1.55 105.56 1.90 1.23 1.59 (0.16) (12.80) (0.38) (0.21) (0.16)

100.76 (10.37)

1.39 101.83 1.91 1.67 1.38 (0.11) (17.11) (0.38) (0.29) (0.15)

127.86 (27.77)

1.43 121.58 1.76 1.66 1.49 (0.15) (18.74) (0.38) (0.25) (0.15)

61.79 (8.37)

1.65 62.13 (0.20) (7.51)

1.64 (0.16)

165.58 (25.50)

1.57 164.12 1.38 1.56 1.58 (0.10) (22.62) (0.28) (0.21) (0.13)

118.09 (16.07)

1.74 121.39 1.34 1.47 1.68 (0.21) (15.53) (0.49) (0.21) (0.15)

91.93 (10.71)

1.47 93.83 2.02 1.56 (0.15) (14.44) (0.40) (0.27)

111.69 (17.42)

1.47 98.44 2.56 1.37 1.64 (0.08) (12.79) (0.49) (0.21) (0.19)

Pooled estimate of (r = 0.20 Pooled estimate of B = 1.03

20 M. Davidian and D.M. Giltinan / Chemom. Intell. Lab. Syst. 20 (1993) l-24 / Tutorial n

noted in Section 2.2, accurate characterization of intra-assay variability in the response is essential for assessing the precision of calibration. This issue may be addressed for the relaxin bioassay within the NME model framework.

Variability in assay responses is typically an increasing function of the response level, and a standard model for characterizing this is the power of the mean variance function in Eqn. 22 [3]. For many assays, usual choices for the power, such as 8 = 0.5 or 1.0, may be inappropriate [3]. Thus, one objective of an analysis is to determine an appropriate value for 8 for the ith run. The choice of 8 often has little impact on the parameter estimates in the four-parameter logistic Model 2 obtained by a weighted regression fit or on subsequent estimated concentrations for unknown samples (calibrated values). However, the values of 8 used directly affects the ability to assess the precision of calibration accurately.

Appropriate values for (a, 01 are often determined by estimation from the available data, usually based on data from the current run only, using, for example, the GLS-I algorithm given in Section 4.2. Although sufficient data are available to provide reliable estimates of & based on data from a single run, estimates of (a, 0) obtained in this way are usually less reliable, mainly because estimation of variance parameters is intrinsically more difficult.

This difficulty may be avoided if one is willing to assume a similar pattern of intra-assay variability across runs. Although the values of (a, 0) may be expected to vary from run to run during assay development, once assay procedures have stabilized, supposition of common values of u and 8 across assay runs is not unreasonable. The NME Model 11 provides a framework for analysis under this assumption, with f as the four-parameter logistic function in Eqn. 2, g the power of the mean Model 22, and pi as in Eqn. 5. The GLS-P algorithm in Section 4.2 where data are pooled across all available runs is a natural way to estimate the common (a, 0).

Table 4 displays the results of applying the following algorithms to the relaxin data: (unweighted) LS fit separately by assay run, GLS-I on each assay run, and GLS-P. In the latter two

cases, because results were similar for both VFE objective functions in Eqn. 13, results are given for PL only. Although estimates of 8 obtained from GLS-I separately by assay run are fairly similar across runs, the value for run 9 seems somewhat aberrant. This is probably attributable to the difficulty in estimating variance parameters mentioned above. The pooled estimate for 8 from the GLS-P algorithm indicates that a constant C.V. model for intra-assay variance would be a reasonable choice for routine use. The difference in e^ for run 9 between GLS-I and GLS-P directly affects assessment of the precision of calibration, as the following development shows.

A standard technique for evaluating intra-assay precision associated with calibration of an unknown sample for a given run is to construct a precision profile [15]. Such a profile is a plot of the estimated precision of an estimated concentration against concentration across the assay range. One way to construct a precision profile is as follows. Let d be the inverse of the four- parameter logistic function, that is, x = d(y, B> = exp{& + logK& - Y)/(Y - &)I/&], and let d&y, #l) and the p-variate vector-valued function d,(y, /I) represent the derivatives of d with respect to y and /3, respectively. Then if y, is the average response obtained for r replicates of an unknown sample in the ith assay run at concentration x0, the corresponding estimated concentration is given by .6, = d(yo, &>, where & is an estimate of /Ii with estimated covariance matrix V;:. An estimated approximate variance for f, may be obtained by a Taylor series:

+$(YOP Bi)VidjJ(Y09 Pi) (24)

where (2, 01 are estimates of the intra-assay variance parameters. The first term in Eqn. 24 reflects uncertainty in the measurement y. and will usually dominate the second term, which corresponds to uncertainty in the fitted standard curve. A precision profile is constructed as [Var(~,)ll/*/$, versus f,. Because the first term in Eqn. 24 depends on ((+, f3>, assessment of precision of calibration will be sensitive to the values of u and 0 used.

n M. Davidian and D.M. Giltinan /Chemom. Intell. Lab. Syst. 20 (1993) I-24/Tutorial 21

Fig. 7 shows the construction of a precision profile for a single replicate of an unknown sample, based on each fit for run 9. The profile based on the IS fit is inappropriate, since it takes no account of the nature of intra-assay variance. The profiles based on GLS-I and GLS-P fits differ appreciably over a large range of concentrations, reflecting the sensitivity of calibration inference to the method used to characterize intra-assay variability. If the model assumptions are correct, then the profile based on the pooled GLS-P fit should provide the most accurate assessment of precision. Overall, precision of this assay is low, but is not unusual for a cell-based bioassay.

6.3. Phatmacokinetics of cefamandole

One objective of a pilot pharmacokinetic study is to characterize intra-subject variation, which is typically an increasing function of plasma level. A second objective is to determine a suitable kinetic model and to obtain preliminary estimates of the values of the pharmacokinetic parameters. All of this information is used in subsequent design and analysis of pharmacokinetic studies in a patient

population. These objectives may be addressed within the NME model framework.

From Section 2.3, the biexponential model is a suitable representation of the kinetics of cefamandole at time x. This model is often written in the form

f(x, B) =esl em(e -@2x) + e83 exp( e-flax)

(25)

in order to ensure positivity of the parameter estimates [16]. This model is used for f in the following analyses of the cefamandole data.

A standard model for intra-subject variance in pharmacokinetic applications is the power of the mean Model 22. As discussed in ref. 10, usual default values for 19 are often inappropriate for pharmacokinetic data, so it is necessary to estimate the power parameter 8 from the data. Be- cause the same assay is used to process blood samples from all subjects, it is not unreasonable to expect a similar pattern of intra-subject variability for all subjects. Thus, assume that the intra-subject variance function g is given by Eqn. 22 with unknown power 0 (common to all individuals) to be estimated.

00 1 2 3 4 5 6 7 8

x0

Fig. 7. Precision profiles for three different fits to run 9 of the relaxin bioassay: dotted line, based on LS fit from data for run 9 only; dashed line, based on GLS-I fit from data for run 9 only; solid line, based on GLS-P fit.

22 M. Davidian and D.M. Giltinan / Chemom. IntelL Lab. Syst. 20 (1993) I-24 / Tutorial n

TABLE 5

Analysis of the cefamandole data; results using the PTS algorithm of Section 4.4

Subject Elements of fl: : GLS-P with PL from Step 3 of the algorithm

Elements of 6,: ‘refined’ PTS estimates from Eqn. 15

Bl 82 I% 84 Bl I32 83 84

1 5.996 2.236 3.917 0.023 5.774 2.142 3.965 0.060 2 5.262 1.533 3.619 - 0.018 3 5.632 1.884 4.135 - 0.268 4 6.113 2.161 4.516 0.210 5 5.808 1.292 4.184 - 0.308 6 4.708 1.000 4.149 - 0.271

Pooled estimate of (T = 0.58 Pooled estimate of 0 = 0.53

Implied C.V.s for kinetic parameters (%o): [6.9,26.6, 6.0, 186.8]=

5.334 1.664 3.782 0.026 5.609 1.837 4.117 - 0.254 6.088 2.072 4.371 - 0.122 5.801 1.387 4.397 -0.199 4.986 0.914 3.936 - 0.371

$ = [5.599, 1.670, 4.095, -0.103]=

0.151

2s [: .

0.136 0.070 0.047

0.198 0.026 0.067 : 0.060 . 0.006

0.037 1 Because no information on individual at-

tributes is available, take #Ii = r + zi (Eqn. 5), where #Ii is the (4 x 1) vector of parameters in the order given in Eqn. 25 for the ith subject. Within this framework, y is the (4 x 1) vector of ‘typical’ pharmacokinetic parameters. Estimates of y and Z, the covariance matrix of zi, provide the required preliminary information on kinetics as well as a sense of how kinetic properties vary. Variability in the kinetic parameters is usually expressed as a coefficient of variation, that is, the square root of the appropriate diagonal element

of 2 (the estimate of standard deviation for the kinetic parameter) divided by the corresponding component of 9, times 100%.

Under these assumptions, the PTS algorithm of Section 4.4 was used to estimate (g, 0) using the PL objective function. Final estimates SF and corresponding estimated covariance matrices Vi from the first stage of this procedure (GLS-P), were input to the GTS algorithm to_estirnate y and Z (Stage 2). Refined estimates pi (Eqn. 15) of the individual parameters pi were also obtained. The results are summarized in Table 5.

TABLE 6

Analysis of the cefamandole data; results using the LME algorithm of Section 5.2 (after three iterations)

Subject Elements of ii: LME algorithm with PL from Eqn. 19

1 2

5.500 1.836 3.778 -0.115 5.256 1.482 3.665 -0.114 += [5.484,1.584,4.130, -0.169]=

3 5.489 1.685 3.961 - 0.346

4 5.819 1.876 4.231 0.062 5 5.832 1.207 4.280 - 0.287 6 4.947 1.350 4.356 0.198 -

Estimate of o = 0.54 Estimate of 0 = 0.58

Implied C.V.s for kinetic parameters (%): [5.1, 10.9,5.6,64.8]=

0.077 - 0.010 0.001 0.004

0.030 - 0.054 0.011 e=: . 0.054 - 0.017

[. . . 0.012 1

n M. Davidian and D.M. Giltinan / Chemom. Intell. Lab. Syst. 20 (1993) 1-24 / Tutorial 23

Refined estimates of the elements of & differ from the GLS-P estimates; this is due to the well-known phenomenon that empirical Bayes estimates borrow information across the sample to ‘shrink’ individual estimates toward the mean parameter value. This difference is most pro- nounced for the fourth element of pi.

The LME algorithm in Section 5.2 was also used to estimale (y, B, u, 0) and obtain individual estimates pi = 9 + ii of Eqn. 19, where 9 and ii are the final estimates of y and zi. The PL objective function was used in Step 2 for the estimation of (a, 0). Results after three iterations of the algorithm are given in Table 6. Estimated C.V.s for the parameters based on both analyses suggest that the second rate constant (and hence terminal half life) varies appreciably among subjects.

It is desirable that the two different estimation methods produce comparable results, and they do yield similar estimates for y, (r, and 8. The most striking difference between the two analyses is that the estimates of the covariance matrix X differ appreciably. It is not clear which method gives the more reliable estimate of X, and hence the more reliable estimate of C.V. The discrep- ancy may simply stem from the fact that estimation of population variability cannot be very reliable when based on such a small sample (six subjects). In general, estimation of I; should not be attempted unless the number of individuals sampled (n) is large. Experience suggests that the quality of estimates of y, 0, and 0 are not as adversely affected by small n.

7. CONCLUSION

The nonlinear mixed effects model is a useful and flexible framework for the analysis of repeated measurement data. In this tutorial several approaches to estimation and inference within this framework have been reviewed and illustrated with data sets taken from several applications. The examples illustrate the types of analyses that are possible, as well as the care that must be taken in interpreting the results. In particular,

the cefamandole example highlights the fact that estimation of variation in the population of individuals can be unreliable when the number of individuals sampled is small.

Other methods for estimation and inference in the NME model have been proposed that are more computationally intensive than those discussed here. See refs. 17-19 for descriptions of some alternative procedures.

The cefamandole example illustrates the analysis of pharmacokinetic data from a small, con- trolled pilot study. Clinical data from a patient population usually consist of only a few plasma concentration measurements on each of a large number of subjects along with information on patient attributes such as physical characteristics and disease status. The analysis of these data is usually complex, and improvement of existing techniques as well as development of new procedures is an area of current research [20,21]. An extensive bibliography of references on population pharmacokinetic analyses is given in ref. 22.

REFERENCES

1 E.F. Vonesh and R.L. Carter, Mixed effects nonlinear regression for unbalanced repeated measures, Biometrics, 48 (1992) 1-18.

2 M. Davidian and D. Giltinan, Some estimation methods for nonlinear mixed effects models, Journal of Biopharma- ceutical Statistics, 3 (1992) 23-55.

3 D. Rodbard, R.H. Lenoz, H.L. Wray and D. Ramseth, Statistical characterization of the random errors in the radioimmunoassay dose-response variable, Clinical Chem- isrry, 22 (1976) 350-358.

4 M. Davidian and P.D. Haaland, Regression and calibration with nonconstant error variance, Chemometrics and Intelligent Laboratory Systems, 9 (1990) 231-248.

5 M. Davidian and D.M. Giltinan, Some simple methods for estimating intraindividual variability in nonlinear mixed effects models, Biometrics, 49 (1993) in press.

6 N.S. Azii, J.G. Gambertoglio, E.T. Lin, H. Grausz and L.Z. Benet, Pharmacokinetics of cefamandole using a HPLC assay, Journal of Pharmacokinetics and Biophanna- ceutics, 6 (1978) 153-164.

7 M. Gibaldi and D. Perrier, Pharmacokinetics, Marcel Dekker, New York, 1982.

8 S.L. Beal and L.B. Sheiner, Estimating population kinetics, CRC Critical Reviews in Biomedical Engineering, 8 (1982) 195-222.

24 M. Dauidian and D.M. Gil&an / Chemom. Intell. Lab. Syst. 20 (1993) l-24/ Tutorial H

9 S.L. Beal and L.B. Sheiner, Methodology of population pharmacokinetics, in E.R. Garrett and J.L. Hirtx (Editors), Drug Fate and Metabolism - Method and Techniques, Marcel Dekker, New York, 1985.

10 S.L. Beal and L.B. Sheiner, Heteroscedastic nonlinear regression, Technometrics, 30 (1988) 327-338.

11 M.J. Lindstrom and D.M. Bates, Nonlinear mixed effects models for repeated measures data, Biometrics, 46 (1990) 673-687.

12 J.L. Steimer, A. Mallet, J.L. Golmard and J.F. Boisvieux, Alternative approaches to estimation of population pharmacokinetic parameters: comparison with the nonlinear mixed effect model, Drug Metabolism Reviews, 15 (1984) 265-292.

13 A. Racine-Poon, A Bayesian approach to nonlinear random effects models, Biometrics, 41(1985) 1015-1023.

14 A.J. Boeckmann, L.B. Sheiner and S.L. Beal, NONMEM User’s Guide, Parts I-N, University of California at San Francisco, 1990.

15 R.P. Ekins and P.R. Edwards, The precision profile: its use in assay design,’ assessment, and quality control, in W.M. Hunter and J.E.T. Corrie (Editors), Immunoassays for Clinical Chemistry, Churchill Livingston, Edinburgh, 1983.

16 L.B. Sheiner and S.L. Beal, Evaluation of methods for estimating population pharmacokinetic parameters. III.

Monoexponential model: routine clinical pharmacokinetic data, Journal of Pharmacokinetics and Biopharmaceutics, ll(1983) 303-319.

17 A.E. Gelfand, S.E. Hills, A. Racine-Poon and A.F.M. Smith, Illustration of Bayesian inference in normal data models using Gibbs sampling, Journal of the American Statistical Association, 85 (1990) 972-985.

18 A. Mallet, F. Mentre, J.-L. Steimer and F. Lokiec, Non- parametric maximum likelihood estimation for population pharmacokinetics, with application to cyclosporine, Jour- nal of Pharmacokinetics and Biopharmaceutics, 16 (1988) 311-327.

19 M. Davidian and A.R. Gallant, The nonlinear mixed effects model with a smooth random effects density, Biometrika, (1993) in press.

20 J.W. Mandema, D. Verotta and L.B. Sheiner, Building population pharrnacokinetic-pharmacodynamic models, Journal of Pharmacokinetics and Biaphamzaceutics, 20 (1992) 511-528.

21 M. Davidian and AR. Gallant, Smooth nonparametric maximum likelihood estimation for population phanna- cokinetics, with application to quinidine, Journal of Phar- macokinetics and Biophannaceutics, 20 (1992) 529-555.

22 L.B. Sheiner and T.M. Ludden, Population pharmacokinetics/dynamics, Annual Review of Pharmacological Toxi- cology, 32 (1992) 185-209.

analysis of repeated measurement data using the nonlinear mixed effects model

Documents