comparison of one-step and two-step meta-analysis models using individual patient data

17
Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data Thomas Mathew 1, and Kenneth Nordstro¨m 2 1 Department of Mathematics and Statistics, University of Maryland Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA 2 Department of Mathematical Sciences, University of Oulu, FI-90014 Oulu, Finland Received 14 June 2009, revised 8 November 2009, accepted 7 February 2010 The problem of combining information from separate trials is a key consideration when performing a meta-analysis or planning a multicentre trial. Although there is a considerable journal literature on meta-analysis based on individual patient data (IPD), i.e. a one-step IPD meta-analysis, versus analysis based on summary data, i.e. a two-step IPD meta-analysis, recent articles in the medical literature indicate that there is still confusion and uncertainty as to the validity of an analysis based on aggregate data. In this study, we address one of the central statistical issues by considering the estimation of a linear function of the mean, based on linear models for summary data and for IPD. The summary data from a trial is assumed to comprise the best linear unbiased estimator, or max- imum likelihood estimator of the parameter, along with its covariance matrix. The setup, which allows for the presence of random effects and covariates in the model, is quite general and includes many of the commonly employed models, for example, linear models with fixed treatment effects and fixed or random trial effects. For this general model, we derive a condition under which the one-step and two-step IPD meta-analysis estimators coincide, extending earlier work considerably. The im- plications of this result for the specific models mentioned above are illustrated in detail, both theo- retically and in terms of two real data sets, and the roles of balance and heterogeneity are highlighted. Our analysis also shows that when covariates are present, which is typically the case, the two estimators coincide only under extra simplifying assumptions, which are somewhat unrealistic in practice. Key words: Balance; Covariates; Fixed effect; Random effect; Treatment-control difference. 1 Introduction Combining the results from multiple trials is a problem of ever-increasing importance in medical research. Typically, the main statistical problem in a meta-analysis or multicentre trial reduces to estimation of, or drawing inferences on, a common effect or several common effects, based on observations from independent trials. The efficiency of a combined estimator, or of combined confidence intervals and tests, is thus a key statistical issue in meta-analysis as well as in multicentre trials. Such efficiency considerations are clearly useful for obtaining general statistical guidelines for practitioners involved in, for example, the design and analysis of multicentre clinical trials. For a thorough discussion of the design of such trials, see the article by Fedorov and Jones (2005). Although a vast number of articles have appeared dealing with specific statistical issues within specific statistical models for meta-analysis, relatively few studies have addressed the underlying common estimation and inference problems from a broader statistical viewpoint, as understood * Correspondence author: e-mail: [email protected], Phone: 11-410-455-2418, Fax: 11-410-455-1066 r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim Biometrical Journal 52 (2010) 2, 271–287 DOI: 10.1002/bimj.200900143 271

Upload: thomas-mathew

Post on 06-Jun-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

Comparison of One-Step and Two-Step Meta-Analysis Models

Using Individual Patient Data

Thomas Mathew1,� and Kenneth Nordstrom2

1 Department of Mathematics and Statistics, University of Maryland Baltimore County, 1000

Hilltop Circle, Baltimore, MD 21250, USA2 Department of Mathematical Sciences, University of Oulu, FI-90014 Oulu, Finland

Received 14 June 2009, revised 8 November 2009, accepted 7 February 2010

The problem of combining information from separate trials is a key consideration when performing ameta-analysis or planning a multicentre trial. Although there is a considerable journal literature onmeta-analysis based on individual patient data (IPD), i.e. a one-step IPD meta-analysis, versusanalysis based on summary data, i.e. a two-step IPD meta-analysis, recent articles in the medicalliterature indicate that there is still confusion and uncertainty as to the validity of an analysis basedon aggregate data. In this study, we address one of the central statistical issues by considering theestimation of a linear function of the mean, based on linear models for summary data and for IPD.The summary data from a trial is assumed to comprise the best linear unbiased estimator, or max-imum likelihood estimator of the parameter, along with its covariance matrix. The setup, whichallows for the presence of random effects and covariates in the model, is quite general and includesmany of the commonly employed models, for example, linear models with fixed treatment effects andfixed or random trial effects. For this general model, we derive a condition under which the one-stepand two-step IPD meta-analysis estimators coincide, extending earlier work considerably. The im-plications of this result for the specific models mentioned above are illustrated in detail, both theo-retically and in terms of two real data sets, and the roles of balance and heterogeneity are highlighted.Our analysis also shows that when covariates are present, which is typically the case, the twoestimators coincide only under extra simplifying assumptions, which are somewhat unrealistic inpractice.

Key words: Balance; Covariates; Fixed effect; Random effect; Treatment-controldifference.

1 Introduction

Combining the results from multiple trials is a problem of ever-increasing importance in medicalresearch. Typically, the main statistical problem in a meta-analysis or multicentre trial reduces toestimation of, or drawing inferences on, a common effect or several common effects, based onobservations from independent trials. The efficiency of a combined estimator, or of combinedconfidence intervals and tests, is thus a key statistical issue in meta-analysis as well as in multicentretrials. Such efficiency considerations are clearly useful for obtaining general statistical guidelines forpractitioners involved in, for example, the design and analysis of multicentre clinical trials. For athorough discussion of the design of such trials, see the article by Fedorov and Jones (2005).

Although a vast number of articles have appeared dealing with specific statistical issues withinspecific statistical models for meta-analysis, relatively few studies have addressed the underlyingcommon estimation and inference problems from a broader statistical viewpoint, as understood

* Correspondence author: e-mail: [email protected], Phone: 11-410-455-2418, Fax: 11-410-455-1066

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Biometrical Journal 52 (2010) 2, 271–287 DOI: 10.1002/bimj.200900143 271

Page 2: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

here. Among such more general methodological contributions, in addition to the article cited above,we single out Fleiss (1993), Berrington and Cox (2003), and Senn (2000), the last reference con-taining a wide-ranging discussion of statistical modelling for meta-analysis and multicentre trials.

The setup of this study is one in which data are obtained from several independent trials (studies),and the observations are assumed to follow a linear model. The linear model could result fromthe fact that such a model is found to be appropriate, perhaps after a suitable transformation of theresponses, or the model could be deduced from asymptotic considerations. For example, theasymptotic normality of estimators may lead to a linear structure for the mean vector involvingunknown parameters.

The statistical problem is to compare the estimators of a linear function of the mean, obtainedfrom a linear model for individual patient data (IPD) and a linear model for summary (aggregate)data. Simmonds et al. (2005) refer to these as one-stage IPD meta-analysis and two-stage IPD meta-analysis, respectively. We shall refer to the corresponding estimators as the one-step IPD meta-analysis estimator and the two-step IPD meta-analysis estimator, respectively, following Riley et al.(2008b). In the terminology of Senn (2000), the comparison is between a meta-analysis of Type Band Type A. For the sole purpose of this comparison of estimators, we deliberately exclude variouspossible sources of concern such as publication bias or selection bias. Although such concernswould invariably enter into any practical meta-analytic modelling of trials, their inclusion in ourcomparison would complicate matters significantly. On the contrary, under this simplifying as-sumption, we are able to derive a complete solution to this problem for a rather general class oflinear models, extending considerably earlier work by Olkin and Sampson (1998) and by Mathewand Nordstrom (1999). The result obtained has implications not only for meta-analysis, but also forthe design of multicentre trials. It can also serve as a starting point for a more comprehensivecomparison in which, for example, publication or selection bias is included.

The motivation for this study stems in part from some recent articles in the medical literature,which seem to indicate that there is still considerable confusion and uncertainty as to the validity ofa two-step IPD meta-analysis, despite the considerable journal literature on this topic (see, e.g.Stewart and Parmar (1993), Jeng, Scott, and Burmeister (1995), Steinberg et al. (1997), Lau,Ioannidis, and Schmid (1998), Smith and Egger (1998), Blettner et al. (1999), Olkin (1999), Higginset al. (2001), Egger, Ebrahim, and Smith (2002), Simmonds et al. (2005), Simmonds and Higgins(2007), and several books devoted solely to meta-analysis). Indeed, in a case study involvingCaesarean section in HIV-positive women, Angelillo and Villari (2003, p. 323) comment that ‘‘Oneissue that merits closer scrutiny is whether meta-analysis of published data is sufficient or whetherindividual patient data are necessary’’; see also the discussion in Lyman and Kuderer (2005).

On the contrary, a number of recent articles address other types of issues such as combining IPDand aggregate data and accounting for correlation between repeated observations; see, for example,Riley et al. (2008a,b) and Jones et al. (2009). However, a general statistical framework in which tocompare directly the one-step and two-step IPD meta-analysis estimators appears not to have beenutilized in the literature. Thus, while the problem will be formulated and solved in general statisticalterms as a comparison of estimators of parameters in linear models, an effort will also be made toexplain and illustrate the practical implications of our findings by way of several concrete examples.

Suppose then that the statistical goal is to estimate and draw inferences on some linear function ofthe mean vector, for example, the full vector of treatment-control differences in an analysis ofvariance setting. Suppose further that summary estimators, i.e. best (in the sense of minimumvariance) linear unbiased estimators (BLUE) or maximum likelihood estimators (MLE) undernormality assumptions, are available from each trial, along with the corresponding covariancematrices. The summary estimators can then be combined optimally to produce a two-step IPDmeta-analysis estimator. On the contrary, combining the data from the different trials, a BLUE (orMLE) can also be obtained by specifying a linear model for the entire vector of responses. That is,we have a one-step IPD meta-analysis estimator. A natural question to ask is whether these twoestimators coincide in general, and if not, under which condition(s) can one expect this to happen.

272 T. Mathew and K. Nordstrom: One-step and two-step meta-analysis

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 3: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

In Section 2, we formalize this comparison, and derive in Section 3 a criterion under which thetwo estimators do indeed coincide. The setup is general, allowing for the presence of random effectsas well as covariates in the model. However, the covariance matrix of the responses will be assumedknown (i.e. has been accurately estimated). This is a typical assumption in meta-analysis. Due to thegenerality of the setup, the criterion is a matrix-algebraic condition, whose proof (given in theAppendix) relies on a general matrix-convexity result, to be found in Jack Kiefer’s (1959) famousdiscussion study on optimum experimental designs. In Section 3, we also indicate briefly ways inwhich the possible loss of precision of the two-step IPD meta-analysis estimator can be measured.

To gain insight into the interpretation of the criterion and to indicate how to apply it in practice,we consider in Section 4 some commonly employed models. Linear models with fixed treatmenteffects, and fixed or random trial effects are first considered. The problem here consists of estimatinga full set of treatment contrasts, such as the vector of treatment-control differences. For the case offixed trial effects, it turns out that the one-step and two-step IPD meta-analysis estimators docoincide. This was noted earlier by Olkin and Sampson (1998) and by Mathew and Nordstrom(1999). However, when the trial effects are considered random, the two estimators coincide onlyunder the condition that the fraction of observations corresponding to any given treatment remainsthe same across trials. Interestingly, the condition is free of variance components.

The latter result holds also when the trial effects are absent. Thus we find that, even for such asimple model, the one-step and two-step IPD meta-analysis estimators of the treatment contrastsdiffer, unless the above condition on equal fractions of observations is satisfied. When the conditionfails, significant loss of precision may result from using the two-step IPD meta-analysis estimator. Acriterion is given for assessing the magnitude of the loss of precision. This loss of precision hasimplications in interval estimation as well as in hypothesis testing. For example, confidence regionsbased on the two-step IPD meta-analysis estimator can have considerably larger volume comparedwith those based on the one-step IPD meta-analysis estimator. Similarly, tests may suffer from aserious loss of power.

We also consider models with a patient-level covariate and note that the one-step and two-stepIPD meta-analysis estimators coincide only under a condition that represents homogeneity of thecovariate across trials. As such a homogeneity is quite unlikely to hold, the two estimators will notcoincide in practice.

Our results have implications even for the standard setup of estimating a single treatment-controldifference. Here are two examples addressing this scenario.

Example 1. This example is taken from Whitehead (2003, p. 50) and is based on a study com-paring two anaesthetics A and B with respect to the recovery times of patients undergoing shortsurgical procedures. The data from nine centres are given in Table 1. The data are reproduced fromWhitehead (2003, Table 3.16) and give the mean and standard deviations (SD) of the log-trans-formed recovery times (in minutes).

Let mA and mB denote the population mean log-recovery times for anaesthetics A and B,respectively. The problem is thus the estimation of mA�mB.

1.1 A model for the one-step IPD meta-analysis estimator

Given the information in Table 1, to compute the one-step IPD meta-analysis estimator of mA�mB,we shall use the following model. Let yAj

and yBjdenote the sample mean log-recovery times for

anaesthetics A and B, respectively, for the jth centre. Letting s2j denote the variance of the recovery

times for the jth centre, and letting nAjand nBj

denote the sample sizes (i.e. the number of patients)for anaesthetics A and B, respectively, at the jth centre, we assume the linear model

EyAj

yBj

� �¼

mAmB

� �; Cov

yAj

yBj

� �¼ s2

j

1=nAj0

0 1=nBj

� �¼ Vj ðsayÞ; ð1Þ

Biometrical Journal 52 (2010) 2 273

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 4: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

for the mean log-recovery times at the jth centre. Writing yj ¼ ðyAj, yBjÞ0, and stacking the

corresponding mean log-recovery times from all the centres into one column vector asy ¼ ðy1; . . . ; y9Þ

0, we have the IPD model

EðyÞ ¼WmAmB

� �; CovðyÞ ¼ diagðV1; . . . ;V9Þ ¼ V ðsayÞ; ð2Þ

where in terms of the Kronecker product W ¼ 19 � I2, and 19 denotes the column vector of 9 ones(the number of centres). The one-step IPD meta-analysis estimator of mA�mB is obtained from theabove model.

1.2 Two-step IPD meta-analysis estimator

The mean difference mA�mB can also be estimated using the corresponding estimators from eachcentre along with their standard errors, by forming a weighted linear combination of these esti-mators of mA�mB. This is the two-step IPD meta-analysis estimator of mA�mB, which can becomputed whenever the summary information available from the different centres is simply theestimators of mA�mB and the corresponding standard errors.

As discussed earlier, one may ask whether the one-step and two-step IPD meta-analysis esti-mators coincide, and if not, by how much will they differ. We shall see that the model given above isa special case of the more general model introduced in Section 2.

Example 2. This example is given in Bower et al. (2003), and deals with the costs of counselling inprimary care. Data from different studies (trials) are available on the costs for patients treated bycounsellors, and for those who remained under the care of a general practitioner. Table 2 gives themean and SD of the short-term costs (in British pounds) from four studies and are reproduced fromBower et al. (2003, Fig. 3).

Letting m1 and m2 denote the population average cost for patients treated by counsellors and forpatients who remained under the care of a general practitioner, respectively, the problem is one ofestimating m1�m2. Again, we can obtain one-step and two-step IPD meta-analysis estimates ofm1�m2, by considering a model similar to the one considered in the previous example.

In both the examples, the possible loss of efficiency resulting from the use of the two-step IPDmeta-analysis estimator is clearly of interest. An application of our main result (stated as ‘‘Result’’in Section 3) shows that, in general, the two estimators do not coincide, unless the fraction ofobservations corresponding to the first treatment (say, anaesthetic A in Example 1, and treatment

Table 1 Log-transformed recovery times after anaesthesia using anaesthetics A and B.

Centre (trial) Anaesthetic A Anaesthetic B

# patients Mean SD # patients Mean SD

1 4 1.141 0.967 5 0.277 0.6202 10 2.165 0.269 10 1.519 0.9133 17 1.790 0.795 17 1.518 0.8494 8 2.105 0.387 9 1.189 1.0615 7 1.324 0.470 10 0.456 0.6196 11 2.369 0.401 10 1.550 0.5587 10 1.074 0.670 12 0.265 0.5028 5 2.583 0.409 4 1.370 0.9349 14 1.844 0.848 19 2.118 0.749

274 T. Mathew and K. Nordstrom: One-step and two-step meta-analysis

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 5: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

by counsellor in Example 2) is the same across the studies. Although this aspect must be well knownto researchers in this area, we have been unable to find a formal explanation for this in theliterature. Note that this fraction is nearly the same in Example 1, but not so in Example 2.Consequently, the loss of efficiency can be expected to be small in Example 1, compared withExample 2. We return in Section 5 to a more detailed analysis of the above examples, including theconstruction of confidence intervals based on both estimators.

It is of some interest to note that the general question of combining information from in-dependent linear experiments (models) has, in fact, a long history in statistics, going back at least towork by Fisher, Cochran and colleagues in the 1930s. For more recent study in this area, see, forexample, Hedayat and Majumdar (1985) and the references therein.

2 The Model and the Meta-Analysis Problem

Consider k independent trials and let yj denote the vector of nj responses in the jth trial. Thus,y1; . . . ; yk comprise the IPD. We assume a linear model for the responses of the form

EðyjÞ ¼Wjb1Zjdj ; CovðyjÞ ¼ Vj ; ð3Þ

j ¼ 1; . . . ; k, where b and dj are the vectors of unknown parameters of dimension p and qj, re-spectively, and Wj and Zj are the corresponding design matrices. For example, in model (1) ofExample 1, Wj 5 I2 and b 5 (mA,mB)0, while d is absent. A number of further special cases of model(3) appears in Section 4. The covariance matrix of the responses in (3), Vj, typically involvesunknown parameters. We will assume that these have been estimated and that Vj is thus free ofunknown parameters. This is a common assumption in the meta-analysis literature. Note that thissetup does allow for both fixed as well as random effects, the random effects appearing as variancecomponents in Vj.

The parameter of interest to us is a linear function of the main parameter b, which is common toall the trials. The djs, on the contrary, are nuisance parameters that are specific to the trials and areincluded in the model to account for trial-specific quantities. There may also be nuisance parametersthat are common across the trials; these (if any) are included in the parameter b. The model is thusgeneral enough to include nuisance parameters that are both specific to the trials as well as commonacross trials. We shall assume that the matrix Xj 5 (Wj,Zj) has full rank p1qj. The weighted leastsquares estimator of ðb0; djÞ

0, based on the data from the jth trial, is thus given by

bðjÞ

dj

!¼ ðX 0jV

�1j XjÞ

�1X 0jV�1j yj :

Table 2 Short-term costs for patients treated by a counsellor and for patients under the care of ageneral practitioner.

Study (trial) Treated by counsellor Care by general practitioner

# patients Mean SD # patients Mean SD

1 58 304 170 57 226 4802 87 221 157 45 140 973 82 283 142 79 171 2914 53 322 285 49 166 329

Biometrical Journal 52 (2010) 2 275

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 6: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

This estimator is well known to be the best (i.e., minimum covariance) linear unbiased estimator orBLUE of ðb0; d0jÞ

0 based on yj and is also the MLE assuming normality for the responses.Suppose the parameter of interest to us is h 5Lb, with L an s� p matrix of full row rank. Under

the model (3) and with known covariance matrix Vj, the BLUE of h from the jth trial and thecorresponding covariance matrix are given by

hðjÞ¼ ðL; 0Þ b

ðjÞ

dj

!and Covðh

ðjÞÞ ¼ ðL; 0ÞI ðjÞ

�1

ðL; 0Þ0; ð4Þ

where I ðjÞ ¼ X 0jV�1j Xj. Using the partition of Xj corresponding to the main and nuisance parameter

in the model (3), the (information) matrix I ðjÞ can be partitioned correspondingly as

I ðjÞ ¼ I ðjÞ11 I ðjÞ12I ðjÞ21 I ðjÞ22

!:

Thus, letting

I ðjÞ11�2 ¼ IðjÞ11 � I

ðjÞ12 I

ðjÞ22

�1 I ðjÞ21; ð5Þ

it follows that

CovðhðjÞÞ ¼ LI ðjÞ11�2

�1L0: ð6Þ

The summary estimator of h from the jth trial is thus hðjÞ, with covariance matrix given by (4), or

equivalently by (6).

2.1 The one-step IPD meta-analysis estimator

Suppose the IPD y1; . . . ; yk are all combined into one linear model to estimate h. Thus,letting y ¼ ðy01; . . . ; y

0kÞ0, W ¼ ðW 01; . . . ;W

0kÞ0, Z ¼ diagðZ1; . . . ;ZkÞ, d ¼ ðd01; . . . ; d

0kÞ0, and

V ¼ diagðV1; . . . ;VkÞ, we obtain the linear model

EðyÞ ¼Wb1Zd; CovðyÞ ¼ V : ð7Þ

Under the model (7), the BLUE (or MLE) of h and the corresponding covariance matrix are givenby

h ¼ ðL; 0ÞI�1X 0V�1y; and CovðhÞ ¼ ðL; 0ÞI�1ðL; 0Þ0; ð8Þ

where I ¼ X 0V�1X and X5 (W,Z). Thus, h is the one-step IPD meta-analysis estimator of h. Usingthe structure of these matrices, it is readily verified that the expression for the covariance matrix in(8) takes the form

CovðhÞ ¼ LXkj¼1

I ðjÞ11�2

!�1L0; ð9Þ

with I ðjÞ11�2 as defined in (5).

2.2 The two-step IPD meta-analysis estimator

Based on the estimators hðjÞ

and their covariance matrices, given by (4) and (6), the two-step IPDmeta-analysis estimator of h, denoted ~h, and the corresponding covariance matrix are given by

~y ¼Xkj¼1

CovðhðjÞÞ�1

!�1Xkj¼1

CovðhðjÞÞ�1

hðjÞ

276 T. Mathew and K. Nordstrom: One-step and two-step meta-analysis

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 7: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

and

Covð~hÞ ¼Xkj¼1

CovðhðjÞÞ�1

!�1¼

Xkj¼1

ðL I ðjÞ11�2�1

L0Þ�1

!�1: ð10Þ

The above expressions are obtained in the usual way, as a weighted combination of the estimatorsfrom the individual trials.

The problem to be addressed is under which type of linear model does the two estimators h and ~hcoincide. It should be noted that although special cases of this comparison have been touched uponin the literature in a somewhat ad hoc fashion, the only earlier systematic studies along the linesinvestigated here appear to be those by Olkin and Sampson (1998) and by Mathew and Nordstrom(1999). In the following section, we derive a condition for equality of these two estimators andconsider ways of measuring the loss of precision that results from using the two-step IPD meta-analysis estimator ~h in models where the two estimators differ.

3 The Meta-Analysis Result

As h is the BLUE of h and as ~h is another linear unbiased estimator, the corresponding covariancematrices in (9) and (10) must satisfy

Xkj¼1

ðL I ðjÞ11�2�1

L0Þ�1

!�1� L

Xkj¼1

I ðjÞ11�2

!�1L0; ð11Þ

where for matrices A and B, AZB denotes that A�B is nonnegative definite. The two estimators hand ~h thus coincide if and only if their covariance matrices are equal. A condition for this equality tohold is given in the following result.

Result. The two estimators h and ~h coincide if and only if

(i) the matrices ðLI ðjÞ11�2�1L0Þ�1

LI ðjÞ11�2�1

are equal for every j ¼ 1; . . . ; k, or equivalently,

(ii) the matrices ðL1I ðjÞ11�2L01Þ�1L1I ðjÞ11�2L0 are equal for every j ¼ 1; . . . ; k, where L1 is a (p�s)� p

matrix of rank (p�s) satisfying L1L0 ¼ 0.

The proof of this result is given in the Appendix.Note that if L is the identity matrix, i.e. if one is estimating the entire vector b, condition (i) (or

(ii)) clearly holds, i.e. equality holds in (11), and consequently the two estimators ~b and b alwayscoincide. This requires that there be no common nuisance parameters across trials. The intuitiveinterpretation of the conditions above is perhaps not so obvious, but more insight can be gained bylooking at some commonly used models obtained as special cases of our general setup. This will bedone in Section 4.

As a measure of the loss of precision that results from using the two-step IPD meta-analysisestimator ~h instead of the one-step IPD meta-analysis estimator h, one may consider, for example,

1

str

Xkj¼1

ðLI ðjÞ11�2�1L0Þ�1

!�1�L

Xkj¼1

I ðjÞ11�2

!�1L0

8<:

9=;: ð12Þ

The above quantity is simply the trace of the difference between the matrices in (11) divided by thedimension of the matrices. Note that as the difference between the matrices in (11) is alwaysnonnegative definite, the difference has a nonnegative trace. The quantity in (12) is thus zero if andonly if the estimators coincide. The larger the quantity in (12), the more the loss of precision whenusing ~h instead of h. Also note that for computing the loss of precision using the criterion (12), it is

Biometrical Journal 52 (2010) 2 277

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 8: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

necessary to have the matrices I ðjÞ11�2 available. If the summary information available from each trialis simply the estimator h

ðjÞalong with the corresponding covariance matrix LI ðjÞ11�2

�1L0, then com-

putation of (12) is not possible.Apart from a criterion based on the trace, it is possible to use other functional measures to assess

the loss of efficiency of the two-step IPD meta-analysis estimator ~h. For example, the ratio of thedeterminants

Xkj¼1

ðLI ðjÞ11�2�1L0Þ�1

�����������1=2

� LXkj¼1

I ðjÞ11�2

!�1L0

������������1=2

ð13Þ

is a criterion that could be of practical interest. Note that (13) is the ratio of the expected volumes ofthe confidence regions for h, based on ~h and h, respectively, under normality. Similarly, one can alsocompare the power functions of tests for testing linear hypotheses concerning h, the tests beingF-tests based on the one-step and two-step IPD meta-analysis estimators.

4 Application to Some Common Models

4.1 A model with fixed treatment effects and fixed trial effects

Let us first consider the model studied by Olkin and Sampson (1998) and by Mathew and Nord-strom (1999). Suppose there are k trials from which data are obtained on m treatments, and let yijadenote the ath response on the ith treatment from the jth trial, a ¼ 1; . . . ; nij, assuming nij40 for all iand j. Letting ti represent the ith treatment effect, and letting mj denote the effect due to the jth trial,we assume a linear model with mean structure

EðyijaÞ ¼ mj1ti; a ¼ 1; . . . ; nij :

Now let us combine all the responses from the jth trial into one linear model. For this, let yij denotethe vector of nij responses on the ith treatment in the jth trial, so that y�j� ¼ ðy1j:; . . . ; ymj:Þ

0 is thevector of all the n�j ¼

Pmi¼1 nij responses from the jth trial. Letting Dj ¼ diagð1n1j ; . . . ; 1nmj

Þ, with 1rthe r-component vector of ones, and letting s ¼ ðt1; . . . ; tmÞ

0 be the vector of all the treatment effects,we have the linear model for the jth trial

Eðy�j�Þ ¼ mj1n�j1Djs; Covðy�j�Þ ¼ Vj : ð14Þ

Note that we allow for a general positive definite within-trial covariance matrix Vj. Furthermore,the trial effects as well as the treatment effects are all assumed to be fixed in this setup.

Suppose that we are interested in estimating simultaneously all the treatment-controldifferences. Letting the mth treatment represent the control, we thus wish to estimateh ¼ ðt1 � tm; . . . ; tm�1 � tmÞ

0. As s ¼ ðh0; 0Þ01tm1m, the mean structure in (14) can be rewritten inthe form

Eðy�j�Þ ¼ D�j h1dj1n�j ; ð15Þ

where D�j is obtained from the matrix Dj by removing the last column and appending an appro-priately-sized zero matrix at the bottom, and dj ¼ mj1tm.

From (15) it is thus clear that we have a special case of model (3), wherein dj is a single nuisanceparameter specific to the jth trial. In the notation of Section 3, the problem of estimating h is thusone of estimating Lh, with L the identity matrix. As pointed out in Section 3, the one-step and two-step IPD meta-analysis estimators, therefore, coincide. This is the result established by Olkinand Sampson (1998) for Vj ¼ s2In�j and by Mathew and Nordstrom (1999) for general positivedefinite Vj.

278 T. Mathew and K. Nordstrom: One-step and two-step meta-analysis

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 9: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

4.2 A model with fixed treatment effects and random trial effects

Now consider the situation in which the treatment effects are fixed, but the trial effects mj areconsidered random. Thus, dj in (15) is random, and we make the usual normality assumptiondj Nðd;s2

dÞ. We also assume that the djs are independent of the sampling error vectors, assumed tofollow normal distributions Nð0;s2

j In�j Þ and that the djs corresponding to the different trials areindependent.

For the responses from the jth trial, we thus have the linear model

Eðy�j�Þ ¼Wjb; Covðy�j�Þ ¼ s2d1n�j1n�j1s2

j In�j ¼ Vj ðsayÞ; ð16Þ

where Wj ¼ ðD�j ; 1n�j Þ and b ¼ ðh0; dÞ0. Note that the models corresponding to the different trials now

contain d as a common nuisance parameter in the mean and that the matrix Zj, appearing in model(3) is now absent. Our problem of estimating the vector of treatment-control differences h nowreduces to that of estimating Lb for L ¼ ðIm�1; 0Þ. From the structure of the matrix Vj in (16), itfollows that

V�1j ¼1

s2j

In�j �s2d=s

2j

11n�js2d=s

2j

1n�j10n�j

!:

As L ¼ ðIm�1; 0Þ, the matrix L1 in our Result is now the row vector l01 ¼ ð0; . . . ; 0; 1Þ. Also note thatwe now have I ðjÞ11�2 ¼W 0j V

�1j Wj, which simplifies using the expressions for Wj and V�1j given above.

After some straightforward computations and simplifying condition (ii) in our Result, the twoestimators are seen to coincide if and only if the vectors nj=n�j, with nj ¼ ðn1j ; . . . ; nmjÞ

0, are all equalfor j ¼ 1; . . . ; k.

The condition thus states that the fraction of observations corresponding to any given treatmentbe the same across trials. It is interesting to note that this condition is free of the variance com-ponents s2

d and s2j , j ¼ 1; . . . ; k. The condition is clearly satisfied if the vectors nj are the same across

the trials, i.e. if the ith treatment is replicated the same number of times in all the trials(i ¼ 1; . . . ;m). The condition also holds if all the treatments are replicated the same number of timeswithin a trial, i.e. we have balanced data within each trial.

On the issue of fixed versus random trial and/or treatment effects, there appears to be somecontroversy. Indeed, Higgins et al. (2001, p. 2225) comment that ‘‘Incorporating trial effects asrandom parameters is controversial in the field of meta-analysisy We believe the assumption ofrandom trial effects is a degree less plausible than that of random treatment differencesy’’. Thisquestion is also considered in Whitehead (2003, Chapter 5). Our view is that, strictly speaking, thisis not an issue that can be decided upon based on formal statistical grounds only – subject-matterconcerns dictate to some extent the proper model to be employed.

However, quite generally, modelling an effect as random, one typically envisions an underlyinghypothetical population from which the effects under study form a sample. Thus, random treatmenteffects (and their differences) should be interpreted as a sample from a multitude of possibletreatment effects, while random trial effects would constitute a sample from a population of po-tential trials. From this point of view, the latter seems rather more plausible, contradicting theconclusion above. The underlying question is, of course, how to model the (possible) heterogeneityof treatment effects between trials (between-study variability), and what is the reference populationabout which inferences will be drawn.

The choice of fixed versus random components in the model should in any case be based on suchbroader considerations rather than issues of computational ease, extraction of information, orconventional practice and protocols. Thus, while we have shown above that, from the point of viewof accuracy of estimation, treating the trial effects as fixed is advantageous (as the two-step and one-step IPD meta-analysis estimators coincide), we are not advocating that trial effects be treated asfixed based on this finding.

Biometrical Journal 52 (2010) 2 279

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 10: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

4.3 A model where trial effects are absent

If it is believed that heterogeneity across trials is not significant, one can ignore the trial effects mjincluded above (assuming them to be equal), and analyze the data using a model that involves onlythe treatment effects. Suppose we are again interested in estimating the full vector of treatment-control differences h ¼ ðt1 � tm; . . . ; tm�1 � tmÞ

0. The model for the data from the jth trial now takesthe form

Eðy�j�Þ ¼ bm1n�j1D�j h ¼Wjb;

where bm is a common intercept term across the trials, and D�j and Wj are as defined above. Ourproblem is thus again the estimation of Lh with L ¼ ðIm�1; 0Þ. Assuming

Covðy�j�Þ ¼ s2j In�j ;

and proceeding as before, one concludes that the two estimators coincide if and only if the vectorsnj/n�j are all equal for j ¼ 1; . . . ; k. Interestingly, the one-step and two-step IPD meta-analysisestimators fail to coincide even in this simple model, unless the sample sizes satisfy the conditionabove. Note that, for the models considered above, the same condition and conclusion hold if weconsider the estimation of any set of m�1 linearly independent treatment contrasts, the full vectorof treatment-control differences being one such set.

When the vectors nj/n�j are not all equal, there will be loss of precision in using the two-step IPDmeta-analysis estimator, and the loss can be assessed, for example, by computing the quantity in(12) or (13). We note that, apart from its dependence on the nijs, the loss of precision will alsodepend on the magnitudes of the within-trial variances s2

j .To gauge the loss of precision in interval estimation, let us consider the problem of estimating a

single treatment-control difference in the model where the trial effects are absent. Note that Ex-amples 1 and 2 both fall within this scenario. Let �y1j and �y2j denote the sample means for thetreatment and the control from the jth trial, respectively, and let n1j and n2j denote the corre-sponding sample sizes, j ¼ 1; . . . ; k. Assuming normality, we thus have

�y1j N m1;s2j

n1j

!; �y2j N m2;

s2j

n2j

!;

where m1 and m2 are the effects due to the treatment and control, respectively, and s2j is the within-

trial variance for the jth trial, j ¼ 1; . . . ; k. Let us assume that the within-trial variances are known.The estimator of y5 m1�m2 in the jth trial is �y1j � �y2j, with distribution

�y1j � �y2j N y;s2j

n1j1n2j

n1jn2j

� �� �:

Hence, the two-step IPD meta-analysis estimator of y is obtained as the weighted linear combi-nation

~y ¼Xkj¼1

1

s2j

n1jn2j

n1j1n2j

!�1 Xkj¼1

1

s2j

n1jn2j

n1j1n2jð �y1j � �y2jÞ

!:

On the contrary, the one-step IPD meta-analysis estimator (which is also the BLUE) of y isobtained as

y ¼Xkj¼1

n1j

s2j

!�1Xkj¼1

n1j �y1js2j

�Xkj¼1

n2j

s2j

!�1Xkj¼1

n2j �y2js2j

:

280 T. Mathew and K. Nordstrom: One-step and two-step meta-analysis

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 11: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

The corresponding variances are

Varð~yÞ ¼Xkj¼1

1

s2j

n1jn2j

n1j1n2j

!�1

and

VarðyÞ ¼Xkj¼1

n1j

s2j

!�11

Xkj¼1

n2j

s2j

!�1:

As the variances s2j are assumed known, confidence intervals for y can now be constructed using

standard normal percentiles, and the ratio of the lengths of the confidence intervals based on thetwo estimators is simply

f½Varð~yÞ=½VarðyÞg1=2:

If the s2j s are all equal, the above quantity becomes

Xkj¼1

n1jn2j

n1j1n2j

!�1=2, Xkj¼1

n1j

!�11

Xkj¼1

n2j

!�124

351=2

:

It is possible to make extreme choices of the nijs to produce large values of the above ratio. In anycase, it is clear that the use of the two-step IPD meta-analysis estimator may result in considerableloss of precision.

4.4 Models with a single patient-level covariate

For illustrating the effect of covariate heterogeneity on the equality of the one-step andtwo-step IPD meta-analysis estimators, let us consider a simple model with one covariate. We shallconsider two versions of the model: one without treatment-covariate interaction, where the problemis that of estimating a common mean treatment difference among all trials, and a second modelwhere such an interaction is present, and the problem is that of estimating the interaction para-meter.

4.4.1 A model without treatment-covariate interactionLet yij denote the outcome for the ith patient in the jth trial, and consider the model given by

yij ¼ b0j1b1x1ij1b2x2ij1eij ; ð17Þ

with i ¼ 1; . . . ; nj and j ¼ 1; . . . ; k. The b0js are trial effects assumed here to be fixed, x1ij is a dummyvariable that indicates treatment group (treatment or control), and the x2ijs are the values of a singlecovariate; cf. Sections 3.1 and 3.3 in Higgins et al. (2001). The parameter b1 represents the commonmean treatment difference among the trials, while b2 is the regression coefficient. We assume that theeijs are independent and identically distributed random variables with mean zero and variance s2.The problem is the estimation of b1.

Define

�x1j ¼1

nj

Xnji¼1

x1ij ; �x2j ¼1

nj

Xnji¼1

x2ij ; u1ij ¼ x1ij � �x1j ;

u2ij ¼x2ij � �x2j ; and d0j ¼ b0j1b1 �x1j 1b2 �x2j :

ð18Þ

Let yj ¼ ðy1j ; . . . ; ynjjÞ0, with ej defined similarly. In addition, with the u1ijs and u2ijs defined by (18),

let u1j ¼ ðu11j ; . . . ; u1nj jÞ0 and let u2j be defined similarly. In terms of these quantities, the model (17)

Biometrical Journal 52 (2010) 2 281

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 12: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

for the outcomes from the jth trial can be rewritten as

yj ¼ d0j1nj1b1u1j1b2u2j1ej ; ð19Þ

where 1nj is the column vector of nj ones.Note that u01j1nj ¼ 0 and u02j1nj ¼ 0. These orthogonality conditions are helpful for simplifying

condition (i) or (ii) in our Result. It should be clear that model (19) is a special case of model (3).Straightforward algebra now shows that the one-step and two-step estimators of b1 coincide whenthe quantities u01ju2j=u02ju2j are equal for all j ¼ 1; . . . ; k. This condition is verified by simplifyingcondition (i) or (ii) of the Result in Section 3. Using the definition of u1ij and u2ij , the conditionamounts to equality of the quantities

Xnji¼1

ðx1ij � �x1jÞðx2ij � �x2jÞ

,Xnji¼1

ðx2ij � �x2jÞ2; ð20Þ

for all j ¼ 1; . . . ; k. Equality of the quantities in (20) clearly implies a certain level of homogeneitywith respect to the treatment allocation and the covariates across trials. Patient-level homogeneityof the covariate within each trial obviously implies the required condition. Such restrictions on thecovariates are clearly unrealistic in practice. Thus, equality of the two estimators is unlikely to holdwhen covariates are present.

4.4.2 A model involving treatment-covariate interactionUsing the notations above, the model we shall consider is given by

yij ¼ b0j1b1x1ij1b2x2ij1gx1ijx2ij1eij ; ð21Þ

i ¼ 1; . . . ; nj, j ¼ 1; . . . ; k; this is model (3) in Simmonds and Higgins (2007). Here, g is the inter-action parameter, and the remaining quantities are as defined before. Again, we assume that the eijsare independent and identically distributed random variables with mean zero and variance s2, andconsider now the problem of estimating the interaction parameter g.

The one-step and two-step IPD meta-analysis estimators of g, say g and ~g, respectively, are givenin Sections 3.2 and 3.3 in Simmonds and Higgins (2007), along with their variances. LettingN ¼

Pkj¼1 nj, �x2j ¼ 1

nj

Pnji¼1 x2ij, and �x2 ¼ 1

N

Pkj¼1

Pnji¼1 x2ij , the variances are given by

VarðgÞ ¼ 4s2=Xkj¼1

Xnji¼1

ðx2ij � �x2Þ2 and Varð~gÞ ¼ 4s2=

Xkj¼1

Xnji¼1

ðx2ij � �x2jÞ2:

These two variances can be compared directly without appealing to our Result. As

Xkj¼1

Xnji¼1

ðx2ij � �x2Þ2¼Xkj¼1

Xnji¼1

ðx2ij � �x2jÞ21Xkj¼1

njð �x2j � �x2Þ2; ð22Þ

it is clear that VarðgÞ � Varð~gÞ, as expected. From (22), we also see that the two variances are equalonly when the �x2js are equal. That is, the mean covariate value is the same across trials. Clearly,such a condition cannot be expected to hold in practice, and hence the two estimators of g differ ingeneral.

5 Examples

To see the practical implications of the Result in Section 3, let us return to the two examples inSection 1 in greater detail.

282 T. Mathew and K. Nordstrom: One-step and two-step meta-analysis

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 13: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

5.1 Example 1 (continued)

Consider the data given in Table 1, and let yAjand yBj

denote the sample mean log-recovery timesfor anesthetics A and B, respectively, for the jth centre. In addition, let nAj

and nBjdenote the

corresponding sample sizes, j ¼ 1; . . . ; 9. The values of these quantities are all given in Table 1. Weshall consider the model (1) for the IPD data at the jth centre, and the resulting model (2) for theIPD data from all the centres.

Our problem is that of estimating mA�mB, the difference in population mean log-recovery times.The variance s2

j can be estimated by pooling the pair of sample variances for each centre. Let s2j

denote the estimator so obtained. The estimator of mA�mB from the jth centre is simply yAj� yBj

,with estimated variance s2

j1nAj

1 1nBj

� �. A weighted linear combination of the yAj

� yBjthus gives the

two-step IPD meta-analysis estimator of mA�mB, and this estimator has the value 0.627, withstandard error 0.099; see Whitehead (2003, p. 85).

On the contrary, based on model (2), one obtains the one-step IPD meta-analysis estimator ofmA�mB. This estimator has the value 0.6791, with standard error 0.0982. Using standard normalpercentiles, one obtains the following 95% confidence intervals for mA�mB based on the two-stepand one-step IPD meta-analysis estimators, respectively, (0.433, 0.821) and (0.486, 0.874).

Note that the fraction of observations for anesthetic A is nearly (but not exactly) the same acrossthe centres. Thus, the two estimators are not expected to be too different. However, note that thetwo confidence intervals are somewhat different.

5.2 Example 2 (continued)

For the data in Table 2, we assume that the variability in costs differs across trials. We also assumethat the variability in costs for patients treated by counsellors differs from the variability in costs forpatients who remained under the care of a general practitioner. We are interested in estimatingm1�m2, with m1 and m2 denoting the population average cost for patients treated by counsellors andfor patients who remained under the care of a general practitioner, respectively.

Under the above assumptions on the variances, the two-step and one-step IPD meta-analysisestimators of m1�m2 can be obtained by direct computation, along with their standard errors. Thetwo-step IPD meta-analysis estimator is 94.09, with standard error 17.468, and the one-step IPDmeta-analysis estimator is 117.050, with standard error 15.918. Assuming normality, the corre-sponding 95% confidence intervals are (59.86, 128.33) using the two-step IPD meta-analysis esti-mator, and (85.85, 148.25) using the one-step IPD meta-analysis estimator. The estimates, as well asthe confidence intervals, are now drastically different due in large measure to the fact that in Trial 2,the fraction of patients treated by a counsellor is quite different from the other studies.

The normality assumption is actually not appropriate for this data, and therefore, Bower et al.(2003) use bootstrap to compute a confidence interval for m1�m2 based on the two-step IPD meta-analysis estimator. The confidence intervals are constructed above to pinpoint the consequences ofusing the two-step IPD meta-analysis estimator when the fractions of observations corresponding toa treatment vary considerably.

6 Discussion

For meta-analysis problems where summary data need to be combined to estimate common effectsacross trials, this article gives a general condition under which such an estimator coincides with theone-step IPD meta-analysis estimator, assuming that a linear model is appropriate for the data athand. In particular, it is noted that there is no loss of efficiency when there are no common nuisanceparameters in the mean vector across the different trials, i.e. when the parameter vector of interestconsists of all the parameters that are common to the mean vectors across the trials. However, when

Biometrical Journal 52 (2010) 2 283

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 14: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

there are common nuisance parameters, loss of efficiency will result, unless the condition stated inthe Result in Section 3 is satisfied. When estimating contrasts among the treatment effects, commonnuisance parameters do occur in some frequently used models, and the condition for no loss ofefficiency is a function of the sample sizes only. When covariates are present, the condition will alsodepend on the values of the covariates. As the values of covariates are very often beyond the controlof the researcher, it is unlikely that the one-step and two-step IPD meta-analysis estimators willcoincide.

In trials involving the comparison of a single treatment with a single control, it is typical to havebalance (or near balance, as in Example 1) within each trial. If so, the summary data will provide afully efficient estimator (or a nearly fully efficient estimator) of the treatment-control differencewhen covariates are absent. Although not reported here, it may perhaps be of some interest tocompute the maximum loss of efficiency, the maximum being computed with respect to some of theparameters in the model such as the variance components. One can compute the maximum of thedeterminantal ratio in (13), for example, and such a maximum would characterize the worst-casescenario in terms of loss of efficiency.

In our investigation, we have made the assumption (common in meta-analysis) that the covar-iance matrices Vj, j ¼ 1; . . . ; k, appearing in (3) and (7), are known. In reality, the covariancematrices must, of course, be estimated, and the covariance inequality (11) and the conditions in theResult guaranteeing equality therein do not apply as such. Indeed, when the matrices Vj involveunknown parameters, both the one-step estimator y and the two-step estimator ~y as well as theircovariance matrices depend on these parameters. Therefore, we are no longer able to conclude thecovariance inequality (11) by referring to the optimality (i.e. minimum variance) of one linearunbiased estimator over another.

Thus, when estimating within-trial variances (and perhaps covariances) simultaneously in theone-step approach versus estimating the within-trial variances separately in each trial and poolingestimates in the two-step approach, it is quite likely to arrive at conclusions which seem to con-tradict our Result. An example of this is shown in Jones et al. (2009), where the two-step IPD meta-analysis estimates have a very slightly smaller standard error than the one-step IPD meta-analysisestimates. We therefore re-iterate that our Result is a theoretical result giving a complete descriptionof the linear models under which the one-step and two-step IPD meta-analysis estimators coincide,assuming that the covariance matrices are known. It would be interesting to consider what (ifanything) can be said about this problem when parameters in the covariances are estimated. An-other interesting and useful extension would be to consider other types of data such as binary orordinal data using, for example, generalized linear models.

Acknowledgements The article has benefited greatly from the thorough and detailed comments from tworeviewers. Their suggestions have resulted in the clarification of ideas, inclusion of several relevant references,and an improved presentation of the results.

Conflicts of Interest

The authors have declared no conflict of interest.

Appendix: Proof of the result

For notational convenience, let us write

I ðjÞ11�2 ¼ Aj ; j ¼ 1; . . . ; k: ðA:1Þ

284 T. Mathew and K. Nordstrom: One-step and two-step meta-analysis

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 15: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

The Ajs are thus p� p matrices. Recall that L is an s� p matrix of rank s. As both y and ~yare unbiased estimators of h, and as h is the BLUE, it follows that h and ~h coincide if andonly if the corresponding covariance matrices are equal. Thus, to prove the result, we have to showthat Xk

j¼1

ðLA�1j L0Þ�1

!�1¼ L

Xkj¼1

Aj

!�1L0; ðA:2Þ

if and only if the matrices ðLA�1j L0Þ�1LA�1j are all equal, or equivalently, if and only if the matrices

ðL1AjL01Þ�1L1AjL

0 are all equal for j ¼ 1; . . . ; k, where L1 is a (p�s)� p matrix of rank p�s,satisfying L1L

0 ¼ 0.

To begin with, note that we can assume LL0 ¼ Is, without loss of generality. This follows from theobservation that if (A.2) holds for a particular L, then (A.2) holds also when L is replaced with~L ¼ ðLL0Þ�1=2L satisfying ~L ~L

0¼ Is. Thus from now on we shall assume, without loss of generality,

that LL0 ¼ Is. Now let L1 be a (p�s)� p matrix of rank p�s, satisfying L1L05 0, so that L00 ¼

ðL0;L01Þ is a p� p orthogonal matrix. Define

Bj ¼ L0AjL00 ¼

LAjL0 LAjL

01

L1AjL0 L1AjL

01

� �¼

BðjÞ11 B

ðjÞ12

BðjÞ21 B

ðjÞ22

!ðsayÞ ðA:3Þ

for j ¼ 1; . . . ; k. Now (A.2) can be re-expressed as

Xkj¼1

ðLL00B�1j L0L

0Þ�1

!�1¼ LL00

Xkj¼1

Bj

!�1L0L

0;

or, equivalently, as

Xkj¼1

ððIs; 0ÞB�1j ðIs; 0Þ

0Þ�1

!�1¼ ðIs; 0Þ

Xkj¼1

Bj

!�1ðIs; 0Þ

0: ðA:4Þ

To arrive at (A.4), we have used the observation that LL00 ¼ ðIs; 0Þ, in view of the orthogonality ofthe matrix L0. As the matrix Bj has the partitioned form in (A.3), and using the expression for theinverse of a partitioned matrix, the terms in (A.4) can be written as

ðIs; 0ÞB�1j ðIs; 0Þ

0¼ B

ðjÞ11 � B

ðjÞ12BðjÞ�1

22 BðjÞ21

� ��1;

ðIs; 0ÞXkj¼1

Bj

!�1ðIs; 0Þ

Xkj¼1

BðjÞ11 �

Xkj¼1

BðjÞ12

! Xkj¼1

BðjÞ22

!�1 Xkj¼1

BðjÞ21

!8<:

9=;�1

:

Consequently, (A.4) can be written as

Xkj¼1

BðjÞ11 � B

ðjÞ12BðjÞ�1

22 BðjÞ21

� �¼Xkj¼1

BðjÞ11 �

Xkj¼1

BðjÞ12

! Xkj¼1

BðjÞ22

!�1 Xkj¼1

BðjÞ21

!;

or, equivalently, as

Xkj¼1

BðjÞ12BðjÞ�1

22 BðjÞ21 ¼

Xkj¼1

BðjÞ12

! Xkj¼1

BðjÞ22

!�1 Xkj¼1

BðjÞ21

!: ðA:5Þ

Thus, to obtain a condition under which (A.2) holds, we have to obtain a condition underwhich (A.5) holds, with B

ðjÞ11, B

ðjÞ12, B

ðjÞ21, and B

ðjÞ22, the blocks of the partitioned matrix defined

in (A.3).

Biometrical Journal 52 (2010) 2 285

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 16: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

We shall now use a result due to Kiefer (1959) (Lemma 3.2), which states that, given a set ofscalars lj, j ¼ 1; . . . ; k, satisfying 0oljo1 and

Pkj¼1 lj ¼ 1, the difference

Xkj¼1

ljBðjÞ12BðjÞ�1

22 BðjÞ21 �

Xkj¼1

ljBðjÞ12

! Xkj¼1

ljBðjÞ22

!�1 Xkj¼1

ljBðjÞ21

!ðA:6Þ

is a nonnegative definite matrix and is equal to the zero matrix if and only if the matrices BðjÞ�1

22 BðjÞ21

are all equal for j ¼ 1; . . . ; k. Choosing lj ¼ 1=k, for j ¼ 1; . . . ; k, we thus conclude that (A.5) holdsif and only if the matrices B

ðjÞ�1

22 BðjÞ21 are all equal for j ¼ 1; . . . ; k. Using the expressions in (A.3), we

see that this condition is equivalent to the matrices ðL1AjL01Þ�1L1AjL

0 being equal for j ¼ 1; . . . ; k.This is condition (ii) of our main result.

To show that condition (ii) is equivalent to condition (i), we use the observation

L01ðL1AjL01Þ�1L1 ¼ A�1j � A�1j L0ðLA�1j L0Þ�1LA�1j ;

see, for example, Searle, Casella, and McCulloch (1992, p. 451). Using the property LL0 ¼ I , theabove identity implies

L01ðL1AjL01Þ�1L1AjL

0 ¼ L0 � A�1j L0ðLA�1j L0Þ�1:

It now follows that if the matrices ðL1AjL01Þ�1L1AjL

0 are equal, then so are the matricesðLA�1j L0Þ

�1LA�1j for j ¼ 1; . . . ; k, and conversely. This completes the proof of the main result.

We note that the matrix inequality (11) can also be deduced from the concavity of the matrixfunction gðAÞ ¼ ðLA�1L0Þ�1, where L is a given matrix of full row rank and A is positive definite.The latter result is given, for example, in Marshall and Olkin (1979, p. 469).

References

Angelillo, I. F. and Villari, P. (2003). Meta-analysis of published studies or meta-analysis of individual data?Caesarean section in HIV-positive women as a case study. Public Health 117, 323–328.

Berrington, A. and Cox, D. R. (2003). Generalized least squares for the synthesis of correlated information.Biostatistics 4, 423–431.

Blettner, M., Sauerbrei, W., Schlehofer, B., Scheuchenpflug, T. and Friedenreich, C. (1999). Traditionalreviews, meta-analyses and pooled analyses in epidemiology. International Journal of Epidemiology28, 1–9.

Bower, P., Byford, S., Barber, J., Beecham, J., Simpson, S., Friedli, K., Corney, R., King, M. and Harvey, I.(2003). Meta-analysis of data on costs from trials of counseling in primary care: using individualpatient data to overcome sample size limitations in economic analyses. British Medical Journal 326,1247–1250.

Egger, M., Ebrahim, S. and Smith, G. D. (2002). Where now for meta-analysis? International Journal ofEpidemiology 31, 1–5.

Fedorov, V. and Jones, B. (2005). The design of multicentre trials. Statistical Methods in Medical Research 14,205–248.

Fleiss, J. L. (1993). The statistical basis of meta-analysis. Statistical Methods in Medical Research 2, 121–145.Hedayat, A.S. and Majumdar, D. (1985). Combining experiments under Gauss–Markov models. Journal of the

American Statistical Association 80, 698–703.Higgins, J. P. T., Whitehead, A., Turner, R. M., Omar, R. Z. and Thompson, S.G. (2001). Meta-analysis of

continuous outcome data from individual patients. Statistics in Medicine 20, 2219–2241.Jeng, G. T., Scott, J. R. and Burmeister, L. F. (1995). A comparison of meta-analytic results using literature

versus individual patient data. Paternal cell immunization for recurrent miscarriage. Journal of theAmerican Medical Association 274, 830–836.

Jones, A. P., Riley, R. D., Williamson, P. R. and Whitehead, A. (2009). Meta-analysis of individual patientdata versus aggregate data from longitudinal clinical trials. Clinical Trials 6, 16–27.

Kiefer, J. (1959). Optimum experimental designs. Journal of the Royal Statistical Society, Series B 21, 272–319.

286 T. Mathew and K. Nordstrom: One-step and two-step meta-analysis

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

Page 17: Comparison of One-Step and Two-Step Meta-Analysis Models Using Individual Patient Data

Lau, J., Ioannidis, J. P. A. and Schmid, C. H. (1998). Summing up evidence: one answer is not always enough.The Lancet 351, 123–127.

Lyman, G. H. and Kuderer, N. M. (2005). The strengths and limitations of meta-analyses based on aggregatedata. BMC Medical Research Methodology 5, 14. DOI: 10.1186/1471-2288-5-14.

Marshall, A. W. and Olkin, I. (1979). Inequalities: Theory of Majorization and Its Applications. Academic Press,New York.

Mathew, T. and Nordstrom, K. (1999). On the equivalence of meta-analysis using literature and using in-dividual patient data. Biometrics 55, 1221–1223.

Olkin, I. (1999). Diagnostic statistical procedures in medical meta-analyses. Statistics in Medicine 18,2331–2341.

Olkin, I. and Sampson, A. (1998). Comparison of meta-analysis versus analysis of variance of individualpatient data. Biometrics 54, 317–322.

Riley, R. D., Dodd, S. R., Craig, J. V., Thompson, J. R. and Williamson, P. R. (2008a). Meta-analysis ofdiagnostic test studies using individual patient data and aggregate data. Statistics in Medicine 27,6111–6136.

Riley, R. D., Lambert, P. C., Staessen, J. A., Wang, J., Gueyffier, F., Thijs, L. and Boutitie, F. (2008b). Meta-analysis of continuous outcomes combining individual patient data and aggregate data. Statistics inMedicine 27, 1870–1893.

Searle, S. R., Casella, G. and McCulloch, C. E. (1992). Variance Components. Wiley, New York.Senn, S. (2000). The many modes of meta. Drug Information Journal 34, 535–549.Simmonds, M. C. and Higgins, J. P. T. (2007). Covariate heterogeneity in meta-analysis: criteria for deciding

between meta-regression and individual patient data. Statistics in Medicine 26, 2982–2999.Simmonds, M. C., Higgins, J. P. T., Stewart, L. A., Tierney, J. F., Clarke, M. J. and Thompson, S. G. (2005).

Meta-analysis of individual patient data from randomized trials: a review of methods used in practice.Clinical Trials 2, 209–217.

Smith, G. D. and Egger, M. (1998). Meta-analysis: Unresolved issues and future developments. British MedicalJournal 316, 221–225.

Steinberg, K. K., Smith, S. J., Stroup, D. F., Olkin, I., Lee, N. C., Williamson, G. D. and Thacker, S. B. (1997).Comparison of effect estimates from a meta-analysis of summary data from published studies and from ameta-analysis using individual patient data for ovarian cancer studies. American Journal of Epidemiology145, 917–925.

Stewart, L. A. and Parmar, M. K. B. (1993). Meta-analysis of the literature or of individual patient data: isthere a difference? The Lancet 341, 418–422.

Whitehead, A. (2003). Meta-analysis of Controlled Clinical Trials. Wiley, New York.

Biometrical Journal 52 (2010) 2 287

r 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com