maximum likelihood estimation in meta-analytic structural equation modeling … · 2015. 12. 4. ·...
TRANSCRIPT
1
Maximum Likelihood Estimation in Meta-Analytic Structural Equation Modeling
Frans J. Oort
University of Amsterdam
Suzanne Jak
Utrecht University
This paper is currently under review in Research Synthesis Methods.
Correspondence should be adressed to:
Prof. dr. F.J. Oort
Research Institute of Child Development and Education
Faculty of Social and Behavioural Sciences
University of Amsterdam
P.O. Box 15780, 1001 NG Amsterdam, The Netherlands
Tel. +31 (0)20 525 1291
E-mail: [email protected]
2
Abstract
Meta analytic structural equation modeling (MASEM) involves fitting models to a common
population correlation matrix that is estimated on the basis of correlation coefficients that are
reported by a number of independent studies. MASEM typically consist of two stages, (1)
estimating the common correlation matrix and (2) fitting models. The method that has been
found to perform best in terms of statistical properties is the two stage structural equation
modeling (TSSEM), in which maximum likelihood (ML) analysis is used in the first stage,
and weighted least squares (WLS) analysis in the second stage. In the present paper we
propose to use ML estimation in both stages of TSSEM. In a simulation study we use both
methods and compare chi-square distributions, bias in parameter estimates, false positive
rates, and true positive rates. Both methods appear to yield unbiased parameter estimates, and
false and true positive rates that are close to the expected values. The differences between the
methods seem negligible, so the choice between the two methods may be based on other
fundamental or practical reasons such as consistency in the estimation method or the
availability of software to fit the models.
3
Introduction
The combination of meta-analysis (MA) and structural equation modeling (SEM) for the
purpose of testing hypothesized models is called meta-analytic structural equation modeling
or MASEM (Cheung & Chan, 2005a). Using MASEM, information from multiple studies can
be used to test a single model that explains the relationships between a set of variables or to
compare several models that are supported by different studies or theories. MASEM typically
consist of two stages (Viswesvaran & Ones, 1995). In the first stage, correlation matrices are
tested for homogeneity across studies and combined together to form a pooled correlation
matrix. Several methods to synthesize correlation matrices have been proposed and evaluated
(e.g. Becker, 1992, 1995, 2000; Cheung & Chan, 2005a; Hafdahl, 2007, 2008). In the second
stage, a structural equation model is fitted to the pooled correlation matrix. For this stage,
also several methods have been proposed (see Zhang (2011) for an overview). The method
that is generally found to perform best in terms of statistical properties and flexibility is the
two-stage SEM (TSSEM) approach of Cheung and Chan (2005a). In this method, maximum
likelihood estimation is used to estimate a pooled correlation matrix (Stage 1), and weighted
least squares estimation is used to fit structural equation models to this correlation matrix
(Stage 2). The purpose of the present paper is to propose and evaluate an alternative way of
conducting TSSEM, using maximum likelihood (ML) estimation in both stages of the
procedure.
Two-stage SEM with weighted least squares estimation in the second stage
In Stage 1 of the TSSEM approach, correlation matrices are tested for homogeneity across
studies using multigroup modeling. If homogeneity of correlation matrices is not rejected,
they are combined to form a pooled correlation matrix. In Stage 2, the pooled correlation
matrix is taken as the observed matrix in a SEM analysis. Cheung and Chan use maximum
4
likelihood to estimate the pooled correlation matrix in the first stage, and weighted least
squares (WLS) to estimate model parameters in the second stage. In Stage 2 WLS estimation,
the inversed matrix of asymptotic variances and covariances of the Stage 1 correlation
estimates are used as the weight matrix in the WLS optimization function. This ensures that
correlation coefficients that are estimated with more precision (based on more studies) in
Stage 1 get more weight in the estimation of model parameters in Stage 2. The precision of a
Stage 1 estimate depends on the number and the size of the studies that included the specific
correlation.
Cheung and Chan (2005a) conducted a large simulation study in which the
performance of the TSSEM approach was compared with two univariate approaches and one
multivariate approach. In the univariate approaches, the correlation elements are pooled
separately across studies based on bivariate information only. Cheung and Chan use the term
univariate-r method to refer to the approach from Hunter and Schmidt (1990), which involves
pooling the untransformed correlation coefficients. The univariate-z method refers to the
method of Hedges and Olkin (1985), who proposed using Fisher’s z-transformed correlation
coefficients. The multivariate approach with which they compared the two-stage SEM
method is the GLS method (Becker, 1992, 1995, 2000). The simulation showed the GLS-
method rejects homogeneity of correlation matrices too often and leads to biased parameter
estimates at Stage 2. The univariate methods lead to inflated Type 1 errors, while the TSSEM
method leads to unbiased parameter estimates and false positive rates close to the expected
rates. The statistical power to reject an underspecified factor model was extremely high for
all four methods. The TSSEM method thus came out as best out of these methods. Software
to apply TSSEM is readily available in the R-Package metaSEM (Cheung, 2013), which
relies on the OpenMx package (Boker et al., 2011) to fit the Stage 1 and Stage 2 models. In
5
this way, MetaSEM provides parameter estimates with standard errors, a chi-square measure
of fit, and likelihood based confidence intervals for parameters at both stages of the analysis.
Two stage SEM with maximum likelihood estimation in both stages
Instead of treating the estimated pooled correlation matrix as fixed and using WLS estimation
to fit a structural equation model in Stage 2, the structural equation model can be fitted to the
observed correlations directly, by imposing restrictions on the Stage 1 model. In this way, the
Stage 2 model is nested in the Stage 1 model, and both models can be fitted through
maximum likelihood estimation. This is an attractive approach because it provides a
maximum likelihood chi-square to evaluate the overall goodness-of-fit of the structural
equation model, and a chi-square difference test to evaluate the significance of the difference
in fit between the Stage 1 and Stage 2 models.
Method
Below we describe the Stage 1 and Stage 2 models that are used in MASEM, and the design
of a simulation study that evaluates the ML analysis in Stage 2 and compares it with Cheung
and Chan’s WLS analysis in Stage 2. The Stage 1 analysis is identical in the two approaches.
Stage 1: Testing homogeneity of correlation matrices
Let Rg be the pg x pg sample correlation matrix and pg be the number of observed variables in
the gth study. Not all studies include all variables. For example, in a meta-analysis of three
variables, the correlation matrices for the first three studies may look like this:
6
R1 = 1 _ 1
_ _ 1 , R2 =
1 _ 1 , and R2 =
1 _ 1 .
Here, Study 1 contains all variables, Study 2 misses Variable 1, and Study 3 misses Variable
3. An estimate of the population correlation matrix R of all q variables is obtained by fitting a
multigroup SEM model, where the model for each group (study) is:
Σg = Dg ( Mg R MgT ) Dg (1)
In this model, R is the q x q population correlation matrix with diag(R) = I, matrix Mg is a pg
x q selection matrix that accommodates smaller correlations matrices from studies with
missing variables (pg < q), and Dg is a pg x pg diagonal matrix that accounts for differences in
variances across the G studies. Correct parameter estimates can be obtained using maximum
likelihood estimation, optimizing
FML = g = 1
G FML(g) , (2)
with
FML(g) = logg - logSg + trace(Sg g-1) - pg , (3)
where Sg is the pg x pg sample variance-covariance matrix in the gth study. A chi-square
measure of fit for the model in Equation 1 is available by comparing its FML value with the
7
FML value of a saturated model that is obtained by leaving out the selection matrix Mg and
replacing R with Rg ,
Σg = Dg Rg Dg , (4)
where diag(Rg) = I for all g. The difference between the resulting FML values of the models in
Equations 1 and 4, multiplied by the total sample size minus the number of studies, has a chi-
square distribution with degrees of freedom equal to the difference in numbers of free
parameters. If the chi-square value of this likelihood ratio test is significant then the
hypothesis of homogeneity must be rejected.
If the correlation matrices are not homogeneous, it does not make sense to create a
single pooled correlation matrix to test the structural model in Stage 2. One possible solution
is to create subgroups of studies with homogenous correlation matrices and to fit different
Stage 2 models to each subgroup (Cheung & Chan, 2005b), or to allow study level variation
in the correlation coefficients by using a random effects approach (Cheung, 2014). In the
present study we assume that homogeneity holds, and we do not consider heterogeneous
situations.
Stage 2: Fitting structural equation models
Cheung and Chang (2005a) proposed to use WLS estimation to fit structural equation models
to the pooled correlation matrix R that is estimated in Stage 1. Fitting the Stage 1 model
provides estimates of the population correlation coefficients in R as well as the asymptotic
variances and covariances of these estimates, V. In Stage 2, hypothesized structural equation
models can be fitted to R by minimizing the weighted least squares fit function (also known
as the asymptotically distribution free fit function; Browne, 1984):
8
FWLS = (r – rMODEL)T V-1 (r – rMODEL) , (5)
where r is a column vector with the unique elements in R, rMODEL is a column vector with the
unique elements in the model implied correlation matrix (RMODEL) , and V-1 is the inversed
matrix of asymptotic variances and covariances that is used as the weight matrix. For
example, in order to fit a factor model with k factors, one would specify RMODEL as
RMODEL = Λ Φ ΛT + Θ , (6)
where Φ is a k by k covariance matrix of common factors, Θ is a q by q (diagonal) matrix
with residual variances, and Λ is a q by k matrix with factor loadings. Minimizing the WLS
function leads to correct parameter estimates with appropriate standard errors and a WLS
based chi-square test statistic TWLS (Cheung & Chan, 2005a, 2010).
As an alternative to using WLS for fitting a RMODEL to the estimated R in a single-
group analysis, we propose to retain the multigroup analysis and to use ML for fitting a
common RMODEL to the observed Rg matrices, which can be done by substituting RMODEL for
R in Equation 1,
Σg = Dg ( Mg RMODEL MgT ) Dg . (7)
Multigroup ML analysis of Equation 7 yields a ML chi-square of overall goodness-of-fit
(TML) and correct standard errors, just as multigroup ML analysis of Equation 1. Moreover,
as RMODEL in Equation 7 is a restriction of R in Equation 1, the difference between the
9
associated chi-square values has a chi-square distribution itself with degrees of freedom equal
to the difference in the numbers of free parameters in R and RMODEL.
For example, if we choose the factor model of Equation 6 as RMODEL, then the
multigroup structural equation model is given by
Σg = Dg ( Mg (Λ Φ ΛT + Θ) MgT ) Dg . (8)
In summary, the ML procedure for MASEM consists of a series of three nested
models: the saturated model (Equation 4), the Stage 1 model (Equation 1) and the Stage 2
model (Equation 7). Table 1 gives an overview of the three models and the associated tests.
The difference between the first two models provides a test of the homogeneity of correlation
coefficients, and the difference between the Stage 1 and Stage 2 models provides a test of the
structural model.
Simulation study
We will use simulated data to investigate the performance of ML TSSEM and compare it
with the TSSEM procedure as proposed by Cheung and Chan (2005a). As the two procedures
involve identical Stage 1 analyses, we will focus on the Stage 2 analyses, and calculate
estimation bias, false positive rates, and power in both the multi-group ML Stage 2 analysis
and the single group WLS Stage 2 analysis. The design and population model of our
simulation study are chosen similar to the design and population model in the Cheung and
Chan (2005a) study.
As a result we have numbers of studies equal to 5, 10, and 15, with 50, 100, 200, 500
and 1000 respondents each. The population model is a two-factor model with six variables,
with parameter values for Λ, Φ, and Θ as shown in Figure 1. We generated continuous
10
multivariate normal data using the MASS package in the R program (R Development Core
Team, 2011), after which we computed the correlation matrices. As it is not realistic that all
studies reported correlations between all research variables, some variables were removed
from some studies. Table 2 shows the patterns of missing variables in conditions with 5, 10
and 15 studies (these patterns are also chosen equal to the missing data patterns in Cheung
and Chan (2005a)). As the pattern is fixed a priori, the missingness may be considered
missing completely at random (MCAR, Graham, Hofer & MacKinnon, 1996), and will not
bias parameter estimates. We generated 500 data sets for each of the 15 conditions.
Estimation bias. We investigated estimation bias by fitting the model of Figure 1 to the data
with both the ML method and the WLS method. The percentage of estimation bias is
calculated as 100 x (mean estimated value - population value) / population value. According
to Muthén, Kaplan, and Hollis (1987), estimation bias less than 10% can be considered
negligible.
False positive rates. The distribution of TML and TWLS and false positive rates for the tests of
overall goodness-of-fit were inspected by fitting the correct model (Figure 1) to the data sets
(df = 8, = 0.05). We also calculated false positive rates for a single parameter, by fitting a
model in which Variable 4 has a redundant cross loading on the first factor. False positives
for this single parameter estimate were determined in two ways. Firstly, on the basis of the
likelihood ratio test, by checking the chi-square difference between models with and without
the redundant parameter (df = 1, = 0.05). Secondly, on the basis of the 95% confidence
interval of the estimate for the redundant parameter, by counting the number of times that the
interval did not include zero.
11
True positive rates (power). To investigate the power to detect non-zero parameters we
generated data according to the population model given by Figure 1 but with an additional
0.10 loading of Variable 4 on the first factor. Power was determined for the overall goodness-
of-fit (by counting true positives of the chi-square difference test with df = 8, = 0.05) and
for the test of the single additional 0.10 loading (chi-square difference test with df = 1, =
0.05). In addition, we counted the number of times that the 95% confidence interval of the
additional parameter estimate did not include zero.
We compare the power of the ML and WLS procedures to each other and to the power
that is expected on the basis of the non-centrality parameter that was estimated by fitting the
covariance matrix implied by the Stage 2 model (Model 2 as in Figure 1) to the covariance
matrix implied by the Stage 2 model with the cross loading (Model 2 with the additional
loading); see Saris and Satorra (1993).
Results
Table 3 shows the percentages of estimation bias in a selection of parameters: The first three
factor loadings and de covariance between the two common factors. For both the ML and the
WLS method, the bias is very small, ranging from 0.00% to (minus) 3.50%, which is well
below the 10% threshold suggested by Muthén et al. (1987). In general, the bias seems to be
somewhat smaller with the ML method than with the WLS-method, but the differences are
negligible.
Table 4 gives the means and standard deviations of the test statistics obtained with
ML and WLS estimation. In all conditions, both test statistics approximate the chi-square
distribution well, as the mean and standard deviations are very close to their expected values
(mean = 8, sd = 4), with the TML values generally slightly closer than the TWLS values. Table
4 also shows the false positive rates, that is, the rates of rejecting the correct model. The false
12
positive rates appear to converge to the expected value (0.05) when the sample size increases,
which is in accordance with the findings of Cheung and Chan (2005a). The results are very
similar for the ML and WLS method.
The false positive rates of the redundant cross loading are shown in Table 5. These
were very close to the expected 0.05 in all conditions and equal for the ML and WLS
methods, except for two conditions (with ML being 0.01 off in one condition and WLS being
0.01 off in another condition).
Table 6 gives the true positive rates or power to reject the model without a cross
loading for the overall goodness-of-fit test (df = 8), for single parameter test (df = 1), and for
the 95% confidence interval. As expected, the power increases with sample size. The number
of studies does not affect power that much, probably because the numbers of studies do not
vary much (5, 10 and 15). Both methods reach acceptable power levels with sample sizes of
500 or larger. The ML method generally shows power results that are somewhat closer to the
expected power than the WLS method, but the WLS method often yields slightly more true
positives.
Discussion
Our simulation study showed that using maximum likelihood estimation in both stages of
MASEM leads to almost identical results as using WLS estimation in Stage 2 of the analysis.
ML estimation may have an edge on WLS estimation, but the differences are not consistent
and hardly noticeable.
There are some fundamental and practical differences though, which may guide a
researcher’s choice between the two methods. Advantages of the ML procedure are that the
same estimation method is used at both stages, and that the Stage 1 and Stage 2 models are
nested. The WLS procedure has practical advantages. In the WLS procedure, the Stage 2
13
model is not a multi-group model, so that estimation convergence is much faster than in the
ML-approach. The necessity to calculate a weight matrix (the inverse of the matrix of
asymptotic variances and covariances of the pooled correlation coefficients) may count as a
disadvantage of the WLS method, but fortunately the readily available R package metaSEM
takes this burden off of the user. As a result, the WLS approach may actually be easier to take
than the ML approach. Another possible disadvantage of the WLS approach is that the pooled
correlation matrix that is estimated in the multi-group SEM in Stage 1, is taken as fixed in the
single group SEM in Stage 2. This could lead to larger rejection rates, and although in our
study the WLS method did not show more false positives than the ML method, it did yield
slightly more true positives. Notably, true positive rates for both methods were generally
somewhat higher than expected.
In our study we considered situations in which some studies did not include one or
more of the variables of interest. In addition to missing variables, there may also be missing
correlations. When authors do not report the full correlation matrix of all research variables,
some correlation coefficients may be derived from other statistics that are reported, such as
regression coefficients. This may not be possible for all correlations. For example, authors
may not report the correlations between the outcome variables in their study. One way to
handle missing correlation coefficients is presented by Jak, Oort, Roorda, and Koomen
(2013), but their method has not been thoroughly evaluated yet. A crude way to handle
missing correlations is to remove one of the two variables that is associated with the missing
correlation (Cheung & Chan, 2005a), but this method leads to a substantial additional loss of
information.
In applications of MASEM, correlation matrices will often not be homogenous,
rendering the fixed effects Stage 2 models inappropriate. One of the options to handle
heterogeneity is to create subgroups of studies that can be considered homogeneous with
14
respect to the correlation coefficients. These subgroups can be created based on study level
variables such as type of respondents in the study, experimental vs non-experimental studies,
type of measurement instruments used, and so on. If no interesting study level variables are
available, it may be an option to use a cluster analytic approach (Cheung & Chan, 2005b).
Another alternative is to use random effects modeling, which does not assume homogeneity
of effect sizes across studies. Cheung (2013, 2014) shows how the Stage 1 model can be
estimated using the random effects approach, after which the study level variance is
accounted for, and the Stage 2 model can be fitted to the pooled correlation matrix using the
asymptotic variances and covariances from Stage 1 as the weight matrix in WLS estimation.
The ML approach presented here has not yet been extended to include random effects.
It would be useful to be able to estimate study level variance for the Stage 2 parameters
directly. One way in which the ML procedure can handle heterogeneity is to freely estimate
one or more of the parameters across studies. A possible problem with this approach is that
the model should still be identified in each of the studies, which may be problematic if some
studies include too few variables.
The simulation study we performed was small, and only served to investigate whether
the ML approach is a viable alternative to the WLS approach. Future studies may examine
the performance of the method with larger numbers of studies, with varying sample sizes, and
with more variation in numbers and patterns of missing variables, and missing correlations.
References
Becker, B. J. (1992). Using results from replicated studies to estimate linear models. Journal
of Educational Statistics, 17, 341-362.
Becker, B. J. (1995). Corrections to" Using Results from Replicated Studies to Estimate
Linear Models". Journal of Educational and Behavioral Statistics, 20, 100-102.
15
Becker, B. J., & Schram, C. M. (2009). Model-based meta-analysis. In Cooper, H., Hedges,
L. V., & Valentine, J. C. (Eds), The handbook of research synthesis and meta-analysis
(2nd ed., pp. 377-395). New York, NY:Sage.
Boker, S., Neale, M., Maes, H., Wilde, M., Spiegel, M., Brick, T., Spies, J., Estabrook, R.,
Kenny, S., Bates, T., Mehta, P. and Fox, J. (2011). OpenMx: An open source
extended structural equation modeling framework. Psychometrika, 76, 306-317.
Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of
covariance structures. British Journal of Mathematical and Statistical Psychology, 37,
62-83.
Cheung, M. W. -L. (2010). Fixed-effects meta-analyses as multiple-group structural equation
models. Structural Equation Modeling, 17, 481-509.
Cheung, M. W. -L. (2013a). metaSEM: Meta-analysis using structural equation modeling.
Retrieved from http://courses.nus.edu.sg/course/psycwlm/Internet/metaSEM/R
package version 0.8–4. (Accessed August 26, 2014.)
Cheung, M. W. -L. (2013b). Multivariate meta-analysis as structural equation models.
Structural Equation Modeling, 20, 429-454.
Cheung, M. W. -L. (2014). Fixed-and random-effects meta-analytic structural equation
modeling: Examples and analyses in R. Behavior research methods, 46(1), 29-40.
Cheung, M.W. -L. and Chan, W. (2005a). Meta-analytic structural equation modeling: A
two-stage approach. Psychological Methods, 10, 40-64.
Cheung, M. W. -L., & Chan, W. (2005b). Classifying correlation matrices into relatively
homogeneous subgroups: A cluster analytic approach. Educational and Psychological
Measurement, 65, 954-979.
16
Graham, J. W., Hofer, S. M., & MacKinnon, D. P. (1996). Maximizing the usefulness of data
obtained with planned missing value patterns: An application of maximum likelihood
procedures. Multivariate Behavioral Research, 31(2), 197-218.
Hafdahl, A. R. (2007). Combining correlation matrices: Simulation analysis of improved
fixed-effects methods. Journal of Educational and Behavioral Statistics, 32, 180-205.
Hafdahl, A. R. (2008). Combining heterogeneous correlation matrices: Simulation analysis of
fixed-effects methods. Journal of Educational and Behavioral Statistics, 33, 507-533.
Hedges, L. V. & Olkin, I. (1985). Statistical methods for meta-analysis. Orlando, FL:
Academic Press.
Hunter, J. E. & Schmidt, F. L. (Eds.). (2004). Methods of meta-analysis: Correcting error
and bias in research findings. Newbury Park, CA: Sage.
Jak, S., Oort, F. J., Roorda, D. L., & Koomen, H. M. (2013). Meta-analytic structural
equation modelling with missing correlations. Netherlands Journal of Psychology, 67,
132-139.
Muthén, B., Kaplan, D., & Hollis, M. (1987). On structural equation modeling with data that
are not missing completely at random. Psychometrika, 52, 431-462.
Neale, M. C., & Miller, M. B. (1997). The use of likelihood-based confidence intervals in
genetic models. Behavior genetics, 27, 113-120.
R Development Core Team (2011). R: A Language and Environment for Statistical
Computing. [Computer software]. Vienna, Austria.
Viswesvaran, C., & Ones, D. (1995). Theory testing: Combining psychometric meta-analysis
and structural equations modeling. Personnel Psychology, 48, 865-885.
Zhang, Y. (2011). Meta-analytic structural equation modeling (MASEM): Comparison of
methods. Electronic Theses, Treatises and Dissertations. Paper 534.
17
Table 1. An overview of models and associated tests in the ML approach to MASEM Model Equation Chi-square Difference TestModel 0 (saturated) Σg = Dg Rg Dg - Model 1 (Stage 1) Σg = Dg ( Mg R Mg
T ) Dg Homogeneity of correlations (Model 1 vs Model 0) Model 2 (Stage 2) Σg = Dg ( Mg RMODEL Mg
T ) Dg Fit of structural equation model (Model 2 vs Model 1)
18
Table 2. The patterns of missing data in conditions with 5, 10, and 15 studies. Conditions with 5 studies Conditions with 10 studies Conditions with 15 studies Observed variable Observed variable Observed variable x1 x2 x3 x4 x5 x6 x1 x2 x3 x4 x5 x6 x1 x2 x3 x4 x5 x6Study 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Study 2 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 Study 3 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 1 1 0 Study 4 1 1 0 1 0 1 1 1 0 1 0 1 1 1 0 1 0 1 Study 5 1 1 1 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1Study 6 0 0 0 1 1 1 0 0 0 1 1 1 Study 7 1 0 0 0 1 1 1 0 0 0 1 1 Study 8 1 1 0 0 0 1 1 1 0 0 0 1 Study 9 1 1 1 0 0 0 1 1 1 0 0 0 Study 10 0 1 1 1 1 0 0 0 1 1 1 0 Study 11 0 0 0 0 1 1 Study 12 1 0 0 0 0 0 Study 13 1 1 0 0 0 0 Study 14 0 1 1 0 0 0 Study 15 0 0 0 1 1 0 Note: 1 = variable present in study; 0 = variable absent in study.
19
Table 3. Estimation bias for selected parameters, in percentages.
λ11 λ21 λ31 Φ21 Convergence* ML WLS ML WLS ML WLS ML WLS ML WLS 5 studies N=50 -0.45 -1.23 -0.33 -1.79 0.16 -1.64 1.21 -1.67 498 497 N=100 -0.77 -1.22 0.19 -0.49 0.28 -0.59 0.57 -1.00 499 499 N=200 -0.05 -0.25 -0.04 -0.29 -0.21 -0.63 0.67 -0.17 500 500 N=500 -0.01 -0.09 -0.01 -0.14 -0.04 -0.22 -0.18 -0.51 500 500 N=1000 0.11 0.06 0.06 0.00 -0.26 -0.33 0.13 -0.05 499 500 10 studies N=50 0.06 -0.56 -0.99 -1.76 -0.47 -1.81 -1.43 -3.50 500 500 N=100 0.06 -0.22 -0.32 -0.66 -0.27 -0.86 -0.13 -1.12 500 500 N=200 0.06 -0.09 -0.14 -0.30 -0.17 -0.45 -0.64 -1.15 500 500 N=500 -0.16 -0.22 0.16 0.08 0.06 -0.05 -0.51 -0.71 500 500 N=1000 -0.13 -0.16 0.04 0.01 0.02 -0.02 0.22 0.11 500 497 15 studies N=50 -0.19 -0.69 -0.45 -1.07 -0.68 -2.03 -0.39 -2.44 500 500 N=100 -0.13 -0.33 -0.02 -0.30 0.42 -0.16 -0.66 -1.64 500 499 N=200 -0.38 -0.48 0.00 -0.14 -0.02 -0.27 -0.85 -1.30 499 500 N=500 -0.08 -0.13 -0.09 -0.17 -0.21 -0.34 0.00 -0.13 493 500 N=1000 -0.03 -0.07 0.10 0.09 -0.15 -0.18 0.16 0.05 493 494 Note: Bold figures indicate cases in which WLS estimation bias is smaller than ML estimation bias.
*Number of data sets with which the analysis converged to a solution.
20
Table 4. Means, standard deviations, and rejection rates of TML and TWLS (df = 8, α = 0.05).
TML TWLS mean sd rejection
ratemean sd rejection
rate 5 studies N=50 8.28 3.96 0.07 8.38 4.05 0.07 N=100 8.25 4.23 0.07 8.30 4.26 0.07 N=200 7.81 3.79 0.05 7.83 3.81 0.05 N=500 8.20 4.26 0.05 8.22 4.27 0.05 N=1000 8.18 4.16 0.05 8.18 4.16 0.05 10 studies N=50 8.46 4.35 0.06 8.64 4.52 0.07 N=100 8.01 4.10 0.05 8.10 4.20 0.06 N=200 8.01 3.99 0.05 8.05 4.03 0.05 N=500 8.18 4.09 0.06 8.20 4.11 0.06 N=1000 8.01 3.94 0.04 8.05 3.93 0.04 15 studies N=50 8.52 4.13 0.08 8.77 4.37 0.09 N=100 8.08 4.20 0.06 8.15 4.28 0.07 N=200 7.92 3.86 0.04 7.97 3.92 0.05 N=500 7.78 3.80 0.04 7.86 3.84 0.04 N=1000 8.15 4.01 0.05 8.23 4.04 0.06 Notes: Bold figures indicate cases in which WLS results were closer to the expected mean (8.00),
standard deviation (4.00), or rejection rate (0.05). With df = 8, α = 0.05, the critical chi-square value is
15.51.
21
Table 5. False positives rates for redundant model parameter.
Chi-square difference test
(df = 1, α = .05)
95% CI Convergence
ML WLS ML WLS ML WLS 5 studies N=50 0.06 0.06 0.06 0.06 493 492 N=100 0.05 0.04 0.05 0.04 498 498 N=200 0.05 0.05 0.05 0.05 500 500 N=500 0.05 0.05 0.05 0.05 500 500 N=1000 0.06 0.06 0.05 0.06 499 499 10 studies N=50 0.04 0.04 0.04 0.04 497 496 N=100 0.04 0.05 0.04 0.05 499 500 N=200 0.05 0.05 0.05 0.05 499 499 N=500 0.07 0.07 0.07 0.07 500 499 N=1000 0.05 0.05 0.05 0.05 500 496 15 studies N=50 0.05 0.05 0.05 0.05 498 499 N=100 0.06 0.06 0.06 0.06 497 497 N=200 0.07 0.07 0.07 0.07 499 499 N=500 0.06 0.05 0.06 0.05 493 500 N=1000 0.05 0.05 0.05 0.05 493 496 Note: Bold figures indicate cases in which WLS has better (lower) false positive rates than ML.
Proportions are calculated based on converged solutions only.
22
Table 6. Power of the overall test, the likelihood ratio test and the likelihood based confidence
intervals to detect a cross loading of 0.10.
N Chi-square differencetest ( df = 8, α = 0.05)
Chi-square differencetest (df = 1, α = 0.05)
95% CI Convergence
Exp. ML WLS Exp. ML WLS ML WLS ML WLS 5 studies 50 0.10 0.11 0.13 0.22 0.24 0.25 0.24 0.25 496 495 100 0.17 0.17 0.16 0.38 0.38 0.39 0.38 0.39 499 499 200 0.33 0.32 0.31 0.66 0.64 0.65 0.64 0.65 499 499 500 0.76 0.74 0.72 0.96 0.96 0.96 0.96 0.96 497 500 1000 0.98 0.98 0.98 1.00 1.00 1.00 1.00 1.00 499 497 10 studies 50 0.12 0.11 0.13 0.26 0.27 0.28 0.26 0.28 498 498 100 0.21 0.21 0.22 0.46 0.45 0.47 0.45 0.47 500 499 200 0.42 0.40 0.40 0.75 0.72 0.73 0.72 0.73 500 500 500 0.87 0.83 0.83 0.99 0.99 0.99 0.99 0.99 500 500 1000 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 500 493 15 studies 50 0.11 0.11 0.13 0.25 0.22 0.23 0.22 0.23 495 500 100 0.20 0.24 0.25 0.44 0.44 0.43 0.43 0.43 500 498 200 0.39 0.40 0.40 0.73 0.73 0.73 0.73 0.73 500 497 500 0.84 0.85 0.85 0.98 0.97 0.97 0.97 0.97 494 500 1000 0.99 1.00 1.00 1.00 1.00 1.00 1.00 1.00 499 493 Note: Bold figures indicate cases in which WLS power is closer to the expected power that is based
on the non-centrality parameter.
23
Figure 1. Population model used in data generation.
y1 y2 y3 y4 y5 y6
ζ1 ζ2
.80 .70 .60 .80 .70 .60
.36 .51 .64 .36 .51 .64
1 1
.30