challenges in nonlinear structural equation modeling
TRANSCRIPT
1
Challenges in nonlinear structural equation modeling1
Polina Dimitruk, Karin Schermelleh-Engel, Augustin Kelava & Helfried Moosbrugger
Please cite as:
Dimitruk, P., Schermelleh-Engel, K., Kelava, A. & Moosbrugger, H. (2007). Challenges in
nonlinear structural equation modeling. Methodology, 3, 100-114.
1This research has been supported by the German Research Foundation, No. Mo 474/6-1
2
Abstract
Challenges in evaluating nonlinear effects in multiple regression analyses include reliability,
validity, multicollinearity, and dichotomization of continuous variables. While reliability and
validity issues are solved by employing nonlinear structural equation modeling,
multicollinearity remains a problem which may even be aggravated when using latent variable
approaches. Further challenges of nonlinear latent analyses comprise the distribution of latent
product terms, a problem especially relevant for approaches using maximum likelihood
estimation methods based on multivariate normally distributed variables, and unbiased
estimates of nonlinear effects under multicollinearity. The only methods that explicitly take
the non-normality of nonlinear latent models into account are Latent Moderated Structural
Equations (LMS) and Quasi-Maximum Likelihood (QML). In a small simulation study both
methods yielded unbiased parameter estimates and correct estimates of standard errors for
inferential statistics. The advantages and limitations of nonlinear structural equation modeling
are discussed.
Key words: Nonlinear structural equation modeling, nonlinear regression, reliability,
multicollinearity, latent product terms
1
1 Introduction
Hypotheses regarding interaction and quadratic effects between continuous variables are
frequently examined in psychological research using multiple regression analysis, although it
is well-known that this method is plagued by several methodological problems (e.g., Aiken &
West, 1991; Cohen, Cohen, Aiken, & West, 2003; MacCallum & Mar, 1995; Ganzach 1997).
Over the last few years, nonlinear structural equation modeling has received much attention
and has become increasingly popular in the context of applied behavioral and social science
research (e.g., Schumacker & Marcoulides, 1998). Nonlinear structural equation modeling
(SEM) provides many advantages over the use of analyses based on observed variables. It is,
however, more complicated to conduct and is hindered by methodological problems that are
different from those in multiple regression analysis.
In the following, we will first present a short introduction to the analysis of nonlinear
effects at the level of observed variables as well as at the level of latent variables. Then, we
will discuss a number of challenges mainly connected with nonlinear regression analyses and
finally turn to problems associated with latent nonlinear analyses.
In a multiple regression equation, the variables are usually linearly related, that is, the
criterion variable Y is a linear function of the predictor variables. In some cases it may,
however, be theoretically plausible that the effect of a predictor variable X on a criterion
variable Y is itself moderated by a second predictor variable Z. Here, in addition to the linear
effects 1 and 2, an interaction effect 3 becomes part of the model structure. In order to
analyze the interaction effect together with the linear effects in the regression equation, a new
variable must be created, i.e. the product XZ between the predictors X and Z, which is to be
included as a third term in the multiple regression equation.
Y = 0 + 1X + 2Z + 3XZ + (1)
2
In Equation (1), Y is the criterion variable, X and Z are the predictor variables, the product
XZ is the interaction term, 0 is the intercept, and are the linear effects, 3 is the
interaction effect, and is the disturbance term.
Given the extensive use of moderated regression, Lubinski and Humphreys (1990) have
suggested that significant interaction effects found in practice may occasionally be spurious in
nature. Accordingly, investigators should also routinely check for quadratic effects, where the
slope of the regression of the criterion variable Y on a predictor variable X varies with the
realizations of the predictor variable Z. In such a case, nonlinear regression models can
contain several nonlinear effects, i.e., one or more interaction effects and one or more
quadratic effects, depending on the number of independent or predictor variables in the
equation. In the case of two predictors (see Equation 2), a fully specified nonlinear model
includes a criterion variable Y, two predictor variables X and Z, a product term XZ, and two
quadratic terms X2 and Z2.
Y = 0 + 1X + 2Z + 3XZ + 4X2 + Z2+ (2)
In Equation (2), the moderator effect is denoted by 3, the quadratic effects by 4 and .
Equation (2) can be rewritten as follows:
Y = 0 + (1 +3Z + 4X)X + (2 + 5Z)Z + (3)
Y = 0 + (1 + 4X)X + (2 + 3X + 5Z)Z + (4)
As can be seen in Equations (3) and (4), the interpretation of the linear effect 1 is only
reasonable as a component of the nonlinear function (1 + 3Z + 4X), and the interpretation
3
of 2 is only reasonable as a component of the nonlinear function (2 + 3X + 5Z). The term
(1 + 3Z + 4X) is the slope of the regression of Y on X at Z, which depends upon the
particular value of Z at which the slope is considered. Since this model is symmetrical, 1 may
also be interpreted as part of the nonlinear function (1 + 4X) and 2 as part of the nonlinear
function (2 + 3X + 5Z).
For the examination of hypotheses concerning interaction and quadratic effects between
continuous variables in psychological research, multiple regression analysis is the method
most applied. Nevertheless, the central problems of this method are - among others - that
reliability and validity of the measures are all too often ignored. Taking reliability and validity
into account, structural equation modeling proves a better alternative. Recently, the
importance of nonlinear relations among latent variables when it comes to developing more
correct and meaningful models has been recognized and underscored (e.g., Bollen, 1996;
Jöreskog & Yang, 1996; Moosbrugger, Schermelleh-Engel & Klein, 1997; Schumacker and
Marcoulides, 1998; Yang-Jonsson, 1997). The analysis of nonlinear latent models has thus
received considerable attention.
While ordinary structural equation models incorporate linear relationships among latent
variables, a fully specified nonlinear structural equation model with two predictor variables
also includes a moderator effect and two quadratic effects 11 and 22. In this model, the
slope of the latent regression of the criterion variable on a predictor variable 1 is not only
moderated by the realizations of the second predictor (or moderator) variable 2 (interaction
effect 12), but also by realizations of the predictors 1 and 2 themselves (quadratic effects
11 and 22).
2 211 1 12 2 12 1 2 11 1 22 2η α + γ ξ + γ ξ +ω ξ ξ +ω ξ +ω ξ + ζ (5)
4
In Equation (5), is the latent criterion variable, is the intercept, 1 and 2 are latent
predictor variables, 12 is the interaction term, 12 and 2
2 are the quadratic terms, and
12 are the linear effects, 12 is the interaction effect, 11 and 22 are the quadratic effects
and is the disturbance term.
The nonlinear structural equation model is characterized by the following measurement
models (Equation 6) in which the latent constructs are each measured by at least two indicator
variables (see Figure 1):
y
x
Y η ε
X ξ δ
(6)
In the measurement models of equation (6), Y is the vector of two criterion indicators Y1
and Y2, X is the vector of four predictor indicators X1 - X4, denotes the factor loading
matrices, and and are vectors of the measurement error terms.
The following assumptions are made:
X1, . . . , X4 are multivariate normal with zero means;
1 and 2 are bivariate normal with zero means;
1 - 4, 1 - 2, and are normal with zero means;
i is independent of i' for i i' (i = 1, . . . , 4; i' = 1, . . . , 4);
j is independent of j' for j j' (j = 1, 2; j' = 1, 2);
i and j are independent of k for i = 1, . . . , 4, j = 1, 2, and k = 1, 2;
is independent of i, j, and k for i = 1, ... ,4, j = 1, 2, and k = 1, 2.
-----------------------------------------------------------------------------------------------------------------
Insert Figure 1 about here
-----------------------------------------------------------------------------------------------------------------
5
Although quadratic terms are usually included neither in regression nor in structural
equation models, it may be of interest to know whether a curvilinear relationship exists
between predictor and criterion in addition to an interactive relationship. The analyses of
nonlinear regression models as well as nonlinear structural equation models are, however,
hindered by several methodological problems that prevent researchers from employing these
methods. While some of the limitations of regression analysis may be removed when
structural equation modeling is used, still others may be aggravated and additional problems
may arise. Despite the plethora of research concerning nonlinear latent hypotheses, the
appropriate methods for testing these remain a subject of debate and ongoing research (cf.
Klein & Muthén, 2006; Marsh, Wen, & Hau, 2004; Schermelleh-Engel, Klein &
Moosbrugger, 1998).
A discussion of some of the problems associated with nonlinear regression and nonlinear
structural equation modeling follows, accompanied by a presentation of possible solutions.
First, we will describe methodological problems that are mainly linked to nonlinear regression
models, i.e., reliability, multicollinearity, validity, and dichotomization of continuous
variables, and subsequently address problems to be dealt with in applying nonlinear structural
equation models, i.e. non-normal distributions of product terms and biasedness of parameter
estimates.
2 Problems associated with nonlinear regression models
2.1 Reliability
The reliability problem arises on account of the fact that the measurement error of the
manifest or observed variables is usually not taken into account in linear regression models.
The reliability of variables becomes a substantial problem in the analysis of linear as well as
nonlinear regression models, when observed variables are treated as if each were a perfectly
6
reliable measure. Although this assumption proves rather common in empirical research, it is
all too often incorrect (cf. Moosbrugger, Schermelleh-Engel, & Klein, 1997).
Ignoring measurement error can lead to biased estimates of the regression coefficients, a
problem that will be aggravated by adding nonlinear terms to the linear multiple regression
equation. Since observed variables are seldom measured with perfect or near perfect
reliability, the sample regression coefficients in multiple regression analysis are usually
attenuated (cf. Aguinis, 1995). Moreover, the reliabilities of nonlinear terms are affected to an
even greater extent, such that the sample-based regression coefficients associated with these
terms greatly underestimate the population coefficients. If the reliability of the criterion
variable Y is also less than perfect, relationships between Y and the predictor variables (e.g.,
X, Z, XZ, X2, Z2) are attenuated even more.
Interaction models. As is well-known from classical test theory, the reliability of a variable
X is defined as the ratio of the true score variance Var(TX) to the total variance Var(X):
( )( )
( )XVar T
Rel XVar X
(7)
If X and Z are centered, the reliability of the interaction term XZ is defined as follows (cf.
Busemeyer & Jones, 1983):
2
2
( ) ( ) [ ( , )]( )
1 [ ( , )]
Rel X Rel Z Corr X ZRel XZ
Corr X Z
(8)
As can be seen in Equation (8), the reliability of the interaction term not only depends on
the reliability of the predictor variables, but also on the correlation between the two predictors.
Figure 2 illustrates the relation between the correlation of the predictors, Corr(X, Z), and the
7
reliability of the interaction term, Rel(XZ), for three reliability values of variables X and Z.
Without loss of generality, we assume that Rel(X) = Rel(Z).
-----------------------------------------------------------------------------------------------------------------
Insert Figure 2 about here
-----------------------------------------------------------------------------------------------------------------
As is apparent from Equation (8), when X and Z are uncorrelated (Corr(X, Z) = 0), the
reliability of the interaction term is reduced to the product of the reliabilities of the predictors
(see Figure 2). If, for example, X and Z are uncorrelated and Rel(X) = Rel(Z) = .50, then
Rel(XZ) = .25. With increasing correlation, the reliability of XZ also increases, though even
with very reliable measures (e.g. Rel(X) = Rel(Z) = .75) and a correlation of .75, the reliability
Rel(XZ) will be smaller than .75, i.e., .72. The reliability of the interaction term only amounts
to 1.00 given that the measures are perfectly reliable.
Because the correlation between two variables X and Z attenuated for measurement error is
less than or equal to one (Equation 9),
( , )1.0
( ) ( )
Corr X Z
Rel X Rel Z ,
(9)
the correlation between the predictor variables cannot exceed the square root of the product of
the reliability coefficients of X and Z (Equation 10, see also Figure 1).
( , ) ( ) ( )Corr X Z Rel X Rel Z (10)
Quadratic models. Since the reliability of the interaction term not only depends on the
reliability of the predictor variables but also on the correlation of the two predictors, it seems
reasonable to assume that the reliability of the quadratic term is formed analogously. This
8
would mean that the reliability of the quadratic term not only depends on the reliability of the
predictor variable, but also on the correlation of the predictor with itself, which, given that a
variable always perfectly correlates with itself, is 1.0. Including the correlation of a variable
with itself is in this case not correct, due to the fact that not only the true scores but also the
error scores are perfectly correlated.
( , ) ( , )( , )
( ) ( )X X X XCov T T Cov E E
Corr X XSD X SD X
( ) ( )1.00
( )X XVar T Var E
Var X
(11)
In order to calculate the reliability of the quadratic term correctly, the squared correlation
in the numerator of Equation (8) must be replaced by the squared reliability of the predictor
variable X. In order to calculate the reliability of the squared variable X2, Equation (8) must be
rewritten as follows:
2 22
2 2
2 2
2
2
2
22
( ) ( ) ( ) ( , )( )
( ) ( ) ( ) ( , )
( ) ( )
( ) ( ) ( ,
2( ( ))
2( ( ) ( ))
2[ ( )][ ( )]
2
X X X X X
X X
X X X X X X X X
X
X X
Var T Var T Var T Cov T TRel X
Var X Var X Var X Cov X X
Var T Var T
Var T E Var T E Cov T E T E
Var T
Var T Var E
Rel XRel X
) (12)
The reliability of the squared variable X2 is thus the squared reliability of the variable X. As
shown in Figure 3, the reliability of the quadratic term X2 is always smaller than the reliability
of the predictor variable X, i.e., the square of Rel(X). Both variables X and X2 will only reach
reliabilities of 1.0 when X is measured without error.
-----------------------------------------------------------------------------------------------------------------
Insert Figure 3 about here
-----------------------------------------------------------------------------------------------------------------
9
Since least squares estimates of the regression coefficients in multiple regression analysis
with unreliable variables are biased and inconsistent, the low reliability of the interaction and
quadratic terms aggravates this bias even more and, as a consequence, leads to an
underestimation of the nonlinear effects and a loss of statistical power in testing the
significance of nonlinear effects (cf. Kelava, Moosbrugger, Dimitruk, & Schermelleh-Engel,
submitted). Therefore, it is recommended that nonlinear multiple regression analysis is only
used for manifest observed variables with very high reliabilities. In all other cases, nonlinear
structural equation modeling, which accounts for measurement error, should be employed.
Within the last century, a variety of new approaches have been developed for the analysis
of nonlinear structural equation models (for an overview, see Marsh et al., 2004; Schumacker
& Marcoulides, 1998), including - among others - the latent moderated structural equations
(LMS) approach (Klein, 2000; Klein & Moosbrugger, 2000) and its subsequent development
the quasi-maximum likelihood (QML) approach (Klein & Muthén, 2006), the LISREL
maximum likelihood approach of Jöreskog and Yang (1996), the centered constrained
approach (Algina & Moulder, 2001), the partially constrained approach (Wall & Amemiya,
2001), the two-step method of moments (2SMM) approach (Wall & Amemiya, 2000), and the
unconstrained approach (Marsh et al, 2004).
Based on the Kenny and Judd (1984) approach, LISREL requires that product variables are
formed as indicators for the latent product terms 12, 12 and 2
2. However, the multiplication
of measurement equations leads to nonlinear terms of the factor loadings and to complex error
terms (Jöreskog & Yang, 1996). These methods are thus extremely error-prone and limited to
few indicator variables.
To date, the only available methods especially developed for the analysis of nonlinear
structural equation models which do not require the formation of product variables are LMS
(implemented in Mplus; Muthén & Muthén, 2004) and QML. These methods are not affected
10
by problems resulting from low reliability of latent interaction and quadratic terms and should
be preferred above nonlinear regression (cf. Kelava et al., submitted).
2.2 Validity
The validity of observed variables is a problem often ignored in multiple regression
analyses. Validity refers to whether an observed variable is a valid representation of the latent
construct that it is intended to measure (construct validity) and/or whether predictor variables
significantly contribute to known and accepted standard measures or criteria (criterion
validity). Generalization of results from the sample to the population is often not justified
when each construct in the regression equation is measured by only one observed variable and
only linear terms are included in the regression equation in order to predict a criterion.
Provided with multiple indicators of each construct, structural equation methods supply a
much stronger basis for evaluating the underlying factor structure than multiple regression, by
relating multiple indicators to their factors, controlling for measurement error, increasing
power, and providing more defensible interpretations of the nonlinear effects (cf. Marsh et al.,
2004). Generally speaking, compared to the use of methods based on observed variables, both
latent variable modeling and nonlinear latent variable modeling prove advantageous in many
ways. This technique is, however, more complicated to conduct and is accompanied by
further problems of which the researcher should be aware (see below).
2.3 Multicollinearity
Multicollinearity occurs when intercorrelations among predictor variables are so high that
certain mathematical operations are either impossible or unstable. The effects of
multicollinearity in linear multiple regression models manifest primarily in affecting the
parameter estimates and their standard errors. When predictor variables are uncorrelated, the
11
value of a parameter estimate remains unchanged regardless of all other predictor variables
included in the regression equation. When predictors are correlated, however, the value of a
regression coefficient depends on which other variables are included in or excluded from the
model. Thus, when multicollinearity is present, a regression coefficient does not simply
reflect an inherent effect of the particular predictor variable on the criterion variable but rather
a partial effect. On these grounds, estimated regression coefficients may vary widely from one
data set to another.
The problem of multicollinearity is exacerbated in nonlinear regression models. Multiple
regression analyses with nonlinear terms are not only hindered by multicollinearity between
the predictor variables, but also with multicollinearity between the predictor variables and the
nonlinear terms as well as multicollinearity between the various nonlinear terms.
When multicollinearity between the independent variables including nonlinear terms, e.g.,
interaction and quadratic terms, is present, the observed interaction or quadratic effect may be
spurious; that is, the coefficient of the nonlinear term in the regression model may be
significant even when there is no true interaction or no true quadratic effect (Ganzach, 1997).
For example, as the correlation between X and Z increases, correlations between XZ and X2,
between XZ and Z2, and between X2 and Z2 increase in parallel. This results in an overlap
between the variance explained by XZ and the variance explained by X2 or Z2 (cf. Busemeyer
& Jones, 1983).
There are two types of multicollinearity among predictors and nonlinear terms in a
nonlinear regression model: nonessential and essential multicollinearity (cf. Cohen et al.,
2003).
Nonessential multicollinearity exists merely due to the scaling of the linear predictor
variables and can be avoided by centering these variables. This can be seen in Equation (13)
for the covariation between a predictor X and an interaction term XZ derived by Bohrnstedt
12
and Goldberger (1969; see also Evans, 1991) and for the covariation between X and its
squared term X2 in Equation (14). Whenever uncentered predictor variables X and Z (with
nonzero means X and Z ) are used and product variables of these predictors (for example, XZ,
X2, and Z2) are computed, these product terms will be highly correlated with the original
linear predictors X and Z.
( , ) ( ) ( , )Cov X XZ ZVar X XCov X Z (13)
2( , ) ( ) ( , ) 2 ( )Cov X X XVar X XCov X X XVar X (14)
Correlations of the predictor variables with the nonlinear terms also depend on the
variances of these variables (Equations 15 and 16). A small change in the position of the zero
point can thus result in considerable changes in the sign and magnitude of the correlation
coefficient (Evans, 1991).
2 2 2( ) ( ) ( ) 2 ( , ) ( ) ( ) ( , )Var XZ X Var Z Z Var X XZCov X Z Var X Var Z Cov X Z (15)
2 2 2 2( ) ( ) ( ) 2 ( ) ( ) ( ) ( )Var X X Var X X Var X X Var X Var X Var X Var X 2 (16)
Centering predictor variables is therefore a convenient method for reducing nonessential
multicollinearity. Centering involves a linear transformation of the predictor variables X to Xc
and Z to Zc by subtracting the mean value from each score, i.e. converting the scores of the
predictor variables into deviation form. Once the linear predictors have been centered, the
nonlinear terms have to be formed from the centered variables and the covariance between the
centered predictor variables and their products are equal to zero (Equation 17).
13
( , ) 0 ( ) 0 ( , )c c c c c cCov X X Z Var X Cov X Z (17)
Essential multicollinearity results from any nonsymmetry in the distribution of the
predictor variable. The variables Xc and Xc2 for example, will therefore still be correlated,
although to a lesser extent than for the uncentered variables X and X2. If, however, the
predictor variables are normally distributed or at least symmetric, centering will remove
essential multicollinearity with all even higher-order terms so that, for instance, the centered
predictor variable Xc and its squared variable Xc2 are uncorrelated.
While regression coefficients change according to whether variables are centered vs. not
centered, the squared multiple correlation obtained from ordinary least squares estimation is
invariant under linear transformation of the variables. Therefore, information regarding
nonlinear effects is not lost when variables are centered.
In structural equation modeling, the problems of nonessential multicollinearity are usually
more severe (cf. Kelava et al., submitted). Even if the latent predictors are centered,
correlations between the latent interaction and quadratic terms are higher compared to a
regression analysis because measurement errors are taken into account. Nevertheless,
removing nonessential multicollinearity by centering the predictor variables is also highly
recommended for structural equation modeling. Methods for the analysis of latent models
with multiple nonlinear terms should demonstrate that they are able to deal with essential and
nonessential multicollinearity inherent in these models.
2.4 Dichotomization of continuous variables
Because of the problems that may arise in the analysis of nonlinear regression models,
dichotomization of continuous variables is sometimes considered an alternative. Researchers
familiar with testing interactions between categorical variables in the context of analysis of
14
variance (ANOVA) might be tempted to analyze the continuous variable interaction of
continuous predictor variables by dichotomizing the continuous variables into two categories,
i.e., by forming median splits on both predictor variables. As will be shown, this
dichotomization strategy is problematic and cannot be recommended (cf. Whisman &
McClelland, 2005).
A frequently cited challenge to the detection of moderator effects with dichotomized
variables refers to the ensuing decrease in the measured relationships between variables. For
example, when a continuous normally distributed predictor is dichotomized at the median -
yielding high and low groups on this variable - its squared correlation with a normally
distributed criterion variable is reduced to 64 % of the original correlation (Cohen, 1983). The
consequence is a loss in power equivalent to reducing the original sample size by
approximately one third. Humphreys (1978, p. 874) emphasized the loss of information
regarding individual differences and the biases in estimates of effects and concluded that
ANOVA methods are inappropriate, misleading and unnecessary when predictor variables are
continuous. A severe problem results from the necessary dependence of median splits on the
investigated samples. Researchers studying the very same phenomenon may obtain
completely different results simply because the medians happened to vary across their
respective samples, causing the variables to be split at different points.
According to Cohen et al. (2003, p. 256), the median split strategy evolved on account of
methods for probing interactions in ANOVA being fully developed, long before they
achieved the same status in multiple regression analysis. Due to the number of deleterious
effects, median splits are to be avoided.
A further ill-advised strategy in testing for interaction effects is the use of subgroup
analysis. Rather than dichotomizing continuous variables into two categories, the sample is
split at the median or some other point on the moderator variable, separate regression analyses
15
are performed for each subgroup and the regression coefficients compared. This technique is
also has associated with several limitations (cf. Coulton & Chow, 1992).
First, information is lost when continuous variables are split. The cut-off point selected in
splitting the moderator variable can affect the results leading to different conclusions
according to where they are drawn (cf. Cronbach and Snow, 1977). If there is no evidence for
a bimodal distribution, dichotomizing a continuous variable is statistically inadvisable.
Second, simulation studies clearly show that dichotomization causes moderate to
substantial decreases in measurement reliability. This loss of reliable information through
categorization attenuates correlations involving dichotomized variables (MacCallum, Zhang,
Preacher, & Rucker, 2002).
Third, in many examples of subgroup analysis multiple correlation coefficients and not
slopes are compared. Under these circumstances, it is possible that the formation of subgroups
within a categorized moderator variable will result in a smaller error term in one sample
compared to another, because the moderator variable is correlated with the error variable. In
this case, the explained variance may be larger because the error variance is smaller, and not
because the relationship between the independent and dependent variable is stronger.
The use of structural equation modeling with multiple indicators measuring latent
constructs enables the researcher to take measurement errors into account when analyzing
relations between latent variables. If moderator variables are categorical, multi-group
structural equation modeling of nested models provides an effective test of moderator effects
(Rigdon, Schumacker & Wothke, 1998). In such a case the invariance of structural
coefficients can be tested in multi-sample comparisons of structural invariance, in which the
model with structural coefficients constrained to being invariant across the multiple groups is
compared to the corresponding model in which the respective parameters are unconstrained.
The evaluation of a moderator effect requires that the model difference test is significant. As
16
the test statistic of each of the nested models follows a χ2 distribution, the difference in χ2
values between two nested models is also χ2 distributed (Steiger, Shapiro, & Browne, 1985),
and the number of degrees of freedom for the difference is equal to the difference in degrees
of freedom for the two models. If invariance testing reveals that the structural coefficient of
interest is statistically different across the groups (i.e. a model with equality constraints on the
particular coefficient shows a significantly worse model fit than the model without such
constraints) the hypothesis of no interaction effect should be discarded.
However, if the moderator variable is not a categorical variable but rather a continuous
indicator variable to be dichotomized, the same problems arise as explained in the context of
regression analysis above. This practice once again entails a loss of information and may lead
to a reduction in power when detecting moderator effects, or to spurious interaction effects
(MacCallum et al., 2002). Furthermore, selecting the variable to be used for the creation of
subgroups proves problematic when a latent construct is measured by several indicators. In
this case it is possible to choose the most valid or most reliable indicator of the latent
construct, but nevertheless some information will be lost by ignoring the other indicators. A
further potential problem is the non-convergence of solutions or estimation problems such as
the occurrence of negative variances.
Thus, the practice of dichotomizing originally continuous variables in order to simplify the
testing of moderator effects is to be discouraged. The methodological literature clearly and
conclusively shows that dichotomization of quantitative measures has substantial negative
consequences and that the use of regression or structural equation methods is preferable in the
case of continuous variables (e.g., MacCallum et al., 2002; West, Aiken, & Krull, 1996;
Whisman & McClelland, 2005).
17
3 Problems associated with nonlinear structural equation
modeling
3.1 Distribution of latent product terms
It is well-known that when each independent variable is a latent variable inferred from
multiple indicators, structural equation modeling provides many advantages over the use of
analyses based on observed variables. Nevertheless, despite extensive use of this method for
the purposes of estimating linear relations among latent variables, the appropriate methods for
testing multiple nonlinear latent effects remain a subject of ongoing research (cf. Klein &
Muthén, 2006; Marsh et al. 2004; Schermelleh-Engel et al., 1998; Kelava et al., 2006). A
major challenge faced by researchers testing latent nonlinear effects concerns the multivariate
non-normality of the data.
Let us consider a full latent nonlinear model involving a latent interaction term and two
latent quadratic terms in the structural equation. Even if all indicators of the latent predictor
variables and the latent variables 1 and 2 themselves are normally distributed, the
distributions of the nonlinear terms 12, 12, and 2
2 are most certainly not normal.
Furthermore, the latent criterion variable will also be non-normally distributed.
Figure 4 illustrates the results of a small simulation study carried out by the current authors,
with a sample size of N = 2000 and the structural equation = -.40 + .401 +.402 + .2012 +
.1512 + .152
2 + . The distribution of the normally distributed variable 1 is shown in
comparison to non-normal distributions of the nonlinear terms 12, 12, and ; each
distribution is depicted in comparison to the density of a normally distributed variable with
the same expectation value and variance. The predictor variables 1 and 2 used for
calculating the product terms are standardized and follow a bivariate normal distribution with
18
a correlation of 21 = .50. As Figure 4 shows, all variables derived from the normally
distributed variables 1 and 2 are clearly non-normal. Quadratic variables are the most
skewed variables because the data are censored below zero, leading to the occurrence of
exclusively positive values.
-----------------------------------------------------------------------------------------------------------------
Insert Figure 4 about here
-----------------------------------------------------------------------------------------------------------------
The distributions of the latent nonlinear terms in Figure 4 show considerable skewness and
kurtosis, indicating their deviation from normality. The deviation from normality becomes
more extreme as the covariance between 1 and 2 increases. Even if all indicators of the
latent exogenous variables and the latent exogenous variables themselves are normally
distributed, the distributions of the latent interaction term (cf. Moosbrugger, Schermelleh-
Engel, & Klein, 1997), the latent quadratic terms, and the latent endogenous variable are
definitely non-normal. Since the structural equation also includes nonlinear components, the
endogenous variable also cannot be normally distributed. The degree of non-normality of
the distribution of depends on the non-normality of 12, 12 and 2
2, the size of the
nonlinear effects 12, 11, and 22 and the variance of the disturbance term in relation to the
variance of .
Methods developed for the analysis of latent nonlinear effects can be sub-classified into
methods for the analysis of homoskedastic models and methods for the analysis of
heteroskedastic models (cf. Klein, 2006). A model which assumes multivariate normality of
the predictor and criterion variables is a homoskedastic model in the sense that the conditional
distribution of the criterion variable conditional on the predictor variables is again normal. In
the case of nonlinear models, however, a homoskedastic model does not adequately represent
19
the relationships among the variables. In this case the distribution of the criterion variable
conditional on the predictor variables is heteroskedastic and therefore dependent on the levels
of the predictor variables.
Most variants of maximum likelihood estimation methods according to the Kenny-Judd
approach (1984) can be subsumed under homoskedastic methods. These methods are the
LISREL maximum likelihood approach of Jöreskog and Yang (1996), the centered
constrained approach (Algina & Moulder, 2001), the partially constrained approach (Wall &
Amemiya, 2001), the two-step method of moments (2SMM) approach (Wall & Amemiya,
2000), and the unconstrained approach (Marsh et al., 2004). Using these applications,
multivariate non-normality poses a severe problem which increases with each product term
added to the structural equation, because these methods require the formation of latent
product terms. In a series of simulation studies, Marsh et al. (2004) compared three types of
product indicators formed from three indicators of 1 and three indicators of 2: all possible
products (X1X4, X1X5, X1X6, X2X4, X2X5, X2X6, X3X4, X3X5, X3X6), matched pair products (X1X4,
X2X5, and X3X6), and one pair (X1X4). Their results demonstrated that the precision of
estimation for matched pairs was systematically better than for other product types.
Nevertheless, using manifest product terms will exacerbate the problem of non-normality in
latent models.
In estimating nonlinear models, the non-normal distribution has two possible consequences:
First, if an estimation procedure is used under the assumption of normally distributed
indicator variables, its robustness for nonlinear models should be demonstrated. The potential
bias of the estimated standard errors can become critical in the method’s performance, when it
comes to inferential statistics. Second, if estimation methods are used that are asymptotically
distribution-free, they must be tested with regard to their power and efficiency. In failing to
20
make distributional assumptions, these methods ignore the specific stochastic structure
implied by product terms.
As simulation studies have clearly shown, using the maximum likelihood estimation
method of the LISREL program with non-normal product terms leads to serious
underestimation of standard errors and biased chi-square values even for models with only
one latent interaction term (Marsh et al. 2004; Jöreskog & Yang, 1996; Schermelleh-Engel,
Klein & Moosbrugger, 1998). Asymptotically distribution-free methods, such as two-stage
least squares (2SLS; Bollen, 1995) and weighted least squares based on the augmented
moment matrix (WLSA; Jöreskog & Yang, 1996) do not constitute feasible alternatives. They
carry the disadvantage of low power and low efficiency and further require large sample sizes
in order to provide estimates of satisfactory precision for nonlinear models.
The only methods especially developed for the analysis of nonlinear structural equation
models which explicitly take heteroskedasticity into account are LMS (Klein, 2000; Klein &
Moosbrugger, 2000) and QML (Klein & Muthén, 2006). In contrast to the LISREL-type
approaches, no product indicators are required because the distributions of the indicator
variables are approximated by a finite mixture distribution.
Although simulation studies for latent interaction models have shown that LMS and QML
estimators are consistent, asymptotically unbiased, asymptotically efficient, and
asymptotically normally distributed (cf. Marsh et al. 2004; Schermelleh-Engel, Klein, &
Moosbrugger, 1998), it is not yet known, whether these methods perform equally well when
larger models including several nonlinear terms are analyzed.
3.2 Unbiasedness of nonlinear parameter estimates under multicollinearity
LMS and QML are the only methods that are able to adequately deal with the nonlinearity
induced by interaction and quadratic terms. These methods should also be able to sufficiently
21
differentiate between interaction and quadratic effects even when multicollinearity exists.
In a simulation study, we investigated the performance of QML and LMS for a sample size
of N = 400 (500 replications). The following parameter values were selected for the
population model: = -.324, 11 = 12 = .400, 12 = .200, 11 = 22 =.112, 11 = 22 = 1.000,
and 21 = .500. The predictor variables 1, 2 and the criterion variable were each measured
by three indicator variables with reliabilities of .75. The given selection of nonlinear effects
results in a model in which 5% of the variance of is explained by the interaction effect and
2.5% by each quadratic effect. Hence, the study tested the performance of QML and LMS for
a nonlinear model with reasonably large nonlinear effects compared to those found in
empirical studies. The data for the latent predictor variables were generated according to the
normal distribution.
----------------------------------------------------------------------------------------------------------------
Insert Table 1 about here
-----------------------------------------------------------------------------------------------------------------
Estimates of the nonlinear effects are presented in Table 1. As the results of the simulation
study show, the parameter estimates of both QML and LMS for a sample size of N = 400 are
unbiased and the standard errors are estimated correctly. Minor differences between the two
methods include a slightly increased bias for QML parameter estimators and marginally more
efficient LMS estimators (not shown in Table 1) (see also Klein & Muthén, 2006).
All in all this small simulation study shows that LMS and QML are well suited for the
analysis of multiple nonlinear effects and that they are adequately able to differentiate
between all three nonlinear effects when the predictor variables are correlated.
22
4 Conclusions
Several issues posing a challenge for researchers investigating nonlinear effects were
presented and several recommendations for dealing with these challenges provided in the
current paper.
Although imperfect, multiple regression analysis seems to be the preferred statistical
method when it comes to detecting nonlinear effects in models with continuous predictor
variables. This method has previously been primarily evaluated in the context of moderator
models, although the treatment of more complex models with multiple nonlinear terms, i.e.
one interaction term and one or more quadratic terms, is plagued by even greater problems
than those affecting moderator models.
As we have shown, the reliability of the interaction term in multiple regression analysis is
dependent on the reliabilities of both predictor variables and on their correlation, while the
reliability of the quadratic terms depends only on the reliability of the predictor variable used
to compute the quadratic variable (cf. Kelava et al., submitted). The reliability of quadratic
terms is generally lower than the reliability of an interaction term and both reliabilities are
always lower than the reliability of the predictor variables used to compute the product
variables. Therefore, measurement models that take the unreliability of linear and nonlinear
terms into account are required.
As is well-known from methodological literature, dichotomization of quantitative variables
has substantial negative consequences so that it is preferable to use regression methods in the
case of continuous variables (Coulton & Chow, 1992; Humphreys, 1978; MacCallum et al.,
2002). Multi-sample analysis should not be considered an alternative, even though at least one
predictor and the criterion may be measured reliably using multiple indicators. The loss of
information regarding individual differences, the loss of power, and the biases in estimates of
23
effects all indicate that multi-sample analysis is an inappropriate method, misleading and
unnecessary when independent variables are continuous.
Multicollinearity in nonlinear regression models is to a large extent a result of scaling, and
can be lessened by centering or standardizing the predictor variables. Multiple regression
analysis with nonlinear terms not only battles with multicollinearity between the predictor
variables and between the different nonlinear terms, but also with multicollinearity between
the predictor variables and the nonlinear terms when variables are non-normally distributed
(Cohen et al., 2003). Nevertheless, removing nonessential multicollinearity by centering the
predictor variables is also highly recommended for structural equation modeling, when
methods are based on the Kenny-Judd model and product terms are to be defined.
To date it remains somewhat unclear whether methods developed for the analysis of latent
variables are able to adequately deal with the problem of multicollinearity resulting from
multiple nonlinear terms. Including several nonlinear terms in the structural equation results
in non-normal distributions of the latent interaction and quadratic terms as well as the latent
criterion variable, even if the latent predictors follow normal distributions. The multivariate
non-normal distribution has two possible consequences: First, if an estimation procedure is
used under the assumption of normally distributed indicator variables, e.g., LISREL-ML, the
robustness for nonlinear models should be thoroughly investigated. For inferential statistics,
the bias of the estimated standard errors can become critical in the method’s performance.
Second, asymptotically distribution-free estimation procedures (e.g., two-stage least squares;
Bollen, 1995, 1996) must be tested with regard to their power and efficiency. However,
simulation studies have already provided evidence that the distribution-free methods require
very large sample sizes when latent moderator models are to be analyzed.
Simulation studies for latent interaction models have previously shown that LMS and
QML estimators are consistent, asymptotically unbiased, asymptotically efficient, and
24
asymptotically normally distributed. As simulation studies have also demonstrated, QML
(Marsh et al., 2004) and LMS (Klein & Moosbrugger, 2000; Schermelleh-Engel, Klein, &
Moosbrugger, 1998) outperform other methods currently available with respect to efficiency.
First results of a simulation study on the performance of LMS for the analysis of more
complex nonlinear models prove rather promising. They indicate that LMS is well-suited for
the analysis of nonlinear latent models including one or more interaction terms (Klein &
Muthén, 2006) or an interaction term and two quadratic terms (cf. Kelava, Moosbrugger,
Dimitruk, & Schermelleh-Engel, 2006).
The QML method was developed for the efficient and computationally feasible estimation
of multiple nonlinear effects in structural equation models with quadratic forms. As Klein and
Muthén (2006) illustrated in their simulation study, QML leads to more efficient estimates
than LMS when a model with three latent interaction effects is subject to analysis, and the
confidence intervals based on the standard error estimates have no substantial bias.
Furthermore, QML appears to be more robust than LMS under violation of the normality
assumption.
Just as important, the QML and LMS estimation of standard errors showed no substantial
bias, which supports precise significance testing of multiple nonlinear effects. Both QML and
LMS seem to represent highly efficient, computationally feasible, and practically adequate
approaches, which are of particular relevance when rather complex structural equation models
with several nonlinear effects are to be analyzed.
Nevertheless, simulation studies are needed in order to analyze more complex models with
one ore more interaction and one or more quadratic effects. Furthermore, the non-normality of
predictor variables should also be systematically varied in order to investigate whether
methodological problems may be aggravated as compared to models with normally
distributed predictor variables. As first results from a simulation study show, QML appears to
25
perform better than LMS when the normality assumption for the predictor variables is
violated (Klein & Muthén, 2006). Simulations studies are also necessary for the investigation
of the behavior of those methods based on the Kenny-Judd model regarding the detection of
multiple nonlinear effects when essential and nonessential multicollinearity is high and the
assumption of multivariate normality is violated.
A further challenge and a problem still unsolved is the evaluation of model fit of nonlinear
models. To date, the model fit for nonlinear structural equation models cannot be reliably
evaluated. The 2 test statistic used for hypothesis testing in evaluating the appropriateness of
a linear structural equation model cannot be employed, due to the violation of distributional
assumptions. Recently, Klein (2006) developed a likelihood ratio test implemented in QML.
In a small simulation study, this test did not show substantial inflation under non-normal
conditions. Further simulation studies should look to investigate the performance of this test
under varying conditions.
26
5 References
Aguinis, H. (1995). Statistical power with moderated multiple regression in management
research. Journal of Management, 21, 1141-1158.
Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions.
Thousand Oaks, CA: Sage.
Algina, J., & Moulder, B. C. (2001). A note on estimating the Jöreskog-Yang model for latent
variable interaction using LISREL 8.3. Structural Equation Modeling, 8, 40-52.
Bohrnstedt, G. W., & Goldberger, A. S. (1969). On the exact covariance of products of
random variables. Journal of the American Statistical Association, 64, 1439-1442.
Bollen, K. A. (1995). Structural equation models that are non-linear in latent variables. In P.
V. Marsden (Ed.), Sociological methodology 1995 (Vol. 25). Washington, DC: American
Sociological Association.
Bollen, K. A. (1996). An alternative two-stage least squares (2SLS) estimator for latent
variable equations. Psychometrika, 61, 109-121.
Busemeyer, J. R. & Jones, L. E. (1983). Analysis of multiplicative combination rules when
the casual variables are measured with error. Psychological Bulletin, 93, 549-562.
Cohen, J. (1983). The cost of dichotomization. Applied Psychological Measurement, 7, 249–
254.
Cohen, J., Cohen, P., West, S. G. & Aiken, L. S. (2003). Applied multiple regression/
correlation analysis for the behavioral sciences. Mahwah, NJ: Lawrence Erlbaum.
Coulton, C., & Chow, J. (1992). Interaction effects in multiple regression. Journal of Social
Service Research, 16 (1/2), 179-199.
Cronbach, L. J. & Snow, R. E. (1977). Aptitudes and instructional methods: A handbook for
research on interactions. New York: Irvington.
Evans, M. G. (1991). The problem of analyzing multiplicative composites: Interactions
Revisited. American Psychologist, 46, 6-15.
Ganzach, Y. (1997). Misleading interaction and curvilinear terms. Psychological Methods, 3,
235-247.
Humphreys, L. G. (1978). Doing research the hard way: Substituting analysis of variance for
a problem in correlational analysis. Journal of Educational Psychology, 70, 873-876.
Jöreskog, K. G. & Yang, F. (1996). Non-linear structural equation models: the Kenny-Judd
model with interaction effects. In G. Marcoulides & R. Schumacker (Eds.), Advanced
structural equation modeling (pp. 57-87). Mahwah, NJ: Lawrence Erlbaum.
27
Kelava, A., Moosbrugger, H., Dimitruk, P., & Schermelleh-Engel, K. (2006).
Multicollinearity and the Performance of Ping’s two-step-approach and the LMS-approach
in detecting nonlinear effects in Structural Equation Models (submitted).
Kenny, D., & Judd, C. M. (1984). Estimating the nonlinear and interactive effects of latent
variables. Psychological Bulletin 96, 201-210.
Klein, A. (2000). Moderatormodelle. Verfahren zur Analyse von Moderatoreffekten in Struk-
turgleichungsmodellen (Moderator models. Method for the analysis of moderator effects in
structural equation models). Hamburg: Dr. Kovač.
Klein, A. (2006). A saturated model and evaluation of model fit for multivariate
heteroscedastic models (submitted).
Klein, A. & Moosbrugger, H. (2000). Maximum likelihood estimation of latent interaction
effects with the LMS method. Psychometrika, 65, 457-474.
Klein, A. & Muthén, B. O. (2006). Quasi maximum likelihood estimation of structural
equation models with multiple interaction and quadratic effects. Methods of Behavioral
Research (in press).
Lubinski, D. & Humphreys, L.G. (1990). Assessing spurious “moderator effects”: Illustrated
substantively with the hypothesized (“synergistic”) relation between spatial and
mathematical ability. Psychological Bulletin, 107, 385-393.
MacCallum, R. C., & Marr, C. M. (1995). Distinguishing between moderator and quadratic
effects in multiple regression. Psychological Bulletin, 118, 405-421.
MacCallum, R. C., Zhang, S., Preacher, K. J., & Rucker, D. D. (2002). On the practice of
dichotomization of quantitative variables. Psychological Methods, 7, 19–40.
Marsh, H. W., Wen, Z., & Hau, K. T. (2004). Structural equation models of latent interactions:
Evaluation of alternative estimation strategies and indicator construction. Psychological
Methods, 9, 275-300.
Moosbrugger, H., Schermelleh-Engel, K., & Klein, A. (1997). Methodological problems of
estimating latent interaction effects. Methods of Psychological Research Online, 2, 95-111.
Muthén, L. K., & Muthén, B.O. (2004). Mplus: The comprehensive modeling program for
applied researchers. User’s guide (3rd ed.). Los Angeles: Muthén & Muthén.
Rigdon, E. E., Schumacker, R. E. & Wothke, W. (1998). A comparative review of interaction
and nonlinear modeling. In R. E. Schumacker & G. A. Marcoulides (eds.), Interaction and
nonlinear effects in structural equation modeling (pp. 1-16). Mahwah: Lawrence Erlbaum
Associates.
28
Schermelleh-Engel, K., Klein, A. & Moosbrugger, H. (1998). Estimating nonlinear effects
using a Latent Moderated Structural Equations Approach. In R. E. Schumacker & G. A.
Marcoulides (Eds.), Interaction and nonlinear effects in structural equation modeling
(pp. 203-238). Mahwah, NJ: Lawrence Erlbaum Associates.
Schumacker, R. E., & Marcoulides, G. A. (Eds.) (1998). Interaction and nonlinear effects in
structural equation modeling. Mahwah, NJ: Lawrence Erlbaum Associates.
Steiger, J.H., Shapiro, A., Browne M.W. (1985). On the multivariate asymptotic distribution
of sequential Chi-square statistics. Publisher: Springer New York.
Wall, M. M. & Amemiya, Y. (2000). Estimation for polynomial structural equation models.
Journal of the American Statistical Association, 95, 929-940.
Wall, M. M., & Amemiya, Y. (2001). Generalized appended product indicator procedure for
nonlinear structural equation analysis. Journal of Education and Behavioral Statistics, 26,
1-29.
West, S. G., Aiken, L. S., & Krull, J. L. (1996). Experimental personality designs: Analyzing
categorical by continuous variable interactions. Journal of Personality, 64, 1–48.
Whisman, Mark A. & McClelland, Gary H. (2005). Designing, testing, and interpreting
interactions and moderator effects in family research. Journal of Family Psychology, 19,
111–120.
Yang-Jonsson, F. (1997). Non-linear structural equation models. Simulation studies of the
Kenny-Judd model. Stockholm: Gotab.
29
Table 1. Comparison of LMS und QML estimates of a nonlinear structural equation model
with large nonlinear effects: interaction effect 12 = .200, R2 = 5%, quadratic effects 11 =
22 = .112, R2 = 2.5%) and correlated predictor variables (21 = .50)
LMS QML
Para-meter
True value
M Bias SD SE SE/SD M Bias SD SE SE/SD
11 .112 .111 -.45% .044 .041 .936 .115 3.27% .040 .041 1.03112 .200 .200 .05% .066 .065 .983 .198 -1.07% .069 .066 .96022 .112 .113 1.07% .042 .041 .976 .112 .03% .042 .041 .979
Note. M = mean parameter estimate, SD = standard deviation, SE = estimated standard error
30
FIGURE CAPTIONS
Figure 1. Nonlinear structural equation model with a latent interaction term 12 and two
quadratic terms 12 and 2
2.
Figure 2. Relation between the correlation of the predictors, Corr(X, Z), and the reliability of
the interaction term, Rel(XZ), for Rel(X) = Rel(Z) = .25, .50, .75, and 1.00
Figure 3. Relationship between the reliability of the predictor variable X and the reliability of
its quadratic term X2.
Figure 4. Distribution of a latent predictor 1, a moderator term 12, a quadratic term 12 and
a latent criterion variable compared to normally distributed variables with equal
expectation values and variances. Predictor variable 1 and moderator variable 2
are bivariate normal and correlated (21 = .50).
31
22
X1 X2
1
Y1 Y2
X3
X4
12 11
2
12
11
21
31
42
11
21
X5
X6
Y3
Figure 1. Nonlinear structural equation model with a latent interaction term 12 and two
quadratic terms 12 and 2
2.
32
1.00 1.00 1.00 1.00
0.56 0.590.65
0.72
0.25 0.290.40
0.060.12
0.00
0.20
0.40
0.60
0.80
1.00
1.20
0.00 0.25 0.50 0.75
Corr(X,Z)
Rel
(XZ
)
Rel(X)=Rel(Z)=1.00
Rel(X)=Rel(Z)=0.75
Rel(X)=Rel(Z)=0.50
Rel(X)=Rel(Z)=0.25
Figure 2. Relation between the correlation of the predictors, Corr(X, Z), and the reliability of
the interaction term, Rel(XZ), for Rel(X) = Rel(Z) = .25, .50, .75, and 1.00
33
0.04
0.16
0.36
0.64
1.00
0.00
0.20
0.40
0.60
0.80
1.00
0.00 0.20 0.40 0.60 0.80 1.00
Rel(X )
Rel
(X 2
)
Figure 3. Relationship between the reliability of the predictor variable X and the reliability of
its quadratic term X2.
34
-8 -6 -4 -2 0 2 4 6 8
0
50
100
150
200
250
Fre
qu
enc
y
-8 -6 -4 -2 0 2 4 6 8
0
100
200
300
400
500
Fre
qu
en
cy
1 12
Figure 4. Distribution of a latent predictor 1, a moderator term 12, a quadratic term 12 and
a latent criterion variable compared to normally distributed variables with equal expectation
values and variances. Predictor variable 1 and moderator variable 2 are bivariate normal and
correlated (21 = .50).
-8 -6 -4 -2 0 2 4 6 8
0
200
400
600
800
Fre
qu
en
cy
-8 -6 -4 -2
0
50
100
150
200
250
Fre
qu
ency
0 2 4 6 8
12
Mean = -.01 Var(1) = 1.00 Skewness = .04 Kurtosis = .07 N = 2000
Mean = .50 Var(12) = 1.24 Skewness = 2.28 Kurtosis = 8.00 N = 2000
Mean = 1.00 Var(1
2) = 2.06 Skewness = 2.98 Kurtosis =13.40 N = 2000
Mean = .00 Var() = 1.04 Skewness = 1.67 Kurtosis = 4.51 N = 2000