on structural equation model equivalence
TRANSCRIPT
This article was downloaded by: [UNSW Library]On: 27 March 2013, At: 21:20Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK
Multivariate BehavioralResearchPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/hmbr20
On Structural Equation ModelEquivalenceTenko Raykov & Spiridon PenevVersion of record first published: 10 Jun 2010.
To cite this article: Tenko Raykov & Spiridon Penev (1999): On Structural EquationModel Equivalence, Multivariate Behavioral Research, 34:2, 199-244
To link to this article: http://dx.doi.org/10.1207/S15327906Mb340204
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions
This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden.
The publisher does not give any warranty express or implied or make anyrepresentation that the contents will be complete or accurate or up todate. The accuracy of any instructions, formulae, and drug doses should beindependently verified with primary sources. The publisher shall not be liablefor any loss, actions, claims, proceedings, demand, or costs or damageswhatsoever or howsoever caused arising directly or indirectly in connectionwith or arising out of the use of this material.
MULTIVARIATE BEHAVIORAL RESEARCH 199
Multivariate Behavioral Research, 34 (2), 199-244Copyright © 1999, Lawrence Erlbaum Associates, Inc.
On Structural Equation Model Equivalence
Tenko RaykovFordham University
Spiridon PenevUniversity of New South Wales, Sydney
A necessary and sufficient condition for equivalence of structural equation models ispresented. Compared to existing rules for equivalent model generation (Stelzl, 1986; Lee &Hershberger, 1990; Hershberger, 1994), it is applicable to a more general class includingmodels with parameter restrictions and models that may or may not fulfil assumptions of therules, to show that two models are nonequivalent, or to nonidentified models. The validity ofthe replacement rule by Lee and Hershberger, Stelzl’s rules, and Hershberger’s inverseindicator rule is implied from the present method. Its application for studying modelequivalence or lack thereof is demonstrated on a series of empirical examples.
The problem of equivalent structural equation models has receivedconsiderable attention during the past decade or so (e.g., Bollen, 1989;Breckler, 1990; Hershberger, 1994; Jöreskog & Sörbom, 1993; Lee &Hershberger, 1991; Luijben, 1991, MacCallum, Wegener, Uchino, &Fabrigar, 1993; Raykov, 1997; Stelzl, 1986). Two or more models aredefined as equivalent (Stelzl, 1986) if they reproduce the same set ofcovariance matrices when their parameters vary across their spaces. Foressentially any model there exist potentially many equivalent modelsrepresenting equally plausible, distinct means of description and explanationof the analyzed data (e.g., Breckler, 1990). They typically give rise to equaloverall goodness-of-fit indices, such as chi-square values and relateddescriptive fit indices, degrees of freedom, and p-values, as well ascovariance residuals (e.g., Jöreskog & Sörbom, 1993). Hence these modelscan possibly be differentiated between by employing additional criteria. Such
This research was partly supported by grants to T. Raykov from the Max Planck Society forAdvancement of Science, the Australian Research Council, and the University of Melbourne. TheEditor and two anonymous Referees have provided a number of suggestive and helpful commentson an earlier draft, which have improved substantially the article. We thank M. W. Browne andK. G. Jöreskog for valuable discussions on model equivalence. Address correspondence on thismanuscript to Tenko Raykov, Department of Psychology, Fordham University, Bronx, NY 10458.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
200 MULTIVARIATE BEHAVIORAL RESEARCH
are interpretability of parameters, meaningfulness of model, design features— for example, implied by experimental manipulation of key variables and/or longitudinal data collection — or characteristics of variables (MacCallumet al., 1993), as well as time precedence and mediating mechanism (Lee &Hershberger, 1990). In a multi-population context, special group equalityrestrictions can lead to different fit indices that allow statistical distinctionbetween multiple-group versions of initial, single-group equivalent models(Raykov, 1997).
Equivalent models occur routinely in behavioral research and often inlarge numbers, as a recent study by MacCallum et al. (1993) exemplifies.Because they entail different and potentially incompatible, controversial oreven opposite explanations of a studied phenomenon, they pose a serioustheoretical challenge to the behavioral scientist that must be addressed. Inparticular, they represent a threat to the validity of substantive conclusionsdrawn from an entertained model. Thus, without explicit and carefulconsideration of the existence of equivalent models any theoreticalinterpretation of results obtained with a given model are subject to question.Hence, unless the model equivalence issue is addressed an originallyconsidered model, regardless how well it fits the data, remains only onepossible means of its explanation (MacCallum et al., 1993). Therefore,theory development and construct validation using structural equationmodeling (SEM) in the behavioral sciences depend on proper handling of theproblem of equivalent models.
Major contributions to the understanding of this problem and itsmanagement were provided by Stelzl (1986), Lee & Hershberger (1990), andHershberger (1994). They demonstrated that starting from a given model onecan construct others equivalent to it using specifically developed rules. Theirdistinctive feature is that frequently they can be readily employed in practice,in particular before the data is collected. This allows the researcher togenerate alternative models before conducting his/her study, which shouldreceive the same substantive attention as rival means of phenomenonexplanation relative to an originally focused model. However, while ofteneasily utilized, these rules are limited to a class of models that does not coverall possibly interesting models of relationships between variables underinvestigation, as elaborated later in this article. Therefore a more generalmeans of studying model equivalence, or lack thereof, will contribute to amore comprehensive treatment of this fundamental issue for applications ofthe SEM methodology.
The aim of the present article is to complement the existing rules by amethod of studying model equivalence, which has wider applicability andimplies the validity of any of the rules. The approach can be used in all cases
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 201
when a rule can be applied as well as when no rule can be utilized, andparticularly: (a) to models with parameter restrictions; (b) to show that a pairof models are not equivalent, which can also be an important finding inbehavioral research; (c) to demonstrate that considered models are equivalentonly for certain values of some of their parameters and nonequivalent forothers and; (d) to nonidentified models. The remainder of the article usesnotation commonly employed in the SEM literature (e.g., Bollen, 1989). Init, u denotes the vector of all parameters of a model M,Q is its parameterspace containing all possible values of u, and S(u) is the covariance matriximplied by M at u.
Model Parameter Transformation
Instrumental for this article is the concept of parameter transformation.For two models M and M9, with parameter spaces Q and Q9 respectively, itwill be said that a transformation (mapping) g has been defined on Q withvalues in Q9, denoted g:Q → Q9, if for each u from Q there exists an elementu9 from Q9 such that u is mapped into u9 by g, that is, u9 = g(u). Thetransformation is called surjective if for each u9 in Q9 there exists a u in Q,such that u is mapped into u9 by g. Thus, a surjective transformation coversall Q9, that is, is an “onto” mapping. Two models M and M9 will be said tofulfil the S-condition if there exists a transformation g:Q → Q9 with whichS(u) = S9[g(u)] holds true for their corresponding implied matrices S and S9for all u from Q. (Note that the definition of the S-condition requires nothingfrom g except its existence.) M is called identified, if it reproduces differentcovariance matrices at different parameter vectors, that is, if S(u1) = S(u2)implies u1 = u2 for u1 and u2 from Q (e.g., Jöreskog & Sörbom, 1993).
As a straight-forward illustration of the parameter transformationconcept, consider the following three models for a pair of covarying variables,Y
1 and Y
2 (e.g., Jöreskog & Sörbom, 1993, p. 251). They are reproduced on
Figure 1 (next page) using widely followed path-diagrammatic notation, witha two-way arrow denoting covariance (e.g., Bentler, 1995). While Model Aassumes that Y
1 plays an explanatory role for Y
2, B postulates the reverse
relationship, and Model C only states that Y1 and Y
2 are interrelated.
The parameters of Model A are: the variance of Y1, denoted s
Y1
2; thevariance s
D2
2 of the disturbance term D2; and the regression coefficient b
21
associated with the path from Y1 into Y
2. That is, the parameter vector of A
is uA = (s
Y1
2, sD2
2, b21
)t (‘ t’ designates transposition in this article). Theparameter vector of Model B is u
B = (s
Y2
2,sD1
2, b12
)t, where sY2
2 is thevariance of Y
2, s
D1
2 that of the disturbance term D1, and b
12 is associated with
the path from Y2 into Y
1. The parameters of Model C are the variances s
Y1
2
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
202 MULTIVARIATE BEHAVIORAL RESEARCH
and sY2
2 of Y1 and Y
2, and their covariance s
12. We note that the three models
have the same number of parameters, namely three. (Throughout this article,as usual in the practice of behavioral research, all covariance matrices areassumed positive definite, and hence all variances positive; e.g., Mardia,Kent, & Bibby, 1979).
Using the simple covariance rules Equations 30 and 31 stated in Appendix2 (e.g., Bollen, 1989), the covariance matrix implied by Model A, S
A, is
directly obtained as (for symmetric matrices, elements above the maindiagonal will not be presented in this article):
S uA AY
Y Y D( ) . . .
= +LNM
OQP
sb s b s s
1
1 1 2
2
2 2 221 21
Similarly, the covariance matrices reproduced by Models B and C arecorrespondingly
S uB BY D
Y Y
( )..
,= +LNM
OQP
b s s
b s s12
2 2 2
122 22 1
2 2
and
S uC CY
Y( )
.= L
NMOQP
s
s s1
2
2
122
Figure 1Models A, B, and C.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 203
To examine the relationship between Models A and B consider thefollowing parameter transformation that consists of as many components asthere are parameters in either model, namely 3, and is defined by equating thecorresponding elements of their implied covariance matrices S
A and S
B:
(1) sY1
2 = b12
2.sY2
2 + sD1
2 ,
b21
.sY1
2 = b21
.sY2
2 , and
b21
2.sY1
2 + sD2
2 = sY2
2 .
Treating the left-hand sides of Equations 1 as fixed quantities, we observethat it represents a system of 3 nonredundant equations in 3 unknowns, theparameters of Model B. The solution of Equations 1 in terms of these 3unknowns is obtained with straight-forward algebraic rearrangementsyielding
(2) sY2
2 = b21
2 .sY1
2 + sD2
2 ,
b12
= b21
.sY1
2/(b21
2.sY1
2 + sD2
2), and
sD1
2 = sY1
2 sD2
2/(b21
2.sY1
2 + sD2
2) .
Equations 2 define a 3-dimensional transformation gAB
of the parametervector of Model A, u
A = (s
Y1
2, sD2
2, b21
)t, into the parameter vector of ModelB, u
B = (s
Y2
2, sD1
2, b12
)t. Thereby, each equation in Equation 2 defines acomponent of g
AB, which yields from u
A a corresponding parameter of Model
B. Comparing the above implied covariance matrices SA and S
B by Models
A and B, we see that with gAB
one obtains SA from S
B: indeed, by substituting
into the above expression for SB(u
B) the right-hand sides of Equations 2 for
sY2
2, b12
and sD1
2, respectively, one directly gets SB[g
AB(u
A)] = S
A(u
A).
Conversely, SA yields S
B if using the inverse transformation g
BA with
components obtained similarly, by solving the 3 equations of Equation 1 interms of the 3 parameters of Model A or by solving Equation 2 now in termsof the parameters of A. Similarly one finds that the 3-componenttransformation g
BC defined by
(3) sY1
2 = b12
2.sY2
2 + sD1
2 ,
sY2
2 = sY2
2 , and
s12
= b12
.sY2
2
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
204 MULTIVARIATE BEHAVIORAL RESEARCH
yields SB from S
C. Conversely, S
C is obtained from S
B by taking the inverse
transformation gCB
with components readily obtained by solving Equations 3in terms of the parameters of Model B. Finally, the compositiontransformation (successive application) of g
AB and then g
BC yields S
A from S
C;
reversely, SA becomes S
C after applying successively g
CB and g
BA on the
former matrix. That is, SC leads to S
A using the transformation resulting as
a composition of gAB
and gBC
, and similarly SA leads to S
C using the
composition of gCB
and gBA
.The concept of parameter transformation lies at the heart of the problem
of equivalent models, as shown later in this article with its method of studyingmodel equivalence. To motivate the development of this approach, thelimitations of the currently available rules for generation of equivalent modelsare first explicated.
Limitations of the Existing Rules for Generation of Equivalent Models
Stelzl’s rules, Lee and Hershberger’s replacement rule (abbreviated toRR in the remainder), and Hershberger’s reverse indicator rule represent onlysufficient conditions for equivalence. Therefore, they imply equivalenceonly for models obtained from a given one using specific modification steps.Hence, if a model M9 is equivalent to another, M, but M9 cannot be obtainedfrom M (or vice versa) following any of these rules, none of the rules can beused in examining their equivalence status. Examples of such models arepresented on Figures 7 and 8, and studied in Appendix 2 using Proposition 1of the next section. The reason for this limitation of the rules is that they donot deal with necessary conditions of equivalence. That is, the rules onlyidentify part of the models equivalent to a given one, and do not addresspotentially many others equivalent to it. Hence the researcher needsadditional means of exploring equivalence of models of interest, in particulara method that gives necessary and sufficient conditions for equivalence.Such a means is provided by Proposition 1 of this article, which is lessrestrictive than the rules as it is not limited to an algorithm of how anequivalent model would be obtained from a given one. In addition, any of therules imposes certain restrictive requirements in order to be applicable, whichare discussed next.
1. The RR requires limited block-recursiveness. It presumes that amodel of interest can be considered consisting of a preceding block, a focalblock (to which the rule is actually applied), and a succeeding block. Theblocks can be delineated by considering (a) two variables of interest, X and Ysay, as comprising the focal block; (b) all variables “causally” preceding X andY as giving rise to the preceding block; and (c) all variables “causally”
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 205
succeeding X and Y as building the succeeding block (Lee & Hershberger,1990). In addition, within the preceding and succeeding blocks therelationships between the variables may be recursive or nonrecursive, yetacross the three blocks only recursive relationships are allowed by the rule.Furthermore, it presumes that in the focal block the relationship between Xand Y is recursive; that is, the rule is not applicable if there exists anonrecursive relationship between X and Y (MacCallum et al., 1993). Whenall these assumptions are fulfilled, the RR states that to obtain an equivalentmodel a regression coefficient from X into Y can be exchanged for acovariance between their residuals if the predictors of Y are the same orinclude those of X. That is, even if the limited block recursivenessassumption is fulfilled, which itself is restrictive, the RR cannot be appliedunless the predictors of Y include or are identical to those of X.
2. The four rules of Stelzl (1986) are subsumed under the RR (Lee &Hershberger, 1990), and hence all limitations of the RR apply to Stelzl’s rulesas well.
3. Hershberger’s (1994) reverse indicator rule, an application of the RR,allows only inversion of a single path and thereby requires exogeneity of thefocused measurement model (within which the path inversion is carried out).This is another restrictive assumption.
Furthermore, as mentioned by Lee & Hershberger (1990) and Stelzl(1986), a major limitation of the existing rules is that they all have beendeveloped for models with only trivial restrictions, namely zero paths. Theseare the paths (one- and/or two-way) by which a model is obtained from asaturated model, that is, paths that are “not depicted” on its path-diagram. Inbehavioral research, however, one is frequently interested in models havingnontrivial parameter restrictions, such as equal effects, equal factor loadingsor equal error variances, or assuming other meaningful parameterrelationships. For such models it is also necessary to consider othersequivalent to them when interpreting them substantively, but with thesemodels no available rule is applicable, unlike the following Proposition 1.
Moreover, the RR as well as Stelzl’s rules subsumed under it deal only withsuch changes in an initial model that are confined to its focal block. In practice,however, models can be of interest that differ by details positioned further apartfrom each other, as exemplified in Appendix 2 (see Figures 7 and 8). Thesemodels cannot be studied via the rules but can be so with the present method,as shown there. Last but not least, Proposition 1 below allows examination ofequivalence or lack thereof contingent upon parameter values.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
206 MULTIVARIATE BEHAVIORAL RESEARCH
A Necessary and Sufficient Condition for Equivalence of StructuralEquation Models
The above definition of equivalent models (Stelzl, 1986) as reproducing thesame sets of covariance matrices is difficult, if at all possible, to use in empiricalpractice because it requires the check of identity of sets typically consisting ofinfinitely many matrices. An alternative means of demonstrating modelequivalence is provided by the following necessary and sufficient condition thatis proved in Appendix 1 and illustrated in the next section.
Proposition 1: Two models M and M9 are equivalent if and only if theyfulfil the S-condition with a surjective transformation g: Q → Q9 relatingtheir parameter spaces.
This proposition states that two models are equivalent if a transformationof the parameters of one of them can be found that preserves the impliedcovariance matrix and covers the whole parameter space of the other;alternatively, any pair of models for which such a transformation can beconstructed are equivalent. Hence, equivalent models are invariant up tosuch a transformation. Thus, the notion of parameter transformation plays afundamental role for the occurrence of the equivalence phenomenon (cf.Luijben, 1991). We note that similarly to the existing rules, application ofProposition 1 does not require any data. Also, we stress that Proposition 1does not require identifiability of any of the two models.
We emphasize that neither (a) surjectivity of g, nor (b) the S-condition,can be relaxed as conditions for model equivalence. First, if one does notrequire surjectivity — that is, if g does not cover the whole parameter spaceof M9 — it is not ruled out that the latter can reproduce a covariance matrixthat is not implied by the other model, M, even if the S-condition holds. Thiscan happen at that parameter vector u9 from Q9, which is not covered then byg. Second, if one does not require the S-condition, preservation of thecovariance matrices is not insured. In either of these two cases, it follows thatthe sets of reproduced covariance matrices by the two models may not beidentical. Thus, both the S-condition and surjectivity of the underlyingparameter transformation, as stated by Proposition 1, are essentialrequirements for model equivalence.
Proposition 1 and Existing Rules For Equivalent Model Generation
Proposition 1 covers a very wide class of structural equation models usedin behavioral research. Since corresponding assumptions are not made by it,
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 207
they (a) may or may not fulfil any of the requirements of any of the rules byStelzl, Lee and/or Hershberger; (b) may or may not have parameterrestrictions, and (c) may or may not be identified. Hence, this method is moregeneral than any of the available rules. In fact, as shown in Appendix 3,Proposition 1 implies the validity of the RR and thus of all existing rules sincethey can be justified on grounds of the RR (Lee & Hershberger, 1990).Moreover, being based on the parameter transformation notion and focusing onrelationships between model parameters this method yields a further insight intothe nature of the equivalence phenomenon relative to all rules, since unlikeProposition 1 the rules are concerned with equivalent model generation onlyand not with specific relationships between the model parameters.
Proposition 1 also makes an important step beyond the existing rulesbecause it provides both a necessary and sufficient condition for modelequivalence, while the rules give only sufficient conditions. Hence, for afixed model M the present method identifies the relationship between M andany model M9 equivalent to it, unlike any rule. Specifically, from thenecessity part of the proposition follows that all M9 equivalent to M reexpressthe structure of the implied covariance matrix in terms of parameters that arethe image of such a surjective transformation of the parameters of M, withwhich M and M9 satisfy the S-condition.
Relationship to Works by Luijben (1991) and Bekker, Merckens, &Wansbeek (1994)
The present proposition gives such a necessary and sufficient conditionfor model equivalence, which is based only on the relatively uninvolvednotions of implied covariance matrix and parameter transformation. Asexemplified in the next section, this condition is directly checked in practiceby: (a) using the simple covariance rules in Equations 30 and 31 in Appendix2 to obtain the models’ implied covariance matrices; (b) equating theircorresponding elements and solving in terms of the parameters of one of themodels (to demonstrate validity of the S-condition); and (c) inverting, interms of parameters of the other model, the formulas of (b) to showsurjectivity of that parameter transformation. The application of this methodof studying model equivalence is thus more accessible to behavioralresearchers than the necessary and sufficient condition for local equivalenceby Luijben (1991, p. 660). The latter declares two models resulting from acommon nested model via single-degree-of-freedom relaxations as locallyequivalent, if and only if the Jacobian matrix of the combined model with theunion of their free parameters is of deficient rank. To apply Luijben’scondition one therefore needs to conduct more involved activities — for
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
208 MULTIVARIATE BEHAVIORAL RESEARCH
example, multivariable function differentiation, determinant or rankevaluation, construction of a one-to-one parameter transformation — inaddition to working out the implied covariance matrices and finding asurjective transformation preserving them, which is all Proposition 1requires. (A one-to-one, or bijective, transformation g: Q → Q9 is asurjective mapping with the properties that no two distinct elements in Q havethe same map in Q9, and each element of Q has an unique map in Q9; e.g.,Luijben, 1991.) At least as importantly, local equivalence is a predominantlytheoretical notion that is distinct from, and more restrictive than, the modelequivalence concept of practical concern that is widely adopted in behavioralresearch and followed in this article (Bollen, 1989; Breckler, 1990; Jöreskog& Sörbom, 1993; Lee & Hershberger, 1991; Hershberger, 1994; MacCallumet al., 1993; Raykov, 1997; Stelzl, 1986). Specifically, local equivalence isdefined only in a neighborhood of parameter point, rather than the wholeparameter space (possibly after constraining some parameters to achievemodel equivalence; see next section) as in the present definition of modelequivalence. To our knowledge, there are very limited applications of thelocal equivalence concept in behavioral research practice whereas there arenumerous utilizations of the model equivalence notion (Stelzl, 1986) followedin this article, which for comparative purposes here may be considered aglobal notion [within possibly constrained parameter space(s) to achieveequivalence; see next section] rather than a local concept of that type.1
Proposition 1 is also easier to use than the necessary and sufficientconditions for local equivalence by Bekker et al. (1994). These are based onmore involved notions from advanced functional analysis, and in addition areeither too strong — in the sense of being rarely satisfied with realistic models— or all too difficult to evaluate in practice (e.g., pp. 158, 168, 170).Specifically, their Theorem 7.4.1 presents necessary and sufficientconditions for local equivalence that, as mentioned, is a more restrictivenotion than that of model equivalence used in behavioral research practiceand of interest in this article (s. Footnote 1).
The present proposition is also more general in another aspect than theabove necessary and sufficient condition by Luijben (1991). His is valid only
1 From local equivalence at all pairs of points u and u9 of the parameter spaces Q and Q9 ofmodels M and M9 respectively follows that M and M9 are equivalent in the sense followed inthis article (Luijben, 1991) — that is, reproducing the same set of covariance matrices acrosstheir parameter spaces (e.g., Stelzl, 1986). However, for two models to be declared equivalentin the sense of this article, it is not necessary that they be locally equivalent at any (pair of)points in their parameter spaces. That is, local equivalence at all points is a sufficient but notnecessary condition for the models to reproduce the same set of covariance matrices acrosstheir parameter spaces.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 209
for models representing each a single parameter relaxation of a commonnested model, while the present method can be used also with models that canbe obtained from such one via relaxing multiple-degree-of-freedomrestrictions. Further, Proposition 1 is more general than another, necessaryonly condition for model equivalence by Luijben (1991, pp. 654, 658), thatof modification indices equality. Whereas this proposition is valid for any pairof models fulfilling its premises, that Luijben’s condition is again valid onlyfor models obtained via single-parameter relaxations of a common nestedmodel, since modification indices have been defined only with reference tosuch model versions (e.g., Jöreskog & Sörbom, 1993).
Studying (Non)Equivalence Status of Models Contingent on ParameterValues
Not only can one use the sufficiency part of Proposition 1 for provingmodel equivalence, but also its necessity part can be utilized to demonstratelack of equivalence of two considered models, similarly without using anydata. Specifically, if for two models one can show that there cannot exist aparameter transformation as stated in Proposition 1 (see its proof in Appendix1, or next section), then the latter implies that the two models are notequivalent. It is also possible to use Proposition 1 for generating modelsequivalent to a given one, M say. If one could apply such a surjectivetransformation on the parameters of M, for which another model, M9 say,could be found with parameter space being the image of the transformationand reproduced covariance matrix at the transformed parameter vector of M,then M9 would be equivalent to M.2
Since use of Proposition 1 capitalizes on features of the parametricstructure of implied covariance matrices, which in general depends on valuesof model parameters (that effectively disappear from the structure if theirvalues are 0 or 1 for example), the present method also allows examination ofmodel equivalence or lack thereof contingent upon values of modelparameters, as demonstrated in the next section and Appendix 2.
2 For example, given the developments in Appendix 3 showing that all existing rules are impliedfrom Proposition 1, any of a pair of models obtained following one or more rules in successioncan as well be considered generated from the other model of the pair using Proposition 1.Similarly, any of a couple of equivalent models focused on later in this article can be viewedas having been obtained from the other one using Proposition 1. This mode of application ofthe proposition, however, is substantially more difficult than an utilization of correspondingrules that are frequently readily used for equivalent model generation in research practice, andare therefore recommended for this purpose.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
210 MULTIVARIATE BEHAVIORAL RESEARCH
Limitations of Method
Unlike the available rules that are frequently readily applied, utilization ofProposition 1 in research practice is more involved in that it requires workingout implied covariance matrices and finding a surjective parametertransformation relating parameter spaces and preserving these matrices.However, this being a more involved method is only a natural consequence ofits greater generality. In fact, one can expect a more general means like thepresent one to be usually less readily applicable than more restrictive andspecific methods, as the existing rules are, since the former covers many morepractically relevant situations.
While Proposition 1 makes an important step beyond the rules ofequivalent model generation, it does not give a specific tool to construct allmodels equivalent to a given one, say M. However, we do not see in this alimitation of the present method because no rule, neither all of them, cangenerate all models equivalent to M, as shown in the preceding section. Froma philosophy of science view, in the general case it will be very difficult if atall possible to find a method of identifying all models equivalent to a givenone. Indeed, such a possibility would imply that one would be in a positionto identify all possible explanations of a general phenomenon that has beenobserved to a limited extent, yet this does not appear to be feasible since ourknowledge about the phenomenon is itself limited to begin with.
Based on this discussion it seems reasonable to recommend in behavioralresearch the application of available rules for generating models equivalent toa given one, while use of Proposition 1 is recommended for: (a) detectingequivalence between considered models, and/or; (b) showing that twomodels are not equivalent; that is, for examination of equivalence status(possibly contingent on parameter values) of two or more models as means ofdata description and explanation. It is in this sense that we see the method ofthis paper to be complementary to all currently available rules of equivalentmodel generation.
Illustrations of the Necessary and Sufficient Condition forModel Equivalence
Here application of Proposition 1 to the study of model equivalence statuswill be demonstrated by means of several empirical examples. A particularconcern will thereby be the practical construction of a surjective parameter
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 211
transformation with which (equivalent) models fulfil the S-condition.3 To thisend, as indicated above, from the model definition equations we typically obtainwith the rules in Equations 30 and 31 the implied covariance matrices, solve theequations of their corresponding elements in terms of parameters of one of themodels, and then show that the resulting formulas can be inverted to obtain theparameters of the other. At the end of the section, Proposition 1 is used todemonstrate similarly that two models are not equivalent for most values oftheir parameters, and equivalent for the others. Although this proposition doesnot require identifiability of any of the models involved, as mentioned before,for the illustrative purposes of this section we will be concerned with identifiedmodels as is the current practice of behavioral research.
Equivalent Models for a Pair of Covarying Variables
The discussion immediately after Equations 2 demonstrated that ModelsA and B fulfil the S-condition with the transformation g
AB defined in
Equations 2, because with gAB
the covariance matrix implied by A wasobtained from that implied by B. Next, Equations 2 are readily solved in termsof the parameters of Model A using direct algebraic rearrangements that yields
Y1
2 = b122s
Y2
2 + sD1
2, b21
= b12
sY2
2/(b122s
Y2
2 + sD1
2), and sD2
2 = sY2
2sD1
2/(b122s
Y2
2 + sD1
2).(We observe that this solution has positive variances and hence indeed belongsto the parameter space of Model A.) Thus, for any parameter u9 of B thereexists such of A, u, with which g(u) = u9 holds true. Hence, g
AB is in addition
surjective. Therefore, Models A and B are equivalent by Proposition 1. Thatdiscussion after Equations 2 showed also that Models A and C fulfil the S-condition with the parameter transformation g
AC that is similarly found to be
surjective, and therefore A and C are equivalent. In the same manner, onedemonstrates that Models B and C are equivalent, too.4
Equivalent Models with Reciprocal Latent Causation Relationships
Unlike the preceding example, two nonsaturated models are consideredhere that are frequently used in behavioral research, for example, in
3 For some of the model pairs discussed below equivalence can be obtain directly from anapplication of the replacement rule or from a reference to a saturated structural part of oneof the models (i.e., its part involving only the latent factors; e.g., Raykov, 1996). Theseexamples will be used next, however, only to demonstrate details pertaining to the practicalapproach to constructing the sought parameter transformation g. Further examples wherenone of the available rules can be used are provided later and in Appendix 2.4 The last statement also follows from the transitivity of the model equivalence relation (e.g.,Stelzl, 1986; Lee & Hershberger, 1990; Hershberger, 1994).
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
212 MULTIVARIATE BEHAVIORAL RESEARCH
autoregressive longitudinal modeling. Model C1 assumes that the latent
variable h2 plays an explanatory role for the latent construct h
1; Model C
2
assumes the reverse relationship. Both models are identical in theirmeasurement parts, and their path diagrams are presented on Figure 2 below.For identifiability, in both models the first factor loading per construct is fixedat 1, and the structural regression slopes are assumed different from zero(otherwise the models will be nonidentified; Bollen, 1989).
The definition equations of Model C1 are:
(4) Y1 = b
12h
2 + z
1 + ε1
Y2 = l
2(b
12h
2 + z
1) + ε
2
Y3 = h
2 + ε
3
Y4 = l
4h
2 + ε
4 ,
Figure 2Models C
1 and C
2.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 213
where ε1 to ε4 are the measurement errors, l
2 and l
4 the free factor loadings
(the others are fixed at unity for model identification purposes), z1 is the
structural disturbance term, and b12
the structural regression path from h2
into h1. The defining equations of Model C
2 are:
(5) Y1 = h
1 + ε
1
Y2 = n
2 h
1 + ε
2
Y3 = b
21h
1 + z
2 + ε
3
Y4 = n
4(b
21h
1 + z
2) + ε
4 ,
with b21
being the path from h1 into h
2, and z
2 the corresponding structural
residual term.Using the covariance rules Equations 30 and 31 of Appendix 2, the
implied covariance matrix of Model C2 is obtained as:
(6) S uc
c2
2
21 21
( ) = L
N
MMM
O
Q
PPP
f dn f n f db f n b fn b f n n b f n n
1 1
2 1 22
1 2
21 1 2 21 1 212
1 2 3
4 2 2 4 1 4 212
1 2 4 212
1 2 4
++
+ +( + ) ( + )+
b f c db f c b f c d
,
where di, i = 1, ..., 4, are the measurement error variances, f
1 the
independent latent variance, and c2 the residual term variance. Similarly, the
covariance matrix implied by Model C1 is:
(7) S uc
c1
1 122
2 1 1
2 122
2 1 22
122
2 1 2
12 2 2 12 2 2 3
4 12 2 2 4 12 2 4 2 42
2 4
( )( ) ( )
= + ++ + +
++
L
N
MMM
O
Q
PPP
b f c v
l b f c l b f c v
b f l b f f v
l b f l l b f l f l f v
where v1 to v
4 are the measurement error variances, f
2 the independent
latent variance, c1 the residual term variance, and n
2 and n
4 the free factor
loadings.To find the parameter transformation g, as in the preceding example we
equate each element of Sc2(uc2
) in Equations 6 with its corresponding one ofSc1(u
c1) in Equation 7, and simplify the resulting equations to a form
expressing the parameters of Model C2 in terms of those of C
1. The first
equation relates the first elements of these matrices, viz. s(c1)11
with s(c2)11
:
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
214 MULTIVARIATE BEHAVIORAL RESEARCH
(8) f1 + d
1 = b
122f
2 + c
1 + v
1 ,
and will be fulfilled if
(9) d1 = v
1 and f
1 = b
122f
2 + c
1 .
Given Equation 9, the elements s(c1)21
and s(c2)21
of Equations 6 and 7 will beequal if l
2 = n
2. With this transformation component, Equation 8, 9, and
additionally
(10) d2 = v
2 ,
we observe that s(c1)22
is automatically transformed into s(c2)22
. Equatingthen the elements s
33 across Equations 6 and 7, we arrive at the
transformation components
(11) d3 = v
3 and f
2 = b
212f
1 + c
2 .
With Equation 1 and additionally
(12) d4 = v
4 ,
we see from the next equality of the elements s44
across Equations 3 and 4 that
(13) l4 = n
4
would complete the transformation of s44
and of s43
across the two models.We now move to the lower-left corner of Equations 6 and 7, and observe thatall four cross-latent-variable covariances will be reproduced exactly if ourremaining transformation components yield
(14) b21
f1 = b
12f
2 .
To explicate Equation 14, we insert the earlier found transformation for f1
(see Equation 9) and obtain via division by f1 of both sides of Equation 14
that transformation component of g, which leads to b21
:
(15) b21
= b12
f2/(b
122f
2 + c
1).
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 215
Finally, using Equations 9 and 15 we obtain from Equation 11
(16) c2 = f
2c
1/(b
122f
2 + c
1).
Thus, the implied covariance matrix by Model C1 is obtained from that by
C2 using the 9-dimensional transformation g with the following components
(the parameter on the left-hand side of an equation denotes the result of theapplication of g on the parameters of Model C
1):
(17) gj
i i
j j
:
, ,...,, ,
,/ ( ),
/ ( ).
d v
l n
f b f c
b b f b f c
c f c b f c
= == == += += +
R
S||
T||
i 1 42 4
1 122
2 1
21 12 2 122
2 1
2 2 1 122
2 1
and
This definition of g and the discussion preceding Equation 17 show that ifone substitutes into Equation 6 the functions of Model C
1 parameters given in
the right-hand sides of Equation 17, for their corresponding parameters on theleft-hand sides of Equation 17, one obtains Equation 7. Thus, with g, C
1 and
C2 fulfil the S-condition. To show surjectivity of g, we need to demonstrate
that its defining equations can be solved in terms of the parameters of Model 1.To this end, one considers the left-hand sides of Equation 17 as fixed quantities,and hence Equation 17 as a system of 9 equations in 9 unknowns — theparameters of C
1. Obviously, the first 6 components of g are directly invertible
for this purpose as trivial identities that only use single symbols at both sides oftheir equations. The remaining 3 components of g are then solved with directalgebraic rearrangements, yielding c
1 = f
1c
2/(c
2 + f
1b
212), f
2 = c
2 + f
1b
212,
and b12
= f1b
21/(c
2 + f
1b
212). (We note that c
1 > 0 and f
2 > 0 is indeed
fulfilled because f1 > 0 and c
2 > 0 as variances, earlier assumed positive
throughout, like the typical case in empirical practice; hence the obtainedsolution belongs to the parameter space of Model C
1.) This implies that for
each parameter vector u9 of Model C2 there exists a parameter vector u of
Model C1, such that g(u) = u9. That is, g is surjective. Therefore, by
Proposition 1, Models C1 and C
2 are equivalent.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
216 MULTIVARIATE BEHAVIORAL RESEARCH
A Simplifying Observation
As noted on the previous page, the first 6 components of g in Equation 17are trivial transformations, that is, identity transformations that formally onlyuse a different symbol at both sides of their equation. They stem from partsof the models in which they are identical, namely the measurement models ofC
1 and C
2. Therefore we could use the same symbols in both models for the
measurement errors (which obviously is inconsequential for Equations 6 and7). In fact, we could have used the same symbols also with factor loadingsand error variances since the corresponding 6 components of g onlyreexpressed any of these parameters in another single parameter. Employingthe same symbols for them would have made the construction of Equation 17less involved. (We note, however, that this use of same symbols is notmandatory for the present method.)
This simplifying use of the same symbols across models whileconstructing g can be done for their “common parameters”, in theterminology and method in Raykov (1997, pp. 96-97); for all remainingparameters, called “distinct parameters”, one needs to keep different symbolsacross the models. An alternative approach to finding the commonparameters is as follows. Fix one of the models, say M, and try to obtain thepath-diagram of the other, M9, from that of M via changes in one- and two-way paths. All paths (one- and/or two-way) that are thereby not modified areassociated with likely common parameters of M and M9, which are unchangedby g. (That is, these parameters will likely be identically transformed by g;for a means of obtaining a reassurance, see below in this paragraph.) Often,yet depending on M and M9, they may be found in the measurement parts, forexample, factor loadings and measurement error variances/covariances.Alternatively, all modified paths are associated with parameters of M that arenontrivially transformed by g, that is, their corresponding component of g isnot the identity transformation. These are the distinct parameters, such as: a)variances, residual variances, and covariances pertaining to initiallydependent/independent variables in M that change their status toindependent/dependent variables in M9, and b) all parameters of M that areassociated with paths going into those initially dependent variables from othervariables (if there are such paths). Notably, this approach is only exploratoryand does not represent a rigorous means of determining common and distinctparameters of studied models. A formal test if a parameter p of M and itscorresponding one p9 of M9 are common parameters of the two models is asfollows. If using different symbols for all latent variables and parameters ofM and M9, common parameters are those that are related by the identitytransformation; all others are distinct parameters of the two models. That is,
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 217
p and p9 are common parameters of M and M9 if their correspondingcomponent of g is the linear transformation p9 = a + b.p, with a = 0 and b = 1(Raykov, 1997). The result of this test is positive for the first six componentsof g in the above Equations 17, which are therefore common parameters ofModels C
1 and C
2, and hence one can use the same symbols for them in both
models when deriving g.This discussion shows that it is possible to substantially simplify the
process of construction of the parameter transformation g, since commonparameters cannot contribute to a violation of surjectivity or the S-condition.This is because their corresponding components of g are the identitytransformations that are trivially invertible and, as simple parametersubstitutions, obviously do not change the structure of the implied covariancematrices. Therefore, when constructing g its components pertaining tocommon parameters, stemming from parts in which the models are identical,are of little concern. Instead, one can focus on those components of g thatrelate to distinct parameters, which are located in parts of the models wherethey differ. This observation concentrates one’s main efforts in the practicalconstruction of g, on finding its nontrivial components.
Equivalent Models of Educational Attainment
This example borrows from research on educational attainment bySewell, Haller, & Ohlendorf (1970), and is closely related to one by Stelzl(1986). Either of the next two models assumes that educational attainmentand variables presumably related to it, such as mental ability, academicperformance and socioeconomic status (SES), represent each a latentconstruct measured with a triple of indicators. The first model to beconsidered, Model 1, is presented in Figure 3; the second model, Model 2, hasthe same measurement part (Figure 4).
We note that unlike the first example in this section neither model issaturated, and unlike the earlier Models C1 and C2 the structural parts ofModels 1 and 2 are not saturated as they have 1 degree of freedom. WhileModel 1 assumes that academic performance causally affects educationalattainment over and above their interrelationship due to mental ability andSES, Model 2 does not assume this causality but postulates aninterrelationship between the unexplained variation in academic performanceand that in educational attainment after their linear relationships to mentalability and SES have been taken into account.
For either model, let hi (i = 1, 2, 3, 4) denote the latent constructs in the
order mental ability, SES, academic performance and educational attainment,
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
218 MULTIVARIATE BEHAVIORAL RESEARCH
with pertinent indicators Yij (i = 1, 2, 3, 4; j = 1, 2, 3), and associated error
terms Eij (i = 1, 2, 3, 4; j = 1, 2, 3). Let z
3 and z
4 denote the respective
structural residuals of academic performance and educational attainment, andfor model identification assume sh
1
2 = sh2
2 = l31
= l41
= 1. Either model has29 parameters: (a) 12 measurement error variances; (b) 10 free factorloadings, and (c) 7 parameters in their structural part. The latter are: (a) thecovariance of h
1 and h
2, (b) 3 paths connecting the latent variables (see
Figures 3 and 4), (c) 2 structural disturbance variances, and (d) the structuralpath connecting h
4 with h
3 in Model 1, or the structural disturbance
covariance in Model 2. Due to identity of their measurement parts, Models1 and 2 do not differ in the 22 parameters in a) and b); this can also beobserved when using the above mentioned graphic method of obtaining thepath-diagram of Model 2 from that of Model 1 — no changes are carried outin the parts of the latter where the parameters in (a) and (b) reside.
Next, for Model 1 let A, B, C, and D denote the postulated structural pathsconnecting the four latent constructs as in Figure 3. The definition equationsof the model are:
Figure 3Model 1
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 219
(18) Y11
= l11
h1 + E
11
Y12
= l12
h1 + E
12
Y13
= l13
h1 + E
13
Y21
= l21
h2 + E
21
Y22
= l22
h2 + E
22
Y23
= l23
h2 + E
23
Y31
= h3 + E
31
Y32
= l32
h3 + E
32
Y33
= l33
h3 + E
33
Y41
= h4 + E
41
Y42
= l42
h4 + E
42
Y43
= l43
h4 + E
43
h3 = Ah
1 + z
3
h4 = Bh
1 + Ch
2 + Dh
3 + z
4 = (AD + B)h
1 + Ch
2 + Dz
3 + z
4 .
Figure 4Model 2
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
220 MULTIVARIATE BEHAVIORAL RESEARCH
The covariance matrix S11
implied by Model 1 is (see Bollen, 1989; orAppendix 2):
(19) S11
= L K11
Lt + Ξ ,
where L is the model-invariant factor loadings matrix, ΞΞΞΞΞ the errorcovariance matrix, and K
11 = [k(1)
ij, i = 1, 2, 3, 4, j = 1, 2, 3, 4, j < i] is the
(symmetric) covariance matrix of its latent variables h1, ..., h
4. K
11 is
structured in terms of model parameters as follows [using the symbols F12
=Cov(h
1, h
2) = Corr(h
1, h
2), where Corr(.) denotes correlation, C
3 = Var(z
3),
C4 = Var(z
4); employing capital symbols will turn out quite convenient shortly
but, as before, is not mandatory]:
(20) k(1)11
= 1
k(1)21
= F12
k(1)22
= 1
k(1)31
= Ak(1)
32 = AF
12
k(1)33
= A2 + C3
k(1)41
= AD + B + CF12
k(1)42
= (AD + B)F12
+ Ck(1)
43 = A(AD + B) + ACF
12 + DC
3
k(1)44
= (AD + B)2 + C2 + 2(AD + B)CF12
+ D2c3 + C
4 .
The defining equations of Model 2 are as follows (cf. Figure 4; forsimplicity of symbolic representation and facilitation of relating thecovariance matrices implied by either model, lower-case symbolscorresponding to Model 1 parameters are used for these in the structural partof Model 2 but, as mentioned above, other choices are possible too):
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 221
(21) Y11
= l11
h1 + E
11
Y12
= l12
h1 + E
12
Y13
= l13
h1 + E
13
Y21
= l21
h2 + E
21
Y22
= l22
h2 + E
22
Y23
= l23
h2 + E
23
Y31
= h3 + E
31
Y32
= l32
h3 + E
32
Y33
= l33
h3 + E
33
Y41
= h4 + E
41
Y42
= l42
h4 + E
42
Y43
= l43
h4 + E
43
h3 = ah
1 + z
3
h4 = bh
1 + lh
2 + z
4 .
The covariance matrix implied by Model 2 is therefore
(22) S22
= L K22
Lt + Ξ ,
where now K22
is the covariance matrix of its latent variables h1, ..., h
4 with
following elements [in addition to the mentioned lower-case symbols, we use
c34
= Cov(z3, z
4)]:
(23) k(2)11
= 1
k(2)21
= f12
k(2)22
= 1
k(2)31
= ak(2)
32 = af
12
k(2)33
= a2 + c3
k(2)41
= b + gf12
k(2)42
= bf12
+ gk(2)
43 = ab + agf
12 + c
34
k(2)44
= b2 + g2 + 2bgf12
+ c4 .
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
222 MULTIVARIATE BEHAVIORAL RESEARCH
As indicated on previous page, due to invariance in the measurementparts of Models 1 and 2, the sought transformation g contains 12 identitycomponents for the error variances and 10 identities for the factor loadings.To construct the remaining 7 nontrivial/nonidentity components of g we firstset k(1)
21 = k(2)
21 which yields F
12 = f
12. Then k(1)
31 = k(2)
31 results in A = a.
Now the implied transformation AF12
= af12
is consistent with k(1)32
= k(2)32
.Next k(1)
33 = k(2)
33 with already found components of g leads to C
3 = c
3. Then
k(1)41
= k(2)41
and k(1)42
= k(2)42
result in AD + B = b and C = g. Subsequently,k(1)
43 = k(2)
42 and all preceding components yield DC
3 = c
34. In the same
manner, k(1)44
= k(2)44
suggests, on the basis of the previous ones, the finalcomponent of g as D2C
3 + C
4 = c
4.
Thus, the full definition of g is (the d’s are the invariant measurementerror variances):
l12
= l12
l13
= l13
. . . = . . .
l43
= l43
d11
= d11
. . . . . .
d43
= d43
(24) g: f12
= F12
a = A
c3
= C3
g = C
b = AD + B
c34
= DC3
c4
= D2C3 + C
4 .
The discussion preceding Equation 24 demonstrates that with g Models 1and 2 fulfil the S-condition. To show that in addition g is surjective, as beforewe need to demonstrate that its defining equations can be solved in terms ofparameters of Model 1 that appear on the right-hand sides of Equation 24.Obviously, the first 26 components of g, as identity transformations, aredirectly invertible for this purpose. Then the last 3 components of g represent3 equations structured in terms of 3 unknowns, namely the 3 remaining
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 223
parameters of Model 1: D, B, and C4 (recall from the preceding components
that A = a and c3 = C
3). These equations are readily solved with straight-forward
algebra, yielding D = c34
/c3 , B = b − ac
34/c
3, and C
4 = (c
3c
4 - c
342)/c
3. [We
note that C4 > 0 holds, due to the Cauchy-Schwarz inequality (e.g., Mardia,
Kent, & Bibby 1979), and hence this solution belongs to the parameter spaceof Model 1.] Thus g is surjective, and hence Models 1 and 2 are equivalentby Proposition 1.
Nonequivalent Models of Educational Aspiration
The following two models do not differ in their measurement parts, and areassumed identical in their structural parts to the ones of Figure 3.3 inHershberger (1994, p. 81; cf. Stelzl, 1986, p. 325). They are depicted onFigures 5 and 6 on the following pages using a different notation fromHershberger’s in order to preserve consistency with that followed in the presentsection. These models can be formally obtained from Models 1 and 2 by“redirecting” into h
3 the explanatory path from h
2 to h
4 in the corresponding
Figures 3 and 4. (For the illustrative purposes of this example, we assumeAC Þ 0 and ag Þ 0; obviously, the case A = C = 0 and a = g = 0 turn thisexample into a special case of the preceding one; see following Footnote 5)
The model in Figure 5, called Model 3, assumes a causal impact of SESupon significant others’ influence, and such of significant others’ uponeducational aspiration. The model in Figure 6, called Model 4, does notpostulate the latter causality but assumes instead interrelated unexplainedparts of significant others’ influence and educational aspiration. Neithermodel is saturated in its structural part or overall. Obviously, there is adifference between Models 3 and 4 on the one hand, and Models 1 and 2 onthe other — beyond the obvious fact that they are based on differentsubstantive variables — since Models 3 and 4 do not postulate therelationship between h
2 and h
4, but instead posit such between h
2 and h
3.
Thus, a single change in the structure of a pair of models, such as theredirection of one path, can “brake” their equivalence property. Indeed,while Models 1 and 2 are equivalent, Models 3 and 4 are generally not, asshown below. At the same time, comparison of the latter with the formerpairs of models shows that in general it is not possible to directly judge, forexample only by looking at their path diagrams, that two models are(non)equivalent. At first glance, these two pairs of models appear to have thesame equivalence status: either both equivalent (Models 1 and 2 equivalent,and 3 and 4 equivalent), or both nonequivalent (Models 1 and 2nonequivalent, and 3 and 4 nonequivalent). This simple exampleemphasizes, in addition to the preceding discussion, the need also of formal
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
224 MULTIVARIATE BEHAVIORAL RESEARCH
(rather than graphical) means for examining model equivalence status, suchas the method of this article.
Next, since Models 3 and 4 do not differ in their measurement parts, inline with the earlier discussion in this section we focus on their structuralparts. (Without limitation of generality, the same assumptions as in thepreceding example are made for model identification.) Using for conveniencethe same notation as with Models 1 and 2, respectively, the structuraldefinition equations of Model 3 and 4 are as follows (see Figures 4 and 5; wenote that there is no restraint upon any parameter of Models 3 and 4, exceptthe positive definiteness of any model implied covariance matrix, as typicallyassumed in practice):
(25) h3 = Ah
1 + Ch
2 + z
3
h4 = Bh
1 + D(Ah
1 + Ch
2 + z
3) + z
4 = (AD + B)h
1 + CDh
2 + Dz
3 + z
4 , and
Figure 5Model 3
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 225
(26) h3 = ah
1 + ch
2 + z
3 ,
h4 = bh
1 + z
4 .
From Equations 25 and 26, the implied covariance matrices by thestructural parts of Models 3 and 4, K
33 = [k(3)
ij, i = 1, 2, 3, 4, j = 1, 2, 3, 4,
j < i] and K44
= [k(4)ij, i = 1, 2, 3, 4, j = 1, 2, 3, 4, j < i], are respectively as
follows (e.g., Appendix 2):
(27) k(3)11
= 1,k(3)21
= F12
, k(3)22
= 1
k(3)31
= A + CF12
,k(3)32
= AF12
+ Ck(3)
33 = A2 + C2 + 2ACF
12 + C
3
k(3)41
= AD + Β + CDF12
k(3)42
= (AD + Β)F12
+ CD
k(3)43
= A(AD + Β) + C2D + (2ACD+ CD)F12
2 + DC3
k(3)44
= (AD + Β)2 + (CD)2 + 2CD(AD + B)F12
+ D2C3 + C
4 , and
Figure 6Model 4
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
226 MULTIVARIATE BEHAVIORAL RESEARCH
(28) k(4)11
= 1, k(4)21
= f12
, k(4)22
= 1
k(4)31
= a + gf12
, k(4)32
= af12
+ gk(4)
33 = a2 + g2 + 2agf
12 + c
3
k(4)41
= bk(4)
42 = bf
12
k(4)43
= ab + bgf12
+ c34
k(4)44
= b2 + c4 .
The necessity part of Proposition 1 states that if two models areequivalent there exists such a surjective parameter transformation of theirparameter vectors, which preserves their implied covariance matrices.Thereby, according to the proof in Appendix 1, such a transformation shouldbe definable by relating those parameter vectors of both models, at whichthey reproduce the same covariance matrix. Assume now that Models 3 and4 are equivalent. Then there should exist such a mapping h of theirparameter spaces (vectors), which equalizes the corresponding elements oftheir covariance matrices stated in Equations 27 and 28, respectively.Thus, k(3)
21 = F
12 equals k(4)
21 = f
12, so h should transform f
12 into F
12.
Next setting equal k(3)31
and k(3)32
with k(4)31
and k(4)32
, respectively, andgiven F
12 = f
12 = f say, we obtain a system of 2 equations in 2 parameters:
A + Cf = a + gf and Af + C = af + g. Solving it we see that h shouldtransforms a into A and γ into C (if f Þ 1; for the latter case of perfectcorrelations of the independent latent variables, see below). Then k(3)
33 = k(4)
33
implies that h should transform c3 into C
3. Comparing then k(3)
41 and k(3)
42
with k(4)41
and k(4)42
, respectively, in the same manner we deduce that F12
andf
12 should equal 6 1. This, however, contradicts the general, unrestrained
nature of the correlation between the independent latent variables in eitherModel 3 or Model 4 (see above). Hence in general no such mapping h can bedefined which is implied by Proposition 1 in the case of model equivalence.Therefore, Models 3 and 4 are in general not equivalent.
This demonstration also shows, in addition to Example 1 in Appendix 2,that two models can be equivalent for certain values of some of theirparameters and nonequivalent for the remaining parameter values, whichfinding cannot be generally deduced using the existing rules. Indeed, the lastparagraph shows that Models 3 and 4 are in fact equivalent if and only ifF
12 = f
12 = 1 or -1. Then, for purposes of structural modeling, one can
consider the independent latent variables in either model, h1 and h
2,
essentially collapsing into a single one. In this empirical context ofeducational aspiration predictors, such a collapse cannot be interpreted
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 227
meaningfully as there is no construct that could be both mental ability andSES. (At least it is not true that mental ability and SES are perfectly linearlyrelated, as follows from F
12 = f
12 = 6 1.) Thus one can conclude that Models
3 and 4, as means of description and explanation of the particular substantivephenomenon focused on in this section, cannot be equivalent. In moregeneral terms, this example shows that for some models studied forequivalence the existence of the critical transformation g may bedemonstrated only after certain restrictions are imposed on the parameterspace (other examples are given in Appendix 2).5
Conclusion
This article presented a widely applicable, necessary and sufficientcondition for equivalence of structural equation models. Proposition 1provided a means for demonstrating, without use of any data, equivalence ofa pair (or sets) of models or lack thereof. The proposition is more generalthan all existing rules of equivalent model generation, which have seriouslimitations explicated in this paper and in addition give only sufficientconditions of equivalence. The present method is applicable to pairs ofmodels with which no rule can be used, such as models with parameterconstraints that are frequently employed in behavioral research (e.g.,Appendix 2) or with models not fulfilling restrictive assumptions of theavailable rules. Furthermore, the validity of all rules is implied fromProposition 1, as shown in Appendix 3, which is thus applicable also to anypair of equivalent models on which a rule is. Hence, Proposition 1 representsa genuine extension of the rules.
The method of this article is complementary also to conditions of modelequivalence by Luijben (1991). The present one is more general than hisnecessary and sufficient condition as well as his modification indices equalitycondition that is in addition only a necessary one. Proposition 1 is distinct fromthe treatment of the equivalence problem by Bekker et al. (1994), specificallytheir statements in Chapters 2 and 7. This is due to the fact that those authorsare concerned with local equivalence, which like Luijben’s necessary andsufficient condition is mainly of theoretical interest (e.g., Bekker et al., pp. 158,168, 170) and represents a substantially more restrictive notion than thepractical concept of model equivalence that is widely adopted in the behavioralliterature and is of concern in this article (see Footnote 1).
5 We thank an anonymous Referee for indicating that if A = C = 0 = a = g Models 3 and 4 areequivalent (across their corresponding parameter spaces then). This was another reason toexclude the points with A = C = 0 and those with a = g = 0 from the parameter spaces of Models3 and 4, respectively, at the outset of our illustrative discussion of this example.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
228 MULTIVARIATE BEHAVIORAL RESEARCH
We therefore hope that the present Proposition 1, along with all existingrules for generating equivalent models by Stelzl (1986), Lee & Hershberger(1990) and Hershberger (1994), will contribute to managing the problem ofequivalent structural equation models in behavioral research. Due to theoften relatively easy application of the replacement rule, Stelzl’s rules or theinverse indicator rule, pairs of models that can be obtained from one anotherwith these rules are recommended to be studied in terms of them instead ofProposition 1 that is in general considerably more involved to use, in whichwe see its limitation. This proposition, however, being applicable when noexisting rule is, is recommended for the latter cases and particularly withmodels having parameter restrictions that are frequently utilized in behavioralresearch or when assumptions of the available rules are not fulfilled. Inaddition, the present method of proving model equivalence or lack thereofyields a special insight into the nature of the equivalence phenomenon. Thisis because the method is based on the identification of a parametertransformation preserving the implied covariance matrix. Hence, thisapproach contributes to “demystifying” the problem of equivalent models, inthat it shows that all that is achieved by any model equivalent to a given oneis a specific reexpression of the implied covariance matrix in a distinct set ofparameters (see Proposition 1). Thus, the method of this article represents auseful extension of the armory of behavioral scientists for detection ofstructural equation models equivalent to considered ones or for showing lackof equivalence, both possibly contingent on values of model parameters,beyond the region of applicability of the existing rules for generation ofequivalent models by Stelzl (1986), Lee & Hershberger (1990), andHershberger (1994).
References
Bekker, P. A., Merckens, A., & Wansbeek, T. J. (1994). Identification, equivalent models,and computer algebra. New York: Academic Press.
Bentler, P. M. (1995). EQS structural equations program manual. Encino, CA:Multivariate Software, Inc.
Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.Breckler, S. (1990). Applications of covariance structure modeling in psychology: Cause for
concern? Psychological Bulletin, 107, 260-273.Hershberger, S. L. (1994). The specification of equivalent models before the collection of
data. In A. von Eye and C. C. Clogg (Eds.), Latent variables analysis (pp. 68-108).Thousand Oaks, CA: Sage.
Jöreskog, K. G. & Sörbom, D. (1993). LISREL 8: User’s guide. Chicago, IL: SPSS ScientificSoftware.
Lee, S.& Hershberger, S. (1990). A simple rule for generating equivalent models in covariancestructure modeling. Multivariate Behavioral Research, 25, 313-334.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 229
Luijben, T. C. W. (1991). Equivalent models in covariance structure analysis.Psychometrika, 56, 653-665.
MacCallum, R. C., Wegener, D. T., Uchino, B., N., & Fabrigar, L. R. (1993). The problemof equivalent models in applications of covariance structure analysis. PsychologicalBulletin, 114, 185-199.
Mardia, K. V., Kent, J. T., & Bibby, J. M. (1979). Multivariate analysis. New York:Academic Press.
Raykov, T. (1996). Structural equation models for measuring change in psychology.Habilitation thesis. Institute of Psychology, Humboldt University, Berlin.
Raykov, T. (1997). Equivalent structural equation models and group equality constraints.Multivariate Behavioral Research, 32, 95-104.
Stelzl, I. (1986). Changing a causal hypothesis without changing the fit: Some rules forgenerating equivalent path models. Multivariate Behavioral Research, 21, 309-331.
Accepted August, 1998.
Appendix 1Proof of Proposition 1
Sufficiency
Let M and M9 fulfil the S-condition with a surjective transformationg: Q → Q9, and let S be an arbitrary matrix reproduced by M. This meansthat there exists a parameter vector u from Q, such that S = S(u). Since g isdefined on Q, there exists a parameter vector u9 from Q9, such that u9 = g(u).Given that M and M9 fulfil the S-condition with g, S = S(u) = S9[g(u)] =S9(u9) follows, that is, S is reproduced by M9 as well, namely at u9. Thus, Sbelongs also to V9, and hence V # V9. (The symbol ‘#‘ denotes setinclusion, i.e., the set to the left is a subset of that on the right side.)Conversely, let S9 be an arbitrary covariance matrix reproduced by M9. Thismeans that there exists a parameter vector u9 from Q9, such that S9 = S9(u9).Since g is surjective, there exists a u from Q, such that g(u) = u9. By virtueof the S-condition, S9(u9) = S9[g(u)] = S(u) = S9 follows. Hence, S9 isreproduced also by M, namely at u. Consequently, S9 belongs to V as well,and hence V9 # V holds true. Given the previous finding V # V9, it followsthat V9 = V. That is, M and M9 generate identical sets of covariance matricesacross their parameter spaces, and hence M and M9 are equivalent.
Necessity
Let M and M9 be equivalent. Here we will construct such a surjectivetransformation g: Q → Q9, with which M and M9 fulfil the S-condition. Theassumption of equivalence means that M and M9 reproduce the same
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
230 MULTIVARIATE BEHAVIORAL RESEARCH
corresponding sets V = V9 = V*, say, of covariance matrices. Let u be fromQ. Then the covariance matrix S(u) = S from V∗ is reproduced by M as wellas by M9. Let S be reproduced by M9 at u9 from Q9. This means that for eachu from Q a corresponding element u9 from Q9 can be found. Thus, we candefine a transformation g: Q → Q9 by mapping u into u9, that is, by the ruleu9 = g(u). Thereby, g is surjective. Indeed, if u9 is from Q9, the covariancematrix S9(u9) reproduced by M9 at u9 belongs to V* (= V = V9). Ηence, S9(u9)must be reproduced by M as well, say at u from Q. By the definition of g,however, g(u) = u9. Hence, for each u9 from Q9 there exists a u from Q, suchthat g(u) = u9. This means that g is surjective. Moreover, by thedefinition of g, S = S(u) = S9(u9) = S9[g(u)]. Hence, M and M9 alsofulfil the S-condition with g.
This completes the proof of Proposition 1.
Appendix 2Not All Models Equivalent to a Given One Can Be Obtained with the
Existing Rules
Three examples of pairs of models are presented here. None of the firsttwo pairs can be examined using an existing rule of equivalent modelgeneration. Proposition 1 is however directly applicable to them, and can beused to study these models’ equivalence status depending on the values oftheir parameters. (The RR should not be applied to the third pair of modelsbelow, as they contain a nontrivial parameter restriction — as mentioned inthe main text, all rules are applicable only in the case of no nontrivialparameter restrictions.)
In the following Figure 7 depicting models M1 and M
2, Y
1, Y
2, Y
3 and Y
4
denote four observed variables, h1 and h
2 two latent variables pertaining to
the first and second pair of manifest variables respectively, and ε1, ε
2, ε
3, ε
4
are the corresponding measurement errors. For the second pair of models, A1
and A2 depicted on Figure 8, there is a third latent variable, h
3, in A
2 whereas
A1 is a special case of M
1 with unitary factor loadings.
Example 1
The definition equations of model M1 in Figure 7 are:
(29) Y1 = h
1 + ε
1
Y2 = l
2 h
1 + ε
2
Y3 = h
2 + ε
3
Y4 = l
4 h
2 + ε
4 .
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 231
In addition, the model assumes interrelated latent variables, that is,Cov(h
1, h
2) = f
12 Þ 0.
Model M2 in Figure 7 has identical definition equations to those of M
1 but
assumes that the latent variables are unrelated and that some of themeasurement errors have equal (nonzero) covariances:
Cov(ε1, ε
3) = Cov(ε
1, ε
4) = Cov(ε
2, ε
3) = Cov(ε
2, ε
4) = r Þ 0.
Evidently, the measurement parts of Models M1 and M
2 are neither
saturated (even if the restrictions in the last equation are not imposed whilethese error covariances are assumed nonzero), nor identical. The models areidentified, as can be shown directly, and have the same number ofparameters. The structural model is ‘void’ in both models, that is, B = 0, interms of Submodel 3 of the general LISREL model (Jöreskog & Sörbom,1993). The structural correlation matrix, however, is different across M
1 and
M2. That is, their structural models differ: while M
1 is saturated there, M
2 is
Figure 7Models M
1 and M
2 — the equality sign denotes assumed identity of the pertaining
measurement error covariances.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
232 MULTIVARIATE BEHAVIORAL RESEARCH
not. Evidently, M1 has one more parameter in its measurement model than
M2, and one less in its structural model. That is, models M
1 and M
2 differ in
both their measurement and structural parts (rather than only in theirstructural part as is most frequently the case when applying the RR — seeMacCallum et al., 1993).
The RR is not applicable on M1 and M
2: neither a focal block, nor a
preceding block, nor a succeeding block is possible to identify in order to getstarted with the rule — these blocks are not definable with such a pair ofmodels. This limitation is due to the fact that the RR is restrictive, in that itonly inverts paths or exchanges them for covariances between the residuals oftheir pertinent variables (that the path originally connects). That is, the RRhas a local nature, in that the change in the model remains at the same portionof it (focal block). Yet this is not the case with models M
1 and M
2. Neither
of Stelzl’s rules is applicable, nor is Hershberger’s inverse indicator rule, asit only inverts a measurement model path. A major reason why no existingrule is applicable to M
1 and M
2 is that the former have been developed for
models without parameter restrictions, yet M2 has such (see last equation).
In difference to the existing rules, Proposition 1 is readily applied onmodels M
1 and M
2. To this end, use will be made of the following two simple
rules of covariance algebra [e.g., Bollen, 1989; Cov(.,.) denotes covarianceand Var(.) variance].
Rule 1 — For any random variable X with finite second-order moment,
(30) Cov(X,X) = Var(X); and
Rule 2 — For any random variables X, Y, Z, and U with finite second-ordermoments, and any real numbers a, b, c, and d,
(31) Cov(aX + bY, cZ + dU)= acCov(X,Z) + adCov(X,U) + bcCov(Y,Z) + bdCov(Y,U) .
Now the implied covariance matrix S1 by M
1 is directly obtained based on
its definition equations Equation 29, and Equations 30 and 31. Denoting byf
i the variances of h
i (i = 1,2) and by d
i those of ε
i (i = 1, ...,4), one obtains:
(32)S1 1 1
2 1 22
1 2
12 2 12 2 3
4 12 2 4 12 4 2 42
2 4
= ++
++
L
N
MMM
O
Q
PPP
f d
l f l f d
f l f f d
l f l l f l f l f d.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 233
Using for convenience capital letters for corresponding parameters in themodel definition equations of model M
2 (“look” at Equations 29 in terms of
capital symbols now, yet this symbolism is not mandatory; see main text) oneobtains its reproduced covariance matrix S
2:
S2 1 1
2 1 22
1 2
2 3
4 2 42
2 4
= ++
++
L
N
MMM
O
Q
PPP
F DLF L F D
F DLF L F D
r rr r
.
Comparing the corresponding entries of the reproduced covariance matricesS
1 and S
2, it is observed that M
1 and M
2 cannot be equivalent if at least one
of the unknown factor loadings of the former, l2 and l
4, differs from 1. This
is because then M1 would generally reproduce different covariance matrices
relative to M2, that is, such matrices that cannot be implied by the latter
model. Yet by the necessity part of Proposition 1 (see its proof in Appendix1) each of these matrices should be reproducible by M
1 at a parameter u say
and by M2 at a u9 [ = g(u)] from its parameter space, which is a contradiction
since there may be no such u and u9.However, if l
2 = l
4 = 1 = L
2 = L
4, there exists a 7-dimensional parameter
vector transformation g that relates the model free parameters symbolized bysame letters in different cases (i.e., capital vs. lower) across the models, andadditionally maps f
12 into r. (Note that in case L
2 Þ 1 or L
4 Þ 1, even if
l2 = l
4 = 1, the models still cannot be equivalent, as is shown with the same
argument like the one in the preceding paragraph.) Formally, g is defined asfollows:
(33) g:
D
D
D
D
F
F
1 1
2 2
3 3
4 4
1 1
2 2
12
=======
R
S
||||
T
||||
d
d
d
d
f
f
r f
With this g, evidently S1(u
1) = S
2[g(u
1)], where u
1 = (d
1, d
2, ..., f
12)t is the
7-dimensional (free) parameter vector of Model M1. That is, Models M
1 and
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
234 MULTIVARIATE BEHAVIORAL RESEARCH
M2 fulfil the S-condition. Furthermore, from Equation 33 is immediately seen
that g is surjective: for any given parameter vector u2 = (F
1, F
2, D
1, ..., D
4, r)t
of Model M2, as a corresponding one of Model M
1 [i.e., as u
1 that is
transformed by g into the former: u2 = g(u
1)] take that with identical
measurement error variances and latent variances to those of M2, and a latent
covariance f12
= r. Thus, based on Proposition 1, Models M1 and M
2 are
equivalent if l2 = l
4 = 1 = L
2 = L
4. As shown in the preceding paragraph, they
are nonequivalent for any other values of these factor loadings that do notobey the equations l
2 = l
4 = 1 = L
2 = L
4. It is emphasized that none of these
conclusions can be arrived at using any of the available rules of equivalentmodel generation.
Example 2
For reasons outlined in Example 1, none of the existing rules can beapplied to models A
1 and A
2 on Figure 8. The structural part (covariance
matrix of size 3 × 3) of model A2 is not saturated as it has only 3 nonzero
elements — the latent variances; yet that part of Model A1 is saturated. Either
model is identified, as can be directly shown.For simplicity (and without any limitation of generality), all 8 factor
loadings are assumed 1 to further make the demonstration transparent that norule is applicable to these models while Proposition 1 is directly useable. Thedefining equations of Models A
1 and A
2 are respectively
Y1 = h
1 + ε
1
Y2 = h
1 + ε
2
Y3 = h
2 + ε
3
Y4 = h
2 + ε
4 , and
Y1 = h
1 + h
3 + ε
1
Y2 = h
1 + h
3 + ε
2
Y3 = h
2 + h
3 + ε
3
Y4 = h
2 + h
3 + ε
4 ,
whereby the latent covariance f12
= Cov(h1,h
2) is a parameter of A
1 in addition
to the latent variances fj = Var(h
j), j = 1, 2, and the measurement error variances
di = Var(ε
i), i = 1, ..., 4; instead of f
12, Model A
2 has the parameter F
3 = Var(h
3),
in addition to the latent variances Fj = Var(h
j), j = 1, 2 and the measurement error
variances Di = Var(ε
i), i = 1, ..., 4. For the developments next, it is also assumed
that 0 < f12
< min (f1, f
2), where min(.,.) denotes the smaller number, but we
note that this assumption does not limit the validity of the following argument.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 235
The implied covariance matrices by Models A1 and A
2 are
correspondingly
SA1 1 1
1 1 2
12 12 2 3
12 12 2 2 4
= ++
++
L
N
MMM
O
Q
PPP
f d
f f d
f f f d
f f f f d
SA2
1 3 1
1 3 1 3 2
3 3 2 3 3
3 3 2 3 2 3 4
=
+ ++ + +
+ ++ + +
L
N
MMM
O
Q
PPP
F F D
F F F F D
F F F F D
F F F F F F D
Figure 8Models A
1 and A
2.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
236 MULTIVARIATE BEHAVIORAL RESEARCH
With the method of this article, we successively equate correspondingelements of the implied covariance matrices, and obtain the equalities d
i = D
i
(i = 1, ..., 4), f1 = F
1 + F
3, f
2 = F
2 + F
3, f
12 = F
3. That is, the sought
parameter transformation g is defined as follows:
(34) g:
D
D
D
D
F
F
F
1 1
2 2
3 3
4 4
1 1 12
2 2 12
3 12
===== −= −=
R
S
||||
T
||||
d
d
d
d
f f
f f
f
Direct comparison of SA1
and SA2
implies that with g Models A1 and A
2
fulfil the S-condition. Then, Equation 34 represent a system of 7 equationsin 7 unknowns, the parameters of A
1. They can be directly solved in terms
of the latter parameters: di = D
i (i = 1, ..., 4), f
1 = F
1 + F
3, f
2 = F
2 + F
3, and
f12
= F3. (We note that due to the earlier made assumptions all obtained
variances are positive, and hence this solution belongs to the correspondingparameter space of Model A
1.) Thus, g is also surjective, and by Proposition
1 Models A1 and A
2 are equivalent.
Example 3
In this final pair of models, M01
and M02
, it is assumed that thepercentages explained variance in the latent dependent variables MentalAbility and Significant Others are identical. After fixing all latent variances to1, this restriction is achieved by setting equal the variances of their associatedresiduals (see Figures 9 and 10, next 2 pages). Note that apart from thisparameter restraint all assumptions of the RR are satisfied with this pair ofmodels.
With such a constraint, the RR should not be used because all rules weredeveloped for models with no parameter restrictions (e.g., Stelzl, 1986; Lee& Hershberger, 1990). To show that this is an essential requirement,particularly of the replacement rule (RR), one may explore what the resultcould be if one nonetheless would apply the RR. (We do not propose here oneshould do it, though.) If this requirement were to be ignored (?!), the RRwould “suggest” equivalence of these two models. However, M
01 and M
02 are
not equivalent. A simple way of seeing this lack of equivalence is via fitting
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 237
them to an (appropriately sized) empirical covariance matrix, which wouldresult in markedly different chi-square values. Alternatively, one can workout the variances and covariance of just an indicator of h
2 and one of h
3 in
either model, similarly to what was done throughout this article. One will thenfind different implied elements of the reproduced covariance matrices by thetwo models, whereby these elements cannot be empirically matched to eachother due to the restriction of equal latent residual variances. This being thecase, the two models are not equivalent, since each model implies at least onecovariance matrix not reproducible by the other model (for the argumentyielding lack of equivalence, see Example 1 and Proposition 1). Weemphasize that Example 3 is used here only to show that the requirement ofno parameter restrictions is an essential one for a proper application of therules, particularly the replacement rule.
Figure 9Model M
01.
Note: Variances of latent variables h2 and h
3 are set equal to 1 and variances of their
disturbances z2 and z
3 are set equal to one another.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
238 MULTIVARIATE BEHAVIORAL RESEARCH
Appendix 3Proposition 1 Implies the Validity of all Existing Rules of Equivalent
Model Generation
Let all assumptions of the RR be fulfilled, and model M11
before theapplication of the RR, as well as model M
22 after its application, be identified.
(This is the setting considered by Lee & Hershberger, 1990.) Let P = (P1, ..., P
m)9
be the vector of common predictors of X and Y. (To save tedious writing,only in this subsection a prime will denote vector/matrix transposition).Finally, let Q = (Q
1,..., Q
n)9 be the vector of additional predictors of Y (m, n > 0).
The preceding block (abbreviated to PB below) comprises all predictorsP
1, ..., P
m, Q
1, ..., Q
n of X and Y, with all their interrelationships (nonrecursive
Figure 10Model M
02.
Note: Variances of latent variables h2 and h
3 are set equal to 1 and variances of their
disturbances z2 and z
3 are set equal to one another.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 239
allowed too), and possibly other variables that are predictors of such in thesucceeding block but not of X and/or Y. The vector of all variables appearingin PB will be denoted by V
p. The vector of focal block (FB) variables will be
written as Vf, and comprises X and Y, that is, V
f = (X,Y)9. The succeeding
block (SB) comprises, say, the variables Z1, ..., Z
q (q > 0), compactly written
as the vector Vs = (Z
1, ..., Z
q)9. All variables in V
p, V
f, and V
s will be
considered latent below, as done by Lee and Hershberger. Thereby, themeasurement models of M
11 and M
22 will be assumed identical, like Lee and
Hershberger did. Because of this invariance, and as discussed in the maintext, in order to use Proposition 1 one can show the existence of such asurjective transformation g of the parameter vector of M
11 into that of M
22,
with which M11
and M22
fulfil the S-condition.
Model Definition Equations
Model M11
(before applying RR) is defined by the following equations; inthem, E
p, u, v, and E
s are pertinent residual terms with the usual assumptions
of uncorrelatedness, including that of u and v:
(35) Vp
= App
.Vp + E
p ,
X = a9.P + u ,
Y = b9.P + c9.Q + B.X + v
= (b9 + B.a9).P + c9.Q + (B.u + v)
(B Þ 0 is assumed, see Lee & Hershberger, 1990)
Vs
= Aps.V
p + K .V
f + L .V
s + E
s .
In Equation 35, a, b, and c are the m × 1-, m × 1-, and n × 1-vectorscontaining the partial regression coefficients of X and Y upon the commonpredictors of X and Y and upon the additional predictors of Y, respectively.(Some of the elements of these vectors may be zero, as may any element of thematrices mentioned in this paragraph.) The matrix A
pp is appropriately sized and
contains all regression coefficients in the PB (i.e., of the relationships of PB-variables among themselves), and A
ps is that coefficients’ matrix of SB-variables
upon PB-variables. The matrix K has two columns only, containing thecoefficients of X and Y, which relate the SB-variables to X and Y, and the matrixL is correspondingly sized as such relating SB-variables to other SB-variables.It is noted that Equations 35 use the limited block-recursiveness assumptionmade by Lee & Hershberger (1990), in that variables in a ‘later’ block — in theorder PB, FB and SB — do not impact variables in an ‘earlier’ block.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
240 MULTIVARIATE BEHAVIORAL RESEARCH
Model M22
(obtained from M11
via an application of the RR), is definedby the following equations (u and w are assumed generally correlated now,with covariance denoted s
uw):
(36) Vp
= App
.Vp + E
p (nothing has been changed by the RR in PB)
X = a9.P + u (nothing changed here, too)
Y = b9.P + c9.Q + w (nothing changed regarding the predictors of
Y from set Q)
Vs
= K .Vf + L .V
s + A
ps.V
p + E
s (nothing changed in the SB, too).
(In Equation 36, one could use different symbols for any of the parameters ofM
22, relative to M
11. However, since many parameters of M
11 are identically
transformed by g defined in Equation 37 below, to save tedious rewriting thesame symbols are used for them as with M
11 — see main text on common
parameters and implications for simplification when constructing g.)
Definition of the Sought Transformation g
The transformation g is defined as follows (additional notation used next:s
uu and s
vv denote the variances of u and v, respectively):
All elements of the parameter vector u of M11
, except b1, ..., b
m, B and s
vv,
are identically transformed (i.e., left unchanged); in addition
(37) b1 = b
1+B.a
1
b2 = b
2+B.a
2
...
bm = b
m+B.a
m (that is, compactly written, b = b+ B.a)
suw
= B.suu
sww
= B2.suu
+ svv .
With g Defined in Equation 37 Models M11
and M22
Fulfill the S-Conditionand g is Surjective
Indeed, we note first that a natural partitioning of the vector V of allvariables appearing in either model is as follows: V = (V
p9, V
f9, V
s9)9. This
leads to a partitioning of the covariance matrix 1S implied by M11
into sixblocks. That pertaining to the PB-variables is denoted below by 1S
11, that to
the FB-variables by 1S22
, the one to the SB-variables by 1S33
, and their
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 241
covariances’ blocks by 1S21
, 1S31
, and 1S32
, respectively. In the same waypartition that matrix 2S for M
22 into the corresponding blocks 2S
11 to 2S
32.
We will show next that for g defined as in Equation 37 above, 1Sij(u) =
2Sij[g(u)] (i,j = 1, 2, 3, i > j), and hence 1S(u) = 2S[g(u)] .1. 1S
11(u) = 2S
11[g(u)] because (a) nothing has been changed in the PB
by an application of the RR, and (b) due the definition of g its component ofrelevance to the PB are the identity transformations. Indeed, if V
pi and V
pj are
two variables from the PB in M11
(i = j is not ruled out next), their definitionequations are (see Equations 35)
Vpi = A
pp(i). V
p + E
pi, and
Vpj = A
pp(j). V
p + E
pj,
with App
(i) and App
(j) being the corresponding rows in the regressioncoefficient matrix A
pp, and E
pi and E
pj these variables’ error terms. Then
(38) Cov(Vpi, V
pj) = A
pp(i).Var(V
p).A
pp(j)9 + Cov(E
pi,E
pj)
in M11
. However, according to its definition g transforms identically any of theparameters of M
11 appearing in the right-hand side of Equation 38. Hence, the
expression App
(i).Var(Vp).A
pp(j)9 + Cov(E
pi,E
pj) remains the same as a result of
applying g. Since M11
and M22
are identical in the preceding block, however (asthe RR does not change anything in the PB), A
pp(i).Var(V
p).A
pp(j)9 +
Cov(Epi,E
pj) equals Cov(V
pi,V
pj) also in M
22 (see Equations 36). Thus, Equation
38 is valid in M22
as well. That is, for any two arbitrary variables in thepreceding block, their covariance is preserved under g.
Therefore, 1S11
(u) = 2S11
[g(u)].2. Also 1S
11(u) = 2S
11[g(u)] holds true because of the definition of g in
its part of relevance to the FB. Indeed, since X = a9.P + u in either model, itsvariance a9.Var(P).a + s
uu is not changed by g (see Equations 37 defining the
transformation) and represents the variance of X in M22
as well (see Equations36; this is because the RR does not alter the relationship of X to othervariables in the PB). The variance of Y in M
11 is (e.g., Mardia, Kent, &
Bibby, 1979, ch. 2.2.2; see Equations 35):
Var (Y) = (b + B.a)9.Var(P).(b + B.a) + c9.Var(Q).c + (b + B.a)9.Cov(P,Q).c +
[(b + B.a)9.Cov(P,Q).c]9 + B2.suu
+ svv.
The latter expression is transformed by g into (see Equations 37)
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
242 MULTIVARIATE BEHAVIORAL RESEARCH
b9.Var(P).b + c9.Var(Q).c + b9.Cov(P,Q).c + [b9.Cov(P,Q).c]9+ sww
.
This, however, is exactly the variance of Y in model M22
, in terms of its ownparameters, as can be directly deduced from Equations 36 using thecovariance rules 30 and 31. Finally, in M
11 the covariance between X and Y
is (see Equations 35):
a9.Var(P).(b + B.a) + a9.Cov(P,Q).c + B.suu
.
This expression is transformed by g (see Equations 37 above) into
a9.Var(P).b + a9.Cov(P,Q).c + suw
.
The last expression is exactly the covariance of X and Y in M22
, in terms of itsparameters, as is directly derived from Equations 36 using Equations 30 and 31.
Therefore, 1S22
(u) = 2S22
[g(u)].3. Furthermore, 1S
21(u) = 2S
21[g(u)] is true as well. To see this, first it
is observed that the covariances of X with any of the PB variables areunchanged by g. Indeed, in M
11 (see Equations 35) these covariances are (V
pi
is an arbitrary variable from Vp, and the notation point of 1. above in this
appendix is used, as well as Rules 30 and 31):
(39) Cov(X,Vpi) = Cov[a9.P + u, A
pp(i).V
p + E
pi]
= a9.Cov(P,Vp).A
pp(i)9.
The right-hand side of Equation 39 is unchanged by g and equals againCov(V
pi,X) in model M
22, as can be shown by working out Cov(V
pi,X) in M
22
using Equations 36, and Rules 30 and 31.Second, the covariances of Y with the PB-variables are (see Equations 35):
(40) Cov (Vpi,Y) = Cov[A
pp(i).V
p + E
pi, (b9 + B.a9).P + c9.Q + (B.u + v)]
= App
(i).Cov(Vp,P).(b + B9.a) + A
p(i).Cov(V
p,Q).c .
The right-hand side of Equation 40 is transformed by g into (see Equations 37)
(41) App
(i).Cov(Vp,P).b + A
pp(i).Cov(V
pi,Q).c,
which is identical to Cov(Vpi,Y) in M
23, as can be shown directly using
Equations 36, and Rules 30 and 31.Therefore, 1S
21(u) = 2S
21[g(u)].
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
MULTIVARIATE BEHAVIORAL RESEARCH 243
4. Next, 1S33
(u) = 2S33
[g(u)], due to the definition of g in its part ofrelevance here — that part consists of identity components (see Equation 37).Thus, for any two variables Z
i and Z
j (i = j is not ruled out) from the SB, all
variables appearing in their equations and their pertinent coefficients are notchanged by g. Since the RR does not change anything in the SB, and thecorresponding components of g are identities, Cov(Z
i, Z
j) is preserved under
g, and therefore 1S33
(u) = 2S33
[g(u)].5. Similarly to 4, 1S
32(u) = 2S
32[g(u)] is valid as well because the
definition equations of any variable in the SB is preserved under g, as is itsrelationship to either X or Y that is the same in M
11 and M
22 due to the RR not
changing the relationships of any variable in the SB to either X or Y. Thus,for any 2 variables, one from SB and one from FB, their covariance ispreserved under g and remains the same in M
22. Thus, 1S
32(u) = 2S
32[g(u)].
6. Finally, 1S31
(u) = 2S31
[g(u)] holds true too, since the definitionequations of any variable in the PB and any variable in the SB are identical inM
11 and M
22 (as the RR does not change anything within the PB and SB), and
unchanged by the pertinent components of g that are identities. Thus, for anytwo variables, one from the PB and one from the SB, their covariance isidentical in M
11 and M
22, and preserved by g. Therefore, 1S
31(u) = 2S
31[g(u)].
Thus, all 6 blocks of the implied covariance matrices by M11
and M22
, 1S and2S, are preserved by g, and therefore 1S(u) = 2S[g(u)]. Thus, the model beforethe RR is applied and the one obtained from it via this rule fulfil the S-condition.
Surjectivity of g
That g is in addition surjective is seen readily by solving its abovedefinition Equations 37, when the parameters of M
22 are considered fixed
quantities expressed in terms of the parameters of M11
. This is easily done forthe identically transformed parameters of both models. In addition, theremaining m + 2 components of g in Equation 37 represent as manynonredundant equations in the m+2 parameters of M
11 that can be easily
solved in terms of the latter, yielding b = b - (suw
/suu
)a , B = suw
/suu
, ands
vv = s
ww - (s
uw/s
uu)2s
uu. (We note that due to the Cauchy-Schwarz
inequality and assumed lack of perfect correlation of the discrepancy termsassociated with X and Y by the very nature of the RR, s
vv > 0 follows, and
hence this solution belongs to the parameter space of model M11
.) Thus thepremises of Proposition 1 are fulfilled, and hence a model fulfilling theassumptions of the replacement rule (M
11) and that obtained from it applying
the rule (M22
) are equivalent.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013
T. Raykov and S. Penev
244 MULTIVARIATE BEHAVIORAL RESEARCH
All Rules are Implied from Proposition 1
Thus, the validity of Proposition 1 implies that of the replacement rule.Since Lee & Hershberger (1990) have shown that RR subsumes all rules byStelzl, it follows that all her rules are also implied from Proposition 1 of thisarticle. Furthermore, the reverse indicator rule by Hershberger (1994, pp.85-92) is also obtained from Proposition 1, since it is a special case of RR, asshown by Hershberger.
Therefore, the validity of all currently available rules for equivalentmodel generation are implied by Proposition 1.
Dow
nloa
ded
by [
UN
SW L
ibra
ry]
at 2
1:20
27
Mar
ch 2
013