bollen (1995 sm)

8/10/2019 Bollen (1995 Sm)

1/29

a n

STRUCTURAL

EQUATION

MODELS THAT ARE

NONLINEAR

IN

LATENT

VARIABLES:

A

LEAST-

SQUARES ESTIMATOR

Kenneth A.

Bollen*

Busemeyer

and Jones

(1983)

and

Kenny

and

Judd

(1984)

pro-

posed

methods

to include

interactions

of

latent

variables

in

structural

equation

models

(SEMs).

Despite

the value

of

these

works,

their

methods

are

limited

by

the

required

distributional

assumptions,

by

their

complexity

in

implementation,

and

by

the unknown

distributions

of

the

estimators. This

paper

pro-

vides a

framework

for

analyzing

SEMs

("LISREL"

models)

that include

nonlinear

functions

of

latent

or

a

mix

of

latent and

observed

variables

in

their

equations.

It

permits

such

nonlinear

functions

in

equations

that are

part

of

latent

variable

models or

measurement models. I

estimate

the

coefficient

parameters

with

a

two-stage

least

squares

estimator that

is

consistent and

asymp-

totically normal with a known asymptotic covariance matrix.

The observed

random

variables can

come

from

nonnormal

distributions.

Several

hypothetical

cases and

an

empirical

exam-

ple

illustrate

the

method.

My

thanks to

Scott

Long,

the

referees,

and

Peter

Marsden

for their

comments

on

this

paper

and

to

Laura

Stoker and

John

Zaller for

their

helpfuldiscussions on

the

empirical

example.

I

gratefully

acknowledge

the

support

from

the

Center

for

Advanced

Study

in

the

Behavioral

Sciences and the

Sociology

Program

of

the

National

Science

Foundation

(SES-9121564).

*University

of

North Carolina

at

Chapel

Hill

223

8/10/2019 Bollen (1995 Sm)

2/29

KENNETH A.

BOLLEN

1.

INTRODUCTION

Structural

equation

models

(SEMs),

sometimes called LISREL mod-

els,

are

widely

used in the

social sciences. These

general

models

include

multiple regression,

confirmatory

factor

analysis,

classical

simultaneous

equation

models,

and a

variety

of other common

analy-

sis

techniques

as

special

cases

(Joreskog

and

Sorbom

1993).

Though

it is

straightforward

o include nonlinear functions of

exogenous

or

predetermined

observed variables into these models

(Bollen

1989,

pp.

128-29)

or to

incorporate

cross-product

terms of "block"vari-

ables

(Marsden 1983),

the treatment of

models

with

equations

that

are nonlinear in latent

or

unobserved variables

is not

fully

devel-

oped. Typical

examples

are

equations

that

include the

product

of two

latent variables

or

the

square

of a latent variable as

explanatory

variables.

Researchers

using

SEMs

have

proposed

two

major

solutions

to this

problem.

One

is

based on the

work of

Busemeyer

and

Jones

(1983), Bohrnstedt and Marwell (1978), Feucht (1989), and Heise

(1986).

The

other derives

from the work

of

Kenny

and Judd

(1984).

These

papers

take

important

steps

toward

allowing product

interac-

tions and

squared

terms

of latent variables

into

SEMs,

but

they

have

several

limitations.

This

paper provides

a more

general

framework

for

analyzing

SEMs

that include

nonlinear functions

of latent

or a mix

of

latent

and

observed

variables.

In

addition,

I

propose

a limited information

esti-

mator for such models that is based on a two-stage least squares

(2SLS)

procedure

described

in Bollen

(forthcoming).

Unlike

the

other

methods,

this estimator

is

simple,

easy

to

implement,

and has

known

asymptotic

properties

that

do not

depend

on

the

normality

of

the observed

random variables.

The

next section

reviews the

literature

on

product

interactions

and

squares

of latent

variables

in SEMs

and instrumental

variable/

2SLS

methods. Section

3

presents

the

notation,

model

assumptions,

and the estimator,andSection4 discussesthe selection of instrumen-

tal variables

(IVs)

that

are needed

to

implement

the

procedure.

Section

5

includes

three

hypothetical

examples

and one

empirical

example

to

illustrate

the

methodology.

The results

are summarized

n

Section

6 in

the conclusion.

224

8/10/2019 Bollen (1995 Sm)

3/29

MODELS

THAT ARE NONLINEAR IN LATENT VARIABLES

2.

LITERATURE REVIEW

2.1. Literatureon

Products

of

Latent

Variables

An

early

study

in the

SEMs literature

on

incorporating

products

of

latent variables

in

models was

by Busemeyer

and

Jones

(1983).

Busemeyer

and

Jones focus

on

a

single

equation:

y,

=

/311L

+

f312L2

+

/13L1L2

+

1,

(1)

where

y,

is an observed randomvariable,L1and

L2

are latentrandom

variables and

,

is

a random disturbanceterm with

a

mean

of

0.

The

latent

variables L1 and

L2

are each measured

with a

single

indicator

such that

Y2

=

L1

+

e2

(2)

Y3

L2

+

63,

(3)

where

E(ei)

is

zero,

and

E2,

E3,

and

5,

are distributed

ndependently

of

L1

and

L2

and of each other. The terms

L1,

L2, E2, E3,

and

,

are

random

variables

from normal

distributions;

e2,

63,

and

s

are each

homoscedastic and

nonautocorrelated;

and

y1,

L1,

L2,Y2,

L1L2,

and

y3

are deviated from their means.

Busemeyer

and Jones

(1983)

show that

knowledge

of

the error

variances

(or

reliability)

of

Y2

and

Y3,

together

with

the

results from

Bohrnstedt

and

Marwell

(1978)

on

estimating

the

reliability

of the

product of two normallydistributedvariables, allows one to consis-

tently

estimate the covariance

matrix of

yi,

L1,

L2,

and

L1L2.

This

in

turn

yields

a

consistent estimator of the

parameters

3,,

312,

and

/13

in

equation

(1).

The

major

limitations of this

method are: it

allows

only

a

single

indicator

per

latent

variable;

the error

variances

of

the

non-

product

observed

variables must be

known;

tests of

statistical

signifi-

cance

of

parameter

estimates

are

not

provided;

it offers no

methods

for estimating equation intercepts; and the robustness of the esti-

mates

to

violations

of

the

normality

and

independence

assumptions

for the

nonproduct

latent variables

and

nonproduct

disturbances is

not

given (Bollen

1989,

pp. 407-8).

Feucht

(1989)

draws on

Fuller's

(1980)

work

and

suggests

225

8/10/2019 Bollen (1995 Sm)

4/29

KENNETH A. BOLLEN

modifications that overcome some of these limitations. The

Feucht-

Fuller

method ensures

that

the

moment matrix that is

corrected

for

measurement error

is

positive-definite,

allows for

nonnormally

dis-

tributed

explanatory

variables,

and

provides

estimates

of

the stan-

dard

errors

of the

resulting

coefficient estimates.

Single

indicators

and known

error

variances

(and

error covariances

f

present)

are still

required,

however.

Heise's

(1986)

and Feucht's

(1989)

Monte Carlo

simulation results

provide

mixed evidence

on

the value

of

these sin-

gle

indicator

approaches

to

including

interactionsof latent variables.

Kenny

and Judd

(1984) give

an alternativemethod

of

incorpo-

rating

squares

of or

product

interactions of latent variables into

SEMs

(see

also

Wong

and

Long

1987;

Hayduk

1987;

Bollen

1989).

Their method

allows

multiple

measures

of

each latent

variable. Prod-

ucts of these indicators are

incorporated

nto the model

as indicators

of

the

products

of

the latent

variables.

To illustrate the

Kenny-Judd

method,

consider the

example

including

an interaction

of

latent

variables

in

equation

(1)

and the

indicators

of

Y2

for

L1

and

Y3

for L2

in

equations

(2)

and

(3).

Since

Kenny and Judd(1984) treatmultipleindicators,add one more indi-

cator each

for L1 and

L2,

as

in

equations

(4)

and

(5):

y4

=

A41

+

64

(4)

y5

=

52L2

+

65

(5)

In

addition

to the

assumptions

already

made

for

equations

(1)

to

(3),

the

assumptions

are

that

64

and

E5

have means

of

zero,

come from

normal distributions, are each homoscedastic and nonautocorre-

lated,

and are

independent

of

L1,

L2,

62,

63,

,

and

of

each

other.

All

y

variables

are

deviated

from their

means.

Kenny

and Judd

(1984)

suggest

that

analysts

form

indicators

of the

interaction

term,

L1L2,

by taking

two-way products

of the

indicators

of

L1 with the

indicators

of

L2.

This results

in four new

measurement

equations

for

the

indicators

of

L1L2:

Y2Y3= L1L2

+

L1E3 + L2E2 +

E2E3

(6)

Y2Y

=

A52L1L2

+

LE5

5L2

+

E5LE

+

(7)

43

=

A41L1L2

+

L2E4

+

A41L1E3

+

E3E4

(8)

Y4Y5 = A41A52L1L2

+

A41L165

+

A52L264

+

6465

226

(9)

8/10/2019 Bollen (1995 Sm)

5/29

MODELS THAT ARE NONLINEAR

IN

LATENT

VARIABLES

Equations

(1)

to

(9)

give

the full model to

estimate under the

Kenny-

Judd

approach.

This

involves

the

introduction of a

number

of

latent

variables and combinations

of

latent

and error variables. The

list of

such

variables s

L1,

L2,

62

to

E5,

1,

L1L2,

L

13,

L1E,

L2E2,

L2E4,

E2E3,

E2E5,

E364,

and

E465.

Estimating

the measurement

equations

(6)

to

(9)

in-

volves linear and nonlinear

constraints on the

parameters.

For in-

stance,

in

equation

(7)

the factor

loadings

for

L1L2

and for

L2E2

are

both

equal

to

A52

n

equation

(5).

Equation

(9)

for

y4y5

has

a nonlinear

constraint on the coefficient

for the

L1L2

variable. Additional

restric-

tions occur

for the variances of the

product

latent variables

in

equa-

tions

(6)

to

(9).

Under the

assumption

that L1 and L2 come from

normal

distributions,

the

variance

of

L1L2

must be

kept equal

to

VAR(L1)VAR(L2)

+

[COV(L1,L2)]2.

Other

examples

of

the restric-

tions are in

Kenny

and Judd

(1984).

The introduction

of the nonlinear

constraints

implied

by

the

model and

assumptions

allows consistent

estimation of the coeffi-

cients of the terms

that

are nonlinear

in

the latent

variables.

Kenny

and

Judd use

a

GLS

fitting

function

(Browne 1984)

to

estimate their

model. See Higginsand Judd(1990)for anotherempiricalapplication.

The

Kenny-Judd

method

represents

an advance in the

ability

to

handle interactions

and

squares

of

latent

variables,

but it still

has

limitations. One

is

the lack

of

knowledge

about

the robustness of the

method to

the failure of the

normality

and

independence

assump-

tions. Another is

the

proliferation

of

product

latent

variables,

distur-

bances,

and

observed variables that

occurs with

this method.

Even

a

relatively

simple

model

requires

many

terms

when

multiple

indica-

tors are available for each latent variable involved in the product

interaction.

Each of the

new

terms

and

the

accompanying

nonlinear

constraints must be

entered

explicitly

into the

model.

Also,

the

prop-

erties of

the model with raw

rather than

deviation

scores are not

known.

2.2.

Literature

on Instrumental

Variables

and

2SLS

Other literaturehas been less concerned with nonlinearfunctions of

latent

variables but is relevant to

this

paper.

This

is the

econometric

literature on

instrumental

variables

(IV)

and

two-stage

least

squares

(2SLS)

estimation.

Most

econometric

texts

(e.g.,

Johnston

1984;

Judge

et al.

1985)

provide

overviews of

these

methods.

227

8/10/2019 Bollen (1995 Sm)

6/29

KENNETH

A.

BOLLEN

IV and 2SLS

techniques

are

helpful

when

an

explanatory

vari-

able

in a

regression equation

is correlated

with

the

disturbance term

of

the

equation.

An IV is a variable

that is

correlated with

an

"endoge-

nous"

explanatory

variable,

but

it is uncorrelated

with the disturbance

term.

In 2SLS

the

predicted

value

of the

endogenous

explanatory

variable,

from a

"first-stage"

ordinary

least

squares

(OLS)

regression

of the

explanatory

variable

on the

IV,

replaces

the

explanatory

vari-

able

in

the

original equation.

The "second

stage"

of

2SLS

is the OLS

regression

of the

original

dependent

variable

on this

predicted

endoge-

nous

explanatory

variable

and

the other

explanatory

variables.

It

pro-

vides a

consistent estimator

of the coefficient

in the

original equation.

When

more

than one

IV is

available,

the 2SLS

estimator

is an IV

estimator

that uses

an

optimal

combination

of

instruments.

Random

measurement

error

in an

explanatory

variable cre-

ates

a correlation

between

it and

the disturbance.

The bulk

of

econometric

research

on IV

and measurement

error is

restricted

to

bivariate

or

multiple

regression

models with

a

single explanatory

variable

measured

with error.

Reiers0l

(1941)

was one

of the first

to

suggest

the use

of IV methods as a correction for an

explanatory

variable

measured

with error.

Extensions

of these

methods

allow an

explanatory

variable

to

have

more

than

one measure

or

expand

to

a

two- to

three-equation

model

(e.g.,

Bowden

and

Turkington

1984,

pp.

3-7,

58-62;

Aigner

et

al.

1984).

IV methods

for

models

that

have

nonlinear

functions

of observed

variables

also are

available

(see

Bowden

and

Turkington,

1984).

Madansky

(1964),

Hagglund

(1982),

and

Joreskog

(1983) pro-

posed

IV/2SLS methods to estimate factor

analysis

models. Bollen

(forthcoming)

developed

a 2SLS

estimator

for the latent

variable

models

as well.

But none

of these

authors

dealt

with

nonlinear

func-

tions of

latent

variables.

The

next

section

develops

a

general

model

and method

that

makes

use

of the

2SLS

estimator

for such

models.

3.

MODEL

AND ESTIMATOR

Busemeyer and Jones (1983), Kenny and Judd (1984), and Feucht

(1989)

concentrated

either

on the

product

of

two

latent

variables

or

the

square

of

a

latent

variable

in a

single equation

latent

variable

model.

A more

general

approach

permits

any

number

of

equations,

allows

other

nonlinear

functions

of

the

latent

or observed

variables,

228

8/10/2019 Bollen (1995 Sm)

7/29

MODELS

THAT ARE NONLINEAR

IN LATENT VARIABLES

and

applies

to the

measurement model as well as to

the latent vari-

able model.

Suppose

that

the

model for the latent

variablesis

L

=

acL

+

BlL

+

B2fL)

+

,

(10)

where

L

is an

m

x

1

vector

of latent

variables,aL

is

an

m

x

1

vector

of

intercept

terms,

B1 is

an

m

x

m matrix

containing

constantcoeffi-

cients for the effects of L on

other

L's, f(L)

is

an n

x

1

vector

of

functions

that

are nonlinear in

L,

B2

is an

m

x

n matrix

containing

constant

coefficients

for the

effects

of

f(L)

on

L,

and

;

is an

m

x

1

vector of disturbances with

E(S)

equal

to zero and

each

,

is i.i.d.

That

is,

the disturbance for each

equation

is homoscedastic and

non-

autocorrelated across

observations,

though

the variance and

other

distributional

traits

of

i

can

differ

from

j

for i

=

j.

Typically

some

elements of

L

orf(L)

are

"predetermined"

or

exogenous

in

the

sense

that

they

are

uncorrelated

with,

or

even

independently

distributed

of '.

The latent

variables

in

L

are observable

through

their indica-

tors.

A

second

equation

provides

the

measurementmodel

linking

the

latent to the observed variables

y

=

ay

+

AL +

A2f(L)

+

e,

(11)

where

y

is

a

p

x

1

vector of

random

variables that are

observed,

a is

a

p

x

1

vector of

intercept

constants for the

measurement

equations,

A1 and

A2

are

p

x

m

and

p

x

n

constant

coefficient matrices

for

L

and

f(L),

and

e is

ap

x

1

vector,

where each

Ei

s

an

i.i.d. random

error

of

measurement that has a

mean of

zero and that

is

independent

of L

andf(L). If a "latentvariable"is perfectlymeasured, then the corre-

sponding

element

of

ay

is

zero,

the

corresponding

row

of

A1

has a 1

in

the column that

matches the

latent

variable and

zeros in the

rest of

the

row,

and

the

corresponding

row of

A2 is

zero,

as

is the

matching

element

in E.

If

B2

and

A2

are

zero,

then the

model in

equations (10)

and

(11)

matches

general

SEMs

with

intercept

terms such

as

Joreskog

and

Sorbom's

(1993)

LISREL model.

In the

case

of the

LISREL

model, equation (10) correspondsto the latent variablemodel, and

equation (11)

to

the

measurement

(or

confirmatory

actor

analysis)

model. What is

distinctive

about the

model

of

equations

(10)

and

(11)

is

its inclusion

of

f(L).

This

permits

effects that

are

nonlinear

in

the

latent

variables.

The

nonlinear

terms can

enter

the

latent vari-

229

8/10/2019 Bollen (1995 Sm)

8/29

KENNETH A. BOLLEN

able

or the

measurement

model.

Thus the

model is

a

generalization

of the usual SEM.1

To

help

identify

the

model,

assume that

each

latent variable

has

an indicator

that "scales"

the

latent

variable such

that

Yi

=

Li

+

E

(12)

This

assumption

does

not rule

out

multiple

indicators

for

a latent

variable,

nor does

it

require

that

indicators

be influenced

by

no more

than one

latent

variable.

It

requires

only

that there

be at least

one

indicator

per

latent

variable

that "loads"

exclusively

on that latent

variable and that scales it

by

virtue of having a loading of unity.

Other

scaling

choices

are

possible,

but

the failure

to

assign

a scale

leads

to an

underidentified

model

(see

e.g.,

Bollen

1989,

pp.

152-54,

307-9).

Partition

y

such

that

the m

y's

that scale

the latent

variables

occur

first

(as

vector

yl)

and the

other

(p

-

m) y's

second

(as

vector

Y2).

This leads

to

Y

Y2

](13)

where

Y =

L

+

eI

(14)

and

L

=

Y

-

E1.

(15)

Substituting equation (15)

into

(10)

transforms

(10)

into

an

equation

for the

observed

scaling

variables

rather

than one

for the

latent

variables:

y,

=

aL

+

Bly,

+

B2f(,

-

E,)

+

El

-

Bll

+

(16)

Similarly

use

equation

(15)

to rewrite

the

measurement

model

in

equation

(11)

to exclude

L:

y

=

ay

+

Al,y

+

A2f(y1

-

E1)-

Ale1

+

E.

(17)

Consider

a

single

equation

from

the latent variable

model

in

equation

(16):

'One

can

also

view

this as

the

"all Y

model"

(Bollen

1989,

ch.

9)

with

the

addition of

nonlinear

functions

of

the latent

variables.

230

8/10/2019 Bollen (1995 Sm)

9/29


IN

LATENT VARIABLES

Yi

=

aLi

+

B1Y1

+

B2ifYl

-

E1)

-

Bli

1

+

Ei

+

i,

(18)

where Yiis one of the indicators that scales a latent variable. The i

subscript signifies

the

ith

row of

the

matrix or

vector-so,

for

in-

stance,

B,l

is

the

ith row of B1 and

Ei

is the

ith

element in

the e1

vector.

In

one broad and

useful

class of

models,

the nonlinear

function of

the

latent

variables

is

expressible

as

f(Yl

-

E1)

=

gl(yl)

+

g2(y1,E1),

(19)

where

gl(.)

and

g2(.)

are functions

of the

respective

variables

n

paren-

theses. This class of models includes the common cases of product

interactions and

quadratic

terms of

latent

variables that

Busemeyer

and

Jones

(1983)

and

Kenny

and Judd

(1984)

examined. For

in-

stance,

supposef(L)

is

a scalar

that

consists of the

product

L1L2.

Then

f(Yl

-

e1)

equals

the scalar

f(Y

-

El)

=

(Yl

-

E1)(2

-

E2)

=

YlY2

-

Y

- Y2E1

+

E1l2, (20)

where

YlY2

s

gl(y1)

and the

last three terms are

g2(yi,el).

Or if

f(L)

equals

L2,

then

f(y

-

E,)

is

the

scalar

f(Yl

-

E1)

=

Y2

-

2yle1

+

El, (21)

where

y2

is

gl(yl)

and

the

remaining

terms

are

g2(y1,el).

The

decomposition

in

equation

(19)

is useful

because it

allows

one

to

place

the

g2(Yl,

el)

component

in

the residual while

keeping

gl(yl,)

in the main

part

of

the

model.

For

these

and other functions

that are

expressible

as

in

equation

(19),

I

can write

equation (18)

as

yi

=

aLi

+

Bliy1

+

B2igl(Yl)

+

Ui, (22)

where

ui

is

the

composite

disturbanceterm

Ui

=

B2ig2(l,E1)

-

Bli

1

+

Ei

+

i'-

(23)

In

general

ui

will

be

correlated with

the

right-hand

side vari-

ables in equation (22), and that makes ordinary least squares an

inconsistent

estimator of

aLi,

Bli,

and

B2,.

An

exception

would occur

if

all

the

right-hand

side

variables in

the

equation

are

measured

without

error

and are

uncorrelated

with

the

equation

disturbance,

Vi.

In

the

more

general

case where the

disturbance

correlates with

the

right-

231

8/10/2019 Bollen (1995 Sm)

10/29

KENNETH

A.

BOLLEN

hand

side

variables,

a

two-stage

least

squares

(2SLS)

estimator

pro-

vides a

consistent estimator of

these

parameters.

The

literature review

described

special

cases

where the

2SLS

estimator

has

been

successful. Here I

develop

a

2SLS estimator that

applies

to

general

SEMs,

including

the latent

variable

and

the mea-

surement model. And

the

2SLS estimator

allows

for

equations

that

are

nonlinear in the

latent

or

observed

variables,

requiring

only

that

they

be linear in the

parameters.

To

develop

this

procedure,

I

modify

the

notation

somewhat.

Define

N to

be the

number of

cases,

y1(l

to

be the N row

matrix of

values for the variablesin

y,

that have nonzero coefficientsin the

yi

equation,

and

gl(yl)(i)

to be

the

N row

matrix

of values for

the vari-

ables in

gl(yl)

that

have nonzero coefficients in

the

Yiequation.

The N

x

1

vector

Yi

contains the N

values of

Yi

n

the

sample,

and

ui

is an N

x

1

vector

of

the values of

ui.

Let

B1(i)

be

a column vector

of the

coefficients

that

correspond

to

yl(i)

and

B2('

be

the coefficient

column

vector for

gl(y1)(')

with all

coefficients

being

identified

parameters.

Define

Zi

=

[1

:

yl(

'

gl(yl)(i)

]

and A'

=

[a'i:P

B

(i)].

Then

rewrite

equation (22) as

Yi

=

ZiAi

+

ui.

(24)

The 2SLS estimator

requires

a matrix of

instrumental vari-

ables,

say

Vi,

that

satisfy

the

assumptions

1

plim

(

V

i Zi)

=

izi

(25)

1

plim

(

-

V;Vi)

=

I?ivi

(26)

1

plim

(

-

V'iui

)

=

0,

(27)

where

plim

stands

for the

probability

limit

as N

goes

to

infinity.

Other

assumptions

are that the variables

in

Zi

have finite variances

and covariances, that the right-handside matricesof equations (25)

to

(27)

are

finite,

that

Xv,iv

s

nonsingular,

and that

XviZi

s

nonzero.

These

assumptions

require

that the

instrumental

variables

(IVs)

cor-

relate with

Zi

and that

the IVs not correlate

with

the

composite

disturbance

ui.

As

I

explain

in the next

section,

the

IVs will

be

232

8/10/2019 Bollen (1995 Sm)

11/29


IN LATENT VARIABLES

observed variables

(y's)

that are

part

of the model or

nonlinear func-

tions

of

such

observed variables.

Assume

that

E[uiui]

=

o2I

so that the

composite

disturbance s

homoscedastic

and nonautocorrelated. Whether

E[ui]

=

0

will

de-

pend

on the nonlinear function of the latent

variables

that occurs

in

the

original

model. For

now

assume

that the model

is such that the

mean of the

composite

disturbance

s

zero;

later

two of the

examples

will

illustrate

the

consequences

that follow when this

assumption

is

false.

In

general

the

ui

of

equation

(24)

will

correlate with one or

more of the variables in

Zi.

This

rules

out

the use

of

single-stage

OLS

to

estimate

Ai.

The first

stage

of

the 2SLS estimator is

to

perform

an

OLS

regression

of

Zi

on

Vi,

with

coefficients

(ViVi) -1V'zi.

(28)

The

Vi

matrix

is then

postmultiplied by

this coefficient to form

Zi

(

=

Vi(V'iV)-~

V'Zi),

the

predicted

Zi

matrix.

The second

stage

in the

2SLS estimation of

A,

is

the OLS

regression

of

y,

on

Zi

which

gives

coefficients

A,

=

(Z^' ,)-1^. (29)

As

is

well

known,

the

2SLS

estimator

is a consistent

estimator of

Ai

(e.g.,

see Johnston

1984,

pp.

478-79).

Assume that

1

Z'i

u-

AN(0,

a

2),

(30)

where AN

refers to an

asymptotically

normal

distribution.

The

previ-

ous

assumptions

in

equations (25)

to

(27)

imply

that

1

plim

( ZiZi)-'

,Z

-

(31)

The

asymptotic

distribution of

Ai

is

then

D

D

-

2

-1

N(Ai

-

A)

D-

N(0,

a

A

N,),

(32)

and an

estimate

of the

asymptotic

covariance matrix

of

Ai

is

acov

(Ai)

=

a

ui(Z'iZi)-.

233

(33)

8/10/2019 Bollen (1995 Sm)

12/29

KENNETH

A. BOLLEN

where

6u

=

(Yi

-

ZiAi)'

(y

-

ZiAi)/N.

Thus

the

preceding procedure

provides

a consistent estimator of the

coefficients

for

the linear

and

nonlinear terms in

equation

(22)

as

well as

a

measure

of their

statisti-

cal

variability.

I

have limited the discussion to

the latent variable model

in

equation

(10)

that allows effects that are nonlinear

in

the

latent

variables for the class of models

described

in

equations

(22)

and

(23).

A

similar

series of

steps

applies

to the measurement model

in

equa-

tion

(11). Substituting

equation

(15),

y,

-

e1,

for

L in

equation

(11)

leads

to

equation

(17).

Analogous

to

equation

(18)

from

the

latent

variablemodel, a

single equation

for the measurement model is

Yi

=

ai

+

AliY

1

+

A2iAf(y

-

el)

-

Aliel

+

Ei.

(34)

Considering

the

gl(.)

and

g2(.)

functions as

before

leads to

Yi

=

ayi

+

+

2ig1)

+

i)

+

,

(35)

where

Ui

=

A2ig2(YlEl)

-

Alil

+

Ei. (36)

An

appropriate

redefinition

of

Zi,

Ai,

and

ui

leads

back

to

equation

(24),

Yi

=

ZiAi

+

ui.

Under

the

assumptions

detailed for

the

latent

variable

model,

one can obtain

a

consistent 2SLS

estimator

of

A,

with

a

known

asymptotic

distribution.

4. INSTRUMENTAL

VARIABLE SELECTION

Key to the success of using the proceduresdeveloped in the preced-

ing

section

is

finding

appropriate

instrumental variables

(IVs)

that

satisfy

the conditions

for IVs and that

lead to

an identified model.

When

treating

the

selection

of

IVs,

many

econometric

texts

do not

explain

methods

for

finding

the IVs.

In

contrast,

the 2SLS

procedure

here

depends

on the

model

structure

or the creation

and selection

of

IVs.

Indeed,

the structure

of the

full model is

essential

in

finding

IVs,

as is

the idea

that nonlinear

functions

of some

of the observed

variables can serve as IVs.

In

practice

the most

challenging

task

is

to find IVs that are

uncorrelated

with the

composite

disturbance

ui.

Equations

(25)

to

(27)

along

with the

pattern

of

correlations

among

the

errors,

distur-

bances,

and latent

variables

of the

model are

important

aids

to

select-

234

8/10/2019 Bollen (1995 Sm)

13/29


IN LATENT

VARIABLES

ing

appropriate

IVs.

A

general

procedure

for

selecting

IVs

has

sev-

eral

steps.

Assume

that

vi

is

a variable that

might

be a suitable

instrumentalvariable. The

following steps help

to

evaluate

its

eligibil-

ity: (1)

Form

COV(vi,

ui);

(2)

if

vi

is an

endogenous

variable,

substi-

tute

its reduced-form

equation

for

it;

(3)

substitute the

right-hand

side of

equation

(23)

or

(36)

for

ui;

and

(4)

take the

covariance of

the

resulting

terms

and see

if

it is zero.

If

so,

then

vi

passes

this condition

for

an IV.

A

similar

series of

steps

applies

in the search for IVs that

are

nonlinear functions

of the observed

variables. For

instance,

when

modeling

the

product

of two latent

variables,

products

of indicators

that do not "scale" the

respective

latent variables

are often suitable

for

use

as IVs.

Suppose

that

Yi

scales

the first latent variable and

Y2

and

Y3

are additional

measures

of

the same

latent variable.

Similarly,

suppose

that the

y4

variable scales the second latent variable

and

y5

and

Y6

are

two

other

indicators. Then

Y2Y5,

2Y6,

3Ys,

nd

Y3Y6

ften

will

qualify

as IVs. Determination

of

their

eligibility

follows the

same

steps

of

writing

a reduced-form

expression

for

each variable

in

the

product, obtainingthe productof the reducedforms, and calculating

its

covariance with

ui

to see

if it

is zero.

If

so,

this

product

of

the

observed variables can serve as

an IV.

Researchers can sometimes form

another

IV

by regressing

each observed variable in

the

product

term

on

all of the individual

and

product

IVs

of

observed variables and

calculating

the

predicted

values from the linear

regressions

for each

component

(e.g.,

9Y

and

Y2).

Then one

forms

9192

as an additional IV for

the model. This latter

IV follows a suggestion of Bowden and Turkington 1981) about the

creation of IV

for nonlinear

functions

of

endogenous

observed vari-

ables

in

econometric models.

The

Kenny

and

Judd

(1984)

example

discussed

above

provides

an

illustration

of the selection of

IVs for the

2SLS method.

Recall

that the latent

variable

equation

was

y,

=

311L,

+

312L2

+

3L,L2

+

1,

(37)

with

Y2

and

Y3

the indicators that scale L1 and L2, respectively (see

equations [2]

and

[3]). Substituting

(Y2

-

E2)

for L1 and

(Y3

-

63)

for

L2

leads

to

Yl

=

P31Y2

+

P/12Y3

+

133Y2Y3

ul,

235

(38)

8/10/2019 Bollen (1995 Sm)

14/29

KENNETH A.

BOLLEN

where

u1

=

-311E2

-

f12E3

-

813y2E3

-

f813y3E2

+

813E2E3

+

1.

Allther.h.s.

variables

of

equation

(38)

are correlated with the

composite

distur-

bance,

u1.

The

y4

and

y5

variables are

indicatorsof L1

and

L2,

respec-

tively (see

equations [4]

and

[5]).

The

Y4,

Y5,

and

Y4Ys

variables

satisfy

the

conditions

for

IVs,

as the reader can

confirm.

Regressing

y,

and

Y2

on these IVs and

forming

91

and

92

and then

calculating

YP12

eads

to

another IV. The

2SLS

estimator

using

all four IVs

(Y4,

y5,

y4y5,

and

9192)

is

a

consistent estimator of

the

coefficients

in

equation

(37).

Though

the

specific

steps

outlined above

apply

to

any

model,

some

general guidelines

for

ruling

out IVs

emerge

from closer exami-

nation of the

composite

disturbance

ui.

For the latent variable model

in

equation

(22),

equation

(23)

defines

ui;

it

is

repeated

here for

easy

reference:

Ui

=

B2ig2(yl,E1)

-

Bli

+

Ei

+

vi

(39)

Note that

the latent variable model

only

has

equations

for

the

latent

endogenous

variables,

so we

do not

have

any equations

to estimate

for the latent

exogenous

variables

in

the latent variable

model.

Any

variables correlated with i are ineligible as IVs (except in the im-

probable

situation

in

which

a

variable

has

an

exactly

equal

but

oppo-

site in

sign

covariance

with the

remaining

components

of

ui).

In

the

typical

situation,

this means that

other

y's

that are

indicators of an

endogenous

Li

are

ineligible

as IVs

in

the latent

variable

model

since

these other indicators

correlate

with

V.2

Less

obvious is

that indicators

of latent variables

that are influ-

enced

by Li

are

unacceptable

since

they

too will correlate

with

Si.

Also,

if icorrelateswith rj, hen the indicatorsof Ljarenot suitable asIVs in

the latent

variable

model.

The

B1iEl

term

means

that

any

of the

scaling

indicators

for the

latent

variablesthat

appear

on

the

right-hand

ide

of

the

yi equation

cannot be

IVs. Nor can

y's

whose

errors

of

measure-

ment

correlate

with the

errors

of

such

scaling

indicators

serve as

IVs.

Furthermore,

any

y's

that have

errors that

correlate

with

Ei

are ruled

out

as IVs.

Finally,

IVs must be

uncorrelated

with

B2ig(yl,E1).

In

many

cases variables

that do not

correlate with

the other

terms

in

ui

will not

correlatewith this one, but there are exceptions.

In

the measurement

model the

composite

disturbance

ui

equals

A2ig2(Y,1)

-

AliE1

+

Ei.

The IVs

must be uncorrelated

with

2Remember

that

I

am

referring

to

the

latent variable

model

here.

In

measurement

models

some of the other

indicators

of the same latent

variable

can

serve

as IVs.

236

8/10/2019 Bollen (1995 Sm)

15/29

MODELS

THAT ARE NONLINEAR

IN LATENT

VARIABLES

these

components

of

ui.

An indicator whose

error

correlates

with

Ei

s

ineligible.

Scaling

indicators for latent variables that affect

Yi

or indi-

cators whose errors of measurementcorrelate with the errorsof such

scaling

indicators cannot

qualify

as

IVs either.

Last,

the IVs must be

uncorrelated

with the

nonlinear

term,

A2ig2(yl,El).

Note that

unlike

the

composite

disturbance

n

the

latent variable model

(see

equation

[23]),

i

does

not

appear

in the

composite

disturbance

for

the mea-

surement

equation.

This means

that some of the observed variables

that correlate

with i and

are

hence

ineligible

as IVs

for

the

latent

variable

equation

might

still be

suitable IVs for

equations

from

the

measurement model.

Another consideration

in

selecting

IVs is that some variables

might technically

meet the

conditions to

be an

IV,

but

they may

not

work well in

practice.

For

instance,

if

the

IVs

collectively

are

poorly

correlated

with

the

variables that

they

are to

replace,

the

resulting

2SLS estimates

may

be unstable

and far from the

true

parameters.

Analysts

can check this

by

examining

the

R2's

from

the

first

stage

of

the 2SLS

procedure.

Low values

(e.g.,

bollen (1995 sm)

Documents

1995 growth mechanisms of coevaporated sm ba2cu3oy thin...

bollen-stine bootstrapping of the chi-square statistic in

3 planten en verplanten - maken.wikiwijs.nl · 36...

pling i bollen av ingvild h. rishøi

el viaje / mª luz uribe y fernando krahn.- sm, 1995 el...

3form v. bollen resources

daniël bollen & dries wyckmans

capability maturity model (cmm-sm) 1995

anton bollen - what makes a video effective?; soap! 2015

sätta bollen i rullning 2010 05-19

nuclear physics perspectives with next-generation …slid 1...

1 bjt, ac behavior bollen. 2 agenda bollen bjt ac behaviour...

1 operational amplifier opamp bollen. 2 agenda bollen opamp...

menukaart placemat v5 opmaak 1 houtrust.pdfdame blanche 3...

key bollen continuitysupply

bollen - variáveis latentes

rondasmexico.gob.mx · nmx-aa-051-scfl-2001 nmx-aa-042-1987...

bollen är i rullning

kenneth a. bollen - carnegie mellon...

wireless application protocol john bollen mba 651