equivalent linear logistic test models

19
Equivalent Linear Logistic Test Models Timo M. Bechger, Huub H. F. M. Verstralen & Norman D. Verhelst Cito, PO Box, 1034, 6801 MG, Arnhem, The Netherlands August 23, 2000

Upload: independent

Post on 11-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Equivalent Linear Logistic Test Models

Timo M. Bechger, Huub H. F. M. Verstralen & Norman D. VerhelstCito, PO Box, 1034, 6801 MG, Arnhem, The Netherlands

August 23, 2000

Abstract

This paper is about the Linear Logistic Test Model (LLTM). We demonstratethat there are in…nitely many equivalent ways to specify a model. An impli-cation is that there may well be many ways to change the speci…cation of agiven LLTM and achieve the same improvement in model …t. To illustratethis phenomenon we analyze a real data set using a Lagrange multiplier testfor the speci…cation of the model.

1 Introduction

Let Xvi denote a dichotomous random variable with realization xvi that in-dicates that subject v chooses a particular response to item i. For ease ofpresentation we will assume that Xvi is coded as follows:

xvi =

(1 if the response is correct0 if the response is wrong : (1)

In the Rasch Model (RM), the probability of a correct response is de…ned as

Pr(Xvi = xvij»v ; ¯i) =exp[xvi(»v ¡ ¯i)]1 + exp(»v ¡ ¯i)

; (2)

where ¡1 < »v; ¯i < 1. The probability of a correct answer is decreasingin ¯i, which is recognized to be an item di¢culty parameter. Similarly, »v isa person parameter.

The parameters of the RM are not identi…ed because Pr(Xvi = xvij»v +c; ¯i+c) = Pr(Xvi = xvij»v; ¯i) for any constant c, and for all i 2 f1; :::; kg. Itis customary to remove this arbitrariness in the parameterization of the modelby imposing a linear restriction on the item parameters. It is customary inItem Response Theory (IRT) to call this restriction a normalization.

De…nition 1 A normalization is a linear restriction on the item parameters,i.e.,

kX

i=1

ai¯i = d, (3)

wherePki=1 ai = 1, and d is an arbitrary constant that is given the value

zero.

A normalization implies that there are only k ¡ 1 independent item pa-rameters, and the item parameters must be interpreted relative to

Pki=1 ai¯i.

We may, for example, consider the value of one of the item parameters asa reference. This amounts to setting ag = 1 (for some g 2 f1; :::; kg) andai = 0 (i 6= g).

Let ¯ = (¯1; :::; ¯k)t, where uppercase t denotes transposition. The Lin-

ear Logistic test Model (LLTM) is a RM with linear restrictions on the ’nor-malized’ item parameters, that is,

¯ =Q´ = (pX

l=1

qil´ l); (4)

1

where Q = (qis : i 2 f1; :::; kg; s 2 f1; :::; pg) is a k £ p matrix of constantweights, and ´ = (´s : s 2 f1; :::; pg) is a p-dimensional vector of so-calledbasic parameters (see Fischer, 1995a, and references therein). We assumethat p < (k ¡ 1) so that the model is more parsimonious than the RM. Wealso assume that there are no restrictions on ´.

In many applications, the LLTM is speci…ed according to a formalizedtheory about the cognitive operations required to solve the items, and thebasic parameters represent the di¢culty of particular cognitive operations.Consider, as an illustration, a set of paragraph comprehension items, whichare classi…ed according to the ”transformations” the students use in respond-ing. For example, with some questions the answer resides in the text, in muchthe same wording as in the question, while in others, subjects would haveto go beyond the information given and make inferences in arriving at thecorrect answer. Under the assumptions implied by the RM, the di¢culty ofeach transformation can be represented by a basic parameter and the (g; l)entry in the Q-matrix represents the number of times the l th transformationis required to answer the gth item. Provided the theory is appropriate, andtheQ-matrix is speci…ed correctly, the LLTM furnishes a useful extension ofthe RM. Especially, because the model enables one to predict the di¢cultyof new items which may facilitate the construction of new tests for di¤erentgroups of students. When theQ-matrix is misspeci…ed, however, the predic-tions based on the model may be erroneous. Note that there are many moreapplications of the LLTM and the model has been extended in various ways(e.g., Fischer, 1995b; Glas & Verhelst, 1989; Linacre, 1994;Wilson, 1989).

The particular application that we have discussed brie‡y in the previousparagraph illustrates the common practice to specify theQ-matrix for k itemparameters, while the need to impose a normalization is ignored. Fischer(e.g., 1995a) suggests that we examine the Q -matrix to assure ourselvesthat a normalization is imposed. To be more precise, he argues that theLLTM should be formulated as

¯ = Q´ + c1k =³Q 1k

´ ôc

!; (5)

where 1k represents a k-dimensional unit vector, and the constant c is addedto indicate that the item parameters are not unique. A normalization impliesthat c is uniquely determined, which is true if and only if the augmentedmatrix

³Q 1k

´has full column rank (e.g., Lang, 1987, pp. 59-60). Here,

we propose a di¤erent approach which reveals the e¤ect of a normalization

2

more clearly and allows the researcher more freedom to choose a convenientnormalization.

In Section 2, it is explained how theQ-matrix is adapted to impose a con-venient normalization. Neither the normalization nor the particular parame-terization of the basic parameters has an e¤ect on the di¤erences between theitem parameters or on the …t of the model, so that there are in…nitely manyways to specify the same LLTM. In Section 3 we will discuss such equivalentLLTMs. An interesting implication is that researchers may think di¤erentlyabout a given set of items and nevertheless specify statistically equivalentmodels. Another implication is that there may well be many ways to changethe speci…cation of a given LLTM and achieve the same improvement inmodel …t. The latter implication will be illustrated in Section 4 where wediscuss a test for the hypothesis that a single entry in the Q-matrix is spec-i…ed correctly. In line with Section 4, Section 5 is about the estimation ofthis entry when it is speci…ed as a parameter. In Section 6, we analyze a realdata set. Finally, we summarize and discuss our …ndings in Section 7.

2 Normalizing the LLTM

As noted in the Introduction, researchers usually specify the Q-matrix for kitem parameters, and ignore the need for a normalization. As a …rst step in ananalysis with the LLTM, one should therefore make sure that a normalizationis imposed. Let Ik denote a k-dimensional unit matrix. Consider an arbitrarynormalization indexed by a = (a1; :::; ak)t. Given a vector of item parameters,the a-normalized item parameters are

¯a = La¯; (6)

where La = Ik ¡ 1kat and at1k = 1. Substituting the postulated LLTM forthe item parameters gives

¯a = LaQ´ =Qa´: (7)

Suppose, for example, that an LLTM is formulated as264¯1¯2¯3

375 =

2641 32 11 2

375

"´1´2

#: (8)

3

The di¤erences between any two item di¢culty parameters can be arrangedin the following matrix

2640 ¡´1+ 2´2 ´2´1 ¡ 2´2 0 ´1 ¡ ´2¡´2 ¡´1+ ´2 0

375 , (9)

where the (i; j) entry is the di¤erence between ¯i and ¯j. This matrix maybe used to specify the LLTM for di¤erences between the item parameters.Its …rst column, for instance, provides the model for di¤erences with the …rstitem. This is an LLTM with the following ”normalized” Q-matrix:

L(1;0;0)tQ = Q(1;0;0)t=

2640 01 ¡20 ¡1

375 : (10)

By adapting the Q-matrix we have obtained the model for di¢culties of theitems relative to the …rst item.

The second step in the analysis is to check the rank of the ”normalizedQ-matrix”, Qa, to make sure that the basic parameters are identi…ed.

Proposition 2 The basic parameters are identi…ed if and only if Qa has fullcolumn rank. Proof. The item parameters are identi…ed because we haveapplied a normalization. That is, ¯(1)a 6= ¯(2)a ) Pr(Xvi = xvij»v ; ¯(1)i ) 6=Pr(Xvi = xvij»v; ¯(2)i ), for some i = 1; :::; k. Hence, we must establish that´(1) 6= ´(2) ) ¯(1)a 6= ¯(2)a . This is well known to apply if and only if Qa hasfull column rank (e.g., Lang, 1987, pp. 59-60).

The rank of Qa is independent of the normalization. It is easily checkedthat LaLb = La, where b and a denote two normalizations. Since rank(LaQ) =rank(LaLbQ) � rank(LbQ) while rank(LbQ) = rank(LbLaQ) � rank(LaQ)it follows that rank(LaQ) = rank(LbQ).

3 Equivalent LLTMs

There are many equivalent formulations of a LLTM because we can changethe normalization and/or the meaning of the basic parameters without chang-ing the restrictions on the di¤erences between the item parameters. As ex-plained in the previous section, the normalization may be changed by means

4

of the La matrix. The meaning of the basic parameters is changed by meansof a non-singular transformation de…ned by ´¤ = T´, where T is a non-singular matrix. In general, therefore, an LLTM may be written as :

¯a = LaQT¡1T´ ´ QaT

¡1T´; (11)

where di¤erent choices of a and T give di¤erent speci…cations of the sameLLTM.

An interesting implication is that researchers who hold di¤erent theoriesabout a given set of items may nevertheless specify LLTMs that are equivalentfrom a mathematical point of view. How do we determine whether twoLLTMs are equivalent ?

De…nition 3 In general, two LLTMs Q(1)´(1) and Q(2)´(2) are equivalent if

Q(1)a ´(1) =Q(2)

a T¡1T´(2): (12)

If Q(1)a is in the column-space of Q(2)a , we may always …nd a non-singularmatrix T¡1 such that Q(1)

a = Q(2)a T

¡1. Hence, two LLTMs are equivalentwhen both Qa-matrices span the same column space, and the rank of theaugmented matrix [Q(1)

a ;Q(2)a ] is p. The transformation matrix may then be

calculated as:T¡1 =

³Q(1)a

´gQ(2)a ; (13)

where³Q(1)a

´gis an arbitrary g-inverse of Q(1)a (see e.g., Pringle & Rayner,

1971).Another implication is that the same improvement in the speci…cation of

a postulated Q-matrix can be achieved in many ways. As an illustration wewill, in the next section, derive an e¢cient score (Rao, 1947) or Lagrangemultiplier test (Aitchison & Silvey, 1958; Silvey, 1959) for the hypothesisthat an arbitrary entry in the Q-matrix is speci…ed correctly.

4 A Lagrange Multiplier Test for the LLTM

Suppose we consider the postulated LLTM as the null-hypothesis. To testthe speci…cation of the (g; l) entry in the Q-matrix, qgl, we compare the

5

postulated LLTM to an alternative model where qgl is speci…es as a parameter¾gl. That is,

H0 : ¾gl = qgl (14)

H1 : ¾gl 6= qgl (15)

If the null hypothesis is true, an estimate of ¾gl would be near qgl.1 We coulduse a likelihood ratio test but this requires estimation under the alternativemodel. To avoid this, we employ the Lagrange Multiplier (LM) test. The LMtest is similar to the likelihood ratio test but it does not require estimationunder the general model (see Buse, 1982).

Let µ = (´t; ¾gl)t denote the vector of parameters for the general model.Let Sµ(µ0) be the vector of …rst order partial derivatives of the conditionalloglikelihood (i.e., the score vector) of the general model evaluated at µ0,which denotes the Conditional Maximum Likelihood (CML) estimates underthe LLTM. That is, µ0 = (^t; qgl)t, where ^ denotes the CML estimate ofthe basic parameters. Similarly, Iµ;µ(µ0) denotes the expected conditionalinformation matrix evaluated at µ0. Since µ = (´t; ¾gl)t is partitioned, thescore vector and the information matrix may likewise be partitioned:

Sµ(µ0) =

ÃS´(µ0)S¾gl(µ0)

!, Iµ;µ(µ0) =

ÃI´;´(µ0) I´;¾gl(µ0)I¾gl;´(µ0) I¾gl;¾gl(µ0)

!: (16)

The LM test involves the following statistic (cf. Glas & Verhelst, 1995):

'gl = S¾gl (µ0)2D¡1gl ; (17)

whereDgl = I¾gl;¾gl(µ0) ¡ I¾gl;´(µ0)I´;´(µ0)¡1I´;¾gl(µ0):

If the postulated LLTM holds, this statistic follows a chi-square distributionwith one degree of freedom. Another view on the LM is given by Sörbom(1989), who demonstrates that 1

2'gl is approximately the increase in con-

ditional loglikelihood when the corresponding element of the Q-matrix isestimated (or speci…ed as a constant with the value of the estimate). Hecalls this statistic the modi…cation index.

1Note that the alternative model pertains to a class of models that is more general thanthe LLTM. A de…nition of these models and conditions for their identi…ability are givenby Bechger, Verhelst and Verstralen (2000).

6

Note that

S ¾gl (µ0)^¡1l = E[x:gjR= r]¡ x:g (g 2 f1; :::; kg; l 2 f1; :::; pg) (18)

provides an informal measure of the …t of item g under the LLTM. The char-acter x:g denotes the number of correct responses to item g, and E [x:gjR = r]denotes its expectation conditional on the observed subject scores. For laterreference, we call (18) the Item Fit (IF) index.

It can be shown that 'gl is invariant under normalization and trans-formations of the basic parameters. This means that 'gl and 'jh (whereg; j 2 f1; :::; kg, and l; h 2 f1; :::; pg) will be the same if the correspondinggeneral models are equivalent. The reverse implication does not hold because'gl and 'jh may be equal by change.

Theorem 4 Let 'gl and 'jh denote the statistics associated to two entriesin a given Q-matrix. Let M(1) denote a model where only the (g; l) entry inQ is estimated as a parameter, and let M(2) denote a model where only the(j; h) entry is estimated as a parameter. Then, 'gl = 'jh if M(1) is equivalentto M(2). The proof of this theorem is added as an appendix.

This …nding is intuitively plausible. When the same improvement inconditional loglikelihood can be achieved by estimating (or re-specifying)one of several entries in the Q-matrix, the corresponding statistics shouldindeed be equal.

5 Estimating ¾glSuppose we keep the basic parameters at the CML estimates and considerFisher’s Scoring (FS) algorithm to …nd optimal values for the parameter intheQ-matrix. After simpli…cation of the gradient and the information matrixwe …nd the following relation between estimates at iteration t and t +1:

¾(t+1)gl = ¾(t)gl +D¡1S¾gl (µ0); (19)

whereD¡1S¾gl (µ0) is evaluated at the current values of the basic parameters.

If we start the iterations at ¾(0)gl = qgl, the quantity D¡1S¾gl (µ0), which

equals 'gl divided by S¾gl (µ0), gives ¾(1)gl ¡ qgl . In the next section, we callthis quantity the First FS Step (FFSS). To estimate the optimal value of qgl ,

7

we …x qgl at the estimated value of ¾gl, estimate the basic parameters, takethe next FS step, and continue this procedure until the speci…cation of qglcan no longer be improved. We have then replaced qgl by a CML estimateof the parameter ¾gl, given the values of the other entries in Q. An estimateof its standard error is given by the square root of the inverse of I¾gl;¾gl(µ)evaluated on the …nal estimates. Note that the estimates of an entry in Qwill be biased when other entries of Q are also misspeci…ed.

This procedure makes sense if ¾gl is identi…ed and its value is unique.The following theorem provides a relatively simple way to determine theidenti…ability of single entries in the Q-matrix. It is proven in Bechger,Verhelst and Verstralen (To appear).

Theorem 5 Choose a normalization such that ag = 0 and let Qa(¡g) denoteQa with the gth row deleted. Then, the (g; l) entry in Q is identi…able if andonly if Qa(¡g) has full column rank and ´l 6= 0.

As an illustration, consider the following LLTM:0B@¯1¯2¯3

1CA =

0B@1 02 11 ¾32

1CA

ô1´2

!=

0B@

´12´1 + ´2´1 + ¾32´2

1CA ; (20)

where the (3; 2) entry is considered a parameter. Suppose that we do notnormalize the LLTM. If we solve the equations for ¾32 we …nd that

¾32 =¯3 ¡ ¯1¯2 ¡ 2¯1

: (21)

One might be tempted to conclude that ¾32 is identi…ed but this is not truebecause ¯2 ¡ 2¯1 depends on the normalization; that is, if c 6= 0, ¯2 ¡2¯1 6= (¯2 + c) ¡ 2(¯1 + c). This shows how important it is to consider thenormalized Q-matrix rather than the matrix a researcher provides us with.The conclusion that ¾32 is not identi…ed is con…rmed if we look at the rankof the Qa(¡3) matrix. In the example that was mentioned in Section 2,

Qa(¡2) =

Ã0 00 ¡1

!; (22)

and ¾21 is not identi…ed. Both examples illustrate that the rank of Qa(¡g) isat most k¡2 so that no element of theQ-matrix will be identi…ed if p > k¡1.

8

When p < k¡ 1, the Q-matrix may contain entries that are identi…ed as wellas entries that are not identi…ed.

Note that if we consider ´ l to be random with some non-degenerate,continuous distribution, the probability that ´l = 0 is zero. If, however, inapplications j´lj becomes very small, the FS algorithm may fail to converge.

6 An Application

Consider as an illustration a data set consisting of 300 responses to …veitems that is published by Rost (1996, e.g., p. 303). For these …ve items,the following Q-matrix was believed to be appropriate (Rost, 1996, section3.4.1):

Q =

0BBBBBB@

1 02 01 12 12 2

1CCCCCCA: (23)

The entries in this Q-matrix specify the number of times that each of twocognitive operations is required to solve the corresponding item.

We have used CML estimation as described brie‡y in the Appendix. Thebasic parameters (with their asymptotic standard error within parenthesis)are estimated to be ^1 = 0:46 (0:14), and ^2 = 0:96 (0:10). The conditionalloglikelihood of the model is ¡324:92, which is only slightly lower than theconditional loglikelihood for the RM, which is ¡320:62. (Using a centralchi-square approximation, minus twice this di¤erence is associated with aprobability level of 0:01.) As judged from the IF indices (see Equation 18),item 4 and item 5 are the worst …tting items. Item 4, for example, is associ-ated to an IF index of 14:35, which indicates that there are about 14 correctresponses to many under the postulated LLTM. Since this is only a smallpercentage of the sample, the IF index suggest that the model …ts the datavery well.

The '’s are 0BBBBBB@

1:61 1:611:76 1:761:61 1:616:93 6:938:30 8:30

1CCCCCCA: (24)

9

Equalities between the MIs suggest ”equivalent” ways to improve themodel. For starters, the MIs are equal for each row of Q since an equivalentimprovement in loglikelihood can be achieved by a change in either one ofthe weights for the …rst or the second basic parameter. To be more precise, ifthe …t of the LLTM can be improved by adding a constant d to the estimated¯i, this improvement can be achieved in two ways since

^i + d = (qi1+

d

´1)´1 + qi2´2 = qi1´1 + (qi2 +

d

´2)´2: (25)

It is obvious that this phenomenon will appear in any LLTM, even whenthere are more than two basic parameters. The equality between the MIs ofitems 1 and 3 can also be explained by the presence of equivalent models. Tobe more speci…c, we found that an LLTM with the (1; 2) entry speci…ed as¡0:347 is equivalent to an LLTM with entry (1; 2) …xed at 0 and entry (3; 2)speci…ed as 1 + 0:347. The value 0:347 was found using the FS proceduredescribed in the previous section. With either speci…cation, the conditionalloglikelihood improved 1

2 £ 1:61.The results suggests that the greatest improvement in conditional loglike-

lihood (about 4: 15) can be achieved by estimating either entry in the …fthrow. Since the entries in the Q-matrix represent the number of times that acertain cognitive operation is used in the solution process, it seems reason-able to consider only integer values. The FFSS suggests that the (5; 1) entryshould be about 2 lower and that the (5; 2) entry should be about 1 lower. Ifwe specify q51 = 0 we obtain an LLTM with a loglikelihood of ¡321:15. Thebasic parameters are now estimated to be ^1 = 0:49 (0:09), and ^2 = 1:38(0:10). Suppose that instead of changing the value of q51, we change thevalue of q41 to 1. The '’s for the parameters of the …fth item are no longersigni…cant and the …t of the model has improved to ¡321:68, which is onlyslightly lower than the likelihood of the model with q51 = 0. This shows thatthe '’s are not independent. The basic parameters are now estimated to be´1 = 0:38 (0:09) and ´2 = 0:95 (0:10), close to the values under the …rstmodel. Although the di¤erent speci…cations are not equivalent, substantialarguments are needed to distinguish between them.

10

7 Conclusion

Researchers often specify the Q-matrix as if there were k independent itemdi¢culty parameters and ignore the need for a normalization. To deal withthis unfortunate situation, Fischer (e.g., 1995a) suggests that we examinethe Q-matrix to assure ourselves that a normalization is imposed. UnlikeFischer, we have proposed that a given Q-matrix be adapted to make surethat a convenient normalization is imposed. We have found that the basicparameters are identi…ed if and only if the adaptedQ-matrix is of full columnrank. If the adapted Q-matrix is of de…cient column rank, the researcher isrequired to specify a new Q-matrix.

We have also found that there are in…nitely many ways to specify anLLTM because one is completely free to choose a normalization or transformthe basic parameters. An interesting implication is that researchers maythink di¤erently about a given set of items and nevertheless specify statisti-cally equivalent models. Another implication is that there are many ways tochange the speci…cation of a given LLTM and achieve the same improvementin model …t. We have demonstrated that this a¤ects the LM test for singleentries in the Q-matrix. In general, the statistics associated with di¤erentsets of entries should be equal if the same improvement can be achieved bychanging the speci…cation of either set of entries.

It may be tempting to use the LM test in the LLTM in the same wayas is done in Structural Equation Modeling (SEM). That is, to examinethe statistics, to free the restriction that leads to the largest reduction ingoodness-of-…t and to repeat the process with the revised model until anadequate …t is developed. Bollen (1989) surveys years of experience with theLM test and puts in the following caveats against this procedure:

1. The central chi-square distribution may not even be valid for statisticsassociated with correctly speci…ed entries. The reason is that the wrongvalues in other places in the Q-matrix are part of the null hypothesis.

2. The statistics are not independent.

3. The order in which parameters are freed or restricted can a¤ect thesigni…cance tests for the remaining parameters.

4. The LM test is most useful when the misspeci…cations are minor. Fun-damental changes in the structure of the model may be needed butthese are not detectable with these procedures.

11

5. Simulation studies show that a speci…cation search that is guided onthe LM test is quite likely to lead one to the wrong model.

These …ndings highlight the importance of guidance from substantiveknowledge in the speci…cation of the LLTM and suggest that the LM testshould be used with caution. Finally, we note that it is common practice inSEM to supplement the LM test with a preliminary estimate of the param-eter, which is given by the …rst FS step (see Section 5).

12

8 Appendix:

8.1 The LM test depends neither on the normaliza-tion, nor on the parameterization of the basic pa-rameters

Theorem 6 The value of the statistic associated to the LM test dependsneither on the normalization, nor on the parameterization of the basic pa-rameters.

To prove this statement we proceed in two steps. First, we prove thatthe statistic 'gl is independent of the normalization. Then, we prove thatthe value of the statistic is not changed by a transformation of the basicparameters. We start by giving general expressions for the score vector andthe conditional information matrix.

In general, the statistic ' is given by

' = Sµ(µ0)tIµ;µ(µ0)

¡1Sµ(µ0): (26)

It follows from the chain rule that Sµ(µ) is given by

JtÃ@l(¯)

!; (27)

where l(¯) denotes the conditional loglikelihood of the item parameters un-der the RM, J = ( @ i

@µj) denotes the Jacobian matrix of …rst-order partial

derivatives (e.g., 35), and µj denotes a parameter is µ = (´t; ¾gl)t. (Bechger,Verhelst and Verstralen (To appear) give a general expression for this Jaco-bian matrix.) The conditional information matrix is given by

Iµ;µ(µ) = JtÃ

¡E"@2l(¯)

@¯@¯t

#!J: (28)

The derivatives with respect to the item parameters can be found, e.g., inFischer (1974 or 1995a).

Suppose that we change to a di¤erent normalization indexed by a. TheJacobian matrix is now:

Ja = LaJ = J¡ 1katJ: (29)

13

We use the following identities, and substitute Ja in the equation for thestatistic ' to obtain 'a; the LM test statistic under arbitrary normalization.

áE

"@2l(¯)

@¯@¯t

#!1k = 0k; and

Ã@l(¯)

!t1k = 0 (30)

The …rst identity is proven by Verhelst (1993, lemma 1). The second identityfollows easily by substituting the derivatives. Now the LM test statistic is

'a =

Ã@l(¯)

!tLaJ

"JtLta

áE

"@2l(¯)

@¯@¯t

#!LaJ

#¡1JtLta

Ã@l(¯)

!(31)

If one substitutes (29) in this equation, and uses the identities (30), one …ndsthat 'a = '. This proofs that the statistic ' is independent of the normal-ization. It follows from the chain rule, and the rule about the transformationof the information after reparameterization (e.g., Azzalini, 1996, Equation3.11) that the statistic ' is also invariant under various parameterizations ofthe basic parameters. Thus, if two alternative hypotheses are equivalent, theassociated statistics will be the same.

8.2 A Common Framework for CML Estimation of theRM and the LLTM

In this paragraph we demonstrate that CML estimates of the item parametersin the RM or the basic parameters in the LLTM can be obtained usingthe same algorithm. This algorithm may not e¢cient because it involvesmultiplication with matrices that may in practice be very large and sparse butmay be useful with small data sets (as in the present paper) or in theoreticalwork.

First, the researcher chooses aQ-matrix and a normalization, i.e., a vectora, such that at1k = 1. The CML estimates of the normalized item parametersare

^a = LaQ´ =Qa^: (32)

where ^ denotes the CML estimates of the basic parameters and La = Ik ¡1kat. The asymptotic variance-covariance matrix of the normalized itemparameters is

LaQI¡1´ ;´(^)Q

tLta: (33)

14

where I¡1´;´(^) denotes the asymptotic variance-covariance matrix of the esti-mates of the basic parameters.

If rank(Qa) = p the basic parameters are identi…ed and we may use aFS algorithm to estimate the basic parameters. The FS algorithm requiresthe score vector and information matrix, which are given by (27) and (28),respectively. The Jacobian matrix J = LaQ. The FS algorithm involvesthe evaluation of the elementary symmetric functions which is described byVerhelst, Glas and Van der Sluis (1984), or Verhelst and Veldhuijzen (1991).

To …t the RM, we may choose Q = Nr, which is a k £ k ¡ 1 matrixthat we obtain by deleting the rth row of Ik. Note that if Q = N, ´t =(¯a ¡ ¯r; ¯a¡1 ¡ ¯r ; :::; ¯r¡1 ¡ ¯r; ¯r+1 ¡ ¯r; :::; ¯r+b ¡ ¯r), where

a =

(2 if r = 11 otherwise

(34)

and b = k¡ r. That is, the CML estimates must be interpreted as distancesrelative to the r th item.

To calculate the LM statistic for the (g; l) entry in the Q-matrix it isconvenient to choose a normalization where ag = 0. The Jacobian matrix isthen given by

J = [LaQ(¾gl);P(´l)]; (35)

where P(´ l) denotes a column vector with all elements equal to zero exceptthe gth entry, which is equal to ´ l. The notation Q(¾gl) indicates that Q isnow a function of the parameter ¾gl. The score vector and the conditionalinformation matrix are again given by (27) and (28), respectively.

15

9 References

Aitchison, W., & Silvey, S. D. (1958). Maximum likelihood estimation ofparameters subject to restraints. Annals of Mathematical Statistics, 28, 813-828.

Azzalini, A. (1996). Statistical inference based on the likelihood. London:Chapman and Hall.

Bechger, T. M. , Verhelst, N. D. , & Verstralen, H. (to appear). Identi-…ability of non-linear logistic test models. Accepted for Publication in Psy-chometrika.

Buse, A. (1982). The likelihood ratio, Wald and Lagrange multiplier test:An expository note. The American Statistician, 36 (3), 153-157.

Fischer, G. H. (1974). Einfurung in die Theorie psychologischer Tests:Grundlagen und Anwendungen. [Introduction to the theory of psychologicaltests: Foundations and applications.] Bern: Verlag Hans Huber.

Fischer, G. H. (1995a). The linear logistic test model. Chapter 8 in G.H. Fischer and I.W. Molenaar (Eds.). Rasch models: Foundations, recentdevelopments and applications. Berlin: Springer.

Fischer, G. H. (1995b). Linear logistic test models for change. In G.H. Fischer and I.W. Molenaar (Eds.). Rasch models: Foundations, recentdevelopments and applications. Berlin: Springer.

Glas, C. A. W., & Verhelst, N. D. (1989). Extensions of the Partial CreditModel. Psychometrika, 54, 635-659.

Glas, C. A. W. , & Verhelst, N. D. (1995). Testing the Rasch model. InG. H. Fischer & I. W. Molenaar (Eds.), Rasch models: Their foundations,recent developments and applications. New-York: Springer Verlag.

Lang, S. (1987). Linear Algebra. (Third Edition). New-York: Springer.Linacre, J. M. (1994). Many-facet Rasch measurement. Chicago: Mesa

PressPringle, R. M. , & Rayner, A. A. (1971). Generalized inverse matrices

with applications to Statistics. London: Charles Gri¢n.Rao, C. R. (1947). Large sample tests of statistical hypothesis concerning

several parameters with applications to the problems of estimation. Proceed-ings Cambridge Philosophical Society, 44, 50-57.

Rost, J. (1996). Testtheory, Testkonstruction [Test theory, test construc-tion]. Bern: Verlag Hans Huber.

Silvey, S. D. (1959). The Lagrange multiplier test. Annals of Mathemat-ical Statistics, 30, 398-407.

16

Sörbom, D. (1989). Model modi…cation. Psychometrika, 54, 371-384.Verhelst, N. D. (1993). On the standard errors of parameter estimators in

the Rasch model. (Measurement and Research department Reports, 93-1.).Arnhem, Cito.

Verhelst, N. D., Glas, C. A. W., & Van der Sluis, A. (1984). Estimationproblems in the Rasch model: The basis symmetric functions. Computa-tional Statistics Quarterly, 245-262.

Verhelst, N. D., Veldhuijzen, N. H. (1991). A new algorithm for comput-ing elementary symmetric functions and their …rst and second order deriva-tives. (Measurement and Research Department Reports 91-5). Arnhem:Cito.

Wilson, M. (1989). Saltus: A psychometric model for discontinuity incognitive development. Psychological Bulletin, 105, 276-289.

17