presentacion 5 12 2017 -...

17

Upload: truongduong

Post on 06-Apr-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Jaime San Martín CMM-Universidad de Chile

December 2017 Nancy Lacourly (CMM), Paula Uribe (CMM), Mónica Silva (PUC)

Theoretical And Practical Problems with 2PL and 3PL

ITEM DIFFICULTY 2PL

1 -0.3

2 0.5

3 1.3

4 0.6

5 3.2

Choose two items, which ones?

ITEM DIFFICULTY 2PL DISC.

1 -0.3 0.5

2 0.5 1.5

3 1.3 0.8

4 0.6 1.0

5 3.2 0.1

… and now ?

where

Pij =eai(✓j�bi)

1+eai(✓j�bi)= P (✓j , ai, bi)

uij =

®1 student j answers correctly item i

0 otherwise

Cj = {i 2 I : uij = 1} the set of correct items for student j

The total likelihood L is the product L =QjLj assuming the local dependence.

Recall that P (✓, a, b) represents the probability that a student with ability ✓ answerscorrectly a question with discrimination a and di�culty b. Obviously this functionhas to be increasing in ✓, which amounts to say that a > 0.

In order to maximise L, we study the first order equations associated to L =log(L), which are given by

@L@ai

=X

j2A

(✓j � bi) [uij(1� Pij)� (1� uij)Pij ] = 0

=X

j2A:uij=1

(✓j � bi)�X

j2A

(✓j � bi)Pij = 0 (2.1)

@L@bi

= �X

j2A

ai [uij(1� Pij)� (1� uij)Pij ] = �ai

2

4X

j2A:uij=1

1�X

j2A

Pij

3

5 = 0 (2.2)

@L@✓j

=X

i2I

ai [uij(1� Pij)� (1� uij)Pij ] =

2

4X

i2Cj

ai �X

i2I

aiPij

3

5 = 0 (2.3)

Equation (2.3) gives the ability ✓j of student j as a function ofP

i2Cj

ai, which is the

accumulated discrimination of the items he/she answered correctly. We notice thatthe function

g(✓) =X

i2IaiP (✓, ai, bi) =

X

i2Iai

eai(✓�bi)

1 + eai(✓�bi),

assuming ((ai, bi) : i 2 I) are known, is strictly increasing in ✓ (because all ai > 0)and it is the same function for all students. Equation (2.3) is equivalent to

g(✓j) =X

i2Cj

ai .

The solution of this equation is

✓j = g�1

ÑX

i2Cj

ai

é,

which is an increasing function of the accumulated discrimination (of the correct itemsanswered by student j). The important observation is that this function is common

4

X

i2Cj

ai =X

i2Iai

eai (✓�bi )

1 + eai (✓�bi )

Cj = the set of correct items for student jI = the set of all items

✓ = ✓j = ability of student j

December 3, 2017 1 / 1

0 10 20 30 40 50 60 70 80Accumulated discrimination correct items

-3

-2

-1

0

1

2

3

4

5

Abilit

y

2PL model

-1 0 1 2 3 4 5 6 7

Difficulty0

0.5

1

1.5

2

2.5

3

Discrimination

to all students. In summary, the ability of an student is an increasing transformationof the accumulated discrimination of the correct items. And this happens even whenwe have not the compliance of the assumptions of the IRT model.

The interpretation of this observation is that, for 2PL model, the ability score ofthe students is ranked as his/her accumulated discrimination, that could be contraryto the rank using any reasonably score based on the di�culties for the correct items.Here a reasonably score should be a function which is increasing in each of the di�cul-ties of the correct items. We have not been able to prove this result theoretically, butwe shall see it empirically with the data from the PSU 2016 test and PISA, Mexico2012.

One could say that in a test, it is expected that the di�culties and discriminationsare highly positively correlated, and so the principle could be fulfilled. In this case,is a discrimination parameter necessary?

Some results can be find in the discrimination parameter interpretation whichis variable in the 2 PL or 3 PL IRT model: if two items have the same di�cultyparameter, the di↵erence of probabilities of correct response between two students isgreater in the item with greater discrimination. So, it is possible that a student witha higher proportion of correct response than other receives a smaller ability score,regardless of the items di�culties.

To understand this statement let us introduce a very strong order among students.For each student j consider the di�culties (bi : i 2 Cj) of the correct items. We orderthese di�culties in decreasing fashion:

Vj =Äbj(1)

� bj(2)

� · · · � bj(nj)

ä,

where nj is the number of correct items for student j and we add an extra index j

because this order depends on the student j: bj(1)

is the di�culty of the most di�cult

item answered correctly by student j, bj(2)

is the di�culty of the second most di�cultitem answered correctly by student j, and so on.

We say student k is weaker than student j, which we denote by k � j, if thefollowing conditions hold: first nk nj and for ` = 1, · · · , nk

bk(`) bj

(`) ,

with at least one strict inequality.That is, j answered more correct items than k and the most di�cult item answered

by j is more di�cult than the most di�cult item answered by k, the second mostdi�cult answered by j is more di�cult than the second of k and so on for the nk

items answered correctly by student k. Any reasonable score should put student jahead of student k.

The main question here is: Is it possible that there are two students k weakerthan j (k � j) such that ✓j < ✓k?, that is, even though k is a weaker student thanj, k has a larger 2PL ability. We will say that k dominates j or that j is dominatedby k. This means that the estimated ability is not an increasing function of thedi�culties of the correct items as one expect. We will show with numerical examplesfrom di↵erent tests, in particular the Chilean national selection test (PSU), whichshows the opposite.

5

ITEM DIFFI. DISC. StudentA

Accumulated Discrimination

StudentB

Accumulated Discrimination

1 -0.3 0.5 1 2.0 1 1.4

2 0.5 1.5 1 0

3 1.3 0.8 0 1

4 0.6 1.0 0 0

5 3.2 0.1 0 1

Student ADifficulty (item number)

Student BDifficulty (item number)

0.5 (2) 3.2 (5)

-0.3 (1) 1.3 (3)

-0.3 (1)

Student A is weaker than Student B but Student A has larger ability

Student Ability2PLAccumulatedDifficulty

AccumulatedDiscrimination

NumberCorrectItems

Student Ability2PLAccumulatedDifficulty

AccumulatedDiscrimination

NumberCorrectItems

38514 1.0358 13.1309 51.9933 40 29583 0.9909 53.074 51.264 52

Item Difficulty Discrimination Answersorted

Difficulty ItemRankingDifficulty Answer

sortedDifficulty Item RankingDifficulty

differenceinranking

1 -0.41 2.2285 1 1.3249 52 22 0 5.002 24 2 202 -0.0874 1.2119 1 1.2381 57 23 0 4.941 63 3 203 0.0336 2.1389 1 1.2377 44 24 1 3.315 68 5 194 -0.0797 1.7566 1 1.2181 53 25 1 2.7128 31 7 185 -0.3059 0.6366 1 1.1506 43 28 1 2.439 67 8 207 0.0178 0.9428 1 1.1378 15 29 1 2.3132 59 9 208 0.0684 1.2415 1 0.8026 23 32 1 2.2741 62 10 229 0.422 0.969 0 0.7335 61 35 1 2.2684 46 11 2410 0.2102 0.6165 1 0.7185 12 36 1 2.1911 58 12 2411 0.4331 1.5098 1 0.698 34 37 0 2.1176 16 14 2312 0.7185 1.1861 1 0.6819 56 38 1 2.0524 25 15 2313 0.5911 1.5789 1 0.5911 13 39 1 1.5612 40 18 2114 3.989 0.4431 0 0.5775 42 40 0 1.5477 54 19 2115 1.1378 1.1214 1 0.5566 27 41 1 1.532 73 20 2116 2.1176 0.6548 0 0.5287 48 43 1 1.4092 71 21 2217 -0.3365 1.9848 1 0.5211 32 44 0 1.3249 52 22 2218 -0.1473 1.0391 1 0.4484 36 47 1 1.2381 57 23 2419 0.1341 1.8171 1 0.4386 22 48 1 1.2026 60 26 2221 -0.5052 1.9939 1 0.4331 11 49 1 1.181 45 27 2222 0.4386 0.8056 1 0.35 38 52 1 1.1506 43 28 2423 0.8026 1.035 1 0.2742 41 54 1 1.1378 15 29 2524 5.002 0.2882 0 0.2539 78 55 1 0.8444 76 31 2425 2.0524 0.6176 0 0.2295 65 56 1 0.8026 23 32 2426 -0.0039 2.0986 1 0.2102 10 57 1 0.7649 49 34 2327 0.5566 0.8284 1 0.1341 19 58 1 0.7335 61 35 2328 -0.4515 2.1566 1 0.0684 8 59 0 0.7185 12 36 2329 -0.084 0.6707 0 0.051 30 60 0 0.698 34 37 2330 0.051 2.6143 1 0.0336 3 61 0 0.6819 56 38 2331 2.7128 0.5522 0 0.0178 7 62 1 0.5911 13 39 2332 0.5211 0.7811 1 -0.0039 26 63 1 0.5775 42 40 2333 2.1526 0.8303 0 -0.0797 4 64 0 0.5566 27 41 2334 0.698 1.3899 1 -0.0874 2 66 1 0.5468 55 42 2435 -0.4304 1.5356 1 -0.1473 18 67 1 0.5211 32 44 2336 0.4484 1.1382 1 -0.3059 5 68 1 0.4484 36 47 2137 0.2912 1.1765 0 -0.3365 17 69 0 0.4386 22 48 2138 0.35 1.0862 1 -0.41 1 71 1 0.422 9 50 2140 1.5612 1.1222 0 -0.4304 35 72 1 0.3908 66 51 2141 0.2742 1.4516 1 -0.4515 28 73 0 0.35 38 52 2142 0.5775 1.0202 1 -0.5052 21 74 1 0.2539 78 55 1943 1.1506 0.9262 1 -0.7708 80 75 1 0.2102 10 57 1844 1.2377 0.8757 1 0 0.1341 19 58 2145 1.181 0.8083 0 1 0.0684 8 59 2046 2.2684 0.6475 0 1 0.0336 3 61 1847 0.4698 1.0835 0 0 0.0178 7 62 1748 0.5287 0.9789 1 0 -0.0039 26 63 1649 0.7649 0.7783 0 1 -0.0797 4 64 1550 1.7425 0.6822 0 0 -0.1473 18 67 1252 1.3249 0.8624 1 1 -0.3059 5 68 1153 1.2181 1.08 1 0 -0.3992 64 70 954 1.5477 0.7811 0 1 -0.4304 35 72 755 0.5468 1.2549 0 1 -0.5052 21 74 556 0.6819 1.1438 1 1 -0.7708 80 75 457 1.2381 0.8504 1 158 2.1911 0.5307 0 159 2.3132 0.6762 0 160 1.2026 0.4363 0 161 0.7335 0.8598 1 162 2.2741 0.6425 0 163 4.941 0.3653 0 164 -0.3992 1.1334 0 165 0.2295 1.1576 1 066 0.3908 1.0264 0 167 2.439 0.5344 0 168 3.315 0.5355 0 170 0.8598 0.5714 0 071 1.4092 0.5843 0 172 2.921 0.5042 0 073 1.532 0.7587 0 174 1.9342 0.9473 0 075 0.5013 0.7655 0 076 0.8444 0.8226 0 177 0.7858 0.9971 0 078 0.2539 0.9606 1 179 6.6153 0.2841 0 080 -0.7708 1.3522 1 1

Student Ability2PLAccumulatedDifficulty

AccumulatedDiscrimination

NumberCorrectItems

Student Ability2PLAccumulatedDifficulty

AccumulatedDiscrimination

NumberCorrectItems

38514 1.0358 13.1309 51.9933 40 29583 0.9909 53.074 51.264 52

Item Difficulty Discrimination Answersorted

Difficulty ItemRankingDifficulty Answer

sortedDifficulty Item RankingDifficulty

differenceinranking

1 -0.41 2.2285 1 1.3249 52 22 0 5.002 24 2 202 -0.0874 1.2119 1 1.2381 57 23 0 4.941 63 3 203 0.0336 2.1389 1 1.2377 44 24 1 3.315 68 5 194 -0.0797 1.7566 1 1.2181 53 25 1 2.7128 31 7 185 -0.3059 0.6366 1 1.1506 43 28 1 2.439 67 8 207 0.0178 0.9428 1 1.1378 15 29 1 2.3132 59 9 208 0.0684 1.2415 1 0.8026 23 32 1 2.2741 62 10 229 0.422 0.969 0 0.7335 61 35 1 2.2684 46 11 2410 0.2102 0.6165 1 0.7185 12 36 1 2.1911 58 12 2411 0.4331 1.5098 1 0.698 34 37 0 2.1176 16 14 2312 0.7185 1.1861 1 0.6819 56 38 1 2.0524 25 15 2313 0.5911 1.5789 1 0.5911 13 39 1 1.5612 40 18 2114 3.989 0.4431 0 0.5775 42 40 0 1.5477 54 19 2115 1.1378 1.1214 1 0.5566 27 41 1 1.532 73 20 2116 2.1176 0.6548 0 0.5287 48 43 1 1.4092 71 21 2217 -0.3365 1.9848 1 0.5211 32 44 0 1.3249 52 22 2218 -0.1473 1.0391 1 0.4484 36 47 1 1.2381 57 23 2419 0.1341 1.8171 1 0.4386 22 48 1 1.2026 60 26 2221 -0.5052 1.9939 1 0.4331 11 49 1 1.181 45 27 2222 0.4386 0.8056 1 0.35 38 52 1 1.1506 43 28 2423 0.8026 1.035 1 0.2742 41 54 1 1.1378 15 29 2524 5.002 0.2882 0 0.2539 78 55 1 0.8444 76 31 2425 2.0524 0.6176 0 0.2295 65 56 1 0.8026 23 32 2426 -0.0039 2.0986 1 0.2102 10 57 1 0.7649 49 34 2327 0.5566 0.8284 1 0.1341 19 58 1 0.7335 61 35 2328 -0.4515 2.1566 1 0.0684 8 59 0 0.7185 12 36 2329 -0.084 0.6707 0 0.051 30 60 0 0.698 34 37 2330 0.051 2.6143 1 0.0336 3 61 0 0.6819 56 38 2331 2.7128 0.5522 0 0.0178 7 62 1 0.5911 13 39 2332 0.5211 0.7811 1 -0.0039 26 63 1 0.5775 42 40 2333 2.1526 0.8303 0 -0.0797 4 64 0 0.5566 27 41 2334 0.698 1.3899 1 -0.0874 2 66 1 0.5468 55 42 2435 -0.4304 1.5356 1 -0.1473 18 67 1 0.5211 32 44 2336 0.4484 1.1382 1 -0.3059 5 68 1 0.4484 36 47 2137 0.2912 1.1765 0 -0.3365 17 69 0 0.4386 22 48 2138 0.35 1.0862 1 -0.41 1 71 1 0.422 9 50 2140 1.5612 1.1222 0 -0.4304 35 72 1 0.3908 66 51 2141 0.2742 1.4516 1 -0.4515 28 73 0 0.35 38 52 2142 0.5775 1.0202 1 -0.5052 21 74 1 0.2539 78 55 1943 1.1506 0.9262 1 -0.7708 80 75 1 0.2102 10 57 1844 1.2377 0.8757 1 0 0.1341 19 58 2145 1.181 0.8083 0 1 0.0684 8 59 2046 2.2684 0.6475 0 1 0.0336 3 61 1847 0.4698 1.0835 0 0 0.0178 7 62 1748 0.5287 0.9789 1 0 -0.0039 26 63 1649 0.7649 0.7783 0 1 -0.0797 4 64 1550 1.7425 0.6822 0 0 -0.1473 18 67 1252 1.3249 0.8624 1 1 -0.3059 5 68 1153 1.2181 1.08 1 0 -0.3992 64 70 954 1.5477 0.7811 0 1 -0.4304 35 72 755 0.5468 1.2549 0 1 -0.5052 21 74 556 0.6819 1.1438 1 1 -0.7708 80 75 457 1.2381 0.8504 1 158 2.1911 0.5307 0 159 2.3132 0.6762 0 160 1.2026 0.4363 0 161 0.7335 0.8598 1 162 2.2741 0.6425 0 163 4.941 0.3653 0 164 -0.3992 1.1334 0 165 0.2295 1.1576 1 066 0.3908 1.0264 0 167 2.439 0.5344 0 168 3.315 0.5355 0 170 0.8598 0.5714 0 071 1.4092 0.5843 0 172 2.921 0.5042 0 073 1.532 0.7587 0 174 1.9342 0.9473 0 075 0.5013 0.7655 0 076 0.8444 0.8226 0 177 0.7858 0.9971 0 078 0.2539 0.9606 1 179 6.6153 0.2841 0 080 -0.7708 1.3522 1 1

Student Ability2PLAccumulatedDifficulty

AccumulatedDiscrimination

NumberCorrectItems

Student Ability2PLAccumulatedDifficulty

AccumulatedDiscrimination

NumberCorrectItems

38514 1.0358 13.1309 51.9933 40 29583 0.9909 53.074 51.264 52

Item Difficulty Discrimination Answersorted

Difficulty ItemRankingDifficulty Answer

sortedDifficulty Item RankingDifficulty

differenceinranking

1 -0.41 2.2285 1 1.3249 52 22 0 5.002 24 2 202 -0.0874 1.2119 1 1.2381 57 23 0 4.941 63 3 203 0.0336 2.1389 1 1.2377 44 24 1 3.315 68 5 194 -0.0797 1.7566 1 1.2181 53 25 1 2.7128 31 7 185 -0.3059 0.6366 1 1.1506 43 28 1 2.439 67 8 207 0.0178 0.9428 1 1.1378 15 29 1 2.3132 59 9 208 0.0684 1.2415 1 0.8026 23 32 1 2.2741 62 10 229 0.422 0.969 0 0.7335 61 35 1 2.2684 46 11 2410 0.2102 0.6165 1 0.7185 12 36 1 2.1911 58 12 2411 0.4331 1.5098 1 0.698 34 37 0 2.1176 16 14 2312 0.7185 1.1861 1 0.6819 56 38 1 2.0524 25 15 2313 0.5911 1.5789 1 0.5911 13 39 1 1.5612 40 18 2114 3.989 0.4431 0 0.5775 42 40 0 1.5477 54 19 2115 1.1378 1.1214 1 0.5566 27 41 1 1.532 73 20 2116 2.1176 0.6548 0 0.5287 48 43 1 1.4092 71 21 2217 -0.3365 1.9848 1 0.5211 32 44 0 1.3249 52 22 2218 -0.1473 1.0391 1 0.4484 36 47 1 1.2381 57 23 2419 0.1341 1.8171 1 0.4386 22 48 1 1.2026 60 26 2221 -0.5052 1.9939 1 0.4331 11 49 1 1.181 45 27 2222 0.4386 0.8056 1 0.35 38 52 1 1.1506 43 28 2423 0.8026 1.035 1 0.2742 41 54 1 1.1378 15 29 2524 5.002 0.2882 0 0.2539 78 55 1 0.8444 76 31 2425 2.0524 0.6176 0 0.2295 65 56 1 0.8026 23 32 2426 -0.0039 2.0986 1 0.2102 10 57 1 0.7649 49 34 2327 0.5566 0.8284 1 0.1341 19 58 1 0.7335 61 35 2328 -0.4515 2.1566 1 0.0684 8 59 0 0.7185 12 36 2329 -0.084 0.6707 0 0.051 30 60 0 0.698 34 37 2330 0.051 2.6143 1 0.0336 3 61 0 0.6819 56 38 2331 2.7128 0.5522 0 0.0178 7 62 1 0.5911 13 39 2332 0.5211 0.7811 1 -0.0039 26 63 1 0.5775 42 40 2333 2.1526 0.8303 0 -0.0797 4 64 0 0.5566 27 41 2334 0.698 1.3899 1 -0.0874 2 66 1 0.5468 55 42 2435 -0.4304 1.5356 1 -0.1473 18 67 1 0.5211 32 44 2336 0.4484 1.1382 1 -0.3059 5 68 1 0.4484 36 47 2137 0.2912 1.1765 0 -0.3365 17 69 0 0.4386 22 48 2138 0.35 1.0862 1 -0.41 1 71 1 0.422 9 50 2140 1.5612 1.1222 0 -0.4304 35 72 1 0.3908 66 51 2141 0.2742 1.4516 1 -0.4515 28 73 0 0.35 38 52 2142 0.5775 1.0202 1 -0.5052 21 74 1 0.2539 78 55 1943 1.1506 0.9262 1 -0.7708 80 75 1 0.2102 10 57 1844 1.2377 0.8757 1 0 0.1341 19 58 2145 1.181 0.8083 0 1 0.0684 8 59 2046 2.2684 0.6475 0 1 0.0336 3 61 1847 0.4698 1.0835 0 0 0.0178 7 62 1748 0.5287 0.9789 1 0 -0.0039 26 63 1649 0.7649 0.7783 0 1 -0.0797 4 64 1550 1.7425 0.6822 0 0 -0.1473 18 67 1252 1.3249 0.8624 1 1 -0.3059 5 68 1153 1.2181 1.08 1 0 -0.3992 64 70 954 1.5477 0.7811 0 1 -0.4304 35 72 755 0.5468 1.2549 0 1 -0.5052 21 74 556 0.6819 1.1438 1 1 -0.7708 80 75 457 1.2381 0.8504 1 158 2.1911 0.5307 0 159 2.3132 0.6762 0 160 1.2026 0.4363 0 161 0.7335 0.8598 1 162 2.2741 0.6425 0 163 4.941 0.3653 0 164 -0.3992 1.1334 0 165 0.2295 1.1576 1 066 0.3908 1.0264 0 167 2.439 0.5344 0 168 3.315 0.5355 0 170 0.8598 0.5714 0 071 1.4092 0.5843 0 172 2.921 0.5042 0 073 1.532 0.7587 0 174 1.9342 0.9473 0 075 0.5013 0.7655 0 076 0.8444 0.8226 0 177 0.7858 0.9971 0 078 0.2539 0.9606 1 179 6.6153 0.2841 0 080 -0.7708 1.3522 1 1

49 50 51 52 53 54 55 56 57 580.9

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8Ability

Number of correct items: 50

n=231 n=306 n=289

n=22 n=7

59 60 61 62 63 64 65 66

1.9

2

2.1

2.2

2.3

2.4

2.5

Abilit

y

Number of correct items: 60

n=205

n=131

n=16

n=6

69 70 71 72 73 74 75

2.8

2.9

3

3.1

3.2

3.3

3.4

3.5

3.6

3.7

Abilit

y

Number of correct items: 70

n=127

n=99

n=34

n=3

73 73.2 73.4 73.6 73.8 74 74.2 74.4 74.6 74.8 75

3.6

3.7

3.8

3.9

4

4.1

4.2

Abilit

y

Number of correct items: 74

n=32

Numbercorrectitems

Students Domina2ngStudents

% Meandominatedstudents

Meanitemsdifference

Maxitemsdifference

Meanabilitydifference

Maxabilitydifference

10 911 893 98% 722.7 1.97 9 0.11 0.67

20 2682 2632 98% 1010.2 1.49 10 0.14 0.70

30 997 982 98% 343.0 1.73 10 0.11 0.57

40 584 558 96% 199.3 1.66 12 0.10 0.58

50 385 367 95% 154.2 1.56 8 0.12 0.73

60 275 268 97% 96.0 1.30 5 0.14 0.61

70 161 149 93% 61.5 0.54 4 0.16 0.85

74 51 47 92% 20.9 0 0 0.15 0.48

Thanks