joint maximum likelihood estimates

41
Joint Maximum Likelihood Joint Maximum Likelihood Estimates on Items-Examinees Estimates on Items-Examinees Using Prox Method: A Study on Using Prox Method: A Study on the Grammar Subtest of TOEFL the Grammar Subtest of TOEFL Widiatmoko Widiatmoko E.: E.: [email protected] [email protected] W.: W.: http://mokogeong.multiply.com http://mokogeong.multiply.com Jakarta, Indonesia Jakarta, Indonesia

Upload: widiatmoko

Post on 12-Nov-2014

974 views

Category:

Documents


1 download

DESCRIPTION

Item response theory (IRT) as accurate solution to weaknesses of classical test theory (CTT). IRT provides much more advantages. Advantages: unidimension, local independence, and parameter invariance. Requirements needed in test construction. TOEFL known as test meeting requirements. It however concerns IRT. In this case, research dealt with grammar subtest of TOEFL. Research aimed to estimate examinee-item parameters. As 1PL model the intended model, Prox method to estimate parameters jointly, named joint maximum likelihood estimates. Method requires dichotomous data with 30 persons as examinee measure θ parameter and 20 items as b parameter. As a result, values of θ and b prove ranges as model intended in IRT named as ICC.

TRANSCRIPT

Page 1: Joint Maximum Likelihood Estimates

Joint Maximum Likelihood Estimates Joint Maximum Likelihood Estimates on Items-Examinees Using Prox on Items-Examinees Using Prox

Method: A Study on the Grammar Method: A Study on the Grammar Subtest of TOEFLSubtest of TOEFL

WidiatmokoWidiatmokoE.: E.: [email protected]@gmail.com

W.: W.: http://mokogeong.multiply.comhttp://mokogeong.multiply.comJakarta, IndonesiaJakarta, Indonesia

Page 2: Joint Maximum Likelihood Estimates

AbstractAbstract

Item response theory (IRT) as accurate solution to Item response theory (IRT) as accurate solution to weaknesses of classical test theory (CTT). weaknesses of classical test theory (CTT).

IRT provides much more advantages. IRT provides much more advantages. Advantages: unidimension, local independence, and parameter Advantages: unidimension, local independence, and parameter

invariance. invariance. Requirements needed in test construction. Requirements needed in test construction. TOEFL known as test meeting requirements. It however TOEFL known as test meeting requirements. It however

concerns IRT. concerns IRT. In this case, research dealt with grammar subtest of TOEFL. In this case, research dealt with grammar subtest of TOEFL. Research aimed to estimate examinee-item parameters. As 1PL Research aimed to estimate examinee-item parameters. As 1PL

model the intended model, Prox method to estimate model the intended model, Prox method to estimate parameters jointly, named joint maximum likelihood parameters jointly, named joint maximum likelihood estimates. estimates.

Method requires dichotomous data with 30 persons as Method requires dichotomous data with 30 persons as examinee measure examinee measure θ θ parameter and 20 items as parameter and 20 items as bb parameter. parameter.

As a result, values of As a result, values of θ θ and and bb prove ranges as model prove ranges as model intended in IRT named as ICC.intended in IRT named as ICC.

Page 3: Joint Maximum Likelihood Estimates

KeywordsKeywords

IRT, 1PL model, parameter estimates, Prox method.IRT, 1PL model, parameter estimates, Prox method.

Page 4: Joint Maximum Likelihood Estimates

I. IntroductionI. IntroductionBackground of StudyBackground of Study

Educational development in global era put Educational development in global era put in seriously months or even years. in seriously months or even years.

In micro scope, concerned with learning In micro scope, concerned with learning input, process, and output evaluation. input, process, and output evaluation.

Output evaluation as a determinant in Output evaluation as a determinant in success and failure of a curriculum. The success and failure of a curriculum. The evaluation takes the form of examination. evaluation takes the form of examination.

Some experts at evaluation use different Some experts at evaluation use different determinant, known program evaluation. determinant, known program evaluation.

The former and the latter perspectives can The former and the latter perspectives can be quantitatively and qualitatively be quantitatively and qualitatively approached. They need measurement.approached. They need measurement.

Page 5: Joint Maximum Likelihood Estimates

Measurement launched in 1960-s – mainly Measurement launched in 1960-s – mainly campaigned E.L. Thorndike’s Mental Measurement campaigned E.L. Thorndike’s Mental Measurement Theory (1914) – aims to measure difficult and Theory (1914) – aims to measure difficult and unobserved mental constructs (Crocker and unobserved mental constructs (Crocker and Algina, 1986: 5). Algina, 1986: 5).

One of them is learning outcome as a result of a One of them is learning outcome as a result of a measure from the instrument to examinees dealing measure from the instrument to examinees dealing with the trait, in this case, achievement. with the trait, in this case, achievement.

Attitude, motivation, or interest are mental Attitude, motivation, or interest are mental constructs, never quantitatively measured in constructs, never quantitatively measured in some classrooms. some classrooms.

The observation tells evaluation on attitude, The observation tells evaluation on attitude, motivation, and interest does not involve motivation, and interest does not involve measurement. May be due to a teacher’s measurement. May be due to a teacher’s unawareness and inability in constructing and unawareness and inability in constructing and analyzing instrument’s items. analyzing instrument’s items.

Page 6: Joint Maximum Likelihood Estimates

Measurement involves composite scores in analysis. Measurement involves composite scores in analysis. Scores function as a tool of decision making. In CTT, Scores function as a tool of decision making. In CTT, a score depends upon examinee measure and examinee a score depends upon examinee measure and examinee measure depends upon item difficulty. measure depends upon item difficulty.

Examinee of remarkable measure responds correct items Examinee of remarkable measure responds correct items highly and that of unfavourable responds correct highly and that of unfavourable responds correct items lowly. Items considered easy when they are items lowly. Items considered easy when they are responded by a large number of examinees and vice responded by a large number of examinees and vice versa (Dali, 1992: 48). versa (Dali, 1992: 48).

Correlation between item and examinee is implemented Correlation between item and examinee is implemented in terms of correlation between probability of in terms of correlation between probability of correct response and examinee’s success. correct response and examinee’s success.

Item and examinee characteristics cannot be known Item and examinee characteristics cannot be known precisely. Items produced cannot be applied in precisely. Items produced cannot be applied in different situation to different examinees and different situation to different examinees and certain examinee measure cannot be known by seeing on certain examinee measure cannot be known by seeing on composite scores obtained. composite scores obtained.

Page 7: Joint Maximum Likelihood Estimates

IRT overcomes weaknesses, i.e., when an item pool IRT overcomes weaknesses, i.e., when an item pool has item difficulties easy or difficult, has item difficulties easy or difficult, impossible to measure true examinee measure and impossible to measure true examinee measure and impossible to know different success of examinees. impossible to know different success of examinees.

Value of item difficulty is invariant to examinee Value of item difficulty is invariant to examinee measure and value of examinee measure is invariant measure and value of examinee measure is invariant to item difficulty (Dali, 1992: 161). Item’s to item difficulty (Dali, 1992: 161). Item’s characteristic and examinee’s trait are connected characteristic and examinee’s trait are connected by model forming as a function expressed by some by model forming as a function expressed by some parameters. parameters.

They are called as parameters of item’s They are called as parameters of item’s characteristic and examinee’s trait. characteristic and examinee’s trait.

When examinees do items, they produce composite When examinees do items, they produce composite scores interpreted as examinee measures scores interpreted as examinee measures θ. θ.

Empirically, examinee measure be constant when he Empirically, examinee measure be constant when he does item of same characteristic. Therefore, does item of same characteristic. Therefore, estimate of parameter of examinee trait is done by estimate of parameter of examinee trait is done by knowing item’s characteristic. knowing item’s characteristic.

Page 8: Joint Maximum Likelihood Estimates

Parameter of item’s characteristic and of Parameter of item’s characteristic and of examinee’s trait in IRT can be estimated examinee’s trait in IRT can be estimated marginally or jointly. marginally or jointly.

Examinee of certain measure Examinee of certain measure θθ can be known as he can be known as he does item of certain difficulty does item of certain difficulty bjbj. .

Item of certain difficulty Item of certain difficulty bj bj can be known as it can be known as it is done by examinee of certain measure is done by examinee of certain measure θθ..

Page 9: Joint Maximum Likelihood Estimates

There are many estimates of parameters. Procedures There are many estimates of parameters. Procedures or methods of estimates are for IRT models. or methods of estimates are for IRT models.

One model is 1PL model or named Rasch model. One model is 1PL model or named Rasch model. Model only contains parameter of examinee measure Model only contains parameter of examinee measure

θ θ and parameter of item difficulty and parameter of item difficulty bjbj. . Model is initially known before estimate occurs. Model is initially known before estimate occurs.

Estimate using Prox method is employed for 1PL Estimate using Prox method is employed for 1PL model. model.

Page 10: Joint Maximum Likelihood Estimates

There are some questions of research. There are some questions of research. Can Prox method be applied to estimate parameters of Can Prox method be applied to estimate parameters of

items-examinees in 2PL model? items-examinees in 2PL model? Can Prox method be applied to estimate parameters of Can Prox method be applied to estimate parameters of

items-examinees in 3PL model? items-examinees in 3PL model? Can Prox method be applied to estimate parameters of Can Prox method be applied to estimate parameters of

items-examinees in 4PL model? items-examinees in 4PL model? Can Prox method be applied to estimate parameters of Can Prox method be applied to estimate parameters of

items-examinees in polytomous scores? items-examinees in polytomous scores? Does Prox method produce item characteristic curves Does Prox method produce item characteristic curves

monotonically? monotonically? Does Prox method prove that result of estimate is the Does Prox method prove that result of estimate is the

same as ICC when number of items and examinees same as ICC when number of items and examinees increased? increased?

Is it true Prox method is as appropriate method to Is it true Prox method is as appropriate method to estimate parameters of joint items-examinees in 1PL estimate parameters of joint items-examinees in 1PL model? model?

Page 11: Joint Maximum Likelihood Estimates

It is limited to only one question of research, It is limited to only one question of research, formulated as: Is it true Prox method is as formulated as: Is it true Prox method is as appropriate method to estimate parameters of joint appropriate method to estimate parameters of joint items-examinees in 1PL model? items-examinees in 1PL model?

Page 12: Joint Maximum Likelihood Estimates

II. Theoretical ReviewII. Theoretical ReviewItem-Examinee Parameters in 1PL ModelItem-Examinee Parameters in 1PL Model

In CTT and IRT, central analysis: item and In CTT and IRT, central analysis: item and examinee. examinee.

IRT’s advantages: local independence, examinee’s IRT’s advantages: local independence, examinee’s invariance on an item and vice versa, and invariance on an item and vice versa, and unidimension of an item (Hulin, unidimension of an item (Hulin, et al., et al., 1983: 40-1983: 40-43). 43).

CTT does not provide advantages. Subpopulation of CTT does not provide advantages. Subpopulation of items cannot be separated from subpopulation of items cannot be separated from subpopulation of examinees. examinees.

When a subpopulation of homogenous items undertaken When a subpopulation of homogenous items undertaken by subpopulation of different examinees, by subpopulation of different examinees, characteristics of items change. characteristics of items change.

Item difficulty and item discriminating power Item difficulty and item discriminating power change due to different examinee measure and vice change due to different examinee measure and vice versa (Dali, 1992: 4-5). versa (Dali, 1992: 4-5).

Examinee-item depends on each other, examinee Examinee-item depends on each other, examinee measure not invariant to item difficulty, item is measure not invariant to item difficulty, item is multidimensional. multidimensional.

Page 13: Joint Maximum Likelihood Estimates

Advantages of IRT in the following. Firstly, local as a Advantages of IRT in the following. Firstly, local as a point in continuum of examinee’s trait parameter point in continuum of examinee’s trait parameter θ, θ, can can be interval containing homogenous subpopulation of be interval containing homogenous subpopulation of examinees. Independence as examinees in subpopulation be examinees. Independence as examinees in subpopulation be independent upon items in subpopulation meaning composite independent upon items in subpopulation meaning composite scores of items responded by homogeneous subpopulation of scores of items responded by homogeneous subpopulation of examinees be independent (Dali, 1992: 170-171). Local examinees be independent (Dali, 1992: 170-171). Local independence also as responses conditionally independent independence also as responses conditionally independent in subpopulation where examinee’s latent trait F1, ..., F in subpopulation where examinee’s latent trait F1, ..., F has constant value f1, ..., f. That two items or more has constant value f1, ..., f. That two items or more uncorrelated in homogenous subpopulation with particular uncorrelated in homogenous subpopulation with particular level of or fixed latent traits level of or fixed latent traits θ θ (Hulin (Hulin et al.et al., 1983: , 1983: 43; McDonald, 1999: 255). [In a heterogeneous population 43; McDonald, 1999: 255). [In a heterogeneous population where where θθ varies, item scores should be correlated]. Lord varies, item scores should be correlated]. Lord and Novick states local independence means within any and Novick states local independence means within any group of examinees all characterized by same values group of examinees all characterized by same values θ1, θ1, θ2, ..., θkθ2, ..., θk, (conditional) distributions of item scores , (conditional) distributions of item scores are all independent each other (Lord and Novick, 1968: are all independent each other (Lord and Novick, 1968: 361).361).

Page 14: Joint Maximum Likelihood Estimates

Secondly, parameter invariance as function of single Secondly, parameter invariance as function of single measure measure θθ or item characteristic or item characteristic bb which does not which does not change across subpopulation whenever subpopulation change across subpopulation whenever subpopulation changes. Parameter invariance also as examinee’s changes. Parameter invariance also as examinee’s trait which does not change whenever item chosen trait which does not change whenever item chosen changes (Hulin changes (Hulin et al.et al., 1983: 44; Dali, 1992: 173). , 1983: 44; Dali, 1992: 173).

Thirdly, unidimension as item measuring one trait or Thirdly, unidimension as item measuring one trait or characteristic over examinees (Dali, 1992: 164). characteristic over examinees (Dali, 1992: 164). Also means probability of item response as function Also means probability of item response as function of single latent characteristic of examinee of single latent characteristic of examinee θθ (Hulin (Hulin et al.et al., 1983: 40). Since every characteristic , 1983: 40). Since every characteristic determined by one measure, one type of measure as determined by one measure, one type of measure as requirement to measure one dimension of examinee’s requirement to measure one dimension of examinee’s latent trait over subpopulation. Items meeting latent trait over subpopulation. Items meeting requirement of unidimension found in certain requirement of unidimension found in certain battery, mostly, in homogenous test battery (Lord battery, mostly, in homogenous test battery (Lord and Novick, 1968: 381). and Novick, 1968: 381).

Page 15: Joint Maximum Likelihood Estimates

In calibrating test instrument, it concerns how In calibrating test instrument, it concerns how model decided instead of unidimension, parameter model decided instead of unidimension, parameter invariance, and local independence requirements. invariance, and local independence requirements. Models intended include Guttman perfect scale Models intended include Guttman perfect scale model, latent distance model, linear model, normal model, latent distance model, linear model, normal ogive model, and logistic model. ogive model, and logistic model.

Page 16: Joint Maximum Likelihood Estimates

Logistic Logistic ModelModel

Logistic model is normal ogive model like, but, Logistic model is normal ogive model like, but, both not the same. Logistic model does not both not the same. Logistic model does not require intricate mathematic calculations, require intricate mathematic calculations, whereas, normal ogive model requires complicated whereas, normal ogive model requires complicated ones. ones.

Page 17: Joint Maximum Likelihood Estimates

Logistic model also undertakes models of 1, 2, 3, Logistic model also undertakes models of 1, 2, 3, and 4 parameters. 1PL model only employs item and 4 parameters. 1PL model only employs item difficulty parameter difficulty parameter bjbj. 2PL model not only . 2PL model not only employs parameter of item difficulty employs parameter of item difficulty bj, bj, but also but also parameter of item discriminating power parameter of item discriminating power aa. 3PL . 3PL model not only employs parameter of item model not only employs parameter of item difficulty difficulty bj bj and item discriminating power and item discriminating power a,a, but but also parameter of examinee of low measure also parameter of examinee of low measure responding items correctly or guessing correct responding items correctly or guessing correct c. c. 4PL model not only employs parameter of item 4PL model not only employs parameter of item difficulty difficulty bj, a, bj, a, and and c, c, but also parameter of but also parameter of examinee of high measure responding items examinee of high measure responding items incorrectly incorrectly γγ. The latter model stated by McDonald . The latter model stated by McDonald (1967) and Barton and Lord (1981) quoted by (1967) and Barton and Lord (1981) quoted by Hambleton considered difficult model to prove Hambleton considered difficult model to prove (Hambleton, 1989: 157).(Hambleton, 1989: 157).

Page 18: Joint Maximum Likelihood Estimates

The simplest: 1PL model or Rasch model. The model The simplest: 1PL model or Rasch model. The model described as ICC as examinees’ responses to described as ICC as examinees’ responses to items. ICC is model of probability of correct items. ICC is model of probability of correct response on item response on item jj, given examinee’s parameter , given examinee’s parameter θθ symbolised as function symbolised as function Pj(θ).Pj(θ). ICC only depends ICC only depends upon distance between upon distance between θθ and and bj bj meaning item meaning item parameter parameter bj bj defined relative to examinee’s defined relative to examinee’s parameterparameter θ θ (Andersen, 1983: 197-198). Written in (Andersen, 1983: 197-198). Written in mathematical form, 1PL model as follows.mathematical form, 1PL model as follows.

(j = 1, 2, ..., n); D = a constant weighing (j = 1, 2, ..., n); D = a constant weighing 1,7; 1,7; e e exponential number (Hambleton, 1989: 154).exponential number (Hambleton, 1989: 154).

)(

)(

1 bjD

bjD

e

ePj

)(

)(

1 bjD

bjD

e

ePj

Page 19: Joint Maximum Likelihood Estimates

The 1PL model empirically model like ogive The 1PL model empirically model like ogive along with its low and high asymptotes. Low along with its low and high asymptotes. Low asymptote approximates value -∞. High asymptote asymptote approximates value -∞. High asymptote approximates value +∞. Theoretically, two approximates value +∞. Theoretically, two points describe examinee measure ranging from -points describe examinee measure ranging from -∞ to +∞ (Hambleton, 1989: 161; Dali, 1992: ∞ to +∞ (Hambleton, 1989: 161; Dali, 1992: 224). Practically, ranges from -3 to +3 (Hulin, 224). Practically, ranges from -3 to +3 (Hulin, et al., et al., 1983: 101) or from -4 to +4 (Dali, 1983: 101) or from -4 to +4 (Dali, 1992: 224). [Note, range (-3, +3) or (-4, +4) 1992: 224). [Note, range (-3, +3) or (-4, +4) rather than range (-∞, +∞) executed by rather than range (-∞, +∞) executed by transforming values of some normal probability transforming values of some normal probability distribution into values of standard normal distribution into values of standard normal probability distribution. Done by deciding probability distribution. Done by deciding value of mean parameter value of mean parameter μμ = 0 and value of = 0 and value of standard deviation parameter standard deviation parameter σσ = 1]. If ICC = 1]. If ICC described, as follows.described, as follows.

Page 20: Joint Maximum Likelihood Estimates

ICC can be model in study of item-ICC can be model in study of item-examinee parameters. One used as a model examinee parameters. One used as a model of parameter estimates. Therefore, of parameter estimates. Therefore, estimate is determined by model of item estimate is determined by model of item characteristics intended. Estimates of characteristics intended. Estimates of parameter done jointly or marginally. parameter done jointly or marginally. Estimates known as maximum likelihood Estimates known as maximum likelihood estimates. Among them, the simplest estimates. Among them, the simplest estimate of parameter conforms in 1PL estimate of parameter conforms in 1PL model. So far, estimate executes Prox model. So far, estimate executes Prox method.method.

Page 21: Joint Maximum Likelihood Estimates

Parameter Estimates Using Prox Method Parameter Estimates Using Prox Method

Many estimate methods in IRT. One needs Many estimate methods in IRT. One needs dichotomous data of items. Other polytomous ones dichotomous data of items. Other polytomous ones in analysis. Parameter estimates in IRT applied in analysis. Parameter estimates in IRT applied to 1PL, 2PL, 3PL, or 4PL models. to 1PL, 2PL, 3PL, or 4PL models.

Estimate in 4PL model more difficult than one in Estimate in 4PL model more difficult than one in 3PL model, 3PL model more difficult than 2PL 3PL model, 3PL model more difficult than 2PL model, 2PL model more difficult than 1PL model. model, 2PL model more difficult than 1PL model.

Estimate in 1PL model then called as estimate Estimate in 1PL model then called as estimate using Prox method requiring dichotomous data of using Prox method requiring dichotomous data of items. items.

Since estimate for 1PL model, thence it Since estimate for 1PL model, thence it estimates two parameters, i.e., parameters of estimates two parameters, i.e., parameters of examinee measure and item difficulty. Estimate examinee measure and item difficulty. Estimate with Prox method, so far, known as joint maximum with Prox method, so far, known as joint maximum likelihood estimate. That when estimate brings likelihood estimate. That when estimate brings about, all values of parameters of examinees’ about, all values of parameters of examinees’ traits and items’ characteristics not known traits and items’ characteristics not known (Dali, 1992: 264). (Dali, 1992: 264).

Page 22: Joint Maximum Likelihood Estimates

Prox method arranges initial Prox method arranges initial bjbj for for parameter of item’s characteristic and parameter of item’s characteristic and initial initial θ θ for parameter of examinee’s for parameter of examinee’s trait. Initial value trait. Initial value bj bj and and θθ based on based on logit incorrect value and logit correct logit incorrect value and logit correct value. Usually, initial value expressed as value. Usually, initial value expressed as deviation from mean of logit so both deviation from mean of logit so both values will be values will be bA bA andand θA θA. Referring to . Referring to variance of logit, Prox method forms variance of logit, Prox method forms expansion factors for two parameters, expansion factors for two parameters, i.e., i.e., F(bj) F(bj) and and F(θ)F(θ). By using initial . By using initial value value bjbj, , θ, F(bj), θ, F(bj), and and F(θ)F(θ), parameter of , parameter of item’s characteristic and parameter of item’s characteristic and parameter of examinee’s trait estimated. examinee’s trait estimated.

Page 23: Joint Maximum Likelihood Estimates

First, dichotomous response correct 1 incorrect First, dichotomous response correct 1 incorrect 0 matrix edited. Done in such a way that every 0 matrix edited. Done in such a way that every examinee or item all responses correct or all examinee or item all responses correct or all responses incorrect eliminated. That examinee responses incorrect eliminated. That examinee responding all items correctly and all items responding all items correctly and all items incorrectly not put in matrix. Also means item incorrectly not put in matrix. Also means item responded by all examinees correctly and responded by all examinees correctly and responded by all examinees incorrectly not put responded by all examinees incorrectly not put in matrix. Finally, examinees’ scores put in in matrix. Finally, examinees’ scores put in order vertically from smallest to largest and order vertically from smallest to largest and items’ proportions of correct responses put in items’ proportions of correct responses put in order horizontally from largest to smallest. In order horizontally from largest to smallest. In other words, examinees ordered from lowest other words, examinees ordered from lowest measure to highest and items ordered from measure to highest and items ordered from easiest to most difficult. easiest to most difficult.

Page 24: Joint Maximum Likelihood Estimates

Second, initial Second, initial bj bj computed. Done using logit computed. Done using logit incorrect value for each number correct. Logit incorrect value for each number correct. Logit incorrect value for each item computed as natural incorrect value for each item computed as natural logarithm of ratio proportion incorrect to logarithm of ratio proportion incorrect to proportion correct. This as reference to calibrate proportion correct. This as reference to calibrate initial initial bjbj. Then, examinees do N items so mean of . Then, examinees do N items so mean of logit incorrect values among items computed. logit incorrect values among items computed. Considering items sample, variance of logit Considering items sample, variance of logit incorrect value obtained. To compute variance of incorrect value obtained. To compute variance of logit incorrect value, necessary to compute sum of logit incorrect value, necessary to compute sum of logit incorrect value squared minus N items times logit incorrect value squared minus N items times mean adjustment squared, all divided by number of mean adjustment squared, all divided by number of items minus one. Variance required for expansion items minus one. Variance required for expansion factor computation. Initial factor computation. Initial bjbj meant to decide meant to decide deviation from mean of logit incorrect value. Done deviation from mean of logit incorrect value. Done for all items. for all items.

Page 25: Joint Maximum Likelihood Estimates

Third, initial Third, initial θθ calculated. Unlike items, calculated. Unlike items, calculation done using logit correct value instead calculation done using logit correct value instead of logit incorrect value. Logit correct value for of logit incorrect value. Logit correct value for each examinee computed as natural logarithm of each examinee computed as natural logarithm of ratio of proportion correct to proportion ratio of proportion correct to proportion incorrect. This as reference to compute initial incorrect. This as reference to compute initial θθ. . Then, items done by M examinees so mean of logit Then, items done by M examinees so mean of logit correct values among examinees computed. correct values among examinees computed. Considering examinees sample, variance of logit Considering examinees sample, variance of logit correct value obtained. To compute variance of correct value obtained. To compute variance of logit correct value, necessary to compute sum of logit correct value, necessary to compute sum of logit correct value squared minus M examinees logit correct value squared minus M examinees times mean adjustment squared, all divided by times mean adjustment squared, all divided by number of examinees minus one. Variance is number of examinees minus one. Variance is required for expansion factor computation. Initial required for expansion factor computation. Initial θθ meant to decide deviation from mean of logit meant to decide deviation from mean of logit correct value. Done for all examinees. correct value. Done for all examinees.

Page 26: Joint Maximum Likelihood Estimates

Fourth, expansion factor for items Fourth, expansion factor for items F(bj) F(bj) and value of and value of bjbj calculated. calculated. Prox method does not do estimate Prox method does not do estimate cycle repeatedly. Otherwise, it cycle repeatedly. Otherwise, it executes statistic estimate on executes statistic estimate on sample variance of logit incorrect sample variance of logit incorrect value and logit correct value. value and logit correct value. Factor for estimate is expansion Factor for estimate is expansion factor. Expansion factor then factor. Expansion factor then multiplied by initial multiplied by initial bj bj to obtain to obtain final estimate. Done for all items. final estimate. Done for all items.

Page 27: Joint Maximum Likelihood Estimates

Fifth, expansion factor for examinees Fifth, expansion factor for examinees F(θ) F(θ) and value of and value of θθ calculated. Prox calculated. Prox method does not do estimate cycle method does not do estimate cycle repeatedly either. Otherwise, it repeatedly either. Otherwise, it executes statistic estimate on sample executes statistic estimate on sample variance of logit correct value and variance of logit correct value and logit incorrect value. Factor for logit incorrect value. Factor for estimate also expansion factor. estimate also expansion factor. Expansion factor then multiplied by Expansion factor then multiplied by initial initial θ θ to obtain final estimate. Done to obtain final estimate. Done for all examinees. for all examinees.

Page 28: Joint Maximum Likelihood Estimates

MethodologyMethodology

This study is survey undertaking examinees-items This study is survey undertaking examinees-items of TOEFL’s grammar subtest. Subpopulation taken by of TOEFL’s grammar subtest. Subpopulation taken by using random purposive sampling technique using random purposive sampling technique involving some steps. First, population involving some steps. First, population determined, i.e. population of examinees in determined, i.e. population of examinees in Jakarta. Second, target population determined, Jakarta. Second, target population determined, i.e. examinees doing TOEFLi.e. examinees doing TOEFL in 2004-2005. [Note, in 2004-2005. [Note, subpopulation as term used to substitute for subpopulation as term used to substitute for sample]. Due to cost and time, it takes 30 persons sample]. Due to cost and time, it takes 30 persons as subpopulation of examinees in a simple random as subpopulation of examinees in a simple random way. Third, purposively determined one subtest of way. Third, purposively determined one subtest of grammar of TOEFL since population mostly concerned grammar of TOEFL since population mostly concerned with subtest. Forth, from subtest, in a simple with subtest. Forth, from subtest, in a simple random way, it takes 20 items as subpopulation of random way, it takes 20 items as subpopulation of items. Therefore, research analysis units 30 items. Therefore, research analysis units 30 examinees and 20 items of subtest of grammar of examinees and 20 items of subtest of grammar of TOEFLTOEFL..

Page 29: Joint Maximum Likelihood Estimates

IV. AnalysisIV. Analysis

Prox method employs steps to Prox method employs steps to estimate item difficulty estimate item difficulty bj bj and and examinee measure examinee measure θθ. Dichotomous . Dichotomous responses of correct 1 incorrect responses of correct 1 incorrect 0 matrix edited. Done in such a 0 matrix edited. Done in such a way that every examinee or item way that every examinee or item for which all responses correct for which all responses correct or all responses incorrect or all responses incorrect eliminated. See on Table 1.eliminated. See on Table 1.

Page 30: Joint Maximum Likelihood Estimates

The examinees’ scores put in order The examinees’ scores put in order vertically from the smallest to the vertically from the smallest to the largest and the items’ proportions largest and the items’ proportions of correct responses put in order of correct responses put in order horizontally from the largest to horizontally from the largest to the smallest. In other words, the the smallest. In other words, the examinees ordered from the lowest examinees ordered from the lowest measure to the highest one and the measure to the highest one and the items ordered from the easiest to items ordered from the easiest to the most difficult. See on Table the most difficult. See on Table 2.2.

Page 31: Joint Maximum Likelihood Estimates

Calibration of initial Calibration of initial bj bj computed. Uses logit computed. Uses logit incorrect value (incorrect value (LGi) LGi) as reference to calibrate as reference to calibrate initial value of parameters of item difficulty initial value of parameters of item difficulty bjbj. .

; where ; where Q(θ) = Q(θ) = probability of incorrect response, probability of incorrect response, P(θ)P(θ) = probability of correct response. = probability of correct response.

Then, examinees do N items so mean of logit Then, examinees do N items so mean of logit incorrect values among items computed. Variance incorrect values among items computed. Variance of logit incorrect valueof logit incorrect value (S2LG) (S2LG) is obtained. is obtained.

, where N = a number of items, μLG = mean of , where N = a number of items, μLG = mean of logit incorrect value.logit incorrect value.

)(

)(ln

P

QLGi

N

j

LGLG NLGiN

S1

222 )(1

1

Page 32: Joint Maximum Likelihood Estimates

Initial Initial bjbj meant to decide meant to decide deviation from mean of logit deviation from mean of logit incorrect value. Done for all incorrect value. Done for all items. See on Table 3.items. See on Table 3.

Page 33: Joint Maximum Likelihood Estimates

Then, initial examinee measure Then, initial examinee measure θθ calculated. Done calculated. Done by using logit correct value by using logit correct value (LSj) (LSj) as a reference. as a reference.

Items responded by M examinees necessary to Items responded by M examinees necessary to compute mean of logit correct values among compute mean of logit correct values among examinees. Then, variance of logit correct value examinees. Then, variance of logit correct value obtained. obtained.

, where M = a number of examinees, μLS = mean of , where M = a number of examinees, μLS = mean of logit correct value.logit correct value.

)(

)(ln

Q

PLSj

M

i

LSLS MLSjM

S1

222 )(1

1

Page 34: Joint Maximum Likelihood Estimates

Value of parameter of examinee Value of parameter of examinee measure measure θθ determined as determined as deviation from mean of logit deviation from mean of logit correct. Done for all correct. Done for all examinees. See on Table 4.examinees. See on Table 4.

Page 35: Joint Maximum Likelihood Estimates

Then, expansion factor for item Then, expansion factor for item characteristic characteristic F(bj) F(bj) and value and value of of bjbj computed. computed.

35,81

89,21

)(22

2

LGLS

LS

j SS

S

bF

Page 36: Joint Maximum Likelihood Estimates

Estimate of parameter of Estimate of parameter of bj bj obtained by multiplying initial obtained by multiplying initial value of parameter of item value of parameter of item difficulty by value of difficulty by value of expansion factor for expansion factor for bj bj gained. gained. Done for all items. See on Done for all items. See on Table 5.Table 5.

Page 37: Joint Maximum Likelihood Estimates

Expansion factor of examinee’s Expansion factor of examinee’s trait trait F(θ) F(θ) and value of and value of θθ computed. computed.

35,81

89,21

)(22

2

LGLS

LG

SS

S

F

Page 38: Joint Maximum Likelihood Estimates

Finally, estimate of parameter of Finally, estimate of parameter of examinee measure examinee measure θ θ obtained by obtained by multiplying initial value of examinee multiplying initial value of examinee measure by value of expansion factor measure by value of expansion factor for examinee measure for examinee measure θθ gained. Done gained. Done for all examinees. See on Table 6.for all examinees. See on Table 6.

From values of estimates of From values of estimates of θ θ and and b, b, proved that ICC conforms to 1PL proved that ICC conforms to 1PL model.model.

Page 39: Joint Maximum Likelihood Estimates

V. ClosingV. Closing

From data analysis, concluded that From data analysis, concluded that examinee measure examinee measure θθ as examinee’s trait as examinee’s trait and item difficulty and item difficulty bjbj as item’s as item’s characteristic estimated jointly by characteristic estimated jointly by using Prox method. Estimate using Prox using Prox method. Estimate using Prox method factually provides accurate method factually provides accurate result. It so happens that estimate result. It so happens that estimate forms ICC of 1PL model. Therefore, forms ICC of 1PL model. Therefore, joint estimate of item-examinee joint estimate of item-examinee parameters as a proof of intended model parameters as a proof of intended model accuracy, i.e. examinee measure accuracy, i.e. examinee measure θθ as as similar as item difficulty similar as item difficulty bj bj by by condition of condition of Pj(θ)Pj(θ) = 0,5. = 0,5.

Page 40: Joint Maximum Likelihood Estimates

Suggested that all people concerned Suggested that all people concerned with measurement, evaluation, test, with measurement, evaluation, test, and assessment pay much attention to and assessment pay much attention to examinee’s trait and item’s examinee’s trait and item’s characteristic. However, it deals characteristic. However, it deals with decision. Decision on examinees with decision. Decision on examinees includes pass-fail, accepted-includes pass-fail, accepted-rejected, and so forth. Whereas, rejected, and so forth. Whereas, decision on the items includes good-decision on the items includes good-bad items, valid-not valid items, and bad items, valid-not valid items, and so forth. so forth.

Page 41: Joint Maximum Likelihood Estimates

Research implication in line Research implication in line with recommendation of with recommendation of requirement of test requirement of test construction consisting of good construction consisting of good items kept in item banking. items kept in item banking. Item banking tells about item Item banking tells about item and examinee identity. and examinee identity.