iccs nrc(feb10)iccsworkshop pvs
DESCRIPTION
ICCS workshopTRANSCRIPT
![Page 1: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/1.jpg)
Introduction to plausible values
National Research Coordinators Meeting Madrid, February 2010
![Page 2: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/2.jpg)
NRCMeetingMadrid
February 2010
Content of presentation
• Rationale for scaling• Rasch model and possible ability
estimates• Shortcomings of point estimates• Drawing plausible values• Computation of measurement error
![Page 3: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/3.jpg)
NRCMeetingMadrid
February 2010
Rationale for IRT scaling of data
• Summarising data instead of dealing with many single items
• Raw scores or percent correct sample-dependent
• Makes equating possible and can deal with rotated test forms
![Page 4: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/4.jpg)
NRCMeetingMadrid
February 2010
The ‘Rasch model’
• Models the probability to respond correctly to an item as
• Likewise, the probability of NOT responding correctly is modelled as
in
innii XP
exp1exp)1(
)exp(11)0(
inniXP
![Page 5: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/5.jpg)
NRCMeetingMadrid
February 2010
IRT curves
0
0.5
1
-4 -3 -2 -1 0 1 2 3 4
![Page 6: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/6.jpg)
NRCMeetingMadrid
February 2010
How might we impute a reasonable proficiency value?
• Choose the proficiency that makes the score most likely– Maximum Likelihood Estimate– Weighted Likelihood Estimate
• Choose the most likely proficiency for the score– empirical Bayes
• Choose a selection of likely proficiencies for the score– Multiple imputations (plausible values)
![Page 7: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/7.jpg)
NRCMeetingMadrid
February 2010
Maximum Likelihood vs. Raw Score
0
1
2
3
4
5
Proficiency
Scor
e
![Page 8: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/8.jpg)
NRCMeetingMadrid
February 2010
The Resulting Proficiency Distribution
Score 0
Score 1
Score 2
Score 3Score 4
Score 5
Score 6
Proficiency on Logit Scale
![Page 9: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/9.jpg)
NRCMeetingMadrid
February 2010
Characteristics of Maximum Likelihood Estimates (MLE)• Unbiased at individual level with
sufficient information BUT biased towards ends of ability scale.
• Arbitrary treatment of perfects and zeroes required
• Discrete scale & measurement error leads to bias in population parameter estimates
![Page 10: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/10.jpg)
NRCMeetingMadrid
February 2010
Characteristics of Weighted Likelihood Estimates
• Less biased than MLE• Provides estimates for perfect and
zero scores• BUT discrete scale &
measurement error leads to bias in population parameter estimates
![Page 11: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/11.jpg)
NRCMeetingMadrid
February 2010
Plausible Values
• What are plausible values?
• Why do we use them?
• How to analyse plausible values?
![Page 12: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/12.jpg)
NRCMeetingMadrid
February 2010
Purpose of educational tests
• Measure particular students(minimise measurement error of individual estimates)
• Assess populations(minimise error when generalising to the population)
![Page 13: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/13.jpg)
NRCMeetingMadrid
February 2010
Posterior distributionsfor test scores on 6 dichotomous items
![Page 14: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/14.jpg)
NRCMeetingMadrid
February 2010
Empirical Bayes – Expected A-Priori estimates (EAP)
![Page 15: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/15.jpg)
NRCMeetingMadrid
February 2010
Characteristics of EAPs
• Biased at the individual level but unbiased population means (NOT variances)
• Discrete scale, bias & measurement error leads to bias in population parameter estimates
• Requires assumptions about the distribution of proficiency in the population
![Page 16: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/16.jpg)
NRCMeetingMadrid
February 2010
Plausible Values
Score 0
Score 1
Score 2
Score 3 Score 4
Score 5
Score 6
Proficiency on Logit Scale
![Page 17: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/17.jpg)
NRCMeetingMadrid
February 2010
Characteristics of Plausible Values
• Not fair at the student level• Produces unbiased population parameter
estimates– if assumptions of scaling are reasonable
• Requires assumptions about the distribution of proficiency
![Page 18: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/18.jpg)
NRCMeetingMadrid
February 2010
Estimating percentages below benchmark with Plausible Values
Level One Cutpoint
The proportion of plausible values less than the cut-point will be a superior estimator to the EAP, MLE or WLE based values
![Page 19: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/19.jpg)
NRCMeetingMadrid
February 2010
Methodology of PVs
• Mathematically computing posterior distributions around test scores
• Drawing 5 random values for each assessed individual from the posterior distribution for that individual
![Page 20: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/20.jpg)
NRCMeetingMadrid
February 2010
What is conditioning?
• Assuming normal posterior distribution: • Model sub-populations:
X=0 for boyX=1 for girl
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69
2,N
2,N X
2...,N X Y Z
![Page 21: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/21.jpg)
NRCMeetingMadrid
February 2010
Conditioning Variables• Plausible values should only be
analysed with data that were included in the conditioning (otherwise, results may be biased)
• Aim: Maximise information included in the conditioning, that is use as many variables as possible
• To reduce number of conditioning variables, factor scores from principal component analysis were used in ICCS
• Use of classroom dummies takes between-school variation into account (no inclusion of school or teacher questionnaire data needed)
![Page 22: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/22.jpg)
NRCMeetingMadrid
February 2010
Plausible values
• Model with conditioning variables will improve precision of prediction of ability (population estimates ONLY)
• Conditioning provides unbiased estimates for modelled parameters.
• Simulation studies comparing PVs, EAPs and WLEs show that– Population means similar results– WLEs (or MLEs) tend to overestimate variances– EAPs tend to underestimate variance
![Page 23: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/23.jpg)
NRCMeetingMadrid
February 2010
Calculating of measurement error
• As in TIMSS or PIRLS data files, there are five plausible values for cognitive test scales in ICCS
• Using five plausible values enable researchers to obtain estimates of the measurement error
![Page 24: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/24.jpg)
NRCMeetingMadrid
February 2010
How to analyse PVs - 1
• Estimated mean is the AVERAGE of the mean for each PV
• Sampling variance is the AVERAGE of the sampling variance for each PV
M
iiM 1
ˆ1ˆ
M
iiM 1
2)(
2)( ˆ1ˆ
![Page 25: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/25.jpg)
NRCMeetingMadrid
February 2010
How to analyse PVs - 2
• Measurement variance computed as:
• Total standard error computed from measurement and sampling variance as:
25
1
2)( ˆˆ
11ˆ
i
iPV M
2 2ˆ ˆ ( )( ) ( )
1ˆ ˆ ˆ(1 )PV PVM
![Page 26: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/26.jpg)
NRCMeetingMadrid
February 2010
How to analyse PVs - 3
can be replaced by any statistic for instance:- SD- Percentile- Correlation coefficient- Regression coefficient- R-square- etc.
![Page 27: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/27.jpg)
NRCMeetingMadrid
February 2010
Steps for estimating both sampling and measurement error
• Compute statistic for each PV for fully weighted sample
• Compute statistics for each PV for 75 replicate samples
• Compute sampling error (based on previous steps)
• Compute measurement error• Combine error variances to calculate
standard error
![Page 28: Iccs Nrc(Feb10)Iccsworkshop Pvs](https://reader036.vdocuments.mx/reader036/viewer/2022062520/5695d1b21a28ab9b02978b1c/html5/thumbnails/28.jpg)
NRCMeetingMadrid
February 2010
Questions or comments?