improving latent trait analysis using mokken scaling analysis
TRANSCRIPT
Improving latent trait analysis using Mokken scaling analysis
Roger WatsonLisa Kirke
Faculty of Health & Social Care23 June 2015
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
ROGER WATSON
PRESENTS
FUN WITH MOKKEN SCALING
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Improving latent trait analysis using Mokken scaling analysis
• Revise IRT• Local stochastic independence• New developments
• Invariant item ordering• Confidence intervals• Person-item fit• Sample size
• Example of recent work
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Revise IRT
• Measurement in social sciences• Advantages of IRT• Guttman and Mokken
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Methods of measurement in social researchRange of methods
• Classical test theory• Item response theory• Latent class analysis
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Item response theory
• Rasch analysis• Partial credit model• Mokken scaling
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Advantages of item response theory
• only a specific set of items produces a given score on the latent variable
• therefore, you know what the score means
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Item response theory (IRT)
• The unit of analysis in IRT:– The item characteristic curve (ICC)
• Also known as:– The item response curve (IRC)– The item response function (IRF)
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Mokken Scaling
• Stochastic version of Guttman scaling
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Guttman scales
Louis Guttman 1916-1987
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Guttman scalogram
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Guttman item responses are ‘deterministic’
P(θ)
θ
1 -
Item i Item j
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Mokken Scaling
• Stochastic version of Guttman scaling
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Deterministic versus stochastic:‘league versus cup’
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Mokken Scaling
• Stochastic version of Guttman scaling• Adheres to the assumptions of item
response theory
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
16
Assumptions of IRT
• Unidimensionality• Local stochastic independence• Monotone homogeneity• Double monotonicity (non-intersection) for
dichotomous items†
• † eg “yes/no”
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Robert Mokken 1929-
Mokken scaling
Mokken suggested a non-parametric item response theory where characteristic curves (ICCs) only had to be monotone and non-intersecting
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Item characteristic curve
P(θ)
θ
1 -
= latent variable
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Item characteristic curves
P(θ)
θ
1 -Item 1 Item 2
• item 2 is more ‘difficult’ than item 1
• it represents more of the latent variable
• more difficult items will have lower mean scores on the latent variable
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Mokken Scaling
• Stochastic version of Guttman scaling• Adheres to the assumptions of item
response theory• Hierarchical cumulative scales
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Concept of a cumulative scale
• Items are ordered reproducibly• Item are ordered meaningfully• A score on an item indicates the extent to
which the latent trait is present• The sum of item scores is a measure for
order of the latent trait
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Cumulative scale: example(Also known as an ‘implicational’ scale)5) I would have no objections to my son or daughter marrying
a Scottish person
4) At a party I would not hesitate to dance with a Scottish person
3) I would have no objections to having a Scottish person dine in my house
2) I would not object to having a Scottish family live next door
1) I would not object to sitting next to a Scottish person on a bus
Response format: “yes” = 0/”no” = 1DIFFICULTY
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Local stochastic independence
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
New developments in Mokken scaling
• Invariant item ordering
Invariant item ordering
P
Disability
Cut toe nails
Tie a knot
Fieo R, Watson R, Deary IJ, Starr JM (2010) A revised activities of daily living/instrumental activities of daily living instrument increases interpretive power: theoretical application for functional tasks Gerontology 56, 483-490
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
31
Polytomous item response functions (IRFs)
of the Hostility (HOS) items
HT = 0.67
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
32
Polytomous item response
functions (IRFs) of the Depression
(DEP) items
HT = 0.47
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
33
Polytomous item response functions
(IRFs) of the Physical
Functioning (PF) items
HT = 0.53
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
New developments in Mokken scaling
• Invariant item ordering• Confidence intervals
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
95% CI for H ij should not include 0
95% CI for Hi should not include 0.30
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
New developments in Mokken scaling
• Invariant item ordering• Confidence intervals• Person-item fit
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
…most studies are difficult to understand for nonspecialists.
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
FITSDON’T FIT
= Guttman errors
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Italian EdFED scale PIF analysis
• Data for EdFED (Edinburgh Feeding Evaluation in Dementia scale) from an
intervention study (The Nutricare Project) in Italy with baseline and 6-
monthly follow-up
• Analysed using the PerFit programme in R
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Italian EdFED scale PIF analysis
Person-item fit scores for selected individuals across the studyPerson ID Months 0 1 2 3 4 5
6
148 0.29 0.28 0.280.28 0.28 0.28 0.27
269 0.30 0.22 0.220.22 0.25 0.24 0.24
290 0.29 0.25 0.250.25 0.25 0.23 0.23
291 0.29 0.28 0.280.28 0.25 0.23 0.30
330 NF 0.23 0.230.23 0.22 0.23 0.21
NF = not flagged
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
New developments in Mokken scaling
• Invariant item ordering• Confidence intervals• Person-item fit• Sample size
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
The effect of sample size on Mokken scales in the Warwick-Edinburgh Mental Well-Being scale
• Increasing sample size• 50/250/500/600/750• Sampling with replacement• Study effects on:
• Scalability• Confidence intervals• Per element accuracy
range of Hi H Hi with 95% CI < .3 (n)
Hij with 95% CI < 0 (n)
PEA (AISP)
PEA (GA)
sample 1 .18 - .46 .32 12 47 .64 .64sample 2 .10 - .47 .31 12 43 .64 .64sample 3 .16 - .51 .35 11 39 .79 .79sample 4 .10 - .41 .26 14 60 .71 .71sample 5 .27 - .58 .43 6 25 .86 .86sample 6 .08 - .47 .35 9 41 .79 .79sample 7 .02 - .40 .27 14 53 .64 .64sample 8 .17 - .46 .29 12 57 .57 .57sample 9 .14 - .56 .37 12 46 .79 .79sample 10 .09 - .57 .38 9 35 .79 .79
Range of Hi values, H, CIs for Hi and Hij, and PEA for ten samples with n = 50Note. Hi values are based on TEST for all 14 items.The values of PEA (AISP) and PEA (GA) are the same.However, not always are the same items selected in the same scale.
range of Hi H Hi with 95% CI < .3 (n)
Hij with 95% CI < 0 (n)
PEA (AIS
P)
PEA (GA)
sample 1 .36 - .58 .48 2 0 1.00 1.00sample 2 .31 - .52 .41 5 3 1.00 1.00sample 3 .32 - .62 .48 2 0 1.00 1.00sample 4 .26 - .55 .47 2 0 .93 .93sample 5 .18 - .50 .39 6 10 .86 .86sample 6 .37 - .61 .49 1 0 1.00 1.00sample 7 .32 - .56 .46 2 0 1.00 1.00sample 8 .40 - .60 .49 0 0 1.00 1.00sample 9 .40 - .60 .49 0 0 .93 .93sample 10 .24 - .55 .44 2 2 .93 .93
Range of Hi values, H, CIs for Hi and Hij, and PEA for ten samples with n = 250Note. Hi values are based on TEST for all 14 items.The values of PEA (AISP) and PEA (GA) are the same.Furthermore, the same items are always selected in the same scale (only exception is shown in sample 9).
range of Hi H Hi with 95% CI < .3 (n)
Hij with 95% CI < 0 (n)
PEA (AIS
P)
PEA (GA)
sample 1 .30 - .60 .49 1 0 1.00 1.00sample 2 .32 - .58 .47 1 0 1.00 1.00sample 3 .36 - .57 .47 1 0 1.00 1.00sample 4 .33 - .58 .48 1 0 1.00 1.00sample 5 .34 - .59 .49 1 0 1.00 1.00sample 6 .35 - .61 .50 1 0 1.00 1.00sample 7 .33 - .59 .47 1 0 1.00 1.00sample 8 .37 - .60 .50 0 0 1.00 1.00sample 9 .25 - .55 .43 2 0 .93 .93sample 10 .33 - .63 .53 1 0 1.00 1.00
Range of Hi values, H, CIs for Hi and Hij, and PEA for ten samples with n = 500Note. Hi values are based on TEST for all 14 items.The values of PEA (AISP) and PEA (GA) are the same.Furthermore, the same items are always selected in the same scale.
range of Hi H Hi with 95% CI < .3 (n)
Hij with 95% CI < 0 (n)
PEA (AIS
P)
PEA (GA)
sample 1 .31 - .62 .50 0 0 1.00 1.00sample 2 .33 - .57 .46 1 0 1.00 1.00sample 3 .37 - .57 .48 0 0 1.00 1.00sample 4 .33 - .59 .49 0 0 1.00 1.00sample 5 .35 - .59 .49 1 0 1.00 1.00sample 6 .34 - .61 .50 1 0 1.00 1.00sample 7 .31 - .59 .47 1 0 1.00 1.00sample 8 .36 - .59 .49 1 0 1.00 1.00sample 9 .27 - .55 .44 1 0 .93 .93sample 10 .33 - .61 .51 1 0 1.00 1.00
Range of Hi values, H, CIs for Hi and Hij, and PEA for ten samples with n = 600Note. Hi values are based on TEST for all 14 items.The values of PEA (AISP) and PEA (GA) are the same.Furthermore, the same items are always selected in the same scale.
range of Hi H Hi with 95% CI < .3 (n)
Hij with 95% CI < 0 (n)
PEA (AIS
P)
PEA (GA)
sample 1 .30 - .57 .47 1 0 1.00 1.00sample 2 .38 - .59 .50 0 0 1.00 1.00sample 3 .33 - .61 .52 1 0 1.00 1.00sample 4 .37 - .64 .53 0 0 1.00 1.00sample 5 .35 - .59 .49 1 0 1.00 1.00sample 6 .33 - .60 .49 1 0 1.00 1.00sample 7 .38 - .60 .50 0 0 1.00 1.00sample 8 .37 - .60 .50 0 0 1.00 1.00sample 9 .37 - .59 .49 0 0 1.00 1.00sample 10 .40 - .61 .51 0 0 1.00 1.00
Range of Hi values, H, CIs for Hi and Hij, and PEA for ten samples with n = 750Note. Hi values are based on TEST for all 14 items.The values of PEA (AISP) and PEA (GA) are the same.
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
The AQ formed reliable Mokken scales. There was a large overlap between the scale from the university student sample and the sample with ASC, with the first scale, relating to social interaction, being almost identical. The present study confirms the utility of the AQ as a single instrument that can dimensionalize autistic traits in both university student and clinical samples of ASC, and confirms that items of the AQ are consistently ordered relative to one another.
03/05/2023 © The University of Sheffield / Department of Marketing and Communications
Conclusion & prospects
• Mokken scaling useful
BUT:
• Sample sizes need to be quite large
• User-friendly methods of assessing LSI need to be developed
• We need some formal criteria to decide what to do with person-item misfits