item difficulty modeling on logical and verbal reasoning tests kuan xing 1 and kirk becker 2 1...

14
Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago IL Acknowledgement: This pilot study was done during first author’s internship at Pearson VUE. The first author wants to thank Pearson VUE, and especially Dr. Kirk Becker for his great support and mentoring. 1

Upload: earl-booth

Post on 19-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

1

Item Difficulty Modeling on Logical and Verbal

Reasoning TestsKuan Xing1 and Kirk Becker2

1 University of Illinois – Chicago; 2 Pearson VUE, Chicago IL

Acknowledgement: This pilot study was done during first author’s internship at Pearson VUE. The first author wants to thank Pearson VUE, and especially Dr. Kirk Becker for his great support and mentoring.

Page 2: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

2

Background

• Psychometric researchers and testing practitioners must investigate the quality of large-scale tests.

• Checking item parameters such as item difficulty is important in the test quality control process.

• However, researchers and practitioners want to know more, such as what makes items difficult (or easy)?

Page 3: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

3

Background (continued)

• It’s essential to investigate the item features, and use those information to make predictions on the item qualities for future item generation (Irvine, 2003).

• In practice, often there are some discrepancies between test writers’ predicted item characteristics (e.g., item difficulty) and operational item characteristics. Item difficulty modeling may function as an effective way to generate new items with good quality.

Page 4: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

4

Theoretical framework

• Earlier work: Mental rotation (Shepard & Metzler, 1971)

• Carroll and Johnson (1991) showed item difficulties could be predicted by radicals (those theoretically consonant structural components of items)

• Psychology of reasoning (Newstead et al., 2003): Using think-aloud protocols and experimental studies, they identified factors which could affect the item difficulties in Analytical Reasoning (AR), such as negatives, types of reasoning task (assignment or placing), overall informativeness, among others.

Page 5: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

5

Logical Reasoning

• Items are grouped into sets;

• Each set provides a set of rules;

• The answer to each item can be deduced from the information provided.

Page 6: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

6

Example of logical reasoning content• The game is for 3 players

• If A is playing, B is not

• Only 2 of B, C, or D can play at the same time

• C will always play

• If E plays, A will also play

• Question – if A and E are playing, who is also playing?

Page 7: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

7

Methods

• Participants: A new-designed college administration test was piloted in a private university. This test consists of logical, quantitative, and verbal reasoning subtests, but we only focus on the logical and verbal parts.

• Procedures: We identified and coded test features from the previous research work (Newstead et al., 2003), answer sheets developed by test writers, and our discussions.

• Item level and set level features were coded.

• Multiple regressions were applied to predict item difficulties using different combinations of coded test features.

Page 8: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

8

Results

Part 1: Logical reasoning test

• A total of 14 valid item difficulty parameters were finally entered in the regression equation for logical reasoning. Values used were the average difficulty of the 4 items within a set.

• Predictors including people, repeating groups, and cultural background explained a marginally significant proportion of variance in item difficulty parameters, R2 = .50, F(3,13) = 3.37, p = .06.

(A list of test features is shown in Table 1 on next page.)

Page 9: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

9

Results (continued)

Test features Coding criteria

PeopleCategorical variable: coded as “1” if there are people (and objects) in test scenarios, and as “0” if there are only objects in test scenarios.

Repeating groupsCategorical variable: coded as “1” if the people/objects could be reassigned into different reasoning questions (usually one question set has 4 reasoning questions), and as “0” if they couldn’t.

Cultural backgroundCategorical variable: coded as “1” if the items possibly require examinees’ cultural background (e.g., sports knowledge), and as “0” if the items don’t require.

Information ratioContinuous variable: it equals to the number of reasoning rules divided by the number of information pieces. [Note. One rule may consists of more than one information piece.

Table 1. Test features in logical reasoning test

Page 10: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

10

Part 2. Verbal Reasoning

Candidates are asked to read a paragraph with 4 missing words. Each “item” consists of the missing word and 3 distractors.

I had called upon my friend Sherlock Holmes upon the second morning after Christmas, with the _______ of wishing him the compliments of the season. He was ______ upon the sofa in a purple dressing-gown, a pipe-rack within his reach upon the right, and a pile of crumpled morning papers, evidently newly studied, near at hand. Beside the couch was a wooden chair, and on the angle of the back hung a very seedy and ________ hard-felt hat, much the worse for wear, and cracked in several places. A lens and a forceps lying upon the seat of the chair suggested that the hat had been suspended in this manner for the purpose of _______.

Page 11: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

11

Verbal Reasoning Features

• Word Frequency (key - ngram from ngramr R package)

• Reading Ease

• Grade level

• Word count

• Characters

• Sentences

• Passive sentences

• Words per sentence

• Characters per word

Page 12: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

12

Verbal Reasoning Correlations2014 2015

freq -.403** -.495**

readease -.197 .035

FK .226 -.189

Words -.133 -.099

chars -.057 -.070

sentences -.232 .231

passive -.164 -.141

wps .086 -.228

cpw .057 .185

Page 13: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

13

Discussions

• Regression analysis for logical and verbal reasoning tests showed promising results (given the sample size is small). Test features were essential components to predict item difficulty parameters.

• Additional studies with more items and with verification of the predictions will be forthcoming.

Page 14: Item Difficulty Modeling on Logical and Verbal Reasoning Tests Kuan Xing 1 and Kirk Becker 2 1 University of Illinois – Chicago; 2 Pearson VUE, Chicago

14

Thanks for listening!Questions?

E-mails: [email protected] [email protected]

Key references

• Irvine, S. H. (2003). The foundations of item generation for mass testing. In S. H. Irvine & P. C. Kyllonen (Eds.), Item generation for test development (pp 3-34). Hillsdale, NJ: Erlbaum.

• Newstead, S., Bradon, P., Handley, S., Evans, J., & Dennis, I. (2003). Using the psychology of reasoning to predict the difficulty of analytical reasoning problems. In S. H. Irvine & P. C. Kyllonen (Eds.), Item generation for test development (pp 35-52). Hillsdale, NJ: Erlbaum.