presentation - ealta
TRANSCRIPT
www.pearsonpte.com
Using CEFR-derived guidelines in test development and assessment
Glyn Jones Kirsten Ackermann
EALTA 2010 Den Haag
London Tests of English
www.pearsonpte.com
Deriving item writer guidelines from the CEFR
1. Rationale
2. Methodology
3. Guidelines
4. Monitoring & review
London Tests of English
www.pearsonpte.com
1. Rationale• Context: item writer and rater training for PTE General (Pearson
Test of English General, formerly London Test of English)
• Linking to CEFR: standardisation of judgments
• Objectives
– to help item writers to produce items at the appropriate CEFR level
– To help raters to apply CEFR descriptors when judging spoken and written samples
• Challenge: confronting lack of definition in CEFR descriptors
• Developing item specific guidelines derived from CEFR descriptors
London Tests of English
www.pearsonpte.com
2. Methodology
For each item type…
• What traits is the item designed to measure?
• Which CEFR descriptors relate most closely to those traits?
• What are the key terms in those descriptors?
• How should we interpret those terms in relation to the item?
London Tests of English
www.pearsonpte.com
Trait(s)
Descriptor(s)
Terms
Implications
?text audio task options … …
item
London Tests of English
www.pearsonpte.com
Sample reading item
For me, the most depressing thing about their new album is its lack of invention - they just seem to be _______ old ideas.
A revising
B reviewing
C recycling
London Tests of English
www.pearsonpte.com
Item type B2 section 5Short text with a single multiple choice gapfill
Which CEF scale(s) are applicable?
Overall reading comprehension Reading for orientation
Which terms need to be glossed?
Reading for orientationCan quickly identify the content and relevance of news items, articles and reports on a wide range of professional topics, deciding whether closer study is worthwhile
Identify the content an relevance
The task should require the test taker to identify the topic of the text with precision, eg not just an ad for a holiday but for an adventure holiday (Sample 1)
London Tests of English
www.pearsonpte.com
Item type B2 section 5Short text with a single multiple choice gapfill
Which CEF scale(s) are applicable?
Overall reading comprehension Reading for orientation
Which terms need to be glossed?
Reading for orientationCan quickly identify the content and relevance of news items, articles and reports on a wide range of professional topics, deciding whether closer study is worthwhile
wide range of professional topics
Texts can be work related but accessible to the general reader; either texts to inform general readers about technical matters (eg information leaflets) or generic work related texts, eg about office procedures, document processing, line management etc.
London Tests of English
www.pearsonpte.com
Item type B2 section 5Short text with a single multiple choice gapfill
Which CEF scale(s) are applicable?
• Overall reading comprehension • Reading for orientation
Which terms need to be glossed?
Reading for orientationCan quickly identify the content and relevance of news items, articles and reports on a wide range of professional topics, deciding whether closer study is worthwhile
Deciding whether closer study is worthwhile
Solving the task should hinge on a fundamental aspect of the text, its structure, purpose or content, not on a minor detail.
London Tests of English
www.pearsonpte.com
Item type B2 section 5Short text with a single multiple choice gapfill
Which CEF scale(s) are applicable?
Overall reading comprehension Reading for orientation
Which terms need to be glossed?
Reading for orientationCan quickly identify the content and relevance of news items, articles and reports on a wide range of professional topics, deciding whether closer study is worthwhile
Identify the content and relevance
The task should require the test taker to identify the topic of the text with precision, eg not just an ad for a holiday but for an adventure holiday (Sample 1)
wide range of professional topics
Texts can be work related but accessible to the general reader; either texts to inform general readers about technical matters (eg information leaflets) or generic work related texts, eg about office procedures, document processing, line management etc.
Deciding whether closer study is worthwhile
Solving the task should hinge on a fundamental aspect of the text, its structure, purpose, all content, not on a minor detail.
London Tests of English
www.pearsonpte.com
section B2 CEF B2 guidance notes
4 B2 Overall reading comprehensionCan read with a large degree of independence, adapting style and speed of reading to different texts and purposes, and using appropriate reference sources selectively. Has a broad active reading vocabulary, but may experience some difficulty with low frequency idioms.B2 Reading for information and argumentCan obtain information, ideas and opinions from highly specialised sources within his/her field.Can understand articles and reports concerned with contemporary problems in which the writers adopt particular stances or viewpoints.B2 Reading for orientationCan quickly identify the content and relevance of news items, articles and reports on a wide range of professional topics, deciding whether closer study is worthwhile.
TextThe text type may be any that a typical language user is likely to encounter in real life, including professional and academic situations The lexis in the text should be accessible to an educated general reader. The text should not contain highly colloquial or idiomatic expressionsTexts can be work related but accessible to the general reader; either texts to inform general readers about technical matters (eg information leaflets) or generic work related texts, eg about office procedures, document processing, line management etc.
TaskThe task should require the test taker to identify the topic of the text with precision, eg not just an ad for a holiday but for an adventure holidayThe task should be designed to assess understanding of the purpose or main message of the text, or familiarity with the formal linguistic features of the genre, including stylistic features, register and appropriate vocabulary.
3. Guidelines
London Tests of English
www.pearsonpte.com
4. Monitoring & review
Guidelines Item writing
Item review
Field testing / item seeding
comments
comments
data
London Tests of English
www.pearsonpte.com
Deriving Marking Criteria from the CEFR
1. Rationale
2. Methodology
3. Marking Criteria
4. Quantitative & Qualitative Analyses
5. Results
London Tests of English
www.pearsonpte.com
1. Rationale
• CEFR as a common standard for language learning, teaching and assessment
• Contributing to a robust alignment of PTE General to the CEFR
• Ensuring comparability between language examinations and qualifications
• Rating language functions as well as qualitative aspects of language
• Having a marking scale based on CEFR levels
London Tests of English
www.pearsonpte.com
2. Methodology
(1) Gathering expert judgments on the usability of the CEFR for the creation of marking criteria
(2) Identifying gaps and inconsistencies in the CEFR
(3) Producing additional guidelines to flesh out the descriptors
(4) Field testing the new marking criteria
(5) Using quantitative and qualitative methods for validation
London Tests of English
www.pearsonpte.com
3. Marking Criteria for PTE General: Speaking Test
Section of Speaking Test Individual Trait Scale
S1: Long turn Sustained Monologue 1-5
S2: Discussion Turn Taking 1-5
S3: Responding to a visual stimulus
Thematic Development 1-5
S4: Role play Sociolinguistic Appropriateness 1-5
Qualitative Traits Fluency, Interaction, Range, Accuracy, Phonological Control
1-5 respectively
London Tests of English
www.pearsonpte.com
Sample B1 Speaking: Role Play
Marking Criteria - Sociolinguistic Appropriateness (abbr.)
Can perform and respond to a wide range of language functions, using their most common exponents in a neutral register.
Language Level Descriptor
Test takers may be required to perform the following functions and respond to them: requesting, offering, suggesting, thanking, rejecting, apologising or congratulating. While test takers’ language will generally be limited to a neutral register, some awareness of appropriateness (e.g., in terms of degrees of formality) is expected.
London Tests of English
www.pearsonpte.com
4. Quantitative & Qualitative Analyses
Quantitative
• Multi-faceted Rasch Analysis (FACETS)
• Internal consistency calculations (Cronbach’s alpha)
Qualitative
• Verbal Protocol Analysis (MAXQDA; Atlas.ti)
London Tests of English
www.pearsonpte.com
Internal Consistency Reliability
Item Trait Session 1 B1 Session 2 A1 Session B2Sustained Monologue 0.77 0.92 0.88
Turn Taking 0.84 N/A 0.93
Thematic Development 0.90 0.84 0.80
Sociolinguistic Approp. 0.84 0.85 0.87
Fluency 0.88 0.93 0.93
Interaction 0.83 0.90 0.91
Range 0.87 0.91 0.95
Accuracy 0.87 0.74 0.83
Phonological Control 0.89 0.81 0.87
Number of Markers 5 3 2
London Tests of English
www.pearsonpte.com
5. Results
• Item writer guidelines and marking criteria robustly aligned to the CEFR
• Language level descriptors for each item trait to be assessed to support marking criteria
• Standardized rating process
• Assessment that ensures maximum reliability and validity
• Comparability and interpretability of test scores
London Tests of English
www.pearsonpte.com
www.pearsonpte.com
Questions?