optionality: forms, intended benefits and potential issues
DESCRIPTION
Optionality: forms, intended benefits and potential issues. Sandra Johnson, Assessment Europe SQA Research seminar, October 2013. What is “optionality”?. - PowerPoint PPT PresentationTRANSCRIPT
Optionality:forms, intended benefits and potential issues
Sandra Johnson, Assessment EuropeSQA Research seminar, October 2013
What is “optionality”?
Any feature in examinations that allows different candidates for the same qualification to achieve that qualification through different assessment routes
Forms of optionality
Appears in the context of the question rather than the content
Allows candidates to choose between items of mandatory content
Reflects optional content in courses Embedded within questions Parallel papers/units Class-based tasks
Example 1 Higher SQP: Spanish Reading and Writing– Section 2 Writing
Choose one of the following four writing scenarios on the contexts of employability, culture, culture and learning that you have studied in the course. Write approximately 120-150 words.
OPTION 1You have just come back from a summer job in Spain and you have been asked to write a report for your school’s/college’s Spanish webpage. You must include the following information
Where you worked and how you got there everyday What you had to do every day as part of your duties and why you enjoyed them
or did not enjoy them How you got on with your boss and the other employees How you think this experience will help you in the future
OPTION 2You recently watched a Spanish film at an International film festival. You have been asked to write a review of the film, in Spanish, for a Spanish website. You must include the following information:
An outline of the plot of the film and a description of one of the major scenes A description of the main characters and why you liked or disliked them What the themes were and which one you considered to be the most important Why you would recommend the film to other young people
Example 2Higher SQP: Media
Attempt BOTH questionsYou should use different examples of media content in your response to each question.
1. Media Content in Context How audiences respond to genre texts can depend on the mixture of expected and unexpected elements within them. Analyse how this statement could apply to media content you have studied. In your response you must cover:a) the ways in which genre markers are evident in narrative structures, codes and/or
conventions (10 marks)b) the ways in which genre markers are evident in at least one other key aspect from
categories, language or representation (10 marks)c) the ways in which different audiences might respond to expected and unexpected
elements of the genre (10 marks) You can use the bullet points to structure your response, or integrate your responses to the bullet points in any appropriate way.
2. The Role of Media The media is consistently criticised as being intrusive, out of control or problematic in some other way. Often, the response from the media is that it is simply meeting needs. Discuss this statement with reference to media content you have studied.
Example 3Higher SQP: Politics – Section 1 Political Theory
Answer either Question B1 or Question B2 and Question B3
B1
Compare the importance of the Executive in making policy, with reference to two political systems you have studied.
In your answer you should compare three aspects of policy making.
12 marks
B2
Compare the importance of the Judiciary, with reference to two political systems you have studied.
In your answer you should identify three aspects of the Judiciary.
12 marks
Example 4Higher SQP: Biology – Section 2
8. Answer either A OR B.
A Describe how animals survive adverse conditions.
OR B Describe recombinant DNA technology.
Labelled diagrams may be used where appropriate
Example 5National 5 English Critical Reading
Total marks — 40
SECTION 1 — Scottish Text — 20 marksRead an extract from a Scottish text you have previously studied and attempt the questions.Choose ONE text from eitherPart A — Drama Pages 2–7orPart B — Prose Pages 8–17orPart C — Poetry Pages 18–25
Attempt All the questions for your chosen text.
SECTION 2 — Critical Essay — 20 marksWrite ONE critical essay on a previously studied text from the following genres — Drama, Prose, Poetry, Film and Television Drama, or language.
Your answer must be on a different genre from that chosen in Section 1.
Motivations for introducing optionality (intended benefits)
To allow flexibility in curriculum coverage To maximise learning motivation and test
motivation for candidates To provide opportunities for candidates to build
on their strengths and interests
Historic debates about the value of optionality – some evidence from the literature:
Should optional questions be used in examinations? Stalnaker, J.W., School and Society, 1935
Optional questions in tests and examinationsDevadson, M.D., Teacher Education, 1963
Test performance and the use of optional questionsDucette, J. & Wolk, S., Journal of Experimental Education, 1972
Question choice in examinations: an experiment in geography and scienceTaylor, E.G. & Nuttall, D.I., Educational Research, 1974
O level examined: the effect of question choice. Willmott, A.S. & Hall, C.G. Schools Council, 1975.
Issues: validity, reliability, fairness
Validity (comparability of demand) Wisdom of candidates’ choices* Reliability Grading (all candidates in the undifferentiated
test score distribution)
Fairness to candidates
* Willmott, A.S. & Hall, C.G. (1975) O level examined: the effect of question choice. Schools Council.
Candidates do not always make the wisest question choices
Principal sources of mark variation in assessment
Mark variation in testing situations principally arises from genuine between-candidate differences in the construct being assessed (e.g. history knowledge, mathematical ability, investigation skills), but also from between-question differences (in apparent difficulty), between marker differences (in standards), and “interaction effects” (e.g. markers marking more or less severely than others on particular questions, candidates performing better than others on some but not all questions – “jagged profiles”).
Interaction effects contribute to measurement error in candidate ranking applications, including examination-based grading. Where candidates can choose to respond to different questions then grading validity issues arise.
Example1A 2-section electrotechnical qualification paper
Section A – 20 3-mark short-answer questions Section B – 6 15-mark structured questions No optionality
Example1 continued:The total mark distribution for the electrotechnical paper
Example 1 continued: % contributions to score variation in the two sections
Johnson, S., Johnson, R., Miller, L. & Boyle, A. (2013) Reliability of Vocational Assessment: An evaluation of level 3 electro-technical qualifications. Coventry: Ofqual.
Example 1 continued: % contributions to score variation in the two sections
Section B differentiated more among the candidates than did Section A while there was more between–question variation in Section A than in Section B. In both sections more than half of the variation in candidate-question-marker scores could be attributed to candidate-question interaction, i.e. inconsistent performances by individual candidates across the questions (examiners put this down to poor candidate preparation).
Example 1 continued: Reliability measures for the two paper sections
Section A
(20 3-mark questions)
Section B
(6 15-mark questions)
phi 95% CI* Phi 95% CI*
Single marking 0.71 ± 9.4 0.66 ± 18.0
Double marking 0.73 ± 9.0 0.68 ± 17.1
* These are given as marks around candidates’ section total scores
Reliability for the whole paper: 0.71 (95% CI around total test mark ± 20.3)
Example 2: A GCE history question paper
one of several alternative Unit 1 papers presented three extended response questions candidates were required to answer two questions
so in practice there were three different pathways through the paper
each question was worth 60 marks for a 120-mark paper total
Example 2 continued: Total mark distribution for the history paper
Example 2 continued:Reliability statistics for the three history papers
Johnson, S. & Johnson, R, (2012) Component reliability in GCSE and GCE Coventry: Ofqual.
Candidates who opted for pathway q2q3 were more reliably assessed than those who opted for pathway q1q2 or pathway q1q3
Example 3: A GCE AS geography paper
a 2-section paper two three-part open-ended questions in each section
allowed candidates to choose between items of mandatory content
candidates were required to answer one question from each section
so in practice there were four different pathways through the paper
each question was worth 35 marks for a 70-mark paper total
Example 3 continued: Mark distributions for the geography paper (2009-2011; same structure, different content!)
Example 3:Reliability indices for the embedded geography papers
Baird, J-A., Hayes, M., Johnson, R. Johnson, S. & Lamprianou, I. (2013) Marker effects and examination reliability. Coventry: Ofqual.
Pathways through the paper
q1+q3 q1+q4 q2+q3 q2+q4
Average %
candidates
per
pathway/year
4% 28% 8% 60%
Reliability coefficients
2009 0.70 0.63 0.66 0.58
2010 0.60 0.56 0.50 0.56
2011 0.72 0.65 0.59 0.62
Pathways q1q4 and q2q4 were very much more popular than pathways q1q3 and q2q3.The reliability statistics look similar, but the analysis could not take into account all the measurement error contributions, since marking was clip-based (i.e. part-questions were electronically randomly allocated to markers for marking).
Example 3: Reliability indices for part-questions in the embedded geography papers
q2a q2b q2c
q4a q4b q4c
Reliability coefficients
Relative measurement 0.44 0.42 0.70
0.52 0.54 0.69
Absolute
measurement 0.41 0.38 0.66
0.49 0.48 0.65
NB. The lesser popularity of optional pathways involving q1 or q3, combined with the systematic random allocation of seeded clips to markers from across the whole candidate entry, meant that there were too few multiply marked clips available for the reliability of marking of q1 and q3 to be explored
Some questions to conclude:
Should optionality be encouraged or eliminated?
If continued in future paper design – what forms would be most appropriate?
How can comparability in level of knowledge/skill demand be assured across optional choices?
How should marker reliability studies be modified to accommodate differential option popularity?
How soon can comprehensive reliability studies be introduced to look at optional pathways?
Without question pretesting, what might be done to give all candidates fair grading outcomes in the context of optionality?