optionality: forms, intended benefits and potential issues

Optionality:forms, intended benefits and potential issues

Sandra Johnson, Assessment EuropeSQA Research seminar, October 2013

What is “optionality”?

Any feature in examinations that allows different candidates for the same qualification to achieve that qualification through different assessment routes

Forms of optionality

Appears in the context of the question rather than the content

Allows candidates to choose between items of mandatory content

Reflects optional content in courses Embedded within questions Parallel papers/units Class-based tasks

Example 1 Higher SQP: Spanish Reading and Writing– Section 2 Writing

Choose one of the following four writing scenarios on the contexts of employability, culture, culture and learning that you have studied in the course. Write approximately 120-150 words.

OPTION 1You have just come back from a summer job in Spain and you have been asked to write a report for your school’s/college’s Spanish webpage. You must include the following information

Where you worked and how you got there everyday What you had to do every day as part of your duties and why you enjoyed them

or did not enjoy them How you got on with your boss and the other employees How you think this experience will help you in the future

OPTION 2You recently watched a Spanish film at an International film festival. You have been asked to write a review of the film, in Spanish, for a Spanish website. You must include the following information:

An outline of the plot of the film and a description of one of the major scenes A description of the main characters and why you liked or disliked them What the themes were and which one you considered to be the most important Why you would recommend the film to other young people

Example 2Higher SQP: Media

Attempt BOTH questionsYou should use different examples of media content in your response to each question.

1. Media Content in Context How audiences respond to genre texts can depend on the mixture of expected and unexpected elements within them. Analyse how this statement could apply to media content you have studied. In your response you must cover:a) the ways in which genre markers are evident in narrative structures, codes and/or

conventions (10 marks)b) the ways in which genre markers are evident in at least one other key aspect from

categories, language or representation (10 marks)c) the ways in which different audiences might respond to expected and unexpected

elements of the genre (10 marks) You can use the bullet points to structure your response, or integrate your responses to the bullet points in any appropriate way.

2. The Role of Media The media is consistently criticised as being intrusive, out of control or problematic in some other way. Often, the response from the media is that it is simply meeting needs. Discuss this statement with reference to media content you have studied.

Example 3Higher SQP: Politics – Section 1 Political Theory

Answer either Question B1 or Question B2 and Question B3

B1

Compare the importance of the Executive in making policy, with reference to two political systems you have studied.

In your answer you should compare three aspects of policy making.

12 marks

B2

Compare the importance of the Judiciary, with reference to two political systems you have studied.

In your answer you should identify three aspects of the Judiciary.

12 marks

Example 4Higher SQP: Biology – Section 2

8. Answer either A OR B.

A Describe how animals survive adverse conditions.

OR B Describe recombinant DNA technology.

Labelled diagrams may be used where appropriate

Example 5National 5 English Critical Reading

Total marks — 40

SECTION 1 — Scottish Text — 20 marksRead an extract from a Scottish text you have previously studied and attempt the questions.Choose ONE text from eitherPart A — Drama Pages 2–7orPart B — Prose Pages 8–17orPart C — Poetry Pages 18–25

Attempt All the questions for your chosen text.

SECTION 2 — Critical Essay — 20 marksWrite ONE critical essay on a previously studied text from the following genres — Drama, Prose, Poetry, Film and Television Drama, or language.

Your answer must be on a different genre from that chosen in Section 1.

Motivations for introducing optionality (intended benefits)

To allow flexibility in curriculum coverage To maximise learning motivation and test

motivation for candidates To provide opportunities for candidates to build

on their strengths and interests

Historic debates about the value of optionality – some evidence from the literature:

Should optional questions be used in examinations? Stalnaker, J.W., School and Society, 1935

Optional questions in tests and examinationsDevadson, M.D., Teacher Education, 1963

Test performance and the use of optional questionsDucette, J. & Wolk, S., Journal of Experimental Education, 1972

Question choice in examinations: an experiment in geography and scienceTaylor, E.G. & Nuttall, D.I., Educational Research, 1974

O level examined: the effect of question choice. Willmott, A.S. & Hall, C.G. Schools Council, 1975.

Issues: validity, reliability, fairness

Validity (comparability of demand) Wisdom of candidates’ choices* Reliability Grading (all candidates in the undifferentiated

test score distribution)

Fairness to candidates

* Willmott, A.S. & Hall, C.G. (1975) O level examined: the effect of question choice. Schools Council.

Candidates do not always make the wisest question choices

Principal sources of mark variation in assessment

Mark variation in testing situations principally arises from genuine between-candidate differences in the construct being assessed (e.g. history knowledge, mathematical ability, investigation skills), but also from between-question differences (in apparent difficulty), between marker differences (in standards), and “interaction effects” (e.g. markers marking more or less severely than others on particular questions, candidates performing better than others on some but not all questions – “jagged profiles”).

Interaction effects contribute to measurement error in candidate ranking applications, including examination-based grading. Where candidates can choose to respond to different questions then grading validity issues arise.

Example1A 2-section electrotechnical qualification paper

Section A – 20 3-mark short-answer questions Section B – 6 15-mark structured questions No optionality

Example1 continued:The total mark distribution for the electrotechnical paper

Example 1 continued: % contributions to score variation in the two sections

Johnson, S., Johnson, R., Miller, L. & Boyle, A. (2013) Reliability of Vocational Assessment: An evaluation of level 3 electro-technical qualifications. Coventry: Ofqual.

Example 1 continued: % contributions to score variation in the two sections

Section B differentiated more among the candidates than did Section A while there was more between–question variation in Section A than in Section B. In both sections more than half of the variation in candidate-question-marker scores could be attributed to candidate-question interaction, i.e. inconsistent performances by individual candidates across the questions (examiners put this down to poor candidate preparation).

Example 1 continued: Reliability measures for the two paper sections

Section A

(20 3-mark questions)

Section B

(6 15-mark questions)

phi 95% CI* Phi 95% CI*

Single marking 0.71 ± 9.4 0.66 ± 18.0

Double marking 0.73 ± 9.0 0.68 ± 17.1

* These are given as marks around candidates’ section total scores

Reliability for the whole paper: 0.71 (95% CI around total test mark ± 20.3)

Example 2: A GCE history question paper

one of several alternative Unit 1 papers presented three extended response questions candidates were required to answer two questions

so in practice there were three different pathways through the paper

each question was worth 60 marks for a 120-mark paper total

Example 2 continued: Total mark distribution for the history paper

Example 2 continued:Reliability statistics for the three history papers

Johnson, S. & Johnson, R, (2012) Component reliability in GCSE and GCE Coventry: Ofqual.

Candidates who opted for pathway q2q3 were more reliably assessed than those who opted for pathway q1q2 or pathway q1q3

Example 3: A GCE AS geography paper

a 2-section paper two three-part open-ended questions in each section

allowed candidates to choose between items of mandatory content

candidates were required to answer one question from each section

so in practice there were four different pathways through the paper

each question was worth 35 marks for a 70-mark paper total

Example 3 continued: Mark distributions for the geography paper (2009-2011; same structure, different content!)

Example 3:Reliability indices for the embedded geography papers

Baird, J-A., Hayes, M., Johnson, R. Johnson, S. & Lamprianou, I. (2013) Marker effects and examination reliability. Coventry: Ofqual.

Pathways through the paper

q1+q3 q1+q4 q2+q3 q2+q4

Average %

candidates

per

pathway/year

4% 28% 8% 60%

Reliability coefficients

2009 0.70 0.63 0.66 0.58

2010 0.60 0.56 0.50 0.56

2011 0.72 0.65 0.59 0.62

Pathways q1q4 and q2q4 were very much more popular than pathways q1q3 and q2q3.The reliability statistics look similar, but the analysis could not take into account all the measurement error contributions, since marking was clip-based (i.e. part-questions were electronically randomly allocated to markers for marking).

Example 3: Reliability indices for part-questions in the embedded geography papers

q2a q2b q2c

q4a q4b q4c

Reliability coefficients

Relative measurement 0.44 0.42 0.70

0.52 0.54 0.69

Absolute

measurement 0.41 0.38 0.66

0.49 0.48 0.65

NB. The lesser popularity of optional pathways involving q1 or q3, combined with the systematic random allocation of seeded clips to markers from across the whole candidate entry, meant that there were too few multiply marked clips available for the reliability of marking of q1 and q3 to be explored

Some questions to conclude:

Should optionality be encouraged or eliminated?

If continued in future paper design – what forms would be most appropriate?

How can comparability in level of knowledge/skill demand be assured across optional choices?

How should marker reliability studies be modified to accommodate differential option popularity?

How soon can comprehensive reliability studies be introduced to look at optional pathways?

Without question pretesting, what might be done to give all candidates fair grading outcomes in the context of optionality?

optionality: forms, intended benefits and potential issues

Documents