handbook for constructing and grading course assessment1 › sites › tiu › files › ... ·...

56
HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT 1 First edition, May 2019 Examination Board Tilburg School of Economics and Management 1 This TiSEM Handbook is based on the TSB Handbook for Constructing and Grading Exams; second edition. Students cannot derive any rights from this Handbook. It is composed for internal use (i.e. teaching staff) only.

Upload: others

Post on 27-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

HANDBOOK FOR CONSTRUCTING AND GRADING COURSE

ASSESSMENT1

First edition, May 2019

Examination Board Tilburg School of Economics and Management

1 This TiSEM Handbook is based on the TSB Handbook for Constructing and Grading Exams; second edition. Students cannot derive any rights from this Handbook. It is composed for internal use (i.e. teaching staff) only.

Page 2: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

2

TABLE OF CONTENTS Preface

3

Essentials for good assessment

4

Prologue – Introduction to course assessment methods and types

5

Chapter 1 – Multiple choice exams

10

Chapter 2 – Open-ended question exams

13

Chapter 3 – Oral exams

16

Chapter 4 – Written products

19

Chapter 5 – Oral presentations

22

Chapter 6 – Group assignments

24

Chapter 7 – Open-book exams and Take-home exams

27

Chapter 8 – Bachelor’s and Master’s thesis

30

Chapter 9 – Rules and Guidelines TiSEM

33

Appendix I: Grading

34

Appendix II: Checklist closed questions

36

Appendix III: Checklist open-ended questions

39

Appendix IV: Avoiding Free Riding Behavior

41

Appendix V: Checklist Specification Table

46

Appendix VI: Template for writing assignment descriptions Appendix VII: Checklist for designing and assessing group work References (selection)

50

53

55

Page 3: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

3

PREFACE The quality of exams is of continued and critical importance for all lecturers and is the key responsibility of the Examination Board. The legislator has also increased the responsibilities of examining boards in ensuring the quality of assessments. Given this responsibility, we consider the support of lecturers when constructing and grading exams an important proactive policy instrument. This is why the Examination Board of the Tilburg School of Economics and Management (TiSEM) has constructed this Handbook for Constructing and Grading Course Assessment. The handbook discusses a number of key topics that virtually all lecturers deal with in their work, such as the choice between open-ended and multiple choice exams, the use of mid-terms, incorporating guessing correction in a grading standard, how to create a specification table for a test, how to determine whether the quality of an exam question is sufficient, avoiding free rider behavior, etc. This handbook specifies a number of do’s and don’ts for all lecturers and stipulates a more common interpretation of assessment procedures across TiSEM. The handbook builds on and clarifies the Education and Examination Regulations (EER) of TiSEM and the Rules and Guidelines (R&G) of the Examination Board of TiSEM. Lecturers are encouraged to also read the EER and R&G (see Chapter 9 for links). The handbook serves as a set of guidelines from which lecturers may diverge when this is justifiable. Reading the handbook, one may notice overlap between chapters. This is done to allow chapters to be read individually. This handbook is based on an earlier initiative of the Tilburg School of Social and Behavioral Sciences (TSB), is a collection of important readings found in the literature, and is adapted -where necessary- to the policy and practice of TiSEM. The TiSEM Examination Board wishes to express its gratitude to the colleagues from TSB (Roel Rutten c.s.), AS (Amy Hsiao) and EST (Yvonne de Vries), for their support and allowing us to build on their initial work. Since a handbook like this is always work-in-progress, we invite you to send your comments, remarks and suggestions on the handbook to [email protected] so we can use them in future updates of this handbook. Examination Board, Tilburg School of Economics and Management

Page 4: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

4

ESSENTIALS FOR GOOD ASSESSMENT No matter what type of test is used to assess the knowledge, skills and attitude of students, there are some essential issues that should be taken into account at all times. These assessment essentials are listed below:

Please note that in this handbook the length of multiple choice and/or open-ended question exams

is based on a three hours format.

ASSESSMENT ESSENTIALS

1) Exams should enable the ordering of students with respect to their abilities, and

eventually divide the students into two groups; those who sufficiently master the subject

(pass), and those who don’t (fail). Although ideal exams do not exist, lecturers should

aim for exams satisfying this objective the best they can.

2) Assessment involves four quality criteria: validity, reliability, objectivity and

transparency. A good test should meet these four criteria.

3) Creating and using a specification table when constructing exam questions is required,

and helps meeting the four quality criteria. A checklist to help create a specification

table is included in Appendix V of this handbook.

4) When constructing a test, one should always keep in mind that guessing should not be

rewarded. A guessing correction should not be restricted to multiple choice exams, but

may also be applicable to other assessment types, including open-ended question

exams.

5) Each assessment type should be peer reviewed a-priori by a subject-matter expert (i.e.,

a second check or so-called “four-eyes principle”).

6) Instructions for students on each assessment type must clearly specify which content

is assessed and how the content is assessed.

7) There is no assessment type that performs best on all learning goals. Consequently,

the use of a variety of different assessment types is highly recommended, within and

across courses.

8) Students have the right to inspect their own work after being graded. However, this

right of inspection should not result in an “extended” examination and/or interminable

chain of personal statements. The answer key is decisive.

Page 5: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

5

PROLOGUE - INTRODUCTION TO COURSE ASSESSMENT METHODS AND TYPES By Amy Hsiao, Assessment Specialist, Teacher Developement

This introduction explains the two purposes of assessments (formative vs summative), two assessment methods (examination vs coursework), as well as commonly used assessment types.

A. Assessment purpose

There are two purposes of assessment:

a. Formative assessment (also called assessment for learning)

The purpose of formative assessment is to help both teachers and students identify discrepancies between current understanding/performance and the targeted learning goals (Hattie & Timperley, 2007).

The results of formative assessments are not only used to feed-back on how students are learning and how teachers are teaching, but also to feed-forward on how both students and teachers can effectively reduce the discrepancies to achieve the targeted learning goals. Formative assessment provides teachers with opportunities to recognize where students are struggling with and address problems immediately, so that teachers’ feedback can appropriately meet students’ needs.

In short, the focus of formative assessment is to monitor students’ progress towards the targeted learning goals. The results of formative assessment are usually not used for grading purposes.

b. Summative assessment (also called assessment of learning)

The purpose of summative assessment is to determine how well students have achieved the course learning goals (or program learning outcomes in the case of a thesis or graduation project). The results of summative assessment are used for assigning a grade and to decide who passed the course and who failed.

Teachers need to account for their assessment quality for internal and external users of the assessment results. In line with this, the results of summative assessment can be used formatively by both students and teaching staff. Students can use the results to guide their efforts and activities in subsequent courses. Teachers can use the results to evaluate the effectiveness of their course design: Are the learning goals set too high (difficult) or too low (easy)? Do the teaching & learning activities develop students’ content knowledge and thinking skills? Does the assessment appropriately measure the learning goals? Based on the evaluation results, teachers can adjust their course design in the future. In addition, teachers in subsequent courses can use the results from previous courses to design their own course.

The Academic Directors of study programs can use the summative assessment results to evaluate the appropriateness of their curriculum design and program assessment plan.

Page 6: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

6

B. Assessment methods

In general, there are two distinct methods for implementing course summative assessment at Tilburg University: examination and coursework. Examination is taken by individual students, whereas coursework can be carried out by either individuals or small groups.

Figure 1. Course summative assessment methods

Examination

An examination is an assessment method which relies upon individual students producing written or oral answers to questions.

An exam can be administrated at different moments in the semester:

Interim exams (mid-terms) take place during the semester or study unit.

Final exams take place at the end of a semester or study unit.

An exam can be administrated with or without invigilation:

Invigilated exams take place under formal examination conditions: students

complete the exam under invigilation in the university room with a fixed duration

(typically within three hours’).

Take home exams take place at students’ own place and students can

complete the exam without invigilation. Teachers should inform students of

certain rules (made by the school) regarding the individual process, originality

of the exam answers and submission, and be able to guarantee that students

comply with these rules and the assessment quality criteria (see test essential

on page 4) are guaranteed.

Course summative assessment methods

Examination

(Individual)

Interim

(mid-term)Final Invigilated

Take home

Coursework

Individual Group

Page 7: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

7

Invigilated exams

Under formal examination conditions, there are multiple ways to administrate invigilated exams (Race, 2001), including:

Seen exams provide students with exam questions some time before the exam

date. Students then prepare their answers before writing them in a formal

invigilated examination environment.

Unseen exams do not release exam questions before the exam date. Students

are required to answer questions based upon what they have learned over the

course of their academic study.

Open-book exams allow students to use course materials to complete the

exam. Students can bring course materials such as the textbook, notebook, or

workbook, and refer to these materials during the exam.

Closed-book exams do not allow students to refer to any course material.

Invigilated exams play a large part in the overall assessment picture at Tilburg University. The most commonly used invigilated exam is the combination of unseen and closed-book exam.

Coursework

Next to examination, coursework refers to all other assessment methods, including practical work, submission of essays, papers, reports, presentations, class tests, project or production of artefacts, design, etc. Coursework can be carried out by either an individual or a small group of students.

Please note that group work is different from group assessment. Group work refers to activities where students work together on a task. Group assessment refers to an item or element of assessment that has been completed by members of a group and for which group members are all assigned the same grade.

Group assessment should preferably not exceed 40% of the final course grade.

If group work is to be used either as preparation for an individual product or in preparing a full group product with a shared grade, then it has to be stated in the course manual. The course manual should include a clear statement of how the grade involving group work will be calculated.

C. Assessment types

This section summarizes the common assessment types or types of assessment tasks used at Tilburg University. Each assessment type can be used as either formative or summative assessment, either examination or coursework. The definition, targeted cognitive skills of Bloom’s taxonomy, and targeted learning outcomes (based on Dublin Descriptors) are described for each assessment type.

Page 8: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

8

S/C: Selected- or Constructed-response questions

S/C Assessment types

Definition

Targeted cognitive skills of Bloom’s taxonomy

Targeted learning outcomes based on Dublin Descriptors2

S Multiple-choice questions

Multiple choice questions (MCQs) are closed questions for which students are asked to select one or more of the choices from a list of answers.

K, C, App

KU

AKU

MJ

C Restricted-response essay questions/Short answer questions

Short answer questions are open-ended questions that require students to construct a response in one word, one sentence, or a few lines.

C, App

KU

AKU

MJ

C Extended-response essay questions

Extended-response questions are open-ended questions that require students to construct a response in a paragraph or short composition.

Ana, E, S KU

AKU

MJ

CS

The following assessment types measure two different qualities:

students’ oral/writing skills of communication in general or language skills in particular (i.e., students’ command of the oral/written medium itself)

students’ thinking skills of the course-specific content knowledge as demonstrated through the oral/written medium

Course assessment should focus on the second category, unless communication or language skills are essential course learning goals.

C Essay Essays require students to construct a debate around a particular issue, compare two or more related ideas, or persuade readers of a particular argument or position.

Essays should be written in an academic style and formatted in a pre-defined

Ana, E, S AKU

MJ

CS

LS

2 Knowledge and Understanding (KU); Applying knowledge and understanding (AKU); Making judgements (MJ); Communication skills (CS); Learning skills (LS)

Page 9: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

9

structure (that is usually provided by the teacher).

C Literature review A literature review requires students to critically survey the academic publications and research on a particular topic. A literature review is often, but not always, part of a larger research project.

A literature review should be written in an academic style and formatted in a pre-defined structure (that is usually provided by the teacher).

Ana, E, S AKU

MJ

CS

LS

C Oral exam Oral exams require students to give spoken responses to questions from one or more examiners. They are most notably used in “medicine with its clinical assessment, law with its mooting or mock trails and architecture with its design juries” (Joughin, 1998).

K, C, App, Ana, E

AKU

MJ

CS

LS

C Presentation Presentations require students to show and explain the content of a topic to an audience or a group of audiences.

K, C, App, Ana, E

AKU

MJ

CS

LS

C Report Reports require students to identify and examine issues, events, or findings that have happened in a physical sense, such as events that have occurred within an organization, or findings from a research investigation. Reports should be organized concisely in a pre-defined structure (that is usually provided by the teacher).

Ana, E, S AKU

MJ

CS

LS

Page 10: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

10

Chapter 1 – Multiple choice exams Constructing and grading exams We distinguish two stages; (A) constructing the exam and the initial grading standard, and (B) determining the final grading standard. For both stages we formulate some

guidelines. Background of some of the guidelines can be found in the Appendix on grading (Appendix I).

(A) Constructing the exam and the initial grading standard

The four guidelines are summarized in Box 1.1.

Exams should (1) enable the ordering of students with respect to their abilities, and (2) divide the students into two groups; those who sufficiently master the subject (pass), and those who don’t (fail). Although ideal exams do not exist, lecturers should aim for exams satisfying these two requirements as best as they can. Two tools can be used: number of questions (guideline 2) and guessing correction (guideline 4). The first part

of Appendix I illustrates how the number of questions affects the ordering of the students on the exam, whereas the second part of Appendix I illustrates how the guessing correction substantially lowers the probability of low ability students passing the exam.

For multiple choice exams we recommend exams of at least 20 questions with four answering categories, or at least 50 questions with two answering categories (see Appendix I). If most or all of the questions can be answered using only recall, we recommend exams of at least forty questions with four answering categories.

A well-known grading standard incorporating guessing correction that can be used in the context of multiple choice questions is

𝑔𝑟𝑎𝑑𝑒 = 101

𝐴 − 1(𝐴

𝐾𝑋 − 1) = 10 − 𝑌

10𝐴

(𝐴 − 1)𝐾

With A equal to the number of alternatives in each question, K the number of questions, and X and Y the student’s number of correct and false answers, respectively. That is, for computing a student’s grade one can either use the formula to the left with the student’s number of correct answers X, or the formula to the right

with the student’s number of false answers Y. The grade is set to zero when the

BOX 1.1: GUIDELINES FOR CONSTRUCTING THE EXAM AND INITIAL GRADING

STANDARD

1) Use a specification table to construct exam questions. Make sure that the test is

representative for the weight assigned to each learning goal and the cognitive level

specified in the table.

2) Construct a large number of items, preferably with many answering categories.

3) Let at least one other expert evaluate the constructed exam questions, and adapt the

questions based on the evaluation of the expert(s).

4) Use a guessing correction or another grading standard to take into account that

students can correctly answer questions by guessing.

Page 11: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

11

outcome of the formulas is negative. The grading standard with guessing correction formula is based on the principle that students with no ability, who randomly guess all questions, should obtain the score zero. Without guessing correction, situations may arise where students with no or low ability have a high probability of passing the exam (see the second part of Appendix I).

For example, consider a student who answered 16 items correctly of an exam containing 20 items with 4 alternatives each. Applying the formulas above yields the

following grade = 10 ×1

3× (

4

20× 16 − 1) = 7

1

3 or 10 − 4 ×

40

60= 7

1

3.

(B) Final grading

After grading the individual questions, the lecturers should check if these individual questions appear satisfactory, and if the initial grading standard is satisfactory. To start with checking individual questions, there are three reasons to adapt grading or discard individual questions. One may consider discarding an individual question if it has at least one of the following three problems:

(1) Problems with the formulation of the question. These problems may show up

after complaints of students, or after post exam evaluation by the lecturer.

(2) The question was too difficult. For example, the proportion of correct answers

to a multiple choice question was not larger than the guessing rate, 1/A.

(3) The RIT was not larger than zero. The RIT of an exam question is the correlation between the score on the question and the total score on the exam. A zero or negative RIT signifies that high ability students have an equal or lower probability to do well on this question than students with lower ability. The RIT can be found in the output of the “item analysis” provided by the Team Exams in the case of multiple choice exams.

After discarding an exam question, we recommend adapting the grading accordingly. An adaptation that may be fair to students is letting all students pass the discarded exam question, and at the same time keeping the initial grading standard. This adaptation guarantees that a student who passed the exam before implementing the adaptation, also does so after the adaptation.

The lecturer can also decide to modify the initial grading standard, for instance, when the exam turned out to be far more difficult than expected, or the average score according to the initial grading standard is considerably lower than for previous exams (please compare regular exams with previous regular exam pass rates and re-sits with previous re-sit pass rates). One easy way to adapt the grading standard is to adapt the guessing correction from the previous page to

𝑔𝑟𝑎𝑑𝑒 = 10 − 𝑌10𝐴

𝐶𝑜𝑟𝑟×(𝐴−1)𝐾,

Where Corr is the correction factor for the difficulty of the exam. For Corr = 1, the

formula is identical to the initial grading standard with guessing correction, whereas for Corr > 1 it is more lenient. For instance, if Corr = 1.1, students can have 10% more

errors to obtain the same grade as under the initial standard. This correction is also available in the excel sheet which was referenced earlier in this chapter.

Please refrain from giving all students the same upgrade (e.g. one extra point) since students who already scored well on the test will benefit less from this correction

Page 12: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

12

compared to students who performed badly; and that is not fair and inappropriate from a didactical perspective.

Page 13: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

13

Chapter 2 – Open-ended question exams Constructing and grading exams

We distinguish two stages; (A) constructing the exam and the initial grading standard, and (B) determining the final grading standard. For both stages we formulate some guidelines.

(A) Constructing the exam and the initial grading standard

The four guidelines are summarized in Box 2.1.

Exams should (1) enable the ordering of students with respect to their abilities, and (2) divide the students into two groups; those who sufficiently master the subject (pass), and those who don’t (fail). Although ideal exams do not exist, lecturers should aim for exams satisfying these two requirements the best they can. Two tools they can use are the number of questions (guideline 2) and guessing correction (guideline 4).

The first part of the Appendix on grading (Appendix I) illustrates how the number of questions affects the ordering of the students on exams; although the illustrations in Appendix I are based on examples of exams with multiple choice questions, the same principles also hold for open written exams. The more separate (sub)items, the more information to grade and order students. Take care to construct many independent

items, i.e., wrongly answering one item should ideally, and by itself not affect answering subsequent items (but this cannot always be avoided). For exams with only open-ended questions, we recommend that these consist of at least twelve short open-ended questions, i.e. (sub)questions that are independently graded and cover content on its own. When exams include both open-ended and multiple choice questions, this recommendation can be used in combination with the recommendation for the number of multiple choice questions. For instance, if an exam consists of an open-ended question and a multiple choice part, each making up 50% of the final grade, we recommend at least six open-ended questions and 20 multiple choice questions.

Guessing correction is needed to take into account that students may provide correct (parts of) answers just by random elaboration (Clay, 2001). Without guessing correction, situations may arise where students with no or low ability have a high probability of passing the exam. The second part of Appendix I illustrates this in the context of multiple choice exams, but the same principle holds for open-ended

BOX 2.1: GUIDELINES FOR CONSTRUCTING THE EXAM AND INITIAL GRADING

STANDARD

(1) Use a specification table to construct exam questions. Make sure that the test is

representative for the weight assigned to each learning goal and cognitive level

specified in the table.

(2) Construct a sufficient number of items.

(3) Let at least one other expert evaluate the constructed exam questions, and adapt the

questions based on the evaluation of the expert(s).

(4) Use a guessing correction or another grading standard to take into account that

students can correctly answer questions by guessing.

Page 14: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

14

questions. A grading standard with guessing correction for open-ended questions is: award positive points for good (parts of) answers, and subtract points for both irrelevant and false arguments. Without guessing correction, long answers with many

arguments (good, irrelevant, bad) will get more points, but with guessing correction, such long answers may result in zero points when irrelevant and bad arguments outweigh the good arguments.

For example, consider an open-ended question in which examinees are asked to provide four examples of a certain phenomenon. On the basis of inspection of the course materials and his expert knowledge, the lecturer lists five correct examples and believes about five to ten incorrect examples can be provided by the examinees. Hence, just by guessing or random elaboration, students may easily come up with at least one or even two correct examples. The lecturer can take this into account by awarding 0 points to answers with zero or one correct example (because one correct example could also be expected after random elaboration), 33% of the points when two correct examples are provided, and 66% and 100% of the points for three and four correct examples, respectively.

(B) Final grading

After the students completed the exam, and the lecturer graded the individual questions, it is time for the final grading stage. In this stage, lecturers should check if individual questions appear satisfactory, and if the initial grading standard is satisfactory. To start with checking individual questions, there are three reasons to adapt grading or discard individual questions. One may consider discarding an individual question if it has at least one of the following three problems:

(1) Problems with the formulation of the question. These problems may show up

after complaints of students, or after post exam evaluation by the lecturer.

(2) The question was too difficult. For example, the proportion of correct answers

to the open-ended question was small. That is, very few students answered the question completely correct, or the average score on the question by students who performed well on the test is lower than “let’s say 50%” of the maximum score on that question.

(3) The RIT was not larger than zero. The RIT of an exam question is the correlation between the score on the question and the total score on the exam. A zero or negative RIT signifies that high ability students have an equal or lower probability to do well on this question than students with lower ability. The RIT can be easily computed for written exams when grading of questions is stored digitally.

After discarding an exam question, we recommend adapting the grading accordingly. An adaptation that seems fair to students, lets all students pass the discarded exam question, and at the same time keep the initial grading standard. This adaptation guarantees that students who passed the exam before implementing the adaptation, also do so after the adaptation. Please refrain from giving all students the same upgrade (e.g. one extra point) since students who already scored well on the test will benefit less from this correction

Page 15: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

15

compared to students who performed badly; and that is not fair and inappropriate from a didactical perspective.

Page 16: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

16

Chapter 3 – Oral exams The reliability and validity of the oral exam (not to be confused with oral presentations; see chapter 5) is known to be generally lower than for the multiple choice and open-ended question exams, with respect to grading and ordering candidates on ability (Berkel & Bax, 2014). Moreover, the oral exam measures other traits as well, such as responding effectively in oral situations (i.e., verbal communication skills) (Joughin, 1998). Finally, the cost effectiveness of oral examinations is low (already when there are more than four students) (Berkel & Bax, 2014), particularly when its cost is weighed against its reliability and validity as a measure of competence. Hence, and apart from the Bachelor and Master Thesis defense, we advise against using oral examinations, except for special circumstances, for example, when the goals of the course (as listed in the Course Catalog) include verbal communication skills, and these skills are also trained in the course. Other appropriate examples include the use of an oral exam to verify if all group members have contributed to a group assignment, or to make a quick and provisional decision whether a student masters a topic or not, which has low costs for the student in case an erroneous decision is made. If, after all, the lecturer plans an oral examination, careful precautions must be made to increase the reliability and validity. Constructing and grading exams

We distinguish three stages; (A) constructing the exam and the initial grading standard, (B) the oral exam, and (C) determining the final grading standard. For all three stages we formulate some guidelines.

(A) Constructing the exam and the initial grading standard

The four guidelines are summarized in Box 3.1.

Exams should (1) enable the ordering of students with respect to their abilities, and (2) divide the students into two groups; those who sufficiently master the subject (pass), and those who don’t (fail). Although ideal exams do not exist, lecturers should aim for exams satisfying these two requirements the best they can. Satisfying the four guidelines is more difficult for oral exams than for multiple choice and open-ended exams, for several reasons: oral examinations are often not standardized, are generally short with a few questions, are affected by interference of the examiner

BOX 3.1: GUIDELINES FOR CONSTRUCTING THE EXAM AND INITIAL GRADING

STANDARD

(1) Use a specification table to construct exam questions. Make sure that the test is

representative for the weight assigned to each learning goal and cognitive level

specified in the table.

(2) Construct a large number of items.

(3) Let at least one other expert evaluate the constructed exam questions, and adapt the

questions based on the evaluation of the expert(s).

(4) Use a guessing correction or another grading standard to take into account that

students can correctly answer questions by guessing.

Page 17: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

17

(Burchard, Rowland‐ Morin, Coe, & Garb, 1995; Joughin, 1998), and mostly do not take answering correctly by guessing into account. For each of the four guidelines, some precautions are presented that increase reliability and validity of the oral exam.

The examiner should specify a table for each exam question with possible answering options, and how these options are graded (guideline1). The grading standard ideally takes into account irrelevant and wrong answers to each question by allocating negative points to them (guideline 4). Specifying the table requires a great deal of planning. Throughout the oral exam, the lecturer can use the table to specify the answer of the candidate (e.g., by putting crosses in the table). One holistic rating should be avoided at all cost, since they are very subjective and unreliable, due to e.g. primacy and recency effects. Make sure the table of each exam question is discussed with at least one other colleague before the oral exam (guideline 3). As with multiple choice and open-ended exams, asking many questions increases the reliability and validity of the exam (guideline 2; also see Appendix I). Because many questions need to be asked, oral examinations cannot be short, with good examinations taking at least half an hour to three hours, or having several sessions of half an hour. To increase standardization, candidates should get, at least partly, the same type of questions.

For oral exams we recommend an exam with at least twelve short questions.

(B) The oral exam

Because grading by multiple lecturers greatly increases reliability and validity, we recommend the presence and independent grading of students by at least two lecturers. Both lecturers should be very familiar with the specification table of each exam question. The lecturer’s active participation in the exam hampers grading. Lecturers have a tendency to concentrate too many questions in areas of student weakness, or in fields of their own interest, or to deviate from standardized protocol, for instance by providing clues to students. Active participation of the lecturer can be limited by just asking questions, and telling candidates to indicate when they finished answering the question, without interference of the lecturer. After the candidate signaled the end of his answer, the lecturer can ask the next question.

(C) Final grading

After the students completed the exam, and the lecturer graded each of the individual questions, it is time for the final grading stage. The course of the final grading stage depends on how many candidates completed the overall exam and each of the exam questions. If only one or two candidates completed each exam question, the grade can only be determined using the candidate’s specification table completed during the exam.

If many candidates (e.g., five or more) completed each exam question, the final grading can proceed as for exams with multiple choice and open-ended questions. In that case, candidates are only graded after all candidates completed the oral exam. In

this final grading stage, lecturers should check if individual questions appeared satisfactory, and if the initial grading standard is satisfactory. To start with checking individual questions, there are three reasons to adapt grading or discard individual questions. One may consider discarding an individual question if it has at least one of the following three problems:

Page 18: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

18

(1) Problems with the formulation of the question. These problems may show up

after complaints of students, or after post exam evaluation by the lecturer(s).

(2) The question was too difficult. For example, the proportion of correct answers

to the oral question was small. That is, very few students answered the question completely correct, or the average score on the question by students who performed well on the test is lower than “let’s say 50%” of the maximum score on that question.

(3) The RIT was not larger than zero. The RIT of an exam question is the correlation between the score on the question and the total score on the exam. A zero or negative RIT signifies that high ability students have an equal or lower probability to do well on this question than students with lower ability. The RIT can be easily computed for exams when grading of questions is stored digitally.

After discarding an exam question, we recommend adapting the grading accordingly. An adaptation that seems fair to students is to let all students pass the discarded exam question, and at the same time keep the initial grading standard. This adaptation guarantees that students who passed the exam before implementing the adaptation, also do so after the adaptation.

Please refrain from giving all students the same upgrade (e.g. one extra point) since students who already scored well on the test will benefit less from this correction compared to students who performed badly; and that is not fair and inappropriate from a didactical perspective.

Page 19: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

19

Chapter 4 – Written products One common coursework test or assessment type is written products. The term “written products” in this chapter refers to all kinds of written products that students may have to produce and that lecturers will assess to award students a final grade (exam) or a sub grade (mid-term) for a course. A non-exclusive list of written products includes: individual research papers, group research papers, essays, short written assignments, Bachelor’s theses, and Master’s theses. Issues specifically concerning the quality of the assessment of the Bachelor’s and Master’s thesis are dealt with in Chapter 8. The assessment of a written product depends largely on the instructions given to students. These instructions must explain how the learning goals of a course “translate” into a clearly specified checklist for the written product. Students then write the written product using and perhaps even based on the checklist3. Obviously, the more learning goals a written product covers, the more detailed and fine-grained the instructions must be.

Table 4.1. Example: Overview of written products at different stages in BSc Business

Economics

Stage of the education programs Written products for …

Domain-specific knowledge and skills

Writing skills (specified in the learning goals)

Scoring criteria

The first year Bachelor’s coursework (essays and short written assignments)

Course Introduction into academic reading & writing. a.o in Academic Competences and Intro Data Analysis

Include criteria of acad. writing skills; including: relevance of content, use of source material, organization, cohesion and coherence, (language) accuracy, and presentation

The second year Bachelor’s coursework (essays, short written assignments, individual research papers, group research papers)

Course subjects Intertextuality; making connections between texts and putting forward your own understanding. a.o. in Business Research Techniques and Management Information Systems

The third year Bachelor’s coursework (essays, short written assignments, individual research papers, group research papers, Bachelor thesis)

Course subjects / Research cycle

Structuring hypotheses and substantiating claims or assertions through careful argumentation (e.g. meta analyses). a.o. in Bachelor Thesis and Business Ethics

3 See the following websites for examples of instructions (http://writing.colostate.edu/guides/teaching/wassign/pop2d.cfm) and checklists (http://writing.colostate.edu/guides/teaching/wassign/pop2e.cfm).

Page 20: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

20

In order to grade written products, lecturers must construct an assessment form in which they specify a number of criteria on the basis of which the work will be assessed. Lecturers can customize their assessment form to make the grading criteria correspond with the course and assignment learning goals. All criteria must be properly defined on the assessment form (i.e. rubrics). Furthermore, the assessment form must specify how points will be awarded to each of the criteria and how the final grade for the work will be determined. There are two basic ways to do this:

- A pre-specified number of points can be ascribed to each criterion and the total number of points adds up to 10 (or 100 points and the total is then divided by 10).

- Each criterion is marked on a five- (or four) -point scale: e.g. Very Weak – Insufficient – Sufficient – Good – Very Good. The lecturers specifies which grade (0 through 10) corresponds to which combination of ‘Very Weak’, ‘Insufficient’, ‘Sufficient’, ‘Good’ and ‘Very Good’ scores (see the proposed evaluation form for an example of such a specification).

Depending on the objectives of the course and the scope of the written product (see the reference to the research cycle above), lecturers may give more or less weight to criteria covering the theoretical, methodological and empirical parts of the work. Obviously, lecturers should omit evaluation criteria from the example evaluation form that do not apply to the objectives and learning goals of their specific written product. Additionally, lecturers may decide to admit criteria concerning the form and style of the work and the independence of the student(s) who wrote the work. Lecturers may also choose to account for possible ‘special circumstances’ students might be confronted with (for example, illness). It is important that the evaluation form (i.e. the evaluation criteria, their definitions and the specification of how they are assessed) are made available to students at the start of the course so that they understand how they will be assessed. Written feedback (e.g. the filled out assessment form) must also be made available/explained to the student(s) so that they can learn from the lecturer’s evaluation.

Plagiarism

As access to documents on the World Wide Web has grown, the issue of plagiarism and the enforcement of the consequences for academic dishonesty have become important concerns when assessing written products. Plagiarism means "to use and pass off the ideas or writings of another as one's own; to appropriate for use as one's own passages or ideas from another; or to put forth as original to oneself the ideas or words of another" (Palmquist, 2003, pp. 173-174). Also unintentionally using someone else's work without properly acknowledging where the ideas came from (the most common form of plagiarism) is considered plagiarism.

Please discuss the concept of plagiarism in class and talk about the consequences and penalties before it becomes an issue. In case you suspect plagiarism, always report this to the Examination Board and let them deal with it. Don’t set your own rules and don’t take your own measures since this will create arbitrariness. The Examination

Page 21: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

21

Board has the expertise and the Rules and Guidelines (R&G) to deal with this accurately and properly.

Box 4.1 Sample instructions on how to prevent plagiarism

“You must write your assignments in your own words. Whenever you take an idea, findings or passage from another author, you must cite accordingly, i.e. referencing in the text or in footnotes. When a passage is literally copied into your work, quotation marks must be used and you need to refer to the page number. When this is not done, you are committing plagiarism. Plagiarism is an extremely serious academic offense. If plagiarism is detected, the Examination Board will be notified and you will be sanctioned accordingly. All assignments will be checked for plagiarism. For this reason, each assignment must be sent digitally to XX. XX will check to what extent your work literally matches sources found on the Internet and that of your peers.

It is your responsibility to educate yourself about the different types of plagiarism, such as copying and inappropriate paraphrasing and citation. However, a manual on how to properly cite can be found on Canvas. XX (course subject) follows the American Psychological Association (APA) rules. The teacher would like you to cite according to these APA rules. For more information see: http://www.apastyle.org. Furthermore, one of the seminars will pay extra attention to how you need to cite properly.”

Sample grading criteria and rubrics

Authentic Assessment and Rubrics

http://www.fctl.ucf.edu/teachingandlearningresources/coursedesign/assessment/assessmentt

oolsresources/rubrics.php

Want to know more?

There is a free workshop, Assessment of Writing Assignments, organised by Teacher

Technology Academy. This workshop is announced on Events for staff on the intranet (see this

link: https://www.tilburguniversity.edu/intranet/staff-events). Additionally, there is a reader,

“How to design writing assignments”, available on request ([email protected]).

Page 22: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

22

Chapter 5 – Oral presentations Oral communication skills are important program learning outcomes, as specified in the international standards of the Dublin Descriptors and most of the domain-specific reference frameworks in the Netherlands. Oral presentations are commonly used to assess oral communication skills in the Dutch research universities. Quite often, students have to present the results of an individual or group task. Presenting is one of the basic academic skills, which cannot be tested using a standard Q&A format, but requires real-time performance assessment by the lecturer and/or by an audience of fellow students. It is therefore highly recommended to use clearly specified criteria when assessing the performance of complex skills like presenting.

Criteria must be derived from the learning goals of the course and the lecturer must think carefully whether or not presentation skills are part of the learning goals before developing criteria to assess the presentation skills. Criteria can then be further specified into items. For example the criterion “use of theory” can be divided into items such as, “the number of theories that are explained,” and “the theory is explained accurately.” The next step is to develop a proper scale for each item. For example, the item “quality of the research question” may be ranked from (1) very poor to (5) very good. However, the pace or tempo of the presentation may be assessed ranging from (1) much too slow, through (3) just right to (5) much too fast. Evaluating a presentation taxes the short-time memory of the lecturer unless the presentation is recorded. This imposes constraints on the number of criteria that may be used to evaluate the presentation. A way to circumvent this may be to have different judges assess different criteria, but care should always be taken to limit the number of criteria, for instance by clustering them. Some suggestions for criteria to evaluate different aspects of academic presentations are provided in Box 5.1 below. It should be understood, however, that criteria for evaluating presentations may vary considerably. The ultimate standard for determining the necessary criteria for evaluating a presentation are the learning objectives of the assignment. So, verbal and non-verbal presentation skills need not be included into a criterion if presentation skills are not a part of the learning objectives of the assignment. Alternatively, lecturers may decide to include additional criteria for assessing the quality of the presentation based on the requirements of the assignment (e.g., criteria may be specified to evaluate the quality of a study that was done). In sum, there is ample leeway to determine the set of criteria that are used for evaluating a presentation, as long as the criteria are clearly specified. Keep in mind that assessing the content of a presentation is something very different than assessing the form of style of a presentation. It is highly recommended that lecturers develop separate criteria for presentation content and presentation style and assess both elements separately, before combining them into a final grade. With regard to group presentations, teams often put forward their most comfortable speaker to deliver the presentation. Whether or not this is a problem depends on the learning objective of the presentation. If the purpose of the presentation is to assess whether the team masters some kind of content, it is perfectly all right for a single team member to deliver the presentation. The grade for the presentation then counts for all team members, possibly after weighing the individual contribution of each team

Page 23: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

23

member, for example through peer evaluation. If on the other hand the objective of the presentation is (also) to assess the presentation skills of the individual team members, the lecturer must design the delivery of the presentations in such a way that each team member has the opportunity to do a presentation.

Box 5.1: A list of possible didactical criteria for the assessment of academic presentations Content Was the goal of the presentation explicated? Was the use of literature appropriate and sufficient? Are theories used understandable, structured and convincing? Does the research model make sense here? Is its use convincingly motivated? Are the findings, conclusions and recommendations clear and supportive? Was the presentation consistent in terms of argumentation? Was the content tailored to the audience? Et cetera. Structure Was the presentation logically organized, with a clear beginning, middle and end? Was the time spent at the various parts off the presentation proportional to their importance? Was the presentation too short or too long? Verbal presentation skills How were the oral skills of the presenter in terms of syntax and prose, pace, articulation, and intonation? Non-verbal presentation skills Did the presenter make adequate use of non-verbal communicative tools, such as posture, facial expression, and gestures, and did the presenter make contact with the audience? Audio visual tools Were the audio visual tools (slides, movies, etc.) used in a clear fashion, did they support the presentation, were they well integrated into the argumentation, were they adequately timed and used in a proportional fashion? For some examples of assessment criteria and indicators of presentation skills; see: Anholt, Robert R. H. (2010). Dazzle 'Em with Style: The Art of Oral Scientific Presentation, Elsevier Science & Technology. ProQuest EBook Central, http://ebookcentral.proquest.com/lib/uvtilburg-ebooks/detail.action?docID=270284.

Want to know more?

There is a free workshop, Assessment of Presentations, organized by Teacher Technology Academy. This workshop is announced on Events for staff on the intranet (see this link: https://www.tilburguniversity.edu/intranet/staff-events). Additionally, there is a reader, Presentation assessment, available on request ([email protected]).

Page 24: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

24

Chapter 6 – Group assignments Group assignments are a tool to assess students’ higher order thinking skills of the course content as well as group skills, including planning, communication, conflict dealing, reflection et cetera. The reasons for lecturers to use group assignments may be twofold: - Group/team skills are important program learning outcomes and/or part of the

course learning goals; - A convenient choice for assignments in case of high student numbers. The first reason, of course, is the best reason to use group assignments. Keep in mind that in this case group processes or skills must be assessed in addition to content criteria. Using group assignments as a convenient choice because of high student numbers is understandable from a lecturer’s perspective but may be a problematic choice from a didactical perspective. In any case, design group assignments in such a way that its high complexity naturally invites students to collaborate (i.e., it cannot easily be done by individual students) and encourages discussion among students. Always clearly explain to students why it is important that they work on the assignment in a group. From a student perspective, group assignments have two key problems: (1) Group assignments require students to invest time and effort on coordination. When the learning value of coordination is not made clear, they consider coordination as ‘additional work load’. (2) Students may find it unfair that their grades depend on other students who may not be as motivated as they are, who are not as good as they are or who shirk their duties in the group (free riders; see Appendix IV). Of course, these observations from students may not be generalizable to all cases. Nevertheless, it helps if lecturers are aware of them and address these reservations in how they design their group assignments and instruct the students about it. Based on the above observations and on more general guidelines in the literature on assessment validity, we recommend that lecturers observe the following guidelines when designing group assignments.

Designing group assignments

(1) Clearly indicate why the assignment should be performed in groups rather than individually. This can be related to the workload and to the more ‘in-depth’ insights this yields for students. Lecturers can also make a discussion within or between groups as part of the assignment. These suggestions effectively make group work a learning goal. Keep in mind that process elements (such as group discussions and individual contributions to the group product) must also be assessed (or graded) when they are defined as learning goals.

(2) Instruct students on how to do group assignments. Do not expect that all students will know how to work in groups effectively. New students may benefit from a clear instruction on how (not) to do group assignments. For all group assignments lecturers may invite students to be explicit towards their group

Page 25: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

25

members on their aspiration level (what grade do they want) and to invite students to draft a brief work plan (regardless of whether the lecturer decides to give feedback on the work plan).

(3) Have a policy in place on how to deal with groups that complain about free riders. Make this policy explicit before the start of the group assignment; see Appendix IV.

Contents

(4) Each member of a group is knowledgeable of and accountable for all the elements of the end product. Actively discourage students from simply dividing

the work among them. Group work has added value for students only when it involves group interaction.

(5) When the learning goals of a course relate to group work, group assignments must be used. In all other cases, carefully consider why you use group assignments and if this is the best choice.

(6) Lecturers may have to determine the composition of groups, rather than students themselves in order to prevent self-selection among students.

Grading

(7) The weight assigned to the group assignments preferably should not count for more than 40% of the course grade. This addresses students’ fairness concerns as it allows students to pass the course based on their own performance on individual assessment types. This allows good students (in bad groups) to still get a reasonable course grade.

(8) Grades for group assignments should never be entirely based on self-assessment by students (because of self-serving bias).

(9) Lecturers should be reluctant with this, but may choose to slightly differentiate grades for individual group members based on group performance and individual performance (let’s say .5 point up or down on a 10-point scale). Individual performance of group members may be assessed by using peer-

assessments where students assess each other’s relative (not absolute) contribution to the group effort. Consequently, a higher grade for one person involves a lower grade for another group member. Peer assessment must be confidential, i.e. students must be unaware of each other’s assessments.

Retakes

(10) Allow students to re-take the group assignment when they failed, if this is feasible and they want to, even if they want to do this individually or with part of the group. If possible, use the same format and assessment criteria as in the regular exam.

(11) Sometimes, a resit opportunity cannot be offered for practical reasons. In that case, the results that have already been obtained in this group assignment can be incorporated in the results of the resit examination, on the condition that the student is offered a fair opportunity to pass the course in the

Page 26: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

26

resit examination. Make sure that the method of assessment and the calculation of the weighting is published on Canvas at the start of the course in all cases.

Want to know more?

There is a free workshop, Assessment of Group Work, organised by Teacher Technology Academy. This workshop is announced on Events for staff on the intranet (see this link: https://www.tilburguniversity.edu/intranet/staff-events). Additionally, there is a reader, Assessing group work, available on request ([email protected]).

Page 27: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

27

Chapter 7 – Open book exams and Take home exams

Open book exams and take home exams allow students to take notes, texts or resource materials into an exam situation. They test students’ ability to find information and apply (this) knowledge. As such, they are often used in subjects requiring direct reference to written materials, like validated measurement tools and scales, or computer programming languages.

Open book exams usually come in two forms:

Traditional sit-down / limited-time exams, with varying degrees of access to resources and references.

Take home exams; i.e. open book exams you do at home. Question(s) are handed out, answers are attempted without help from others, and the exam is returned within a specified period of time (e.g. the next day or within 3 weeks).

For certain learning goals, open book exams (and take home exams), may be a suitable format because they test for more than just rote-learning/memorizing information. Higher education is supposed to equip their students with intellectual abilities and skills, and open book exams may test a student’s ability to quickly find relevant sources and data, transform these into information (understand, analyze, apply knowledge) and think critically.

Open book exam questions usually require students to apply knowledge, and as such, these may be essay-style questions or involve problem solving and delivering solutions. The assessment type may vary from multiple choice question and open-ended question, to almost any type of written or oral product. For helpful instructions, recommendations and pitfalls, see the corresponding chapters in this handbook. The question style used, is typically school and discipline dependent. In Economics and Management, for example, the questions may be set up along a real business case, a student needs to discuss and solve.

The materials one can take into an open book exam can vary. Some restrict the type of materials that are allowed (e.g. formula sheets and tables or a limited number of texts), others - like take home exams - may be totally unrestricted (any quantity or quality of any material). Such materials might include: lecture notes, readings, reference materials or textbooks, equipment like calculators and a computer, drafting tools, et cetera.

Materials used in take home exams are usually unrestricted. The main restriction for take home exams is that they must be the student’s own work without any help from others, although this may be difficult to verify and assure. This should always be kept in mind when considering this type of examination. For this reason the pros and cons of open book exams in general and take home exams in particular are listed below.

Pros and Cons of Open Book and Take Home Exams

Advantages: Take-home exams, put less emphasis on the student’s memory. They can open and consult the book anytime they want to and also search for information on e.g. the internet. Also, students can do their examination at home or anywhere they

Page 28: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

28

want, as long as they finish their exam on time. Since time, may be a big issue or limitation in invigilated assessment types, this might make students feel more comfortable and unrestricted. Open-book exam questions may be more challenging requiring higher level analysis thinking skills and creativity, but since examinees have extra time to study the problem carefully and to deliver well-structured and well-presented arguments and solutions the quality of the answers and solutions may benefit from this. Related to this advantage, open-book assessments may represent the professional setting better than closed-book tests do. Students have full access to their references, just as they would have when they are on the job.

Disadvantages: Besides certain benefits, there are important disadvantages as well. Firstly, “despite the expectations that open-book tests would encourage a deep learning approach, evidence suggests that (1) students spend less time preparing for open-book than for closed-book tests, and that (2) they seem to underestimate the need for preparation” (see: Heijne-Panninga, 2010, p16). Also, in the take-home exam, it is essential that the students must do their work themselves and not consult friends or senior students; i.e. students are supposed to be ethical in answering take-home exams. This assumption often is at least naïve; especially in a digital era and when a take-home exam is used for summative testing. One way of dealing with this issue is to organize frequent progress reviews, common when supervising Bachelor’s or Master’s Theses, but this is usually very time consuming.

In sum. In cases where the assurance of assessment should be but cannot be guaranteed, it is recommended to consider take-home exams as an aid for formative testing only.

Additional disadvantages include: (see: https://www.ryerson.ca/content/dam/lt/resources/handouts/open_book_exams.pdf)

• “Students may place too much emphasis on their reference materials; • Students may believe that they don’t need to study as much, or may

underestimate how long it will take them to locate the information in their reference materials;

• Student workload may be increased by the need to create reference materials before exams;

• Instructor workload may be increased if it’s necessary to police the material that is used in the open book exam;

• Depending on the reference materials being used, limited desk space may be a problem;

• The reference material may not be available to all students, such as an expensive textbook that all students may not have purchased;

• Students may be unfamiliar with the format and will need to be provided with clear procedures and rules; Several types of questions that would be acceptable in a closed book exam will not work in an open book exam – such questions include questions that ask for definitions, descriptions, or lists of properties, characteristics, etc.”

Page 29: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

29

Box 7.1 Misconceptions about Open Book and Take Home Exams adapted from https://student.unsw.edu.au/open-book-and-take-home-exams

1) Open Book exams are a breeze

Open book exams are not an easy option for students. Answering the questions well requires more than just copying information straight from texts. For example, having access to a textbook can’t stop students from giving a wrong answer if they can't remember a fact or formula. In open book exams, it's how you locate, apply and use the information that‘s important.

2) You don't have to study

Probably the biggest misconception about open book exams is that there is no need to study anything. Having books and notes to refer to might mean students don't have to memorize as much information, but they still need to be able to apply it effectively. Being able to apply information implies that you know it exists and what it means or can mean within its context.

This means that students must fully understand - and be familiar with - the content and materials of the course so they can find and use the appropriate information. In open book exams, students need to quickly find the relevant information in the resources they have. If students don't study, they won't be able to know where it is.

3) You can just copy straight from the book

Students can't copy chunks of text directly from textbooks or notes; this is plagiarism. In open book exams, the resource materials are made available to students, so they are expected to do more than just reproduce them. They must be able to find, interpret and apply the available information to the exam questions. They usually need to reference as well, just as one would for any other assignment.

4) The more materials the better

A big issue is that sometimes, student get carried away with an overload of materials and resources in the exam. One should only take what one needs. Stacks of books won't necessarily guarantee high performance, and students don't have time for extensive reading. Too many materials can end up distracting and crowding up work space. Students should be encouraged to carefully select the materials and organize them for quick reference.

Page 30: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

30

Chapter 8 – Bachelor’s and Master’s thesis

The assessment of the Bachelor’s and Master’s thesis is a complex task that must be done rigorously and transparently. This chapter provides important guidelines to that extent. It discusses three critical aspect of the assessment of theses: (A) ensuring thesis assessment quality, (B) the independence of supervisors, and (C) resolving differences between assessors.

(A) Ensuring thesis assessment quality

Validity According to the didactic principle of constructive alignment, assessing the Bachelor’s and Master’s thesis must be done by rigorously examining the thesis content against specified thesis learning goals and grading criteria in the assessment form. Therefore, it is critically important that thesis learning goals are clearly specified in a document that is available for both students and supervisors. Each study program should formulate the shared thesis learning goals so that every thesis supervisor and assessor follows the same thesis learning goals of their education program. Furthermore, the learning goals must be operationalized unambiguously into measurable criteria with specific indicators. Clearly specified criteria with indicators further ensure that all students are assessed in the same way, independent of who their supervisor happens to be. Reliability Reliability of thesis assessment refers to the degree to consistency. To ensure a satisfactory degree of consistency, there should be uniformity in the behavior of different assessors when assessing the same thesis so that the level of assessor subjectivity can be minimized. The challenge to assessors in grading theses is to ensure that the grade truly reflects the quality of the students’ theses, regardless of who assesses the work. Unfortunately, a number of studies have addressed the issue of inconsistencies in thesis assessment (e.g. Hand & Clewes, 2000; Saunders & Davis, 1998; Webster, Pepper, & Jenkins, 2000; Williams & Kemp, 2018). They pointed out a number of factors that result in inconsistencies, such as time spent on assessment, experience of assessors, assessors’ interpretations of criteria, different weighting on different criteria, different ways of using the assessment forms, the role taken when reading the thesis (as an academic or as a normal reader). To ensure reliability, the program should take these factors into account and establish thesis assessment procedures to prevent/resolve these inconsistencies. Consequently, the specification and consistent use of rubrics is highly recommended. Transparency To ensure transparency, these criteria must be clearly explained and made available to students and supervisors. This guarantees that students know what is expected of them and that supervisors know how they must assess a thesis. When giving feedback on the draft and thesis product, supervisors should always refer their comments to these criteria so that students learn how to improve their work towards what is required in the criteria. To ensure thesis assessment quality, as a minimum, the following rules must be observed:

Page 31: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

31

Align thesis grading criteria to thesis learning goals and supervision process.

Operationalize the assessment criteria and indicators in a thesis assessment form.

Establish thesis assessment procedures to ensure consistency/reliability. Make both thesis learning goals and assessment form available for students and

supervisors. Use the criteria for giving feedback and explain quality expectations during the

supervision process.

(B) Independence of the supervisor Supervisors are more closely involved in the process of writing the thesis then they are anywhere else in the education program. This presents unique challenges regarding supervisor independence because the supervision process interlocks advising students and assessing their work in potentially problematic ways. To ensure that students receive an independent assessment of their thesis, the following rules must always be observed: Each student is assigned a supervisor and a second assessor, of which at least

one must possess the University Teaching Qualification and has at least 1 degree higher than the diploma being assessed.

The supervisor guides the student throughout the process of writing the thesis. A

second assessor only assesses the final work. The second assessor’s job is to evaluate the thesis from the perspective of the NVAO/AACSB.

The supervisor and the second assessor assess the thesis independently and

individually fill out the thesis assessment form. Both forms must be kept for a period of seven years. There is no need to make these forms available to students, but may be used to explain the final grade to the student.

The supervisor and the second assessor discuss their assessments and together

develop a final, joint assessment of the thesis. Also this assessment is written down in the thesis assessment form. The joint version of the assessment form explains why the thesis is assessed the way it is for each criterion, and in sufficient detail.

In order to avoid dependency relationships between supervisors that may

compromise their independence, PhD students cannot team up with their PhD supervisors. It is also recommended that membership of the assessment team varies with different department members enabling calibration of grades on similar topics and avoiding the formation of resident teams.

When there is also a company supervisor involved (e.g. in a Master Thesis), be

aware that this person has no mandate to act as an examiner and consequently cannot be involved in the assessment and grading of the thesis.

(C) Resolving differences between assessors

Page 32: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

32

It is unavoidable that assessments of the supervisor and the second assessor differ and sometimes conflict. They may have different methodological and disciplinary backgrounds and may interpret the work of students differently. Moreover, the supervisor also grades the process leading to the final thesis, a process that is not observable for the second assessor (and the NVAO/AACSB). When the supervisor and the second assessor differ in their assessments they must first of all discuss their differences and try to bridge them. If their differences cannot be resolved by discussion, the academic director must appoint (or act as) a third assessor (referee). In all cases of resolving different assessment results, the following rules should be followed:

In the moderation meeting, the supervisor and the second assessor discuss their

initial assessment results based on the thesis learning goals and assessment criteria. They should make an effort to reach joint assessment results when the intended final grade between the assessors (before thesis defense) deviates 1.5 points or more (≥1.5 points) on a 10-point scale.

In any event the moderation is not successful (i.e. if the assessment results between the supervisor and second assessor still differ 1.5 points or more (≥1.5 points) on the final grades), the academic director must act as (or assign) a third assessor; a referee.

The third assessor independently fills out a thesis assessment form and engages in the discussion between supervisor and second assessor trying to reach a joint assessment. If agreement cannot be reached, the assessment of the third assessor is leading and decisive.

Page 33: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

33

Chapter 9 – Rules and Guidelines TiSEM In order to guarantee the quality of a diploma, the Examination Board of each School has defined a set of rules and guidelines. These rules and guidelines are applicable to the tests, preliminary examinations and final examinations in all programs of the School, referred to as degree programs and irrespective of their starting date. Typically, these rules and guidelines deal with issues like: fraud and penalties, inspection, re-takes, authority and mandates, deadlines, invigilation during exams, award of academic distinction (judicium), exemptions, and assessment of course results. Please note that: All examiners and students are expected to know the Rules and Guidelines of the Examination Board (R&G) and follow them accordingly. The current Rules and Guidelines used at TiSEM can be found here:

https://www.tilburguniversity.edu/sites/tiu/files/download/TiSEM%20Regels%20en%20Richtlijnen%202018-

2019%20FINAL_2.pdf

The current Rules and Guidelines used for Bachelor Data Science can be found here:

https://www.tilburguniversity.edu/sites/tiu/files/download/BSc%20RR%20DS%202018-2019_2.pdf

The current Rules and Guidelines used for Master Data Science can be found here:

https://www.tilburguniversity.edu/sites/tiu/files/download/MSc%20RR%20DSE%202018-2019_2.pdf

All Education and Examination Regulations of current TiSEM programs can be found here:

https://www.tilburguniversity.edu/students/studying/regulations/oer/economics-management

Page 34: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

34

APPENDIX I: Grading

(1) Accurate ordering of students

A desirable characteristic of an exam is that it accurately orders students differing in ability. The probability that an exam accurately orders students differing in ability is positively affected by the number or exam questions. For example, consider three students who differ widely with respect to their ability; A (knows the correct answer of 40% of all questions, and randomly guesses the answer to all other questions), B (60%), and C (80%). Table 1 shows the probability that the exam accurately orders the three students as a function of the number of questions (K) and the number of answering categories of each question (A) in the context of a multiple choice exam. Table I.1: Probability that an exam accurately orders three students with abilities 40%, 60%, 80%, as a function of the number of answering categories (A), and number of items (K).

K Probability of accurate

ordering (A = 2)

Probability of accurate

ordering (A = 4)

10 33.3% 61.1%

20 56.4% 85.4%

30 70.2% 93.8%

40 79.1% 97.2%

50 84.9% 98.7%

60 89.0% 99.4%

For instance, if the exam has 10 questions with four answering categories each, the probability is 38.9% that a student low(er) in ability performs at least as good on the exam as one of the two students who has higher ability. Requiring a 85% or higher probability of an accurate ordering implies the construction of exams with exams with at least 50 questions with two answering categories, and at least 20 questions with

four answering categories.

(2) Guessing correction

Another desirable characteristic of an exam is that students with no or low ability have a low probability to pass the exam. A grading standard incorporating guessing correction accomplishes this. Table 2 provides the probabilities of a student with no ability (0%, randomly guesses all answers) and a student with low ability (50%, knows half of the answers and randomly guesses all other answers) to pass the exam, as a function of the number of answering categories (A) and the number of items (K). The probabilities are calculated both for standards without and with guessing correction.4

4 Without guessing correction, the grade equals 10 times X/K. The grade with guessing correction is calculated using the formula discussed in Chapter 1.

Page 35: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

35

Table I.2: Probability that a student passes the exam (grade ≥ 5.5) with and without guessing correction as a function of his knowledge (0% or 50%), number of answering categories (A), and number of items (K).

Without guessing correction With guessing correction

A K Student without

ability (know 0% of

answers)

Student with some ability (know 50% of answers)

Student without ability

(know 0% of answers)

Student with some

ability (know 50% of answers)

2 10 37.69% 96.87% 5.46% 50.00%

2 20 41.19% 99.90% 0.49% 37.69%

2 30 29,23% 99.95% 0.07% 30.36%

2 40 31.79% 99.99% 0.03% 41.19%

2 50 23.99% 99.99% 0.00% 34.50%

2 60 25.94% 100.00% 0.00% 29.23%

4 10 1.97% 76.27% 0.35% 10.35%

4 20 0.39% 94.36% 0.00% 22.41%

4 30 0.06% 91.98% 0.00% 31.35%

4 40 0.01% 97.56% 0.00% 21.42%

4 50 0.00% 96.78% 0.00% 14.94%

4 60 0.00% 98.94% 0.00% 19.65%

Table I.2 illustrates that5

Without guessing correction, students without ability have a high substantial probability to pass when each question has only two answering categories.

Without guessing correction, students with some ability (knowing only 50% of the answers) almost always pass the exam.

With guessing correction, students without ability almost never pass the exam.

With guessing correction, students with some ability (knowing only 50% of the answers) have less than 50% chance to pass the exam.

To conclude, guessing correction minimizes the probability that students without ability pass the exam, and limits the probability of passing for students who know the answers to half of the questions to .5 or lower.

5 These conclusions also hold for other values of the number of answering categories and items than those shown in the table, i.e., they hold in general.

Page 36: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

36

APPENDIX II: Checklist closed questions (including MC exams) By Claudia Loijens General

Read the question and answer it without looking at the answer key.

When your answer is incorrect, check whether this is caused by the way the question is posed. Check:

- A whether there is a correct answer at all. - B whether there is only one correct answer. - C whether the question does not have any unclear aspects in it. - D whether the alternatives do not have any unclear aspects in

them.

Check B to D even when then answer was correct.

Each question should be important enough to ask. It has to be about important parts of the course. The question should fit your specification table/ test grid.

Each question should measure what you intended to measure. The purpose of the question should be clear.

Each question should only be answered based on content knowledge of your course. It should not be possible to answer the question correctly based on common knowledge.

Questions in the test should be independent of each other (as much as possible). That means (a) that by answering a question in the test correctly, it is not automatically clear what the correct answer to another question would be (b) it is possible to give a correct answer to a question even when the answer to a previous question is incorrect or unknown.

The stem

The stem should contain a clear question or assignment.

The stem should contain enough information to be able to answer the question.

The stem should not contain redundant information. If so, delete this information.

The stem should be phrased precisely, concisely and grammatically correct.

The stem cannot contain a double negation. If so, rephrase.

When the stem contains a negation, this should be placed in bold and CAPITALS (NOT)?

The alternatives

Try to have alternatives that are plausible answers to the question only.

The correct alternative (key) should not contain a repetition of an expression in the stem, as the only alternative. If so, rephrase.

Do not use words like ‘always’, ‘never’ or ‘all’ in any of the distracters. Most students know that few things are universally true or false, so distractors with these words in them can often be easily dismissed.

Make sure there are no double negations between the stem and one or more of the alternatives.

Make sure the alternatives are mutually exclusive.

Page 37: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

37

Try to have alternatives with approximately the same length.

The alternatives should connect with the stem grammatically and in the matter of content.

The alternatives should not have repetitions from the stem or each other.

The alternatives should be logically ordered; alphabetically, chronologically, etc.

The alternatives should be distinctively different from each other.

The phrasing and word choice of the key should be similar to that of the distractors.

Relevance and content validity

The question should relate to the field of study to be tested.

The question should show clearly enough what subject matter and/or skill is being tested.

The level of difficulty of the question should be acceptable for the intended education (estimation).

The question type should be appropriate for the purpose. Language

The phrasing of the question should be grammatically correct.

Make sure the syntax of the question is as uncomplicated as possible.

Make sure the question does not contain a double negation.

Make sure the question does not contain unnecessarily difficult language.

Make sure the question does not contain unnecessary infixes.

Try to phrase the question positively.

Make sure that the phrasing of the question does not lead to misconceptions (no ambiguities).

Make sure that the meaning of the question does change when the words are stressed differently.

The use of context The use of images can in some cases be more helpful than just descriptions. The functionality of the image should be clear, however.

The use of pictures, graphs, et cetera should be functional.

The pictures, graphs, drawings, et cetera should be printed clearly.

Make sure that the images are correct.

Make sure that the accompanying information with the images is clear-cut, brief and to the point.

Make sure that the accompanying information does not contain unnecessary information.

Presentation A clear lay-out of your test material is an important precondition for students to take a test.

The questions should be easily distinguished from each other.

The numeration needs to be clear-cut.

Stick to the general conventions for use of symbols, punctuation, etc.

Check your tables et cetera for mistakes.

Page 38: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

38

The references to texts, drawings, tables or lines in the question should be correct.

Add on:

Avoid using “all of the above”. If “all of the above” is an option and students know two of the options are correct, the answer must be “all of the above”. If they know one is incorrect, the answer must not be “all of the above”. A student may also read the first option, determine that it is correct, and be misled into choosing it without reading all of the options. Furthermore, this alternative is usually not a grammatical correct answer to the question in the stem.

Avoid using “none of the above”. The option “none of the above” does not test whether the student knows the correct answer, but only that he/she knows the distractors aren’t correct. Furthermore, if this would be the correct alternative, why not give the correct answer in one of the alternatives. If that is not possible, maybe it is not a good question and therefore should not be used.

Page 39: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

39

APPENDIX III: Checklist open-ended questions By Claudia Loijens Language The language use should be clear and unambiguous to the student to avoid misinterpretation.

The phrasing of the question should be grammatically correct.

The syntax of the question should be as uncomplicated as possible.

Make sure that the question does not contain a double negation.

Make sure that the question does not contain overly difficult language.

Try to phrase the question positively.

Make sure that the phrasing of the question does not lead to misconceptions (no ambiguities).

Make sure that the meaning of the question does not change when the words are stressed differently.

Information The stem of the question should contain enough relevant information for the student to be able to answer as complete as possible.

(The stem of) The question should contain enough information for students to give the correct answer.

Make sure that the question contains enough information about the expected length, form and comprehensiveness of the answer.

Be clear about whether you expect a motivated answer from the student.

The informational part of the question and the definition of the problem/question should be easily distinguishable from each other.

Relevance and content validity

The question should clearly show what subject matter and/or skill is being tested.

It should not be possible to answer the question by using another skill than the intended one.

Make sure that the question is not a trick question. It should not imply a problem that doesn’t really exist.

Make sure that the question does not unintentionally contain clues for the correct answer. Test wise students use these clues to answer the question correctly; or at least to some extent.

The level of difficulty of the question should be acceptable for the intended knowledge level of the students.

The level of difficulty of the question should not be increased by irrelevant information in the questions (unless exactly this is the intended learning goal).

Make sure that the question format is suitable for the aim of the test.

The level of difficulty of the test as a whole should be acceptable for the intended knowledge level of the students.

The test as a whole should give a reasonable representation (of the specifications) of the attainment targets of the course.

Page 40: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

40

Objectivity A question is considered to be objective when several experts all indicate the same answer as the correct one. Objectivity has nothing to do with the content of the question. That can be very subjective.

The experts must agree on the correct answer(s).

Make sure that the student is not asked to give his/her opinion without having to answer with regards to the context of the course or future profession. An opinion is hard, if not impossible, to grade.

The use of context The use of images can be more helpful in some cases than just descriptions. The functionality of the image should be clear.

The use of pictures, graphs, et cetera should be functional.

The pictures, graphs, drawings, et cetera printed clearly.

The images should be correct.

The accompanying information with the images should be clear-cut, brief and to the point.

The accompanying information should not contain unnecessary information. Presentation A clear lay-out of your test material is an important precondition for students to take a test.

Make sure that the questions are easily distinguishable from each other.

The numeration should be clear-cut.

Stick to the general conventions for use of symbols, punctuation, et cetera.

Check your tables for mistakes.

Make sure that the references to texts, drawings, tables or lines in the question are correct.

Scoring instructions Scoring instructions consist of a set of model answers, a marking scheme (test score, number of score point per question and scoring of partially correct answers) and general scoring instructions (general directions on how to rate/judge a test).

Every question should have a set of model answers.

The answers in the set should be correct, and the given correct answers should be plausible.

Clearly indicate how many score points should be awarded to certain correct elements in an answer. The scoring instructions should be clear about this.

Add general instructions.

The scoring instructions should be clear-cut.

When it is impossible to include a set of model answers for one or more questions, marking standards should be added in the scoring instructions.

The scoring instructions should not be too general or too broad to guarantee consistent rating.

The scoring instructions cannot be in so much detail that it makes it hard for the rater to use, just because of the size of the instructions.

The lay-out of the scoring instructions should be in a way that the rater can get a quick understanding of his duties.

Page 41: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

41

APPENDIX IV: Avoiding Free Riding Behavior Source: edc.polyu.edu.hk/resources/free_riders.pdf

Best practices for coping with the free-rider problem in group work The use of group assignments/projects has become a more common form of assessment in higher education globally. Students, however, often can suffer from a negative reaction to group work due to two main factors: low grades on assessment and free-riders in group work. The free-rider problem may contribute to both of these problems as individual, hard-working students often adopt a sour grapes framework and blame lower-than-anticipated grades on their free riding classmates, with the latter having the potential to wreak havoc and create ill will. Typical free-riding

behavior reported include inconsequential contributions to group work, students who mysteriously only show up for final presentations and students who are willing to take advantage of their classmates’ passivity. One difficulty that arises with student group work is the assessment of individual student work in a group context. Assessing students individually is important because academic grades assigned to students are supposed to represent the individual student’s performance. When asking students to work in groups, it cannot be taken for granted that each student has contributed equally to any given assignment. A small but not insignificant number of students may attempt to take advantage of the group work setting and contribute a minimal amount and quality of work. These students ride on the efforts of others and this is why they are dubbed ‘free-riders.’ They are a source of frustration for both student and teacher alike. This paper is broken into four main sections. The first part allows the reader to situate his/her respective choice amongst a continuum of options in addressing free-rider problems in group work based on current practices at PolyU. The second section draws on international best practices and offers teachers ideas about the design of their group projects. The third offers advice on ways to encourage students to take the initiative in policing their fellow group members and the fourth part discusses using peer review as a formative and summative tool.

1. Assessment strategies for teachers: a continuum of six general strategies

Broadly speaking, there are six general approaches to assessment design that teachers employ to prevent or reduce the problem of free-riding in their courses. Each of the respective approaches has its relative pluses and minuses and it is not easy to generalize which approach is preferable given the wide range of topics in which group work is used. The best assignments build in individual accountability to prevent social loafing. Besides the evaluation of the group work itself, there have to be other components that are part of the assessment or supplementary assessments offered within the context of the course:

Use individual-based assessment only, even in group work. This is the easiest way to prevent free-riding and often is the toughest. It will work, but one of the reasons that teachers like group work is that it is meant to lessen the grading burden.

Page 42: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

42

Make each student responsible for a certain section of the overall task and assess each student. This may prove difficult in some fields as it is hard to find equivalent tasks and to get an even division of labor when tasks are assigned.

Employ a combination of group and individual work. This is a feasible solution but added work for those who prefer to give just one grade.

Use a group grade but with internal modifications made in light of individual work. Some teachers use peer review or allow students to report back on problematic team members.

Give the group the same grade but prevent free-riding by working on group dynamics. As above, these teachers use either a peer review or a reporting back to the teacher mechanism.

Have an assessment policy/plan where at least a certain amount of the overall subject grade must come from individual assessments. Teachers may have a quiz or part of a test in which students report individually on aspects of their group work

2. Design of student group work for successful projects One of the most effective ways of overcoming the free-rider problem is by means of how group work is implemented in a course. While free-riding can be a persistent problem, other troublesome behaviors can also haunt group work. These include dominant behavior, poorly-constructed assignments and even intentional sabotage. Teachers who employ best practices in group work can actively prevent both these and the free rider problem. Helpful design factors include:

Employ peer review (see below).

Make groups small enough so that they are able to make a division of labor and agreed upon tasks for each group member can be clearly delineated. Make individual inputs visible and use a shared presentation time. Ask questions to each individual and let students know that they are jointly responsible for the entire project and might have to answer questions on it in its entirety.

Avoid giving only a group grade if possible. If not, at least incorporate group peer assessments or design the project so that each student is responsible for a separate part that is assessed.

Remember that effective practices of individual assignments are also effective practices for group assignments. These include utilizing stages in which the work can be monitored and checked, self-reflective mechanisms for monitoring metacognitive skills, individual journals written about work and the like. Teachers who are only concerned with the final product and who ignore the process are more likely to experience the negative consequences of free riders.

Model work on disciplinary competence. Beyond the potential grading convenience, teachers include group work as a means of mirroring the process of working in their field in the workplace. Invest time in building teamwork skills. Don’t assume that students know how to work in teams. Too many teachers use a strict division of labor where each student performs one task (e.g. abstract, paper, literature review, et cetera) and the staple – voila! – makes the work into team work. To work successfully in groups, it is necessary for students to do a task assessment, a division of labor, a team strategy or plan, coupled with time assessment and management.

Page 43: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

43

Be attentive to special issues that arise in group work but not individual work. Teachers need to be aware of the nuances of group work, and students need to be able to explain their ideas to others, listen and be ‘open’ to other ideas and perspectives, discuss and reach consensus, delegate responsibilities, coordinate efforts and schedules, conflict resolution.

Use rubrics to ensure that students understand what they are being assessed on and they are able to act appropriately.

3. Empower students to monitor their own group work It is important to police student work as free-riding needs to be detected early before it can do its damage. One way of policing student behavior to prevent free-riding is by putting the onus directly on the students themselves. The most effective type of cooperative work is one in which students are personally empowered and are able to both design and monitor the work of their project and team. Just as teachers can come up with innovative topics for study, empowered students are also able to do so if given the chance. The following are some tips that you can convey to the group in order to build structure to empower their learning possibilities. Consider the following:

Students typically prefer to work in their own groups. Whether they are allowed to or not depends on how important student empowerment is to the teacher versus the other learning outcomes underwriting the work.

Get students to make an agreed upon division of labor. Put down the nature of individual tasks, who does what, who shares what material with the group and when, include the agreed upon tasks in the final submission of the project (consider it as a potential learning contract). Free riders dislike transparency as they hope to take advantage of uncertainty.

Have students timetable their tasks with deadlines – this gives the person who is behind schedule a chance to catch up if there is some slack in the schedule, it also makes the team aware of potential problems and allows them to respond.

Students are often too willing to ignore their colleagues’ indiscretions. Let them know that they should not encourage free-riding behavior, and that there is a low to zero tolerance for excuses. Let them know that you prefer that they attempt to solve

their own problems before coming to you as the final recourse.

Provide a mechanism for students to deal with uncooperative group members and employ peer review (see next section).

4. Effective use of peer review in group work One of the most widely adopted mechanisms teachers use for avoiding free-riding is by having students conduct peer review of each other’s work. But peer review is a powerful tool, so it is important that it is employed wisely. The following are some pointers to keep in mind:

Ensure the criteria by which students grade each other are known and readily available. In addition to understanding the criteria, students should also have an understanding of how to apply them. As mentioned previously, consider using a grading rubric.

Page 44: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

44

Decide whether peer review will be used formatively or summatively. While teachers may prefer the latter for ease of use and the power of the grade, it may be that some students just need a slight nudge. A gentle, friendly reminder from their peers is all that some students need to get on task.

Help students by guiding their work and giving well-defined tasks. In order to help students, it is necessary to clearly explain your expectations. Students often dislike group work because of a perceived lack of fairness and a lack of trust of peer evaluations. It is important to get student ‘buy-in’, so that they trust the process.

Design an effective peer review form. Spend some time to consider which factors are important to you (e.g. comes to meetings on time, produces work on time, volunteers for tasks, makes helpful suggestions, meets deadlines, shares resources et cetera). Don’t be afraid of being explicit.

Those who do not actively wish to use peer assessments may use them as a safety mechanism to warn the teacher of troublesome behavior. One way this is done is by having the students recommend a percentage of the group grade the offending party should receive. The teacher then needs to verify whether the complaint is valid, and, if verified, to adjust the corresponding grade. Helping the free-riders Although free-riders are often depicted as the ‘bad guys’, sometimes they can be victims of circumstances too. Hall and Buzwell (2012) alert us to other possible reasons for free-riding beyond social loafing, such as differing learning style and ‘involuntary free-riding’ (Vernon, 2008) can result from group dynamics, where low-status members gradually become more submissive and contribute less as their status becomes more polarized within the group over the course of the project. Teaching students how to work in a group and how to handle group dynamics may be another way to solve the free-rider problem. Although free-riding is a perennial problem when dealing with group work, this paper has attempted to illustrate some of the ways that a teacher can employ clever strategies and reduce the stress it causes to teacher and student alike. References:

Dingel M., Wei W. and Huq A. (2013) Cooperative learning and peer evaluation: The effect of free riders on team performance and the relationship between course performance and peer evaluation. Journal of the Scholarship of Teaching and Learning 13 (1), 45-56. Gueldenzoph L. and May G. (2002) Collaborative peer evaluation: best practices for group member assessments. Business Communication Quarterly 65 (1), 9-20. Hall D. and Buzwell S. (2012) The problem of free-riding in group projects: looking beyond social loafing as reason for non-contribution. Active Learning in Higher Education 14 (1), 37-49. Pfaff E. and Hulleston P. (2003) Does it matter if I hate teamwork? What impacts student attitudes toward teamwork. Journal of Marketing Education, 25 (1), 37-45. Vernon, J. (2008) Involuntary free riding - how status affects performance in a group project. In: HeuvelPanhuizen, M. v. d. and Köller, O. (eds.) Challenging Assessment - Book of Abstracts of the 4th Biennial EARLI/Northumbria Assessment Conference 2008. Berlin, Germany: Breitfeld Vervielfältigungsservice. 978-3-00-025471-0. p. 30.

Page 45: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

45

Weimer M. (2009) Dealing with free riders. Faculty Focus. Retrieved from https://www.facultyfocus.com/articles/teaching-and-learning/dealing-with-free-riders/

Page 46: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

Checklist Specification Table - AS Teacher Development

46

APPENDIX V: Checklist Specification Table

By Amy Hsiao, Teacher Development, Academic Support (TiU)

Introduction Checklist Specification Table

What does a specification table look like? A specification table consists of three components: a) course learning goals in the left-hand column, b) cognitive skills based on Bloom’s taxonomy (knowledge, comprehension, application, analysis, evaluation, synthesis/creation) in the horizontal row above the table, and c) weights of different cognitive skills per learning goal in the cells (such as percentage or number of test questions).

Please note that we refer the term ”test” to diverse types of assessment such as preliminary exams, resits, assignments, paper, report, et cetera. A specification table should be established for each type of assessment that contributes to the course grade.

Why do I make a specification table? The most common way and most convincing way to show content validity of an assessment is to make a specification table. Content validity is the most import assessment quality that teachers have to consider when developing any assessment type. The content validity refers to how representative a test is to measure the course learning goals and the relationship between the learning goals and test questions.

For whom? Before you make a table, it is important to know the parties of interests. In most cases, making a table is important for you as being the teacher as well as being the assessment developer of the course (sometimes, this might be done by others). When you construct a test, you need a blue print to show the content validity, namely how representative the test is for the learning goals. In principle, you make this blue print once for all future tests of the same course such as the exam and resit. Both of them should be based on the same specification table. Only when the learning goals of the course change, then the specification table should be modified as well. With the specification table, you can see how the test questions are distributed per

learning goal and cognitive skill. When you make a new test for the same course, you can use the table to check whether the new test has the same distribution. In addition, the specification table can help a new teacher of the same course to construct his own test with the same distribution of test questions per learning goal and cognitive skill. You can also use the table to account for the validity to other stakeholders such as exam committee or accreditation/visiting committee. Take the perspective of a test developer, a co-teacher and other stakeholders.

Page 47: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

Checklist Specification Table - AS Teacher Development

47

Checklist Specification Table

Guidelines Explanations

Meet the guideline?

□ Yes / □ No

Use course learning goals to develop a test

You have formulated the course learning goals. Now you also use these course learning goals to develop a test. Please note: if you use learning objectives at the lecture level in the table to develop a test, the course content will be measured at a very detailed level, and it will not be possible to test all learning goals.

Meet the guideline?

□ Yes / □ No

Choose an assessment type that is suitable for measuring the cognitive skills in the learning goals

For example, a test with multiple choice questions is more suitable for measuring knowledge and comprehension while an assignment with open questions is more suitable for measuring application and analysis.

Meet the guideline?

□ Yes / □ No

Make a separate table for each assessment type

If a test with multiple choice questions and an assignment with open questions are used, then make two specification tables. If a resit covers different course content or different learning goals (for example, measuring the entire course instead of the first six lectures), then you also are encouraged to make a different table.

Meet the guideline?

□ Yes / □ No

Make sure that all learning goals are measured by one or multiple assessment types

The learning goals can be measured by different assessment types. But all learning goals must be measured.

Meet the guideline?

□ Yes / □ No

Make sure that the distribution of the test questions reflects the importance of the learning goals

Not only more instructional time is spent on a more important learning goal, but also more test questions are used to measure a more important learning goal.

Meet the guideline?

□ Yes / □ No

Make sure that the highest cognitive skill in the learning goal is measured

Common issues regarding the match between the test questions and the learning goals:

Learning goals that aim at lower cognitive skills are measured at higher levels.

Learning goals that aim at higher cognitive skills are merely measured at lower levels.

The highest cognitive skills must be measured. The lower cognitive skills can also be measured, but it is not necessary to measure all underlying lower skills explicitly for a learning goal that aims at a higher cognitive skill.

Meet the guideline? Account for the match between the table and the test questions

A specification table is a blue print. The most important is that this blue print serves as the basis of a test, that you really use it to construct a test

Page 48: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

Checklist Specification Table - AS Teacher Development

48

□ Yes / □ No

and/or to check the validity. This is why it is important to see whether the test questions indeed match what you have specified in the table: Do the test questions match the learning goal and cognitive skill? The accountability should indicate which goal and cognitive skill is measured by which test question (parts of any assessment type).

For an exam with test questions: Fill in the respective question number and points/percentage assigned in the cells. For example, Q1 (5) Q2 (5): 10% of the points are assigned for the first goal at the knowledge skill.

For an assignment/paper/report, it is important that the answer model/scoring criteria match the specification table. For example, formulating hypotheses (10%): 10% of the points are assigned for hypotheses.

Page 49: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

Checklist Specification Table - AS Teacher Development

49

Example of a Specification Table

Course name: How to make a specification table for your assessment task

Course code: XX

Assessment type/Question type: Written exam/Open questions

Learning goals

The participant can …

Bloom’s cognitive skills Number of

questions/

Percentage

points per

goal Knowledge Comprehension Application Analysis Evaluation

Synthesis /

Creation

1. Describe the three components of a

specification table and interpret how

these three components relate to

content validity of the assessment task.

Q1 (5)

Q2 (5)

Q3 (4)

Q4 (5)

Q5 (6)

25%

2. Find coherence among learning goals,

instruction and assessment when

making a specification table to develop

an assessment task.

Q6 (10)

Q7 (10)

Q8 (10)

Q9 (20) 50%

3. Elaborate on your opinions regarding

the usefulness of making a specification

table on developing an assessment task.

Q10 (25) 25%

Number of questions/

Percentage points per goal 10% 15% 30% 20% 25% 100%

Page 50: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

50

APPENDIX VI: Template for writing assignment descriptions with explanations

By Amy Hsiao, Assessment Specialist, Teacher Development, Academic Support (TiU)

To ensure that you do not forget any important elements, use this template to structure your

assignment descriptions. After filling out the content, keep the headings only and delete the

explanation texts before giving it out to your students.

Introduction

Name of the writing assignment

Learning goals

List the course learning goals measured by this assignment (the scope of subject matter, expected cognitive skills, and writing skills if they are graded with a substantial weighting).

Alignment with learning goals and instruction

State briefly what thinking skills and course content topics are required by this assignment and what teaching and learning activities have prepared students for writing this higher-level writing assignment.

Genre of the writing assignment

Define the genre (e.g., research paper, essay, report) of the writing assignment.

Task descriptions

Traditional topic-focused task (Bean & Weimer, 2011; Forster, Hounsell, & Thompson, 1995)

Make a list of topics based on the learning goals you want to assess when students are allowed to choose from a list of topics. Also, restrict the number of topics on this list.

Problem-focused task with a rhetorical context6 (Bean & Weimer, 2011)

Write task descriptions as an intriguing problem with a rhetorical context that includes the writer’s role or purpose and their audience.

Product requirements

Length

6 https://writing.colostate.edu/guides/teaching/fys/rhetcontext.cfm

Page 51: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

51

Specify the length in a range (e.g., 900-1100 words) and the consequence when the text is too short or too long.

Formatting requirements

Specify the requirements in a defined format and the citation style (APA, MLA). You can either make a Word template to specify the structure, content and style or state them in the assignment descriptions as in the following sample.

Submission

State deadlines for the first draft (if any) and its revision/final version, consequences for late submission, and possibility of re-submission.

Grading and scoring

Grading

State the percentage that this writing assignment counts towards the course final grade.

Explain the relative weight of the draft and final version (if the draft is graded).

State whether there is a pass score/grade on this writing assignment and whether a resit is arranged and in what way.

Scoring criteria

Decide what constitutes a successful presentation. Make a list of criteria with indicators and explain the relative weighting in grading thinking skills on the course subject matter (content) and writing skills (form).

Support and resources

Provide students with a suggested timeline/schedule for processing this assignment or ask students to make this by themselves.

Inform students what kind of support they can expect from you, such as office hours.

Inform students how and when you will give them feedback.

Make a list of resources available to students to help them complete the assignment, such as course materials, key references, example assignments from previous students (if possible, provide students with samples of different levels of performance: strong, average, and weak), tutoring services, links to style guides for bibliographies or outlines, etc. For example,

o Academic and Business Writing https://www.edx.org/course/academic-and-business-writing

o Academic Legal Writing https://edisciplinas.usp.br/pluginfile.php/135669/mod_resource/content/1/Academic.pdf

o Advanced Writing

Page 52: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

52

https://www.coursera.org/learn/advanced-writing

o Assignment types http://owll.massey.ac.nz/main/assignment-types.php

o Critical Thinking https://skillshub.northampton.ac.uk/critical-thinking/

Page 53: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

53

APPENDIX VII: Checklist for designing and assessing group work7

Designing a group assignment and assessing group work is complex.

Make sure that you have considered the following questions before you start with using group work for assessment.

Design and planning

Why am I using group work? □ Yes / □ No

Is group work the most appropriate choice? □ Yes / □ No

Have workload implications been considered? □ Yes / □ No

Should group work be compulsory or optional? □ Yes / □ No

What are the benefits of group work? □ Yes / □ No

Do staff have the necessary knowledge, skills and resources? □ Yes / □ No

What level of group work is required? □ Yes / □ No

What arrangements have been made for distance students to

participate?

□ Yes / □ No

How will the role of each group member be decided? □ Yes / □ No

What are the ethical implications for staff of role assignment when

students’ grades depend on the outputs and process of role tasks

and responsibilities?

□ Yes / □ No

How many students per group? Why? □ Yes / □ No

When will groups meet? □ Yes / □ No

Have expectations about frequency and timing of meetings been

made clear?

□ Yes / □ No

What workload is expected of students and staff? □ Yes / □ No

What types of tasks are suitable for group work and how should they

be designed?

□ Yes / □ No

Have students been involved in negotiating some or all aspects of

the task?

□ Yes / □ No

What support and training do students need to work well in groups?

How will this be provided?

□ Yes / □ No

What provision has been made for students with disabilities/language

handicaps/ other problems?

□ Yes / □ No

What are the responsibilities of staff and students regarding group

work?

□ Yes / □ No

Assessment and Feedback

When should group tasks be assessed? □ Yes / □ No

What are the relevant issues affecting assessment? □ Yes / □ No

7 Adapted from https://www.mq.edu.au/lih/pdfs/058_AT_Group_Checklist.pdf

Page 54: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

54

Is there a university policy that needs to be complied with? □ Yes / □ No

Should there be a maximum % of grades allocated to group tasks in any course?

□ Yes / □ No

What are the assessment criteria? □ Yes / □ No

Have students been involved in negotiating how the work will be assessed?

□ Yes / □ No

Who will do the assessing? Self? Peer? Lecturer? A combination? In what proportion?

□ Yes / □ No

What will be assessed? □ Yes / □ No

How do you differentiate between process and product in grading? □ Yes / □ No

How do you assess process? □ Yes / □ No

Do assessors understand how to grade process results, such as journals, learning logs, etc.?

□ Yes / □ No

What mechanisms have been put in place to enable staff to monitor parity of contribution between students?

□ Yes / □ No

How should grades be allocated between individual students where group work is to count towards grades?

□ Yes / □ No

What safety net arrangements have been put in place where group work tasks count towards grades?

□ Yes / □ No

What two way feedback cycles/mechanisms have been designed into the group work process?

□ Yes / □ No

How/what will feedback be given on completion of the group assignment?

□ Yes / □ No

Page 55: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

55

REFERENCES (selection)

Anholt, R. R. H. (2010). Dazzle 'Em With Style: The Art of Oral Scientific Presentation: Elsevier Science.

Bean, J. C., & Weimer, M. (2011). Engaging ideas: The professor's guide to integrating writing, critical thinking, and active learning in the classroom: Wiley.

Berkel, H., & Bax, A. (2014). Toetsen met een mondelinge toets. In H. van Berkel, A. Bax, & D. Joosten-ten Brinke (Eds.), Toetsen in het hoger onderwijs (pp. 157-168). Houten: Bohn Stafleu van Loghum.

Burchard, K. W., Rowland‐Morin, P. A., Coe, N. P. W., & Garb, J. L. (1995). A surgery oral examination: Interrater agreement and the influence of rater characteristics. Academic Medicine, 70(11), 1044-1046.

Clay, B. (2001). Is this a trick question? A short guide to writing effective test questions. Retrieved from http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf

Cohen-Schotanus, J., & van der Vleuten, C. P. M. (2010). A standard setting method with the best performing students as point of reference: Practical and affordable. Medical Teacher, 32(2), 154-160. doi: 10.3109/01421590903196979

Crannell, A. (1999). Collaborative oral take home examinations. In B. Gold, S. Z. Keith, & W. A. Marion (Eds.), Assessment practices in undergraduate mathematics. Washington, DC: The Mathematical Association of America.

Dijkstra, J., Latijnhouwers, M., Norbart, A., & Tio, R. A. (2016). Assessing the “I” in group work assessment: State of the art and recommendations for practice. Medical Teacher, 38(7), 675-682. doi:10.3109/0142159X.2016.1170796

Dillenbourg, P. (1999). Chapter 1 (Introduction) What do you mean by 'collaborative learning'? (Vol. Vol. 1).

Dingel, M., Wei, W., & Huq, A. (2013). Cooperative learning and peer evaluation: The effect of free riders on team performance and the relationship between course performance and peer evaluation (Vol. 13).

edX - Public Speaking. (2019). Rubric for an Informative Presentation. Retrieved from https://courses.edx.org/courses/course-v1:RITx+SKILLS105x+1T2019/course/

Feichtner, S. B., & Davis, E. A. (1984). Why Some Groups Fail: a Survey of Students' Experiences with Learning Groups. Organizational Behavior Teaching Review, 9(4), 58-73. doi:10.1177/105256298400900409

Felder, R., & Brent, R. (2010). Effective strategies for cooperative learning (Vol. 10). Forster, F., Hounsell, D., & Thompson, S. (1995). Tutoring and demonstrating: A handbook.

Edinburgh, UK.: Centre for Teaching, Learning and Assessment, University of Edinburgh.

Gueldenzoph, L. E., & May, G. L. (2002). Collaborative Peer Evaluation: Best Practices for Group Member Assessments. Business Communication Quarterly, 65(1), 9-20. doi:10.1177/108056990206500102

Hall, D., & Buzwell, S. (2013). The problem of free-riding in group projects: Looking beyond social loafing as reason for non-contribution. Active Learning in Higher Education, 14(1), 37-49. doi:10.1177/1469787412467123

Hand, L., & Clewes, D. (2000). Marking the Difference: An investigation of the criteria used for assessing undergraduate dissertations in a business school. Assessment & Evaluation in Higher Education, 25(1), 5-21. doi:10.1080/713611416

Hattie, J., & Timperley, H. (2007). The Power of Feedback. Review of Educational Research, 77(1), 81-112. doi:10.3102/003465430298487

Heijne-Penninga, M. (2010). Open-book tests assessed: quality, learning behaviour, test time and perfomance. Groningen: s.n.

Joughin, G. (1998). Dimensions of oral assessment. Assessment & Evaluation in Higher Education, 23(4), 367-378. doi:10.1080/0260293980230404

Page 56: HANDBOOK FOR CONSTRUCTING AND GRADING COURSE ASSESSMENT1 › sites › tiu › files › ... · Management (TiSEM) has constructed this Handbook for Constructing and Grading Course

56

King, A. (2007). Scripting Collaborative Learning Processes: A Cognitive Perspective. In (pp. 13-37).

Lejk, M., & Wyvill, M. (2001). Peer Assessment of Contributions to a Group Project: A comparison of holistic and category-based approaches. Assessment & Evaluation in Higher Education, 26(1), 61-72. doi:10.1080/02602930020022291

Moskal, B. M. (2000). Scoring rubrics: What, when and how? Practical Assessment, Research & Evaluation, 7(3). http://PAREonline.net/getvn.asp?v=7&n=3

Oakley, B., Brent, R., Felder, R., & Elhajj, I. (2004). Turning student groups into effective teams (Vol. 2).

Palmquist, M. (2005). The Bedford Researcher: Bedford/St. Martin's. Parsons, D. (2004). Justice in the Classroom: Peer Assessment of Contributions in Group

Projects. Pell, G., Boursicot, K., & Roberts, T. (2009). The trouble with resits …. Assessment &

Evaluation in Higher Education, 34(2), 243-251. doi: 10.1080/02602930801955994Pfaff, E., & Huddleston, P. (2003). Does It Matter if I Hate Teamwork? What Impacts Student Attitudes toward Teamwork (Vol. 25).

Quigley, B. L. (1998). Designing and Grading Oral Communication Assignments. New Directions for Teaching and Learning, 1998(74), 41-49. doi:10.1002/tl.7404

Race, P. (2001). The Lecturer's Toolkit: A Practical Guide to Learning, Teaching & Assessment: Kogan Page.

Race, P., & Brown, S. (2006). The lecturer’s toolkit. London, UK: Routledge. Saunders, M. N. K., & Davis, S. M. (1998). The use of assessment criteria to ensure

consistency of marking: some implications for good practice. Quality Assurance in Education, 6(3), 162-171. doi:doi:10.1108/09684889810220465

University of New South Wales (2015). Open Book and Take Home Exams, https://student.unsw.edu.au/open-book-and-take-home-examsVernon, J. (2019). Involuntary free riding - how status affects performance in a group project.

Watkins, R. (2005). Groupwork and assessment Retrieved from https://www.economicsnetwork.ac.uk/handbook/groupwork/welcome

Webster, F., Pepper, D., & Jenkins, A. (2000). Assessing the undergraduate dissertation. Assessment & Evaluation in Higher Education, 25(1), 71-80. doi:10.1080/02602930050025042

Williams, L., & Kemp, S. (2018). Independent markers of master’s theses show low levels of agreement. Assessment & Evaluation in Higher Education, 1-8. doi:10.1080/02602938.2018.1535052