fundamentals of assessment and grading

37
Alice CHUANG, MD Department of Obstetrics and Gynecology University of North Carolina-Chapel Hill Chapel Hill, NC AOE Basic Teaching Skills Curriculum April 16, 12:00 PM, Bondurant G010 Fundamentals of Assessment and Grading APGO Clerkship Directors’ School

Upload: masao

Post on 22-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Fundamentals of Assessment and Grading. APGO Clerkship Directors’ School. Alice CHUANG , MD Department of Obstetrics and Gynecology University of North Carolina-Chapel Hill Chapel Hill, NC AOE Basic Teaching Skills Curriculum April 16, 12:00 PM, Bondurant G010. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Fundamentals of Assessment and Grading

Alice CHUANG, MDDepartment of Obstetrics and Gynecology

University of North Carolina-Chapel HillChapel Hill, NC

AOE Basic Teaching Skills CurriculumApril 16, 12:00 PM, Bondurant G010

Fundamentals of Assessment and Grading

APGO Clerkship Directors’ School

Page 2: Fundamentals of Assessment and Grading

Neither I nor my spouse has any financial interests to disclose related to this talk.

Page 3: Fundamentals of Assessment and Grading

Understand reliability and validity Contrast formative and summative evaluation Compare and contrast norm-referenced and

criterion referenced assessments Improve delivery of feedback Understand the NBME exam Be familiar with different testing formats, their

uses and their limitations

Objectives

Page 4: Fundamentals of Assessment and Grading

Validity: Are we measuring what we think we’re measuring Content: Does the instrument measure the depth

and breadth of the content of the course? Does it inadvertently measure something else?

Construct: Does the evaluation criteria or grading construct allow for true measurement of the knowledge, skills or attitudes taught in the course? Is any part of the grading construct irrelevant?

Criterion: Does the outcome correlate with true competencies? Relate to an important current or future events? Is the assessment relevant to future performance?

Terminology

http://pareonline.net/getvn.asp?v=7&n=10

Page 5: Fundamentals of Assessment and Grading

Validity Content: A summative ob/gyn test which

covered only obstetrics Construct: You allow students to use their

textbook for a knowledge-based multiple choice test of foundational information on prenatal care.

Criterion: New Coke v. Old Coke

Examples

Page 6: Fundamentals of Assessment and Grading

Reliability: Are our measurements consistent? The score should be the same no matter when it was taken, who scored it, or when it was scored. Interrater reliability: Is a student’s score consistent

between evaluators? Intrarater reliability: Is a student’s score consistent

with the same rater even if rated under different circumstances?

Scoring rubric: standardized method of grading to increase interrater and intrarater reliability

Terminology

http://pareonline.net/getvn.asp?v=7&n=10

Page 7: Fundamentals of Assessment and Grading

In general, if you repeat the same assessment, will you get the same answer? Interrater: 3 individuals are asked to go to

the beach and estimate how many seagulls they see from 6-7AM and come up with 200, 800 and 1200.

Intrarater: A particular food critic always gives low scores for food quality if the server is female.

Examples:

Page 8: Fundamentals of Assessment and Grading

Poor Candidate0 points

Fair Candidate1 points

Good Candidate2 points

Superior Candidate3 points

Singing Skills

Sings with as much expression as a wet

noodle, cannot identify which tune

candidate is singing, also cannot identify what the lyrics of

song are secondary to poor

pronunciation

Minimally expressive, pitch off

significantly on occasion, diction unclear at times

Very expressive, sings on pitch most

of the time with minor errors, diction

clear most of the time

Artistically expressive, sings on pitch, diction clear

Dancing Skills

Has 2 left feet, unable to learn new steps and continues

to dance like MC Hammer despite

different choreography demonstrated

Missteps despite multiple attempts,

no artistic expression in dance

moves, unable to learn new

choreography after 3 demonstrations

Occasionally missteps, but overall

dance steps are accurate, adapts

choreography fairly rapidly,

Quick and nimble, dances artistically, able to learn new

choreography quickly.

Enthusiasm for show

CHOIR

Freely admits not knowing what GLEE

is

Endorses enjoyment of GLEE, but unable to identify favorite

character

Has watched 70% of GLEE episodes

Has seen every episode of GLEE, all

GLEE albums confirmed in iTUNES library, has been to

GLEE LIVE each summer

Examples: Show Choir Audition Rubric

Page 9: Fundamentals of Assessment and Grading

Formative: on-going assessment, designed to help improve educational program as well as learner progress

Summative: designed to evaluate student overall performance at end of educational phase and evaluate effectiveness of teaching

Formative v. summative assessments

http://fcit.usf.edu/assessment/basic/basica.html

Page 10: Fundamentals of Assessment and Grading

Formative: short multiple choice exam written in house that is pass/fail; answers are reviewed with class at end of testing session

Summative: NBME exam

Examples

Page 11: Fundamentals of Assessment and Grading

ED30: The directors of all courses and clerkship must design and implement a system of formative and summative evaluation of student achievement in each course and clerkship.

Those responsible for the evaluation of student performance should understand the uses and limitation of various test formats, the purposes and benefits of criterion-referenced vs. norm-referenced grading, reliability and validity issues, formative vs. summative assessment, etc….

Formative v. summative assessments

Page 12: Fundamentals of Assessment and Grading

ED31: Each student should be evaluated early enough during a unit of study to allow time for remediation

ED32: Narrative descriptions of student performance and of non-cognitive achievement should be included as part of evaluations in all required courses and clerkships where teacher-student interaction permits this form of assessment.

Formative v. summative assessments

Page 13: Fundamentals of Assessment and Grading

Formative v. summative assessments

Uses for assessments

Formative Summative

PurposeFeedback for learning

Certification/Grading

Breadth of scopeNarrow focus on specific objectives

Broad focus on general goals

Scoring Explicit feedbackOverall performance

Learner affective response

Little anxietyModerate to high anxiety

Target audience Learner Society

Page 14: Fundamentals of Assessment and Grading

Characteristics of feedback

Effective Feedback:• given with the goal of

improvement timely honest respectful clear issue-specific objective supportive motivating action-oriented solution-oriented

Destructive Feedback:• unhelpful accusatory personal judgmental subjectiveIt also undermines the self-esteem of

the receiver leaves the issue unresolved the receiver is unsure how to

proceed.

http://www.expressyourselftosuccess.com/the-importance-of-providing-constructive-feedback/

Page 15: Fundamentals of Assessment and Grading

When you… You give the impression… I would stop… I would recommend…instead

Feedback…from APGO/CREOG 2011

Page 16: Fundamentals of Assessment and Grading

Norm-referenced Purpose is to classify students in order of

achievement from low to high Allow comparisons of students May not give accurate information regarding

student abilities Half of the students should score above

midpoint score and the other half should score below midpoint score

Norm-referenced v. criterion- referenced assessments

Rickets C. A plea for the proper use of criterion-referenced tests in medical assessment. Med Educ, Vol 43, Issue 12.

Page 17: Fundamentals of Assessment and Grading

Criterion-referenced Purpose is to evaluate students knowledge and

skills compared to a pre-determined goal performance level

Gives information about a student’s achievement of certain objectives

Should be possible for everyone to earn a passing score

Norm-referenced v. criterion- referenced assessments

Rickets C. A plea for the proper use of criterion-referenced tests in medical assessment. Med Educ, Vol 43, Issue 12.

Page 18: Fundamentals of Assessment and Grading

Norm-referenced: Soccer tryouts where 11 players are chosen out of 40

Criterion-referenced: Test for driver’s license

Example

Page 19: Fundamentals of Assessment and Grading

Be sure your assessment is appropriately norm-referenced or criterion referenced.

Be sure that your assessment is designed with this in mind.

Most assessments in medical education are criterion-referenced.

Norm-referenced tests should emphasize variability; criterion-referenced tests should emphasize accuracy of tested material.

Norm-referenced v. criterion- referenced assessments

Page 20: Fundamentals of Assessment and Grading

Exams Developed by committees and content experts Same protocol used to build Step 1 and Step 2

In general Subject exams provided to all 130 LCME

accredited medical school is US 8 Canadian medical schools 8 osteopathic medical school 22 international medical schools

NBME

Page 21: Fundamentals of Assessment and Grading

Scaled to have a mean of 70 and SD of 8 based on 9000 first-time test takers from 80+ schools who took exam as end-of-clerkship exam in 1993-94

Scores do not reflect percentage of questions answered correctly.

NBME

Page 22: Fundamentals of Assessment and Grading

A score of 60 in the fourth quarter means that 2% of the examinees in the fourth quarter scored 60 or below!

NBME: What do those scores mean?

Score2011-2012

Total year Q1 Q2 Q3 Q4

93 or above 98 99 98 97 97

92 97 98 98 97 96

86 90 93 91 89 88

80 75 80 77 73 71

78 67 71 69 63 62

74 49 54 51 45 44

70 29 33 32 26 25

62 6 7 6 5 4

60 3 4 4 3 2

Page 23: Fundamentals of Assessment and Grading

NBME: Academic purpose for exam

%

Advanced placement 5

Course/clerkship 95

Year-end 12

Make-up 21

Minimal competence 44

Identify at risk students 23

Practice for USMLE 47

Promotion requirement 37

Review course 1

Student self-assessment 26

Other 4

Total responses: 78

Page 24: Fundamentals of Assessment and Grading

NBME: Weight given the subject exam

Weight given the subject exam

%

1-10% 411-20% 1621-30% 3331-40% 3941-50% 13>50% 0

Total number responding 70

Page 25: Fundamentals of Assessment and Grading

NBME 2008 Clerkship Survey Results

Assessment/Evaluation Method Ob/gyn (%)Computer Case Simulations 0.5

Subject Exam 30

School’s MCQ Exam 9

Observation and evaluation by residents 28

Observation and evaluation by faculty 26

Oral exam 14

OSCE 12

Peer evaluation 1

Standardized patient exam 3

Other 18

Total number responding 81

Page 26: Fundamentals of Assessment and Grading

2004 and 2009 survey of performance guidelines across clerkship

Recommend setting an absolute versus a relative standard for performance Angoff Procedures: item-based, judges provide guess of

minimally proficient examinees that answer each question correctly

Hofstee Method: judges determine minimum and maximum scores for passing and percentage of failures…then plotted against a graph made up of exam score and failure rate

NBME

Page 27: Fundamentals of Assessment and Grading

NBME

Page 28: Fundamentals of Assessment and Grading

Multiple choice exam (MCQ) Objective structured clinical

examination (OSCE) Oral examination Direct observation Simulation Standardized patient Patient/procedure log Medical record reviews Written essay questions

Testing Formats

Casey et al, To the point: reviews in medical education – the Objective Structured Clinical Examination. AJOG, Jan 2009.

Page 29: Fundamentals of Assessment and Grading

Use distractors which could plausibly represent correct answer

Use a question format, not complete-the-statement format

Emphasize higher-level thinking, not strict memorization

Keep option length consistent within a question Balance the placement of the correct answer Use correct grammar Avoid clues to the correct answer Highly reliable and valid for assessing knowledge

Testing format: MCQ

http://testing.byu.edu/info/handbooks/14%20Rules%20for%20Writing%20Multiple-Choice%20Questions.pdf

Page 30: Fundamentals of Assessment and Grading

Examinees rotate through circuit of stations (5-10 minutes each)

One-on-one examination (with examiner or trained or simuated patient)

List of criteria for successful completion of each station

Each station test a specific skill or competency Good for examining higher-order skills, clinical and

technical skills Requires large amount of resources

Testing format: OSCE

Page 31: Fundamentals of Assessment and Grading

Portfolio based: similar to case-based portion of Oral Boards

Poor inter-rater and intra-rater reliability Scores higher when scored live verses on video Teaching students how to do better on oral exam

does not improve scores Practicing oral exams does improve scores Mock public oral exam improves performance Limitations

Halo effect (grade reflects not only performance on exam but also previous experience)

Subconscious consensus grading: examiners take subconscious cues from each other.

Testing format: Oral Exam

Burch & Seggie, 2008; Kearney et al, 2001; Buchard et al, 2007; Jacobsohn et al, 2006

Page 32: Fundamentals of Assessment and Grading

Is an oral exam justified? Is there an advantage? Does the material lend itself to open questioning? How will communication skills, delivery of information

be graded? Will only content be graded? Is the examiner experienced? Will he/she skew grades

in any way? How will you prepare students for the exam? Is there enough time for every student to examine them

adequately? How much prompting/assistance is allowed for oral

examination? How much time will you allow for “thinking?” How will you ensure consistency in these areas for all examinees?

Testing format: Oral Exam

http://testing.byu.edu/info/handbooks/14%20Rules%20for%20Writing%20Multiple-Choice%20Questions.pdf

Page 33: Fundamentals of Assessment and Grading

Formalized criteria Various observers True-to-life clinical setting (versus simulated) Numerical scores Comment anchored Improve reliability with multiple perspectives Consider 360 evaluation (including self,

patient and other staff members)

Testing format: Direct observation

Page 34: Fundamentals of Assessment and Grading

Testing format

MCQ OSCE Direct obs Oral exam

Content +++ ++ + +

Construct +++ ++ + +

Criterion + ++ + +

Reliability +++ ++ + +

Formative Y Y Y Y

SummativeY Y Y Y

Norm-referenced Y N N N

Criterion-referenced Y Y Y Y

Page 35: Fundamentals of Assessment and Grading

Be sure your assessment Provides reliable data Provides valid data Provides valuable data Is feasible Can be incorporated into the systems in

place (hospital, clinic, curriculum, etc) Is consistent with course objectives Utilizes multiple instruments, multiple

assessors and multiple points of assessment Aligns with pre-specified criteria Is fair

General rules of thumb

Lynch and Swing. Key Considerations for Selecting Assessment Instruments and Implementing Assessment Systems. ACGME.

Page 36: Fundamentals of Assessment and Grading

Bond, Linda A. (1996). Norm- and criterion-referenced testing. Practical Assessment, Research & Evaluation, 5(2). Accessed at http://pareonline.net/getvn.asp?v=5&n=2

Burch VC, Seggie JL. Use of a structured interview to assess portfolio-based learning. Med Ed 2008: 42: 894-900.

Burchard K et al. Is it live or is it Memorex? Student oral examinatinos and the use of video for additional scoring. Am J Surg. 193 (2007), 233-236

Casey et al, To the point: reviews in medical education – the Objective Structured Clinical Examination. AJOG, Jan 2009.

Jacobsohn E , Kock PA, Avidan M. Poor inter-rater reliability on mock anesthesia oral examinations.

Kearney RA et al. The inter-rater and intra-rater reliability of a new Canadian oral examinatino format in anesthesia is fair to good. Can J Anesth 2002; 49:3, 232-236.

Lynch and Swing. Key Considerations for Selecting Assessment Instruments and Implementing Assessment Systems. ACGME.

Metheny WP, Espey EL, Bienstock J, et al. To the point: Medical education reviews evaluation in context: Assessing learners, teachers, and training programs. Am J Obstet Gynecol. 2005;192(1):34-37.

Moskal, Barbara M. & Jon A. Leydens (2000). Scoring rubric development: validity and reliability. Practical Assessment, Research & Evaluation, 7(10). Retrieved December 29, 2009 from http://PAREonline.net/getvn.asp?v=7&n=10

Rickets C. A plea for the proper use of criterion-referenced tests in medical assessment. Med Educ, Vol 43, Issue 12.

References

Page 37: Fundamentals of Assessment and Grading

14 Rules for Writing Multiple Choice Questions. Brigham Young University 2001 Annual Conference. Accessed at http://testing.byu.edu/info/handbooks/14%20Rules%20for%20Writing%20Multiple-Choice%20Questions.pdf

Formative vs. Summative Assessments. Classroom Assessment. Accessed at: http://fcit.usf.edu/assessment/basic/basica.html

NBME 2008 Clinical Clerkship Director Survey Results. Accessed at https://portal.nbme.org/web/medschools/home?p_p_id=62_INSTANCE_dOGM&p_p_action=0&p_p_state=maximized&p_p_mode=view&p_p_col_id=column-1&p_p_col_count=1&_62_INSTANCE_dOGM_struts_action=%2Fjournal_articles%2Fview&_62_INSTANCE_dOGM_keywords=&_62_INSTANCE_dOGM_advancedSearch=false&_62_INSTANCE_dOGM_andOperator=true&_62_INSTANCE_dOGM_groupId=1172&_62_INSTANCE_dOGM_searchArticleId=&_62_INSTANCE_dOGM_version=1.0&_62_INSTANCE_dOGM_name=&_62_INSTANCE_dOGM_description=&_62_INSTANCE_dOGM_content=&_62_INSTANCE_dOGM_type=&_62_INSTANCE_dOGM_structureId=&_62_INSTANCE_dOGM_templateId=&_62_INSTANCE_dOGM_status=approved&_62_INSTANCE_dOGM_articleId=817480

Objective Structured Clinical Examination. Wikipedia. Accessed at http://en.wikipedia.org/wiki/Objective_structured_clinical_examination

Reliability and Validity. Classroom Assessment. Accessed at: http://fcit.usf.edu/assessment/basic/basicc.html

Talk about teaching:  Significant issues in Oral Examinations. Contributed by Meryl Carlson, Concordia College, Moorhead, MN. Accessed at http://www.cord.edu/faculty/ulnessd/oral/MCarlson/questions.html

References