Lessons from High-Stakes Licensure Examinations for Medical School Examinations

Download Lessons from High-Stakes Licensure Examinations  for Medical School Examinations

Post on 03-Jan-2016




6 download

Embed Size (px)


Lessons from High-Stakes Licensure Examinations for Medical School Examinations. Queens University 4 December 2008 Dale Dauphinee, MD, FRCPC, FCAHS. Background: FAME Course. - PowerPoint PPT Presentation


<ul><li><p>Lessons from High-Stakes Licensure Examinations for Medical School Examinations</p><p>Queens University</p><p>4 December 2008</p><p>Dale Dauphinee, MD, FRCPC, FCAHS</p></li><li><p>Background: FAME CourseGoal today is to offer insights for those of you working at the undergraduate level looking back on my two careers in assessment: Undergraduate Assoc. Dean and CEO of the MCC!</p></li><li><p>Validating Test Scores and Decisions</p><p>Pulling All of the Pieces Together! </p><p>Dale Dauphinee</p></li><li><p>Seeing the woods for the trees and defining the way ahead .. !!!</p><p>Why? Ensure that you keep out of trouble and get the effect/impact that you want! </p></li><li><p>FAME Course Framework</p><p>Assessment FramesThemesKnowledge and ReasoningClinical SkillsWorkplace PerformanceProgram EvaluationScoring, Analysis &amp; ReportingTest Material DevelopmentStandard SettingTest Design: Constructed ResponseTest Design: Content and Validity</p></li><li><p>Elements of TalkProcess: be clear on why are doing this!Describe: assessment steps written downItem design: key issues Structure: clear where decision are madeOutcome: pass-fail or honours-pass-failEvaluation cycle: it is about improvement!Getting into troubleProblems in process: questions to be askedNever ask them after the fact: ANTICIPATEPrevention</p></li><li><p>Preparing a Course Flow ChartFor whom and what?What is the practice/curriculum model?What method?What is the blueprint and sampling frame?To what resolution level will they answer?Scoring and analysisDecision makingReportingDue processHINT: Think project management! What are the intended steps?</p></li><li><p>Classic Assessment CycleDesired Objectives or AttributesEducational ProgramAssessment of PerformancePerformance GapsProgram Revisions</p></li><li><p>Change in the Hallmarks of Competence - Increase ValidityKnowledgeassessmentProblem-solvingassessmentClinical skillsassessmentPracticeassessment(adapted from van der Vleuten 2000)</p></li><li><p>Climbing the PyramidKnowsShows howKnows howDoes</p></li><li><p>Traditional ViewCurriculumTeacherAssessmentStudentAfter van der Vleutin - 1999</p></li><li><p>An Alternative ViewCurriculumTeacherAssessmentStudentAfter van der Vleutin - 1999</p></li><li><p>Traditional Assessment: What, Where &amp; How</p><p>Student-Trainee Assessment</p><p>Content: maps on to the domain and curriculum to which the results generalize - basis of assessmentWhere and who: within set programs where candidates are in same cohortMeasurement: Test or tool testing time is long enough to yield reliable resultsTests are comparable from administration to administrationControlled environment not complexCan attribute differences to candidate? and rule out exam-based or error attributionAdequate numbers per cohortTraditional Tests/Tools at SchoolDoes content map to domainTest length = reliableAttributable to candidate?Are tests comparable?Ideal test or all these qualities!</p></li><li><p>Principle It is all about the context and purpose your course, then intended use of the test score - or the program!There is no test for all seasons or for all reasons.</p></li><li><p>Written Tests: Designing ItemsKey Concepts</p></li><li><p>Principle The case prompts or item stems must create low level simulations in the candidates mind about . the performance situations that are about to be assessed ..</p></li><li><p>Classifying Constructed FormatsCronbach (1984): defined constructed response formats as broad class of item formats where the response is generated by examinee rather than selected from a list of options.Haladyna (1997): constructed response formatsHigh inference formatRequires expert judgment about a trait being observedLow inference format Are observing behaviour of interest: short answer; checklists</p></li><li><p>Types of CR Formats*Low InferenceWork samplingDone in real timeIn-training evaluationsProvide rating laterMini-CEXShort answerClinical orals: structuredEssays (with score key)Key features (no menus)OSCEs at early UG levelHigh InferenceWork 360sOSCEs at grad levelOrals (not old vivas)Complex simulationsTeamsInterventionsCase-based discussionsPortfoliosDemonstration of procedures*Principle - All CR formats need lots of development planning: you cant show up and wing it!</p></li><li><p>What Do CRs Offer &amp; What Must One Consider for Good CRsThe CR format can provideOpportunity candidates to generate/create a responseOpportunity to move beyond MCQsResponse is evaluated by comparing response to pre-developed criteriaEvaluation criteria have a range of values that are acceptable to the faculty of the course or testing body.CRs: other considerations- Writers/authors need training- Need CR development process- Need topic selection plan or blueprint- Need guidelines- Need scoring rubric and analysis reporting- Need content review process- Need test assembly process- May encounter technical issues</p></li><li><p>Moving to Clinical AssessmentThink of it as work assessment!</p><p>Point: validity of scoring is key because the scores are being used to imply judge clinical competence in certain domains!</p></li><li><p>Clinical Assessment IssuesContext: Clinical Skills Work AssessmentOverview:Validating test scoresValidating decisionsExamples:Exit (final) OSCEMini-CEXConclusion Presentation Grid</p><p>Clinical SkillsMini-CEXValidating ScoringValidating Decisions</p></li><li><p>Key Pre-condition #1</p><p>What Is the Educational Goal?And the level of resolution expected?Have you defined the purpose or goal of the evaluation and the manner in which the result will be used?Learning point:Need to avoid Downings threats to validityToo few cases/items (under representation)Flawed cases/items (irrelevant variance)If not you are not ready to proceed!</p></li><li><p>Key Pre-condition #2:Be Clear About Due Process!Ultimately, if this instrument is an exit exam or an assessment to be used for promotion, clarity about due process is crucialSamples: Student must know that he/she has the right to the last word; the board has followed acceptable standards of decision-making; etc.</p></li><li><p>Practically in 2008, validity implies ... that in the interpretation of a test score a series of assertions, assumptions and arguments are considered that support that interpretation! Validation is a pre-decision assessment - specifying how you will consider and the interpret the results as evidence that will be used in final decision-making !In simple terms: for student promotion a series of conditional steps (cautions) are needed to document a legitimate assessment process Critical steps for a valid process leading to ultimate decision i.e. make a pass/fail decision or provide a standing</p></li><li><p>4General Framework for Evaluating Assessment Methods after SwansonEvaluation: determining the quality of the performance observed on the testGeneralization: generalizing from performance on the test to other tests covering similar, but not identical, content Extrapolation: inferring performance in actual practice from performance on the testEvaluation, Generalization, and Extrapolation are like links in a chain: the chain is only as strong as the weakest link</p></li><li><p>5EvaluationGeneralizationExtrapolationKanes Links in a Chain Defense - after SwansonIncludes: Scoring and Decision-making</p></li><li><p>Scoring: Deriving the EvidenceContent validity: Performance and work based testsEnough items/cases?Match to exam blueprint and ultimate usesExam versus work-related assessment pointDirect measures of observed attributesKey: is it being scored by items or cases?Observed score compared to target scoreItem (case) matches the patient problem!And the candidates ability!</p></li><li><p>Preparing the EvidenceFrom results to evidence: three inferencesEvaluate performance get scoreGeneralize that to target scoreTranslate target score into a verbal descriptionAll three inferences must be validProcess: Staff role versus decision-makers responsibilities/roleFlawed items/casesFlag unusual or critical events for decision-makersPrepare analysesComparison data</p></li><li><p>Validating the Scoring - EvidenceValidation carried out in two stagesDevelopmental stage: process is nurtured, refinedAppraisal stage: real thing - trial by fire!Interpretive argumentContent validity: how do scores function in various required conditions?Enough items/cases?Eliminate flawed items /cases</p></li><li><p>7EvaluationGeneralizationExtrapolationObservation of PerformanceWith Real Patients - if sees variety of patients</p></li><li><p>10EvaluationGeneralizationExtrapolationObjective StructuredClinical Examination (OSCE)- Dave Swanson</p></li><li><p>Stop and Re-consider .What were the educational goals?ANDHow will the decision be used?</p></li><li><p>The Decision-making ProcessStandard settingmany methodsBut keys are: ultimate success fidelity care with which decision is executed is crucial must be documentedHelpful Hint: can also use standard setting for defining faculty expectations for content and use - in advance of test!</p></li><li><p>The Decision-making Process</p><p>Generic steps: exam was conducted properly; results are psychometrically accurate and valid; establish pass-fail point; and consider each candidates resultsRed steps require an evaluating process that is Deliberate and reflectiveOpen discussionBlack steps: decisionAll members of decision-making board must be in or else an escalation procedure needs to be established in advance!</p></li><li><p>ExamplesOSCEMCC meeting stepsOverview: how exam wentReview each stationDiscussionDecision: use all casesReview results in totoDecide on pass-fail pointConsider each person:Decide pass-fail for specific challenging instancesAward standing or tentative decisionCommentsWork-based: mini-CEXSix month rotation in PGY-1Construction stepsSampling grid?Numbers neededScore per caseRating issues:Global (preferred) vs. Check-listScale issuesExaminer strategyNot same oneNumber neededPreparationAwarding standing: Pass-fail or one of several parameters?Comments</p></li><li><p>Appeals vs. Remarking!Again pre-defined processTending to make a negative decisionCandidates right to last word before final decisionWhere does that take place? Must plan this!Differentiate decision-making from rescoringRequires independent ombudspersonOther common issues</p></li><li><p>Delivering the NewsDepends on the purpose and desired useContext drivenIn a high stakes situation at a specific faculty may want two steps processTending - to negative decision: Notion of right of the candidate to the last word before a decision is made: has right to provide evidence that addresses the boards concernsFinal decisionComments/queries?</p></li><li><p>Key Lessons: Re-capPurpose and use of resultOverview of due process in promotionOverview of Validity prefer Kanes approachScoring component of validityGeneralization and extra-polizationTrue score variance - and error variance Interpretation/Decision-making components of validityKnow due process</p></li><li><p>Are you ready?Are the faculty clear on the ultimate use and purpose of the test or exam?How will you track the issues to be resolved?Have you defined the major feasibility challenges at your institution and plan!Do you have a process to assure valid scoring and interpretation of the result?Do you have support and back-up?</p></li><li><p>Summary and QuestionsThank You!</p></li><li><p>ReferencesClauser BE, Margolis MJ, Swanson DB. (2008). Issues of Validity and Reliability for Assessments in Medical Education. In Practical Guide to the Evaluation of Clinical Competence. Hawkins R. and Holmboe ES, eds. Publisher - MosbyPangaro L, Holmboe ES (2008). Evaluation Forms and Global Rating Forms. In Practical Guide to the Evaluation of Clinical Competence. Hawkins R.&amp; Holmboe ES, eds. Publisher - MosbyNewble D, Dawson-Saunders B, Dauphinee WD, et al: (1994). Guidelines for Assessing Clinical Competence. Teaching and Learning in Medicine 6 (3): 213-220.Kane MT. (1992). An Argument-Based Approach to Validity. Psychological Bulletin Validity. 112 (3): 527-535.Downing S. (2003) Validity: on the meaningful interpretation of assessment data. Medical Education 37:830-7 Norcini J. (2003) Work based assessment. BMJ 326:753-5Smee S. (2003) Skill based assessment. BMJ 326: 703-6</p><p>*4*5*7*10</p></li></ul>