validation of an operating room immersive microlaryngoscopy simulator

5
The Laryngoscope V C 2012 The American Laryngological, Rhinological and Otological Society, Inc. Validation of an Operating Room Immersive Microlaryngoscopy Simulator Jason Fleming, MRCS, DOHNS; Karan Kapoor, MRCS, DOHNS; Nick Sevdalis, PhD; Meredydd Harries, FRCS Objectives/Hypothesis: To assess the face and construct validity of two assessment tools for a microlaryngoscopy simu- lator—a Checklist Assessment for Microlaryngeal Surgery and Global Rating Assessment for Microlaryngeal Surgery. Study Design: Blinded experimental simulator-based study. Methods: There were 15 candidates divided into a novice (50 procedures performed) or experienced (>50 proce- dures) group depending on their previous microlaryngoscopy experience. Each candidate undertook a 10-minute simulation of a microlaryngoscopy and excision biopsy, and two blinded experts rated their performance live on each of the two assess- ment tools. To assess face validity, each candidate subsequently completed a questionnaire about the simulator. Results: The model demonstrated good face validity across all levels of experience. The global rating assessment demon- strated excellent interrater reliability (0.9) compared to the checklist assessment (0.7). The checklist assessment was able to differentiate experienced and novice candidates and therefore demonstrated construct validity. The global rating tool, how- ever, was unable to differentiate candidates. There was a significant correlation between the two assessment tools (correla- tion coefficient ¼ 0.624). Conclusions: This study is the first reported study of a high-fidelity microlaryngoscopy simulator with task-specific rat- ing tools. Use of these tools is recommended within otolaryngology training programs, with the global rating assessment for use as a frequently used feedback tool, and the checklist assessment as a confirmatory evaluation of competency at transi- tions of professional training. Key Words: Simulation, validity, assessment, microlaryngoscopy. Level of Evidence: 2b. Laryngoscope, 122:1099–1103, 2012 INTRODUCTION The turn of the 21st century has heralded a new dawn in UK postgraduate medical education and train- ing. A perceived lack of clarity in postgraduate training, in Europe especially but also recently in the United States, and stipulations on working hours have fuelled change within the medical education landscape, with the result that reliance on experiential training is no longer feasible. 1 This environment has without a doubt brought increased pressure for surgical educators to provide the media for trainees to get increased exposure to proce- dures within an environment where competency to perform said procedures can be assessed, all without affecting patient safety. The prime aim of simulators is to provide a high-fi- delity learning environment for trainees, whereby safety of patients in the drive for clinical competency is not compromised. Numerous studies have shown that simu- lation-based training can provide the perfect environment for safe training in highly specialized tasks. 2 Issenberg et al. 3 suggested that in the future, simulations would form the basis for technical skills training and assessment due to the ability of allowing deliberate practice for trainees away from real patients in a nonthreatening educational environment. However, the recent 2009 Cochrane review of randomized con- trolled trials investigating the effectiveness of simulation-based interventions concluded, that ‘‘research of higher methodological quality is needed.’’ 4 Introducing simulation into surgical assessment at all levels has been proposed, including for fully trained specialists as part of revalidation. Otolaryngology is subject to the pressures of mod- ern postgraduate skills training as much as any other surgical specialty. The low volume of suitable cases in a nontertiary care general hospital setting, in particular, affects experience in phonomicrosurgery. Any surgical errors or poor technique can manifest themselves in ei- ther the early or late postoperative course with both phonatory, or even more seriously, airway complications. No suitable high-fidelity simulators have previously existed for this procedure, and the operation itself lends From the ENT Department (J.F ., K.K., M.H.), Royal Sussex County Hospital, Brighton, United Kingdom; and Division of Surgery (N.S.), Imperial College, London, United Kingdom. Editor’s Note: This Manuscript was accepted for publication January 17, 2012. Presented at the American Academy of Otolaryngology–Head and Neck Surgery Annual Meeting, San Francisco, California, U.S.A., Sep- tember 11–14, 2011. The authors have no funding, financial relationships, or conflicts of interest to disclose. Send correspondence to Jason Fleming, MRCS, DOHNS, Specialty Registrar, ENT Department, Royal Sussex County Hospital, Eastern Road, Brighton, East Sussex, United Kingdom BN2 5BE. E-mail: jcflem- [email protected] DOI: 10.1002/lary.23240 Laryngoscope 122: May 2012 Fleming et al.: Microlaryngoscopy Simulator Validation 1099

Upload: jason-fleming

Post on 09-Aug-2016

224 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Validation of an operating room immersive microlaryngoscopy simulator

The LaryngoscopeVC 2012 The American Laryngological,Rhinological and Otological Society, Inc.

Validation of an Operating Room Immersive MicrolaryngoscopySimulator

Jason Fleming, MRCS, DOHNS; Karan Kapoor, MRCS, DOHNS; Nick Sevdalis, PhD;

Meredydd Harries, FRCS

Objectives/Hypothesis: To assess the face and construct validity of two assessment tools for a microlaryngoscopy simu-lator—a Checklist Assessment for Microlaryngeal Surgery and Global Rating Assessment for Microlaryngeal Surgery.

Study Design: Blinded experimental simulator-based study.Methods: There were 15 candidates divided into a novice (�50 procedures performed) or experienced (>50 proce-

dures) group depending on their previous microlaryngoscopy experience. Each candidate undertook a 10-minute simulationof a microlaryngoscopy and excision biopsy, and two blinded experts rated their performance live on each of the two assess-ment tools. To assess face validity, each candidate subsequently completed a questionnaire about the simulator.

Results: The model demonstrated good face validity across all levels of experience. The global rating assessment demon-strated excellent interrater reliability (0.9) compared to the checklist assessment (0.7). The checklist assessment was able todifferentiate experienced and novice candidates and therefore demonstrated construct validity. The global rating tool, how-ever, was unable to differentiate candidates. There was a significant correlation between the two assessment tools (correla-tion coefficient ¼ 0.624).

Conclusions: This study is the first reported study of a high-fidelity microlaryngoscopy simulator with task-specific rat-ing tools. Use of these tools is recommended within otolaryngology training programs, with the global rating assessment foruse as a frequently used feedback tool, and the checklist assessment as a confirmatory evaluation of competency at transi-tions of professional training.

Key Words: Simulation, validity, assessment, microlaryngoscopy.Level of Evidence: 2b.

Laryngoscope, 122:1099–1103, 2012

INTRODUCTIONThe turn of the 21st century has heralded a new

dawn in UK postgraduate medical education and train-ing. A perceived lack of clarity in postgraduate training,in Europe especially but also recently in the UnitedStates, and stipulations on working hours have fuelledchange within the medical education landscape, with theresult that reliance on experiential training is no longerfeasible.1 This environment has without a doubt broughtincreased pressure for surgical educators to provide themedia for trainees to get increased exposure to proce-dures within an environment where competency toperform said procedures can be assessed, all withoutaffecting patient safety.

The prime aim of simulators is to provide a high-fi-delity learning environment for trainees, whereby safetyof patients in the drive for clinical competency is notcompromised. Numerous studies have shown that simu-lation-based training can provide the perfectenvironment for safe training in highly specializedtasks.2 Issenberg et al.3 suggested that in the future,simulations would form the basis for technical skillstraining and assessment due to the ability of allowingdeliberate practice for trainees away from real patientsin a nonthreatening educational environment. However,the recent 2009 Cochrane review of randomized con-trolled trials investigating the effectiveness ofsimulation-based interventions concluded, that ‘‘researchof higher methodological quality is needed.’’4 Introducingsimulation into surgical assessment at all levels hasbeen proposed, including for fully trained specialists aspart of revalidation.

Otolaryngology is subject to the pressures of mod-ern postgraduate skills training as much as any othersurgical specialty. The low volume of suitable cases in anontertiary care general hospital setting, in particular,affects experience in phonomicrosurgery. Any surgicalerrors or poor technique can manifest themselves in ei-ther the early or late postoperative course with bothphonatory, or even more seriously, airway complications.No suitable high-fidelity simulators have previouslyexisted for this procedure, and the operation itself lends

From the ENT Department (J.F., K.K., M.H.), Royal Sussex CountyHospital, Brighton, United Kingdom; and Division of Surgery (N.S.),Imperial College, London, United Kingdom.

Editor’s Note: This Manuscript was accepted for publicationJanuary 17, 2012.

Presented at the American Academy of Otolaryngology–Head andNeck Surgery Annual Meeting, San Francisco, California, U.S.A., Sep-tember 11–14, 2011.

The authors have no funding, financial relationships, or conflictsof interest to disclose.

Send correspondence to Jason Fleming, MRCS, DOHNS, SpecialtyRegistrar, ENT Department, Royal Sussex County Hospital, EasternRoad, Brighton, East Sussex, United Kingdom BN2 5BE. E-mail: [email protected]

DOI: 10.1002/lary.23240

Laryngoscope 122: May 2012 Fleming et al.: Microlaryngoscopy Simulator Validation

1099

Page 2: Validation of an operating room immersive microlaryngoscopy simulator

itself poorly to well-controlled and supervised training.Thus, the need arises within the specialty to develop high-fidelity surgical simulators to fill in the educational gapsthat have appeared because of the organizational andlogistical pressures on junior trainees. These simulatorsmust be accessible, fit for purpose, and ultimately we mustbe able to demonstrate that the simulator and the toolsused to assess a candidate’s performance are validated.

However, simulators themselves are only part ofthe solution. To provide a robust and objective means ofassessment, they need to be combined with a suitabletool to assess the performance of a trainee performingthe simulated task. Checklists and global rating scoreshave the greatest number of studies in the medical liter-ature and are the focus of this study. The specificdefinition of set criterion against which to reference per-formance was thought to be more objective, valid, andreliable, although the criticism persists that checklistassessments turn examiners into observers rather thanexpert interpreters of performance.5 The objective, struc-tured, clinical examination was one of the first checklist-based assessments to gain widespread acceptance in themedical field and is now routinely used in both medicalschool and postgraduate examinations. The success ofthis assessment led a group in Toronto to develop a simi-lar rating scale specific to technical skills assessment—the Objective Structured Assessment of Technical Skills(OSATS).6 This tool combines the benefits of both a task-specific checklist and a global rating scale of genericcomponents anchored by behavioral descriptors. The aimof this form of assessment was therefore to reduce thesubjectivity of the observer’s experience. OSATS is effec-tive and reliable method of appraising surgical dexterityand has been demonstrated to be valid in inanimatebench, animal, cadaveric, and live operating simulationenvironments.6 Checklists and global rating scales, com-ponents of the tools used within the OSATS assessment,have been found to have construct validity and interraterreliability when used in the operating room environ-ment.7 Task-specific checklists were designed tointroduce some objectivity and reproducibility to techni-cal skills assessment. Although they are highly contentspecific (assess knowledge of the procedure), a number ofstudies have shown procedure-specific checklists to haveonly moderate validity in surgical skills assessment.8

We have developed a high-fidelity model of a humanlarynx that permits an accurate procedural simulation ofthe operation of microlaryngoscopy and excision of a be-nign lesion. The purpose of this single-center pilot studywas to test the suitability of our new operating roomimmersive microlaryngoscopy simulator (ORIMS) for useas an assessment tool. There is no evidence in the litera-ture of standardized face validity questionnaires in high-fidelity simulation. The senior author (M.H.) andanother phonosurgery expert (K.K.) developed a Likert-scale questionnaire through a consensus opinion on theimportant features of the ideal simulator. For develop-ment of our novel assessment tools, both expertsconstructed a hierarchical task analysis of the compo-nents of microlaryngoscopy. This analysis was used toproduce two new assessment tools: 1) Global Rating

Assessment for Microlaryngeal Surgery (GRAMS), whichconsists of a nine-parameter global rating (each scored1–5, maximum score 45; Supplementary Table I), and 2)Checklist Assessment for Microlaryngeal Surgery(CAMS) (Supplementary Table II), consisting of a 28-point checklist.

Our objectives for this study were: 1) Does theORIMS demonstrate face validity and is this affected bythe level of experience of the participant? 2) Are twoscoring systems—CAMS and GRAMS—able to differenti-ate participants of differing levels of experience(construct validity)? 3) Are the scoring systems reliable?

MATERIALS AND METHODS

Ethical ConsiderationsThis study was approved by the ethics board of the Brigh-

ton and Sussex University Hospitals National Health ServiceTrust.

ParticipantsAll participants were recruited from attendees at the prac-

tical laryngology course run by the Ear, Nose, and Throat(ENT) Department at the Royal Sussex County Hospital,United Kingdom. A study information sheet and consent formwere completed prior to participation in the study. The totalnumber of recruited participants in this study was 15. Nine ofthe subjects were classed as novice (<50 unassisted microlar-yngoscopies performed) and six as experienced (>50 performed).Two raters at the ENT consultant level with a subspecialty in-terest in laryngology were invited to take part.

Settings/MaterialsThe simulator was run within an operating room. The

simulator makes use of a standard ENT operating microscope(Carl Zeiss Meditec, Jena, Germany) with a 400-mm lens con-nected to a screen monitor for live feed display. The equipmentrequired for the task includes a laryngoscope with adjustablestand for suspension, a standard microlaryngoscopy set ofstandard instruments, and the laryngeal model. This consists ofa synthetic vocal cord model, an industry-produced high-qualityreplica constructed of copolymers, inserted into a modifiedadvanced life support airway manikin.

MeasuresTo assess face validity, all participants completed a five-

point Likert-scale questionnaire immediately on completion oftheir simulator task on their experience of the model. Twoexpert assessors blinded to the identity of the candidatethrough the use of surgical hats and masks, and to each other’smarks, rated the performance of the candidate on the simulatorusing both the CAMS and GRAMS scoring systems.

ProceduresFollowing a 1-minute briefing, all assessments on the sim-

ulator ran for a maximum of 10 minutes, at which point thecandidates were asked to stop. The participants were identifiedby number only. This rating was performed live in the operatingroom and the technical performance of the candidate wasassessed using both the microscope teaching arm as well as thelive feed on the monitor.

Laryngoscope 122: May 2012 Fleming et al.: Microlaryngoscopy Simulator Validation

1100

Page 3: Validation of an operating room immersive microlaryngoscopy simulator

After the test had begun, the participant was required toposition the laryngoscope and suspend it correctly (Fig. 1). Themicroscope was then brought into correct position and focus.The training task provided by the model involved dissecting apolypoid lesion from the superficial layer of the vocal fold usinga bimanual approach (Fig. 2).

Data AnalysisStatistical analysis was performed using SPSS version

17.0 for Windows software (SPSS Inc., Chicago, IL). Nonpara-metric tests were used for all analyses. Aggregate results for allanalyses used the mean score for each candidate from the twoexaminers. Data from the two groups of varying experiencewere compared using the Mann-Whitney U test. Significancewas set at P � .05. Reliability was a measure of internal consis-tency of the system. For that purpose, metrics that werecommon between all tasks were analyzed with the Cronbach atest. Spearman rank correlation coefficient was calculated tocompare the conformity between the two scoring systems.

RESULTSThere were 15 ENT participants (divided into nov-

ice [n ¼ 9] and experienced [n ¼ 6]). All participantswere residing in the United Kingdom and were currentlyin an otolaryngology post, ranging from year 1 residentsthrough to attending physicians. There was no signifi-

cant difference between the ages of the two groups (P ¼.74). The novice group had a range of performed micro-laryngoscopies of 0 to 40 procedures (median, 10;standard deviation [SD], 14.2), and the expert groupcomprised a range of 60 to 1,000 procedures (median,115; SD, 369.3).

Face ValidityTable I analyzes the responses of each defined

group for the different questions. As can be seen, therewere no significant differences in the impressions of thesimulator between the experienced and novice group,and responses were generally positive for the differentquestions.

Construct ValidityTable II demonstrates the descriptive statistics for

the aggregated results of both the examiners for boththe novice and experienced groups. Statistical analysisdemonstrates that the average marks of the novicegroup compared to the experienced group for CAMS issignificantly different (P ¼ .05), but this does not applyfor the GRAMS assessment (P ¼ .066).

Fig. 1. Operating room immersive microlaryngoscopy simulatorsetup. The airway manikin and suspended microlaryngeal tube isin place. Fig. 2. Microscopic view of performing cold steel dissection of a

polypoid lesion on the operating room immersive microlaryngo-scopy simulator.

TABLE I.Face Validity Ratings by Group.

Total Mean

Experienced Group Novice Group

P Value*Mean SD Mean SD

Realism of model 4.00 4.00 0.00 4.00 0.50 1.00

Fair assessment of skills 4.07 4.00 0.00 4.11 0.60 .61

Stressful 3.27 3.33 1.03 3.22 0.83 .70

Artificial 3.20 4.00 0.63 2.67 0.87 0.14

Use as a training tool 4.07 3.83 0.41 4.22 0.44 .11

Use as an assessment tool 4.13 4.00 0.63 4.22 0.67 .50

Ratings range from 1 (strongly disagree) to 5 (strongly agree).*Mann-Whitney U test.SD ¼ standard deviation.

Laryngoscope 122: May 2012 Fleming et al.: Microlaryngoscopy Simulator Validation

1101

Page 4: Validation of an operating room immersive microlaryngoscopy simulator

To assess the interrater reliability of the two scor-ing systems, the reliability coefficient, or Cronbach a,was first calculated for each of the scoring systems. TheCAMS Cronbach a was calculated as 0.7 and theGRAMS as 0.9. Spearman rank correlation is significantfor the two scoring systems (taking the mean score ofthe two raters) (P ¼ .013) with a correlation coefficientof 0.624, demonstrating a strong correlation between thetwo scoring systems (Fig. 3).

DISCUSSION

Key FindingsThe results of the face validity part of the study

give a quantitative snapshot of the participants’ views ofthe simulator. The fact that no significant difference wasshown between novice and experienced operators demon-strates that the model has a high fidelity across theexperience levels, and this can only be beneficial for theconsideration of simulator use in a variety of guises todifferent levels of trainees. In addition, overall opinionabout the ORIMS is favorable, with the simulator’s real-ism, fairness, and use for both assessment and trainingscoring above 3.5 in the questionnaire.

Reliability and validity have been inconsistentlydefined and measured within the surgical literature.9 Itis generally accepted that the Cronbach a coefficientshould be above 0.7 for the tool to be of acceptable reli-ability, although some publications recommend thatvalues above 0.8 are preferable as an acceptable thresh-old for a test.10 Our Cronbach a results give a value of0.7 for the CAMS and 0.9 for the GRAMS. Thus,although the results of this pilot study demonstrate asignificant correlation between the two scoring tools, interms of reliability, GRAMS appears to be a more reli-able tool and meets the criteria for use in high-stakesassessment.

When assessing the ability of the tools to distin-guish candidates of differing levels of experience, orconstruct validity, a somewhat different result wasobtained. Based on the P values obtained, only theCAMS assessment tool demonstrated a statistically sig-nificant ability to distinguish between the two groups.This differs from previous research in this area. Thispilot study found that the GRAMS, our modified globalrating scale, did not show itself to demonstrate similarconstruct validity.

Comparisons With Other StudiesOngoing studies have demonstrated the construct

validity of procedural checklist scales and concurrent va-lidity with global rating scales.11 OSATS have been usedextensively in North America for the assessment of surgi-cal residents in a clinical skills laboratory setting, wherethey have been shown to have construct validity andinterrater reliability.12 Our results agree with a numberof previous studies regarding the reliability of global rat-ing scales, although the greater construct validity of thechecklist assessment compared to our global rating scaleis different from studies observed in a variety of specialtyassessments. This may demonstrate that the checklistassessment more closely correlates with the technicalsteps involved in a microlaryngoscopy procedure. How-ever, we believe the global rating scale has significantadvantages when formalizing feedback to trainees. Otheradvantages have also been demonstrated in the litera-ture. Regehr et al.8 showed that a global rating scaledemonstrated better interobserver reliability discrimina-tion between surgeons of differing skill (constructvalidity). The Imperial College group demonstrated ahigh interobserver reliability and construct validity of a

TABLE II.Aggregate Scores and Demonstration of Construct Validity.

Candidate No.

CAMS GRAMS

Median

IQR Range

P Value,N vs. E* Median

IQR Range

P Value,N vs. E*

25thpercentile

75thpercentile Min Max

25thpercentile

75thpercentile Min Max

Novice 9 17.00 15.88 18.50 11.50 21.50 .05 26.00 21.25 26.75 16.50 29.50 .07

Experienced 6 22.00 17.63 24.13 16.50 26.00 35.75 24.50 42.00 20.00 45.00

*Mann-Whitney U TestCAMS ¼ Checklist Assessment for Microlaryngeal Surgery; GRAMS ¼ Global Rating Assessment for Microlaryngeal Surgery; IQR ¼ interquartile range;

Min ¼ minimum; max ¼ maximum; N ¼ novice; E ¼ experienced.

Fig. 3. Correlation graph comparing the two assessment tools.CAMS ¼ Checklist Assessment for Microlaryngeal Surgery;GRAMS ¼ Global Rating Assessment for Microlaryngeal Surgery.

Laryngoscope 122: May 2012 Fleming et al.: Microlaryngoscopy Simulator Validation

1102

Page 5: Validation of an operating room immersive microlaryngoscopy simulator

procedure-specific rating scale.13 Therefore, althoughchecklist assessments may still have a role in training,procedure-specific scales can provide a greater degree offormative feedback to the trainee than a generic scalethrough the identification of areas of weakness.

Currently, only one laryngeal model study has beenpublished in the medical literature—the laryngeal dis-section model (LDM) developed at Emory Voice Centerwithin the Department of Otolaryngology.14 This modelconsists of an anatomically accurate model for laryngealsurgery, but the vocal folds are made of double-layeredtissue paper on double-sided tape, thus limiting thedegree of fidelity that this simulator can achieve.Indeed, the authors themselves acknowledge the needfor a more ’lifelike’ synthetic vocal fold to improve theface validity of the model. However, the low cost of theLDM simulator is to be applauded, and it thereforeshould have a place as a modern teaching aid, evidencedby the improvement in novices’ scores after training.

However, although there is abundant literature onthe validation of both checklists, global rating scales,and OSATS, much of it suffers from criticism of method-ology. A comprehensive qualitative review by Feldmanet al.15 highlighted reliability and validity studiesinvolving competency assessment of laparoscopic surgi-cal skills and concluded that a ‘‘lack of standardisationin tasks, metrics, and level of validation’’ were due to‘‘significant design flaws’’ in the reviewed studies.

LimitationsReviews on validation studies comment that many

studies are underpowered. We acknowledge that thenumber of candidates in this study is not significantenough to make decisions on absolute changes in educa-tional/assessment policy, but reflects the pilot nature ofthis study and should act as a prompt for larger studieswith the simulator, and ultimately further larger multi-center studies. However, the use of operating room staffand the room itself is a significant expense and limitsthe availability of this model. One of the main draw-backs of this technique, or of any observer-basedmethod, is the extensive commitment, both human andtime, needed to carry out such an appraisal, which isexaggerated by using multiple examiners. However, ret-rospective analysis of video-recorded performance couldbe trialed in future studies to try and address theseproblems, both by reducing the number of observersneeded and more accurately blinding the candidates’identity. The educational literature stresses the impor-tance of accurately defining experience levels for thepurpose of validation studies. With ranges from >50 pro-cedures performed to >1,000 being used to defineexperts, a definition has clearly not been established inthe literature. We elected to use the definitions of novice(�50 procedures performed) and experienced (>50 proce-dures performed) to try and reduce the controversy ofthe arbitrary definition. As demonstrated by our rangeof procedure numbers in the two groups, there was aconfluence of the tested groups for those candidates who,

by defining 50 procedures performed, were divideddespite being of similar experience. Using two more con-trasting groups or further work in the definition ofprocedure experts to be used in future validation stud-ies, would hopefully help to further define the constructvalidity of the assessment tools more clearly.

CONCLUSIONSimulation technology has been advocated as a far

safer method for trainees to learn in high-fidelity scenar-ios, without exposing patients to any clinical risk. Thisstudy has shown our laryngeal simulator to be of high fi-delity as used within an immersive operating roomenvironment. The assessment tools of CAMS andGRAMS as used in our department have shown strongcorrelation. However, the agreement between thembelies their diverse use in skills assessment. We wouldadvocate consideration of use of GRAMS as a reliable,frequently used, feedback tool, and the CAMS as a con-firmatory evaluation of competency at transitions ofprofessional training. To our knowledge, this is the firststudy that addresses assessment of microlaryngoscopyskills in an immersive environment. This tool mayrequire further research and refinement, but it alreadyhas the potential to help assess trainees’ skills on amuch wider scale. The ultimate aim is for developmentof these simulators to reach a stage where their use in astructured training program and high-stakes assessmentand revalidation becomes seamless and commonplace.

BIBLIOGRAPHY

1. Chikwe J, De Souza AC, Pepper JR. No time to train surgeons. Br Med J2004;328:418–419.

2. Singh H, Thomas EJ, Petersen LA, et al. Medical errors involving train-ees: a study of closed malpractice claims from 5 insurers. Arch InternMed 2007:167:2030–2036.

3. Issenberg SB, Mcgaghie WC, Hart IR, et al. Simulation technology forhealth care professional skills training and assessment. JAMA 1999;282:861–866.

4. Gurusamy KS, Aggarwal R, Palanivelu L, et al. Virtual reality training forsurgical trainees in laparoscopic surgery. Cochrane Database Syst Rev2009;(1):CD006575.

5. Moorthy K, Munz Y, Sarker SK, Darzi A. Objective assessment of techni-cal skills in surgery BMJ 2003;327:1032–1037.

6. Faulkner H, Regehr G, Martin J, et al. Validation of an objective struc-tured assessment of technical skill for surgical residents. Acad Med1996;71:1363–1365.

7. Winckel CP, Reznick RK, Cohen R, et al. Reliability and construct validityof a structured technical skills assessment form. Am J Surg 1994;167:423–427.

8. Regehr G, Macrae H, Reznick RK, et al. Comparing the psychometricproperties of checklists and global rating scales for assessing perform-ance on an OSCE-format examination. Acad Med 1998;73:993–997.

9. Van Nortwick SS, Lendvay TS, Jensen AR, et al. Methodologies for estab-lishing validity in surgical simulation studies. Surgery 2010;147:622–630.

10. Gallagher AG, Ritter EM, Satava RM. Fundamental principles of valida-tion, and reliability: rigorous science for the assessment of surgical edu-cation and training. Surg Endosc 2003;17:1525–1529.

11. Taffinder N, Sutton C, Fishwick RJ, et al. Validation of virtual reality toteach and assess psychomotor skills in laparoscopic surgery: resultsfrom randomised controlled studies using the MIST VR laparoscopicsimulator. Stud Health Tech Informat 1998;50:124–130.

12. Reznick R, Regehr G, Macrae H, et al. Testing technical skill via an inno-vative "bench station" examination. Am J Surg 1997;173:226–230.

13. Pandey V, Wolfe J, Moorthy K, et al. Procedural rating scales increase ob-jectivity in surgical assessment. Br J Surg 2003;90(suppl 1):14–15.

14. Contag SP, Klein AM, Blount AC, et al. Validation of a laryngeal dissectionmodule for phonomicrosurgical training. Laryngoscope 2009;119:211–215.

15. Feldman LS, Sherman V, Fried GM. Using simulators to assess laparo-scopic competence: ready for widespread use? Surgery 2004;135:28–42.

Laryngoscope 122: May 2012 Fleming et al.: Microlaryngoscopy Simulator Validation

1103