p3-030: consensus diagnosis: assessing the impact

1
of dementia. In younger persons, CVD clustered with depression, whereas in the older persons with dementia, suggesting that cerebrovascular pathol- ogy may lead to depressive syndromes in relatively young elderly, and to dementia in very old persons. P3-029 THE USEFULNESS OF AN AUTOBIOGRAPHICAL MEMORY TASK IN ASSESSING DEMENTIA SEVERITY Denise M. Maue Dreyfus, Cathy M. Roe, John C. Morris, Washington University, St. Louis, MO, USA. Contact e-mail: [email protected] Background: Deficits in episodic memory are useful diagnostic indi- cators in assessing the presence of Alzheimer’s disease and one of the most sensitive to the earliest signs of the disease. Episodic memory is typically assessed clinically by using standardized memory tasks, such as recall of a story or word lists presented during evaluation. An additional method used at the Washington University Alzheimer Dis- ease Research Center involves querying participants about the details of recent personal events. This project tested whether the outcomes of the autobiographical memory query or the brief episodic memory tasks described below are more highly correlated with the clinicians’ final dementia severity rating.Methods: In the autobiographical memory task, participants (N 852) aged 60 years with no, very mild, or mild dementia were asked to recall two personal events, one occurring over the past week and one over the past month, that had been related to the clinician by an informant. The clinician scored the recounting of each event as largely correct, partially correct, or largely incorrect, and scores for the two events were combined. Brief measures of episodic memory examined were the number of errors on recall of the John Brown phrase from the Short Blessed Test (SBT), and the number incorrect on the MMSE 3-item recall. Spearman rank-order correlations were used to examine associations between scores on the memory tasks and dementia severity, as reflected in the Clinical Dementia Rating (CDR) Sum of Boxes. Results: Scores on the autobiographical recall task were more highly correlated with CDR Sum of Boxes than were scores on the John Brown phrase (p .0001) or MMSE 3-item recall (p.0001). Correlation coefficients (95%CI) with the CDR Sum of Boxes score were .76 ( .74 to .79) for the autobiographical recall scores, .68 ( .64 to.71) for the SBT John Brown phrase, and .65 (.61 to.69) for MMSE 3-item recall. The correlation coefficients of the two brief episodic memory tasks with Sum of Boxes did not differ from each other (p .1822). Conclusions: Clinicians’ CDR ratings reflect a greater level of agreement with the results of the autobiograph- ical memory task compared to brief episodic memory tasks. P3-030 CONSENSUS DIAGNOSIS: ASSESSING THE IMPACT John R. McCarten, Laura S. Hemmy, Susan E. McPherson, Howard A. Fink, Susan J. Rottunda, Maurice W. Dysken, VA Medical Center, Minneapolis, MN, USA. Contact e-mail: [email protected] Background: Accurate diagnosis is the cornerstone of clinical care and research. For diseases lacking a definitive biomarker, expert opinion is the gold standard. Diagnostic agreement among several experts--consensus diagnosis--increases accuracy and reliability. Objective: We assessed the value of a rigorously applied consensus diagnosis process. Methods: Patients presenting to the Minneapolis VAMC GRECC Memory Loss Clinic are assigned a diagnosis on three occasions: DX1_physician’s di- agnosis at the clinic visit (initial impression); DX2_physician’s diagnosis after considering information gathered subsequent to the clinic visit (pre- consensus diagnosis); and DX3_consensus diagnosis following formal presentation and discussion with voting team members (neurologist, gero- psychiatrist, internist, and neuropsychologist). Diagnosis is in two stages: (1) Dementia vs. cognitive impairment (CI)/not demented vs. no CI; (2) Primary and secondary etiologies of CI. Definitions for dementia, MCI, and specific etiologies were operationalized. Results: Complete data were available on 305 unique patients. DX1:DX2 agreement was 86% (180/209) for dementia, 83% (70/84) for CI/not demented and 42% (5/12) for no CI. DX2:DX3 agreement was 96% (203/211) for dementia, 89% (75/84) for CI/ not demented, and 80% (8/10) for no CI 8/10. DX1:DX3 agreement was 84% (177/211) for dementia, 81% (68/84) for CI/ not demented and 60% (6/10) for no CI. Agreement with consensus probable AD was 78% (113/145) and 98% (142/145) for DX1 and DX2, respectively. Agreement with consensus possible AD was 30% (10/33) and 55% (18/33) for DX1 and DX2, respectively. Agreement with consensus probable MCI was 60% (21/35) and 83% (29/35) with DX1 and DX2, respectively. Agreement with consensus possible MCI was 0% (0/9) and 56% (5/9) with DX1 and DX2, respectively. Conclusions: The consensus diagnosis process pro- duced significant changes in diagnoses even among experienced clinicians. Agreement between preconsensus and consensus diagnoses was high for probable AD (98%), but less so for probable MCI (83%), possible AD (55%) and possible MCI (56%). Agreement between the initial impression and consensus diagnoses was lower, including probable AD (78%), prob- able MCI (60%), possible AD (30%), and possible MCI (0%). This works highlights the importance of a rigorous process of consensus diagnosis for identifying patients for clinical research. P3-031 ADMINISTRATION AND SCORING VARIABILITY AMONG ADAS-COG RATERS David S. Miller 1 , Donald J. Connor 2 , John Bartko 3 , 1 United BioSource Corporation, Wayne, PA, USA; 2 Cleo Roberts Center for Clinical Research, Sun City, AZ, USA; 3 Independent Consultant, Newville, PA, USA. Contact e-mail: [email protected] Background: The Alzheimer’s Disease Assessment Scale - Cognitive section (ADAS-Cog) is the most commonly used primary efficacy measure in clinical trials assessing treatments for Alzheimer’s disease. The reliability and consistency of its administration and scoring is influenced by differences in training on the scale that raters receive as part of a clinical trial. To assess the degree of variability in rater training, raters were surveyed across geographic regions. Methods: Surveyed raters were identified as having been previously trained, assessed and certified on the ADAS-Cog in conjunction with a clinical trial(s). The survey was sent via fax or email to identified raters across countries. In addition to demographic information (including degree, number years administering the scale and number of different clinical protocols where they administered the scale), 24 questions assessed both general and item specific information (eg - instructions for scoring Word Recall and administering Ideational Praxis, and scoring criteria for Constructional Praxis).. For raters outside of North America, an additional six questions were included that assessed variability around translation issues. Results: A minimum of 48 raters per geographic region responded. Of those, 98% had administered the ADAS-Cog for at least one year, and 61% had administered the scale over 100 times. 58% had administered the scale in 5 trials. 45% were doctoral level or above. Raters in each region noted considerable variation in their training and/or a decided lack of training depending on the particular scale item. The degree of reported variability in training ranged from 12% to 72% depending on the item. 79% of questions showed no statistically significant regional difference in the reported variability of training. Conclusions: The survey results demonstrate that an individ- ual rater receives inconsistent training on how to administer and score the ADAS-Cog. These differences are present across regions. The impact of this variability on instrument reliability and ultimately on data integrity may be significant. Given the importance of this measure and its frequent use, standardizing training on the ADAS-Cog across trial protocols will limit variability in scale administration, enhance its sensitivity as an efficacy measure, and enable accurate comparison of outcomes between trials. T525 Poster Presentations P3:

Upload: maurice-w

Post on 03-Jan-2017

218 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: P3-030: Consensus diagnosis: Assessing the impact

of dementia. In younger persons, CVD clustered with depression, whereasin the older persons with dementia, suggesting that cerebrovascular pathol-ogy may lead to depressive syndromes in relatively young elderly, and todementia in very old persons.

P3-029 THE USEFULNESS OF AN AUTOBIOGRAPHICALMEMORY TASK IN ASSESSING DEMENTIASEVERITY

Denise M. Maue Dreyfus, Cathy M. Roe, John C. Morris, WashingtonUniversity, St. Louis, MO, USA. Contact e-mail:[email protected]

Background: Deficits in episodic memory are useful diagnostic indi-cators in assessing the presence of Alzheimer’s disease and one of themost sensitive to the earliest signs of the disease. Episodic memory istypically assessed clinically by using standardized memory tasks, suchas recall of a story or word lists presented during evaluation. Anadditional method used at the Washington University Alzheimer Dis-ease Research Center involves querying participants about the details ofrecent personal events. This project tested whether the outcomes of theautobiographical memory query or the brief episodic memory tasksdescribed below are more highly correlated with the clinicians’ finaldementia severity rating. Methods: In the autobiographical memorytask, participants (N� 852) aged 60� years with no, very mild, or milddementia were asked to recall two personal events, one occurring overthe past week and one over the past month, that had been related to theclinician by an informant. The clinician scored the recounting of eachevent as largely correct, partially correct, or largely incorrect, andscores for the two events were combined. Brief measures of episodicmemory examined were the number of errors on recall of the JohnBrown phrase from the Short Blessed Test (SBT), and the numberincorrect on the MMSE 3-item recall. Spearman rank-order correlationswere used to examine associations between scores on the memory tasksand dementia severity, as reflected in the Clinical Dementia Rating(CDR) Sum of Boxes. Results: Scores on the autobiographical recalltask were more highly correlated with CDR Sum of Boxes than werescores on the John Brown phrase (p� .0001) or MMSE 3-item recall(p� .0001). Correlation coefficients (95%CI) with the CDR Sum ofBoxes score were � .76 (� .74 to � .79) for the autobiographical recallscores, � .68 (� .64 to � .71) for the SBT John Brown phrase, and � .65( � .61 to � .69) for MMSE 3-item recall. The correlation coefficients ofthe two brief episodic memory tasks with Sum of Boxes did not differfrom each other (p� .1822). Conclusions: Clinicians’ CDR ratingsreflect a greater level of agreement with the results of the autobiograph-ical memory task compared to brief episodic memory tasks.

P3-030 CONSENSUS DIAGNOSIS: ASSESSING THEIMPACT

John R. McCarten, Laura S. Hemmy, Susan E. McPherson,Howard A. Fink, Susan J. Rottunda, Maurice W. Dysken, VA MedicalCenter, Minneapolis, MN, USA. Contact e-mail: [email protected]

Background: Accurate diagnosis is the cornerstone of clinical care andresearch. For diseases lacking a definitive biomarker, expert opinion is thegold standard. Diagnostic agreement among several experts--consensusdiagnosis--increases accuracy and reliability. Objective: We assessed thevalue of a rigorously applied consensus diagnosis process. Methods:Patients presenting to the Minneapolis VAMC GRECC Memory LossClinic are assigned a diagnosis on three occasions: DX1_physician’s di-agnosis at the clinic visit (initial impression); DX2_physician’s diagnosisafter considering information gathered subsequent to the clinic visit (pre-consensus diagnosis); and DX3_consensus diagnosis following formalpresentation and discussion with voting team members (neurologist, gero-psychiatrist, internist, and neuropsychologist). Diagnosis is in two stages:(1) Dementia vs. cognitive impairment (CI)/not demented vs. no CI; (2)Primary and secondary etiologies of CI. Definitions for dementia, MCI, and

specific etiologies were operationalized. Results: Complete data wereavailable on 305 unique patients. DX1:DX2 agreement was 86% (180/209)for dementia, 83% (70/84) for CI/not demented and 42% (5/12) for no CI.DX2:DX3 agreement was 96% (203/211) for dementia, 89% (75/84) forCI/ not demented, and 80% (8/10) for no CI�8/10. DX1:DX3 agreementwas 84% (177/211) for dementia, 81% (68/84) for CI/ not demented and60% (6/10) for no CI. Agreement with consensus probable AD was 78%(113/145) and 98% (142/145) for DX1 and DX2, respectively. Agreementwith consensus possible AD was 30% (10/33) and 55% (18/33) for DX1and DX2, respectively. Agreement with consensus probable MCI was 60%(21/35) and 83% (29/35) with DX1 and DX2, respectively. Agreementwith consensus possible MCI was 0% (0/9) and 56% (5/9) with DX1 andDX2, respectively. Conclusions: The consensus diagnosis process pro-duced significant changes in diagnoses even among experienced clinicians.Agreement between preconsensus and consensus diagnoses was high forprobable AD (98%), but less so for probable MCI (83%), possible AD(55%) and possible MCI (56%). Agreement between the initial impressionand consensus diagnoses was lower, including probable AD (78%), prob-able MCI (60%), possible AD (30%), and possible MCI (0%). This workshighlights the importance of a rigorous process of consensus diagnosis foridentifying patients for clinical research.

P3-031 ADMINISTRATION AND SCORING VARIABILITYAMONG ADAS-COG RATERS

David S. Miller1, Donald J. Connor2, John Bartko3, 1United BioSourceCorporation, Wayne, PA, USA; 2Cleo Roberts Center for ClinicalResearch, Sun City, AZ, USA; 3Independent Consultant, Newville, PA,USA. Contact e-mail: [email protected]

Background: The Alzheimer’s Disease Assessment Scale - Cognitivesection (ADAS-Cog) is the most commonly used primary efficacymeasure in clinical trials assessing treatments for Alzheimer’s disease.The reliability and consistency of its administration and scoring isinfluenced by differences in training on the scale that raters receive aspart of a clinical trial. To assess the degree of variability in ratertraining, raters were surveyed across geographic regions. Methods:Surveyed raters were identified as having been previously trained,assessed and certified on the ADAS-Cog in conjunction with a clinicaltrial(s). The survey was sent via fax or email to identified raters acrosscountries. In addition to demographic information (including degree,number years administering the scale and number of different clinicalprotocols where they administered the scale), 24 questions assessedboth general and item specific information (eg - instructions for scoringWord Recall and administering Ideational Praxis, and scoring criteriafor Constructional Praxis).. For raters outside of North America, anadditional six questions were included that assessed variability aroundtranslation issues. Results: A minimum of 48 raters per geographicregion responded. Of those, 98% had administered the ADAS-Cog forat least one year, and 61% had administered the scale over 100 times.58% had administered the scale in � 5 trials. 45% were doctoral levelor above. Raters in each region noted considerable variation in theirtraining and/or a decided lack of training depending on the particularscale item. The degree of reported variability in training ranged from12% to 72% depending on the item. 79% of questions showed nostatistically significant regional difference in the reported variability oftraining. Conclusions: The survey results demonstrate that an individ-ual rater receives inconsistent training on how to administer and scorethe ADAS-Cog. These differences are present across regions. Theimpact of this variability on instrument reliability and ultimately ondata integrity may be significant. Given the importance of this measureand its frequent use, standardizing training on the ADAS-Cog acrosstrial protocols will limit variability in scale administration, enhance itssensitivity as an efficacy measure, and enable accurate comparison ofoutcomes between trials.

T525Poster Presentations P3: