rapid critical appraisal of diagnostic accuracy studies professor paul glasziou centre for evidence...

Rapid Critical Appraisalof diagnostic accuracy studies

Professor Paul GlasziouCentre for Evidence Based Medicine

University of Oxford

www.cebm.net

What are tests used for?

Diagnosis – what is the problem?

Log of reasons by several docs:• Monitoring – has it changed?

• Prognosis – risk/stage within Dx

• Treatment planning, e.g., location

• Stalling for time!

Is the test accurate?

To be accurate a test should be:•Reproducible

oWe get the same (wrong?) answer every time

oP I-I question

•ValidoWe get the right answeroP I O question

Reproducibility:Agreement of histopathologists

Organ Feature Agreement Kappa Reference

Rectal Cancer Grading

50% to 69% 0.11 to 0.5 Thomas

Hodgkins Classification

56% 0.44 Holman

Melanoma depth

82%; 64% 0.68; 0.23 Breslow; Clark

Breast cancer classification

73% 0.46 Stenkvist

Ken Fleming, Evidence-based pathology. EBM 1997

Diagnostic Test Accuracymeasurements

Sensitivity is the probability of a positive test in a diseased person

Specificity is the probability of a negative test in a non-diseased person.

Is the test helpful (valid)?The Youden Index

Youden Index = sensitivity+specificity-1•For a test to be useful, then

osensitivity + specificity > 1 (Youden Index > 0)

Examples:•Coin Toss with +ve = "heads"

sensitivity = 0.5 specificity = 0.5

•Youden = 0

Can a test rule-in or rule-out?

SpPln•Specific test, Positive rules In

eg: Rovsing's sign, ST elevation > 2mm

SnNout•Sensitive test, Negative rules Out

eg: Erect abdominal film for obstruction, Elevated WCC in CSF (>5/mm )

Can I trust the accuracy data from the study?

RAMMbo

Recruitment: Was an appropriate spectrum of patients included? • (Spectrum Bias)

Maintainence: All patients subjected to a Gold Standard?• (Verification Bias)

Measurements: Was there an independent, blind or objective comparison with a Gold Standard? • Observer Bias; Differential Reference Bias

Index test (Comparison)

Outcomes(ReferenceStandard)

Presenting Problem





Presenting Problem

Appraisal of Tests: RAMMbo was the evaluatioin fair?

OutcomeMeasures(Gold Standard)

OutcomeMeasures(Gold Standard)

Index test

Comparator Test

Maintained?

Representation?

Population

Blinded or Objective?

QUADAS

The Literary Digest PollLandon versus Roosevelt, 1936

% for Roosevelt Literary Digest: 2.4 Million reader poll

• Prediction for Roosevelt 43%

Gallup's 50,000 random sample• Prediction of the election result 56%

Gallup's 3,000 Digest readers• Prediction of Digest prediction 44%

Election result 62%

Good sampling: needs a sample frame & unbiased selection

Target Population

Sample Frame

Actual Sample

Complete data

Were reference Measurements blinded or objective?

Index +ve

Index -ve

High Threshold

Low Threshold

Apparent difference

Use standardised measurement strategy across ALL patients

Smith H, et al BMJ, 2000

BNP screen of GP elderly patients

UK GP setting 155 patients

• 70-84 yrs old

Echocardiogram• 12 with CCF

Sens=92%Spec=65%

Sens=50%Spec=90%

http://www.bmj.com/content/vol320/issue7239/images/large/smih3149.f1r.jpeg

Assessment process systematic reviews

Assessment process original studies

Potentially Eligible Systematic Reviews 1999-2002N=191

157 systematic reviewsexcluded

34 systematic reviewswith

39 meta analysescontaining

678 original studies

6 systematic reviews with 8 meta-analyses excluded

28 systematic reviewswith

31 meta-analysescontaining


31 meta-analysescontaining


58 original studiesexcluded

Replication & Extension Study

AWS Rutjes et al. 2005AWS Rutjes et al. 2005

Other case-control designs

Study characteristics

No description population

No description reference

No description index

Retrospective data collection

Non-consecutive

Non blinded studies

Partial verification

Differential verification

Severe cases and healthy controls

487 studies; 31 meta-analyses487 studies; 31 meta-analysesRDOR0 1 2 3 4 5

How well are diagnostic studies reported?

112 studies in 4 major journals (1978-1993)

Standard N (%)Spectrum composition 30 (27)Avoidance of workup bias 51 (46)Avoidance of review bias 43 (38)Test accuracy precision 12 (11)Indeterminate test results 26 (23)Test reproducibility 26 (23)Accuracy in subgroups 9 (8)

Reid MC, Lachs MS, Feinstein AR. Use of Methodological Standardin diagnostic test research. JAMA 1995;274:645-651

Using Evidence about Tests

Appraise the study - PICO

Always ask• Is it useful at all? (Youden Index > 0)

Usually ask• Can it “rule in” or “rule out” a disease?

Often ask• What is the post-test probability in the same situation as

the study? Rarely ask

• Calculation of post-test probabilities in a different situation

rapid critical appraisal of diagnostic accuracy studies professor paul glasziou centre for evidence...

Documents

quadas slide

negative test

positive test

test rule

test accurate

youden index youden

test helpful valid

snnout sensitive test