mark e. nunnally, md, fccm co-director, critical care fellowship and associate professor in the...
TRANSCRIPT
Mark E. Nunnally, MD, FCCM
Co-Director, Critical Care Fellowship and Associate Professor in the Department of Anesthesia and Critical Care University of Chicago Medical CenterChicago, Illinois
GRADE Methodology ExpertContributing Author, “Surviving Sepsis Campaign: International Guidelines for Management of Sepsis and Septic Shock: 2012”
Making GRADE work: a how-to for guidelines authors
Mark E. Nunnally, MD, FCCMAssociate Professor
Department of Anesthesia & Critical CareThe University of Chicago
Course objectives I
• Translate evidence into graded recommendations.
• Identify the features that reduce or increase the quality of evidence.
Course objectives II
• Appraise clinical data to determine quality of evidence.
• Integrate quality of evidence for an intervention with costs, the balance between desirable and undesirable effects and values to determine the strength of a recommendation.
Contents
• GRADE- why?• Transparency and Certainty• The Guidelines process: a methodologist’s
perspective• GRADE- components• Summary
Conflict of interest.
I am a GRADE advisor for the Surviving Sepsis Campaign
Conflict of interest.
I am also only a consultant. YOU are the experts.
WHY GRADE?
Many guidelines, little standardizationSome inform…Some restrict…All claim to be evidence-based…
…how can we be certain a guideline is supported by the evidence?
…how can we be certain its recommendations will hold over time?
…how relevant is the recommendation to the things that matter to me?
Should we rate evidence?
• ‘Quality’ is a diluted term• Quality is a continuum• Decisions are always somewhat arbitrary• ‘Experts’ and clinicians don’t always share the
same view– This is one reason evidence and
recommendations should be separate.
Should we rate evidence?
• You need some reference• Simplicity• Transparency• Vividness
Grading of Recommendations Assessment, Development and
Evaluation
• www.gradeworkinggroup.org• International consensus document • Template for systematic reviews,
recommendations
TRANSPARENCY AND CERTAINTY
QOE- Philosophical Bent
• We are going to make recommendations that we (or others) will subsequently change.– GRADE lets us:
• try to define how likely that is• communicate our certainty in any effect• translate findings to clinical realities, by accounting for
the costs, tradeoffs and effort behind following a recommendation
Example- Glycemic Control
• 2001: Van den Berghe publishes sentinel article: NEJM 2001, 345
• 2003-2008: Guidelines, protocols, quality metrics proposed
• 2009: NICE SUGAR• 2009-present: Re-write or retire
Be Explicit
• What are the data?• What are their limitations?• How easy is it to do something?• How confident are you in recommending?
The guidelines process: a methodologist’s perspective
Getting from evidence to guidelines
Evidence Hierarchy• Experience• Reports• Observational Studies• RCTs• Meta-analyses
Guidelines Hierarchy• Clinical biases• Experience-based
tendencies• Cost analyses• Decision analyses• Formal Guidelines
Not all guidelines are created equal
recommendation
Outcome1
Outcome2
Outcome3
Outcome4
Formulate question
Rateimportanceof outcomes
Critical
Important
Critical
Not important
Evidence Profile (GRADEpro)
Pooled estimate of effect for each outcome
Rate overall quality of evidence across outcomes
high low
1. risk of bias2. inconsistency3. indirectness4. imprecision5. publication bias
1. large effect2. dose-response3. antagonistic bias
Quality of evidence for each outcomeSelect
outcomes
High Moderate Low Very low
Formulate recommendationsFor or against an actionStrong or weak (strength)
Strong or weak:Quality of evidenceBalance benefits/downsidesValues and preferencesResource use (cost)
Wording “We recommend…” | “Clinicians should…” “We suggest…” | “Clinicians might…”
Systematic Review(outcomes across studies)
action
PICOrate down
RCT observational
rate
up
12
start
High | Moderate | Low | Very low
unambiguous clear implications for action transparent (values & preferences statement)
systematic review of evidence
Question
Evidence
Judge
PICO
Summarize
QOESOR
THE QUESTION
PICO
• Population – Ventilated patients, APACHE scores
• Intervention– Medicine, therapy, education, systems intervention
• Comparison– High(how high) versus low (how low) tidal volume
• Outcome– FBI: mortality (at what follow-up), LOS, VAP
Rating outcomes
• 7-9: critical [death, disability or both]• 4-6: important [skin breakdown, sepsis]• 1-3: limited [ileus, ICU stay]
THE EVIDENCE
Collect evidence
• Be thorough– Use explicit search strategies– Decide on published v unpublished data
• Consider gray literature in some cases– Proceedings papers– Abstracts– Clinicaltrials.gov
– ALWAYS consider comparator
Assembling Evidence is HardData have to be summarized to inform
GRADE pragmatic approach
• Get a good meta-analysis (MA)• If no MA, identify main studies• If possible, do your own MA• If no MA, describe main studies/results
– Be explicit (inclusion/exclusion, flaws)• Keep the link between recommendation and
evidence
Meta Analysis-the Good and the Bad
Don’t GRADE everything
• No plausible alternative– Surveying for infection, resuscitating shock,
practicing quality improvement
• Recommend to consider– As opposed to not considering?
• Statements lacking specificity– Intervention, Comparison, relevant Outcomes
(good and bad)
JUDGING
Judge Evidence and Recommendation
• Unique to GRADE• Related, but distinct• Recommendation must take clinical realities
into account– Costs– Burdens– Benefits/risks– Values
Recommendations
• Strength
• Direction
Have 2 Components:
GRADE COMPONENTS
Entering the GRADE meat-grinder
• RCT- High quality• Observational study- Low quality• Expert report- Very Low quality
Grade Down
Grade Down
Study Limitations/Risk of Bias
• Bias definition: 1. Unequal distribution of risk factors (confounders) across study groups. 2. Factors that systematically change study effects to result in a directional change in the signal.
Risk of Bias
• GRADE treats bias by individual outcomes– Pain scores- strong effect if unblinded– Mortality- effect of blinding less clear– Loss to follow-up for different outcome windows
• With multiple studies and different risks of bias, quality should be judged by the relative contribution of studies to the confidence in the effect.
Risk of Bias
• Blinding– Patient, clinician, data assessor
• Concealment of allocation• Intention-to-treat principle
– Absence negates the balance from randomization
Risk of Bias
• Stopping Early for Benefit, especially if trials have < 500 events– Brassler D, et al. JAMA, 2010;303(12):1180-7
• Selective outcome reporting– Only positive outcomes, composite results only, or
lack of pre-specified outcomes
• Loss to follow-up– Significance relates to # of events
Risk of bias- Observational Studies
• Prognosis can differ• Groups can have multiple differences:
– Time– Place– Population– Co-morbidityThis is why observational studies typically enter as
“Low” quality of evidence
Grade Down
Inconsistency
• Definition: 1. Heterogeneity. 2. Lack of similarity of point estimates or confidence intervals. 3. Variable findings unexplained by a priori hypotheses. 4. Subgroup effects that cannot be sufficiently explained.
Inconsistency
• Generally, effects are looked at in relative terms, rather than absolute– Subgroups may have different baseline rates, but
similar relative effects
Inconsistency
• Inconsistency can come from study diversity:– Populations– Interventions – Outcomes– Study methods
• Credible inconsistency may lead to split recommendations
Basic assessments of inconsistency
• Point estimates vary widely• Little or no CI overlap
• Test of heterogeneity shows a low p value– 𝛘2
• I2 is large:(P ≤ 0.10 may be sufficient)
Context
• It is only significant inconsistency if the variability would influence a clinical decision– If point estimates and CIs favor treatment over
costs/burdens/side effects, no need to downgrade
Inconsistency
• Example: • Low-dose steroids in sepsis:
– 6 studies, 3 high baseline mortality, 3 low, with difference in effect:
• Patel GP. Am J Respir Crit Care Med 2012;185:133-139
Grade Down
Indirectness
• Definition: 1. Evidence does not directly compare to the clinical question of interest. 2. Differing patients, interventions, comparisons or outcomes in available studies necessitate extrapolation of evidence to question being addressed.
Indirectness
• Examples: – Animal studies: downgrade 1 or 2 levels, in
general, but consider the relevance of the data (toxicity v therapeutic benefit)
– If drug A>B and B>C, is A>C?– Low-fat diet: US versus French population
• Setting, co-”interventions,” genetics
– Surrogate outcomes: Blood pressure control versus cardiovascular events
Indirectness
• Example: – H2RA and PPI: C. Difficile infection: observational
study not direct to critically ill patients, but with interesting effect: Very Low QOE
• Leonard J et al. Am J Gastroenterol 2007;102: 2047
Grade Down
Imprecision
• Definition: 1. High impact of random error on evidence quality. 2. Wide range of results to be expected from repetitive study. 3. Wide range in which the truth likely lies.
Imprecision
• Driven by # of events and by degree of effect• 95% confidence intervals may encompass
harm and benefit– Taken in the context of the recommendation
• More important: 95% CIs embrace absolute values that reduce our confidence in a recommendation
Use absolute effects
Toxicity
Imprecision
• Example: – NE v Vasopressin: Mortality CI wide, spanned RR =
1.• for ventricular arrhythmias, RR 0.47 (0.38, 0.58), but 21
events FRAGILE
– H2RA and pneumonia: unable to exclude harm– Negative factors may require tighter CIs:
• Side effects/toxicity• Burdens/costs
Grade Down
Publication Bias
• Definition: 1. Studies with statistically significant results more likely to be counted than negative studies. 2. Smaller, high-effect studies disproportionately impact published literature. 3. Published commercially-funded studies are more likely to be positive.
Publication Bias
• Publication: + Studies > – Studies (RR 1.78)– Hopewell S, The Cochrane Database of Systematic
Reviews, 2007.
• – Studies: delayed, obscure publication• + studies: duplicate publication• Small studies, industry sponsor ⇒
↑publication bias
Publication Bias
• How to detect? It’s more difficult than one might think.– Look for:
• Small trials• Conflicts in authors/study sponsors• Duplications• Abstracts, grey literature with negative findings• Unpublished data
– Ideally, we would trend MAs over time
Publication BiasPooled Estimate
Publication BiasSelective PublicationGreater Study LimitationsMore Restrictive/Responsive Population
Publication Bias
Publication Bias-Testing
• Tests of asymmetry• Imputing missing information• Repeated MA over time
Publication Bias-Addressing the Problem
• Thorough research– Gray Literature– FDA submissions– Abstracts, proceedings– Author Contact
• Clinicaltrials.gov– N.B: only for RCTs, not observational studies
Grade Up
Grade Up
Grade Up
Grade Up
Moving Up- Examples
• Very strong, consistent association; no plausible confounders, up 2 grades– insulin in diabetic ketoacidosis– antibiotics in septic shock
• Strong, consistent association with no plausible confounders up 1 grade
How to get GRADEpro on your computer?
• Cochrane IMS website• cc-ims.net/revman/gradepro/download• http://www.cc-ims.net/revman/gradepro/dow
nload• Google ‘gradepro’
GRADE output: Summary of Findings
GRADE output: Evidence Profile
Quality assessmentSummary of findings
Importance
No of patients Effect
QualityNo of studies Design Limitations Inconsistency Indirectness Imprecision
Other considerations
longer term (7 day) low dose (up to 300 mg/day of hydrocortisone)
glucocorticosteroids
controlRelative(95% CI)
Absolute
Mortality, 28 days
12 randomised trials
no serious limitations
serious1 no serious indirectness
no serious imprecision
none236/629 (37.5%)
264/599 (44.1%)
RR 0.84 (0.72 to 0.97)
71 fewer per 1000 (from 13 fewer to 123
fewer)
MODERATE
CRITICAL2
GI bleeding
3 randomised trials
no serious limitations
no serious inconsistency3
no serious indirectness
serious4 none
65/827 (7.9%) 56/767 (7.3%)RR 1.12 (0.81 to
1.53)
9 more per 1000 (from 14
fewer to 39 more)
MODERATE
IMPORTANT
Superinfections
45 randomised trials
no serious limitations
no serious inconsistency6
no serious indirectness
no serious imprecision7
none184/983 (18.7%)
170/934 (18.2%)
RR 1.01 (0.82 to 1.25)
2 more per 1000 (from 33
fewer to 46 more)
HIGH
IMPORTANT
1 Meta-regression examining the effect of severity of illness (baseline mortality) on efficacy suggested an effect - p value 0.04 using fixed effect and 0.06 using random effect model. JAMA 2009; 302:1643-1645.2 Reported for all trials3 I2=04 RR up to 1.535 need to check6 I2=8%
Question: Should longer term (7 day) low dose (up to 300 mg/day of hydrocortisone) glucocorticosteroids be used in severe sepsis and septic shock?Settings: ICUBibliography: Annane 2009
Final QOE
• High: A , ++++, ↑↑↑↑• Medium: B, +++-, ↑↑↑ • Low: C, ++--, ↑↑• Very Low: D, +---, ↑
Alternate QOE interpretation
• High- Further research very unlikely to change confidence
• Moderate- likely to have an important impact• Low- very likely to impact• Very Low- uncertain
Separate QOE and Strength of Recommendation
• Evidence: high or low quality?• likelihood estimates are true and adequate
• Recommendation: weak or strong?• confidence that following recommendation will cause
more good than harm
GRADE’s defining feature
Factors- STRONG vs WEAK
Factors- STRONG vs WEAK
Factors- STRONG vs WEAK
Factors- STRONG vs WEAK
Factors- STRONG vs WEAK
STRONG to stakeholders
• Patient: most people would want it• Clinician: most should receive, uniform
behavior• Policymaker: adopt as policy, use as quality
indicator
WEAK to stakeholders
• Patient: many people would not want it• Clinician: help patient make a balanced
decision– decision aid might be needed
• Policymaker: debate
Final Strength of Recommendations
recommendation
Outcome1
Outcome2
Outcome3
Outcome4
Formulate question
Rateimportanceof outcomes
Critical
Important
Critical
Not important
Evidence Profile (GRADEpro)
Pooled estimate of effect for each outcome
Rate overall quality of evidence across outcomes
high low
1. risk of bias2. inconsistency3. indirectness4. imprecision5. publication bias
1. large effect2. dose-response3. antagonistic bias
Quality of evidence for each outcomeSelect
outcomes
High Moderate Low Very low
Formulate recommendationsFor or against an actionStrong or weak (strength)
Strong or weak:Quality of evidenceBalance benefits/downsidesValues and preferencesResource use (cost)
Wording “We recommend…” | “Clinicians should…” “We suggest…” | “Clinicians might…”
Systematic Review(outcomes across studies)
action
PICOrate down
RCT observational
rate
up
12
start
High | Moderate | Low | Very low
unambiguous clear implications for action transparent (values & preferences statement)
systematic review of evidence
Useful Resources
• BMJ: GRADE series– GRADE Introduction:
• BMJ 2008;336;924-926
– Overview of Quality of Evidence:• BMJ 2008;336;995-998
– Translating Evidence to Recommendations:• BMJ 2008;336;1049-1051
– How to handle disagreements in guidelines panels: BMJ 2008;337:a744
Useful Resources II
• Journal of Clinical Epidemiology– GRADE Guidelines Series: 1-9. 2011– April, 2011 (64(4)): 1-4
• Intro, framing the question and outcomes, rating quality of evidence, risk of bias
– December, 2011 (64(12)): 5-9• Publication bias, imprecision, inconsistency,
indirectness, rating up
Useful Resources II
• Journal of Clinical Epidemiology– GRADE Guidelines Series: 1-9. 2011– April, 2011 (64(4)): 1-4
• Intro, framing the question and outcomes, rating quality of evidence, risk of bias
– December, 2011 (64(12)): 5-9• Publication bias, imprecision, inconsistency,
indirectness, rating up