how to design and interpret controlled clinical trials “the dark side of the moon” “how to...
Post on 18-Dec-2015
230 Views
Preview:
TRANSCRIPT
How to design and interpret controlled clinical trials
“the dark side of the moon”“How to session” ESH June 2005
Andreas Pittaras MD
E. FREIS1912-2005
The father of the first
multicenter, double-
blinded, random trial of
cardiovascular drugs,
VA Cooperative Study
on Antihypertensive
Agents
10.00 new randomized trials every year
>350.000 trials
General internists would need to read 20
articles a day all year round to maintain
present knowledge
Systematic reviews and guidelines reduces
this problem
“the aim of science (…clinical trial) is not to open a door to endless wisdom,
but to put a limit to endless error”
-Bertolt Brecht
Do we wear the same eyeglasses ?
Clinical
Trials
Clinical Studies: Essential Questions
• Was the study original?
• Whom is the study about?
• Was the design of the study sensible?
• Was systematic bias avoided or
minimized?
• Was the study large enough, and continued
for long enough, to make the results credible?
Clinical Studies: Essential Questions
• Was the study original?
• Is there any similar study?
• Is this study bigger, continued for longer,
or otherwise more substantial than previous
one(s)?
• Is the methodology of this study any more
rigorous (in particular, does it address any
specific methodological criticisms of previous
studies)?
• Will the numerical results of this study add
significantly to a meta-analysis of previous studies?
• Is the population that was studied different in any
way (has the study looked at different ages, sex, or
ethnic groups than previous studies)?
• Is the clinical issue addressed of sufficient
importance, and is there sufficient doubt in the minds
of the public or key decision makers, to make new
evidence “politically” desirable even when it is not
strictly scientifically necessary?
Clinical Studies: Essential Questions
• Whom is the study about?
• How were the subjects recruited ? advertisement local
newspaper, primary care, veterans, homeless people etc
• Who was included in the study? coexisting illness, local
language, other medication, illiterate people etc (the results of
studies of new drugs in 23 yo healthy male volunteers will not
be applicable to the average elderly women)
• Who was excluded from the study? A study may be
restricted to pts with moderate or severe CHF, which could
lead to false conclusions about mild CHF. Hospital outpatients
studies have different disease spectrum from the primary care
• Were the subjects studied in real life circumstances?
doubt on the applicability of findings to your own practice
Clinical Studies: Essential Questions
• was the design of the study sensible?
• What specific intervention or other maneuver was
being considered, and what was it being compared
with ?
•It is tempting to take published statements at face value, but
authors frequently misrepresent (usually subconsciously
rather than deliberately) what they actually did, and they
overestimate its originality and potential importance.
•…… examples of problematic descriptions in the method
section of a clinical trial……
What the authors said"We measured how often GPs ask patients whether they smoke."
"We measured how doctors treat low back pain."
"We compared a nicotine-replacement patch with placebo."
"We asked 100 teenagers to participate in our survey of sexual attitudes."
"We randomized patients to either 'individual care plan' or 'usual care'."
"To assess the value of an educational leaflet, we gave the intervention group a leaflet and a telephone helpline number. Controls received neither."
"We measured the use of vitamin C in the prevention of the common cold."
What the authors said What they should have said (or should have done)"We measured how often GPs ask patients whether they smoke." •"We looked in patients' medical records and counted how many had had their
smoking status recorded."
"We measured how doctors treat low back pain." •"We measured what doctors say they do when faced with a patient with low back pain."
"We compared a nicotine-replacement patch with placebo." •"Subjects in the intervention group were asked to apply a patch containing 15 mg nicotine twice daily; those in the control group received identical-looking patches."
"We asked 100 teenagers to participate in our survey of sexual attitudes." •"We approached 147 white American teenagers aged 12-18 (85 males) at a summer camp; 100 of them (31 males) agreed to participate."
"We randomized patients to either 'individual care plan' or 'usual care'." •"The intervention group were offered an individual care plan consisting of ...; control patients were offered ...."
"To assess the value of an educational leaflet, we gave the intervention group a leaflet and a telephone helpline number. Controls received neither."
•If the study is purely to assess the value of the leaflet, both groups should have been given the helpline number.
"We measured the use of vitamin C in the prevention of the common cold." •A systematic literature search would have found numerous previous studies on this subject14
An example of:•Assumption that medical records are 100% accurate.
•Assumption that what doctors say they do reflects what they actually do.
•Failure to state dose of drug or nature of placebo.
•Failure to give sufficient information about subjects. (Note in this example the figures indicate a recruitment bias towards females.)
•Failure to give sufficient information about intervention. (Enough information should be given to allow the study to be repeated by other workers.)
•Failure to treat groups equally apart form the specific intervention.
•Unoriginal study.
What the authors said
"We measured how often GPs ask patients whether they smoke.""We measured how doctors treat low back pain."
"We compared a nicotine-replacement patch with placebo."
"We asked 100 teenagers to participate in our survey of sexual attitudes."
"We randomized patients to either 'individual care plan' or 'usual care'."
"To assess the value of an educational leaflet, we gave the intervention group a leaflet and a telephone helpline number. Controls received neither."
"We measured the use of vitamin C in the prevention of the common cold."
What the authors said What they should have said (or should have done)"We measured how often GPs ask patients whether they smoke." "We looked in patients' medical records and counted how many
had had their smoking status recorded.""We measured how doctors treat low back pain." "We measured what doctors say they do when faced with a patient with low back
pain."
"We compared a nicotine-replacement patch with placebo." "Subjects in the intervention group were asked to apply a patch containing 15 mg nicotine twice daily; those in the control group received identical-looking patches."
"We asked 100 teenagers to participate in our survey of sexual attitudes." "We approached 147 white American teenagers aged 12-18 (85 males) at a summer camp; 100 of them (31 males) agreed to participate."
"We randomized patients to either 'individual care plan' or 'usual care'." "The intervention group were offered an individual care plan consisting of ...; control patients were offered ...."
"To assess the value of an educational leaflet, we gave the intervention group a leaflet and a telephone helpline number. Controls received neither."
If the study is purely to assess the value of the leaflet, both groups should have been given the helpline number.
"We measured the use of vitamin C in the prevention of the common cold." A systematic literature search would have found numerous previous studies on this subject14
An example of:
Assumption that medical records are 100% accurate.
Assumption that what doctors say they do reflects what they actually do.
Failure to state dose of drug or nature of placebo.
Failure to give sufficient information about subjects. (Note in this example the figures indicate a recruitment bias towards females.)
Failure to give sufficient information about intervention. (Enough information should be given to allow the study to be repeated by other workers.)
Failure to treat groups equally apart form the specific intervention.
Unoriginal study.
What the authors said"We measured how often GPs ask patients whether they smoke."
"We measured how doctors treat low back pain."
"We compared a nicotine-replacement patch with placebo."
"We asked 100 teenagers to participate in our survey of sexual attitudes.""We randomized patients to either 'individual care plan' or 'usual care'."
"To assess the value of an educational leaflet, we gave the intervention group a leaflet and a telephone helpline number. Controls received neither."
"We measured the use of vitamin C in the prevention of the common cold."
What the authors said What they should have said (or should have done)"We measured how often GPs ask patients whether they smoke." "We looked in patients' medical records and counted how many had had their
smoking status recorded."
"We measured how doctors treat low back pain." "We measured what doctors say they do when faced with a patient with low back pain."
"We compared a nicotine-replacement patch with placebo." "Subjects in the intervention group were asked to apply a patch containing 15 mg nicotine twice daily; those in the control group received identical-looking patches."
"We asked 100 teenagers to participate in our survey of sexual attitudes." "We approached 147 white American teenagers aged 12-18 (85 males) at a summer camp; 100 of them (31 males) agreed to participate."
"We randomized patients to either 'individual care plan' or 'usual care'." "The intervention group were offered an individual care plan consisting of ...; control patients were offered ...."
"To assess the value of an educational leaflet, we gave the intervention group a leaflet and a telephone helpline number. Controls received neither."
If the study is purely to assess the value of the leaflet, both groups should have been given the helpline number.
"We measured the use of vitamin C in the prevention of the common cold." A systematic literature search would have found numerous previous studies on this subject14
An example of:Assumption that medical records are 100% accurate.
Assumption that what doctors say they do reflects what they actually do.
Failure to state dose of drug or nature of placebo.
Failure to give sufficient information about subjects. (Note in this example the figures indicate a recruitment bias towards females.)Failure to give sufficient information about intervention. (Enough information should be given to allow the study to be repeated by other workers.)
Failure to treat groups equally apart form the specific intervention.
Unoriginal study.
•What outcome was measured, and how?
•If you had an incurable disease, testing a new drug, you would
measure the efficacy of the drug in terms of whether it
made you live longer (and perhaps, whether life was
worth living given your condition and any side effects of
the medication)
•The measurement of symptomatic effects (pain), functional
effects (mobility), psychological effects (anxiety), or social
effects (inconvenience) of an intervention has even more
problems.
•What is important in the eyes of the doctor may not be valued
so highly by the patient, and vice versa.
Clinical Studies: Essential Questions
•Was systematic bias avoided or minimized?
The aim: groups as similar as possible except for the particular difference being examined
Receive same explanations
Have same contacts with health professionals
Be assessed the same number of times
Using the same outcome measures
Different study designs to reduce systematic bias
Randomized controlled trials
Non-randomized controlled clinical trials
Cohort studies
Case-control studies
Randomized double-blind controlled trials
“Gold standard”
The two treatments are investigated concurrently
Allocation of treatments to patients is by a random
process
Neither the patient nor the clinician knows which
treatment was received
“Single blind”: only the patient is unaware
Copyright ©1997 BMJ Publishing Group Ltd.
Sources of bias to check for in a randomised controlled trial
Random allocation: same chance of receiving either treatment, and is thus
unbiased by definition Minimization (each pt takes automatically the treatment
which leads to less imbalance; alternative in small trials,) Systematic allocation (pseudo-random; even vs odd days
groups; open to abuse) Non-random concurrent controls ( active vs control of
ineligible + refusers; volunteer bias) Historical controls ( a single group of new treatment vs a
group previously treated with other alternative treatment)
Alternative designs Parallel group design (two different groups are
studied concurrently) Crossover design Within group (paired) comparisons Sequential designs Factorial designs Adaptive designs Zelen’s design
Alternative designs Parallel group design (two different groups are
studied concurrently) Crossover design Within group (paired) comparisons Sequential designs Factorial designs Adaptive designs Zelen’s design
Alternative designs Parallel group design (two different groups are
studied concurrently) Crossover design Within group (paired) comparisons Sequential designs Factorial designs Adaptive designs Zelen’s design
Sequential design Parallel groups are studied, but the trial continues until
the clear benefit of one treatment, or it is unlikely that any difference will emerge.
Will be shorter than fixed length trials The data are analyzed after each pt’s results become
available Blinding problems; ethical difficulties Group sequential trial : a useful variation; data
analysis after each block of patients are available (early termination)
Alternative designs Parallel group design (two different groups are
studied concurrently) Crossover design Within group (paired) comparisons Sequential designs Factorial designs Adaptive designs Zelen’s design
Factorial designs
Two treatments , A & B, are simultaneously
compared with each other and with a control.
Pts are divided into four groups, who receive
the control treatment, A only, B only, and
both A&B.
Allows the investigation of the “synergy”
between A & B
Alternative designs Parallel group design (two different groups are
studied concurrently) Crossover design Within group (paired) comparisons Sequential designs Factorial designs Adaptive designs Zelen’s design
Copyright ©1997 BMJ Publishing Group Ltd.
Sources of bias to check for in a randomised controlled trial
Clinical Studies: Essential Questions
• Was assessment “blind”?
“Blind” assessment? “Blind” assessment? People who assess outcome know the patient’s group
-Judge whether someone is still clinically in heart
failure
-Say whether an x ray is “improved” from last time
-recheck a high BP measurement in active group
-BB vs ACEi or ARBs or Diuretics(<HR, <K+)
-CCB vs others (pedal edema)
Copyright ©1997 BMJ Publishing Group Ltd.
Sources of bias to check for in a randomised controlled trial
Clinical Studies: Essential Questions
• Was the study large enough, and
continued for long enough, to make the
results credible?
Sample SizeSample Size
Big enough to have a high chance of
detecting a worthwhile effect if it exists
Be reasonably sure that no benefit exists if its
not found in the trial
Errors defined Type I error (α) : The probability of detecting a
statistically significant difference when the treatments are in reality equally effective (the chance of false-positive result)
Type II error (β): :The probability of not detecting a statistically significant difference when a difference of a given magnitude in reality exists (the chance of a false-negative result)
Power (1-β): The probability of detecting a statistically significant difference when a difference of a given magnitude really exists
The simplest approximate sample size formula for binary outcomes, assuming α=0.05, power=0.90,
and equal sample sizes in the two groups
n=10.51 [(R+1)-p₂(R²+1)]
p₂(1-R)²
n : the sample size in each of the groups
p₁: event rate in the treatment group
p₂: event rate in the control group
R: risk ratio (p₁/p₂)
The simplest approximate sample size formula for binary outcomes, assuming α=0.05, power=0.90,
and equal sample sizes in the two groups
N=962=10.51 [(0.60+1)-0.10(0.60²+1)]
0.10(1-0.60)²
n : the sample size in each of the groups
p₁: 0.06 ( 6% event rate in the treatment group)
p₂:0.10 (estimate 10% event rate in the control group)
R:0.60=6%/10% (to detect 40% reduction (p₁/p₂)
Approximate relative trial sizes for different levels of “α” and “power”
Power (1-β)
α (type I error) 0.50 0.80 0.90 0.99
0.05 100 200 270 480
0.01 170 300 390 630
0.001 280 440 540 820
Duration of follow upDuration of follow up
The study must continue long enough for the effect
of intervention to be reflected in the outcomes.
A study of a new painkiller on the postoperative
pain may only need a follow up period of 48h.
The effect of nutritional supplements in the
preschool years on the final height needs decades.
Events in newly diagnosed DM need >10 years
Copyright ©1997 BMJ Publishing Group Ltd.
Sources of bias to check for in a randomised controlled trial
Copyright ©1997 BMJ Publishing Group Ltd.
Interpretation “Tips” of results Interpretation “Tips” of results p <0.05 means by chance <1:20 “significant”
p<0.01 by chance <1:100 “highly significant”
CI “confidence interval” around a result:
indicates the limits within which the “real”
difference is likely to lie
Every r value should be accompanied by a p
value or a CI
Interpretation “Tips” of results Interpretation “Tips” of results
Relative Risk of death
Relative Risk Reduction
Absolute Risk Reduction
Number needed to treat
Odds Ratio
10.00 new randomized trials every year >350.000 trials General internists would need to read 20 articles a
day all year round to maintain present knowledge Systematic reviews and guidelines reduces this
problem
•30-40% of patients do not receive care according to present scientific evidence•20-25% of care provided, is not needed or is potentially harmful
Common clinician concerns about trials, subgroups, meta-analyses, and risk
“Could my patient have been randomized in this
trial? If so the results are applicable; if not, they
may not be”
“Is my patient so different from those in the trial
that its results cannot help me make my
treatment decision?”
Source Population
Eligible Population
Participants
Exposureor
Intervention
Comparisonor
Control
Outcomes
+_
+_
Nested triangles: Different
Population with a Common condition
top related