statistical issues in clinical studies using laboratory endpoints elizabeth s. garrett, phd sidney...
TRANSCRIPT
Statistical Issues in Clinical Studies
UsingLaboratory Endpoints
Elizabeth S. Garrett, PhDSidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
ACOSOG June 5, 2003
Montreal
Outline Motivating example: Endostatin trial
Surrogate markers
Design issues in laboratory studies
Defining outcomes
Issues relating to measurement
Motivating Example: Dose Finding Endostatin Trial*
* Herbst, Abbruzzese et al. Development * Herbst, Abbruzzese et al. Development of Biologic Markers of Response and of Biologic Markers of Response and Assessment of Antiangiogenic Activity Assessment of Antiangiogenic Activity in Clinical Trial of Human Recombinant in Clinical Trial of Human Recombinant Endostatin. JCO v. 20, 2002.Endostatin. JCO v. 20, 2002.
Endostatin is angiogenesis inhibitor.
Complex biology: finding best dose will require extensive study.
Cytostatic versus cytotoxic agent.
Outcomes of interest are biomarkers: Tumor blood flow Tumor metabolism
Outcomes were measured at baseline, 28 days and 56 days.
Motivating Example: Dose Finding Endostatin Trial* Why is this study different than other dose
finding studies? Standard Phase I trials find MTD (maximum
tolerated dose) with safety as primary outcome.
Primary outcome is ‘efficacy-related’. Outcomes are ‘surrogate’ outcomes. Measuring outcomes is more invasive and
more costly than standard safety or efficacy trials.
Measurement of outcomes can be complicated.
Surrogate Markers If tumor blood flow and tumor metabolism change
as we expect, what can we conclude?
Is it true, then, that endostatin is efficacious? Not necessarily…..how closely tied are these
markers and clinical response?
Surrogate outcomes: outcomes in the causal pathway of true outcome.
Replace a distal endpoint (response) by proxy endpoint (tumor metabolism and/or blood flow).
Benefits of using surrogate markers Reduction in sample size Reduction in trial duration Reduction in cost Reduction in time to evaluate new therapies
Their use is NOT AS EASY AS IT SOUNDS… Use of a marker as surrogate for outcome
requires that you first identify one...
Surrogate Markers
What is a surrogate marker? Defining Characteristic:
a marker must predict clinical outcome, in addition to predicting the effect of treatment on clinical outcome
Operational Definition establish an association between marker & clinical
outcome establish an association between marker,
treatment & clinical outcome, in which marker mediates relationship between clinical outcome and treatment
Surrogate Markers
[ Clinical Outcome | Treatment & Marker]
[ Clinical Outcome | Marker] =
Effect of the treatment on the clinical outcome is completely explained by the effect of the marker
Knowing the status of the marker, the treatment adds nothing more to explaining the clinical outcome.
Surrogate Markers
marker
Clinical outcome
treatmentClinical outcome
1) establish an association between marker & clinical outcome.
2) establish an association between marker, treatment & clinical outcome, in which marker completely mediates relationship between clinical outcome and treatment.
marker
NOT Surrogate Markers
marker
treatmentClinical outcome
treatment marker
Clinical outcome
Correlative Studies It is still valid to look at
markers that you would expect to be correlated with the clinical outcome.
But, we do not want to be overconfident by saying that they are true ‘surrogates.’
Correlative studies might include: Pharmacokinetics Pharmacodynamics Other biologic markers
that can be measured in serum, biopsy samples, etc.
treatment marker
Clinical outcome
Design Issues Novel laboratory studies may not be suited
for standard phase I and phase II clinical trials.
Standard Phase 1 Clinical Trial Designs Historically, MTD is of interest ‘3+3’: aims for dose with <33% toxicity
Treat 3 patients at a dose. If no toxicities, escalate. If 1 toxicity, treat 3 more at that level. If 2 or more, de-escalate to lower dose.
‘3+3’ or variants are most commonly seen. Not relevant to cytostatic and other agents (e.g.
vaccine trials). Need different approach.
Alternative Designs Continual Reassessment Method (CRM)
‘Response’ can be defined as toxicity or efficacy endpoint.
Target level of response is chosen. Mathematical model for dose-response is
assumed. Dose levels are not fixed in advance: later
doses are determined by responses to previous doses.
Many variants, but generally need to pre-specify a minimum and maximum dose.
Problem: we might not know what our ‘target’ response is.
Alternative Designs Dose-ranging study
Choose fixed doses and treat a fixed number of patients at each of the doses.
Allows estimation of dose-response curve. Choice of Herbst, et al. for endostatin trial. Can be used when safety is not of great concern (i.e.
probability of toxicities is small). For both CRM and dose-ranging, we tend to get
better estimates of dose-response curves. Toxicity almost always gets included as (at least)
a secondary outcome.
Selecting a Phase I Design Don’t assume the standard ‘phase 1’ trial designs are
appropriate for assessing laboratory endpoints. Dose-ranging studies are more likely candidates. Treat enough patients at each dose level to get
‘reasonable’ estimate of response (i.e. at least 3). Think carefully about dose levels….
Want enough dose levels to understand/estimate the curve Don’t want so many that it will take you a long time to finish
the study. Patient population
Endostatin trial: broad patient population 3 NSCLC, 8 melanoma, 2 HNSCC, 3 breast, 2 colon, 2 renal cell,
4 sarcoma, 1 thyroid. Generalizability versus accrual versus specificity
Choosing dose levels Can be hard to know
exactly how to select doses.
If there is too much ‘space’ between doses, it can be difficult to estimate true curve.
If you expect, ‘monotonicity’, then you can infer intermediate doses.
But, what if not monotonic? Endostatin example.
Choosing the timing Standard Phase I
Patients are under surveillance for short term toxicities. Patients contact study team when a toxicity occurs after
discharge. Not an issue of ‘when to look.’
In laboratory studies Usually a pre-post setting: how does the measure
compare after treatment to before treatment. Baseline (before treatment) measure is needed. Post-treatment measures:
When is treatment at its most potent? How often should response be measured? Expensive? Invasive? Is it sufficient to look at clinic visit times?
Timing in Endostatin Trial
28 days 56 days
Phase II Designs Phase II designs: efficacy Standard efficacy outcomes
Complete response Partial response Overall survival
For cytostatic agents Progression Progression-free survival Laboratory outcomes…
Defining outcomes in laboratory studies They are usually messy More common binary outcomes have nice
properties: “Looking for 40% response vs. 20% response”
Laboratory outcomes are not so nice: Often skewed. Often have ‘undetectable’ range. Often do not know what to expect. This makes it hard to plan (i.e. sample size, power).
Novel assays: Not obvious what expected changes would be without treatment. How much fluctuation would we expect to see?
Expected Fluctuations
0 28 56
But, with multiplepatients, these changes wouldtend to average tozero.
However, if halfof patients haveincrease and halfhave decrease, we might conclude thattreatment is 50%effective!
Expected Fluctuations Follow-up times are
compared to baseline That puts a lot of stock
in baseline measure Why not consider a
‘burn-in’ period? If baseline is
inaccurately measured, all comparisons will be incorrect.
Measurement Issues Regardless of natural
fluctuations… How accurately are we
measuring our outcome?
How accurate can we measure blood flow?
Are we on target?
Accurate?
Measurement Issues Why would it be measured inaccurately?
Often LONG protocol to get ‘final’ measure. Lots of room for errors!
Although the classic efficacy outcome of ‘response’ is relatively soft, it has been objectively defined.
Laboratory endpoints require assays and other measurements, and also assumptions.
Often, assay is being developed along with the trial.
Assay Properties IS THE ASSAY SENSITIVE? Sensitivity: does the assay detect
abnormalities in cases where abnormalities exist?
Specificity: does the assay find normal levels in normal cases?
“It has been shown that at high flow rates, measuring blood flow by PET underestimates blood flow.” bias in results.
More Measurement Issues Reliability of assay
How reproducible are the results? Two samples taken from the same patient on the same day
from different lesions? Two samples taken from the same patient on the same day
from the same lesion? One sample analyzed twice using the same method?
Subjectivity Inter-rater agreement Intra-rater agreement In what ways can ‘error’ come into the procedure?
Great study + bad assay = bad study
Measurement Issues How can we know the reliability of the
measures? Preliminary studies (pre-clinical). Build it into the study design!
Reliability substudy: Inter-rater agreement Intra-rater agreement
Incorporate burn-in period Take multiple measures Run the assay (test) more than once
Final Comments Novel trials often need creative/novel study designs Understanding the properties (i.e. reliability, sensitivity) of
your assay are crucial. Try to build reliability testing into study design Think carefully about:
Are your markers are truly surrogates? Which doses you should choose (minimum, maximum,
and increments)? How often you should measure the outcome?
Statisticians usually can provide guidance, but not necessarily answers to these questions! Answers need to be based on experience with the novel agent andthe underlying biology.