statistical issues in clinical studies using laboratory endpoints elizabeth s. garrett, phd sidney...

Statistical Issues in Clinical Studies

UsingLaboratory Endpoints

Elizabeth S. Garrett, PhDSidney Kimmel Comprehensive Cancer Center

Johns Hopkins University

ACOSOG June 5, 2003

Montreal

Outline Motivating example: Endostatin trial

Surrogate markers

Design issues in laboratory studies

Defining outcomes

Issues relating to measurement

Motivating Example: Dose Finding Endostatin Trial*

* Herbst, Abbruzzese et al. Development * Herbst, Abbruzzese et al. Development of Biologic Markers of Response and of Biologic Markers of Response and Assessment of Antiangiogenic Activity Assessment of Antiangiogenic Activity in Clinical Trial of Human Recombinant in Clinical Trial of Human Recombinant Endostatin. JCO v. 20, 2002.Endostatin. JCO v. 20, 2002.

Endostatin is angiogenesis inhibitor.

Complex biology: finding best dose will require extensive study.

Cytostatic versus cytotoxic agent.

Outcomes of interest are biomarkers: Tumor blood flow Tumor metabolism

Outcomes were measured at baseline, 28 days and 56 days.

Motivating Example: Dose Finding Endostatin Trial* Why is this study different than other dose

finding studies? Standard Phase I trials find MTD (maximum

tolerated dose) with safety as primary outcome.

Primary outcome is ‘efficacy-related’. Outcomes are ‘surrogate’ outcomes. Measuring outcomes is more invasive and

more costly than standard safety or efficacy trials.

Measurement of outcomes can be complicated.

Surrogate Markers If tumor blood flow and tumor metabolism change

as we expect, what can we conclude?

Is it true, then, that endostatin is efficacious? Not necessarily…..how closely tied are these

markers and clinical response?

Surrogate outcomes: outcomes in the causal pathway of true outcome.

Replace a distal endpoint (response) by proxy endpoint (tumor metabolism and/or blood flow).

Benefits of using surrogate markers Reduction in sample size Reduction in trial duration Reduction in cost Reduction in time to evaluate new therapies

Their use is NOT AS EASY AS IT SOUNDS… Use of a marker as surrogate for outcome

requires that you first identify one...

Surrogate Markers

What is a surrogate marker? Defining Characteristic:

a marker must predict clinical outcome, in addition to predicting the effect of treatment on clinical outcome

Operational Definition establish an association between marker & clinical

outcome establish an association between marker,

treatment & clinical outcome, in which marker mediates relationship between clinical outcome and treatment

Surrogate Markers

[ Clinical Outcome | Treatment & Marker]

[ Clinical Outcome | Marker] =

Effect of the treatment on the clinical outcome is completely explained by the effect of the marker

Knowing the status of the marker, the treatment adds nothing more to explaining the clinical outcome.

Surrogate Markers

marker

Clinical outcome

treatmentClinical outcome

1) establish an association between marker & clinical outcome.

2) establish an association between marker, treatment & clinical outcome, in which marker completely mediates relationship between clinical outcome and treatment.

marker

NOT Surrogate Markers

marker

treatmentClinical outcome

treatment marker

Clinical outcome

Correlative Studies It is still valid to look at

markers that you would expect to be correlated with the clinical outcome.

But, we do not want to be overconfident by saying that they are true ‘surrogates.’

Correlative studies might include: Pharmacokinetics Pharmacodynamics Other biologic markers

that can be measured in serum, biopsy samples, etc.

treatment marker

Clinical outcome

Design Issues Novel laboratory studies may not be suited

for standard phase I and phase II clinical trials.

Standard Phase 1 Clinical Trial Designs Historically, MTD is of interest ‘3+3’: aims for dose with <33% toxicity

Treat 3 patients at a dose. If no toxicities, escalate. If 1 toxicity, treat 3 more at that level. If 2 or more, de-escalate to lower dose.

‘3+3’ or variants are most commonly seen. Not relevant to cytostatic and other agents (e.g.

vaccine trials). Need different approach.

Alternative Designs Continual Reassessment Method (CRM)

‘Response’ can be defined as toxicity or efficacy endpoint.

Target level of response is chosen. Mathematical model for dose-response is

assumed. Dose levels are not fixed in advance: later

doses are determined by responses to previous doses.

Many variants, but generally need to pre-specify a minimum and maximum dose.

Problem: we might not know what our ‘target’ response is.

Alternative Designs Dose-ranging study

Choose fixed doses and treat a fixed number of patients at each of the doses.

Allows estimation of dose-response curve. Choice of Herbst, et al. for endostatin trial. Can be used when safety is not of great concern (i.e.

probability of toxicities is small). For both CRM and dose-ranging, we tend to get

better estimates of dose-response curves. Toxicity almost always gets included as (at least)

a secondary outcome.

Selecting a Phase I Design Don’t assume the standard ‘phase 1’ trial designs are

appropriate for assessing laboratory endpoints. Dose-ranging studies are more likely candidates. Treat enough patients at each dose level to get

‘reasonable’ estimate of response (i.e. at least 3). Think carefully about dose levels….

Want enough dose levels to understand/estimate the curve Don’t want so many that it will take you a long time to finish

the study. Patient population

Endostatin trial: broad patient population 3 NSCLC, 8 melanoma, 2 HNSCC, 3 breast, 2 colon, 2 renal cell,

4 sarcoma, 1 thyroid. Generalizability versus accrual versus specificity

Choosing dose levels Can be hard to know

exactly how to select doses.

If there is too much ‘space’ between doses, it can be difficult to estimate true curve.

If you expect, ‘monotonicity’, then you can infer intermediate doses.

But, what if not monotonic? Endostatin example.

Choosing the timing Standard Phase I

Patients are under surveillance for short term toxicities. Patients contact study team when a toxicity occurs after

discharge. Not an issue of ‘when to look.’

In laboratory studies Usually a pre-post setting: how does the measure

compare after treatment to before treatment. Baseline (before treatment) measure is needed. Post-treatment measures:

When is treatment at its most potent? How often should response be measured? Expensive? Invasive? Is it sufficient to look at clinic visit times?

Timing in Endostatin Trial

28 days 56 days

Phase II Designs Phase II designs: efficacy Standard efficacy outcomes

Complete response Partial response Overall survival

For cytostatic agents Progression Progression-free survival Laboratory outcomes…

Defining outcomes in laboratory studies They are usually messy More common binary outcomes have nice

properties: “Looking for 40% response vs. 20% response”

Laboratory outcomes are not so nice: Often skewed. Often have ‘undetectable’ range. Often do not know what to expect. This makes it hard to plan (i.e. sample size, power).

Novel assays: Not obvious what expected changes would be without treatment. How much fluctuation would we expect to see?

Expected Fluctuations

0 28 56

But, with multiplepatients, these changes wouldtend to average tozero.

However, if halfof patients haveincrease and halfhave decrease, we might conclude thattreatment is 50%effective!

Expected Fluctuations Follow-up times are

compared to baseline That puts a lot of stock

in baseline measure Why not consider a

‘burn-in’ period? If baseline is

inaccurately measured, all comparisons will be incorrect.

Measurement Issues Regardless of natural

fluctuations… How accurately are we

measuring our outcome?

How accurate can we measure blood flow?

Are we on target?

Accurate?

Measurement Issues Why would it be measured inaccurately?

Often LONG protocol to get ‘final’ measure. Lots of room for errors!

Although the classic efficacy outcome of ‘response’ is relatively soft, it has been objectively defined.

Laboratory endpoints require assays and other measurements, and also assumptions.

Often, assay is being developed along with the trial.

Assay Properties IS THE ASSAY SENSITIVE? Sensitivity: does the assay detect

abnormalities in cases where abnormalities exist?

Specificity: does the assay find normal levels in normal cases?

“It has been shown that at high flow rates, measuring blood flow by PET underestimates blood flow.” bias in results.

More Measurement Issues Reliability of assay

How reproducible are the results? Two samples taken from the same patient on the same day

from different lesions? Two samples taken from the same patient on the same day

from the same lesion? One sample analyzed twice using the same method?

Subjectivity Inter-rater agreement Intra-rater agreement In what ways can ‘error’ come into the procedure?

Great study + bad assay = bad study

Measurement Issues How can we know the reliability of the

measures? Preliminary studies (pre-clinical). Build it into the study design!

Reliability substudy: Inter-rater agreement Intra-rater agreement

Incorporate burn-in period Take multiple measures Run the assay (test) more than once

Final Comments Novel trials often need creative/novel study designs Understanding the properties (i.e. reliability, sensitivity) of

your assay are crucial. Try to build reliability testing into study design Think carefully about:

Are your markers are truly surrogates? Which doses you should choose (minimum, maximum,

and increments)? How often you should measure the outcome?

Statisticians usually can provide guidance, but not necessarily answers to these questions! Answers need to be based on experience with the novel agent andthe underlying biology.

statistical issues in clinical studies using laboratory endpoints elizabeth s. garrett, phd sidney...

Documents

treatment clinical outcome

clinical response

primary outcome

surrogate outcomes

causal pathway of true

surrogate markersreplace

surrogate markersreduction

effect of treatment