smart designs for developing dynamic treatment regimes

SMART Designs for Developing Dynamic Treatment Regimes

S.A. MurphyMD Anderson December 2006

Collaborators

• A. John Rush (University of Southwestern Texas Medical Center)

• Bibhas Chakraborty, Lacey Gunter, Alena Scott (U. Michigan)

• Linda Collins (PennState)• Dave Oslin, Kevin Lynch, Tom TenHave

(UPenn)

Outline

• Why dynamic treatment regimes?• Why SMART experimental designs?• Experimental principles • Constructing and addressing questions regarding

an optimal dynamic treatment regime• Why and when non-regular?• A class of solutions• A preliminary STAR*D analysis

Dynamic treatment regimes are individually tailored treatments, with treatment type and dosage changing according to patient outcomes. Operationalize clinical practice.

•Brooner et al. (2002) Treatment of Opioid Addiction

•Breslin et al. (1999) Treatment of Alcohol Addiction

•Prokaska et al. (2001) Treatment of Tobacco Addiction

•Rush et al. (2003) Treatment of Depression

Why Dynamic Treatment Regimes?

– High heterogeneity in response to any one treatment

• What works for one person may not work for another

– Improvement often marred by relapse • What works now for a person may not work later

– Side effects and/or co-occurring disorders and/or adherence problems occur frequently

k Decisions on one individual

Observation available at jth decision

Action at jth decision

History available at jth decision

k Decisions

History available at jth decision

“Reward” following jth decision point (rj is a known function)

Primary Outcome:

Construct decision rules that input information in the history at each decision point and output a recommended decision; these decision rules should lead to a maximal mean Y.

The dynamic treatment regime is the sequence of decision rules:

In the future we offer treatment

An example of a simple decision rule is: alter treatment at time j if

otherwise maintain on current treatment.

SMART experimental designs are sequential, multiple assignment, randomized trial designs. At each step/critical decision, subjects are randomized among alternative options.

CATIE (2001) Treatment of Psychosis in Schizophrenia

•STAR*D (2003) Treatment of Depression

•Tummarello (1997) Treatment of Small Cell Lung Cancer (many, for many years, in this field)

•Oslin (on-going) Treatment of Alcohol Dependence

•Pellman (on-going) Treatment of ADHD

SMART Trial for Alcohol Dependency

Initial Txt Intermediate Outcome Secondary Txt

Responder R

Monitoring

Nonresponder REM + Med B+ CBT

Responder TDM

Monitoring

Med A + CBT Med B

Nonresponder R

EM +Med B+ CBT

Why SMART experimental designs?

• Why not use data from multiple randomized trials to construct the dynamic treatment regime?

• Use statistical methods that incorporate the potential for delayed effects and are suited for combining data from multiple randomized trials.

•Methods from Medical Decision Making involving a variation of a Markovian assumption

•Use (an approximation to) dynamic programming.

Why statistical methods for combining over multiple trials are not always the

answer

Subjects who will enroll in, who remain in or who are adherent in the trial of the one-stage treatments may be quite different from the subjects in SMART.

Designing Principles for a SMART

•KEEP IT SIMPLE: At each stage, restrict class of treatments only by ethical, feasibility or strong scientific considerations. Use a summary (responder status) instead of all intermediate outcomes (time until nonresponse, adherence, burden, stress level, etc.) to restrict class of next treatments.

•Collect intermediate outcomes that might be useful in ascertaining for whom each treatment works best; information that might enter into the dynamic treatment regime.

Designing Principles

•Primary hypotheses concern “main effects” that are both scientifically important and aid in developing the dynamic treatment regime.

•Secondary hypotheses consider choice of variables that can be used to tailor treatment and/or compare treatments in an “optimal dynamic treatment regime.”

Primary Hypotheses

•EXAMPLE 1: (sample size is highly constrained): Hypothesize that given the secondary treatments provided, the initial treatment Med A + CBT leads to lower drinking than the initial treatment Med A alone.

•EXAMPLE 2: (sample size is less constrained): Hypothesize that nonresponders will make greater improvement on EM+Med B+CBT as compared to the improvement on Med B alone.

TDMResponder

Monitoring

Med ANonresponder

EM + Med B+ CBT

Intensive OutpatientProgram

Responder TDM

Monitoring

Med A + CBT Med B

Nonresponder

EM +Med B+ CBT

TDMResponder

Monitoring

Med ANonresponder

EM + Med B+ CBT

Intensive OutpatientProgram

Responder TDM

Monitoring

Med A + CBT Med B

Nonresponder

EM +Med B+ CBT

Secondary Hypotheses

•EXAMPLE 1: Hypothesize that non-adhering non-responders will have lower drinking if provided a change in medication + CBT + EM as compared to a change in medication only.

•EXAMPLE 2: Hypothesize that the optimal sequence of treatments begins with Med A + CBT as opposed to Med A alone.

Constructing and Addressing Questions Regarding an Optimal Dynamic

Treatment Regime

Four Categories of Methods •Likelihood-based (Thall et al. 2000, 2002; POMDP’s in medical decision making and in reinforcement learning; vast literature)

•Q-Learning (Watkins, 1989) (a popular method from reinforcement learning)

---regression

•A-Learning (Murphy, 2003; Robins, 2004) ---regression on a mean zero space

•Weighting (Murphy, et al., 2002, related to policy search in reinforcement learning) ---weighted mean

Q-learning

Approximate

A Simple Version of Q-Learning –binary actions

• Stage 2 regression: Use least squares with outcome, Y, and covariates to obtain

• Set

• Stage 1 regression: Use least squares with outcome, and covariates to obtain

Decision Rules:

Why non-regular?

When do we have non-regularity?

Non-regularity

A class of “solutions”

Test if coefficient of A1 is nonzero

Summary: This is an open problem

• Use a tuning parameter set around .25?

• Just use tests for main effects (averaging over future treatments?)

• Just use tests with maximum (maximizing over future treatments?)

• Find a way to combine the main effect test with the use of the maximum?

STAR*D "Sequenced Treatment to Relieve Depression

Preference Treatment Intermediate Preference Treatment Two Outcome Three

Follow-up

CIT + BUS Remission L2-Tx +THY

Augment R Augment R

CIT + BUP-SR L2-Tx +LI

CIT Non-remission

Bup-SR MIRTSwitch Switch

R RVEN

SER NTP

Decision Rules: Outcome is Final QIDS_SR Score

Level 2 Level 3

Choose SER if ζSw1>0 and ζSw1 -ψSw1>0

Choose VEN if ψ Sw1>0 and ψSw1-ζSw1>0

Sw2 Choose MIRT over NTP if α(1-Aug2) >0

Aug2 Choose LI over THY if βSw1*Aug2 + δAug2*QIDS>0

Aug1 Choose CIT+BUP if η(1-Sw1) + φ(1-Sw1)*Anx >0

Sw2 Choose MIRT over NTP if α(1-Aug2) >0

Aug2 Choose LI over THY if θ(1-Sw1)*Aug2 + δAug2*QIDS>0

Regression

• “S30” = H3, Sw1, Sw1*Aug2, (1-Sw1)*Aug2, Aug2*QIDS

• “S31A3” = Sw1*Aug2*Li, (1-Sw1)*Aug2*Li, (1-Aug2)*MIRT, Aug2*Li*QIDS

• “S20” = H2, Sw1, (1-Sw1)*Anx

• “S21A2” = Sw1*SER, Sw1*VEN, (1-Sw1)*(CIT+BUP), (1-Sw1 )*Anx*(CIT+BUP)

(all covariates are binary except continuous QIDS and covariates in H2, H3)

Results are omitted from this web copy!

Results: Outcome is Final QIDS_SR Score (λ=.25, .33, .4)

Histogram of Symptom Severity

Final QIDS-SR Score

0 5 10 15 20 25

Level 2 Residuals

Symptom Severity Residuals

-15 -10 -5 0 5 10 15

Level 3 Residuals

Symptom Severity ResidualsFr

-10 -5 0 5 10

Outcome and Residual Plots

Discussion

• It is unclear how one might combine averaging over the future actions with maximizing over the future actions.

• Ideally the effect a covariate has on the maximized mean outcome should be used to decide whether to use the covariate in the decision rules. We did not do this here.

• Constructing “evidence-based” regimes is of great interest in clinical research and there is much to be done by statisticians.

This seminar can be found at:http://www.stat.lsa.umich.edu/~samurphy/seminars/MDAnderson12.06.ppt

Email me with questions or if you would like a copy!

samurphy@umich.edu

smart designs for developing dynamic treatment regimes

treatment of psychosis

treatment type

current treatment

dynamic treatment regimess

recommended decision

stepcritical decision

jth decision point rj

construct decision rules

Documents

dynamic-behavior metrics for object oriented designs

dynamic treatment regimes

approaches and designs of dynamic voltage and frequency...

modeling wildﬁ re regimes in forest landscapes ... ·...

demystifying optimal dynamic treatment regimes - imperial

q-learning and dynamic treatment regimes

dynamic power noise analysis method for memory designs

dynamic regimes 1 text 1

constructing dynamic treatment regimes & star*d

set-valued dynamic treatment regimes for competing...

is3. dynamic treatment regimes in clinical trials and...

dynamic analysis of exchange rate regimes: policy...

experiments and dynamic treatment regimes s.a. murphy univ....

quasistatic and dynamic regimes of...

smart designs for developing dynamic treatment regimes s.a....

dynamic treatment regimes: challenges in data analysis s.a....

new method for dynamic treatment regimes

an behavioral model of various stock market dynamic regimes

how to create designs with dynamic/adaptive voltage...

4. single decision treatment regimes: additional methods 4...