smart designs for developing dynamic treatment regimes

Post on 04-Feb-2016

24 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

SMART Designs for Developing Dynamic Treatment Regimes. S.A. Murphy MD Anderson December 2006. Collaborators. A. John Rush (University of Southwestern Texas Medical Center) Bibhas Chakraborty, Lacey Gunter, Alena Scott (U. Michigan) Linda Collins (PennState) - PowerPoint PPT Presentation

TRANSCRIPT

SMART Designs for Developing Dynamic Treatment Regimes

S.A. MurphyMD Anderson December 2006

2

Collaborators

• A. John Rush (University of Southwestern Texas Medical Center)

• Bibhas Chakraborty, Lacey Gunter, Alena Scott (U. Michigan)

• Linda Collins (PennState)• Dave Oslin, Kevin Lynch, Tom TenHave

(UPenn)

3

Outline

• Why dynamic treatment regimes?• Why SMART experimental designs?• Experimental principles • Constructing and addressing questions regarding

an optimal dynamic treatment regime• Why and when non-regular?• A class of solutions• A preliminary STAR*D analysis

4

Dynamic treatment regimes are individually tailored treatments, with treatment type and dosage changing according to patient outcomes. Operationalize clinical practice.

•Brooner et al. (2002) Treatment of Opioid Addiction

•Breslin et al. (1999) Treatment of Alcohol Addiction

•Prokaska et al. (2001) Treatment of Tobacco Addiction

•Rush et al. (2003) Treatment of Depression

5

Why Dynamic Treatment Regimes?

– High heterogeneity in response to any one treatment

• What works for one person may not work for another

– Improvement often marred by relapse • What works now for a person may not work later

– Side effects and/or co-occurring disorders and/or adherence problems occur frequently

6

k Decisions on one individual

Observation available at jth decision

Action at jth decision

History available at jth decision

7

k Decisions

History available at jth decision

“Reward” following jth decision point (rj is a known function)

Primary Outcome:

8

Goal:

Construct decision rules that input information in the history at each decision point and output a recommended decision; these decision rules should lead to a maximal mean Y.

The dynamic treatment regime is the sequence of decision rules:

9

In the future we offer treatment

An example of a simple decision rule is: alter treatment at time j if

otherwise maintain on current treatment.

10

SMART experimental designs are sequential, multiple assignment, randomized trial designs. At each step/critical decision, subjects are randomized among alternative options.

CATIE (2001) Treatment of Psychosis in Schizophrenia

•STAR*D (2003) Treatment of Depression

•Tummarello (1997) Treatment of Small Cell Lung Cancer (many, for many years, in this field)

•Oslin (on-going) Treatment of Alcohol Dependence

•Pellman (on-going) Treatment of ADHD

11

SMART Trial for Alcohol Dependency

Initial Txt Intermediate Outcome Secondary Txt

TDM

Responder R

Monitoring

Med B

Med A

Nonresponder REM + Med B+ CBT

R

Responder TDM

R

Monitoring

Med A + CBT Med B

Nonresponder R

EM +Med B+ CBT

12

Why SMART experimental designs?

• Why not use data from multiple randomized trials to construct the dynamic treatment regime?

• Use statistical methods that incorporate the potential for delayed effects and are suited for combining data from multiple randomized trials.

•Methods from Medical Decision Making involving a variation of a Markovian assumption

•Use (an approximation to) dynamic programming.

13

Why statistical methods for combining over multiple trials are not always the

answer

Subjects who will enroll in, who remain in or who are adherent in the trial of the one-stage treatments may be quite different from the subjects in SMART.

14

Designing Principles for a SMART

•KEEP IT SIMPLE: At each stage, restrict class of treatments only by ethical, feasibility or strong scientific considerations. Use a summary (responder status) instead of all intermediate outcomes (time until nonresponse, adherence, burden, stress level, etc.) to restrict class of next treatments.

•Collect intermediate outcomes that might be useful in ascertaining for whom each treatment works best; information that might enter into the dynamic treatment regime.

15

Designing Principles

•Primary hypotheses concern “main effects” that are both scientifically important and aid in developing the dynamic treatment regime.

•Secondary hypotheses consider choice of variables that can be used to tailor treatment and/or compare treatments in an “optimal dynamic treatment regime.”

16

Primary Hypotheses

•EXAMPLE 1: (sample size is highly constrained): Hypothesize that given the secondary treatments provided, the initial treatment Med A + CBT leads to lower drinking than the initial treatment Med A alone.

•EXAMPLE 2: (sample size is less constrained): Hypothesize that nonresponders will make greater improvement on EM+Med B+CBT as compared to the improvement on Med B alone.

17

SMART Trial for Alcohol Dependency

Initial Txt Intermediate Outcome Secondary Txt

TDMResponder

Monitoring

Med B

Med ANonresponder

EM + Med B+ CBT

Intensive OutpatientProgram

Responder TDM

Monitoring

Med A + CBT Med B

Nonresponder

EM +Med B+ CBT

18

SMART Trial for Alcohol Dependency

Initial Txt Intermediate Outcome Secondary Txt

TDMResponder

Monitoring

Med B

Med ANonresponder

EM + Med B+ CBT

Intensive OutpatientProgram

Responder TDM

Monitoring

Med A + CBT Med B

Nonresponder

EM +Med B+ CBT

19

Secondary Hypotheses

•EXAMPLE 1: Hypothesize that non-adhering non-responders will have lower drinking if provided a change in medication + CBT + EM as compared to a change in medication only.

•EXAMPLE 2: Hypothesize that the optimal sequence of treatments begins with Med A + CBT as opposed to Med A alone.

20

Constructing and Addressing Questions Regarding an Optimal Dynamic

Treatment Regime

21

Four Categories of Methods •Likelihood-based (Thall et al. 2000, 2002; POMDP’s in medical decision making and in reinforcement learning; vast literature)

•Q-Learning (Watkins, 1989) (a popular method from reinforcement learning)

---regression

•A-Learning (Murphy, 2003; Robins, 2004) ---regression on a mean zero space

•Weighting (Murphy, et al., 2002, related to policy search in reinforcement learning) ---weighted mean

22

(k=2)

Q-learning

23

Approximate

A Simple Version of Q-Learning –binary actions

• Stage 2 regression: Use least squares with outcome, Y, and covariates to obtain

• Set

• Stage 1 regression: Use least squares with outcome, and covariates to obtain

24

Decision Rules:

25

Why non-regular?

26

When do we have non-regularity?

27

Non-regularity

28

A class of “solutions”

29

A class of “solutions”

30

A class of “solutions”

31

32

Test if coefficient of A1 is nonzero

33

Test if coefficient of A1 is nonzero

34

Test if coefficient of A1 is nonzero

35

Test if coefficient of A1 is nonzero

36

Test if coefficient of A1 is nonzero

37

Summary: This is an open problem

• Use a tuning parameter set around .25?

• Just use tests for main effects (averaging over future treatments?)

• Just use tests with maximum (maximizing over future treatments?)

• Find a way to combine the main effect test with the use of the maximum?

38

STAR*D "Sequenced Treatment to Relieve Depression

Preference Treatment Intermediate Preference Treatment Two Outcome Three

Follow-up

CIT + BUS Remission L2-Tx +THY

Augment R Augment R

CIT + BUP-SR L2-Tx +LI

CIT Non-remission

Bup-SR MIRTSwitch Switch

R RVEN

SER NTP

39

Decision Rules: Outcome is Final QIDS_SR Score

Level 2 Level 3

Sw1

Choose SER if ζSw1>0 and ζSw1 -ψSw1>0

Choose VEN if ψ Sw1>0 and ψSw1-ζSw1>0

Sw2 Choose MIRT over NTP if α(1-Aug2) >0

Aug2 Choose LI over THY if βSw1*Aug2 + δAug2*QIDS>0

Aug1 Choose CIT+BUP if η(1-Sw1) + φ(1-Sw1)*Anx >0

Sw2 Choose MIRT over NTP if α(1-Aug2) >0

Aug2 Choose LI over THY if θ(1-Sw1)*Aug2 + δAug2*QIDS>0

40

Regression

• “S30” = H3, Sw1, Sw1*Aug2, (1-Sw1)*Aug2, Aug2*QIDS

• “S31A3” = Sw1*Aug2*Li, (1-Sw1)*Aug2*Li, (1-Aug2)*MIRT, Aug2*Li*QIDS

• “S20” = H2, Sw1, (1-Sw1)*Anx

• “S21A2” = Sw1*SER, Sw1*VEN, (1-Sw1)*(CIT+BUP), (1-Sw1 )*Anx*(CIT+BUP)

(all covariates are binary except continuous QIDS and covariates in H2, H3)

41

Results are omitted from this web copy!

Results: Outcome is Final QIDS_SR Score (λ=.25, .33, .4)

42

Histogram of Symptom Severity

Final QIDS-SR Score

Freq

uenc

y

0 5 10 15 20 25

050

100

150

Level 2 Residuals

Symptom Severity Residuals

Freq

uenc

y

-15 -10 -5 0 5 10 15

050

100

150

200

Level 3 Residuals

Symptom Severity ResidualsFr

eque

ncy

-10 -5 0 5 10

010

3050

70

Outcome and Residual Plots

43

Discussion

• It is unclear how one might combine averaging over the future actions with maximizing over the future actions.

• Ideally the effect a covariate has on the maximized mean outcome should be used to decide whether to use the covariate in the decision rules. We did not do this here.

• Constructing “evidence-based” regimes is of great interest in clinical research and there is much to be done by statisticians.

44

This seminar can be found at:http://www.stat.lsa.umich.edu/~samurphy/seminars/MDAnderson12.06.ppt

Email me with questions or if you would like a copy!

samurphy@umich.edu

top related