1 dynamic treatment regimes advances and open problems s.a. murphy icsprar-2008

52
1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

Post on 19-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

1

Dynamic Treatment Regimes Advances and Open Problems

S.A. Murphy

ICSPRAR-2008

Page 2: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

2

Outline

– Dynamic Treatment Regimes– Advances– Inferential Challenges

• Incomplete, primitive, mechanistic models

• Measures of Confidence

Page 3: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

3

Page 4: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

4

Dynamic treatment regimes (e.g. policies) are individually tailored treatments, with treatment type and dosage changing according to patient outcomes.

k Stages for one individual

Observation available at jth stage

Action at jth stage (usually a treatment)

Page 5: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

5

k Stages

History available at jth decision

“Reward” following jth decision point (rj is a known function)

Primary Outcome:

Page 6: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

6

Goal:

Construct decision rules that input information in the history at each stage and output a recommended action; these decision rules should lead to a maximal mean Y.

The dynamic treatment regime (policy) is the sequence of decision rules.

In future one selects actions as:

Page 7: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

7

Types of Data• Large Observational Data Sets

– Noise in data – Actions are not manipulated by scientist (causal

inference methods required)– Actions are measured with error– Moderate to small number of variables

• Small Randomized Clinical Trials– Actions are manipulated by scientist– Unknown causes requires causal inference methods– Moderate to large number of variables

Page 8: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

8

Reality

Unknown Unknown Causes Causes

O1 A1 O2 A2 Y

Page 9: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

9

Constructing Dynamic Treatment Regimes

– Why is this more than a standard control problem?

• High quality mechanistic models are often unavailable. (Unknown, complex, system dynamics)

• Even when such models are available often they do not adequately simulate the interrelationships between observations and how the actions might impact the observations because there are strong behavioral and contextual influences.

Page 10: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

10

Advances Methods for Constructing Dynamic

Treatment Regimes

•Likelihood-based (model conditional distribution of observations in each state given past history)

•Late stage cancer (Thall et al. 2000, 2002, 2007)

•Some HIV/AIDS (Davidian & colleagues, 2007)

Page 11: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

11

Constructing Dynamic Treatment Regimes

Why is this more than a standard reinforcement learning problem?

• Unknown causes of observations in system dynamics (violates POMDP assumptions)

• Large data sets in which actions are manipulated are unavailable.

POMDP: Partially Observed Markov Decision Process (used in “medical decision making.”)

Page 12: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

12

Advances Methods for Constructing Dynamic

Treatment Regimes

•Q-Learning (Watkins, 1989) (a popular method from reinforcement learning)

---generalization of regression

---may be misleading when actions are not randomized or there are unknown causes

Page 13: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

13

Advances Methods for Constructing Dynamic

Treatment Regimes

(Deal with some causal inference issues in large observational data sets)

•A-Learning (Murphy, 2003; Robins, 2004) ---regression on a mean zero space

•Weighting (Murphy, et al., 2002; Tsiatis & coauthors, 2004, 2006; Hernan et al. 2006) ---weighted mean

Page 14: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

14

Advances

Experimental Design

(Very applied, rather primitive)

•Adaptive Trial Design (Thall & colleagues, 2000, 2002)

•Sequential, Multiple Assignment, Randomized Trials (Murphy & colleagues, 2005, 2006, 2007)

•General Trial Design Issues (Lavori & Dawson, 1998, 2003, 2004)

Page 15: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

15

STAR*D

• There was not statistical expertise available at the time the trial was designed.

• This trial is over and one can apply for access to this data

• One goal of the trial is construct good treatment sequences for patients suffering from treatment resistant depression.

www.star-d.org

Page 16: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

16

STAR*D "Sequenced Treatment to Relieve Depression"

Preference Treatment Intermediate Preference Treatment Two Outcome Three

Follow-up

CIT + BUS Remission L2-Tx +THY

Augment R Augment R

CIT + BUP-SR L2-Tx +LI

CIT Non-remission

Bup-SR MIRTSwitch Switch

R RVEN

SER NTP

Page 17: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

17

ExTENd

• Ongoing study at U. Pennsylvania

• Goal is to learn how best to help alcohol dependent individuals reduce alcohol consumption.

Page 18: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

18

Oslin ExTENd

Late Trigger forNonresponse

8 wks Response

TDM + Naltrexone

CBIRandom

assignment:

CBI +Naltrexone

Nonresponse

Early Trigger for Nonresponse

Randomassignment:

Randomassignment:

Randomassignment:

Naltrexone

8 wks Response

Randomassignment:

CBI +Naltrexone

CBI

TDM + Naltrexone

Naltrexone

Nonresponse

Page 19: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

19

Measures of Confidence • We would like measures of confidence for

the following:– Aid in dynamic treatment regime construction

• To assess if there is sufficient evidence that a particular observation (e.g. output of a biological test) should be part of the dynamic treatment regime.

• To assess if there is sufficient evidence that a group of actions lead to equivalent outcomes for a given observation.

– To compare the mean outcome of two estimated dynamic treatment regimes (both estimated using the same data).

Page 20: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

20

ChallengesMeasures of Confidence

– Measures of confidence are essential• Need to know when a subset of actions are

equivalent –that is, when there is no or little evidence that one of the actions leads to a better outcome.

• It is important to minimize the number of observations that must be collected in the future clinical setting.

– Randomized Clinical Trials

Page 21: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

21

Measures of Confidence

• Traditional methods for constructing measures of conference require some form of differentiability (if frequentist properties are desired).

• Non-differentiable operations are used to construct dynamic treatment regimes.

• The mean of the outcome Y following use of a dynamic treatment regime is a non-differentiable function of the regime.

Page 22: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

22

Example: Q-learning

• Generalization of regression to multistage decisions.

• Move backward through time as in dynamic programming.

• Hj is the history available at stage j

• k=2 stages in following

Page 23: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

23

(k=2)

Q-learning

Page 24: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

24

Q-learning with Data

• Assume actions, Aj are randomized.

• Sj is a summary of the information available at and prior to stage j

• Binary actions in the following

Page 25: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

25

Approximate

A Simple Version of Q-Learning –binary actions

• Stage 2 regression: Use least squares with outcome, Y, and covariates to obtain

• Set

• Stage 1 regression: Use least squares with outcome, and covariates to obtain

Page 26: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

26

Decision Rules:

Page 27: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

27

Inference is a non-regular problem

Page 28: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

28

When do we have non-regularity?

Page 29: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

29

Non-regularity

(Bootstrap & Taylor series-based estimators of standard errors & Bayesian methods have poor frequentist properties......)

Page 30: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

30

Simulation Example

• Generative Model: Y =1 + γ1A1 + γ2A2 + γ3A1A2 + N (0, 1); A1, A2 coded {0, 1}; no Sj’s

• Parameter of interest: β1 in

E[ maxa2E[Y| A1, A2 =a2]]=α1 + β1A1

β1 = γ1 + (γ2+ γ3)+ - γ2+

• 1000 simulated data sets

Page 31: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

31

Sample Size Asymptotic Normality

Percentile Bootstrap

Bayesian

100 .98 .98 .98

300 .97 .97 .97

500 .98 .98 .98

1000 .98 .98 .97

Confidence Rate γ1 =γ2= γ3=0 so β1=0

Confidence rate should be .95

Page 32: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

32

A Challenge!

• The goal is to conduct inference on parameters in the dynamic treatment regime.

• I’ve worked on this problem for 3 years (!) and every solution I’ve formulated has unsatisfactory drawbacks.

• Can you produce a good solution; a solution that can be used in REAL LIFE to analyze clinical trial data?!

Page 33: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

33

Measures of Confidence

• We would like measures of confidence for the following:– Aid in Policy Construction

• To assess if there is sufficient evidence that a particular observation (e.g. output of a biological test) should be part of the policy.

• To assess if there is sufficient evidence that a subset of the actions lead to better rewards for a given observation than the remaining actions.

– To compare the mean outcomes of two estimated policies (both estimated using the same data).

Page 34: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

34

Single Stage (k=1)

• Find a prediction interval for the mean outcome if a particular estimated policy (here one decision rule) is employed.

• Action A is binary in {-1,1}.

• Suppose the decision rule is of form

• We do not assume the Bayes decision boundary is linear.

Page 35: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

35

Single Stage (k=1)

Mean outcome following this policy is

is the randomization probability

Page 36: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

36

ClassificationMisclassification rate for a given decision rule

(classifier)

where V is defined by

(A is the {-1,1} classification; O1 is the observation; βT

O1 is a linear classification boundary)

Page 37: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

37

Prediction Interval for

Two problems

• V(β) is not necessarily smooth in β.

• We don’t know V so V must be estimated as well. Data set is small so overfitting is a problem.

Page 38: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

38

Simulation Example

• Population: Ionosphere data from the UCI repository, 351 samples, O is composed of 9 covariates, A is binary

• Use least squares to form classification rule

Page 39: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

39

“95% Prediction Intervals”

Sample Size

Percentile Bootstrap

Adjusted Bootstrap

Naïve Binomial

Our Method

30 .601 .190 .389 .993

50 .316 .500 .632 .957

Confidence rate should be .95

Page 40: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

40

Prediction Interval for

Our method

• Obtains a prediction interval for a smooth upper bound on

• Our method is generally too conservative

Page 41: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

41

A Challenge!

• Statistical methods for constructing the policy/classifier and providing an evaluation of the policy/classifier should use same small data set.

• Can you make an advance?

Page 42: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

42

Discussion

These are real problems and the need for advances in statistical methods is great.

Page 43: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

43

Discussion: Further Open Problems

• These are real problems and the need for advances in statistical methods is great.

• High level of interest in clinical medicine research.

• Developing methods for variable selection in decision making (in addition to variable selection for prediction)

Page 44: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

44

Discussion: Further Open Problems

• Model selection when goal is constructing good policies.

• Feature Construction

• Methods for producing composite outcomes (Y)– High quality elicitation of functionality

Page 45: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

45

This seminar can be found at:

http://www.stat.lsa.umich.edu/~samurphy/

seminars/ICSPRAR01.08Plenary.ppt

Email me with questions or if you would like a copy:

[email protected]

Page 46: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

46

Studies under review

• H. Jones study of drug-addicted pregnant women (goal is to reduce cocaine/heroin use during pregnancy and thereby improve neonatal outcomes)

• J. Sacks study of parolees with substance abuse disorders (goal is reduce recidivism and substance use)

Page 47: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

47

Jones’ Study for Drug-Addicted Pregnant Women

rRBT

2 wks Response

rRBT

tRBTRandom

assignment:

rRBT

Nonresponse

tRBT

Randomassignment:

Randomassignment:

Randomassignment:

aRBT

2 wks Response

Randomassignment:

eRBT

tRBT

tRBT

rRBT

Nonresponse

Page 48: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

48

Sack’s Study of Adaptive Transitional Case Management

Standard Services

Standard TCM

Randomassignment:

Randomassignment:

4 wks Response

Standard TCM

Augmented TCM

Standard TCM

Nonresponse

Page 49: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

49

Adaptive Treatment for ADHD

• Ongoing study at the State U. of NY at Buffalo (B. Pelham)

• Goal is to learn how best to help children with ADHD improve functioning at home and school.

Page 50: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

50

ADHD Study

B. Begin low dosemedication

8 weeks

Assess-Adequate response?

B1. Continue, reassess monthly; randomize if deteriorate

B2. Increase dose of medication with monthly changes

as neededRandom

assignment:B3. Add behavioral

treatment; medication dose remains stable but intensity

of bemod may increase with adaptive modifications

based on impairment

No

A. Begin low-intensity behavior modification

8 weeks

Assess-Adequate response?

A1. Continue, reassess monthly;randomize if deteriorate

A2. Add medication;bemod remains stable butmedication dose may vary

Randomassignment:

A3. Increase intensity of bemod with adaptive modifi-

cations based on impairment

Yes

No

Randomassignment:

Page 51: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

51

A class of “solutions”

Page 52: 1 Dynamic Treatment Regimes Advances and Open Problems S.A. Murphy ICSPRAR-2008

52

Soft-max

F is a distribution function (e.g. logistic) and λ is a tuning parameter.

The choice of a data based tuning parameter λ is difficult.