empirically supported treatments and efficacy trials: what steps do we still need to take?

9
ORIGINAL PAPER Empirically Supported Treatments and Efficacy Trials: What Steps Do We Still Need to Take? Jessica D. Nasser Published online: 7 April 2013 Ó Springer Science+Business Media New York 2013 Abstract The Task Force on the Promotion and Dissemi- nation of Psychological Procedures sought to identify scien- tifically supported treatments in order to espouse their use and improve client outcomes in therapy. Nevertheless, the gap between scientists and practitioners persists, and there still remain some limitations to the manner in which this goal is carried out. The criteria specifying whether a treatment qualifies as empirically supported are too lenient. The research used in the search for empirically supported treatments does not take into account the full literature base. Efficacy trials provide practitioners with limited information. This paper proposes means through which the field can improve its search for scientifically supported treatments. Alterations to the cri- teria that assess empirically supported treatments, greater research transparency and external validity, and collaboration between investigators and clinicians will allow the field of clinical psychology to better answer the question, ‘‘How can we most successfully treat this client?’’ Keywords Clinical utility Á Empirically supported treatments Á Efficacy studies Á External validity Á Meta-analyses Á Research transparency Á Scientist-practitioner Division 12 (Clinical Psychology) of the American Psy- chological Association (APA) established the Task Force on the Promotion and Dissemination of Psychological Procedures (hereinafter referred to as the Task Force) to identify and disseminate Empirically Validated Treatments (subsequently termed Empirically Supported Treatments [ESTs]; Chambless et al. 1998, 1996; APA 1993). The Task Force’s objectives (e.g., increasing public awareness of psychotherapies’ efficacy, ensuring that clinicians implement scientifically-based interventions, and improv- ing treatment outcomes in psychotherapy) are laudable goals the field should continue to pursue. The manner in which ESTs are identified and the type of information they provide, however, still contain drawbacks that need to be addressed. Additionally, despite the Task Force’s efforts, the gap between scientists and practitioners remains wide, and individuals in need of treatment often receive inter- ventions that are not scientifically based (Lilienfeld 2010). Concerns regarding the EST movement are examined, and subsequent suggestions are provided that will both enhance the investigation of scientifically-based treatments and help decrease the scientist-practitioner gap. Concern #1: The EST Criteria Include the Use of Wait-List Groups and Pill and Psychological Placebos as Controls The criteria used to identify ESTs are too lenient and allow any intervention—whether it merely produces a placebo effect or adds an irrelevant component to an already established therapeutic intervention—to qualify as an EST (e.g., Eye Movement Desensitization Therapy, or EMDR; Davidson and Parker 2001; Shapiro 1989). Therapies demonstrating statistical superiority to wait-list controls in two different experiments qualify as ‘‘probably effica- cious’’ (Chambless et al. 1998), and such criteria can result in the proliferation of interventions that are merely more helpful than doing nothing. Evidence indicating that J. D. Nasser (&) Department of Psychological Sciences, Psychology Program, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH 44106-7123, USA e-mail: [email protected] 123 J Contemp Psychother (2013) 43:141–149 DOI 10.1007/s10879-013-9236-x

Upload: jessica-d-nasser

Post on 10-Dec-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Empirically Supported Treatments and Efficacy Trials: What Steps Do We Still Need to Take?

ORIGINAL PAPER

Empirically Supported Treatments and Efficacy Trials: WhatSteps Do We Still Need to Take?

Jessica D. Nasser

Published online: 7 April 2013

� Springer Science+Business Media New York 2013

Abstract The Task Force on the Promotion and Dissemi-

nation of Psychological Procedures sought to identify scien-

tifically supported treatments in order to espouse their use and

improve client outcomes in therapy. Nevertheless, the gap

between scientists and practitioners persists, and there still

remain some limitations to the manner in which this goal is

carried out. The criteria specifying whether a treatment

qualifies as empirically supported are too lenient. The research

used in the search for empirically supported treatments does

not take into account the full literature base. Efficacy trials

provide practitioners with limited information. This paper

proposes means through which the field can improve its search

for scientifically supported treatments. Alterations to the cri-

teria that assess empirically supported treatments, greater

research transparency and external validity, and collaboration

between investigators and clinicians will allow the field of

clinical psychology to better answer the question, ‘‘How can

we most successfully treat this client?’’

Keywords Clinical utility � Empirically supported

treatments � Efficacy studies � External validity �Meta-analyses � Research transparency �Scientist-practitioner

Division 12 (Clinical Psychology) of the American Psy-

chological Association (APA) established the Task Force

on the Promotion and Dissemination of Psychological

Procedures (hereinafter referred to as the Task Force) to

identify and disseminate Empirically Validated Treatments

(subsequently termed Empirically Supported Treatments

[ESTs]; Chambless et al. 1998, 1996; APA 1993). The

Task Force’s objectives (e.g., increasing public awareness

of psychotherapies’ efficacy, ensuring that clinicians

implement scientifically-based interventions, and improv-

ing treatment outcomes in psychotherapy) are laudable

goals the field should continue to pursue. The manner in

which ESTs are identified and the type of information they

provide, however, still contain drawbacks that need to be

addressed. Additionally, despite the Task Force’s efforts,

the gap between scientists and practitioners remains wide,

and individuals in need of treatment often receive inter-

ventions that are not scientifically based (Lilienfeld 2010).

Concerns regarding the EST movement are examined, and

subsequent suggestions are provided that will both enhance

the investigation of scientifically-based treatments and help

decrease the scientist-practitioner gap.

Concern #1: The EST Criteria Include the Use

of Wait-List Groups and Pill and Psychological

Placebos as Controls

The criteria used to identify ESTs are too lenient and allow

any intervention—whether it merely produces a placebo

effect or adds an irrelevant component to an already

established therapeutic intervention—to qualify as an EST

(e.g., Eye Movement Desensitization Therapy, or EMDR;

Davidson and Parker 2001; Shapiro 1989). Therapies

demonstrating statistical superiority to wait-list controls in

two different experiments qualify as ‘‘probably effica-

cious’’ (Chambless et al. 1998), and such criteria can result

in the proliferation of interventions that are merely more

helpful than doing nothing. Evidence indicating that

J. D. Nasser (&)

Department of Psychological Sciences, Psychology Program,

Case Western Reserve University, 10900 Euclid Avenue,

Cleveland, OH 44106-7123, USA

e-mail: [email protected]

123

J Contemp Psychother (2013) 43:141–149

DOI 10.1007/s10879-013-9236-x

Page 2: Empirically Supported Treatments and Efficacy Trials: What Steps Do We Still Need to Take?

therapy works better than no treatment does not provide

clinicians with information that is applicable to their

practice. If two individual studies examine different ther-

apeutic interventions and demonstrate that each treatment

is more beneficial than doing nothing, practitioners must

still grapple with deciding which of the two treatments will

most benefit their client.

The EST criteria also indicate that interventions can be

considered ‘‘well-established’’ if they demonstrate signifi-

cant superiority to medication placebo or intervention pla-

cebo in two studies conducted by different research teams

(Chambless et al. 1998). This is despite the fact that inves-

tigators have often expressed concerns regarding the use of

pill and psychological placebos (e.g., Mahoney 1978;

O’Leary and Borkovec 1978). Recent studies have strength-

ened the basis for this concern by demonstrating that more

patients prefer individual therapy to medication when given

the choice (e.g., Feeny et al. 2009). Thus, participants who are

randomly assigned to psychotherapy conditions might be more

satisfied with their group placement and more likely to believe

they will improve when compared to participants who are

assigned to pill placebo conditions. Disparities in participants’

satisfaction and expectations might make it easier to find a

significant difference between groups in favor of therapy.

Implementing psychological placebo groups in random-

ized controlled trials (RCTs) can threaten a study’s construct

validity (Kazdin 2003). Successfully executing psychological

placebo conditions in behavioral RCTs is difficult since both

participants and researchers are conscious of the treatment

they are receiving and providing, respectively (Castelnuovo

2010). Administering clinicians might believe in the active

therapy’s positive effects, be biased against the placebo con-

dition, and unwittingly affect the manner in which each is

delivered. For instance, the administering clinician might

unconsciously act in such a way as to convey to participants

whether or not they should be improving. Indeed, a recent

experimental study suggests that, in instances in which

experimenters’ expectations regarding outcome are congruent

with the experimental condition, experimenters can unwit-

tingly influence participants’ behavior (Doyen et al. 2012).

Participants’ beliefs about whether they have been

assigned to the treatment or placebo condition can also

threaten a study’s construct validity. Patients might not

expect to improve if they believe they are in the placebo

group, and vice versa (Colagiuri 2010; Kazdin 2003). Even if

participants cannot determine their group placement, pla-

cebo groups lack active therapeutic components and might

eventually discourage participants who sense no improve-

ment. Disheartened participants might subsequently develop

aversion towards and negative expectations regarding the

placebo group or their own ability to change, in turn mag-

nifying the observed differences between the placebo and

treatment groups (O’Leary and Borkovec 1978).

Most importantly, when efficacious interventions for a

given disorder are known to exist, providing participants

with a non-therapeutic placebo intervention in an RCT is

ethically questionable (Michels and Rothman 2003; World

Medical Association 2008). Participants in placebo groups

do not receive the treatment they need to address their

psychopathology (Kazdin 2003; O’Leary and Borkovec

1978), and their symptoms might worsen. Participants in

placebo groups might also blame themselves for their lack

of improvement and begin believing they are ‘‘hopeless.’’

Increased distress in individuals who are already in need of

psychological interventions can exacerbate their psycho-

pathology and be potentially harmful.

Finally, pill and psychological placebo comparison

groups do not provide clinicians with the knowledge nee-

ded to make informed decisions about treatment choice.

Clinicians are not trying to decide between implementing

weekly therapy sessions and providing sugar pills or psy-

chological placebos; the latter two are irrelevant in clinical

practice. Instead, clinicians might be interested in whether

or not a novel treatment is superior to the one they are

currently employing. As is the case for wait-list control

comparison groups, it is easier to find significant differ-

ences when comparing a treatment to a placebo than when

comparing it to an active and efficacious intervention

(Rothman and Michels 1994). Placebo control groups can

thus lead to an infinite list of ‘‘efficacious’’ treatments

without providing knowledge about which, amongst all

interventions, is superior.

Suggestion #1: Only Active Comparison Groups

Should be Used to Determine EST Status

Given the limited information and possible complications

that arise from pill and psychological placebos, investigators

should compare novel therapies to the most efficacious

treatment currently available. Active and efficacious com-

parison groups offer a legitimate test of a therapy’s efficacy

(as opposed to a placebo or a diluted version of therapy that is

thought, from the beginning, not to work; Krause and Lutz

2009) and can provide an efficient means of narrowing down

the list of ESTs (Castelnuovo 2010). Additionally, recent

statistical advances suggest that comparing two potentially

equivalent interventions is more feasible than previously

believed (e.g., Greene et al. 2008; Spławinski and Kuzniar

2004; Streiner 2003).

In addition to providing a more stringent test of a therapy’s

efficacy, active comparison groups also solve the ethical

dilemma concerning delaying treatment for individuals who

need therapy. Although wait-list and placebo groups control

for nonspecific factors such as participant expectancies and

natural symptom remission (Mahoney 1978; Lohr et al. 2005),

142 J Contemp Psychother (2013) 43:141–149

123

Page 3: Empirically Supported Treatments and Efficacy Trials: What Steps Do We Still Need to Take?

such scientific goals should not trump the ethical obligation

that researchers have to their participants (Michels and

Rothman 2003). Additionally, wait-list and placebo control

groups are not the only available means to account for non-

specific factors. Participant expectancies, for instance, can be

assessed via simple inquiries (e.g., ‘‘What do you expect from

this treatment?’’; Price et al. 2007, p. 583) or through ques-

tionnaires (e.g., The Credibility/Expectancy Questionnaire;

Devilly and Borkovec 2000). Participant preference for

intervention type (e.g., psychotherapy versus medication) can

be assessed with single questions about the type of treatment

they would select if given the option (e.g., ‘‘If you had a choice

between individual therapy, medication, or no treatment to

help you with [your symptoms], which would you choose?’’;

Feeny et al. 2009, p. 726). Differences in participant expec-

tancies or preferences can subsequently be accounted for

during group randomization or throughout statistical analyses

(e.g., as covariates or possible moderators).

RCTs comparing two active psychotherapeutic inter-

ventions also have the benefit of providing therapists with

information that is applicable to clinical practice. Studies

with active comparison groups demonstrate which of two

treatments is superior and indicate whether distinct treat-

ments lead to differential outcomes. For instance, knowing

that therapy Y decreases anxiety more so than therapy Z,

while therapy Z increases positive affect more so than

therapy Y, will allow clinicians to implement the treatment

best suited to each client. (E.g., a highly anxious, but not

depressed, client would benefit from therapy Y as opposed

to therapy Z.)

Comparing active psychotherapy to active pharmaco-

therapy provides clinicians with different, yet equally

important, information. Studies demonstrating whether

therapy works better than medication, whether medication

works better than therapy, or whether a combination of

both therapy and medication is superior to either in isola-

tion can help ensure that clients receive the most beneficial

intervention possible. If the therapy in question is less

beneficial than medication, then clinicians can seek a dif-

ferent treatment that is superior to medication. If medica-

tion is superior to all therapeutic interventions, therapists

can suggest pharmacotherapy to their patients. If combin-

ing therapy and medication proves the most beneficial, then

clinicians can suggest that clients concurrently engage in

psychotherapy and medication management.

Active comparison groups, however, are not without their

drawbacks. One potential weakness is their inability to

provide information regarding the incremental and specific

efficacy of individual components. Component control

studies can provide such information (Lohr et al. 2005;

O’Leary and Borkovec 1978) but might be ethically tenuous

since participants are provided with a potentially less-effi-

cacious intervention. If such studies are implemented, then

researchers should ensure that they provide participants in

the dismantled arm with the missing component if it is found

to have incremental efficacy after study completion.

RCTs employing two active treatments are also subject

to factors that can negatively affect construct validity.

‘‘Allegiance effects’’ refer to the fact that researchers’

favored treatments usually demonstrate superior outcomes

(Luborsky et al. 1999). This phenomenon might occur if

investigators are more knowledgeable about the preferred

intervention and provide the study’s clinicians with better

training on that treatment (Leykin and DeRubeis 2009). In

studies where evaluators are not blinded, allegiance effects

might also occur if researchers incorrectly (but inadver-

tently) assess individuals in a way that confirms their

hypotheses. Indeed, in one experimental study in which

experimenters’ expectations were incongruent with the

experimental condition, experimenters tended to inaccu-

rately assess participant behavior so as to align results with

their expectations (Doyen et al. 2012).

In order to reduce possible allegiance effects, investigator

teams with different theoretical orientations can collaborate

on studies to compare their respective interventions (Leykin

and DeRubeis 2009; Luborsky et al. 1999; Mellers et al. 2001).

Not only will such collaboration help minimize allegiance

effects (Leykin and DeRubeis 2009), but it will also pool

investigators’ resources so that larger studies can be imple-

mented and more intricate and detailed data analyses can be

conducted. Carrying out multiple levels of analyses, for

instance, can provide information on whether therapeutic

techniques lead to distinct outcomes or degrees of efficacy

with different individuals (Singer and Willett 2003). Such

studies would allow researchers to take into account the dif-

ferential effects of therapeutic techniques on subgroups within

larger homogenous groups (e.g., individuals with Generalized

Anxiety Disorder who differ in age of onset or in ethnicity).

Alternatively, when collaboration between research teams is

not feasible, investigators can ensure treatment fidelity by

employing experts (either researchers or clinicians) who

specialize in the treatments under examination for their study

(Leykin and DeRubeis 2009; Luborsky et al. 1999).

Concern #2: The EST Criteria Do Not Account

for All the Literature

The EST criteria disregard negative findings and only

consider individual studies with positive outcomes. The

criteria also lack provisions for removing treatments from

the list of ESTs (Castelnuovo 2010). Thus, a treatment that

is shown to produce positive effects in two individual

studies can qualify as empirically supported regardless of

whether other studies have found negative effects. Such

criteria are problematic if different studies provide mixed

J Contemp Psychother (2013) 43:141–149 143

123

Page 4: Empirically Supported Treatments and Efficacy Trials: What Steps Do We Still Need to Take?

results regarding a treatment’s efficacy, or if an interven-

tion initially demonstrates efficacy but subsequently does

not work in more rigorous trials (Herbert 2003).

Given their singular focus on positive outcomes, the EST

criteria do not account for studies that commit Type II errors.

Methodologically rigorous studies with insufficient statisti-

cal power (due to small sample sizes that often constitute

individual studies) can fail to detect veridical effects when

the intervention is indeed efficacious (Borenstein et al.

2009). Although meta-analyses render Type II errors for

overall effects less likely by aggregating participants across

primary studies (Borenstein et al. 2009), the EST criteria do

not consider meta-analytic results. Failing to take into

account the entire literature might make it difficult for

researchers and clinicians to rely on the list of ESTs.

Suggestion #2: Aggregate Results in a Transparent

Research Context Should Determine EST Criteria

Studies with both significant and nonsignificant results

should be taken into account when determining EST status

(Herbert 2003). Ultimately, meta-analytic results should be

the final arbiter in determining treatment efficacy. Sub-

sequent meta-analyses that include a wider range of studies

could be used to challenge the results of older meta-anal-

yses. Using meta-analyses to establish whether treatments

are empirically supported will also provide a method of

updating the list of ESTs (e.g., if a meta-analysis concludes

that an intervention is not efficacious, then it can be

removed from the list of ESTs). The benefits of using meta-

analytic results to ultimately determine EST status will be

maximized if some modifications—described below—are

made to the research and publication process.

The validity of meta-analytic results is contingent upon

various factors. First, meta-analyses are only as valid as the

individual studies they include (Lipsey and Wilson 2001).

Although the EST criteria call for ‘‘good’’ study designs

(APA 1993), guidelines regarding acceptable research

methodology are not specified. Studies used to determine

EST status can therefore vary greatly in methodological

rigor (Herbert 2003) and reporting accuracy. RCTs in top-

tier journals (e.g., New England Journal of Medicine), for

instance, have been shown to contain problems with their

analytic quality such as failing to identify primary out-

comes, provide justification for estimating study size, and

account for missing data (Spring et al. 2007). Statistical

reanalysis of published psychology papers has revealed

some calculation and reporting errors (Bakker and Wich-

erts 2011). In a random sample of 281 psychology articles,

15 and 18 % of statistical results were incorrectly calcu-

lated and reported, respectively (Bakker and Wicherts

2011). If a meta-analysis were to use the t and F tests with

one df in the numerator from such studies to compare the

difference between two groups, it would end up with a

Cohen’s d mean difference of 0.17 when compared to a

meta-analysis whose aggregate studies did not include such

errors (Bakker and Wicherts 2011).

Given that meta-analyses are likely to include studies with

methodological limitations, meta-analyses looking to sum-

marize the research regarding treatment efficacy will do well

to consider individual studies’ methodological features and

quality and their effects on meta-analytic outcomes. For

instance, meta-analyses can examine the independent and

combined effect of each methodological feature on summary

effect sizes. This in turn will indicate the extent to which

meta-analytic results accurately represent the strength of the

examined relationship (Lipsey and Wilson 2001). Meta-

analyses should also examine the effects of confounding

variables on summary effect sizes. So doing will ensure that

results correctly reflect the factors of interest and will also

provide the opportunity to elucidate which treatments work

for whom, under what circumstances, and with what out-

comes (Lipsey and Wilson 2001).

A second factor that can also affect the validity of meta-

analyses is the ‘‘file-drawer’’ phenomenon (Rosenthal 1979).

This phenomenon suggests that published studies might not

represent all the research investigators have conducted.

Meta-analytic results thus run the risk of only reflecting

studies with positive findings and ignoring those with non-

significant outcomes. Multiple factors contribute to the file-

drawer problem. Studies with larger effect sizes and signif-

icant results are more likely to be published than are those

with smaller effect sizes and non-significant results

(Borenstein et al. 2009). Reviewers often prefer clean

research study presentations (Simmons et al. 2011) and are

unlikely to accept studies with nonsignificant findings

(Krause 2011; Rosenthal 1979). Additionally, researchers

might not pursue publication if a study’s findings are nega-

tive (Howard et al. 2009; Krause 2011) or undesired (Leykin

and DeRubeis 2009). When investigators conduct meta-

analyses, it is not always possible to find and acquire all the

research that has been carried out but not published (Roth-

stein et al. 2005). Meta-analyses whose aim is to determine

whether an intervention is indeed efficacious should there-

fore include analytic methods that assess the potential impact

of missing studies on meta-analytic outcomes in order to

determine the certainty with which aggregate results can be

interpreted (see Rothstein et al. 2005 for detailed descrip-

tions regarding such analyses).

Examining the effect of both methodology (Lipsey and

Wilson 2001) and publication bias (Borenstein et al. 2009;

Rothstein et al. 2005) on meta-analytic outcomes and com-

paring results to analyses in which such factors are not taken

into account will provide researchers and clinicians with

information regarding the validity of aggregate findings.

144 J Contemp Psychother (2013) 43:141–149

123

Page 5: Empirically Supported Treatments and Efficacy Trials: What Steps Do We Still Need to Take?

Using the entire literature to determine EST status as

opposed to a few select studies will increase clinicians’

confidence in research-based interventions. Additionally,

given the difficulty clinicians may face when trying to keep

up with the numerous research articles that are published

each month (Herbert 2003), clinicians might find it easier to

examine empirical findings if they are summarized in com-

prehensive meta-analyses.1

Changes to the research and publication process that

allow and encourage transparency can also improve the

validity of meta-analytic results. Reviewers could be more

accepting of mixed results and emphasize full disclosure of

variables and outcome measures over tidy study packages

and significant effects (Simmons et al. 2011).2 Basing

publication decisions on methodology and transparency

will increase the number of quality studies available and

help ensure that a greater range of pertinent information is

taken into account when meta-analyses examine interven-

tion efficacy. Reviewers can also encourage authors to

adhere to the comprehensive Journal Article Reporting

Standards (JARS) or the Meta-Analysis Reporting Stan-

dards (MARS) so that manuscripts can be more easily

evaluated and included in future meta-analyses (APA

2008). Rigorous methodological and reporting standards

will give clinicians and other research consumers greater

confidence in EST studies’ findings.

Finally, registering unpublished studies and findings

online can also help decrease the file-drawer problem and

improve the validity of meta-analytic results. Current

websites such as Figshare (http://figshare.com/) and Open

Science Framework (http://openscienceframework.org/;

Nosek et al. 2012) allow researchers to upload and share

study information. To encourage researchers to log

unpublished studies’ results, investigators should list such

studies in their curricula vitae (CVs). After all, regardless

of whether studies are published, investigators will be

contributing to the field’s knowledge base by making their

studies and results accessible. The availability of unpub-

lished studies will increase the accuracy of meta-analyses

and will reduce the costs associated with unnecessary study

replication that might occur when unpublished studies

remain unknown.

The changes indicated above will facilitate compre-

hensiveness and transparency in therapy efficacy research.

Clinicians, investigators, and the community at large will

more easily rely on empirical findings. Moving towards an

inclusive and transparent research model will help clinical

psychology retain its credibility during the present era in

which research practices throughout scientific fields are

being questioned (e.g., Fanelli 2009; Ioannidis 2005; Steen

2011).3

Concern #3: ESTs Identified Via Efficacy Trials

Lack External Validity and Clinical Utility

ESTs are identified through efficacy trials in highly con-

trolled clinical research settings (Chambless et al. 1998;

Lau et al. 2010). Although studies suggest that therapy

conducted in RCT efficacy trials is equally beneficial when

conducted in clinical settings (e.g., Gibbons et al. 2010),

outcome measures in such studies are often limited to

Diagnostic and Statistical Manual of Mental Disorders

(DSM) symptoms (e.g., depression scores, rate of substance

use). It is therefore not clear whether ESTs improve self-

esteem, quality of life, interpersonal and work functioning,

coping with multiple stressors and life demands, and sim-

ilar phenomena that are often important to clients seeking

psychotherapy (Kazdin 2008; Mahrer 2005; Overholser

2007). Indeed, out of 435 RCTs examining therapeutic

interventions, only 18.8 % included evidence for reduction

in functional impairment (Becker et al. 2011).

Similarly, reported statistical results such as statistical

significance and effect size are not always indicative of

improvement in everyday functioning (Kazdin 2008).

Statistical results often reflect mean treatment outcome

scores and depend on sample size and variability within

1 Clinicians might not have access to large databases that often accompany

academic posts. Investigators can help make research articles more

accessible to clinicians by posting word-processing copies of manuscripts

on their websites whenever possible. As of 2008, PubMed Central has

mandated that copies of articles written with the help of NIH grants be

available to the public for free (http://grants.nih.gov/grants/guide/notice-

files/NOT-OD-08-033.html). In cases in which studies are not funded by

the NIH (and therefore not publicly available on PubMed Central), authors

can strive to publishtheir manuscripts in journals whose publishing com-

pany allows them to post a copy of the manuscript on their professional

website (e.g., APA [http://www.apa.org/pubs/authors/posting.aspx], Else-

vier [http://www.elsevier.com/wps/find/authorsview.authors/preprints],

and Springer [http://www.springer.com/open?access/authors?rights?

SGWID=0-176704-12-683201-0] journals).2 Some individuals have suggested that publishing null results might

decrease a journal’s status (e.g., Nosek et al. 2012). However, if a

journal primarily accepts studies that rigorously apply the scientific

process and provide transparent methodology, there should be no

reason for the journal’s status to be marred. On the contrary:

publishing rigorous and transparent studies might increase a journal’s

status. Literature consumers and researchers might be more likely to

read, cite, and trust journal articles with methodologically rigorous

and transparent studies than journal articles whose studies lack such

qualities, regardless of whether results are significant.

3 The APA recently introduced a new journal, Archives of Scientific

Psychology (http://www.apa.org/pubs/journals/arc/index.aspx), which

is in line with some of the suggestions presented in this paper.

Archives of Scientific Psychology strives to follow a transparent and

accessible model of research. The journal’s articles are open to the

public at no cost (authors pay for publication fees). Additionally,

authors complete JARS or MARS (APA 2008) criteria and make their

data available for others to use.

J Contemp Psychother (2013) 43:141–149 145

123

Page 6: Empirically Supported Treatments and Efficacy Trials: What Steps Do We Still Need to Take?

and between subjects. Statistical results therefore do not

provide clinicians with comprehensive methods of deter-

mining treatment outcome, utility, and response variability

(Jacobson et al. 1984; Kazdin 2008). Research reports from

efficacy trials might also omit information that is useful for

clinical practice, such as each individual patient’s response

to treatment (Barlow et al. 1984; Krause 2011), the pro-

portion of improved and recovered patients, and whether

improvement was maintained long-term (Becker et al.

2011; Westen et al. 2004). Given the Task Force’s sug-

gestion that clinicians implement ESTs in their practice

(Chambless et al. 1998; APA 1993), it is important to

broaden efficacy trials’ investigative scope so that outcome

measures and results can be more applicable to clinical

settings.

Suggestion #3: Efficacy Trials Should Include Measures

and Outcomes to Increase External Validity

and Clinical Utility

Researchers can implement some additions to efficacy tri-

als so that their outcomes are more informative for prac-

ticing clinicians. RCTs would benefit from linking outcome

measures, effect sizes, and statistical and clinical signifi-

cance to real-life functioning and practical significance

(Blanton and Jaccard 2006; Kazdin 2006).4,5 Academic

psychologists can remain informed about investigative

practices and outcome measures that will most benefit

clinical settings by actively engaging in clinical work

(Overholser 2007; 2012).

Clinicians might find research reports from efficacy

trials more informative if each individual participant’s

change is described and mapped onto criteria for clinical

significance. Doing so will provide information about each

participant’s outcome and the proportion of individuals

whose improvement was clinically significant (Jacobson

et al. 1984). Additionally, information indicating whether

each participant’s change is statistically reliable will give

literature consumers information about whether observed

changes are truly a result of the intervention (as opposed to

chance, measurement error, or some other unknown factor;

Jacobson et al. 1984).

While statistical results from efficacy trials provide

nomothetic information, clinicians are often seeking to

answer idiographic questions (Persons et al. 2006). Effi-

cacy research reports might therefore benefit from includ-

ing clinical case studies of some individual participants.

Case studies can report on the patient’s outcome scores and

accounts of their reactions to particular interventions. Such

information can provide researchers and clinicians with

insight into what specific assessment scores mean for

individual patients and how they might translate into real-

life functioning (Gottdiener 2011). Including research

summaries and case studies in research reports has been

shown to increase clinicians’ interest in receiving training

in ESTs and implementing them in their practice (Stewart

and Chambless 2007; 2010).

Finally, researchers should consider clearly delineating

participants’ demographic information in efficacy trials and

reports. Descriptions of individual participants’ change and

case reports of individuals who vary in demographic

variables might allow clinicians to more easily use study

results to answer idiographic questions with minority cli-

ents. Studies examining the effects of ESTs on minorities

are rare (Voss Horrell 2008; Miranda et al. 2005) and have

found mixed results (e.g., Markowitz et al. 2000; Miranda

et al. 2006). Clearly portraying participants’ demographics

in each individual study might also allow future meta-

analyses to aggregate the effects of interventions for

minorities and improve the field’s knowledge in this area.6

Increasing communication and cooperation between

researchers and clinicians can also help make ESTs more

applicable to clinicians’ work. Researchers and clinicians

can collaborate in designing and conducting treatment

outcome studies (Castonguay 2011; Lau et al. 2010).

Indeed, clinicians have expressed an interest in participat-

ing in research and in helping identify outcome measures

that will provide useful information for clinical practice

(Garland et al. 2003; Ogrodniczuk et al. 2010). Clinicians

can indicate which outcome measures would most inform

their practice and can provide researchers with alternative

perspectives regarding successful therapeutic techniques

that have not been empirically examined and which are

worth studying. As indicated earlier, researchers wishing to

compare two different therapeutic interventions could

employ clinicians who specialize in each intervention to

help ensure treatment fidelity. Including clinicians from the

community in the investigative process and allowing them

to have a direct say in the type of research that is conducted

might increase clinicians’ confidence in research outcomes4 See Kazdin (2001) for a discussion regarding how outcome

measures can be linked to real-world functioning.5 Relating changes in outcome measures to real-life phenomena will

also help the research field elucidate whether or not outcome

measures actually map on to the real-life experiences associated with

the constructs the assessments are thought to evaluate. If an outcome

measure does not demonstrate external validity, then future studies

can implement and examine different measures.

6 A recent meta-analysis (Griner and Smith 2006) indicated that

ethnicity interacted with acculturation in that individuals with low

levels of acculturation responded better to culturally adapted inter-

ventions. Researchers might therefore consider including variables

such as levels of acculturation and allocentrism (in addition to

common variables such as race, ethnicity, and socioeconomic

background) when collecting demographic information.

146 J Contemp Psychother (2013) 43:141–149

123

Page 7: Empirically Supported Treatments and Efficacy Trials: What Steps Do We Still Need to Take?

and make them more open to incorporating results in their

practice.

Conclusion

Clinical psychology can enhance its ability to answer the

question, ‘‘How can we best help this client?’’ by increasing

the rigor with which it examines treatment efficacy.

Researchers can employ efficacious comparison groups and

take into account all the variables and factors that can affect

study results in order to strengthen the evidence regarding

treatment efficacy. Clinicians will find efficacy research

more applicable to clinical practice if researchers clearly

delineate individual participants’ responses to treatment.

Clinicians can engage in research design and implementa-

tion to provide the investigative field with fresh perspectives

and help increase studies’ applicability to clinical practice.

We live in an era of virtually limitless online storage

space and incredible ease of information sharing and

communication. Taken together, these factors make it more

feasible than ever to increase collaboration, research

transparency, and information sharing and dissemination.

By taking strides towards making its investigative pro-

cesses available to all, the field of clinical psychology will

not only maintain research consumers’ trust but also will

save time and resources and successfully build a cumula-

tive field of knowledge.

Acknowledgments I would like to thank Dr. Amy Przeworski for

reviewing earlier drafts of this paper and for her insightful sugges-

tions. I would like to thank the two anonymous reviewers for their

thoughtful and helpful comments on earlier drafts of this manuscript.

Conflict of interest The author declares that the author has no

conflict of interest.

References

American Psychological Association, Publications and Communica-

tions Board Working Group on Journal Article Reporting

Standards. (2008). Reporting standards for research in psychol-

ogy: Why do we need them? What might they be? American

Psychologist, 63(9), 839–851. doi:10.1037/0003-066X.63.9.839.

American Psychological Association, Task Force on Promotion and

Dissemination of Psychological Procedures. (1993). A report

adopted by the division 12 board. Retrieved November 1, 2011

from http://www.apa.org/divisions/div12/journals.html#ESTs.

Bakker, M., & Wicherts, J. M. (2011). The (mis)reporting of

statistical results in psychology journals. Behavior Research,

43, 666–678. doi:10.3758/s13428-011-0089-5.

Barlow, D. H., Hayes, S. C., & Nelson, R. O. (1984). The scientist-

practitioner: Research and accountability in clinical and

educational settings. New York: Pergamon Press.

Becker, K. D., Chorpita, B. F., & Daleiden, E. L. (2011). Improve-

ment in symptoms versus functioning: How do our best

treatments measure up? Administration and Policy in Mental

Health and Mental Health Services Research, 38, 440–458.

doi:10.1007/s10488-010-0332-x.

Blanton, H., & Jaccard, J. (2006). Arbitrary metrics in psychology.

American Psychologist, 61(1), 27–41. doi:10.1037/0003-066X.

61.1.27.

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R.

(2009). Introduction to meta-analysis. Chichester: Wiley.

Castelnuovo, G. (2010). Empirically supported treatments in psycho-

therapy: Towards an evidence-based or evidence-biased psy-

chology in clinical settings? Frontiers in Psychology, 1(27),

1–10. doi:10.3389/fpsyg.2010.00027.

Castonguay, L. G. (2011). Psychotherapy, psychopathology, research

and practice: Pathways of connections and integration. Psycho-

therapy Research, 21(2), 125–140. doi:10.1080/10503307.2011.

563250.

Chambless, D. L., Baker, M. J., Baucom, D. H., Beutler, L. E.,

Calhoun, K. S., Crits-Christoph, P., et al. (1998). Update on

empirically validated therapies. II. Clinical Psychologist, 51(1),

3–16.

Chambless, D. L., Sanderson, W. C., Shoham, V., Johnson, S. B.,

Pope, K., Crits-Christoph, P., & McCurry, S. (1996). An update

on empirically validated therapies. Retrieved November 1, 2011

from http://www.apa.org/divisions/div12/journals.html#ESTs.

Colagiuri, B. (2010). Participant expectancies in double-blind

randomized placebo-controlled trials: Potential limitations to

trial validity. Clinical Trials, 7(3), 246–255. doi:10.1177/174077

4510367916.

Davidson, P. R., & Parker, K. C. H. (2001). Eye movement

desensitization and reprocessing (EMDR): A meta-analysis.

Journal of Consulting and Clinical Psychology, 69(2), 305–316.

doi:I0.1037//0022-006X.69.2.305.

Devilly, G. J., & Borkovec, T. D. (2000). Psychometric properties of

the credibility/expectancy questionnaire. Journal of Behavior

Therapy and Experimental Psychiatry, 31, 73–86.

Doyen, S., Klein, O., Pichon, C., & Cleeremans, A. (2012).

Behavioral priming; It’s all in the mind, but whose mind? PLoS

ONE, 7, e29081. doi:10.1371/journal.pone.0029081.

Fanelli, D. (2009). How many scientists fabricate and falsify

research? A systematic review and meta-analysis of survey data.

PLoS ONE, 4(5), 1–11. doi:10.1371/journal.pone.0005738.

Feeny, N. C., Zoellner, L. A., Mavissakalian, M. R., & Roy-Byrne, P.

P. (2009). What would you choose? Sertraline or prolonged

exposure in community and PTSD treatment seeking women.

Depression and Anxiety, 26(8), 724–731. doi:10.1002/da.20588.

Garland, A. F., Kruse, M., & Aarons, G. A. (2003). Clinicians and

outcome measurement: What’s the use? The Journal of Behav-

ioral Health Services & Research, 30(4), 393–405.

Gibbons, C. J., Fournier, J. C., Stirman, S. W., DeRubeis, R. J., Crits-

Christoph, P., & Beck, A. T. (2010). The clinical effectiveness of

cognitive therapy for depression in an outpatient clinic. Journal

of Affective Disorders, 125, 169–176.

Gottdiener, W. H. (2011). Improving the relationship between the

randomized clinical trial and real-world clinical practice. Psy-

chotherapy, 48(3), 231–233. doi:10.1037/a0022703.

Greene, C. J., Morland, L. A., Dirkalski, V. L., & Frueh, B. C. (2008).

Noninferiority and equivalence designs: Issues and implications

for mental health research. Journal of Traumatic Studies, 21(5),

433–439. doi:10.1002/jts.20367.

Griner, D., & Smith, T. B. (2006). Culturally adapted mental health

interventions: A meta-analytic review. Psychotherapy: Theory,

Research, Practice, Training, 43(4), 531–548. doi:10.1037/0022-

006X.74.1.99.

Herbert, J. D. (2003). The science and practice of empirically

supported treatments. Behavior Modification, 27(3), 412–430.

doi:10.1177/0145445503253836.

J Contemp Psychother (2013) 43:141–149 147

123

Page 8: Empirically Supported Treatments and Efficacy Trials: What Steps Do We Still Need to Take?

Howard, G. S., Hill, T. L., Maxwell, S. E., Baptista, T. M., Farias, M.

H., Coelho, C., et al. (2009). What’s wrong with research

literatures? And how to make them right. Review of General

Psychology, 13(2), 146–166. doi:10.1037/a0015319.

Ioannidis, J. P. A. (2005). Why most published research findings are

false. PLoS Medicine, 2(8), 0696–0701. doi:10.1371/journal.

pmed.0020124.

Jacobson, N. S., Follette, W. C., & Revenstorf, D. (1984). Psycho-

therapy outcome research: Methods of reporting variability and

evaluating clinical significance. Behavior Therapy, 15, 336–352.

Kazdin, A. E. (2001). Almost clinically significant (p \ .10): Current

measures may only approach clinical significance. Clinical

Psychology: Science and Practice, 8(4), 455–462.

Kazdin, A. E. (2003). Research design in clinical psychology (4th

ed.). Boston: Allyn and Bacon.

Kazdin, A. E. (2006). Arbitrary metrics: Implications for identifying

evidence-based treatments. American Psychologist, 61(1), 42–49.

doi:10.1037/0003-066X.61.1.42.

Kazdin, A. E. (2008). Evidence-based treatment and practice: New

opportunities to bridge clinical research and practice, enhance

the knowledge base, and improve patient care. American

Psychologist, 63(3), 146–159. doi:10.1037/0003-066X.63.3.146.

Krause, M. S. (2011). What are the fundamental facts of a comparison

of two treatments’ outcomes? Psychotherapy, 48(3), 234–236.

doi:10.1037/a0023383.

Krause, M. S., & Lutz, W. (2009). What should be used for baselines

against which to compare treatments’ effectiveness? Psycho-

therapy Research, 19(3), 358–367. doi:10.1080/1050330090292

6539.

Lau, M. A., Ogrodniczuk, J., Joyce, A. S., & Sochting, I. (2010).

Bridging the practitioner-scientist gap in group psychotherapy

research. International Journal of Group Psychotherapy, 60(2),

177–196.

Leykin, Y., & DeRubeis, R. J. (2009). Allegiance in psychotherapy

outcome research: Separating association from bias. Clinical

Psychology: Science and Practice, 16(1), 54–65.

Lilienfeld, S. O. (2010). Can psychology become a science? Person-

ality and Individual Differences, 49, 281–288. doi:10.1016/j.paid.

2010.01.024.

Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis.

Thousand Oaks: Sage Publications.

Lohr, J. M., Olatunji, B. O., Parker, L. M., & DeMaio, C. (2005).

Experimental analysis of specific treatment factors: Efficacy and

practice implications. Journal of Clinical Psychology, 61, 819–834.

doi:10.1002/jclp.20128.

Luborsky, L., Diguer, L., Seligman, D. A., Rosenthal, R., Krause, E.

D., Johnson, S., et al. (1999). The researcher’s own therapy

allegiances: A ‘‘wild card’’ in comparisons of treatment efficacy.

Clinical Psychology: Science and Practice, 6(1), 95–106.

Mahoney, M. J. (1978). Experimental methods and outcome evalua-

tion. Journal of Consulting and Clinical Psychology, 46, 660–672.

Mahrer, A. R. (2005). What is psychotherapy for? A plausible

alternative to empirically supported therapies, therapy relation-

ships, and practice guidelines. Journal of Contemporary Psy-

chotherapy, 35(1), 99–115. doi:10.1007/s10879-005-0806-4.

Markowitz, J. C., Spielman, L. A., Sullivan, M., & Fishman, B. (2000).

An exploratory study of ethnicity and psychotherapy outcome

among HIV-positive patients with depressive symptoms. Journal

of Psychotherapy Practice and Research, 9(4), 226–231.

Mellers, B., Hertwig, R., & Kahneman, D. (2001). Do frequency

representations eliminate conjunction effects? An exercise in

adversarial collaboration. Psychological Science, 12(4), 269–275.

Michels, K. B., & Rothman, K. J. (2003). Update on unethical use of

placebos in randomized trials. Bioethics, 17, 188–204.

Miranda, J., Bernal, G., Lau, A., Kohn, L., Hwang, W. C., &

LaFromboise, T. (2005). State of the science on psychosocial

interventions for ethnic minorities. Annual Review of Clinical

Psychology, 1, 113–142. doi:10.1146/annurev.clinpsy.1.102803.

143822.

Miranda, J., Green, B. L., Krupnick, J. L., Chung, J., Siddique, J., &

Revicki, D. (2006). One-year outcomes of a randomized clinical

trial treating depression in low-income minority women. Journal

of Consulting and Clinical Psychology, 74(1), 99–111.

doi:10.1037/0022-006X.74.1.99.

Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific utopia: II.

Restructuring incentives and practices to promote truth over

publishability. Perspectives on Psychological Science, 7(6),

615–631. doi:10.1177/1745691612459058.

Ogrodniczuk, J. S., Piper, W. E., Joyce, A. S., Lau, M. A., &

Sochting, I. (2010). A survey of Canadian group psychotherapy

association members’ perceptions of psychotherapy research.

International Journal of Group Psychotherapy, 60(2), 159–176.

O’Leary, K. D., & Borkovec, T. D. (1978). Conceptual, methodo-

logical, and ethical problems of placebo groups in psychotherapy

research. American Psychologist, 33(9), 821–830.

Overholser, J. C. (2007). The boulder model in academia: Struggling

to integrate the science and practice of psychology. Journal of

Contemporary Psychotherapy, 37(4), 205–211. doi:10.1007/

s10879-007-9055-z.

Overholser, J. C. (2012). Behind a thin veneer: What lurks beneath

the scientist-practitioner label? Journal of Contemporary Psy-

chotherapy, 42, 271–279. doi:10.1007/s10879-012-9211-y.

Persons, J. B., Roberts, N. A., Zalecki, C. A., & Brechwald, W. A. G.

(2006). Naturalistic outcome of case formulation-driven cogni-

tive-behavior therapy for anxious and depressed outpatients.

Behaviour Research and Therapy, 44, 1041–1051. doi:10.1016/

j.brat.2005.08.005.

Price, D. D., Finniss, D. G., & Beneditti, F. (2007). A comprehensive

review of the placebo effect: Recent advances and current

thought. Annual Review of Psychology, 59, 565–590.

doi:10.1146/annurev.psych.59.113006.095941.

Rosenthal, R. (1979). The file drawer problem and tolerance for null

results. Psychological Bulletin, 86(3), 638–641.

Rothman, K. J., & Michels, K. B. (1994). The continuing unethical

use of placebo controls. The New England Journal of Medicine,

331(6), 394–398.

Rothstein, H., Sutton, A. J., & Borenstein, M. (Eds.). (2005).

Publication bias in meta-analysis: Prevention, assessment and

adjustments. Chichester: Wiley.

Shapiro, F. (1989). Eye movement desensitization: A new treatment

for post-traumatic stress disorder. Journal of Behavior Therapy

and Experimental Psychiatry, 20(3), 211–217.

Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive

psychology: Undisclosed flexibility in data collection and analysis

allows presenting anything as significant. Psychological Science,

22(11), 1359–1366. doi:10.1177/0956797611417632.

Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data

analysis: Modeling change and event occurrence. New York:

Oxford University Press, Inc.

Spławinski, J., & Kuzniar, J. (2004). Clinical trials: Active control vs

placebo—What is ethical? Science and Engineering Ethics,

10(1), 73–79.

Spring, B., Pagoto, S., Knatterud, G., Kozak, A., & Hedeker, D.

(2007). Examination of the analytic quality of behavioral health

randomized clinical Trials. Journal of Clinical Psychology,

63(1), 53–71. doi:10.1002/jclp.20334.

Steen, R. G. (2011). Retractions in the scientific literature: Is the

incidence of research fraud increasing? Journal of Medical

Ethics, 37(4), 249–253. doi:10.1136/jme.2010.040923.

Stewart, R. E., & Chambless, D. L. (2007). Does psychotherapy

research inform treatment decisions in private practice? Journal

of Clinical Psychology, 63(3), 267–281. doi:10.1002/jclp.20347.

148 J Contemp Psychother (2013) 43:141–149

123

Page 9: Empirically Supported Treatments and Efficacy Trials: What Steps Do We Still Need to Take?

Stewart, R. E., & Chambless, D. L. (2010). Interesting practitioners in

training in empirically supported treatments: Research reviews

versus case studies. Journal of Clinical Psychology, 66(1),

73–95. doi:10.1002/jclp.20630.

Streiner, D. L. (2003). Unicorns do exist: A tutorial on ‘‘proving’’ the

null hypothesis. Research Methods in Psychiatry, 48(11),

756–761.

Voss Horrell, S. C. (2008). Effectiveness of cognitive-behavioral

therapy with adult ethnic minority clients: A review. Profes-

sional Psychology: Research and Practice, 39(2), 160–168.

doi:10.1037/0735-7028.39.2.160.

Westen, D., Novotny, C. M., & Thompson-Brenner, H. (2004). The

empirical status of empirically supported psychotherapies: Assump-

tions, findings, and reporting in controlled clinical trials. Psycholog-

ical Bulletin, 130(4), 631–663. doi:10.1037/0033-2909.130.4.631.

World Medical Association. (2008). Declaration of Helsinki.

Amended by the 59th WMA General Assembly, Seoul, Korea,

October 2008. Retrieved from http://www.wma.net/en/30publi-

cations/10policies/b3/.

J Contemp Psychother (2013) 43:141–149 149

123