empirically supported treatments and efficacy trials: what steps do we still need to take?
TRANSCRIPT
ORIGINAL PAPER
Empirically Supported Treatments and Efficacy Trials: WhatSteps Do We Still Need to Take?
Jessica D. Nasser
Published online: 7 April 2013
� Springer Science+Business Media New York 2013
Abstract The Task Force on the Promotion and Dissemi-
nation of Psychological Procedures sought to identify scien-
tifically supported treatments in order to espouse their use and
improve client outcomes in therapy. Nevertheless, the gap
between scientists and practitioners persists, and there still
remain some limitations to the manner in which this goal is
carried out. The criteria specifying whether a treatment
qualifies as empirically supported are too lenient. The research
used in the search for empirically supported treatments does
not take into account the full literature base. Efficacy trials
provide practitioners with limited information. This paper
proposes means through which the field can improve its search
for scientifically supported treatments. Alterations to the cri-
teria that assess empirically supported treatments, greater
research transparency and external validity, and collaboration
between investigators and clinicians will allow the field of
clinical psychology to better answer the question, ‘‘How can
we most successfully treat this client?’’
Keywords Clinical utility � Empirically supported
treatments � Efficacy studies � External validity �Meta-analyses � Research transparency �Scientist-practitioner
Division 12 (Clinical Psychology) of the American Psy-
chological Association (APA) established the Task Force
on the Promotion and Dissemination of Psychological
Procedures (hereinafter referred to as the Task Force) to
identify and disseminate Empirically Validated Treatments
(subsequently termed Empirically Supported Treatments
[ESTs]; Chambless et al. 1998, 1996; APA 1993). The
Task Force’s objectives (e.g., increasing public awareness
of psychotherapies’ efficacy, ensuring that clinicians
implement scientifically-based interventions, and improv-
ing treatment outcomes in psychotherapy) are laudable
goals the field should continue to pursue. The manner in
which ESTs are identified and the type of information they
provide, however, still contain drawbacks that need to be
addressed. Additionally, despite the Task Force’s efforts,
the gap between scientists and practitioners remains wide,
and individuals in need of treatment often receive inter-
ventions that are not scientifically based (Lilienfeld 2010).
Concerns regarding the EST movement are examined, and
subsequent suggestions are provided that will both enhance
the investigation of scientifically-based treatments and help
decrease the scientist-practitioner gap.
Concern #1: The EST Criteria Include the Use
of Wait-List Groups and Pill and Psychological
Placebos as Controls
The criteria used to identify ESTs are too lenient and allow
any intervention—whether it merely produces a placebo
effect or adds an irrelevant component to an already
established therapeutic intervention—to qualify as an EST
(e.g., Eye Movement Desensitization Therapy, or EMDR;
Davidson and Parker 2001; Shapiro 1989). Therapies
demonstrating statistical superiority to wait-list controls in
two different experiments qualify as ‘‘probably effica-
cious’’ (Chambless et al. 1998), and such criteria can result
in the proliferation of interventions that are merely more
helpful than doing nothing. Evidence indicating that
J. D. Nasser (&)
Department of Psychological Sciences, Psychology Program,
Case Western Reserve University, 10900 Euclid Avenue,
Cleveland, OH 44106-7123, USA
e-mail: [email protected]
123
J Contemp Psychother (2013) 43:141–149
DOI 10.1007/s10879-013-9236-x
therapy works better than no treatment does not provide
clinicians with information that is applicable to their
practice. If two individual studies examine different ther-
apeutic interventions and demonstrate that each treatment
is more beneficial than doing nothing, practitioners must
still grapple with deciding which of the two treatments will
most benefit their client.
The EST criteria also indicate that interventions can be
considered ‘‘well-established’’ if they demonstrate signifi-
cant superiority to medication placebo or intervention pla-
cebo in two studies conducted by different research teams
(Chambless et al. 1998). This is despite the fact that inves-
tigators have often expressed concerns regarding the use of
pill and psychological placebos (e.g., Mahoney 1978;
O’Leary and Borkovec 1978). Recent studies have strength-
ened the basis for this concern by demonstrating that more
patients prefer individual therapy to medication when given
the choice (e.g., Feeny et al. 2009). Thus, participants who are
randomly assigned to psychotherapy conditions might be more
satisfied with their group placement and more likely to believe
they will improve when compared to participants who are
assigned to pill placebo conditions. Disparities in participants’
satisfaction and expectations might make it easier to find a
significant difference between groups in favor of therapy.
Implementing psychological placebo groups in random-
ized controlled trials (RCTs) can threaten a study’s construct
validity (Kazdin 2003). Successfully executing psychological
placebo conditions in behavioral RCTs is difficult since both
participants and researchers are conscious of the treatment
they are receiving and providing, respectively (Castelnuovo
2010). Administering clinicians might believe in the active
therapy’s positive effects, be biased against the placebo con-
dition, and unwittingly affect the manner in which each is
delivered. For instance, the administering clinician might
unconsciously act in such a way as to convey to participants
whether or not they should be improving. Indeed, a recent
experimental study suggests that, in instances in which
experimenters’ expectations regarding outcome are congruent
with the experimental condition, experimenters can unwit-
tingly influence participants’ behavior (Doyen et al. 2012).
Participants’ beliefs about whether they have been
assigned to the treatment or placebo condition can also
threaten a study’s construct validity. Patients might not
expect to improve if they believe they are in the placebo
group, and vice versa (Colagiuri 2010; Kazdin 2003). Even if
participants cannot determine their group placement, pla-
cebo groups lack active therapeutic components and might
eventually discourage participants who sense no improve-
ment. Disheartened participants might subsequently develop
aversion towards and negative expectations regarding the
placebo group or their own ability to change, in turn mag-
nifying the observed differences between the placebo and
treatment groups (O’Leary and Borkovec 1978).
Most importantly, when efficacious interventions for a
given disorder are known to exist, providing participants
with a non-therapeutic placebo intervention in an RCT is
ethically questionable (Michels and Rothman 2003; World
Medical Association 2008). Participants in placebo groups
do not receive the treatment they need to address their
psychopathology (Kazdin 2003; O’Leary and Borkovec
1978), and their symptoms might worsen. Participants in
placebo groups might also blame themselves for their lack
of improvement and begin believing they are ‘‘hopeless.’’
Increased distress in individuals who are already in need of
psychological interventions can exacerbate their psycho-
pathology and be potentially harmful.
Finally, pill and psychological placebo comparison
groups do not provide clinicians with the knowledge nee-
ded to make informed decisions about treatment choice.
Clinicians are not trying to decide between implementing
weekly therapy sessions and providing sugar pills or psy-
chological placebos; the latter two are irrelevant in clinical
practice. Instead, clinicians might be interested in whether
or not a novel treatment is superior to the one they are
currently employing. As is the case for wait-list control
comparison groups, it is easier to find significant differ-
ences when comparing a treatment to a placebo than when
comparing it to an active and efficacious intervention
(Rothman and Michels 1994). Placebo control groups can
thus lead to an infinite list of ‘‘efficacious’’ treatments
without providing knowledge about which, amongst all
interventions, is superior.
Suggestion #1: Only Active Comparison Groups
Should be Used to Determine EST Status
Given the limited information and possible complications
that arise from pill and psychological placebos, investigators
should compare novel therapies to the most efficacious
treatment currently available. Active and efficacious com-
parison groups offer a legitimate test of a therapy’s efficacy
(as opposed to a placebo or a diluted version of therapy that is
thought, from the beginning, not to work; Krause and Lutz
2009) and can provide an efficient means of narrowing down
the list of ESTs (Castelnuovo 2010). Additionally, recent
statistical advances suggest that comparing two potentially
equivalent interventions is more feasible than previously
believed (e.g., Greene et al. 2008; Spławinski and Kuzniar
2004; Streiner 2003).
In addition to providing a more stringent test of a therapy’s
efficacy, active comparison groups also solve the ethical
dilemma concerning delaying treatment for individuals who
need therapy. Although wait-list and placebo groups control
for nonspecific factors such as participant expectancies and
natural symptom remission (Mahoney 1978; Lohr et al. 2005),
142 J Contemp Psychother (2013) 43:141–149
123
such scientific goals should not trump the ethical obligation
that researchers have to their participants (Michels and
Rothman 2003). Additionally, wait-list and placebo control
groups are not the only available means to account for non-
specific factors. Participant expectancies, for instance, can be
assessed via simple inquiries (e.g., ‘‘What do you expect from
this treatment?’’; Price et al. 2007, p. 583) or through ques-
tionnaires (e.g., The Credibility/Expectancy Questionnaire;
Devilly and Borkovec 2000). Participant preference for
intervention type (e.g., psychotherapy versus medication) can
be assessed with single questions about the type of treatment
they would select if given the option (e.g., ‘‘If you had a choice
between individual therapy, medication, or no treatment to
help you with [your symptoms], which would you choose?’’;
Feeny et al. 2009, p. 726). Differences in participant expec-
tancies or preferences can subsequently be accounted for
during group randomization or throughout statistical analyses
(e.g., as covariates or possible moderators).
RCTs comparing two active psychotherapeutic inter-
ventions also have the benefit of providing therapists with
information that is applicable to clinical practice. Studies
with active comparison groups demonstrate which of two
treatments is superior and indicate whether distinct treat-
ments lead to differential outcomes. For instance, knowing
that therapy Y decreases anxiety more so than therapy Z,
while therapy Z increases positive affect more so than
therapy Y, will allow clinicians to implement the treatment
best suited to each client. (E.g., a highly anxious, but not
depressed, client would benefit from therapy Y as opposed
to therapy Z.)
Comparing active psychotherapy to active pharmaco-
therapy provides clinicians with different, yet equally
important, information. Studies demonstrating whether
therapy works better than medication, whether medication
works better than therapy, or whether a combination of
both therapy and medication is superior to either in isola-
tion can help ensure that clients receive the most beneficial
intervention possible. If the therapy in question is less
beneficial than medication, then clinicians can seek a dif-
ferent treatment that is superior to medication. If medica-
tion is superior to all therapeutic interventions, therapists
can suggest pharmacotherapy to their patients. If combin-
ing therapy and medication proves the most beneficial, then
clinicians can suggest that clients concurrently engage in
psychotherapy and medication management.
Active comparison groups, however, are not without their
drawbacks. One potential weakness is their inability to
provide information regarding the incremental and specific
efficacy of individual components. Component control
studies can provide such information (Lohr et al. 2005;
O’Leary and Borkovec 1978) but might be ethically tenuous
since participants are provided with a potentially less-effi-
cacious intervention. If such studies are implemented, then
researchers should ensure that they provide participants in
the dismantled arm with the missing component if it is found
to have incremental efficacy after study completion.
RCTs employing two active treatments are also subject
to factors that can negatively affect construct validity.
‘‘Allegiance effects’’ refer to the fact that researchers’
favored treatments usually demonstrate superior outcomes
(Luborsky et al. 1999). This phenomenon might occur if
investigators are more knowledgeable about the preferred
intervention and provide the study’s clinicians with better
training on that treatment (Leykin and DeRubeis 2009). In
studies where evaluators are not blinded, allegiance effects
might also occur if researchers incorrectly (but inadver-
tently) assess individuals in a way that confirms their
hypotheses. Indeed, in one experimental study in which
experimenters’ expectations were incongruent with the
experimental condition, experimenters tended to inaccu-
rately assess participant behavior so as to align results with
their expectations (Doyen et al. 2012).
In order to reduce possible allegiance effects, investigator
teams with different theoretical orientations can collaborate
on studies to compare their respective interventions (Leykin
and DeRubeis 2009; Luborsky et al. 1999; Mellers et al. 2001).
Not only will such collaboration help minimize allegiance
effects (Leykin and DeRubeis 2009), but it will also pool
investigators’ resources so that larger studies can be imple-
mented and more intricate and detailed data analyses can be
conducted. Carrying out multiple levels of analyses, for
instance, can provide information on whether therapeutic
techniques lead to distinct outcomes or degrees of efficacy
with different individuals (Singer and Willett 2003). Such
studies would allow researchers to take into account the dif-
ferential effects of therapeutic techniques on subgroups within
larger homogenous groups (e.g., individuals with Generalized
Anxiety Disorder who differ in age of onset or in ethnicity).
Alternatively, when collaboration between research teams is
not feasible, investigators can ensure treatment fidelity by
employing experts (either researchers or clinicians) who
specialize in the treatments under examination for their study
(Leykin and DeRubeis 2009; Luborsky et al. 1999).
Concern #2: The EST Criteria Do Not Account
for All the Literature
The EST criteria disregard negative findings and only
consider individual studies with positive outcomes. The
criteria also lack provisions for removing treatments from
the list of ESTs (Castelnuovo 2010). Thus, a treatment that
is shown to produce positive effects in two individual
studies can qualify as empirically supported regardless of
whether other studies have found negative effects. Such
criteria are problematic if different studies provide mixed
J Contemp Psychother (2013) 43:141–149 143
123
results regarding a treatment’s efficacy, or if an interven-
tion initially demonstrates efficacy but subsequently does
not work in more rigorous trials (Herbert 2003).
Given their singular focus on positive outcomes, the EST
criteria do not account for studies that commit Type II errors.
Methodologically rigorous studies with insufficient statisti-
cal power (due to small sample sizes that often constitute
individual studies) can fail to detect veridical effects when
the intervention is indeed efficacious (Borenstein et al.
2009). Although meta-analyses render Type II errors for
overall effects less likely by aggregating participants across
primary studies (Borenstein et al. 2009), the EST criteria do
not consider meta-analytic results. Failing to take into
account the entire literature might make it difficult for
researchers and clinicians to rely on the list of ESTs.
Suggestion #2: Aggregate Results in a Transparent
Research Context Should Determine EST Criteria
Studies with both significant and nonsignificant results
should be taken into account when determining EST status
(Herbert 2003). Ultimately, meta-analytic results should be
the final arbiter in determining treatment efficacy. Sub-
sequent meta-analyses that include a wider range of studies
could be used to challenge the results of older meta-anal-
yses. Using meta-analyses to establish whether treatments
are empirically supported will also provide a method of
updating the list of ESTs (e.g., if a meta-analysis concludes
that an intervention is not efficacious, then it can be
removed from the list of ESTs). The benefits of using meta-
analytic results to ultimately determine EST status will be
maximized if some modifications—described below—are
made to the research and publication process.
The validity of meta-analytic results is contingent upon
various factors. First, meta-analyses are only as valid as the
individual studies they include (Lipsey and Wilson 2001).
Although the EST criteria call for ‘‘good’’ study designs
(APA 1993), guidelines regarding acceptable research
methodology are not specified. Studies used to determine
EST status can therefore vary greatly in methodological
rigor (Herbert 2003) and reporting accuracy. RCTs in top-
tier journals (e.g., New England Journal of Medicine), for
instance, have been shown to contain problems with their
analytic quality such as failing to identify primary out-
comes, provide justification for estimating study size, and
account for missing data (Spring et al. 2007). Statistical
reanalysis of published psychology papers has revealed
some calculation and reporting errors (Bakker and Wich-
erts 2011). In a random sample of 281 psychology articles,
15 and 18 % of statistical results were incorrectly calcu-
lated and reported, respectively (Bakker and Wicherts
2011). If a meta-analysis were to use the t and F tests with
one df in the numerator from such studies to compare the
difference between two groups, it would end up with a
Cohen’s d mean difference of 0.17 when compared to a
meta-analysis whose aggregate studies did not include such
errors (Bakker and Wicherts 2011).
Given that meta-analyses are likely to include studies with
methodological limitations, meta-analyses looking to sum-
marize the research regarding treatment efficacy will do well
to consider individual studies’ methodological features and
quality and their effects on meta-analytic outcomes. For
instance, meta-analyses can examine the independent and
combined effect of each methodological feature on summary
effect sizes. This in turn will indicate the extent to which
meta-analytic results accurately represent the strength of the
examined relationship (Lipsey and Wilson 2001). Meta-
analyses should also examine the effects of confounding
variables on summary effect sizes. So doing will ensure that
results correctly reflect the factors of interest and will also
provide the opportunity to elucidate which treatments work
for whom, under what circumstances, and with what out-
comes (Lipsey and Wilson 2001).
A second factor that can also affect the validity of meta-
analyses is the ‘‘file-drawer’’ phenomenon (Rosenthal 1979).
This phenomenon suggests that published studies might not
represent all the research investigators have conducted.
Meta-analytic results thus run the risk of only reflecting
studies with positive findings and ignoring those with non-
significant outcomes. Multiple factors contribute to the file-
drawer problem. Studies with larger effect sizes and signif-
icant results are more likely to be published than are those
with smaller effect sizes and non-significant results
(Borenstein et al. 2009). Reviewers often prefer clean
research study presentations (Simmons et al. 2011) and are
unlikely to accept studies with nonsignificant findings
(Krause 2011; Rosenthal 1979). Additionally, researchers
might not pursue publication if a study’s findings are nega-
tive (Howard et al. 2009; Krause 2011) or undesired (Leykin
and DeRubeis 2009). When investigators conduct meta-
analyses, it is not always possible to find and acquire all the
research that has been carried out but not published (Roth-
stein et al. 2005). Meta-analyses whose aim is to determine
whether an intervention is indeed efficacious should there-
fore include analytic methods that assess the potential impact
of missing studies on meta-analytic outcomes in order to
determine the certainty with which aggregate results can be
interpreted (see Rothstein et al. 2005 for detailed descrip-
tions regarding such analyses).
Examining the effect of both methodology (Lipsey and
Wilson 2001) and publication bias (Borenstein et al. 2009;
Rothstein et al. 2005) on meta-analytic outcomes and com-
paring results to analyses in which such factors are not taken
into account will provide researchers and clinicians with
information regarding the validity of aggregate findings.
144 J Contemp Psychother (2013) 43:141–149
123
Using the entire literature to determine EST status as
opposed to a few select studies will increase clinicians’
confidence in research-based interventions. Additionally,
given the difficulty clinicians may face when trying to keep
up with the numerous research articles that are published
each month (Herbert 2003), clinicians might find it easier to
examine empirical findings if they are summarized in com-
prehensive meta-analyses.1
Changes to the research and publication process that
allow and encourage transparency can also improve the
validity of meta-analytic results. Reviewers could be more
accepting of mixed results and emphasize full disclosure of
variables and outcome measures over tidy study packages
and significant effects (Simmons et al. 2011).2 Basing
publication decisions on methodology and transparency
will increase the number of quality studies available and
help ensure that a greater range of pertinent information is
taken into account when meta-analyses examine interven-
tion efficacy. Reviewers can also encourage authors to
adhere to the comprehensive Journal Article Reporting
Standards (JARS) or the Meta-Analysis Reporting Stan-
dards (MARS) so that manuscripts can be more easily
evaluated and included in future meta-analyses (APA
2008). Rigorous methodological and reporting standards
will give clinicians and other research consumers greater
confidence in EST studies’ findings.
Finally, registering unpublished studies and findings
online can also help decrease the file-drawer problem and
improve the validity of meta-analytic results. Current
websites such as Figshare (http://figshare.com/) and Open
Science Framework (http://openscienceframework.org/;
Nosek et al. 2012) allow researchers to upload and share
study information. To encourage researchers to log
unpublished studies’ results, investigators should list such
studies in their curricula vitae (CVs). After all, regardless
of whether studies are published, investigators will be
contributing to the field’s knowledge base by making their
studies and results accessible. The availability of unpub-
lished studies will increase the accuracy of meta-analyses
and will reduce the costs associated with unnecessary study
replication that might occur when unpublished studies
remain unknown.
The changes indicated above will facilitate compre-
hensiveness and transparency in therapy efficacy research.
Clinicians, investigators, and the community at large will
more easily rely on empirical findings. Moving towards an
inclusive and transparent research model will help clinical
psychology retain its credibility during the present era in
which research practices throughout scientific fields are
being questioned (e.g., Fanelli 2009; Ioannidis 2005; Steen
2011).3
Concern #3: ESTs Identified Via Efficacy Trials
Lack External Validity and Clinical Utility
ESTs are identified through efficacy trials in highly con-
trolled clinical research settings (Chambless et al. 1998;
Lau et al. 2010). Although studies suggest that therapy
conducted in RCT efficacy trials is equally beneficial when
conducted in clinical settings (e.g., Gibbons et al. 2010),
outcome measures in such studies are often limited to
Diagnostic and Statistical Manual of Mental Disorders
(DSM) symptoms (e.g., depression scores, rate of substance
use). It is therefore not clear whether ESTs improve self-
esteem, quality of life, interpersonal and work functioning,
coping with multiple stressors and life demands, and sim-
ilar phenomena that are often important to clients seeking
psychotherapy (Kazdin 2008; Mahrer 2005; Overholser
2007). Indeed, out of 435 RCTs examining therapeutic
interventions, only 18.8 % included evidence for reduction
in functional impairment (Becker et al. 2011).
Similarly, reported statistical results such as statistical
significance and effect size are not always indicative of
improvement in everyday functioning (Kazdin 2008).
Statistical results often reflect mean treatment outcome
scores and depend on sample size and variability within
1 Clinicians might not have access to large databases that often accompany
academic posts. Investigators can help make research articles more
accessible to clinicians by posting word-processing copies of manuscripts
on their websites whenever possible. As of 2008, PubMed Central has
mandated that copies of articles written with the help of NIH grants be
available to the public for free (http://grants.nih.gov/grants/guide/notice-
files/NOT-OD-08-033.html). In cases in which studies are not funded by
the NIH (and therefore not publicly available on PubMed Central), authors
can strive to publishtheir manuscripts in journals whose publishing com-
pany allows them to post a copy of the manuscript on their professional
website (e.g., APA [http://www.apa.org/pubs/authors/posting.aspx], Else-
vier [http://www.elsevier.com/wps/find/authorsview.authors/preprints],
and Springer [http://www.springer.com/open?access/authors?rights?
SGWID=0-176704-12-683201-0] journals).2 Some individuals have suggested that publishing null results might
decrease a journal’s status (e.g., Nosek et al. 2012). However, if a
journal primarily accepts studies that rigorously apply the scientific
process and provide transparent methodology, there should be no
reason for the journal’s status to be marred. On the contrary:
publishing rigorous and transparent studies might increase a journal’s
status. Literature consumers and researchers might be more likely to
read, cite, and trust journal articles with methodologically rigorous
and transparent studies than journal articles whose studies lack such
qualities, regardless of whether results are significant.
3 The APA recently introduced a new journal, Archives of Scientific
Psychology (http://www.apa.org/pubs/journals/arc/index.aspx), which
is in line with some of the suggestions presented in this paper.
Archives of Scientific Psychology strives to follow a transparent and
accessible model of research. The journal’s articles are open to the
public at no cost (authors pay for publication fees). Additionally,
authors complete JARS or MARS (APA 2008) criteria and make their
data available for others to use.
J Contemp Psychother (2013) 43:141–149 145
123
and between subjects. Statistical results therefore do not
provide clinicians with comprehensive methods of deter-
mining treatment outcome, utility, and response variability
(Jacobson et al. 1984; Kazdin 2008). Research reports from
efficacy trials might also omit information that is useful for
clinical practice, such as each individual patient’s response
to treatment (Barlow et al. 1984; Krause 2011), the pro-
portion of improved and recovered patients, and whether
improvement was maintained long-term (Becker et al.
2011; Westen et al. 2004). Given the Task Force’s sug-
gestion that clinicians implement ESTs in their practice
(Chambless et al. 1998; APA 1993), it is important to
broaden efficacy trials’ investigative scope so that outcome
measures and results can be more applicable to clinical
settings.
Suggestion #3: Efficacy Trials Should Include Measures
and Outcomes to Increase External Validity
and Clinical Utility
Researchers can implement some additions to efficacy tri-
als so that their outcomes are more informative for prac-
ticing clinicians. RCTs would benefit from linking outcome
measures, effect sizes, and statistical and clinical signifi-
cance to real-life functioning and practical significance
(Blanton and Jaccard 2006; Kazdin 2006).4,5 Academic
psychologists can remain informed about investigative
practices and outcome measures that will most benefit
clinical settings by actively engaging in clinical work
(Overholser 2007; 2012).
Clinicians might find research reports from efficacy
trials more informative if each individual participant’s
change is described and mapped onto criteria for clinical
significance. Doing so will provide information about each
participant’s outcome and the proportion of individuals
whose improvement was clinically significant (Jacobson
et al. 1984). Additionally, information indicating whether
each participant’s change is statistically reliable will give
literature consumers information about whether observed
changes are truly a result of the intervention (as opposed to
chance, measurement error, or some other unknown factor;
Jacobson et al. 1984).
While statistical results from efficacy trials provide
nomothetic information, clinicians are often seeking to
answer idiographic questions (Persons et al. 2006). Effi-
cacy research reports might therefore benefit from includ-
ing clinical case studies of some individual participants.
Case studies can report on the patient’s outcome scores and
accounts of their reactions to particular interventions. Such
information can provide researchers and clinicians with
insight into what specific assessment scores mean for
individual patients and how they might translate into real-
life functioning (Gottdiener 2011). Including research
summaries and case studies in research reports has been
shown to increase clinicians’ interest in receiving training
in ESTs and implementing them in their practice (Stewart
and Chambless 2007; 2010).
Finally, researchers should consider clearly delineating
participants’ demographic information in efficacy trials and
reports. Descriptions of individual participants’ change and
case reports of individuals who vary in demographic
variables might allow clinicians to more easily use study
results to answer idiographic questions with minority cli-
ents. Studies examining the effects of ESTs on minorities
are rare (Voss Horrell 2008; Miranda et al. 2005) and have
found mixed results (e.g., Markowitz et al. 2000; Miranda
et al. 2006). Clearly portraying participants’ demographics
in each individual study might also allow future meta-
analyses to aggregate the effects of interventions for
minorities and improve the field’s knowledge in this area.6
Increasing communication and cooperation between
researchers and clinicians can also help make ESTs more
applicable to clinicians’ work. Researchers and clinicians
can collaborate in designing and conducting treatment
outcome studies (Castonguay 2011; Lau et al. 2010).
Indeed, clinicians have expressed an interest in participat-
ing in research and in helping identify outcome measures
that will provide useful information for clinical practice
(Garland et al. 2003; Ogrodniczuk et al. 2010). Clinicians
can indicate which outcome measures would most inform
their practice and can provide researchers with alternative
perspectives regarding successful therapeutic techniques
that have not been empirically examined and which are
worth studying. As indicated earlier, researchers wishing to
compare two different therapeutic interventions could
employ clinicians who specialize in each intervention to
help ensure treatment fidelity. Including clinicians from the
community in the investigative process and allowing them
to have a direct say in the type of research that is conducted
might increase clinicians’ confidence in research outcomes4 See Kazdin (2001) for a discussion regarding how outcome
measures can be linked to real-world functioning.5 Relating changes in outcome measures to real-life phenomena will
also help the research field elucidate whether or not outcome
measures actually map on to the real-life experiences associated with
the constructs the assessments are thought to evaluate. If an outcome
measure does not demonstrate external validity, then future studies
can implement and examine different measures.
6 A recent meta-analysis (Griner and Smith 2006) indicated that
ethnicity interacted with acculturation in that individuals with low
levels of acculturation responded better to culturally adapted inter-
ventions. Researchers might therefore consider including variables
such as levels of acculturation and allocentrism (in addition to
common variables such as race, ethnicity, and socioeconomic
background) when collecting demographic information.
146 J Contemp Psychother (2013) 43:141–149
123
and make them more open to incorporating results in their
practice.
Conclusion
Clinical psychology can enhance its ability to answer the
question, ‘‘How can we best help this client?’’ by increasing
the rigor with which it examines treatment efficacy.
Researchers can employ efficacious comparison groups and
take into account all the variables and factors that can affect
study results in order to strengthen the evidence regarding
treatment efficacy. Clinicians will find efficacy research
more applicable to clinical practice if researchers clearly
delineate individual participants’ responses to treatment.
Clinicians can engage in research design and implementa-
tion to provide the investigative field with fresh perspectives
and help increase studies’ applicability to clinical practice.
We live in an era of virtually limitless online storage
space and incredible ease of information sharing and
communication. Taken together, these factors make it more
feasible than ever to increase collaboration, research
transparency, and information sharing and dissemination.
By taking strides towards making its investigative pro-
cesses available to all, the field of clinical psychology will
not only maintain research consumers’ trust but also will
save time and resources and successfully build a cumula-
tive field of knowledge.
Acknowledgments I would like to thank Dr. Amy Przeworski for
reviewing earlier drafts of this paper and for her insightful sugges-
tions. I would like to thank the two anonymous reviewers for their
thoughtful and helpful comments on earlier drafts of this manuscript.
Conflict of interest The author declares that the author has no
conflict of interest.
References
American Psychological Association, Publications and Communica-
tions Board Working Group on Journal Article Reporting
Standards. (2008). Reporting standards for research in psychol-
ogy: Why do we need them? What might they be? American
Psychologist, 63(9), 839–851. doi:10.1037/0003-066X.63.9.839.
American Psychological Association, Task Force on Promotion and
Dissemination of Psychological Procedures. (1993). A report
adopted by the division 12 board. Retrieved November 1, 2011
from http://www.apa.org/divisions/div12/journals.html#ESTs.
Bakker, M., & Wicherts, J. M. (2011). The (mis)reporting of
statistical results in psychology journals. Behavior Research,
43, 666–678. doi:10.3758/s13428-011-0089-5.
Barlow, D. H., Hayes, S. C., & Nelson, R. O. (1984). The scientist-
practitioner: Research and accountability in clinical and
educational settings. New York: Pergamon Press.
Becker, K. D., Chorpita, B. F., & Daleiden, E. L. (2011). Improve-
ment in symptoms versus functioning: How do our best
treatments measure up? Administration and Policy in Mental
Health and Mental Health Services Research, 38, 440–458.
doi:10.1007/s10488-010-0332-x.
Blanton, H., & Jaccard, J. (2006). Arbitrary metrics in psychology.
American Psychologist, 61(1), 27–41. doi:10.1037/0003-066X.
61.1.27.
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R.
(2009). Introduction to meta-analysis. Chichester: Wiley.
Castelnuovo, G. (2010). Empirically supported treatments in psycho-
therapy: Towards an evidence-based or evidence-biased psy-
chology in clinical settings? Frontiers in Psychology, 1(27),
1–10. doi:10.3389/fpsyg.2010.00027.
Castonguay, L. G. (2011). Psychotherapy, psychopathology, research
and practice: Pathways of connections and integration. Psycho-
therapy Research, 21(2), 125–140. doi:10.1080/10503307.2011.
563250.
Chambless, D. L., Baker, M. J., Baucom, D. H., Beutler, L. E.,
Calhoun, K. S., Crits-Christoph, P., et al. (1998). Update on
empirically validated therapies. II. Clinical Psychologist, 51(1),
3–16.
Chambless, D. L., Sanderson, W. C., Shoham, V., Johnson, S. B.,
Pope, K., Crits-Christoph, P., & McCurry, S. (1996). An update
on empirically validated therapies. Retrieved November 1, 2011
from http://www.apa.org/divisions/div12/journals.html#ESTs.
Colagiuri, B. (2010). Participant expectancies in double-blind
randomized placebo-controlled trials: Potential limitations to
trial validity. Clinical Trials, 7(3), 246–255. doi:10.1177/174077
4510367916.
Davidson, P. R., & Parker, K. C. H. (2001). Eye movement
desensitization and reprocessing (EMDR): A meta-analysis.
Journal of Consulting and Clinical Psychology, 69(2), 305–316.
doi:I0.1037//0022-006X.69.2.305.
Devilly, G. J., & Borkovec, T. D. (2000). Psychometric properties of
the credibility/expectancy questionnaire. Journal of Behavior
Therapy and Experimental Psychiatry, 31, 73–86.
Doyen, S., Klein, O., Pichon, C., & Cleeremans, A. (2012).
Behavioral priming; It’s all in the mind, but whose mind? PLoS
ONE, 7, e29081. doi:10.1371/journal.pone.0029081.
Fanelli, D. (2009). How many scientists fabricate and falsify
research? A systematic review and meta-analysis of survey data.
PLoS ONE, 4(5), 1–11. doi:10.1371/journal.pone.0005738.
Feeny, N. C., Zoellner, L. A., Mavissakalian, M. R., & Roy-Byrne, P.
P. (2009). What would you choose? Sertraline or prolonged
exposure in community and PTSD treatment seeking women.
Depression and Anxiety, 26(8), 724–731. doi:10.1002/da.20588.
Garland, A. F., Kruse, M., & Aarons, G. A. (2003). Clinicians and
outcome measurement: What’s the use? The Journal of Behav-
ioral Health Services & Research, 30(4), 393–405.
Gibbons, C. J., Fournier, J. C., Stirman, S. W., DeRubeis, R. J., Crits-
Christoph, P., & Beck, A. T. (2010). The clinical effectiveness of
cognitive therapy for depression in an outpatient clinic. Journal
of Affective Disorders, 125, 169–176.
Gottdiener, W. H. (2011). Improving the relationship between the
randomized clinical trial and real-world clinical practice. Psy-
chotherapy, 48(3), 231–233. doi:10.1037/a0022703.
Greene, C. J., Morland, L. A., Dirkalski, V. L., & Frueh, B. C. (2008).
Noninferiority and equivalence designs: Issues and implications
for mental health research. Journal of Traumatic Studies, 21(5),
433–439. doi:10.1002/jts.20367.
Griner, D., & Smith, T. B. (2006). Culturally adapted mental health
interventions: A meta-analytic review. Psychotherapy: Theory,
Research, Practice, Training, 43(4), 531–548. doi:10.1037/0022-
006X.74.1.99.
Herbert, J. D. (2003). The science and practice of empirically
supported treatments. Behavior Modification, 27(3), 412–430.
doi:10.1177/0145445503253836.
J Contemp Psychother (2013) 43:141–149 147
123
Howard, G. S., Hill, T. L., Maxwell, S. E., Baptista, T. M., Farias, M.
H., Coelho, C., et al. (2009). What’s wrong with research
literatures? And how to make them right. Review of General
Psychology, 13(2), 146–166. doi:10.1037/a0015319.
Ioannidis, J. P. A. (2005). Why most published research findings are
false. PLoS Medicine, 2(8), 0696–0701. doi:10.1371/journal.
pmed.0020124.
Jacobson, N. S., Follette, W. C., & Revenstorf, D. (1984). Psycho-
therapy outcome research: Methods of reporting variability and
evaluating clinical significance. Behavior Therapy, 15, 336–352.
Kazdin, A. E. (2001). Almost clinically significant (p \ .10): Current
measures may only approach clinical significance. Clinical
Psychology: Science and Practice, 8(4), 455–462.
Kazdin, A. E. (2003). Research design in clinical psychology (4th
ed.). Boston: Allyn and Bacon.
Kazdin, A. E. (2006). Arbitrary metrics: Implications for identifying
evidence-based treatments. American Psychologist, 61(1), 42–49.
doi:10.1037/0003-066X.61.1.42.
Kazdin, A. E. (2008). Evidence-based treatment and practice: New
opportunities to bridge clinical research and practice, enhance
the knowledge base, and improve patient care. American
Psychologist, 63(3), 146–159. doi:10.1037/0003-066X.63.3.146.
Krause, M. S. (2011). What are the fundamental facts of a comparison
of two treatments’ outcomes? Psychotherapy, 48(3), 234–236.
doi:10.1037/a0023383.
Krause, M. S., & Lutz, W. (2009). What should be used for baselines
against which to compare treatments’ effectiveness? Psycho-
therapy Research, 19(3), 358–367. doi:10.1080/1050330090292
6539.
Lau, M. A., Ogrodniczuk, J., Joyce, A. S., & Sochting, I. (2010).
Bridging the practitioner-scientist gap in group psychotherapy
research. International Journal of Group Psychotherapy, 60(2),
177–196.
Leykin, Y., & DeRubeis, R. J. (2009). Allegiance in psychotherapy
outcome research: Separating association from bias. Clinical
Psychology: Science and Practice, 16(1), 54–65.
Lilienfeld, S. O. (2010). Can psychology become a science? Person-
ality and Individual Differences, 49, 281–288. doi:10.1016/j.paid.
2010.01.024.
Lipsey, M. W., & Wilson, D. B. (2001). Practical meta-analysis.
Thousand Oaks: Sage Publications.
Lohr, J. M., Olatunji, B. O., Parker, L. M., & DeMaio, C. (2005).
Experimental analysis of specific treatment factors: Efficacy and
practice implications. Journal of Clinical Psychology, 61, 819–834.
doi:10.1002/jclp.20128.
Luborsky, L., Diguer, L., Seligman, D. A., Rosenthal, R., Krause, E.
D., Johnson, S., et al. (1999). The researcher’s own therapy
allegiances: A ‘‘wild card’’ in comparisons of treatment efficacy.
Clinical Psychology: Science and Practice, 6(1), 95–106.
Mahoney, M. J. (1978). Experimental methods and outcome evalua-
tion. Journal of Consulting and Clinical Psychology, 46, 660–672.
Mahrer, A. R. (2005). What is psychotherapy for? A plausible
alternative to empirically supported therapies, therapy relation-
ships, and practice guidelines. Journal of Contemporary Psy-
chotherapy, 35(1), 99–115. doi:10.1007/s10879-005-0806-4.
Markowitz, J. C., Spielman, L. A., Sullivan, M., & Fishman, B. (2000).
An exploratory study of ethnicity and psychotherapy outcome
among HIV-positive patients with depressive symptoms. Journal
of Psychotherapy Practice and Research, 9(4), 226–231.
Mellers, B., Hertwig, R., & Kahneman, D. (2001). Do frequency
representations eliminate conjunction effects? An exercise in
adversarial collaboration. Psychological Science, 12(4), 269–275.
Michels, K. B., & Rothman, K. J. (2003). Update on unethical use of
placebos in randomized trials. Bioethics, 17, 188–204.
Miranda, J., Bernal, G., Lau, A., Kohn, L., Hwang, W. C., &
LaFromboise, T. (2005). State of the science on psychosocial
interventions for ethnic minorities. Annual Review of Clinical
Psychology, 1, 113–142. doi:10.1146/annurev.clinpsy.1.102803.
143822.
Miranda, J., Green, B. L., Krupnick, J. L., Chung, J., Siddique, J., &
Revicki, D. (2006). One-year outcomes of a randomized clinical
trial treating depression in low-income minority women. Journal
of Consulting and Clinical Psychology, 74(1), 99–111.
doi:10.1037/0022-006X.74.1.99.
Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific utopia: II.
Restructuring incentives and practices to promote truth over
publishability. Perspectives on Psychological Science, 7(6),
615–631. doi:10.1177/1745691612459058.
Ogrodniczuk, J. S., Piper, W. E., Joyce, A. S., Lau, M. A., &
Sochting, I. (2010). A survey of Canadian group psychotherapy
association members’ perceptions of psychotherapy research.
International Journal of Group Psychotherapy, 60(2), 159–176.
O’Leary, K. D., & Borkovec, T. D. (1978). Conceptual, methodo-
logical, and ethical problems of placebo groups in psychotherapy
research. American Psychologist, 33(9), 821–830.
Overholser, J. C. (2007). The boulder model in academia: Struggling
to integrate the science and practice of psychology. Journal of
Contemporary Psychotherapy, 37(4), 205–211. doi:10.1007/
s10879-007-9055-z.
Overholser, J. C. (2012). Behind a thin veneer: What lurks beneath
the scientist-practitioner label? Journal of Contemporary Psy-
chotherapy, 42, 271–279. doi:10.1007/s10879-012-9211-y.
Persons, J. B., Roberts, N. A., Zalecki, C. A., & Brechwald, W. A. G.
(2006). Naturalistic outcome of case formulation-driven cogni-
tive-behavior therapy for anxious and depressed outpatients.
Behaviour Research and Therapy, 44, 1041–1051. doi:10.1016/
j.brat.2005.08.005.
Price, D. D., Finniss, D. G., & Beneditti, F. (2007). A comprehensive
review of the placebo effect: Recent advances and current
thought. Annual Review of Psychology, 59, 565–590.
doi:10.1146/annurev.psych.59.113006.095941.
Rosenthal, R. (1979). The file drawer problem and tolerance for null
results. Psychological Bulletin, 86(3), 638–641.
Rothman, K. J., & Michels, K. B. (1994). The continuing unethical
use of placebo controls. The New England Journal of Medicine,
331(6), 394–398.
Rothstein, H., Sutton, A. J., & Borenstein, M. (Eds.). (2005).
Publication bias in meta-analysis: Prevention, assessment and
adjustments. Chichester: Wiley.
Shapiro, F. (1989). Eye movement desensitization: A new treatment
for post-traumatic stress disorder. Journal of Behavior Therapy
and Experimental Psychiatry, 20(3), 211–217.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive
psychology: Undisclosed flexibility in data collection and analysis
allows presenting anything as significant. Psychological Science,
22(11), 1359–1366. doi:10.1177/0956797611417632.
Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data
analysis: Modeling change and event occurrence. New York:
Oxford University Press, Inc.
Spławinski, J., & Kuzniar, J. (2004). Clinical trials: Active control vs
placebo—What is ethical? Science and Engineering Ethics,
10(1), 73–79.
Spring, B., Pagoto, S., Knatterud, G., Kozak, A., & Hedeker, D.
(2007). Examination of the analytic quality of behavioral health
randomized clinical Trials. Journal of Clinical Psychology,
63(1), 53–71. doi:10.1002/jclp.20334.
Steen, R. G. (2011). Retractions in the scientific literature: Is the
incidence of research fraud increasing? Journal of Medical
Ethics, 37(4), 249–253. doi:10.1136/jme.2010.040923.
Stewart, R. E., & Chambless, D. L. (2007). Does psychotherapy
research inform treatment decisions in private practice? Journal
of Clinical Psychology, 63(3), 267–281. doi:10.1002/jclp.20347.
148 J Contemp Psychother (2013) 43:141–149
123
Stewart, R. E., & Chambless, D. L. (2010). Interesting practitioners in
training in empirically supported treatments: Research reviews
versus case studies. Journal of Clinical Psychology, 66(1),
73–95. doi:10.1002/jclp.20630.
Streiner, D. L. (2003). Unicorns do exist: A tutorial on ‘‘proving’’ the
null hypothesis. Research Methods in Psychiatry, 48(11),
756–761.
Voss Horrell, S. C. (2008). Effectiveness of cognitive-behavioral
therapy with adult ethnic minority clients: A review. Profes-
sional Psychology: Research and Practice, 39(2), 160–168.
doi:10.1037/0735-7028.39.2.160.
Westen, D., Novotny, C. M., & Thompson-Brenner, H. (2004). The
empirical status of empirically supported psychotherapies: Assump-
tions, findings, and reporting in controlled clinical trials. Psycholog-
ical Bulletin, 130(4), 631–663. doi:10.1037/0033-2909.130.4.631.
World Medical Association. (2008). Declaration of Helsinki.
Amended by the 59th WMA General Assembly, Seoul, Korea,
October 2008. Retrieved from http://www.wma.net/en/30publi-
cations/10policies/b3/.
J Contemp Psychother (2013) 43:141–149 149
123