why consider single-case design for intervention research: reasons and rationales tom kratochwill...
DESCRIPTION
Assumptions of the Presentation: Some participants may have limited knowledge of single-case intervention research methods; A good knowledge base in single-case design will come from resources from the presentation and/or future training in courses or institutes; Participants have a commitment to the role of science in society and the importance of intervention research in psychology, education, special education, and related fields.TRANSCRIPT
Why Consider Single-Case Design for Intervention Research:
Reasons and RationalesTom Kratochwill February 12, Wisconsin Center
for Education ResearchSchool Psychology ProgramUniversity of
Wisconsin-MadisonMadison, Wisconsin Assumptions of the
Presentation:
Some participants may have limited knowledge of single-case
intervention research methods; A good knowledge base in single-case
design will come from resources from the presentation and/or future
training in courses or institutes; Participants have a commitment
to the role of science in society and the importance of
intervention research in psychology, education, special education,
and related fields. Specific Goals of the Presentation
Review Reasons and Rationale for Single-Case Design Research Review
the Logic and Foundations of Single-Case Design Intervention
Research Reviewpilot What Works Clearinghouse (WWC) Single-Case
Design Standards WWC design standards (Single-Case Designs) WWC
evidence criteria (Visual Analysis) Summarize proposed approaches
to visual and statistical analysis and within single-case
intervention research Review criteria for documenting
evidence-based practices using single-case intervention research
methods Resources on Single-Case Design and Analysis*
Recommended text for general overview of single-case design:
(Kazdin, 2011) Recommend Kratochwill and Levin (2014) text for
advanced information on design and data analysis Institute of
Education Sciences 2015 Single-Case Design Institute Web site:
Rational, Reasons, Logic, and Foundations of Single-Case
Intervention Research
Purposes and Fundamental Assumptions of Single-Case Intervention
Research Methods Defining features of SCDs Core design types
Internal validity and the role of replication Characteristics of
Scientifically Credible Single-Case Intervention Studies True
Single-Case Applications and the WWC Standards (design and evidence
credibility) Classroom-Based Applications (design and evidence
credibility) Features of Single-Case Research Methods
Experimental Single-Case Research will have Four Features:
Independent variable Dependent variable Focus is on functional
relation (causal effect) Dimension(s) of predicted change over time
(e.g., level, trend, variability, score overlap) Additional
Considerations
Operational definition of dependent variable (DV) Measure of DV is
valid, reliable, and addresses the dimension(s) of concern Repeated
measurement of an outcome before, during, and/or after active
manipulation of independent variable Operational definition of
independent variable (IV) Core features of IV are defined and
measured to document fidelity Unit of IV implementation Group
versus individual unit (an important distinction; WWC Standards
only for individual unit of analysis). Types of Research Questions
thatCan be Answered with Single-Case Design Types
Evaluate Intervention Effects Relative to Baseline Does a
forgiveness intervention reduce the level of bullying behaviors for
studentsin a high school setting? Compare Relative Effectiveness of
Interventions Is function-based behavior support more effective
than non-function-base support at reducing the level and
variability of problem behavior for this participant? Compare
Single- and Multi-Component Interventions Does adding performance
feedback to basic teacher training improve the fidelity with which
instructional skills are used by new teachers in the classroom?
More Examples of SCD Research Questions that Might be
Addressed
Is a certain teaching procedure functionally related to an increase
in the level of social initiations by young children with autism?
Is time delay prompting or least-to-most prompting more effective
in increasing the level of self-help skills performed by young
children with severe intellectual disabilities? Is the pacing of
reading instruction functionally related to increased level and
slope of reading performance (as measured by ORF) for third
graders? Is Adderal (at clinically prescribed dosage) functionally
related to increased level of attention performance for elementary
age students with Attention Deficit Disorder? Single-Case Designs
are Experimental Designs
Like RCTs, purpose is to document causal relationships Control for
major threats to internal validity through replication Document
effects for specific individuals / settings Replication across
participants required to enhance external validity Can be
distinguished from case studies Single-Case Design Standards were
Developed to Address Threats to Internal Validity (when the unit of
analysis is the individual) Ambiguous Temporal Precedence Selection
History Maturation Testing Instrumentation Additive and Interactive
Effects of Threats See Shadish, Cook, and Campbell (2002) 11
Distinctions Between Experimental Single-Case Design and Clinical
Case Study Research Some Characteristics of Traditional Case Study
Research
Often characterized by narrative description of case, treatment,
and outcome variables Typically lack a formal design with
replication but can involve a basic design format (e.g., A/B)
Methods have been suggested to improve drawing valid inferences
from case study research [e.g., Kazdin, 1982; Kratochwill, 1985;
Kazdin, A. E. (2011). Single-case research designs: Methods for
clinical and applied settings(2nd ed.). New York: Oxford University
Press] Experimental Single-Case Designs
Defining Features of Single-Case Intervention Design Nine Defining
Features of Single-Case Research
1. Experimental control: The design allows documentation of causal
(e.g., functional) relations between independent and dependent
variables. 2. Individual as unit of analysis Individual provides
their own control. Can treata group as a participant with focus on
the group as a single unit. 3.Independent variable is actively
manipulated 4. Repeated measurement of dependent variable Direct
observation at multiple points in time is often used.
Inter-observer agreement to assess reliability of the dependent
variable. 5. Baseline To document social problem, and control for
confounding variables. Defining Features of Single-Case
Research
6. Design controls for threats to internal validity Opportunity for
replication of basic effect at 3 different points in time. 7.
Visual Analysis/Statistical Analysis Visual analysis documents
basic effect at three different points in time. Statistical
analysis options emerging and presented in textbooks 8. Replication
Within a study to document experimental control. Across studies to
document external validity. Across studies, researchers, contexts,
participants to document Evidence-Based Practices. 9. Experimental
flexibility Designs may be modified or changed within a study
(sometimes called response-guided research). An emerging role for
single-case research in development of effective
interventions
Useful in the iterative development of interventions. Documentation
of experimental effects that help define the mechanism for change,
not just the occurrence of change. Allows study of low prevalence
disorders where otherwise would need large sample for statistical
power (Odom, et al., 2005). Sometimes more palatable to service
providers because SCDs may not include a no-treatment comparison
group. Allows an ongoing assessment of response to an intervention.
Has been recommended for establishing practice-based evidence.
Useful for pilot research to assess the effect size needed for
other research methods (RCTs). Useful for fine-grained analysis of
weak and non-responders (Negative Results; to be discussed later) .
Design Examples Reversal/Withdrawal Designs Multiple Baseline
Designs
Alternating Treatment Designs Others: Changing Criterion
Non-Concurrent Multiple Baseline Multiple Probe Descriptive
Analysis Hammond and Gast (2011) reviewed 196 randomly identified
journal issues (from ) containing 1,936 articles (a total of 556
single-case designs were coded). Multiple baseline designs were
reported more often than withdrawal designs and these were more
often reported across individuals and groups. Overview of Basic
Single-Case Intervention Designs ABAB Design Description
Simple phase change designs [e.g., ABAB; BCBCdesign]. (In the
literature, ABAB designs are sometimes referred to as withdrawal
designs, intrasubject replication designs, within-series designs,
operant, or reversal designs) ABAB Reversal/Withdrawal
Designs
In these designs, estimates of level, trend, and variability within
a data series are assessed under similar conditions; the
manipulated variable is introduced and concomitant changes in the
outcome measure(s) are assessed in the level, trend, and
variability between phases of the series, with special attention to
the degree of overlap, immediacy of effect, and similarity of data
patterns across similar phases (e.g., all baseline phases). ABAB
Reversal/Withdrawal Designs
Some Example Design Limitations: Behavior must be reversible in the
ABABseries (e.g., return to baseline). May be ethical issues
involved in reversing behavior back to baseline (A2). May be a
complex study when multiple conditions need to be compared. There
may be order effects in the design. Multiple Baseline Design
Description
Multiple baseline design. The design can be applied across
units(participants), across behaviors, across situations. Multiple
Baseline Designs
In these designs, multiple AB data series are compared and
introduction of the intervention is staggered across time.
Comparisons are made both between and within a data series.
Repetitions of a single simple phase change are scheduled, each
with a new series and in which both the length and timing of the
phase change differ across replications. Multiple Baseline
Design
Some Example Design Limitations: The design is generally limited to
demonstrating the effect of one independent variable on some
outcome. The design depends on the independence of the multiple
baselines (across units, settings, and behaviors). There can be
practical as well as ethical issues in keeping individuals on
baseline for long periods of time (as in the last series).
Alternating Treatment Designs
Alternating treatments (in the behavior analysis literature,
alternating treatment designs are sometimes referred to as part of
a class of multi-element designs). Alternating Treatment Design
Description
In these designs, estimates of level, trend, and variability in a
data series are assessed on measures within specific conditions and
across time. Changes/differences in the outcome measure(s) are
assessed by comparing the series associated with different
conditions. Alternating Treatment Design
Some Example Design Limitations: Behavior must be reversed during
alternation of the intervention. There is the possibility of
interaction/carryover effects as conditions are alternated.
Comparing more than three treatments is very challenging in terms
of balancing conditions. Characteristics of Scientifically Credible
Single-Case Intervention Studies Based on the WWC Standards
Motivation for "Standards" for Single-Case Intervention
Research:
Foster professional agreement on the criteria for design and
analysis of single-case research Better standards (materials) for
training in single-case methods More precision in RFP stipulations
and grant reviews Established expectations for reviewers Better
standards for reviewing single-case intervention research Journal
editors Reviewers Development of effect size and meta-analysis
technology Consensus on what is required to identify evidence-based
practices Single-case researchers have a number of conceptual and
methodological standards to guide their synthesis work. These
standards, alternatively referred to as guidelines, have been
developed by a number of professional organizations and authors
interested primarily in providing guidance for reviewing the
literature in a particular content domain (e.g., Smith, 2012; Wendt
& Miller, 2012). The development of these standards has also
provided researchers who are designing their own intervention
studies with a protocol that is capable of meeting or exceeding the
proposed standards. Reviews of Appraisal Guidelines
Wendt and Miller (2012) identified seven quality appraisal tools
and compared these standards to the single-case research criteria
advanced by Horner et al. (2005). Smith (2012) reviewed research
design and various methodological characteristics of single-case
designs in peer-reviewed journals, primarily from the psychological
literature (over the years ). Based on his review, six standards
for appraisal of the literature (some of which overlap with the
Wendt and Miller review). Professional Groups with SCD Standards or
Guidelines (Examples):
National Reading Panel American Psychological Association (APA)
Division 12/53 American Psychological Association (APA) Division 16
What Works Clearinghouse (WWC) Consolidated Standards of Reporting
Trials (CONSORT) Guidelines for N-of-1 Trials (the CONSORT
Extension for N-of1 Trials [CENT] Context Single-case methods
developed and used within Applied Behavior Analysis Recent
Investment by IES Funding of grants focused on single-case methods
Formal policy that single-case studies are able to document
experimental control Inclusion of single-case options in IES RFPs
What Works Clearinghouse Pilot SCD Standards White Paper (2010)
Training IES/WWC reviewers White Paper on Single-Case Design Effect
Size (2015) Single-Case Design Institutes to Educate Researchers
Recent Investment by the American Psychological Association SCD
Summer Institute in Madison, Wisconsin Context: WWC White
Paper
Single-Case Intervention Research Design Standards Panel Thomas R.
Kratochwill, Chair University of Wisconsin-Madison John H.
Hitchcock Ohio University Robert H. Horner University of Oregon
Joel R. Levin University of Arizona Samuel M. Odom University of
North Carolina at Chapel Hill David M. Rindskopf City University of
New York William R. Shadish University of California Merced
Available at: "True" Single-Case Applications and the WWC
Standards
What Works Clearinghouse Standards Design Standards Evidence
Criteria Effect Size and Social Validity Effect-Size Estimation
Social Validity Assessment
Is it possible to document Experimental Control ? Evaluate the
Design Meets Design Standards Meets with Reservations Does Not Meet
DesignStandards Evaluate the Evidence Strong Evidence Moderate
Evidence No Evidence Effect-Size Estimation Social
ValidityAssessment Stop Do the data document Experimental Control ?
Is the effect something we should care about? WWC Design Standards
Evaluating the Quality of Single-Case Designs Research Currently
Meeting WWC Design Standards
Sullivan and Shadish (2011) assessed the WWC pilot Standards
related to implementation of the intervention, acceptable levels of
observer agreement/reliability, opportunities to demonstrate a
treatment effect, and acceptable numbers of data points in a phase.
In published studies in 21 journals in 2008, they found that nearly
45% of the research met the strictest WWC standards of design and
30% met with some reservations. So, it can be concluded that around
75% of the published research during a sampling year of major
journals that publish single-case intervention research would meet
(or meet with reservations) the WWC design standards. Effect-Size
Estimation Social Validity Assessment
Evaluate the Design Meets Design Standards Meets with Reservations
Does Not Meet DesignStandards Evaluate the Evidence Strong Evidence
Moderate Evidence No Evidence Effect-Size Estimation Social
ValidityAssessment WWC Single-Case Design Standards
Four Standards for Design Evaluation Systematic manipulation of
independent variable Inter-assessor agreement Three attempts to
demonstrate an effect at three different points in time Minimum
number of phases and data points per phase, for phases used to
demonstrate an effect Standard 3 Differs by Design Type Reversal /
Withdrawal Designs (ABAB and variations) Alternating Treatments
Designs Multiple Baseline Designs History:History is a threat to
internal validity when events occurring concurrently with the
intervention could cause the observed effect. History is typically
the most important threat to any time series, including SCDs.
However, history threats are lessened in single-case research that
involves one of the types of phase replication necessary to meet
standards (e.g., the ABAB design discussed earlier). Designs such
as the ABAB we just saw reduce the plausibility that extraneous
events account for changes in the dependent variable(s) because
they require that the extraneous events occur at about the same
time as the multiple introductions of the intervention over time.
Ambiguous Temporal Precedence: SCD standards require that
independent variable is actively and repeatedly manipulated by the
researcher Attrition:1. Participants can be selectively unavailable
for study - Standards require a minimum number of data points per
phase 2. Participants can leave the study altogether - Standards
require at least three phase changes at three different points in
time 3. When a case comprises a group, membership can change during
study Standards require documentation of this for PI Regression to
mean: When cases (e.g., single participants, classrooms, schools)
are selected on the basis of their extreme scores, their scores on
other measured variables (including re-measured initial variables)
typically will be less extreme. If only pretest and posttest scores
were used to evaluate outcomes, statistical regression would be a
major concern. However, the repeated assessment identified as a
distinguishing feature of SCDs in the Standards (wherein
performance is monitored to evaluate level, trend, and variability,
coupled with phase repetition in the design) makes regression easy
to diagnose as an internal validity threat. As noted in the
Standards, data are repeatedly collected during baseline and
intervention phases and this repeated measurement enables the
researcher to examine characteristics of the data for the
possibility of regression effects under various conditions.
Standard 1: Systematic Manipulation of the Independent
Variable
Researcher Must Determine When and How the Independent Variable
Conditions Change. If Standard Is Not Met, Study Does Not Meet
Design Standards. Examples of Manipulation that is Not
Systematic
Teacher/Consultee Begins to Implement an Intervention Prematurely
Because of Parent Pressure. Researcher Looks Retrospectively at
Data Collected during an Intervention Program. Standard 2:
Inter-Assessor Agreement
Each Outcome Variable for Each Case Must be Measured Systematically
by More than One Assessor. Researcher Needs to Collect
Inter-Assessor Agreement: In each phase On at least 20% of the data
points in each condition (i.e., baseline, intervention) Rate of
Agreement Must Meet Minimum Thresholds: (e.g., 80% agreement or
Cohens kappa of 0.60) If No Outcomes Meet These Criteria, Study
Does Not Meet Design Standards. Standard 3: Three Attempts to
Demonstrate an Intervention Effect at Three Different Points in
Time
Attempts Are about Phase Transitions Designs that Could Meet This
Standard Include: ABAB design Multiple baseline design with three
baseline phases and staggered introduction of the intervention
Alternating treatment design (other designs and design
combinations) Designs Not Meeting this Standard Include: AB design
ABA design Multiple baselines with three baseline phases and
intervention introduced at the same time for each case Basic Effect
versus Experimental Control
Basic Effect:Change in the pattern of responding after manipulation
of the independent variable (level, trend, variability).
Experimental Control: At least three demonstrations of basic
effect, each at a different point in time. Design Evaluation Meets
Design Standards
IV manipulated directly IOA documented (e.g., .80 percent
agreement; .60 Kappa) 20% of data points in each phase Design
allows opportunity to assess basic effect at three different points
in time. Five data points per phase (or design equivalent) ATD
(four comparison option) Meets Design Standards with Reservation
All of above, except at least three data points per phase Does not
Meet Design Standards (Kratochwill & Levin, Psychological
Methods, 2010)
Random Thoughts On Enhancing the Scientific Credibility of
Single-Case Intervention Research: Randomization to the Rescue for
Designs (Kratochwill & Levin, Psychological Methods, 2010) Why
Random? Internal Validity Promotes the status of single-case
research by increasing the scientific credibility of its
methodology; the tradition has been replication and with the use of
randomization these procedures can rival randomized clinical trial
studies. Statistical-Conclusion Validity Legitimizes the conduct of
various statistical tests and ones interpretation of results; the
tradition has been visual analysis. Traditional ABAB Design Visual
Analysis of Single-Case Intervention Data WWC Standards Evaluating
Single-Case Design Outcomes With Visual Analysis: Evidence Criteria
Effect-Size Estimation Social Validity Assessment
Evaluate the Design Meets Design Standards Meets with Reservations
Does Not Meet DesignStandards Evaluate the Evidence Strong Evidence
Moderate Evidence No Evidence Effect-Size Estimation Social
ValidityAssessment Visual Analysis of Single-Case Evidence
Traditional Method of Data Evaluation for SCDs Determine whether
evidence of a causal relation exists Characterize the strength or
magnitude of that relation Singular approach used by WWC for rating
SCD evidence Methods for Effect-Size Estimation Several methods
proposed SCD WWC Panel members among those developing these
methods, but methods are still being tested and some are now
comparable with group-comparison studies WWC standards for
effect-size are being assessed as field reaches greater consensus
on appropriate statistical approaches Goal, Rationale, Advantages,
and Limitations of Visual Analysis
Goal is to Identify Intervention Effects A basic effect is a change
in the dependent variable in response to researcher manipulation of
the independent variable. Subjective determination of evidence, but
practice and common framework for applying visual analysis can help
to improve agreement rate. Evidence criteria are met by examining
effects that are replicated at different points. Encourages Focus
on Interventions with Strong Effects Strong effects are generally
desired by applied researchers and clinicians. Weak results are
filtered out because effects should be clear from looking at data -
viewed as an advantage. Statistical evaluation can be more
sensitive than visual analysis in detecting intervention effects.
Goal, Rationale, Advantages, Limitations (contd)
Statistical Evaluation and Visual Analysis are Not Fundamentally
Different in terms of Controlling Errors (Kazdin, 2011) Both
attempt to avoid Type I and Type II errors Type I: Concluding the
intervention produced an effect when it did not Type II: Concluding
the intervention did not produce an effect when it did Possible
Limitations of Visual Analysis Lack of concrete decision-making
rules (e.g., in contrast to p