why consider single-case design for intervention research: reasons and rationales tom kratochwill...

Why Consider Single-Case Design for Intervention Research: Reasons and RationalesTom Kratochwill February 12, Wisconsin Center for Education ResearchSchool Psychology ProgramUniversity of Wisconsin-MadisonMadison, Wisconsin Assumptions of the Presentation:
Some participants may have limited knowledge of single-case intervention research methods; A good knowledge base in single-case design will come from resources from the presentation and/or future training in courses or institutes; Participants have a commitment to the role of science in society and the importance of intervention research in psychology, education, special education, and related fields. Specific Goals of the Presentation
Review Reasons and Rationale for Single-Case Design Research Review the Logic and Foundations of Single-Case Design Intervention Research Reviewpilot What Works Clearinghouse (WWC) Single-Case Design Standards WWC design standards (Single-Case Designs) WWC evidence criteria (Visual Analysis) Summarize proposed approaches to visual and statistical analysis and within single-case intervention research Review criteria for documenting evidence-based practices using single-case intervention research methods Resources on Single-Case Design and Analysis*
Recommended text for general overview of single-case design: (Kazdin, 2011) Recommend Kratochwill and Levin (2014) text for advanced information on design and data analysis Institute of Education Sciences 2015 Single-Case Design Institute Web site: Rational, Reasons, Logic, and Foundations of Single-Case Intervention Research
Purposes and Fundamental Assumptions of Single-Case Intervention Research Methods Defining features of SCDs Core design types Internal validity and the role of replication Characteristics of Scientifically Credible Single-Case Intervention Studies True Single-Case Applications and the WWC Standards (design and evidence credibility) Classroom-Based Applications (design and evidence credibility) Features of Single-Case Research Methods
Experimental Single-Case Research will have Four Features: Independent variable Dependent variable Focus is on functional relation (causal effect) Dimension(s) of predicted change over time (e.g., level, trend, variability, score overlap) Additional Considerations
Operational definition of dependent variable (DV) Measure of DV is valid, reliable, and addresses the dimension(s) of concern Repeated measurement of an outcome before, during, and/or after active manipulation of independent variable Operational definition of independent variable (IV) Core features of IV are defined and measured to document fidelity Unit of IV implementation Group versus individual unit (an important distinction; WWC Standards only for individual unit of analysis). Types of Research Questions thatCan be Answered with Single-Case Design Types
Evaluate Intervention Effects Relative to Baseline Does a forgiveness intervention reduce the level of bullying behaviors for studentsin a high school setting? Compare Relative Effectiveness of Interventions Is function-based behavior support more effective than non-function-base support at reducing the level and variability of problem behavior for this participant? Compare Single- and Multi-Component Interventions Does adding performance feedback to basic teacher training improve the fidelity with which instructional skills are used by new teachers in the classroom? More Examples of SCD Research Questions that Might be Addressed
Is a certain teaching procedure functionally related to an increase in the level of social initiations by young children with autism? Is time delay prompting or least-to-most prompting more effective in increasing the level of self-help skills performed by young children with severe intellectual disabilities? Is the pacing of reading instruction functionally related to increased level and slope of reading performance (as measured by ORF) for third graders? Is Adderal (at clinically prescribed dosage) functionally related to increased level of attention performance for elementary age students with Attention Deficit Disorder? Single-Case Designs are Experimental Designs
Like RCTs, purpose is to document causal relationships Control for major threats to internal validity through replication Document effects for specific individuals / settings Replication across participants required to enhance external validity Can be distinguished from case studies Single-Case Design Standards were Developed to Address Threats to Internal Validity (when the unit of analysis is the individual) Ambiguous Temporal Precedence Selection History Maturation Testing Instrumentation Additive and Interactive Effects of Threats See Shadish, Cook, and Campbell (2002) 11 Distinctions Between Experimental Single-Case Design and Clinical Case Study Research Some Characteristics of Traditional Case Study Research
Often characterized by narrative description of case, treatment, and outcome variables Typically lack a formal design with replication but can involve a basic design format (e.g., A/B) Methods have been suggested to improve drawing valid inferences from case study research [e.g., Kazdin, 1982; Kratochwill, 1985; Kazdin, A. E. (2011). Single-case research designs: Methods for clinical and applied settings(2nd ed.). New York: Oxford University Press] Experimental Single-Case Designs
Defining Features of Single-Case Intervention Design Nine Defining Features of Single-Case Research
1. Experimental control: The design allows documentation of causal (e.g., functional) relations between independent and dependent variables. 2. Individual as unit of analysis Individual provides their own control. Can treata group as a participant with focus on the group as a single unit. 3.Independent variable is actively manipulated 4. Repeated measurement of dependent variable Direct observation at multiple points in time is often used. Inter-observer agreement to assess reliability of the dependent variable. 5. Baseline To document social problem, and control for confounding variables. Defining Features of Single-Case Research
6. Design controls for threats to internal validity Opportunity for replication of basic effect at 3 different points in time. 7. Visual Analysis/Statistical Analysis Visual analysis documents basic effect at three different points in time. Statistical analysis options emerging and presented in textbooks 8. Replication Within a study to document experimental control. Across studies to document external validity. Across studies, researchers, contexts, participants to document Evidence-Based Practices. 9. Experimental flexibility Designs may be modified or changed within a study (sometimes called response-guided research). An emerging role for single-case research in development of effective interventions
Useful in the iterative development of interventions. Documentation of experimental effects that help define the mechanism for change, not just the occurrence of change. Allows study of low prevalence disorders where otherwise would need large sample for statistical power (Odom, et al., 2005). Sometimes more palatable to service providers because SCDs may not include a no-treatment comparison group. Allows an ongoing assessment of response to an intervention. Has been recommended for establishing practice-based evidence. Useful for pilot research to assess the effect size needed for other research methods (RCTs). Useful for fine-grained analysis of weak and non-responders (Negative Results; to be discussed later) . Design Examples Reversal/Withdrawal Designs Multiple Baseline Designs
Alternating Treatment Designs Others: Changing Criterion Non-Concurrent Multiple Baseline Multiple Probe Descriptive Analysis Hammond and Gast (2011) reviewed 196 randomly identified journal issues (from ) containing 1,936 articles (a total of 556 single-case designs were coded). Multiple baseline designs were reported more often than withdrawal designs and these were more often reported across individuals and groups. Overview of Basic Single-Case Intervention Designs ABAB Design Description
Simple phase change designs [e.g., ABAB; BCBCdesign]. (In the literature, ABAB designs are sometimes referred to as withdrawal designs, intrasubject replication designs, within-series designs, operant, or reversal designs) ABAB Reversal/Withdrawal Designs
In these designs, estimates of level, trend, and variability within a data series are assessed under similar conditions; the manipulated variable is introduced and concomitant changes in the outcome measure(s) are assessed in the level, trend, and variability between phases of the series, with special attention to the degree of overlap, immediacy of effect, and similarity of data patterns across similar phases (e.g., all baseline phases). ABAB Reversal/Withdrawal Designs
Some Example Design Limitations: Behavior must be reversible in the ABABseries (e.g., return to baseline). May be ethical issues involved in reversing behavior back to baseline (A2). May be a complex study when multiple conditions need to be compared. There may be order effects in the design. Multiple Baseline Design Description
Multiple baseline design. The design can be applied across units(participants), across behaviors, across situations. Multiple Baseline Designs
In these designs, multiple AB data series are compared and introduction of the intervention is staggered across time. Comparisons are made both between and within a data series. Repetitions of a single simple phase change are scheduled, each with a new series and in which both the length and timing of the phase change differ across replications. Multiple Baseline Design
Some Example Design Limitations: The design is generally limited to demonstrating the effect of one independent variable on some outcome. The design depends on the independence of the multiple baselines (across units, settings, and behaviors). There can be practical as well as ethical issues in keeping individuals on baseline for long periods of time (as in the last series). Alternating Treatment Designs
Alternating treatments (in the behavior analysis literature, alternating treatment designs are sometimes referred to as part of a class of multi-element designs). Alternating Treatment Design Description
In these designs, estimates of level, trend, and variability in a data series are assessed on measures within specific conditions and across time. Changes/differences in the outcome measure(s) are assessed by comparing the series associated with different conditions. Alternating Treatment Design
Some Example Design Limitations: Behavior must be reversed during alternation of the intervention. There is the possibility of interaction/carryover effects as conditions are alternated. Comparing more than three treatments is very challenging in terms of balancing conditions. Characteristics of Scientifically Credible Single-Case Intervention Studies Based on the WWC Standards Motivation for "Standards" for Single-Case Intervention Research:
Foster professional agreement on the criteria for design and analysis of single-case research Better standards (materials) for training in single-case methods More precision in RFP stipulations and grant reviews Established expectations for reviewers Better standards for reviewing single-case intervention research Journal editors Reviewers Development of effect size and meta-analysis technology Consensus on what is required to identify evidence-based practices Single-case researchers have a number of conceptual and methodological standards to guide their synthesis work. These standards, alternatively referred to as guidelines, have been developed by a number of professional organizations and authors interested primarily in providing guidance for reviewing the literature in a particular content domain (e.g., Smith, 2012; Wendt & Miller, 2012). The development of these standards has also provided researchers who are designing their own intervention studies with a protocol that is capable of meeting or exceeding the proposed standards. Reviews of Appraisal Guidelines
Wendt and Miller (2012) identified seven quality appraisal tools and compared these standards to the single-case research criteria advanced by Horner et al. (2005). Smith (2012) reviewed research design and various methodological characteristics of single-case designs in peer-reviewed journals, primarily from the psychological literature (over the years ). Based on his review, six standards for appraisal of the literature (some of which overlap with the Wendt and Miller review). Professional Groups with SCD Standards or Guidelines (Examples):
National Reading Panel American Psychological Association (APA) Division 12/53 American Psychological Association (APA) Division 16 What Works Clearinghouse (WWC) Consolidated Standards of Reporting Trials (CONSORT) Guidelines for N-of-1 Trials (the CONSORT Extension for N-of1 Trials [CENT] Context Single-case methods developed and used within Applied Behavior Analysis Recent Investment by IES Funding of grants focused on single-case methods Formal policy that single-case studies are able to document experimental control Inclusion of single-case options in IES RFPs What Works Clearinghouse Pilot SCD Standards White Paper (2010) Training IES/WWC reviewers White Paper on Single-Case Design Effect Size (2015) Single-Case Design Institutes to Educate Researchers Recent Investment by the American Psychological Association SCD Summer Institute in Madison, Wisconsin Context: WWC White Paper
Single-Case Intervention Research Design Standards Panel Thomas R. Kratochwill, Chair University of Wisconsin-Madison John H. Hitchcock Ohio University Robert H. Horner University of Oregon Joel R. Levin University of Arizona Samuel M. Odom University of North Carolina at Chapel Hill David M. Rindskopf City University of New York William R. Shadish University of California Merced Available at: "True" Single-Case Applications and the WWC Standards
What Works Clearinghouse Standards Design Standards Evidence Criteria Effect Size and Social Validity Effect-Size Estimation Social Validity Assessment
Is it possible to document Experimental Control ? Evaluate the Design Meets Design Standards Meets with Reservations Does Not Meet DesignStandards Evaluate the Evidence Strong Evidence Moderate Evidence No Evidence Effect-Size Estimation Social ValidityAssessment Stop Do the data document Experimental Control ? Is the effect something we should care about? WWC Design Standards Evaluating the Quality of Single-Case Designs Research Currently Meeting WWC Design Standards
Sullivan and Shadish (2011) assessed the WWC pilot Standards related to implementation of the intervention, acceptable levels of observer agreement/reliability, opportunities to demonstrate a treatment effect, and acceptable numbers of data points in a phase. In published studies in 21 journals in 2008, they found that nearly 45% of the research met the strictest WWC standards of design and 30% met with some reservations. So, it can be concluded that around 75% of the published research during a sampling year of major journals that publish single-case intervention research would meet (or meet with reservations) the WWC design standards. Effect-Size Estimation Social Validity Assessment
Evaluate the Design Meets Design Standards Meets with Reservations Does Not Meet DesignStandards Evaluate the Evidence Strong Evidence Moderate Evidence No Evidence Effect-Size Estimation Social ValidityAssessment WWC Single-Case Design Standards
Four Standards for Design Evaluation Systematic manipulation of independent variable Inter-assessor agreement Three attempts to demonstrate an effect at three different points in time Minimum number of phases and data points per phase, for phases used to demonstrate an effect Standard 3 Differs by Design Type Reversal / Withdrawal Designs (ABAB and variations) Alternating Treatments Designs Multiple Baseline Designs History:History is a threat to internal validity when events occurring concurrently with the intervention could cause the observed effect. History is typically the most important threat to any time series, including SCDs. However, history threats are lessened in single-case research that involves one of the types of phase replication necessary to meet standards (e.g., the ABAB design discussed earlier). Designs such as the ABAB we just saw reduce the plausibility that extraneous events account for changes in the dependent variable(s) because they require that the extraneous events occur at about the same time as the multiple introductions of the intervention over time. Ambiguous Temporal Precedence: SCD standards require that independent variable is actively and repeatedly manipulated by the researcher Attrition:1. Participants can be selectively unavailable for study - Standards require a minimum number of data points per phase 2. Participants can leave the study altogether - Standards require at least three phase changes at three different points in time 3. When a case comprises a group, membership can change during study Standards require documentation of this for PI Regression to mean: When cases (e.g., single participants, classrooms, schools) are selected on the basis of their extreme scores, their scores on other measured variables (including re-measured initial variables) typically will be less extreme. If only pretest and posttest scores were used to evaluate outcomes, statistical regression would be a major concern. However, the repeated assessment identified as a distinguishing feature of SCDs in the Standards (wherein performance is monitored to evaluate level, trend, and variability, coupled with phase repetition in the design) makes regression easy to diagnose as an internal validity threat. As noted in the Standards, data are repeatedly collected during baseline and intervention phases and this repeated measurement enables the researcher to examine characteristics of the data for the possibility of regression effects under various conditions. Standard 1: Systematic Manipulation of the Independent Variable
Researcher Must Determine When and How the Independent Variable Conditions Change. If Standard Is Not Met, Study Does Not Meet Design Standards. Examples of Manipulation that is Not Systematic
Teacher/Consultee Begins to Implement an Intervention Prematurely Because of Parent Pressure. Researcher Looks Retrospectively at Data Collected during an Intervention Program. Standard 2: Inter-Assessor Agreement
Each Outcome Variable for Each Case Must be Measured Systematically by More than One Assessor. Researcher Needs to Collect Inter-Assessor Agreement: In each phase On at least 20% of the data points in each condition (i.e., baseline, intervention) Rate of Agreement Must Meet Minimum Thresholds: (e.g., 80% agreement or Cohens kappa of 0.60) If No Outcomes Meet These Criteria, Study Does Not Meet Design Standards. Standard 3: Three Attempts to Demonstrate an Intervention Effect at Three Different Points in Time
Attempts Are about Phase Transitions Designs that Could Meet This Standard Include: ABAB design Multiple baseline design with three baseline phases and staggered introduction of the intervention Alternating treatment design (other designs and design combinations) Designs Not Meeting this Standard Include: AB design ABA design Multiple baselines with three baseline phases and intervention introduced at the same time for each case Basic Effect versus Experimental Control
Basic Effect:Change in the pattern of responding after manipulation of the independent variable (level, trend, variability). Experimental Control: At least three demonstrations of basic effect, each at a different point in time. Design Evaluation Meets Design Standards
IV manipulated directly IOA documented (e.g., .80 percent agreement; .60 Kappa) 20% of data points in each phase Design allows opportunity to assess basic effect at three different points in time. Five data points per phase (or design equivalent) ATD (four comparison option) Meets Design Standards with Reservation All of above, except at least three data points per phase Does not Meet Design Standards (Kratochwill & Levin, Psychological Methods, 2010)
Random Thoughts On Enhancing the Scientific Credibility of Single-Case Intervention Research: Randomization to the Rescue for Designs (Kratochwill & Levin, Psychological Methods, 2010) Why Random? Internal Validity Promotes the status of single-case research by increasing the scientific credibility of its methodology; the tradition has been replication and with the use of randomization these procedures can rival randomized clinical trial studies. Statistical-Conclusion Validity Legitimizes the conduct of various statistical tests and ones interpretation of results; the tradition has been visual analysis. Traditional ABAB Design Visual Analysis of Single-Case Intervention Data WWC Standards Evaluating Single-Case Design Outcomes With Visual Analysis: Evidence Criteria Effect-Size Estimation Social Validity Assessment
Evaluate the Design Meets Design Standards Meets with Reservations Does Not Meet DesignStandards Evaluate the Evidence Strong Evidence Moderate Evidence No Evidence Effect-Size Estimation Social ValidityAssessment Visual Analysis of Single-Case Evidence
Traditional Method of Data Evaluation for SCDs Determine whether evidence of a causal relation exists Characterize the strength or magnitude of that relation Singular approach used by WWC for rating SCD evidence Methods for Effect-Size Estimation Several methods proposed SCD WWC Panel members among those developing these methods, but methods are still being tested and some are now comparable with group-comparison studies WWC standards for effect-size are being assessed as field reaches greater consensus on appropriate statistical approaches Goal, Rationale, Advantages, and Limitations of Visual Analysis
Goal is to Identify Intervention Effects A basic effect is a change in the dependent variable in response to researcher manipulation of the independent variable. Subjective determination of evidence, but practice and common framework for applying visual analysis can help to improve agreement rate. Evidence criteria are met by examining effects that are replicated at different points. Encourages Focus on Interventions with Strong Effects Strong effects are generally desired by applied researchers and clinicians. Weak results are filtered out because effects should be clear from looking at data - viewed as an advantage. Statistical evaluation can be more sensitive than visual analysis in detecting intervention effects. Goal, Rationale, Advantages, Limitations (contd)
Statistical Evaluation and Visual Analysis are Not Fundamentally Different in terms of Controlling Errors (Kazdin, 2011) Both attempt to avoid Type I and Type II errors Type I: Concluding the intervention produced an effect when it did not Type II: Concluding the intervention did not produce an effect when it did Possible Limitations of Visual Analysis Lack of concrete decision-making rules (e.g., in contrast to p

why consider single-case design for intervention research: reasons and rationales tom kratochwill...

Documents