standards of evidence for substance abuse (su) prevention brian r. flay, d.phil. distinguished...

38
Standards of Evidence for Substance Abuse (SU) Prevention Brian R. Flay, D.Phil. Distinguished Professor Public Health and Psychology University of Illinois at Chicago and Chair, Standards Committee Society for Prevention Research Prepared for Addiction Studies Workshop for Journalists New York, January 11, 2005

Upload: tamsyn-wilson

Post on 30-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Standards of Evidence for Substance Abuse

(SU) Prevention

Brian R. Flay, D.Phil.

Distinguished ProfessorPublic Health and Psychology

University of Illinois at Chicagoand

Chair, Standards CommitteeSociety for Prevention Research

Prepared for Addiction Studies Workshop for Journalists

New York, January 11, 2005

What is SU Prevention?• Programs or other interventions designed to

prevent youth from becoming users or abusers of alcohol, tobacco and other drugs (ATOD) and to prevent the long-term health and social consequences of SU.

• Types of programs:– School-based educational programs

• E.g., Life Skills Training, DARE– Mass media campaigns

• Office of National Drug Control Policy (ONDCP)– Community-based programs

• E.g., Midwest Prevention Project (Project STAR)

• Age:– Most programs target middle school– Some target high school– More and more target elementary school and K-12

Approaches to SU Prevention:

Historical View1. Information2. Scare tactics3. Affective education4. Resistance Skills5. Social/Life Skills6. Comprehensive

Information and Fear• Early approaches relied on information

– “If only they knew (the consequences of what they are doing), they wouldn’t do it.”

• Some used fear tactics– Show consequences, black lungs– “Reefer Madness”– Some of these had incorrect information– All were misleading to some degree

• For example, not showing benefits of use

• Neither approach reduces SU– Some informational approaches improve

knowledge of consequences, but they do not reduce SU

– Indeed, some caused negative effects – greater SU.

Affective (Feelings-based) Approaches

• Values Clarification– Techniques for thinking about values were taught.– Tied to decision-making skills

• Decision-Making Skills– Think about alternative behaviors– Consider the positive and negative outcomes of

each– Some kind of weighting scheme

• Approaches were not effective alone– Values were not taught– DM skills improved, but SU not reduced– DM just another way of using information

Resistance Skills• Developed in late 1970’s through 1980’s• Just Say “No”

– Simplistic approach was popularized by Nancy Reagan

• Resistance Skills much more complex– Many different ways of saying “no”– Many other considerations

• Peer pressure – direct or indirect• Consequences of different ways of resisting pressures

• Approach reduced SU in small-scale studies– Then some researchers thought they had a

program– They just had one important (maybe) component

for a more comprehensive effective program

Life Skills Training• Developed by Gilbert Botvin of Cornell 1980• More comprehensive than prior programs• 30 sessions over grades 7-9• Components:

– Information about short- and long-term consequences, prevalence rates, social acceptability, habit development, addiction

– Decision-making, independent thinking– Social influences and persuasive tactics– Skills for coping with anxiety– Social skills – communication, assertiveness,

refusal• Effectively reduced SU over multiple studies,

including one study of long-term effects (grade 12)

Getting More Comprehensive

• Adding small media to school-based programs– Video, Newsletters to engage parents

• Using mass media– Television

• Multiple tests have failed, but well-designed programs base on theory and careful developmental research can be effective (Vermont study: Worden, Flynn)

• Incorporating community components– E.g., Midwest Prevention Project

• Difficult, expensive, and long-term effects not clear

• In general, adding mass media, family or community components improves effectiveness (Flay 2000)

• Addressing multiple behaviors also found to be more effective (Flay, 2002)

Why a Comprehensive Approach is Necessary

• Multiple influences on (i.e., causes of) behavior, especially its development– Biology

• Genetic predispositions

– Social contexts • Parents, friends, others

– The broad socio-cultural environment• Religion, politics, economics, justice, education

• All of these have distal (causally distant) and proximal (causally close) qualities

• Theory of Triadic Influence incorporates all of these from multiple theories

THE BASICS OF THE THEORY OF TRIADIC INFLUENCE

& PersonalityBiology

BEHAVIOR

ValuesEvaluations

Bonding

Sense ofSelf

SocialSkills

SelfDetermin-

ation

SELFEFFICACY

SocialCompetence

Decisions/Intentions

ReligionCulture

PerceivedNorms

Motivationto Comply

Others'Beh&&Atts

SOCIALNORMATIVEBELIEFSContext

Social

DNA

ATTITUDES

InformationalEnvironment

CulturalKnowledgeExpectancies

EvaluationsValues

The Theory of Triadic InfluenceDistal Intermediate Proximal

Example of the Application of TTI to Program Content

Effects of Aban Aya Program on Reducing SU Onset

Substance use among boys grades 5-8

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Grade

Prop

ortio

n re

porti

ng a

ny u

se

5 5 6 7 8

Proportion reporting any substance use derived from proportional odds growth model

CONTROL

School Only

School + Community

Pressure For Programs of “Proven Effectiveness”

• The Federal Government increasingly requires that Federal money be spent only on programs of “proven effectiveness”– Center for Substance Abuse Prevention

(CSAP) in the Substance Abuse and Mental Health Services Administration (SAMHSA)

– U.S. Department of Education (DE)– Office of Juvenile Justice and Delinquency

Prevention (OJJDP)

What is “Proven Effectiveness”?

• Requires rigorous research methods to determine that observed effects were caused by the program being tested, rather than some other cause.

• E.g., the Food and Drug Administration (FDA) requires at least two randomized controlled trials (RCTs) before approving a new drug.

• RCTs are always expensive, and there are many challenges to conducting RCTs, especially in schools.

Standards of Evidence• Each government agency and

academic group that has reviewed programs for lists has its own set of standards.

• They are all similar but not equal– E.g., CSAP allows more studies in than DE

• All concern the rigor of the research• The Society for Prevention Research

recently created standards for the field• Our innovation was to consider

standards for efficacy, effectiveness and dissemination

Members of the SPR Standards Committee

• Brian R. Flay (Chair), D. Phil., U of Illinois at Chicago

• Anthony Biglan, Ph.D., Oregon Research Institute

• Robert F. Boruch, Ph.D., U of Pennsylvania

• Felipe G. Castro, Ph.D., MPH, Arizona State U

• Denise Gottfredson, Ph.D., Maryland U

• Sheppard Kellam, M.D., AIR • Eve K. Moscicki, Sc.D., MPH, NIMH

• Steven Schinke, Ph.D., Columbia U • Jeff Valentine, Ph.D., Duke University

• With help from Peter Ji, Ph.D., U of Illinois at Chicago

Standards for 3 levels: Efficacy, Effectiveness and

Dissemination• Efficacy

– What effects can the intervention have under ideal conditions?

• Effectiveness– What effects does the intervention have under

real-world conditions?• Dissemination

– Is an effective intervention ready for broad application or distribution?

Desirable– Additional criteria that provide added value to

evaluated interventions

Overlapping Standards

• Efficacy Standards are basic– Required at all 3 levels

• Effectiveness Standards include all Efficacy Standards plus others

• Dissemination standards include all Efficacy and Effectiveness Standards plus others

Four Kinds of Validity(Cook & Campbell, 1979; Shadish, Cook & Campbell,

2002)

• Construct validity– Program description and measures of

outcomes

• Internal validity– Was the intervention the cause of the

change in the outcomes?

• External validity (Generalizability)– Was the intervention tested on relevant

participants and in relevant settings?

• Statistical validity– Can accurate effect sizes be derived from

the study?

Specificity of Efficacy Statement

• “Program X is efficacious for producing Y outcomes for Z population.” – The program (or policy, treatment,

strategy) is named and described– The outcomes for which proven

outcomes are claimed are clearly stated

– The population to which the claim can be generalized is clearly defined

Program Description• Efficacy

– Intervention must be described at a level that would allow others to implement or replicate it

• Effectiveness– Manuals, training and technical support must be

available– The intervention should be delivered under the

same kinds of conditions as one would expect in the real world

– A clear theory of causal mechanisms should be stated

– Clear statement of “for whom?” and “under what conditions?” the intervention is expected to work

• Dissemination– Provider must have the ability to “go-to-scale”

Program Outcomes

• ALL– Claimed public health or behavioral

outcome(s) must be measured•Attitudes or intentions cannot

substitute for actual behavior

– At least one long-term follow-up is required•The appropriate interval may vary by

type of intervention and state-of-the-field

Measures• Efficacy

– Psychometrically sound• Valid• Reliable (internal consistency, test-retest or inter-

rater reliability)• Data collectors independent of the intervention

• Effectiveness– Implementation and exposure must be

measured• Level and Integrity (quality) of implementation• Acceptance/compliance/adherence/involvement of

target audience in the intervention

• Dissemination– Monitoring and evaluation tools available

Desirable Standards for Measures

•For ALL Levels–Multiple measures–Mediating variables (or immediate effects)

–Moderating variables–Potential side-effects–Potential iatrogenic (negative) effects

• At least one comparison group– No-treatment, usual care, placebo or wait-list

• Assignment to conditions must maximize causal clarity– Random assignment is “the gold standard”– Other acceptable designs

• Repeated time-series designs• Regression-discontinuity• Well-done matched controls

– Demonstrated pretest equivalence on multiple measures– Known selection mechanism

Design – for Causal Clarity

Level of Randomization

• In many drug and medical trials, individuals are randomly assigned

• In educational trials, classrooms or schools must be the unit of assignment– Students within classes/schools are not

statistically independent -- they are more alike than students in other classes/schools

• Need large studies– 4 or more schools per condition, preferably

10 or more, in order to have adequate statistical power

Generalizability of Findings• Efficacy

– Sample is defined• Who it is (from what “defined” population)• How it was obtained (sampling methods)

• Effectiveness– Description of real-world target population and

sampling methods– Degree of generalizability should be evaluated

• Desirable• Subgroup analyses• Dosage studies/analyses• Replication with different populations• Replication with different program providers

Precision of Outcomes:Statistical Analysis

• Statistical analysis allows unambiguous causal statements– At same level as randomization and

includes all cases assigned to conditions– Tests for pretest differences– Adjustments for multiple comparisons– Analyses of (and adjustments for) attrition

• Rates, patterns and types

• Desirable– Report extent and patterns of missing data

Precision of Outcomes: Statistical Significance

• Statistically significant effects– Results must be reported for all

measured outcomes– Efficacy can be claimed only for

constructs with a consistent pattern of statistically significant positive effects

– There must be no statistically significant negative (iatrogenic) effects on important outcomes

Precision of Outcomes:Practical Value

• Efficacy– Demonstrated practical significance in terms of

public health (or other relevant) impact– Report of effects for at least one follow-up

• Effectiveness– Report empirical evidence of practical

importance

• Dissemination– Clear cost information available

• Desirable– Cost-effectiveness or cost-benefit analyses

Precision of Outcomes:Replication

• Consistent findings from at least two different high-quality studies/replicates that meet all of the other criteria for efficacy and each of which has adequate statistical power – Flexibility may be required in the application of

this standard in some substantive areas

• When more than 2 studies are available, the preponderance of evidence must be consistent with that from the 2 most rigorous studies

• Desirable– The more replications the better

Additional Desirable Criteria

for Dissemination• Organizations that choose to adopt a

prevention program that barely or not quite meets all criteria should seriously consider undertaking a replication study as part of the adoption effort so as to add to the body of knowledge.

• A clear statement of the factors that are expected to assure the sustainability of the program once it is implemented.

Embedded Standards

Efficacy Effectiveness Dissemination Desirable

20 28 31 43

What Programs Come Close to Meeting These

Standards?• Life Skills Training (Botvin)

– Multiple RCTs with different populations, implementers and types of training

– Only one long-term follow-up– Independent replications of short-term

effects are now appearing (as well as some failures)

– No independent replications of long-term effects yet

Comprehensive Programs?

• No comprehensive programs yet meet the standards

• Some have one study that meets most, but without long-term outcomes or replications (e.g., Project STAR)

• Most promising comprehensive program is Positive Action– K-12 Curriculum, Teacher training, school

climate change, family and community involvement

– Improves both behavior (SU, violence, etc.) and academic achievement.

– Well-designed matched controlled studies– Randomized trials currently in progress

Programs for Which the Research Meets the Standards, But Do Not

Work• DARE

– Many quasi-experimental and non-experimental studies suggested effectiveness

– Multiple RCTs found no effects (Ennett, et al., 1994 meta-analysis)

• Hutchinson (Peterson, et al., 2000)– Well-designed RCT– Published results found no long-term effects– But no published information on the program

or short-term effects – Published findings cannot be interpreted

because of lack of information – they certainly cannot be interpreted to suggest that social influences approaches can never have long-term effects

How you can use the Standards when Questioning Public

Officials• Has the program been evaluated in a

randomized controlled trial (RCT)?• Were classrooms or schools

randomized to program and control (no program or alternative program) conditions?

• Has the program been evaluated on populations like yours?

• Have the findings been replicated?• Were the evaluators independent

from the program developers?