two essential characteristics of an experiment:

Two essential characteristics of an experiment:

• Random assignment of subjects (participants) to groups (e.g., treatments or conditions) AND

• Treatments or conditions that manipulate the independent variable (IV).

GROUP COMPARISON RESEARCH

EXPERIMENTS

Evaluating Experiments

Internal

and

External Validity

Threats to Internal Validity

• (1) Maturation: passage of time may produce changes in research participants.

• (2) Historical effects: some significant historical event(s) may impact performance on the DVs (“time of measurement effects”).

• (3) Testing effects: practice and reactivity.


• (4) Instrumentation: differences in measurement techniques and measures over different measurement periods make it difficult to assess true changes in the DVs.

• (5) Regression to the mean: occurs whenever groups are selected based on extreme scores (ie, very high or very low); scores “regress” towards their true mean over repeated measurement occasions.


• (6) Experimental mortality: participant attrition, or drop-out, from a study.

• (7) Participant selection: obtained effects on the DV are a function of characteristics of the sample.

• (8) Selection-by-maturation interaction: maturation effects are found in some samples, but not in others.

External Validity

• People

• Situations (contexts)

• Time (history)

Factors that influence external (ecological) validity

• Explicit description of the experimental conditions.

• Multiple-treatment interference.

• Hawthorne (observer) effects.

• Novelty and disruption effects.

• Experimenter effects (teacher v. researcher)

• Pretest and post-test sensitization

Additional factors that influence external validity

• History X treatment interaction effects.

• The manner in which the dependent variable is measured.

• Interaction of time of measurement (immediate v. delayed) and treatment effects.

Improving External Validity (1)

Use random selection rather than a nonrandom procedure.

Keep attrition low.

Describe how your study’s setting and other settings differ; provide data about similarity between various groups of students, schools, and historical times.

Improving External Validity (2) Conduct your study in variety of schools, with different students, and at different times.

Replicate your study.

Experimental Design Issues• How many treatments are involved in the

experiment?

• Will a control or comparison group be used?

• Will a pretest of the DV be used?

• How many times will the DV be measured?

• How will internal and external validity threats be controlled?

Design 1:

ONE-GROUP PRETEST-POSTTEST DESIGN

pretest posttest

Group 1: O1 X O2

Problems with pre-experimental design: (1) can't assume any change brought about by treatment is due to the treatment itself; other factors may explain the change. (2) design has no internal validity. (3) what is the effect of the pretest on subject performance?

Design 2:

POSTTEST ONLY CONTROL GROUP DESIGN (Static group comparison)

Group 1: X O11 (treatment)

Group 2: - O21 (control)

Problem with this design:

lacks random assignment; we can't assume groups are equivalent prior to treatment.

Design 3:

RANDOMIZED SUBJECTS; POSTTEST-ONLY CONTROL GROUP DESIGN

Group 1: [R] X O11 (treatment)

Group 2: [R] - O21 (control)

Strengths of design:

• Simple, but powerful experimental design.

• No pretest: useful in studies where pretest sensitivity/reactivity is likely, or pretest is not available.

• Main advantage: randomization--controls for several internal validity threats.

• Can be used with >2 groups.

Design 4:

RANDOMIZED MATCHED SUBJECTS, POST-TEST ONLY CONTROL GROUP

Group 1: [R] X O11 (treatment)

MatchingGroup 2: [R] - O21 (control)

Similar to Design 3, except uses matching of Ss on some variables (eg., IQ or reading achievement), rather than random assignment. Pretest scores can be used to match Ss. Matching variables presumed to be correlated with DV.

One S randomly assigned to treatment, one S to control.Useful in studies with small sample sizes.

Design 5:

RANDOMIZED SUBJECTS, PRETEST-POSTTEST CONTROL GROUP DESIGN

(a true experimental design)

pretest posttestExperiment: [R] O11 X O12

Control group: [R] O21 O22

Strengths of this design:

(1) initial randomization of groups-- assures statistical equivalence prior to treatment.

(2) allows experimenter to study change (in attitudes, learning, behaviors, etc. due to treatment).

(3) pretest assures equivalence of groups. (4) controls for most threats to internal

validity: history, maturation, and pretesting; also differential selection of Ss and statistical regression.

Weakness of this design:

Threat to internal validity: what effect does the pretest have on subjects’

responses to the treatment?

Design 6:

SOLOMON THREE-GROUP

Group 1: [R] O11 X O12

Group 2: [R] O21 - O22

Group 3: [R] - X O32

DESIGN 7:

SOLOMON FOUR-GROUP DESIGN

Group 1: [R] O11 X O12

Group 2: [R] O21 - O22

Group 3: [R] - X O32

Group 4: [R] - - O42

Factorial designs

• Include more than one independent variable.

• Often have more than one dependent variable.

• Can examine interactions between independent variables as well as the “main effects” of the individual independent variables on the dependent variables.

SIMPLE FACTORIAL Design 8:

EXPERIMENTAL VARIABLES Independent Variable 1:

Teacher skill (Novice vs. expert teacher)Independent variable 2:

Presentation (Lecture vs. multimedia)

Student attribute variable: Aptitude

Dependent variable: Achievement

What are the main effects of teacher skill on student achievement?What are the main effects of presentation format on student achievement?What are the interactive effects of skill and presentation on student achievement? Do these effects differ by the aptitude level (e.g., high, average, low) of students?

FACTORIAL DESIGN Groups:

Novice/Lecture Expert/Lecture High Aptitude High Aptitude

Novice/Lecture Expert/LectureLow Aptitude Low Aptitude

Novice/Multimedia Expert/MultimediaHigh Aptitude High AptitudeNovice/Multimedia Expert/MultimediaLow Aptitude Low Aptitude

Dependent Variable: Achievement

POTENTIAL RESULTS of a FACTORIAL DESIGN

M.E. for M.E. for Interaction btwnI.V # 1? I. V. # 2? I.V. # 1 & # 2?(“Teacher skill”) (“Presentation”)NO NO NOYES NO NONO YES NOYES YES NONO NO YESYES NO YESNO YES YESYES YES YES

Design 9: NON-RANDOMIZED CONTROL GROUP, PRETEST-POSTTEST DESIGN

Group 1: 011 X 012

Group 2: 021 - 022

(groups not randomly formed)

COUNTERBALANCED DESIGN 10:

Experiment 1Group 1: Treatment AGroup 2: Treatment B

Replication (Experiment 2)Group 1: Treatment BGroup 2: Treatment A

Design 11:

ONE-GROUP TIME SERIES DESIGN

Pretests Posttests

Group 1: O1 O2 O3 X O4 O5 O6

Strengths:(1) Multiple testing provides a check on threats to internal validity: maturation, testing, and regression can be accounted for.

Weaknesses:(1) fails to control for internal validity threat of history: perhaps some other, unexplained variable accounts for any observed change in DV.(2) external validity problem: effect of repeated testing.(3) selection-maturation interaction may occur if atypical groups are selected.(4) statistical analysis/interpretation may be difficult.

Design 12: CONTROL GROUP TIME SERIES DESIGN

Pretests Posttests

Group 1: O1 O2 O3 X O4 O5 O6

Group 2: O1 O2 O3 - O4 O5 O6

Miller, Miller, & Rosen

• Reciprocal teaching of reading comprehension:– summarizing a paragraph;– asking a good question;– clarifying the hard parts of text;– predicting what comes next

Experimental design:

• Students randomly assigned to– modified RT– control group I– control group II

• Research Design– Experimental R 0 X O– Control 1 R 0 - 0– Control 2 R 0 - -

Can you summarize the results?

• What were the effects of modified reciprocal teaching?

• Did MRT improve reading comprehension, compared to controls? Did it increase writing skills, compared to controls?

• What were other, non-hypothesized effects of MRT?

Critique

• Why not train classroom teachers to conduct reciprocal teaching?

• Teacher bias affects grading of assignments.• Little explanation of the reading

comprehension strategies.• Any practical difference among the groups?• Why focus on student conduct?

Gettinger study: Effects of error correction on third graders’

spelling.

• Does 3rd graders’ spelling improve when instruction provides corrective feedback and sufficient time for mastery learning?

• Can a method successfully utilized under artificial (laboratory) conditions be successful in a classroom learning situation?

Hypothesis:“...students who received

the error-correction intervention would evidence higher spelling accuracy than would students who received no additional modification beyond their standard spelling practice, or whose practice was optimized by dividing their words into smaller, daily chunks...”

Random assignment of intact classes to three experimental conditions:

(A) Standard condition(B) Reduced-number of words(C) Error-correction and practice

Dependent Variables:

(1) Spelling accuracy (weekly tests).(2) Teacher ratings: spelling test performance and spelling accuracy when writing.

Experimental group only: Number of trials-to-criterion (TTC) and orthographic ratings of spelling attempts.

Procedures of the study:

Baseline (6 weeks)Treatment (6 weeks)Generalization (6

weeks)

Results

• EC group > weekly spelling test scores intervention and generalization phases than SC or R# groups.

• Similar results for dictated stories spelling.

• EC group > teacher ratings during intervention phase.

• Number of learning trials decreased.

• Orthographic ratings improved over time.

Critique

• Strengths– good experimental

design

– valid measures of spelling achievement

– sufficient treatment duration

– incorporated into regular classroom

• Weaknesses– sample not fully

representative of 3rd grades’ students

two essential characteristics of an experiment:

Documents

treatment effects

historical effects

maturation effects

disruption effects

testing effects

preexperimental design

powerful experimental

experimental design