compromised randomization and uncertainty of treatment assignments in social...
TRANSCRIPT
Compromised Randomization and Uncertainty of Treatment
Assignments in Social Experiments: The Case of the Perry
Preschool Program
James HeckmanDepartment of Economics
University of [email protected]
Rodrigo R. A. PintoDepartment of Economics
University of [email protected]
Azeem M. ShaikhDepartment of Economics
University of [email protected]
Adam YavitzDepartment of Economics
University of [email protected]
July 24, 2009
1James Heckman is Henry Schultz Distinguished Service Professor of Economics at the University ofChicago, Professor of Science and Society, University College Dublin, Alfred Cowles Distinguished VisitingProfessor, Cowles Foundation, Yale University and Senior Fellow, American Bar Foundation. Azeem M.Shaikh is an assistant professor, Department of Economics, University of Chicago. Rodrigo Pinto and AdamYavitz are graduate students at the University of Chicago. A version of this paper was presented at a seminarat the High/Scope Foundation Ypsilanti, Michigan, December 2006; at a conference at the MinneapolisFederal Reserve in December 2007; at a National Poverty Center conference, Ann Arbor, Michigan, December2007; at a conference sponsored by the Jacobs Foundation held at Castle Marbach, Germany, April 2008; at aLeibniz Network Conference on noncognitive skills in Mannheim, May 2008; and at an Institute for Researchon Poverty conference, Madison, Wisconsin, June 2008. We thank participants at these conferences andseminars and Ricardo Barros and Peter Savelyev for helpful comments. We are grateful to Larry Schweinhartof the High Scope Foundation for his continual support. This research was supported by the Committeefor Economic Development by a grant from the Pew Charitable Trusts and the Partnership for America’sEconomic Success (PAES); the JB & MK Pritzker Family Foundation; Susan Thompson Buffett Foundation;NICHD (R01HD043411); and a grant from the American Bar Foundation. The views expressed in thispresentation are those of the authors and not necessarily those of the funders listed here. Supplementarymaterials may be retrieved from http://jenni.uchicago.edu/perry comp rand/.
Abstract
Randomized controlled trials (RCT) are considered the golden standard for assessing the effec-tiveness of a treatment. However, social experiments that rely on RCT are often compromised, andthe statistical consequences of these compromises are typically neglected. We develop a generalinference framework that accounts for compromised RCT in social experiments.
Under the null hypothesis of no treatment effects, the outcomes of an RCT should be inde-pendent of treatment assignments. However, compromises of the random assignment can inducea spurious dependency between treatment status and outcomes regardless of the null hypothesis.Failing to account for such dependencies can produce biased inferences, casting doubt on the va-lidity of traditional inference. We decompose a compromising features of a randomization into twoaspects: (a) the uncertainty on the randomization mechanism that generated the compromises and(b) the role played by unobserved variables of participants in the implemented randomization. Weprovide a general framework to corrects for both issues.
We apply our analysis to the Perry Preschool Program, an early education intervention targetedtoward disadvantaged African Americans. Evidence of Perry Program is considered a cornerstonein support of Early Childhood Interventions. Randomization in Perry was compromised by reas-signment of treatment status according to a partially unknown randomization mechanism. Therandomization compromises also cast on unobserved variables of participants. We develop a sta-tistical method that corrects for the Perry compromises using the available information about thereassignment rule. We perform multiple hypothesis tests on a range of human activities. Outcomesremain statistically significant for both males and females after accounting for the compromisedrandomization.
KEYWORDS: Multiple Testing, Multiple Outcomes, Randomized Trial, Perry Preschool Program,Program Evaluation, Familywise Error Rate, Exact Inference.
JEL Codes: C12, C82, C93
James J. HeckmanDepartment of Economics
University of Chicago1126 East 59th StreetChicago, Illinois 60637
Telephone: (773) 702-0634Fax: (773) [email protected]
Rodrigo PintoDepartment of Economics
University of Chicago1126 East 59th StreetChicago, Illinois 60637Phone: (773) 702-3478
Fax: (773) [email protected]
Azeem ShaikhDepartment of Economics
University of Chicago1126 East 59th StreetChicago, Illinois 60637Phone: (773) 702-3621
Fax: (773) [email protected]
Adam YavitzDepartment of Economics
University of Chicago1155 E 60th Street, Suite 227
Chicago, Illinois 60637Phone: (773) 702-4686
Fax: (773) [email protected]
1 Introduction
Randomized controlled trials (RCT) are considered the golden standard for assessing the effec-
tiveness of a treatment. Carefully executed social experiments allow researches to identify mean
treatment effects without bias.1 However, compromises are often made in practical applications
of the RCT method. These compromises change the statistical properties of RCT and cast doubt
on the validity of traditional inference. This paper examines general sources of compromisation in
RCT experiments and proposes a framework to generate valid inferences.
Compromised randomization occurs when the actual assignments of treatment and control
status among participants deviates from an ideal RCT randomization protocol. A common source
for compromisation is reassignment of treatment status, usually made to balance the distribution of
background variables between treatment and control groups. Compromises can induce dependance
between treatment assignments and some pre-program variables. If these pre-program variables
impact outcomes then treatment assignments may be correlated with outcomes through a common
dependence on pre-program variables. Spurious correlation between treatment assignments and
outcomes through pre-program variables (instead of through a true treatment effect) generate
biased inferences when not controlled for.
It is useful to model a compromised randomization as a stepwise procedure in which researchers
perform an initially not compromised randomization, and per- form a series of actions which deter-
mine the compromises. If compromises occur as a result of some known rule, and do not depend on
unobserved variables, then the distribution of treatment assignments that arise from the compro-
mised randomization can be reproduced, and valid statistical inference can be achieved using this
distribution.2 However, a full description of compromises are seldom available in social experiments.
As a result, there is often uncertainty regarding the true distribution of treatment assignments.3 As1See Heckman and Vytlacil (2007b) for one discussion of what randomized trials identify.2The required conditions depend on the exact assignment rule. If the reassignment rule is a deterministic function
of observed variables, and support conditions for matching are satisfied, a simple matching procedure produces validinference. For more general reassignment rules of identification of assignment rules is required. See Heckman andVytlacil (2007a) for one discussion of identification of models when there is selection into treatment on the basis ofunobserved variables.
3This problem is not unique to the program analyzed in this paper. For example, in the Abecedarian program,randomization was also compromised. One hundred twenty-two children, born to 120 economically disadvantagedfamilies, underwent preliminary processing for the Abecedarian study. After target children were born, qualifyingfamilies were matched on High-Risk Index, sex of the child, maternal IQ, and number of siblings and then assignedto receive high quality child care (treatment) or no treatment (control) on the basis of a table of random numbers(Breitmayer and Ramey, 1986). After this phase, some families refused participation (7 families from the treatment
1
the compromises do not depend on unobserved variables, it is possible to surmount the uncertainty
of the distribution of treatment assignments by using an exchangeability property discussed in Sub-
section 3.3. However, when a randomization protocol combines both uncertainty and a dependance
on un- observed variables some novel statistical approaches are necessary. Our method addresses
this last case.
We apply our analysis to the Perry Preschool Program, an pre-school intervention targeted
toward disadvantaged young African Americans in Ypsilanti, Michigan in the early 1960s. The
Perry Program is the flagship for advocating the benefits of early childhood intervention. Although
extensive, the literature on Perry Program typically fails to account for its randomization compro-
mises. An exception is (Heckman, Moon, Pinto, Savelyev, and Yavitz, 2008) which corrects for
the compromises of Perry randomization by exploring a property of conditional exchangeability
that remains valid under compromises. As a consequence of this property, they assume indepen-
dence of outcomes and treatment assignments conditional on pre-program variables (a matching
assumption). However, (Heckman, Moon, Pinto, Savelyev, and Yavitz, 2008) do not account for the
compromises based on unobserved characteristics of the participants of the program. In the present
paper, we present a method for generating conservative test statistics for the null hypothesis of no
treatment effect that are consistent with compromises based on unobserved characteristics of Perry
participants. To this end, we correct for the uncertain nature of the assignment distribution in the
spirit of the literature on partial identification (Manski, 2003). Unlike this literature, we are not
interested in estimating or bounding treatment effects. Instead, we are interested in developing
conservative, but data-consistent tests of the hypothesis of no treatment effect in the case of partial
information about the randomization procedure as implemented.
The paper proceeds as follows. Section 2 describes the experimental setting of the Perry Pro-
gram and describes the randomization protocol. Section 3 discuss sources of compromises and
presents our model framework. Section 4 presents a formal description of our testing procedure.
Section 5 shows our empirical analysis. Section 6 concludes.
group and one from the control group), two children from the control group were reassigned to the day-care conditionat the request of local authorities and were dropped as subjects (Campbell and Ramey, 1994). In the SIME-DIMEstudy, the randomization protocol was never clearly described. (See Kurz and Spiegelman, 1972).
2
2 The Perry Experiment
2.1 The Perry Program
The High/Scope Perry Preschool Program was an early childhood education program conducted at
the Perry elementary school in Ypsilanti, Michigan, during the early 1960s. Beginning at age three
and lasting two years, treatment consisted of a 2.5-hour preschool program on weekdays during the
school year, supplemented by weekly home visits by teachers.
Treatment began at age three and lasted for two years. The study admitted a total of five cohorts
during the period 1962–1965; two were admitted in the first year and one in each subsequent year.
The first cohort is an exception as treated children received only one year of treatment, beginning
at age four. High/Scope’s innovative curriculum, developed over the course of the Perry Program,
was based on the Piagetian principle of active learning, guiding students through key learning
experiences with open-ended questions (Schweinhart et al. 1993, pp. 34–36; Weikart et al. 1978,
pp. 5–6, 21–23). Web Appendix A provides further information on the program activities.
Sample of Participants The study is made up of 123 children from 104 families. Participants
were randomized into treatment (58 participants) and control (65 participants) groups. Siblings’
distribution among families consists of 82 singletons, 17 pairs, 1 triple, and 1 quadruple of siblings.
No family has more than one child per entry cohort.
Family Background Perry participants had an average of four older siblings; the average ma-
ternal age was 29 years at the time of enrollment. About half of the children were living with
both biological parents. Mothers had completed 9.4 years of schooling on average, but none of the
parents had completed more than 12 years of education.
Study Follow-Up Follow-up interviews were conducted yearly from 3 to 15 years old and then
at approximately 19, 27, and 40 years. Perry’s extensive database includes two thousand questions
about parental care, family socio-economic background, family structure, marital status, health,
cognitive and noncognitive tests, crime behavior, drug use, employment, income, consumption of
durable goods, welfare status, and life-course expectations. Appendix D present tables that describe
3
the control and treatment differences for a selection of Perry outcomes. For further discussion of
the Perry experiment see Heckman, Moon, Pinto, Savelyev, and Yavitz (2008).
2.2 Experimental Design
This subsection describes Perry randomization protocol in detail. The description is useful for
understanding the specifics of the compromises and to identify the lack of information through some
steps of the protocol. Next section defines the statistical problems that follow Perry randomization
and proposes a Perry-tailored inference that complies with the implemented randomization protocol.
Eligibility Criteria Children were drawn from the population surrounding the Perry Elementary
School. Subjects were located through a survey of families associated with that school, as well as
through neighborhood group referrals and door-to-door canvassing.
Disadvantaged children were identified via Stanford-Binet IQ score and socio-economic status
(SES) index cutoffs. The SES is a weighted linear combination of three components: paternal skill
level, parental educational attainment, and number of rooms per person in the family home.4 Sub-
jects with SES above a certain level (initially fixed at 11) were excluded. Children with biological
mental defects and those with IQ scores outside the range 70–85 were excluded. The IQ and SES
criteria were not always adhered to (see Figures 3–4).
Randomization Protocol It is essential to understand the randomization protocol in order to
perform an appropriate inference procedure. Details of the implemented randomization provide the
theoretical basis to access some exchangeability property of treatment assignments. Exchangeability
is used to design an inference method that complies the specificities of the randomization. Following
Weikart et al. (1978, p. 16), there are 123 participants; the 51 females (25 treated and 26 control)
and 72 males (33 treated and 39 control) are distributed among five cohorts. For each entry cohort,
children were assigned to treatment and control groups under the following randomization protocol:
The Randomization Protocol:
Step 1: Younger siblings of previous Perry participants were set aside and assigned to the same
treatment group as their elder siblings.4This index is discussed in detail in the legend to Figure 4 below.
4
Step 2: Those remaining were ranked by IQ score measured at study entry with ties broken
randomly; Two groups were formed, with odd ranks in one group and even ranks in the other.
Step 3: Some individuals were swapped between the groups to “balance” gender and mean
SES score (while keeping IQ roughly constant);
Step 4: The groups were assigned to treatment and control with equal probability.
Step 5: Reassign some treated individuals whose mothers were working (and unable to make
it to follow-up meetings) to the control group.
Note that item 5 depends on unobserved variables of a subset of participants. The rationale for
excluding younger siblings from the randomization process was to avoid “spillovers” within a family,
which could potentially weaken the observed treatment effect.5 The steps of the Randomization
Protocol can be used to characterize the exogenous random variables and reassignment functions
described in Section 3. The randomization procedure is graphically illustrated in Figure 1 and
Appendix D gives detailed description of all available information on the randomization protocol.
3 Model Framework
The fundamental evaluation model describes an observed outcome k for a person i, that is Yi,k. Let
Yi,k1 , Yi,k0 be the potential treatment or control outcomes for person i. Let the random variable
Di be the treatment assignment for person i that takes the value 1 if treatment occurs, and 0
otherwise. Thus the the observed outcome Yi,k is defined by the following equation:
Yi,k = DiYi,k1 + (1−Di)Yi,k0 .
The evaluation problem arises because either Yi,k1 or Yi,k0 is observed but not both. Non-experimental
analysis often encounter the problem of selection bias, when person i self-selects into treatment.
Under self-selection, the difference between Yi,k1 or Yi,k0 might not be a causal consequence of treat-
ment itself, but rather due to a sorting effect. Ideally, RCT experiments are designed to solve the
self-selection problem by imposing independence between Yi,k1 , Yi,k0 and Di. In terms of standard
notation, these experiments guarantee that (Yi,k1 , Yi,k0) ⊥⊥ Di where “⊥⊥” denotes independence.
5For example, through home visits or emulation of one child by another.
5
Fig
ure
1:P
erry
Ran
dom
izat
ion
Pro
toco
l
CT
Step
4:
Post
-Ass
ignm
ent S
wap
sSo
me
post-
rand
omiza
tion
swap
sba
sed
on m
ater
nal e
mpl
oym
ent.
CT
Step
3:
Ass
ign
Trea
tmen
tR
ando
mly
ass
ign
treat
men
t sta
tus t
o th
e un
labe
led
sets
(with
equ
al p
roba
bilit
y).
CT
Step
2:
Bal
ance
Unl
abel
ed S
ets
Som
e sw
aps b
etw
een
unla
bele
d se
ts to
bal
ance
mea
ns (e
.g. g
ende
r, SE
S).
G₂
G₁
Step
1:
Form
Unl
abel
ed S
ets
Chi
ldre
n ra
nked
by
IQ, w
ith
ties b
roke
n ra
ndom
ly; e
ven-
an
d od
d-ra
nked
form
two
sets.
G₂
G₁
IQ ScoreSt
ep 0
: Se
t Asi
de Y
oung
er S
iblin
gsSu
bjec
ts w
ith e
lder
sibl
ings
are
ass
igne
d th
e sa
me
treat
men
t sta
tus a
s tho
se e
lder
sibl
ings
.
Unr
ando
miz
edEn
try
Coh
ort
CT
CT
Prev
ious
Wav
es
6
The null hypothesis of no treatment effect for outcome k is translated by the equality in distri-
bution of the counterfactuals Yi,k1 , Yi,k0 for all participants i. Notationally, we write Hk : Yk1d= Yk0 ,
where Yk1 , Yk0 , Yk are the vectors of pooled outcomes across agents i. It is well known that if the
null hypothesis is true and (Yi,k1 , Yi,k0) ⊥⊥ Di holds for all i, then the outcome Yk is independent
of the random vector of treatment status D. In other words, hypothesis Hk is equivalently to
Hk : Yk ⊥⊥ D.
Compromises of the randomization protocol may preclude the use of Hk : Yk ⊥⊥ D as a test for
equality in distribution between Yk1 and Yk0 . Indeed, if compromises create a dependence between
some background variables Z and treatment status D, then the randomization might also induce
dependence between outcomes Y and treatment status D through their common dependence on Z.
This induced dependence might invalidate the hypothesis Hk : Y ⊥⊥ D even under the assumption
of no treatment effects. Inference techniques can correct for this common dependence by accounting
for the implemented randomization. Specifically, valid inference is obtained by using a distribution
of treatment assignments D that reproduces the relationship between D and background variables
Z according to the the implemented randomization. In order to generate valid inference, it is
necessary to uncovering the details of the compromised randomization. Next subsection discusses
the sources of compromises and maps the case of Perry into a general framework.
3.1 Sources of Compromises
As mentioned in the introduction, we decompose a compromising features of a randomization into
two aspects: (a) the uncertainty on randomization protocol generated by the compromises and (b)
the role played by unobserved variables of participants in the implemented randomization. We
discuss each one of these aspects. We build our analysis on a general model of randomization that
provides a theoretical basis to examine Perry compromises.
We describe the randomization protocol as a stepwise procedure that generates the distribution
of treatment status D. Let ε be a vector of exogenous random variables pooled across agents i
that was used at some step of the randomization protocol. For example, ε could be a collection
of i.i.d. Bernoulli random variables that assign the treatment status for each participant. Typi-
cally, researches also want to balance some background variables Z across treatment and control
groups through the randomization protocol, thus we generally define the initial random variable of
7
treatment assignments generated by an uncompromised randomization as D ∼ δ([Z, ε]), where Z
denote the background variables used in the randomization protocol. Function δ encompasses all
steps of the randomization protocol and generates the random variable D based on the statistical
properties of the exogenous variables ε.
Reassignments: According to our framework, reassignments of participants after an initial treat-
ment assignment constitute a compromise. We represent general types of reassignments by a deter-
ministic function that changes the treatment status of participants. Notationally, we still use δ to
denote these reassignment compromises. That is, D ∼ δ([Z, ε]) where the function δ accounts for
the initial randomization and the followup reassignment compromises. If the function δ is known,
there is no uncertainty regarding the compromise, so the distribution of D is fully characterized and
could be reproduced. If this is the case, valid inference can be done by using the known distribution
of D.
Uncertainty: Full descriptions of compromises are seldom available in social experiments. Lack
of information on the compromises leads to uncertainty regarding the true distribution of treatment
assignments D. In other words, if δ is unknown, the distribution of D cannot be fully character-
ized. Nevertheless, valid inference can be done by exploring an exchangeability property that
remains valid under uncertainty of δ. Specifically, we prove (in Subsection 3.3) that, under general
assumptions, a treatment assignment D is exchangeable among participants that share identical
information for the background variables Z. This exchangeability property implies that counter-
factual outcomes Yk,1, Yk,0 are independent of treatment assignments D conditioned on Z, that is
(Yk,1, Yk,0) ⊥⊥ D|Z. In non-experimental data, this conditional dependence is called the Matching
Assumption (see Heckman (2006)). Researches usually evoke the Matching Assumption to avoid
the statistical modeling of bias selection in non-experimental data. In experimental data, if the
Matching Assumption holds, it comes as a statistical property of the implemented RCT.
Unobserved Variables: The compromises the follow the initial randomization can also depend
on unobserved variables of participants, denoted by U . In this case, we represent the vector of
treatment status byD ∼ δ([Z, ε, U ]). In general, the exchangeability property mentioned above does
not hold when the randomization depends on unobserved variables of participants. Compromises
8
through unobserved variables generate a statistical problem that is statistically identical to selection
bias in non-experimental data. Participants’ compliance is a typical case of compromise through
unobserved variables. In the case of Perry, compromises through unobserved variables stem from
reassigning participants whose mothers could not comply the scheduled interviews.
3.2 The case of Perry:
Recall that the Perry protocol was designed to initially rank the IQs of the eldest siblings in each
family (Steps 1 and 2 of the Randomization Protocol). Swaps of individuals took place to “balance”
gender and mean SES score (Step 3); then treatment and control status were randomly assigned to
odd and even numbered families within each wave (Step 4). Step 5 reassigns the treatment status
of some participants whose mothers were employed at the onset of the survey.
Let the families of participants be indexed by J = {1, . . . , J} and let Ij be the index of the
siblings of family j ∈ J. In most cases, |Ij | = 1, but, as described in Section 2, there are families
with more than one sibling in the program. We denote by Zi,j the values of observed characteristics
used in the randomization protocol for the ith child in the jth family:
Zi,j = (Wi,j , IQi,j , SESi,j ,Gi,j ,Mi,j) ,
where Wi,j is the wave of the program in which the individual participated, IQi,j is the Stanford-
Binet IQ score of the individual, SESi,j is the measure of socio-economic status for the individual
described in the preceding section, Gi,j is the gender of the individual, and Mi,j is an indicator
variable for whether the individual’s mother was working or not. Also define Z = (Zi,j : i ∈ Ij , j ∈
J). Treatment status was assigned to families, not individuals, thus, let Dj be an indicator variable
for whether the jth family was selected into treatment, moreover define D = (Dj , j ∈ J).
A diagram of Perry’s five step randomization protocol is presented in Figure 2. Steps 1, 2, 3
and 5 use background variables Z. Steps 2 and 4 use exogenous random variables ε. Exogenous
random variables in Step 2 are associated with tie breaking of IQs, during the IQ ranking within
each waves. Exogenous random variables in Step 4 are associated with treatment assignments for
the groups that arise in Step 3. The last step of the randomization protocol states that a subset
of treated individuals were reassigned to the control group. The reassignments (Step 5) was based
9
on unobserved characteristics of participants. Notationally, the reassignments cast on participants
unobservable variables U = (Uj : j ∈ J). Sources of compromises comes from uncertainty on Swaps
(Step 3) and Reassignments (Step 5).
The the final distribution of treatment assignments can be described by D ∼ δ([Z, ε, U ]), how-
ever, it is useful to define an intermediate vector of treatment assignments D, associated with steps
1 through 4 (appendix B provides a formal description of D).
Remark 3.1. Characteristics of D are: (a) the distribution of D is partially unknown; (b) uncertainty
concerning D comes from lack of information on the actual swaps that took place in step 3 of
the Randomization procedure; (c) distribution of D does not depend on unobserved variables U ,
which are only assessed in the last step; (d) we can write this vector of treatment assignments as
D ∼ ϕ([Z, ε]).
The final distribution of treatment assignments D can be decomposed in terms the intermediate
vector of treatment assignments D (which encompass steps 1 through 4) and the subsequent acts
of step 5 (which is based on unobserved variables U and background variables Z). We represent
the final distribution of assignments D by:
D ∼ δ([D, Z, U ]); δ : {0, 1}|J | × supp(Z,U)→ {0, 1}|J |; (1)
(2)
There are two advantages of this decomposition. First, an exchangeability property holds for
D, but not for D (see Subsection 3.3). Second, the last step focuses on a subset of participants
(families whose mothers were working at the onset of the survey), which decreases the complexity
of the function δ considerably.
Remark 3.2. Traditional resampling procedures that assume equal probability for each treatment
assignment regardless of the values of conditioning variables Z cannot generate the actual distri-
bution of treatment assignments given the rules followed in the Perry randomization.
Remark 3.3. As a consequence of Remark 3.2, a simple permutation of treatment labels regardless
of background variables Z (as in Lehmann and Romano (2005)) is invalid to test the null hypothesis
10
Fig
ure
2:Fr
amew
ork
for
Com
prom
ises
inP
erry
Ran
dom
izat
ion
Pro
toco
l
Step 1
Step 2
Step 3
Step 4
Step 5
Action
Associated
Random Variable
Associated
Background Variables
Younger siblings
set aside
IQ Ranking
Swaps of Participants
Group Treatment
Assignment
Reassignment upon
Unobserved Variables
none
Family ID
IQ, Wave
SES, Gender
none
Maternal
Employment
Status
ε
Exogenous
Random Variable
none
ε
Exogenous
Random Variable
Unobserved
Random Variable
U
D ~δ([D,Z,U])
~
D ~δ([Z,ε,U])
D ~ϕ([Z,ε])
~
Not
es:
Thi
sdi
agra
mre
pres
ents
the
step
sof
Per
ryR
ando
miz
atio
nP
roto
col
and
its
com
prom
ises
.P
erry
’sfiv
est
eps
rand
omiz
atio
npr
otoc
olis
repr
esen
ted
byth
eco
lum
nsof
this
diag
ram
.T
hefir
stlin
epo
ints
the
five
step
s.T
hese
cond
line
ofth
edi
agra
mlis
tsth
em
ain
act
perf
orm
edin
each
step
.T
heth
ird
line
pres
ents
the
rand
omva
riab
leth
atw
asas
sess
edin
each
step
.T
hefo
urth
line
lists
the
back
grou
ndva
riab
leus
edto
perf
orm
the
act
asso
ciat
edw
ith
each
step
.St
eps
1,2,
3an
d5
use
back
grou
ndva
riab
lesZ
.St
eps
2an
d4
cast
onex
ogen
ous
rand
omva
riab
lesε.
Exo
geno
usra
ndom
vari
able
sof
Step
2ar
eas
soci
ate
toth
eti
ebr
eaki
ngof
IQs,
duri
ngth
eIQ
rank
ing
wit
hin
each
wav
es.
Exo
geno
usra
ndom
vari
able
sof
Step
4ar
eas
soci
ate
wit
hth
etr
eatm
ent
assi
gnm
ent
for
the
grou
psth
atar
ise
from
Step
3.O
nly
Step
5is
base
don
part
icip
ants
unob
serv
edva
riab
lesU
.T
hebo
ttom
arro
wre
calls
that
final
dist
ribu
tion
oftr
eatm
ent
assi
gnm
entsD
can
bede
scri
bed
byD∼δ(
[Z,ε,U
]]),
whe
reZ
acco
unts
for
back
grou
ndva
riab
les,ε
acco
unts
for
exog
enou
sra
ndom
izat
ion
vari
able
san
dU
acco
unts
for
part
icip
ants
unob
serv
edva
riab
les.
We
deco
mpo
seth
edi
stri
buti
onofD
into
two
part
s.L
etD
beth
etr
eatm
ent
assi
gnm
ent
dist
ribu
tion
asso
ciat
edw
ith
step
s1
thro
ugh
4(l
eft
top
arro
w),
deno
ted
byD∼ϕ
([Z,ε
]]).
Obs
erve
thatD
does
not
depe
ndon
unob
serv
edva
riab
leU
beca
use
unob
serv
edva
riab
les
are
only
asse
ssed
inth
ela
stst
ep.
The
final
dist
ribu
tion
oftr
eatm
ent
assi
gnm
entsD
can
bew
ritt
enin
term
sth
era
ndom
vari
ableD
and
step
5(t
opri
ght
arro
w),
nota
tion
ally
,D∼δ(
[D,Z,U
]),.
11
of no treatment effects.
Remark 3.4. Any permutation used for inference must be based upon an exchangeability property
that results from the true randomization protocol.
Conditioned on unobserved variable U , the available information on the reassignments allows
us to map δ into some reassignment functions σ of observable variables Z, such that σ ∈ Σ, where
Σ is known. Set Σ is restricted by the fact that at most two reassignments occurred per wave.
Notationally, we define Σ as a class of reassignment functions of the type
σ : {0, 1}|J | × supp(Z)→ {0, 1}|J |,
that adhere to the available information on reassignments (e.g. all reassignment functions that
reassigns at most two treated participants per wave among the ones whose mother were working).
Remark 3.5. The number of possible associations between two vectors of treatment assignments is
finite for a fixed Z, thus set Σ is finite for a fixed Z.
Assumption 3.1. There is a finite partition of supp(U), that is⋃u∈IU
Au = supp(U) where IU
is the indexing set of the partition, such that if U ∈ Au then the function δ can be mapped into
σu ∈ Σ.
Assumption 3.1 does not restrict our analysis, it is ia consequence of the finiteness of Σ for a
fixed Z. Under Assumption 3.1, we can write δ as a linear combination elements of Σ, that is:
δ([D, Z, U ]) =∑u∈U
1[U ∈ Au] · σu(D, Z); σu ∈ Σ; (3)
The incompleteness of our model stems from three sources: (a) the distribution of D is partially
unknown, (b) there is not enough information to fully define function δ, and (c) variable U is
unobserved. Uncertainty regarding the distribution of D is addressed by using an exchangeability
property which holds under compromises (described in the Subsection 3.3). We account for the
uncertainty of δ and U by computing critical values that produce valid inferences under uncertainty.
The measure of uncertainty in our model is related to the extension of the set Σ. Our empirical
analysis explores several different specifications of Σ and the strength of our results depends on the
12
our assumptions about Σ. We show that uncertainty decreases as Σ is narrowed and even modest
assumptions about Σ lead to interesting inferences.
3.3 Exchangeability Property of Intermediate Treatment Assignments D
This section proves a useful exchangeability property of the random variable D: elements of D ∼
ϕ([Z, ε]) are exchangeable among participants that have the same values for background variables
Z. The property holds even though the function ϕ is partially unknown due to compromises. The
next section explores this property for inference.
Without loss of generality, we represent the exogenous variables ε as a vector of identical
variables such that each element of the vector is associated to a Perry participant. In other words,
ε not only has the same distribution across participants, but also realizes the same value for each
participant.
Let gz be a bijection on J in itself (known as permutations) exchanges indexes j and j′ only if
Zj = Zj′ , j, j′ ∈ J . Let GZ the collection of all permutations gZ , in other words, elements of GZ
permute participants that share the same information in Z. Set GZ is a symmetric group acting
on J, it is also finite and closed under compositions and inverses.
Assumption 3.2. Assume gZ [Z, ε] d= [Z, ε] ∀ gZ ∈ GZ .
Assumption 3.3. For any permutation g of indexes 1, . . . , |J | assume gϕ([Z, ε])|ε = ϕ(g[Z, ε])|ε.
The first assumption is innocuous given that gZZ = Z and gZεd= ε by construction. The second
assumption just states that the permutation mechanism is equivariant, that is, conditioned on a
draw of ε, any swap of participants is associated with the symmetric swap of the final output of ϕ.
In other words, Assumption 3.3 only stating that if the arguments of function ϕ were swapped, we
also have a symmetric swap in its output.
Remark 3.6. Observe that ϕ, does not rely on unobservable variables U , so agents with same Z are
indistinguishable in terms of the swapping mechanism.
Theorem 3.1. The Exchangeability Property: under Assumptions 3.2 and 3.3,
gzDd= D ∀ gz ∈ Gz
13
Proof.
gZD|ε = gzϕ([Z, ε])|ε
= ϕ(gz[Z, ε])|ε by Assumption 3.3
∴ gZDd= ϕ(gz[Z, ε])
d= ϕ([Z, ε]) by Assumption 3.2
d= D
Appendix B.1 provides a simple example that illustrates the intuition of the exchangeability
property described in this subsection.
Remark 3.7. As a consequence of Theorem 3.1, the distribution of treatment assignments D is
invariant to permutations gZ of the arguments of function δ, that is,
D = δ([D, Z, U ]) d= δ(gz[D, Z, U ]) ; gz ∈ GZ .
Remark 3.7 has a direct application for testing procedures. If function δ were known, we
could obtain D by reversing the reassignments of the actual treatment status D. The conditional
distribution of D is uniform across elements of GZ . Thus, we could obtain variability D by applying
permutations gzD. The conditional distribution of D could be obtained by applying the assignment
rule δ across the treatment vectors generated by gzD. Distribution of D could be then used in
hypothesis testing. Unfortunately, the reassignment rule is partially unknown and we account for
the uncertainty on δ by using a conservative approach. Section 4 provides a formal description of
our testing procedure, which is based on the exchangeability property of Theorem 3.3.
4 Setup and Notation
Let the vector of outcome k be denoted by Yk = (Yi,j,k : i ∈ Ij , j ∈ J), where outcomes are indexed
by K = {1, . . . ,K}. As mentioned, families are indexed by J and for each j ∈ J, Ij denotes the set
14
of siblings in the jth family. Denote the observed data by
XK = (Yi,j,k, Zi,j , Dj : i ∈ Ij , j ∈ J, k ∈ K)
and let P be the distribution of the observed and unobserved data, (X,U) ∼ P . Let Pk be defined
by the rule:
P ∈ Pk ⇐⇒ Yk ⊥⊥ D|Z,U
The null hypotheses of interest are the |K| null hypotheses
Hk : P ∈ Pk for each k ∈ K (4)
indexed by k ∈ K. The alternative hypothesis is the unrestricted version of hypothesis (4). Let
K0(P ) denote the set of true null hypotheses, i.e.,
k ∈ K0(P )⇐⇒ P ∈ Pk,
that is, k ∈ K0(P ) if and only if Hk is true. Our goal is to test the family of null hypotheses (4) in
a way that controls the Familywise Error Rate (FWER) at level α, that is,
FWERP = P{reject any Hk with k ∈ K0(P )} ≤ α . (5)
Remark 4.1. Since randomization is imperfect in our setting, testing even a single null hypothesis
in a way that controls the probability of a Type-I error at level α will be nontrivial.
Remark 4.2. Controlling for familywise error rate avoids making “too many” false rejections. An-
other option of controlling for type-I probability error in a multivariate set up is considering error
rates that control for false rejections. See, e.g., Romano and Shaikh (2004).
15
4.1 Testing a Single Null Hypothesis
For L ⊆ K, consider the problem of testing
HL : P ∈⋂k∈L
ωk, that is, Hk is true for k ∈ L (6)
in a way that controls the probability of a false rejection at level α, that is,
P{reject HL} ≤ α whenever HL is true . (7)
Denote the data associated with hypothesis HL as XL and consider any test statistic TL = TL(XL)
for which large values of TL provide evidence against the null hypothesis. Note that here we have
assumed that the test statistic only depends on XL rather than XK , again, L ⊆ K6.
To test hypothesis (6) we use permutation methods tailored to the available information of
the implemented randomization. The distribution of treatment assignments D, described in Equa-
tion (1), consists of a reassignment transformation of the random variable D. Moreover, D is
exchangeable according to Theorem (3.1). As a consequence of exchangeability, Remark 3.7 states
that δ([D, Z, U ]) d= δ(gz[D, Z, U ]) ∀ gz ∈ GZ . If the true reassignment rule δ were known, we could
obtain D by reversing the reassignments of the actual treatment status D. Let d be the actual
realization of D. The conditional distribution of D is uniform across elements of GZ . In particular,
random draws of gzd serve as the distribution of D conditional on D ∈ {gZ d : gZ ∈ GZ} (see
Theorem 15.2.2 in Lehmann and Romano (2005)). Consequently, we could obtain the conditional
distribution of D by applying the assignment rule δ. Unfortunately, the reassignment rule is par-
tially unknown. We account for the uncertainty on δ by using a conservative approach. We do
inference using σ ∈ Σ that generates the highest critical values. This way, we bound the critical
value that would arise if δ were known and control for the probability of a false rejection.
For each σ ∈ Σ, let σ−1(D,Z) be the set of possible vectors d of treatment assignments that
would generate D upon the reassignment function σ(d, Z).
Let the true d associated with δ and D be denoted by d∗, that is D = δ(d∗, Z, U).
Let gmZ , m = 2, . . . ,M be and i.i.d. sequence of permutations in GZ . Define the the test statistic
6This assumes no-cross variable restrictions of the outcomes Y in L \ K with the outcome in K.
16
TmL (d) by:
TmL (σ, d) = TmL (XmL );
XmL (σ, d) = (Yi,j,k, σ(gmZ d, Zi,j), Zi,j , Dj : i ∈ Ij , j ∈ J, k ∈ L)
for σ ∈ Σ, d ∈ σ−1(D,Z).
For notation purposes, adopt T 1L as the test statistics based on the actual data XL, that is TL.
Theorem 4.1. The statistics TmL (δ, d∗), m = 1, . . . ,M are (conditionally) exchangeable.
Proof. We waive the proof due to its simplicity.
Define T (m)L (σ, d), m = 1, . . . ,M as the ordered values of test statistics TmL (σ, d), thus:
T(1)L (σ, d) ≤ · · · ≤ T (M)
L (σ, d).
Let cL(σ, d, α) be the (1 − α) highest quantile of TmL (σ, d), m = 1, . . . ,M. Let cL(Σ, α) be the
maximum value of cL(σ, d, α) over d ∈ σ−1(D,Z) and σ ∈ Σ. Notationally,
cL(Σ, α) = max({cL(σ, d, α) : d ∈ σ−1(D,Z), σ ∈ Σ}) , (8)
where
cL(σ, d, α) = T(d(1−α)Me)L (σ, d) ,
and dxe means the smallest integer greater than or equal to x.
Remark 4.3. The sets σ−1(D,Z) ∀σ ∈ Σ are finite, therefore, by the Bolzano-Weierstrass theorem,
cL(Σ, α) always exists.
Theorem 4.2. Under Assumption 3.1, cL(Σ, α) ≥ cL(δ, d∗, α).
Proof. Under Assumption 3.1, Equation (3) states that for each value of U , exists a function σ ∈ Σ
such that δ = σ. Moreover, {d : d ∈ δ−1(Z,D,U)} ⊂ {d : d ∈ σ−1(Z,D), σ ∈ Σ}.
17
Remark 4.4. The statistic cL(Σ, α) relies only on observed variables, thus it can be computed.
Statistic cL(δ, d∗, α) cannot be computed because neither δ nor d∗ are observed. Appendix C
describes the method for computing cL(Σ, α) in detail.
In this notation, we may now state the following result:
Theorem 4.3. Suppose data XK with distribution as described in Section 4 and σ satisfying σ ∈ Σ
are available. Let M ≥ 2 and 0 < α < 1 be given. Then, the test that rejects HL whenever
TL > cL(Σ, α) ,
satisfies (7).
Proof. Suppose HL is true. Then, T 1L(δ, d∗), . . . , TML (δ, d∗) are (unconditionally) exchangeable.
Hence,
P {TL > cL(δ, d∗, 1− α)}
= E[1{TL > T
(d(1−α)Me)L (δ, d∗)
}]= E
[E
[1M
M∑m=1
1{T (m) (δ, d∗) ≥ T (d(1−α)Me)
L (δ, d∗)}| {TmL (δ, d∗)}Mm=1
]]
= E[E[M − (d(1− α)Me)
M| {TmL (δ, d∗)}Mm=1
]]=
1M
(M − d(1− α)Me)
≤ α .
By Theorem 4.2, cL(Σ, α) ≥ cL(δ, d∗, α). Thus, P {TL > cL(Σ, α)} ≤ P{TL > cL(δ, d∗, α)}, which
completes the proof.
4.2 Testing Multiple Null Hypotheses
We now return to the problem of testing the family of null hypotheses (4) in a way that satisfies
(5). We use a stepdown multiple testing procedure. The terminology reflects the fact that our
procedure begins with the most significant null hypothesis and then “steps down” to less significant
null hypotheses. The argument for the validity of our procedure follows the arguments given in
18
Romano and Wolf (2005), who provide general results on the use of stepdown multiple testing
procedures for control of the FWER.
In order to describe the procedure based on critical values for each k ∈ K, let Tk = Tk(Xk)
be any test statistic for testing Hk. We use lowercase subscripts to denote single outcomes and
capital subscripts to denote sets of outcomes. Data Xk are the set of values associated with the
single outcome Yk. As mentioned, large values of Tk provide evidence against the null hypothesis.
For any L ⊆ K, define
TL(XL) = maxk∈L
Tk(Xk) (9)
and let cL(Σ, α), defined by (8), be the corresponding critical value for testing HL using TL as
described in Section 4.1. Using this notation, we now describe the respective testing procedure by
the following algorithm:
Algorithm 1.
Step 1: Let L1 = K. If
TL1 ≤ cL1(Σ, α) ,
then stop. Otherwise, reject any Hk such that Tk > cL1(Σ, α), and define the set L2 of
indexes in L1 which respective test statistics are less than the critical value associated with
L1. Formally, set
L2 = {k ∈ L1 : Tk ≤ cL1(Σ, α)}
and go on to Step 2.
...
Step n: If TLn ≤ cLn(Σ, α) , then stop. Otherwise, reject any Hk such that Tk > cLn(Σ, α),
set Ln+1 = {k ∈ Ln : Tk ≤ cLn(Σ, α)} and go on to Step n+ 1.
...
The following result shows that Algorithm 1 controls the FWER at level α.
19
Theorem 4.4. Suppose data XK with distribution P described in Section 4 and σ satisfying σ ∈ Σ
is available. Let M > 0 and 0 < α < 1 be given. Then Algorithm 1 satisfies (5).
Proof. Assume without loss of generality that K0(P ), the set of indices corresponding to true null
hypotheses, is nonempty. Suppose Algorithm 1 leads to at least one false rejection. Let n be the
smallest step at which a false rejection occurs. Then, it must be the case that
Tk > cLn(Σ, α)
for some k ∈ K0(P ). But, by the minimality of n, it must also be the case that
K0(P ) ⊆ Ln ,
since for any σ ∈ Σ and d ∈ σ−1(D,Z),
cK0(P )(σ, d, α) ≤ cLn(σ, d, α)
we have that
cK0(P )(Σ, α) ≤ cLn(Σ, α) .
Hence,
TK0(P ) ≥ Tk > cLn(Σ, α) ≥ cK0(P )(Σ, α) .
It follows that
FWERP ≤ P{TK0(P ) > cK0(P )(Σ, α)} ≤ α ,
where the final inequality follows from Theorem 4.3.
20
5 Inference from the Perry Experiment
6 Conclusion
References
Breitmayer, B. J. and C. T. Ramey (1986, October). Biological nonoptimality and quality of
postnatal environment as codeterminants of intellectual development. Child Development 57 (5),
1151–1165.
Campbell, F. A. and C. T. Ramey (1994, April). Effects of early intervention on intellectual and
academic achievement: A follow-up study of children from low-income families. Child Develop-
ment 65 (2), 684–698. Children and Poverty.
Heckman, J. J. (2006). The principles underlying evaluation estimators with an application to
matching. Forthcoming, /emphAnnales d’Economie et de Statistiques.
Heckman, J. J., S. H. Moon, R. R. Pinto, P. A. Savelyev, and A. Q. Yavitz (2008). A reanalysis
of the High/Scope Perry Preschool Program. Unpublished manuscript, University of Chicago,
Department of Economics. First draft, September, 2006.
Heckman, J. J. and E. J. Vytlacil (2007a). Econometric evaluation of social programs, part I: Causal
models, structural models and econometric policy evaluation. In J. Heckman and E. Leamer
(Eds.), Handbook of Econometrics, Volume 6B, pp. 4779–4874. Amsterdam: Elsevier.
Heckman, J. J. and E. J. Vytlacil (2007b). Econometric evaluation of social programs, part II:
Using the marginal treatment effect to organize alternative economic estimators to evaluate
social programs and to forecast their effects in new environments. In J. Heckman and E. Leamer
(Eds.), Handbook of Econometrics, Volume 6B, pp. 4875–5144. Amsterdam: Elsevier.
Kurz, M. and R. G. Spiegelman (1972). The Design of the Seattle and Denver Income Maintenance
Experiments. Menlo Park, CA: Stanford Research Institute.
Lehmann, E. L. and J. P. Romano (2005). Testing Statistical Hypotheses (Third ed.). New York:
Springer Science and Business Media.
21
Manski, C. F. (2003). Partial Identification of Probability Distributions. New York: Springer-Verlag.
Romano, J. P. and A. M. Shaikh (2004). On control of the false discovery proportion. Technical
Report 2004-31, Department of Statistics, Stanford University.
Romano, J. P. and M. Wolf (2005, March). Exact and approximate stepdown methods for multiple
hypothesis testing. Journal of the American Statistical Association 100 (469), 94–108.
Schweinhart, L. J., H. V. Barnes, and D. Weikart (1993). Significant Benefits: The High-Scope
Perry Preschool Study Through Age 27. Ypsilanti, MI: High/Scope Press.
Schweinhart, L. J., J. Montie, Z. Xiang, W. S. Barnett, C. R. Belfield, and M. Nores (2005). Lifetime
Effects: The High/Scope Perry Preschool Study Through Age 40. Ypsilanti, MI: High/Scope
Press.
Schweinhart, L. J. and D. P. Weikart (1980). Young Children Grow Up: The Effects of the Perry
Preschool Program on Youths through Age 15. Ypsilanti, MI: High/Scope Press.
Weikart, D. P. (Ed.) (1967). Preschool Intervention: A Preliminary Report of the Perry Preschool
Project. Ann Arbor, MI: Campus Publishers.
Weikart, D. P., J. T. Bond, and J. T. McNeil (1978). The Ypsilanti Perry Preschool Project:
Preschool Years and Longitudinal Results Through Fourth Grade. Ypsilanti, MI: Monographs of
the High/Scope Educational Research Foundation.
Weikart, D. P., A. S. Epstein, L. Schweinhart, and J. T. Bond (1978). The Ypsilanti Preschool
Curriculum Demonstration Project: Preschool Years and Longitudinal Results. Ypsilanti, MI:
High/Scope Press.
22
Appendix
23
A Background on the Perry Preschool Curriculum
Preschool Overview During each wave of the experiment, the preschool class consisted of 20–25
children, whose ages ranged from 3 to 4. This is true even of the first and last waves, as the first
wave admitted 4-year-olds, who only received one year of treatment, and the last wave was taught
alongside a group of 3-year-olds, who are not included in our data. Classes were 2-1/2 hours every
weekday during the regular school year (mid-October through May).
The preschool teaching staff of four produced a child-teacher ratio ranging from 5 to 6.25 over
the course of the program. Teaching positions were filled by public-school teachers who were
“certified in elementary, early childhood, and special education,” (Schweinhart et al., 1993, p.32).
Home Visits Home visits lasting 1-1/2 hours were conducted weekly by the preschool teachers.
The purpose of these visits was to “involve the mother in the educational process,” and “implement
the curriculum in the home,” (Schweinhart et al., 1993, p.32). By way of encouraging the mothers’
participation, teachers also helped with any other problems arising in the home during the visit.
Occasionally, these visits would consist of field trips to stimulating environments such as a zoo.
Curriculum The Perry Preschool curriculum was based on the Piagetian concept of active learn-
ing, which is centered around play that is based on problem-solving and guided by open-ended
questions. Children are encouraged to plan, carry out, and then reflect on their own activities.
The topics in the curriculum are not based on specific facts or topics, but rather on key experi-
ences related to the development of planning, expression, and understanding. The key experiences
are then organized into ten topical categories, such as “creative representation”, “classification”
(recognizing similarities and differences), “number”, and “time.”7 These educational principles are
reflected in the types of open-ended questions asked by teachers: for example,“What happened?
How did you make that? Can you show me? Can you help another child?” (Schweinhart et al.,
1993, p.33)
As the curriculum was developed over the course of the program, its details and application
varied from year to year. While the first year involved “thoughtful experimentation” on the part of
the teachers, experience with the program and series of seminars during subsequent years led to the7For a full list, see Schweinhart et al. (1993).
24
development and systematic application of teaching principles with “an essentially Piagetian theory-
base.” During the later years of the program, all activities took place within a structured daily
routine intended to help children “to develop a sense of responsibility and to enjoy opportunities
for independence,” (Schweinhart et al., 1993, pp. 32–33).
B Computing Initial Distribution of Treatment Assignments D
As described in Section 4, Yi,j,k denote the kth outcome of the ith sibling in the jth family. The
set K = {1, . . . ,K} indexes outcomes and J = {1, . . . , J} is the set of families, in the experiment,
the outcome Yk ≡ (Yi,j,k : i ∈ Ij , j ∈ J). Thus, Yi,j,k is the kth outcome for the ith child of family
j. For each j ∈ J, let Ij denote the set of siblings in the jth family. In most cases, |Ij | = 1, but, as
described in the preceding section, there are families with more than one sibling in the program.
Denote by Zi,j a vector of characteristics for the ith child in the jth family:
Zi,j = (Wi,j , IQi,j , SESi,j ,Gi,j ,Mi,j) ,
where Wi,j is the wave of the program in which the individual participated, IQi,j is the Stanford-
Binet IQ score of the individual, SESi,j is the measure of socio-economic status for the individual
described in the preceding section, Gi,j is the gender of the individual, and Mi,j is an indicator
variable for whether the individual’s mother was working or not. Also define Z = (Zi,j : i ∈
Ij , j ∈ J). Let Dj be an indicator variable for whether the jth family was selected into treatment,
moreover define D = (Dj , j ∈ J).
Controlling for FWER requires a formal description of the distribution of Dj , j ∈ J . Let
Wj ≡Wi∗,j and IQj ≡ IQi∗,j , where
i∗ ≡ arg mini∈Ij
Wi,j .
In other words, Wj and IQj are the values of Wi,j and IQi,j for the sibling in the earliest wave of
the program for the jth family (the eldest sibling). Define SESj , Gj ,Mj symmetrically.
The Perry protocol was designed to initially rank the IQs of the eldest siblings in each family
and then to randomly assign treatment and control status to odd and even numbered families
25
within each wave, respectively, as determined by a single toss of a coin. A practical problem that
plagues the Perry randomization protocol is that in the first stage there is not necessarily a unique
ordering of IQ because of ties. The IQ ties for eldest siblings within each wave are broken by
assuming equal probability of that available ranks for the block of tied individuals. Specifically,
define the partition of family indices by Ja, where
[j, j′ ∈ Ja
]⇐⇒
[(Wj , IQj) = (Wj′ , IQj′)
],
where all sets are clusters of eldest siblings that share the same values of IQ and belongs to the
same wave, and
J ≡A⋃a=1
Ja .
Without loss of generality, order the sets Ja by the lexicographic rank of the values of wave
and IQ, respectively (Wj , IQj), of participants. Construct a vector of the indexes in J , that is
{1, 2, . . . , |J |}, that follows the order of the sets J1, . . . , JA. Whenever the set Ja has more than
one participant (its cardinality is bigger than one), pick a random ordering of its components
with probability 1/(|Ja|!). Let ϕ denote the random variable of the vector of indexes (j1, . . . , jJ)
constructed in this fashion. Therefore ϕ has J dimension, its support is equal to JJ and there is
no repetition of its indices j ∈ J for all realizations of ϕ. Moreover, each of the realizations of ϕ is
equally probable, conditional on (Wj , IQj). Formally, let
ϕ = ϕ(((Wj , IQj) : j ∈ J))
be the random variable conditional on ((Wj , IQj) : j ∈ J), which consists of all equal probable
orderings of (j1, . . . , j|J |) satisfying the lexicographic ordering in R2 of ((Wj , IQj)):
(Wj1 , IQj1) ≤L · · · ≤L (WjJ , IQjJ ) ,
where L denotes the lexicographic order. Notationally write (j1, . . . , jJ) ∼ ϕ.
Define the set of assignments resulting from this protocol by D where D = (Dj : j ∈ J). It is
distributed as follows: for each 1 ≤ w ≤ w, let Dj` = 1 for all odd values of ` and Dj` = 0 for all
26
even values of ` for which Wj` = w with probability 1/2 and Dj` = 0 for all odd values of ` and
Dj` = 1 for all even values of ` for which Wj` = w with probability 1/2.
The variable D comprises the items 1, 2 and 4 of Procedure ?? in Subsection 2.2.
Remark B.1. The distribution of D|Z has ≈ 1010 points of support.
Recall that by Remark ??, we can interchange items 3 and 4 w.l.o.g. We represent the swaps
of item 3 in Procedure ?? by a function δ0 that only depend on the observable variables Z:
δ0 : {0, 1}|J | × suppZ → {0, 1}|J |;
Thus we define the random vector of dichotomous random variables D that arises from items 1
through 4 of the Procedure ?? by:
D ∼ δ0(D, Z),
Remark B.2. The function of swaps δ0 is only partially known. We represent teh uncertainty of
function δ0 by stating that δ0 ∈ Σ0, such that Σ0 is known. Note that for a fixed Z, Σ0 is finite.
Remark B.3. Even though the distribution of D is unknown, inferences can be made by using the
exchangeability property that arises from the set up of the randomization. Section 3 describes the
exchangeability property of treatment assignments.
B.1 Example
One example can clarify ideas. Suppose three participants (I1, I2, I3). Now let three initial treat-
ment status D = (Da, Db, Dc) be associated with these three participants respectively. Elements
of the vector D do not need to be exchangeable. Now suppose the reassignment function σ swaps
the first and third elements of the vector [Db, Da, Dc]′, that is, δ([Da, Db, Dc]′) = [Dc, Db, Da]′.
The criteria of exchangeability relies on the equality on the information on variables Z, say
that the first and second participants share the same value of background variables Z, that is
Z1 = Z2. A valid permutation g should permute the first and the second participants, that is,
gδ([Da, Db, Dc]′) = [Db, Dc, Da]′.
Our exchangeability property says that gδ(D) d= δ(D). Indeed it is true. The rationale behind
this fact relies on the method of randomization. For the researcher who did the randomization,
27
participants I1 and I2 are indistinguishable, as they sahre the same background information. thus
the initial order of participants could also be (I2, I1, I3). In the first ordering of participants, the
final treatment status would be: D1
D2
D3
= δ
Da
Db
Dc
=
Dc
Db
Da
if the second order was used, the final treatment status would be:
D2
D1
D3
= δ
Da
Db
Dc
=
Dc
Db
Da
, that is,
D1
D2
D3
=
Db
Dc
Da
As these two configurations are both equally likely to occur, we conclude that the two first elements
of the treatment status vector associated with the ordered participants (I1, I2, I3) are exchangeable.
In other words, the participants who share the same content of Z still remain exchangeable after
being swapped.
The swap function might act only for some draws of treatment status, say it swaps first and
third elements only if the first element is equal to one. In this scenario, we obtain equally likely
the distributions of treatment status above conditional on Da = 1, and the same distribution of
[D1, D2, D3] when Da = 0. It does not change the essence of the exchangeability.
The idea described here is subtle. Suppose an analyst draw on a slightly different idea. Suppose
he fix the order of participants at (I1, I2, I3) up front, that is D = [D1, D2, D3]′ = [Da, Db, Dc]′.
Suppose he assumes that Da and Db are exchangeable, thus the vector [Da, Db, Dc]′ has the same
distribution as the vector [Db, Da, Dc]′. Again, let g be a permutation that swaps the first and
second elements associated with participants the same content of Z, that is, I1 and I2. In this case,
δ([Da, Db, Dc]′) = [Dc, Db, Da]′ and gδ(D) = ([Db, Dc, Da]′). Observe that the first two elements
of the vector δ(D) are not exchangeable.
28
C The Algorithm
This appendix describes the Algorithm of the paper is detail. The Appendix is divided into parts
that resembles the implemented algorithm. Subsection C.1 has a short description of the testing
procedure based on a permutation approach; Subsection ?? has a short description of the data
used in the randomization protocol. Subsection ?? describes how to construct the set Σ−12 (D,Z),
which includes all treatment status d ∈ σ−12 (D,Z) for all σ2 ∈ Σ2; Subsection ?? describes the set
of permutations gZ ∈ GZ ; Finally, Subsection C.5 describes the maximization algorithm and the
stepdown procedure;
C.1 A Permutation Approach
The second reassignment δ2 focus only participants assigned to treatment whose mothers were
working. The reassignment switches treatment status to control status for a few participants. The
function δ2, however δ2 ∈ Σ2 and Σ2 is known. Thus we maximize the critical value across for
possible σ2 ∈ Σ2. We apply the permutations gZ to the set of possible treatment status that could
arise previously to swaps δ2, that is d ∈ σ−12 (D,Z). The variability of D is obtained by δ2(gz ·d, Z).
Observe that we surpass the problem of maximizing over δ1 ∈ Σ1 by using the exchangeability
property of the randomization protocol.
Notationally, let gmZ ,m = 2, . . . ,M be and i.i.d. sequence of permutations in GZ . Let σ−12 (D,Z)
be the inverse set associated with each function σ2 ∈ Σ2, that is σ−12 (D,Z) ≡ {d : D = σ2(d, Z)},
where D is the actual treatment status. Let Σ−12 (D,Z) be the union of possible sets σ−1
2 (D,Z) :
σ2 ∈ Σ2.
Let m = 2, . . . ,M, let Xm(σ2, d) = (Y, σ2(gmZ d, Z), Z) define X0(σ, d) = (Y,D,Z).
Let Tm(σ, d) be the test statistics evaluated at these datasets.
Let c(σ, d) be the 1− α highest of the Tm(σ, d),m = 1, . . . ,M.
Let c be the maximum value of c(σ2, d) over σ2 ∈ Σ2 and d ∈ Σ−12 (D,Z).
Since σ2 ∈ Σ2 ⊃ σU and δ1(D, Z) ∈ σ−1U (D,Z), we have that c >= c(σU , δ1(D, Z)).
C.2 Basic Data
Data for implementing the testing procedure are:
29
• We use the actual treatment status D (dichotomous vector 123× 1);
• Identification Number Id (ordered vector numbers from 1 to 123);
• Family Identification Number F (vector 123×1). There are 101 distinct families and therefore
101 older siblings;
• Wave W (vector 123× 1) takes values from 1 to 5. Values comply the cohort of each partici-
pant, from the first up to the fifth cohort;
• Maternal Employment Indicator MW ( dichotomous vector 123 × 1) other Socio-Economic
Status SES (vector 123× 1);
• Binet IQ at Entry IQ (vector 123× 1);
• Gender, represented by a “male” indicator M (vector 123× 1);
We use the notation of a superscript o to denote when a vector is associated to the eldest sibling
of each family. For example, while IQ has dimension 123, IQo has dimension 101 and comprises
only the IQs of the eldest siblings.
C.3 Constructing the Set of Vectors of Treatment Status Σ−12 (D, Z)
The set Σ−12 (D,Z) consists in the set of possible vectors of treatment status that could provide
the actual treatment status after the σ2(D,Z). To construct it, we switch the control status of
selected participants to treatment status. Eligible participants for the switch from control statutes
to treatment status comply the following criteria:
1. Belong to waves 2 through 5;
2. Maternal employment status is “working”;
3. Be the eldest sibling of the family;
4. Be assigned to control status;
We assume that there were are least one switch per wave and at most two switches per wave.
There are at most 2700 vectors of treatment status in Σ−12 (D,Z). Table 3 shows the identification
30
number of such participants in the third column. The first column gives the treatment status
of selected participants, the second column gives the gender (1 for males). Fourth Column gives
the maternal status. The fifth column gives the number of possible combination of switches per
wave considering two switches. The sixth column provides the number of possible combination of
switches per wave considering one switch. The last column considers 1 or 2 switches. The final
number of possible treatment vectors that could had generated the current vector of treatment
status considering 1 or 2 switches per wave is 2700.
C.4 Constructing the Permutation Set GZ
We create a partition of the set of eldest participants based on four criteria. Let i be a identification
number of some eldest sibling of a family, thus:
1. Strictly above to the median of IQ at entry, that is 1[IQo(i) > median(IQ)];
2. Strictly below to the median of SES at entry, that is 1[SESo(i) > median(SES)];
3. Wave, that is W o(i);
4. Gender, that is Mo(i);
Participants who are the eldest siblings and have the same value of each of the criteria above
were clustered in the same group. The final partition of eldest siblings is given in Table 4. The
treatment assignments for eldest siblings that belongs to the same group number are assume to be
exchangeable. Only seven participants belong to groups of singletons, and do to permute at all.
Permutations within groups are independent.
31
C.5 Maximizing the Critical Values
The code for the maximization procedure is short and it is printed in full.
D The Perry Study: Sampling and Randomization Protocol
D.1 Overview
Program Timeline Children were admitted to the study in five waves between 1962 and 1965
(Table 6). Waves (entry cohorts) were distinguished by birth year. During 1962, the first year of the
study, two waves were admitted: wave zero, at age 4, and wave one, at age 3. Additional waves were
admitted during each subsequent year, through wave four in 1966, all at age 3. Treatment lasted
one year for wave zero and two years for later waves, the reason being that the Perry Preschool
Program terminated at kindergarten entry (age 5).
D.2 Sampling
Sample Population The Perry study sampled children from families living in the catchment
area of the Perry Elementary School, in Ypsilanti, Michigan, as children in this district “seemed
most in need of early education support, as judged by the poor academic performance of children
at Perry School when compared with their peers in the community at large,” (Weikart et al., 1978,
pp. 6, 12). Table 7 shows these differences. All children were African-American, which reflected
both the de jure institutional segregation of the local school districts of the time as well as the de
facto segregation of residential areas during through the 1960s (Schweinhart et al., 2005, p. 22).
By contrast, the Erickson school shown in Table 7 is characterized as “an all-white school located
in an upwardly-mobile section of the Ypsilanti Public School District,” (Weikart, 1967, p. 65).
Sampling Procedure Children were primarily selected using the family census of the Perry Ele-
mentary School, as well as by neighborhood-group referral or door-to-door canvasing (Schweinhart
and Weikart, 1980, p. 17). The sampling procedure was comprehensive within its geographic scope:
“the total population in the Perry school district” with children born during the target period
was surveyed for each wave to determine eligibility, using census information for the school district
(Weikart, 1967, pp. 4, 65–68). The specific criteria for determining eligibility are described below.
32
Self-selection in the eligibility survey does not appear to be a factor; during the first two waves,
for example, “no parent who was approached refused to give information as to his socio-economic
status, and no parent refused to allow his child to be tested. Some families could not be found at
home in spite of several visits... [but] these families were later contacted during weekends. The
socio-economic data as well as test results on the children reached late indicated that these children
did not differ from those evaluated earlier,” Weikart (1967, p. 69). Further, the sample captured
a broad swath of the population:, as “...virtually all eligible children were enrolled in the project
during its five years of operation, approximately 25% of the total preschool-age population at that
time within the attendance area,” (Weikart et al., 1978, p. 16).
The only voluntary self-selection is that “3 families with children identified for the study refused
to participate,” (Schweinhart et al., 2005, p. 22). There was some involuntary selection after
treatment assignments were known — four children’s families moved away and one child died
before treatment program completion (Schweinhart et al., 2005, p. 23) — but no families voluntarily
declined participation following treatment assignment (Weikart et al., 1978, p. 7).
Eligibility Criteria Disadvantaged children were identified using an index of socio-economic
status (SES) and an IQ scale (Stanford-Binet, 1960 Norm, Form L-M).
The SES index is a weighted linear combination of three components: paternal occupational
skill level, parental educational attainment, and number of rooms per person in the family home.
The distribution of SES over the sample is given in Figure 4 (with definitions of SES components
provided in the figure notes). While the program documentation variously gives 10 (Weikart, 1967,
p. 68) or 11 (Weikart et al., 1978, p. 14) as the upper limit for eligibility, some children with an
SES index greater than 11 were admitted; out of 7 total such children, 6 ended up in the treatment
group, and 6 were admitted in the last two waves.
Weikart et al. (1978, p. 16) provides a Stanford-Binet IQ eligibility range between 50 and 85:
“The score range of 50 to 85 was selected for a very practical reason: special-education funds were
available from the state to aid children certified ‘educable mentally retarded’ based on demonstrated
performance within that range in the absence of discernible organic impairments.” This is at odds
with the ranges given in some other documentation, such as the most recently publication from the
experimenters: “They selected for the study those children whose intellectual performance scores
33
(IQs) at this initial testing qualified them as ‘borderline educable mentally impaired’ by the State of
Michigan... that is, in the range of 70 to 85,” while noting the exceptions to this rule,” (Schweinhart
et al., 2005, p. 23).
Despite the discrepancy in documentation, there is no suggestion that the lower IQ bounds were
varied systematically, or that the inclusion of low-IQ children was a result of anything other than
their chance occurrence in the sample. The same cannot be said for the inclusion children with IQ
greater than 85: the documentation sources agree on that upper bound, and Weikart et al. (1978,
p. 16ff) indicates that “in some instances, [high-IQ inclusion] was done in order to fill preschool
vacancies in the experimental group and then to balance the control group; in other instances,
to include siblings of children already in the project.” Figure 3 shows all IQs by wave and final
treatment assignment.
Family Background The Perry sample is made up of 123 children from 104 families, with
each family contributing at most one child to each wave. Table 8 breaks down children by their
relationship to siblings in the study (if any). Expanding ones view beyond the study, Perry children
were young, and from large families: each had at least one older sibling, with a study-wide average
of four.
About half of the children were living with both biological parents. Average maternal age was
29 years at the time of enrollment; average maternal educational attainment was 9.4 years, but no
parent had completed more than 12 years of education. Among mothers with children at the Perry
School in 1962, 77% were born and 53% were educated in the south; correspondingly, 79% of Perry
study participants’ mothers were from the south, and only 11% from Michigan (Schweinhart et al.,
2005, pp. 22-23, 27).
D.3 Randomization Protocol
Detailed information on the implemented randomization protocol is important to construct a Perry-
tailored inference that corrects for compromises in the randomization as implemented and uncer-
tainty about the true assignment rule.
An understanding of the It is essential to understand the randomization protocol in order to
choose an appropriate inference procedure. Following Weikart et al. (1978, p. 16), there are 123
34
participants; the 51 females (25 treated and 26 control) and 72 males (33 treated and 39 control)
are distributed among five cohorts. For each entry cohort, children were assigned to treatment and
control groups in the following the Procedure ??, which is graphically illustrated in Figure 1 of
Subsection 2.2.
Balancing IQ All documentation sources agree randomization protocol began with unlabeled
groups formed based on stratifying IQ rank. Weikart et al. (1978, p. 16) and Schweinhart et al.
(2005, p. 27) are not specific, indicating merely that children were “sorted” or “assigned” to groups
based on IQ rank. However, other sources differ on specifics. Weikart et al. (1978, p. 7) and
Schweinhart et al. (1993, p. 31) indicate that children were paired based on “similar” or “matched”
IQs, after which the two pair members were “randomly” assigned to one of each of the two unlabeled
groups. In contrast, Schweinhart and Weikart (1980, p. 20) indicates that “...[c]hildren were ranked
by their initial IQs; even rankings were assigned to one group and odd rankings to another.” We
interpret the invocation of randomness as describing the manner in which ties were broken, as there
are a few children with ties IQs in each wave, and that after ties were broken randomly, children
with even- and odd-ranked IQs were put in separate unlabeled groups.
Balancing SES and Gender Table 1 reviews the descriptions of the exchanges intended to
balance mean SES index and gender. In all cases, the text appears to indicate that exchanges
were made between one treatment child and one control child, keeping the marginal treatment
group counts fixed. The descriptions also indicate that swaps were made in such a way as to .
Although a couple of descriptions indicates that exchanges were made so as to balance IQ, we read
this as merely a comment that mean IQ remains balanced because balancing exchanges hold the
distribution of IQ across groups roughly constant. For the number of such exchanges, we use the
most precise estimate: 1–2 per wave.
Reassigning Children with Working Mothers After the first two waves, the randomization
procedure “had to be qualified somewhat by practical considerations;” children with single mothers
had transportation difficulties — funding was not available for preschool-provided transportation
— and scheduling home visits proved problematic, (Weikart et al., 1978, pp. 16–17). Table ??
reviews the documentation on reassignments of treatment children with working mothers from the
35
Tab
le1:
Ran
dom
izat
ion
Pro
toco
lD
ocum
enta
tion
:B
alan
cing
Gen
der
and
SES
inde
x
#of
Sw
aps
IQIn
vari
ance
Bal
ance
IQ?
Tex
tSou
rce
-H
eld
Rou
ghly
Con
stan
tN
o“I
fth
ese
grou
psha
dun
equa
lse
xra
tios
orun
equa
lSE
Sra
ting
s,th
eyw
ere
equa
ted
byex
chan
ging
them
,w
ith
[IQ
]sc
ores
held
mor
eor
less
cons
tant
.”
Wei
kart
etal
.(1
978,
p.16
)
-H
eld
Rou
ghly
Equ
alN
o“P
airs
ofsi
mila
rly-
rank
edch
ildre
nw
ere
exch
ange
dbe
twee
ngr
oups
unti
l[ge
nder
and
SES]
for
the
two
grou
psw
ere
equi
v-al
ent.
”
Schw
einh
art
and
Wei
kart
(198
0,p.
20)
-B
tw.
Sim
ilar
Ran
ksN
o“T
hen,
pair
sof
sim
ilarl
y-ra
nked
child
ren
wer
eex
chan
ged
betw
een
grou
psto
equa
tew
ithi
n-gr
oup
rati
osof
boys
togi
rls
and
the
aver
age
soci
oeco
nom
icle
vels
ofth
etw
ogr
oups
.”
Wei
kart
etal
.(1
978,
p.7)
Seve
ral
Btw
.Si
mila
rR
anks
Yes
“The
yth
enex
chan
ged
seve
ral
sim
ilarl
y-ra
nked
pair
mem
-be
rs,
soth
etw
ogr
oups
wou
ldbe
mat
ched
onm
ean
[SE
S,IQ
,an
dge
nder
].”
Schw
einh
art
etal
.(1
993,
p.31
)
1–2
Btw
.Sa
me
Ran
k-P
airs
Yes
“As
part
ofth
ein
itia
las
sign
men
tpr
oced
ure,
they
ex-
chan
ged
1or
2pa
irm
embe
rspe
rcl
ass
toen
sure
that
the
grou
psw
ere
mat
ched
onm
ean
[SE
S,IQ
,an
dge
nder
].”
Schw
einh
art
etal
.(2
005,
p.27
)
36
control group. We simulate these swaps by supposing that up to two such reassignments occurred
per wave.
The lowest estimate of the total number of exchanges (2) comes from Schweinhart et al. (1993, p.
31), which adds, “[a] thorough review of the study’s comprehensive original data could identify only
two such transfers; the transferred children’s names were independently identified by the program’s
head teacher. (Because this transfer has become a point of question [cites], a special effort was
made to verify its extent.)” However, even if this were the overall limit, the fact that their timing
is unknown does not decrease the per-wave upper bound of 2.
Documentation is generally consistent on reassignments being unilateral from the treatment to
control group. The only indication to the contrary is in Schweinhart et al. (2005, p. 27), which uses
the phrase “exchanged 1 or 2 pair members” — also used in that book to describe the exchanges
between (see Table 1). Since the other books seem consistent on employment-based reassignment
being unilateral, and balancing swaps not, we choose to ignore this aberration as an imprecision in
terminology.
37
Tab
le2:
Ran
dom
izat
ion
Pro
toco
lD
ocum
enta
tion
:R
eass
igni
ngC
hild
ren
wit
hW
orki
ngM
othe
rs
#of
Sw
apsa
One-
orT
wo-
Way
bTex
tSou
rce
--
“Fol
low
ing
the
assi
gnm
ent
ofW
aves
0an
d1.
..[o
]cca
sion
alex
chan
ges
ofch
ildre
nbe
twee
ngr
oups
also
had
tobe
mad
ebe
caus
eof
the
inco
nven
ienc
eof
half-
day
pres
choo
lfor
wor
k-in
gm
othe
rs...
”
Wei
kart
etal
.(1
978,
pp.
16–1
7)
5T
otal
One
-Way
“...fi
vech
ildre
nw
ere
tran
sfer
red
from
the
expe
rim
enta
lgr
oup
toth
eco
ntro
lgr
oup,
rath
erth
andr
oppe
dfr
omth
est
udy,
beca
use
they
wer
eun
able
toat
tend
pres
choo
lor
topa
rtic
ipat
ew
ith
thei
rm
othe
rsin
the
hom
e-vi
sit
com
pone
ntof
the
prog
ram
.T
hese
child
ren
cam
efr
omsi
ngle
-par
ent
fam
ilies
inw
hich
the
mot
her
was
empl
oyed
.”
Schw
einh
art
and
Wei
kart
(198
0,p.
21)
5T
otal
One
-Way
“Fiv
ech
ildre
nw
ith
sing
lepa
rent
sem
ploy
edou
tsid
eth
eho
me
had
tobe
tran
sfer
red
from
the
pres
choo
lgr
oup
toth
eno
-pre
scho
olgr
oup
beca
use
ofth
eir
inab
ility
topa
rtic
i-pa
tein
the
clas
sroo
man
d/or
hom
e-vi
sit
com
pone
nts
ofth
epr
esch
ool
prog
ram
.”
Wei
kart
etal
.(1
978,
p.7)
2T
otal
One
-Way
“...f
eari
ngov
eral
lsa
mpl
eat
trit
ion,
staff
tran
sfer
red
from
the
prog
ram
grou
pto
the
non-
prog
ram
grou
p2
child
ren
(wit
hsi
ngle
mot
hers
empl
oyed
away
from
hom
e)w
how
ere
unab
leto
part
icip
ate
inan
yof
the
prog
ram
’scl
asse
sor
hom
evi
sits
.”
Schw
einh
art
etal
.(1
993,
p.31
)
1–2
/W
ave
Tw
o-W
ay?
“...a
spa
rtof
the
init
ial
assi
gnm
ent
proc
edur
ein
late
rcl
asse
s,th
eyex
chan
ged
1or
2pa
irm
embe
rspe
rcl
ass
tore
duce
the
num
ber
ofch
ildre
nof
empl
oyed
mot
hers
inth
epr
ogra
mgr
oup,
beca
use
itw
asdi
fficu
ltto
arra
nge
hom
evi
sits
for
them
.”
Schw
einh
art
etal
.(2
005,
p.27
)
Not
es:
(a)
Tot
alnu
mbe
rof
exch
ange
s,th
roug
hout
the
stud
y;(b
)W
heth
erre
assi
gnm
ents
wer
eun
ilate
ralf
rom
the
trea
tmen
tgr
oup
toco
ntro
lgro
up,
orw
heth
erth
eyw
ere
num
eric
ally
bala
nced
bytr
ansf
ers
from
the
cont
rol
grou
pto
the
trea
tmen
tgr
oup.
38
Figure 3: IQ at Entry, by Wave and by Treatment GroupTable 12: Entry IQ vs. Treatment Group, by Wave
Control Treat. Control Treat. Control Treat. Control Treat. Control Treat.88 2 1 87 2 1 87 3 1 86 2 88 186 1 86 2 86 1 2 85 2 85 2 185 1 85 1 84 1 84 2 84 184 2 84 2 83 1 1 83 3 2 83 383 1 83 1 82 1 1 82 2 1 82 282 2 79 1 81 1 2 81 1 81 180 1 1 73 1 80 2 80 1 80 1 279 1 72 2 79 1 1 79 1 1 79 277 1 2 71 1 75 1 1 78 2 1 78 1 176 1 70 1 73 1 1 77 1 76 2 173 1 69 1 71 1 76 2 75 1 171 1 64 1 69 1 75 1 71 170 1 9 8 68 1 73 1 61 169 3 14 12 66 1 13 1268 1 14 1367 166 163 2
15 13
Counts CountsIQIQIQIQIQ
Counts Counts Counts
Class 5
Perry: Stanford-Binet Entry IQ by Cohort and Group Assigment
Class 1 Class 2 Class 3 Class 4
61
Notes: Stanford Binet IQ at Entry (Age 3) was used.
39
Figure 4: SES Index, by Gender and Treatment Status
6 8 10 12 140
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
SES Index : Male
Fra
ctio
n
6 8 10 12 140
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
SES Index : Female
Fra
ctio
n
ControlTreatment
ControlTreatment
Notes: The socio-economic status index (SES) was defined as a weighted linear combination of 3 variables:(1) average highest grade completed by parents (or the grade of the only present parent), with coefficient12 ; (2) father’s employment status (or mother’s if father is absent): 3 for skilled, 2 for semi-skilled, and 1 forunskilled or none, all with coefficient 2; (3) number of rooms in home divided by number of people livingin the household, with coefficient 2. The skill level of the parent’s job is rated by the study coordinatorsand is not clearly defined. An SES score of 11 or lower was required to enter the study (Weikart, Bond,and McNeil, 1978, pp. 14). The SES score cutoff of 11 was not always adhered to: out of the full sample, 7individuals have SES above the cutoff. 6 out of 7 are in the treatment group, and 6 out of 7 are in the lasttwo waves.
40
Table 3: All Participants that can Switch Back to Treatment according to Σ−12 (D,Z)
Total Combinations
Id D MW Wave Males 0 switch 1 switches 2 switches 0, 1 or 2
42 0 1 2 1
43 0 1 2 1
58 0 1 3 0
68 0 1 3 1
69 0 1 3 0
88 0 1 4 1
89 0 1 4 1
90 0 1 4 0
94 0 1 4 1
95 0 1 4 1
114 0 1 5 1
115 0 1 5 0
121 0 1 5 0
122 0 1 5 0
Total 1 120 180 4928
3
4
1
1
1
1
4
7
16
116
105
3
12
Notes: First column gives the treatment status of selected participants, second column gives the gender (1 for
males). Third column gives the identification number. Fourth Column gives the maternal status. The fifth column
gives the number of possible combination of switches per wave considering two switches. The sixth column provides
the number of possible combination of switches per wave considering one switch. The last column considers 1 or 2
switches. The final number of possible treatment vectors that could had generated the current vector of treatment
status considering 1 or 2 switches per wave is 2700.
41
Table 4: Partition of Eldest Siblings for Permutation Purposes
id Group id Group id Group id Group
1 1 5 31 31 12 62 73 30 83 99 38
2 2 6 32 32 13 63 74 30 84 100 37
3 3 7 33 35 10 64 75 24 85 101 32
4 4 1 34 36 10 65 76 26 86 102 39
5 5 4 35 38 12 66 77 27 87 104 38
6 6 2 36 39 12 67 78 31 88 105 39
7 7 2 37 40 12 68 79 29 89 107 39
8 8 8 38 41 15 69 80 27 90 108 38
9 9 7 39 42 14 70 81 28 91 109 33
10 10 5 40 43 13 71 82 29 92 110 36
11 11 4 72 85 30 93 112 35
12 12 5 73 86 29 94 113 38
13 13 8 41 47 18 74 88 28 95 114 39
14 14 3 42 49 16 75 89 31 96 115 35
15 15 5 43 50 17 76 90 24 97 116 37
16 16 5 44 51 20 77 92 24 98 118 35
17 17 6 45 52 20 78 93 31 99 121 35
18 18 8 46 53 21 79 94 30 100 122 32
19 19 2 47 55 17 80 95 30 101 123 39
20 20 8 48 56 17 81 96 24
21 21 3 49 57 18 82 98 29
22 22 5 50 58 18
23 23 4 51 60 19
24 24 5 52 61 23
25 25 5 53 63 23
26 26 6 54 64 23
27 27 5 55 65 20
28 28 4 56 66 21
29 29 12 57 67 18
30 30 13 58 68 21
31 31 12 59 69 17
60 70 18
61 71 22
Wave 2 Wave 4 Wave 5
Wave 3
Wave 1
Notes: The treatment assignments for eldest siblings that belongs to the same group number are assume to beexchangeable.
42
Table 5: Partitions of Eldest Siblings According to different Discretization Criteria
panel(a) 1 2 3 4
IQ Median Median Tercile Tercile
SES Median Tercile Median Tercile
Gender Yes Yes Yes Yes
Wave Yes Yes Yes Yes
panel(b)
N. of Partition Sets 38 49 53 64
N. of Sigletons 7 16 20 28
panel(c)
N. of Permutations 4.42E+29 6.39E+21 5.92E+19 5.07E+14
N. of different gD 4.35E+13 2.94E+11 2.72E+09 4.03E+07
Cri
teri
a
Discretizations
Notes: The table presents four partitions of the eldest siblings set, which comprises 101 participants. The partitions
are based on the values of selected variables: gender (dichotomous variable), wave (categorical variable from 1 to 5)
and quantile indicators of IQ and SES, both measured at entry. Participants with same background value of selected
outcomes are clustered into the same partition set. We use two types of quantile indicators: (a) if the participant is
above or below the median of the target variable; (b) the tercile of the participant’s value of SES or IQ in the overall
sample. Panel (a) in the table provides the specification of each one of the four partitions. The first line in Panel
(b) gives the number of sets in each partition. The second line gives the number of singletons, that is, the number
of sets that have a single participant. We assume exchangeability of participants within partition sets. The first line
of Panel (c) provides the number of possible permutations for each partition. The second line of Panel (c) provides
the number of distinct vectors of treatment status would had been obtained if the permutations were applied to the
actual treatment status.
43
Table 6: Timeline of Perry Preschool Program Waves
WaveSample Size YearTreat. Ctl. 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967
Zero 13 15One 8 9Two 12 14
Three 13 14Four 12 13
Birth Year Perry Preschool Programa
Source: Weikart et al. (1978, p. 6)
Notes: The Perry Preschool Program ran during the school year (October through mid-May).
44
Table 7: Comparing Families of Participants with Other Families with Children in the PerryElementary School Catchment, Ypsilanti, MI.
Perry School(Overall)a
PerryPreschoolb
EricksonSchool
Mot
her
Average Age 35 31 32Mean Years of Education 10.1 9.2 12.4% Working 60% 20% 15%Mean Occupational Levelc 1.4 1.0 2.8% Born in South 77% 80% 22%% Educated in South 53% 48% 17%
Fat
her
% Fathers Living in the Home 63% 48% 100%Mean Age 40 35 35Mean Years of Education 9.4 8.3 13.4Mean Occupational Levelc 1.6 1.1 3.3
Fam
ily
&H
ome
Mean SESd 11.5 4.2 16.4Mean # of Children 3.9 4.5 3.1Mean # of Rooms 5.9 4.8 6.9Mean # of Others in Home 0.4 0.3 0.1% on Welfare 30% 58% 0%% Home Ownership 33% 5% 85%% Car Ownership 64% 39% 98%% Members of Librarye 25% 10% 35%% with Dictionary in Home 65% 24% 91%% with Magazines in Home 51% 43% 86%% with Major Health Problems 16% 13% 9%% Who Had Visited a Museum 20% 2% 42%% Who Had Visited a Zoo 49% 26% 72%
N 277 45 148
Source: Weikart, Bond, and McNeil (1978). Notes: (a) These are data based on parents who attended parent-
teacher meetings at the Perry school or that were tracked down at their homes by Perry personnel (Weikart, Bond,
and McNeil, 1978, pp. 12–15); (b) The Perry Preschool subsample consists of the full sample (treatment and control)
from the first two waves; (c) The Erickson School was an “all-white school located in a middle-class residential section
of the Ypsilanti public school district.” (ibid., p. 14); (d) Occupation level: 1 = unskilled; 2 = semiskilled; 3 = skilled;
4 = professional; (e) See the base of Figure 4 for the definition of SES; (f) Any member of the family.
45
Table 8: Siblings in Perry Sample
Overall Male Female
All Ctl. Trt. All Ctl. Trt. All Ctl. Trt.
Singleton 82 41 41 51 26 25 31 15 16Eldest 19 12 7 13 9 4 6 3 3
Younger 22 12 10 8 4 4 14 8 6
N 123 65 58 72 39 33 51 26 25
Note: The sample includes 17 pairs, 1 triple, and 1 quadruple of siblings.
46