a structural model of charter school choice and academic ... · a structural model of charter...

78
A Structural Model of Charter School Choice and Academic Achievement Christopher Walters * September 20th, 2013 Abstract Lottery-based instrumental variables estimates show that Boston’s charter schools substantially increase test scores and close racial achievement gaps among their ap- plicants. A key policy question is whether charter expansion can produce similar effects for broader groups of students. This paper uses a structural model of school choice and academic achievement to study the demand for charter middle schools in Boston, with an emphasis on the external validity of lottery-based estimates. I identify the parameters of the model using two sets of instruments: randomly as- signed lottery offers, and distance to charter schools. Estimates of the model suggest that charter applicants are negatively selected on achievement gains. Specifically, low-income students and students with low prior achievement gain the most from charter attendance, but are unlikely to apply to charter schools. I explore the consequences of these findings by simulating the effects of charter expansion in an equilibrium school choice model with constant returns to scale on the supply side. The simulations show that while potential achievement gains for marginal charter applicants are large, the effects of charter expansion may nevertheless be limited by weak demand: perceived application costs are high, and the students for whom charter schools are most effective prefer to remain in traditional public schools. These findings suggest that in the absence of significant behavioral or institutional changes, the potential gains from charter expansion may be limited as much by demand as by supply. Keywords: School choice; demand estimation; discrete choice modeling; treatment ef- fects; charter schools * University of California, Berkeley (e-mail: [email protected]). I am grateful to Joshua Angrist, Parag Pathak, and David Autor for their guidance and support. Seminar participants at AP- PAM, Cornell University, Duke University, Mathematica Policy Research, MIT, Northeastern University, the Spencer Foundation, UC Berkeley, UCLA, the University of Chicago Harris School of Public Policy, and the University of Michigan provided useful comments. I also thank Aviva Aron-Dine, David Chan, Sarah Cohodes, William Darrity Jr., Susan Dynarski, Michal Kurlaender, Bridget Terry Long, Christo- pher Palmer, Stephen Ryan, Xiao Yu Wang, Tyler Williams, and numerous others for helpful discussions. Special thanks go to Carrie Conaway and the Massachusetts Department of Elementary and Secondary Education for suggestions, assistance and data. This work was supported by a National Science Founda- tion Graduate Research Fellowship, a National Academy of Education/Spencer Dissertation Fellowship, and Institute for Education Sciences award number R305A120269. 1

Upload: lykien

Post on 12-Feb-2019

215 views

Category:

Documents


0 download

TRANSCRIPT

A Structural Model of Charter School Choice andAcademic Achievement

Christopher Walters∗

September 20th, 2013

Abstract

Lottery-based instrumental variables estimates show that Boston’s charter schoolssubstantially increase test scores and close racial achievement gaps among their ap-plicants. A key policy question is whether charter expansion can produce similareffects for broader groups of students. This paper uses a structural model of schoolchoice and academic achievement to study the demand for charter middle schoolsin Boston, with an emphasis on the external validity of lottery-based estimates. Iidentify the parameters of the model using two sets of instruments: randomly as-signed lottery offers, and distance to charter schools. Estimates of the model suggestthat charter applicants are negatively selected on achievement gains. Specifically,low-income students and students with low prior achievement gain the most fromcharter attendance, but are unlikely to apply to charter schools. I explore theconsequences of these findings by simulating the effects of charter expansion in anequilibrium school choice model with constant returns to scale on the supply side.The simulations show that while potential achievement gains for marginal charterapplicants are large, the effects of charter expansion may nevertheless be limitedby weak demand: perceived application costs are high, and the students for whomcharter schools are most effective prefer to remain in traditional public schools.These findings suggest that in the absence of significant behavioral or institutionalchanges, the potential gains from charter expansion may be limited as much bydemand as by supply.

Keywords: School choice; demand estimation; discrete choice modeling; treatment ef-fects; charter schools

∗University of California, Berkeley (e-mail: [email protected]). I am grateful to JoshuaAngrist, Parag Pathak, and David Autor for their guidance and support. Seminar participants at AP-PAM, Cornell University, Duke University, Mathematica Policy Research, MIT, Northeastern University,the Spencer Foundation, UC Berkeley, UCLA, the University of Chicago Harris School of Public Policy,and the University of Michigan provided useful comments. I also thank Aviva Aron-Dine, David Chan,Sarah Cohodes, William Darrity Jr., Susan Dynarski, Michal Kurlaender, Bridget Terry Long, Christo-pher Palmer, Stephen Ryan, Xiao Yu Wang, Tyler Williams, and numerous others for helpful discussions.Special thanks go to Carrie Conaway and the Massachusetts Department of Elementary and SecondaryEducation for suggestions, assistance and data. This work was supported by a National Science Founda-tion Graduate Research Fellowship, a National Academy of Education/Spencer Dissertation Fellowship,and Institute for Education Sciences award number R305A120269.

1

1 Introduction

Differences in test scores between racial and socioeconomic groups are a pervasive

feature of the American educational landscape. In 2012, 13-year-old black students scored

more than 0.8 standard deviations (σ) below their white counterparts on the National

Assessment of Educational Progress Long Term Trend (NAEP-LTT) math test.1 Similar

achievement gaps appear in all subjects and at all grade levels (Fryer and Levitt 2004,

2006; Vanneman et al., 2009). Moreover, after several decades of convergence, the relative

scores of black students stagnated in the late 1980s (Neal 1996). Achievement gaps

between high- and low-income students have also grown in recent years (Reardon 2011).

Numerous policy interventions with the potential to affect achievement gaps have been

proposed, but few produce gains of the magnitude necessary to substantially reduce these

gaps (Fryer 2010). As a result, some analysts have argued that it is either impossible or

inordinately expensive to significantly reduce achievement gaps using educational policies

alone, especially policies that target adolescents (Rothstein 2004; Heckman 2011).

These pessimistic accounts notwithstanding, a growing body of evidence suggests that

some charter schools serving poor, minority populations in urban areas boost achievement

sharply. Charter schools are publicly funded, non-selective schools that typically have

more freedom than traditional public schools to set curricula and make staffing decisions.

Studies based on entrance lotteries show that attendance at charter schools in Boston

and New York’s Harlem Children’s Zone raises achievement by 0.25σ per year or more

(Abdulkadiroglu et al. 2011; Dobbie and Fryer 2011). Angrist et al. (2012, 2013a,

2013b), Dobbie and Fryer (2013), Gleason et al. (2010), Hoxby and Murarka (2009),

and Hoxby and Rockoff (2004) also report positive effects for urban charters.2 These

findings suggest that urban charter schools may have the potential to reduce achievement

gaps. Reflecting this hope, the Massachusetts legislature recently relaxed the state’s

charter school cap with the explicit goal of reducing racial and socioeconomic disparities

in academic performance (Commonwealth of Massachusetts 2010).

Despite the substantial effects reported in lottery-based studies, however, the efficacy

of charter school expansion may be limited by both demand-side and supply-side factors.

1Author’s calculation.2Estimates for charter schools outside urban areas are mixed. Gleason et al. (2010) find that non-

urban charters are no more effective than traditional public schools, while Angrist et al. (2013b) findnegative effects for non-urban charter middle schools in Massachusetts.

2

On the supply side, inputs such as high-quality teachers and principals may become

increasingly scarce as the system expands (Wilson 2008). On the demand side, effects

for current applicants may differ systematically from potential effects for broader groups

of students. In particular, students with larger potential benefits may be more likely to

sign up for charter lotteries, in which case lottery-based estimates overstate the possible

effects of expansion. In support of this view, Ravitch (2010, pp. 144-145) argues that

“charter schools enroll the most motivated students in poor communities, those whose

parents push them to do better...as more charter schools open, the dilemma of educating

all students will grow sharper.” Similarly, Rothstein (2004, p. 82) writes of the Knowledge

is Power Program (KIPP), a high-performing urban charter operator: “[T]hese exemplary

schools...select from the top of the ability distribution those lower-class children with

innate intelligence, well-motivated parents, or their own personal drives, and give these

children educations they can use to succeed in life.” On the other hand, the parents of

low-achieving students may be unlikely to investigate alternatives to traditional public

school, despite evidence that urban charters are especially effective for such students

(Angrist et al. 2012).

This paper studies the demand for charter middle schools in Boston, with an emphasis

on predicting achievement effects for the full population of Boston students. To this

end, I develop and estimate a structural model that links students’ charter application

and enrollment decisions to their potential achievement gains in a parametric selection

framework. The model is similar to the stochastic portfolio choice problem outlined by

Chade and Smith (2006): students submit charter applications to maximize expected

utility, taking account of admission probabilities and non-monetary application costs.

To identify the model’s parameters, I combine instruments from entrance lotteries with a

second set of instruments based on proximity to charter schools. The approach taken here

is similar in spirit to other recent studies that use economic theory to extrapolate from

experimental or quasi-experimental causal estimates (see, e.g., Todd and Wolpin 2006,

Card and Hyslop 2006, Attanasio et al. 2011, Lise et al. 2004, and Duflo et al. 2012).3

The model accommodates heterogeneity in student preferences and achievement gains on

3Angrist and Fernandez-Val (2011) and Hotz et al. (2005) describe approaches to extrapolation thatemphasize variation in treatment effects as a function of observed covariates. Heckman and Vytlacil(2001, 2005), Carneiro et al. (2010), and Heckman (2010) discuss nonparametric estimation of marginaltreatment effects (MTE), effects at particular values of the unobserved propensity to receive treatment.The approach taken here includes elements of both of these approaches.

3

both observed and unobserved dimensions, as well as heterogeneous effects across charter

schools.

Estimates of the model imply that charter applicants are negatively selected on gains

from charter attendance. Specifically, higher-achieving, less-disadvantaged students have

the strongest preferences for charter schools, but charters are most effective for poor

students and those with low previous achievement. The structural estimates imply that

charter applicants are also negatively selected on unobserved dimensions of achievement

gains. Surprisingly, these findings imply that lottery-based estimates understate the

achievement effects of charter schools for non-applicants: effects of treatment on the

treated (TOT) for current applicants are substantially smaller than the model’s implied

population average treatment effects (ATE). As a result, charter expansion has the po-

tential to substantially raise achievement for marginal applicants.

I quantify these findings by simulating the effects of charter school expansion in an

equilibrium school choice model. The model adds a constant-returns-to-scale specifica-

tion of charter supply, and allows charter admission probabilities to adjust endogenously

to clear the market for charter seats. The constant-returns-to-scale assumption, which

implies that expansion charter schools are able to replicate the production technology

used by existing schools, represents a best-case scenario on the supply side. This as-

sumption allows me to estimate upper bounds on the effects of charter expansion due to

demand-side behavior. The simulations show that the expansion currently underway in

Boston, which will raise the share of middle schoolers attending charters from 9 percent

to 14 percent, may reduce the gap in 8th-grade math scores between Boston and the rest

of Massachusetts by up to 3 percent, and also reduce citywide racial achievement gaps

by roughly 3 percent.

At the same time, my results show that even in the best-case scenario for charter

supply, the effects of large charter expansions may be limited by weak demand, espe-

cially among students with large potential achievement gains. Students act as if charter

application costs are high, and most prefer to attend traditional public schools despite

the substantial achievement gains they would receive from charters. Furthermore, the

model predicts that the students with the largest potential gains will choose to remain in

traditional public schools even when charters offering guaranteed admission are located

in close proximity. As a result, the ATE remains well above the TOT in all simulations.

4

I conclude with a simulation that breaks the link between preferences and causal effects.

The results show that the effects of charter expansion would be substantially larger if

high-benefit students could be induced to apply.

These results contribute to a nascent literature assessing the demand for school quality

and its relationship to socioeconomic status. Hastings et al. (2009) show that higher-

socioeconomic-status parents are more likely to choose schools with high average test

scores in a public school choice plan in Charlotte, North Carolina. Using a nationally

representative sample from the Early Childhood Longitudinal Study, Butler et al. (2013)

show that richer children are more likely to enroll in charter schools. In the higher-

education sphere, Hoxby and Avery (2012) show that seemingly-qualifed poor students

are unlikely to apply to selective colleges. Similarly, Brand and Xie (2010) argue that

students with the largest potential returns are the least likely to enroll in college, while

Dillon and Smith (2013) document the prevalence of “mismatch” between student ability

and college quality in the National Longitudinal Survey of Youth. Farther afield, Ajayi

(2011) shows that students from lower-performing elementary schools in Ghana are less

likely to apply to selective secondary schools.

Finally, the equilibrium model used in my simulations contributes to a rich literature

analyzing models of school competition. Epple and Romano (1998) study a theoretical

model of competition between public and private schools. Epple et al. (2003, 2006,

2013) analyze theoretical and empirical models of college quality, student sorting, and

financial aid. In other related work, Ferreyra and Kosenok (2011) and Mehta (2011) study

models of charter school entry. The model used here allows charter schools to compete for

students by adjusting their lottery admission probabilities, while abstracting from other

supply-side issues in order to focus on the consequences of demand-side behavior.

The rest of the paper is organized as follows: The next section gives background on

charter schools in Boston and describes the data. Section 3 benchmarks the effects of

charter schools in the sample of lottery applicants. Section 4 outlines the structural model

of charter demand and academic achievement, and Section 5 discusses identification and

estimation of the model. Section 6 reports the structural estimates. Section 7 adds a

specification of charter supply, derives equilibrium admission probabilities, and uses the

model to simulate the effects of charter expansion. Section 8 concludes.

5

2 Setting and Data

2.1 Context: Charter Schools in Boston

Non-profit organizations, teachers, or other groups wishing to operate charter schools

in Massachusetts submit applications to the state’s Board of Education. If authorized,

charter schools are granted freedom to organize instruction around a philosophy or cur-

ricular theme, as well as budgetary autonomy. Charter employees are also typically

exempt from local collective bargaining agreements, giving charters more discretion over

staffing than traditional public schools.4 Charters are funded primarily through per-pupil

tuition payments from local districts. Charter tuition is roughly equal to a district’s per-

pupil expenditure, though the state Department of Elementary and Secondary Education

partially reimburses these payments (Massachusetts Department of Elementary and Sec-

ondary Education 2011). The Board of Education reviews each charter school’s academic

and organizational performance at five year intervals, and decides whether charters should

be renewed or revoked.

Enrollment at Massachusetts charter schools is open to all students who live in the lo-

cal school district. If a charter school receives more applications than it has seats, it must

accept students by random lottery. Students interested in multiple charter schools must

submit separate applications to each charter, and may receive multiple offers through

independent school-specific lotteries. This system of independent enrollment processes is

in contrast to the centralized enrollment mechanism used for Boston’s traditional public

schools, which collects lists of students’ preferences over schools and generates a single

offer for each student (Pathak and Sonmez 2008).

The Boston Public Schools (BPS) district is the largest school district in Massachusetts,

and it also enrolls an unusually large share of charter students. In the 2010-2011 school

year, 14 charter schools operated in Boston, accounting for 9 percent of BPS enrollment.

The analysis here focuses on middle schools, defined as schools that accept students in

fifth or sixth grade; 12 percent of Boston middle schoolers attended charter schools in

4Massachusetts has two types of charter schools: Commonwealth charters, and Horace Mann charters.Commonwealth charters are usually new schools authorized directly by the Board of Education, whileHorace Mann charters are often conversion schools and must be approved by the local school boardand teachers’ union prior to state authorization. Horace Mann employees typically remain part of thecollective bargaining unit. I focus on Commonwealth charter schools. No Horace Mann charter middleschools operated in Boston during my data window.

6

2010-2011. Panel A of Appendix Table A1 list names, grade structures and years of op-

eration for the ten Boston charter middle schools that operated through the 2010-2011

school year. As described in Section 2.2, I use lottery data from seven of these schools to

produce the estimates reported below. These seven schools are marked in black on the

map in Appendix Figure A1.

Many of Boston’s charter schools adhere to a model known as “No Excuses,” a set

of practices that includes extended instruction time, strict behavior standards, a focus

on traditional reading and math skills, selective teacher hiring, and teacher monitoring

(Wilson 2008). A growing body of evidence suggests that these practices boost student

achievement (Angrist et al., 2013b; Dobbie and Fryer, 2013; Fryer, 2011). Consistent

with this evidence, Abdulkadiroglu et al. (2011) use entrance lotteries to show that

Boston’s charter schools substantially increase achievement among their applicants. Their

estimates imply that a year of charter middle school attendance raises test scores by 0.4σ

in math and 0.2σ in reading. Similarly, Angrist et al. (2013a) show that Boston’s charter

high schools have substantial effects on longer-term outcomes like SAT scores and four-

year college enrollment.

These encouraging findings make Boston an appealing setting for studying the demand

for charter schools and the potential effects of charter expansion. The effects of expansion

in Boston are also relevant to an ongoing policy debate. In recent years, the growth of

charters in Massachusetts has been slowed by the state’s charter cap, a law that limits

expenditures on charter schools to 9 percent of the host district total.5 The Board of

Education stopped accepting proposals for new Boston charters in 2008 when charter

expenditure hit the cap (Boston Municipal Research Bureau 2008).

In 2010, the Massachusetts legislature relaxed the charter cap for low-performing

school districts. Specifically, for districts with test scores in the state’s lowest decile, the

limit on charter expenditures is to rise incrementally from 9 percent in 2010 to 18 per-

cent in 2017 (Commonwealth of Massachusetts 2010). This law gives priority to “proven

providers” who have previously held leadership positions at schools demonstrating aca-

demic success for similar student populations (Massachusetts Department of Elementary

and Secondary Education 2012a). The law also requires schools to specify recruitment

plans aimed at attracting applicants who are demographically similar to the local popula-

5Legislation also limits the total number of Commonwealth charter schools to 72 and the number ofHorace Mann charters to 48, though these caps are not currently binding.

7

tion, though all students are free to apply and admissions will continue to be determined

by lottery (Massachusetts Department of Elementary and Secondary Education 2012c).

Through 2011, the Board of Education received 51 charter applications under the new

law. Of these, 32 were selected as finalists, and 20 charters were granted, eleven to schools

in Boston (Massachusetts Department of Elementary and Secondary Education 2012b).

Panel B of Appendix Table A1 lists the six new Boston charter middle schools opened in

the 2011-2012 and 2012-2013 school years, as well as existing schools operated by the same

providers. Two opened in 2011-2012, while four opened in 2012-2013. Five new schools

are linked to existing charters in Boston; the sixth, KIPP Academy Boston, is part of the

Knowledge is Power Program, the nation’s largest charter management organization.6

The locations of the expansion middle schools are marked in red in Appendix Figure A1.

Below, I refer to these schools as part of Boston’s “planned” middle school expansion

because although they have started operating, lottery records and test scores for their

applicants are not yet available. Boston’s charter sector may continue to expand in the

near future; recently-proposed legislation would eliminate the charter cap in Boston and

other low-performing districts (Levitz 2013).

2.2 Data Sources and Sample Construction

The data used in my analysis comes from three sources. First, I obtain demograph-

ics, school attendance, and test scores from an administrative database provided by the

Massachusetts Department of Elementary and Secondary Education (DESE). Second, I

draw spatial location data from student addresses provided by the BPS district. Finally, I

obtain information on charter school applications and offers from lottery records gathered

from individual charter schools.

The DESE database covers all Massachusetts public school students from the 2001-

2002 school year through the 2011-2012 school year. Key variables include sex, race, sub-

sidized lunch status, limited English proficiency (LEP), special education status (SPED),

town of residence, schools attended, and scores on Massachusetts Comprehensive Assess-

ment System (MCAS) math and English Language Arts (ELA) achievement tests. I begin

by selecting from the database the five cohorts of students who attended a traditional

6KIPP operates two charter schools in Lynn, a poor suburb of Boston. In a lottery-based evaluationof one of these schools, Angrist et al. (2012) estimate effects similar to those of Boston’s charter middleschools.

8

BPS school in 4th grade between 2005-2006 and 2009-2010. I also require students to

have non-missing 4th grade demographics and test scores, as well as school attendance

information and test scores in 6th, 7th, or 8th grade. I use only the earliest test taken

by a given student in a particular subject and grade. I standardize these scores to have

mean zero and standard deviation one within each subject-year-grade in Massachusetts.

Next, I merge the student address database to the DESE administrative file using a

crosswalk between BPS and state student identifiers. The address database includes a

record for every year that a student attended a traditional BPS school between 1998 and

2011. I drop students in the state database without 4th-grade BPS address data. This

restriction eliminates less than 1 percent of Boston 4th graders. The address information

is used to measure proximity to each Boston charter school. I measure proximity using

great-circle distance in miles.7

I then match the student data to records from lotteries held at seven charter middle

schools in Boston. The lotteries used here are an expanded version of the middle school

sample used by Abdulkadiroglu et al. (2011), including two additional schools. I focus on

middle schools because applicant records were more consistently available in middle school

than in elementary or high school. I restrict attention to applicant cohorts attending 4th

grade between 2006 and 2010 because lottery records for earlier cohorts were missing for

several schools, while lottery records and test scores for later cohorts are not yet available.

Column 4 of Appendix Table A1 summarizes the availability of lottery records for

the ten charter middle schools that operated between the 1997-1998 school year and the

2010-2011 school year.8 Of the three schools without available records, two closed prior

to the 2010-2011 school year; the third declined to provide records. In the analysis below,

I treat these schools as equivalent to traditional public schools. I matched the available

lottery records to the administrative data by name, grade, year, and (where available)

date of birth. This process produced unique matches for 92 percent of lottery applicants.

After matching the lottery files to the student data, I constructed two subsamples

7I also performed the analysis using travel times measured by Google Maps, obtained using theSTATA traveltime command. I chose to use great-circle distances instead because traveltime produceddifferent results when queried at different times, making the results difficult to replicate. Key estimateswere very similar for this alternative distance measure.

8I classify charter schools as middle schools if they accept applicants in 5th or 6th grade. TwoBoston charter schools accept students prior to 5th grade but serve grades 6 through 8. Since I restrictthe analysis to students who attended traditional BPS schools in 4th grade, no students in the sampleattend these schools.

9

for statistical analysis. The first is used to estimate causal effects for lottery applicants,

and thus excludes students who did not apply to charter schools. The lottery sample

includes 2,564 applicants to charter middle schools. A second sample, which includes

non-applicants, is used to estimate the structural model. The structural sample includes

13,473 students who attended BPS schools in 4th grade between 2006 and 2010.

2.3 Descriptive Statistics

Applicants to Boston charter schools differ from the general population of Boston

students. Specifically, charter applicants tend to have higher socioeconomic status, and

to enter middle school with higher prior achievement than non-applicants. This can be

seen in Table 1, which reports summary statistics for the structural sample in column (1)

and the applicant sample in column (2). Panel A shows that nineteen percent of students

applied to at least one charter lottery, thirteen percent were offered a charter seat, and

ten percent attended a charter school. Panel B shows mean characteristics measured

in 4th grade. Compared to the general student population, charter applicants are less

likely to be Hispanic, to be eligible for subsidized lunch, to have special education status,

or to be classified as limited English proficient. Charter applicants are more likely to be

black, and also live slightly closer to charter schools on average (1.9 miles from the closest

charter, compared to 2.1 miles for the full population).

Table 1 also displays information about 4th grade achievement test scores. Boston 4th

graders lag behind the state average by 0.54σ and 0.64σ in math and ELA. Students who

apply to charter schools have higher scores than the general Boston population: appli-

cants’ 4th grade scores exceed the Boston average by more than 0.15σ in both subjects.

Taken together, these summary statistics indicate that charter applicants are higher-

achieving, less economically disadvantaged, and less likely to have academic problems

than students who do not apply to charter schools.

To investigate balance in the charter lotteries used here, column (3) of Table 1 com-

pares lottery winners and losers in the charter applicant sample. The estimates in this

column are coefficients from regressions of student characteristics on a lottery offer in-

dicator, controlling for lottery fixed effects capturing all combinations of schools and

applicant cohorts in the data. These lottery fixed effects are necessary to generate unbi-

ased comparisons between winners and losers, because the set of lotteries a student enters

10

influences her offer probability.9 The estimates in column (3) show that the lotteries are

balanced: there are no statistically significant differences between lottery winners and

losers, and the hypothesis of balance across all characteristics is not rejected in a joint

test (p = 0.22).10

3 Effects on Lottery Applicants

3.1 Lottery Estimates

To motivate my analysis of the demand for charter schools, I begin by using lotteries

to benchmark the effects of charter attendance among the selected subset of students

who choose to apply to charters. I also report estimates of the effects of charters on

racial achievement gaps among applicants, which are of policy interest but have not been

previously investigated.

I interpret the lottery-based estimates in the Local Average Treatment Effect (LATE)

model described by Imbens and Angrist (1994), which provides a formal framework for

analyzing heterogeneity in causal effects. Let Yi(1) be applicant i’s potential test score

if she attends a charter school, and let Yi(0) be her test score if she attends a public

school. Si indicates charter attendance (the “treatment”), and Zi is a lottery offer dummy.

Let Si(1) and Si(0) denote potential treatment status as a function of Zi. The LATE

framework is based on the following assumptions for the lottery applicant sample:

A1 Independence and Exclusion: (Yi(1), Yi(0), Si(1), Si(0)) is independent of Zi.

A2 First Stage: 0 < Pr[Zi = 1] < 1 and Pr[Si(1) = 1] > Pr[Si(0) = 1].

A3 Monotonicity : Si(1) ≥ Si(0) ∀i.

The Independence and Exclusion assumption is motivated by the observation that offers

are randomly assigned among applicants, and are unlikely to affect test scores through any

channel but charter attendance. The First Stage assumption requires that winning the

lottery makes applicants more likely to attend charter school on average. Monotonicity

9I construct separate lottery groups for sibling applicants, who are guaranteed admission. Since allsiblings receive offers, the inclusion of sibling lottery combinations has no effect on the estimates.

10Even with random assignment, the validity of lottery-based instruments can be compromised bynon-random attrition. Appendix Table A2 shows that the follow-up rate for the lottery sample is 87percent, and followup rates for lottery winners and losers are very similar. The first row of this tableshows that follow-up rates are similar in the lottery and structural samples.

11

requires that winning the lottery does not discourage any applicant from attending charter

school.

Under assumptions A1-A3, applicants can be partitioned into three groups: never

takers, who never attend charters (Si(1) = Si(0) = 0), always takers, who attend re-

gardless of the offer (Si(1) = Si(0) = 1), and compliers, who are induced to attend by

receiving offers (Si(1) > Si(0)). Imbens and Angrist (1994) show that conventional in-

strumental variables (IV) methods consistently estimate LATE, the average treatment

effect for compliers. We have

E[Yi|Zi = 1]− E[Yi|Zi = 0]

E[Si|Zi = 1]− E[Yi|Si = 0]= E[Yi(1)− Yi(0)|Si(1) > Si(0)]. (1)

The Wald (1940) IV estimator is the empirical analogue of the left-hand side of equation

(1).

I estimate LATE using a two-stage least squares (2SLS) procedure that combines

observations from multiple lotteries. The estimating equation for the lottery analysis is

Yi = ψ` + βSi + εi, (2)

where Yi is a test score for applicant i, Si is a dummy variable indicating charter school

attendance, and ψ` is a set of lottery fixed effects. I code a student as attending charter

school if she attends a charter at any time after the lottery and prior to the test. The

first stage equation is

Si = κ` + πZi + ηi. (3)

The instrument Zi is one for students who receive any charter offer before the start of

the school year following the lottery. The 2SLS estimate of β can be interpreted as a

weighted average of within-lottery LATEs.11 To use all available test score information,

the sample stacks scores in grades six through eight. Standard errors are clustered at the

student level.

Consistent with the results reported by Abdulkadiroglu et al. (2011), the 2SLS esti-

mates show that Boston’s charter schools have dramatic effects on student achievement for

lottery applicants. As shown in column (1) of Table 2, receipt of a lottery offer increases

the probability of charter attendance by 0.64. The second-stage estimates, reported in

11With a first stage saturated in offer-times-lottery interactions, 2SLS produces a weighted average ofwithin-lottery Wald estimates, with weights proportional to the variance of the first-stage fitted values(Angrist and Imbens 1995). Estimates from this fully saturated model were very similar to the moreparsimonious model used here, which includes a single instrument and lottery fixed effects.

12

columns (2) and (3), imply that attending a charter school increases math scores by 0.49σ

and boosts ELA scores by 0.32σ. These effects are precisely estimated (p < 0.01).

These pooled results mask substantial heterogeneity in the benefits of charter school

attendance across racial groups. The second row of Table 2 shows that the effects of

charter schools are relatively modest for white students: the math estimate for whites is

a statistically insignificant 0.19σ, and the ELA estimate is negative and insignificant. In

contrast, the third and fourth rows reveal large, significant effects for black and Hispanic

students in both subjects. Charter attendance boosts scores for black students by 0.60σ

in math and 0.38σ in ELA. The corresponding effects for Hispanics are 0.54σ and 0.46σ.

The last row of Table 2 reports p-values from Wald tests of the equality of charter effects

across races. The null hypothesis of equal effects is rejected at conventional significance

levels (p < 0.05) for both subjects. These results show that Boston’s charter schools raise

test scores for non-white students much more than for whites.

3.2 Effects on Score Distributions by Race

As a final piece of motivation for the structural analysis to follow, I next use the

lottery sample to ask whether charter schools close racial achievement gaps among ap-

plicant compliers. To estimate effects on black and white complier score distributions, I

modify the methods described by Abadie (2002, 2003). Abadie notes that in addition to

LATE, the marginal distributions of Yi(1) and Yi(0) are separately identified for compli-

ers in instrumental variables settings. Intuitively, the distribution of Yi for students with

Si = Zi = 0 is a mixture of the distributions of Yi(0) for compliers and never takers. The

distribution of Yi(0) for never takers is directly observable among students with Zi = 1

and Si = 0. The distribution of Yi(0) for compliers can therefore be recovered by a de-

convolution that uses these two observed distributions. A similar argument shows that

the distribution of Yi(1) for compliers can be recovered using the distribution of Yi for

students with Si = Zi = 1 together with the distribution for students with Si = 1 and

Zi = 0. Abadie provides simple methods for estimating CDFs of potential outcome dis-

tributions for compliers, and outlines bootstrap procedures for testing hypotheses about

these distributions. I extend these methods to estimate potential outcome densities sep-

arately by race, and test for black-white equality among applicant compliers who are

randomly assigned to charter schools or public schools.

13

The estimating equations for the distributional analysis are of the form

Kh (y − Yi) · Si = κ`y + γ(y) · Si + ηiy, (4)

where Si is treated as an endogenous regressor and instrumented with lottery offers.

Here Kh(t) = 1hK(th

), K(t) is a kernel function, and h is a bandwidth. Let f cs (y) be the

density of Yi(s) for lottery compliers. Appendix A shows that the probability limit of

the 2SLS estimate of γ(y) is f c1(y), the density function for treated compliers. f c0(y) can

be estimated by replacing Si with (1− Si) in equation (4). I use a Kolmogorov-Smirnov

(KS) statistic to test distributional equality for blacks and whites.12

Boston’s charter schools cause math score distributions for black and white applicant

compliers to converge. This can be seen in Figure 1, which plots estimated complier

densities of Yi(0) and Yi(1) for math, separately by race and year since application. The

densities are estimated using a triangle kernel with bandwidth 1σ.13 Black vertical lines

indicate the pooled mean of Yi(0) in each figure, while red lines mark the mean of Yi(1)

in plots for treated students. At baseline (prior to treatment), distributions for treated

and non-treated compliers are similar, and black students lag behind whites throughout

the distribution. The KS test rejects baseline distributional equality at conventional

significance levels (p = 0.01 for the untreated, p = 0.07 for the treated). In post-

treatment years, the black distribution moves towards the white distribution for treated

compliers, and in 7th and 8th grade the null hypothesis of distributional equality cannot

be rejected for treated students (p ≥ 0.76). By 8th grade, the distribution for black

compliers lies slightly to the right of the white distribution, though these distributions

are not statistically distinguishable.

In contrast with the results for treated compliers, no convergence occurs for untreated

compliers. The left-hand panels of Figure 1 show that black compliers who attend public

schools lag behind their white counterparts in every year, with little relative change in the

12The KS statistic is proportional to the maximum difference in complier CDFs between racial groups.To compute these CDFs, I replace Kh (y − Yi) with 1Yi ≤ y in equation (4) and estimate this equationon a grid of 100 points covering the support of Yi. Inference is based on a stratified bootstrap procedure.For each of 200 bootstrap replications, I draw observations with replacement within lotteries to obtaina new sample with the same lottery-specific sample sizes as the original sample. I then randomly assignobservations in each lottery to racial groups in the same proportions as in the original sample, andrecalculate the KS statistic. The results provide the sampling distribution of the KS statistic under thenull hypothesis of distributional equality for black and white compliers.

13Imbens and Rubin (1997) point out that instrumental variables estimates of potential outcomedensities are not guaranteed to be positive. I follow their suggestion and set the estimated densities tozero in a small number of cases where the 2SLS estimate is negative.

14

distributions after baseline (though distributional equality is not rejected in 8th grade due

to a lack of power). These results suggest that Boston’s charter schools close otherwise

persistent achievement gaps between black and white compliers in math. As shown in

Figure 2, black-white convergence is less pronounced for ELA than for math, though large

shifts in mean ELA scores are evident in the plots for 7th and 8th grade.

4 Modeling Charter School Attendance

4.1 Setup

The lottery estimates in Section 3 show that Boston’s charter schools have dramatic

effects on average test scores and racial achievement gaps for applicant compliers. At

the same time, effects for non-applicants may differ systematically from effects for appli-

cants, so these results need not provide an accurate guide to the likely consequences of

charter expansion. To study the external validity of the lottery-based estimates, I use

a structural model of charter application, attendance, and achievement. As in Chade

and Smith (2006) and Ajayi (2011), the charter school application decision is modeled

as a random utility portfolio choice problem: students choose a set of applications to

maximize expected utility, taking into account admission probabilities and application

costs. The model also allows for heterogeneous effects across charter schools.

Figure 3 explains the sequence of events described by the model. First, students

decide whether to apply to each of K charter schools, indexed by k ∈ 1...K. The

dummy variable Aik ∈ 0, 1 indicates that student i applies to school k. Second, charter

schools randomize offers to applicants. The dummy variable Zik ∈ 0, 1 indicates an

offer for student i at school k, and πk denotes the admission probability for applicants to

school k. In the third stage, students choose schools denoted Si, where Si = 0 indicates

public school attendance. Any student can attend public school, but student i can attend

charter school k only if Zik = 1. Finally, students take achievement tests. Yij denotes

student i’s score on math and ELA tests, indexed by j ∈ m, e.

15

4.2 Student Choice Problem

4.2.1 Preferences

Students’ preferences for schools depend on demographic characteristics, spatial prox-

imity, application costs, and unobserved heterogeneity. Specifically, the utility of attend-

ing charter school k is

Uik = γ0k +X ′iγ

x + γd ·Dik + θi + vik − ci (Ai) , (5)

where Xi is a vector of characteristics for student i including sex, race, subsidized lunch

status, special education status, limited English proficiency, and 4th grade math and

ELA scores. Dik measures distance to school k. The utility of public school attendance

is

Ui0 = vi0 − ci (Ai) .

The quantity ci (Ai) represents the utility cost of Ai, the application bundle chosen

by student i. (Here and elsewhere, variables without k subscripts refer to vectors, so

that Ai ≡ (Ai1, ..., AiK)′ and so on.) Application costs include the disutility of filling

out application forms and the opportunity cost of time spent attending lotteries. These

costs may also capture frictions associated with learning about charter schools.14 The

application cost function is parameterized as

ci (a) = γa · |a| − ψia.

The parameter γa is the marginal cost associated with an additional charter school appli-

cation. The error term ψia is a shock to the utility associated with a particular application

bundle. Applicants pay these costs whether or not they attend a charter.

The variables θi and vik represent unobserved heterogeneity in tastes. θi, which char-

acterizes student i’s preference for charter schools relative to traditional public school, is

the key unobservable governing selection into the charter sector. This variable includes

any latent factors that influence students to opt out of traditional public school in favor

of charter schools, such as the perceived achievement gain from attending charter schools,

14Charter schools are not listed in informational resources provided to parents by the BPS district.For example, the “What Are My Schools?” tool located at www.bostonpublicschools.org provides a listof the BPS schools to which children are eligible to apply, but does not list charter schools (accessedSeptember 13th, 2013).

16

proximity to the relevant public school, and parental motivation.15 In the language of the

random-coefficients logit model (see, e.g., Hausman and Wise 1978, Berry et al. 1995,

and Nevo 2000), θi is the random coefficient on a charter school indicator. The presence

of θi implies that charter schools are closer substitutes for each other than for traditional

public schools. I assume that θi follows a normal distribution with mean zero and variance

σ2θ .

The vik capture idiosyncratic preferences for particular schools, which are further

decomposed as

vik = τik + ξik.

Students know ψia, τik and θi before applying to charter schools, and learn ξik after

applying. The post-application preference shock ξik explains why some applicants decline

charter school offers. To generate multinomial logit choice probabilities, ψia, τik, and

ξik are assumed to follow independent extreme value type I distributions, with scale

parameters λψ, λτ and 1.16

4.2.2 School Lotteries

In the second stage of the model, schools hold independent lotteries. School k admits

applicants with probability πk. The probability mass function for the offer vector Zi

conditional on Ai is

f(Zi|Ai; π) =∏k

[Aik · (πkZik + (1− πk)(1− Zik)) + (1− Aik) · (1− Zik)]. (6)

Initially, admission probabilities are treated as parameters to be estimated. However,

admission rates are likely to change as the system of charter schools expands. In the

simulations to follow, the πk adjust so that schools fill their seats in equilibrium. Section

7 and Appendix D discuss the determination of endogenous admission probabilities.

4.2.3 Application and Attendance Decisions

I derive students’ optimal application and attendance rules by backward induction.

A student is faced with a unique attendance decision after each possible combination of

15Proximity to the relevant public school is treated as unobserved because Boston has a citywidechoice plan, so students have a large number of traditional public schools to choose from.

16That is, ξik follows a standard Gumbel distribution, which provides the scale normalization for themodel.

17

charter school offers, because the set of offers in hand determines the available school

choices. Consider the decision facing a student at stage 3 in Figure 3. At this point, the

student knows her charter offers, application costs are sunk, and there is no uncertainty

about preferences. Student i can attend public school or any charter school that offers a

seat. Her choice set is

C(Zi) = 0 ∪ k : Zik = 1.

Define

Uik(θi, τik) ≡ γ0k +X ′iγ

x + γd ·Dik + θi + τik,

with Ui0(θi, τi0) ≡ τi0. Student i’s optimal school choice is

Si = arg maxk∈C(Zi)

Uik(θi, τik) + ξik,

and the probability that student i chooses school k at this stage is given by

Pr[Si = k|Xi, Di, Zi, θi, τi] =exp

(Uik(θi, τik)

)∑

j∈C(Zi) exp(Uij(θi, τij)

)≡ Pik(Zi, θi, τi).

The expected utility associated with this decision (before the realization of ξi) is

Wi(Zi, θi, τi) ≡ E[maxk∈C(Zi) Uik(θi, τi0) + ξik|Xi, Di, Zi, θi, τi]

= ν + log

∑k∈C(Zi)

exp(Uik(θi, τik)

)where ν is Euler’s constant.

Students choose charter applications to maximize expected utility, anticipating offer

probabilities and their own attendance choices. Consider the application decision facing

a student at stage 1 in Figure 3. The student knows θi, τi and ψi, but does not know ξi,

and her choice of Ai induces a lottery over Zi at a cost of ci(Ai). Define

Vi(a, θi, τi) ≡∑

z∈0,1K[f(z|a; π) ·Wi(z, θi, τi)]− γa · |a|.

The expected utility associated with the choice Ai = a is Vi(a, θi, τi) + ψia, and the

18

probability of choosing this bundle is

Pr [Ai = a|Xi, Di, Zi, θi, τi] =exp

(Vi(a,θi,τi)

λψ

)∑

a′∈0,1Kexp

(Vi(a

′,θi,τi)λψ

)≡ Qia (θi, τi) .

4.3 Achievement

Students are tested after application and attendance decisions have been made. Po-

tential achievement at charter school k for student i in subject j is given by

Yij(k) = α0jk +X ′iα

xjc + αθjc · θi + εijk, (7)

while potential traditional public school achievement is

Yij(0) = α0j0 +X ′iα

xj0 + αθj0 · θi + εij0. (8)

The subscript c in equation (7) denotes parameters that are constant across charter

schools. The causal effect of attending charter school k relative to traditional public

school for student i in subject j is Yij(k)− Yij(0). Observed scores for student i are the

potential scores associated with her optimal school choice: Yij = Yij(Si).

The unobserved determinants of achievement may be correlated across subjects. To

capture this possibility, I specify a multivariate normal distribution for the εijk:

(εimk, εiek)′ ∼ N (0,Σk) , k ∈ 0, 1, ..., K. (9)

I assume that Σk is the same across charter schools, though its elements may differ

between charter schools and traditional public schools.

4.4 Comments on Modeling Choices

Equations (5) through (9) provide a complete description of charter demand and

potential academic achievement. This section provides intuition for some of the key

modeling choices implicit in these equations.

First, the model emphasizes differences between charter and traditional public schools,

while limiting differences between charter schools. Heterogeneity in preferences and

19

achievement across students with different observed characteristics is governed by the

vectors γx, αxjc, and αxj0. This specification allows observed characteristics to affect the

choice of charter schools relative to traditional public schools, and to interact differently

with achievement in charter and public schools, but requires that these characteristics

affect preferences and achievement the same way at every charter. Similarly, equation (7)

implies that the relationship between the unobserved taste θi and student achievement

is the same at every charter school. Heterogeneity in preferences and achievement across

charter schools is captured by the school-specific intercepts γ0k and α0

jk. These restrictions

limit the number of parameters to be estimated while also parsimoniously summarizing

heterogeneity across both students and schools. Moreover, this emphasis on differences

between charters and traditional public schools mirrors the approach to identification

described in Section 5.3, which emphasizes selection into the charter sector rather than

across charter schools.

A second notable feature of the model is that potential achievement does not enter

directly in students’ utility functions. Instead, achievement and preferences are linked

through the charter taste θi, which appears in equations (7) and (8). Appendix B shows

that this specification nests a standard Roy (1951) model of selection in which students

seek to maximize achievement and have private information about their potential scores in

charter and public schools. The model described here is more flexible than this Roy model,

in the sense that it allows students’ preferences to depend on unobserved factors besides

achievement. For example, students may be more likely to choose charter schools if they

expect to receive large achievement gains, but they may also be more likely to choose

charters if they have more motivated parents, and parental motivation may be positively

or negatively correlated with achievement gains. Equations (7) and (8) admit either

possibility by allowing for an unspecified relationship between the unobserved charter

taste θi and potential achievement. The next section formally discusses my strategy for

identifying this relationship and outlines my estimation procedure.

20

5 Identification and Estimation

5.1 Exclusion Restriction

Identification of the parameters of equations (7) and (8) is based on the following

exclusion restriction:

E[εijk|Xi, Zi, Di, θi, ψi, τi, ξi] = 0, k ∈ 0, 1, ..., K. (10)

Equation (10) embeds three identifying assumptions. First, the lottery offer vector

Zi is excluded from equations (7) and (8). The exclusion of Zi requires that offers have

no direct effect on student achievement, a standard assumption in the charter lottery

literature. Second, assumption (10) requires that distance to charter schools is unrelated

to the idiosyncratic component of potential achievement, εijk. Finally, the school- and

application-specific preference shocks τi, ξi, and ψi are also taken to be unrelated to

potential achievement. I next discuss the latter two assumptions in detail and provide

suggestive evidence in support of them.

5.2 Exclusion of Distance

The model described here can be viewed as a generalized version of the Heckman

(1979) sample selection model, with multiple stages of treatment selection (application

and attendance), multiple treatments (charter schools), and multiple censored outcomes

(Yi(k), observed only when Si = k). In models of this type, excluded instruments are

necessary to identify potential outcome distributions without relying on functional form

(Heckman 1990). I use distance and lottery instruments as exogenous shifters of applica-

tion and attendance behavior in the model’s two choice stages.

The distance instrument is valid if it affects charter attendance and is uncorrelated

with unobserved determinants of achievement. The use of this instrument parallels the

use of proximity-based instruments in previous research on college and school choice (see,

e.g., Card 1993 and Booker et al. 2011). The exclusion of distance seems plausible in the

model described here, since Xi includes a rich set of student characteristics. The exclu-

sion restriction requires that distance to charter schools is effectively random conditional

on these variables. Together, the lottery and distance instruments identify the taste co-

efficients αθjc and αθj0. Appendix C demonstrates analytically how the combination of

21

lotteries and distance identifies these parameters in a simplified model with one charter

school.

Table 3 explores the validity of the distance instrument and compares IV estimates

based on lotteries and distance. Columns (1) and (2) report coefficients from ordinary

least squares (OLS) regressions of 4th grade test scores on distance to the closest charter

middle school, measured in miles. The estimates in the first row show that students who

live farther from charter middle schools have significantly higher 4th grade test scores,

suggesting that charter schools tend to locate in low-achieving areas of Boston. The sec-

ond row shows that adding controls for observed characteristics shrinks these imbalances

considerably and renders the coefficients statistically insignificant. This suggests that ob-

servable student characteristics capture the relationship between location and academic

achievement, and lends plausibility to the use of distance as an instrument in models that

control for these characteristics.

Columns (3) through (5) of Table 3 compare 2SLS estimates using lottery offers and

distance as instruments for charter attendance. Models using the lottery instrument

control for lottery fixed effects and limit the sample to applicants, while models using

the distance instrument control for student characteristics and include the full sample.

Column (3) shows that both instruments have strong, statistically significant first stage

effects on charter attendance: winning a lottery increases the probability of charter at-

tendance among applicants by 64 percentage points, while a one-mile increase in distance

decreases the probability of charter attendance by 2.2 percentage points. Columns (4)

and (5) show that both instruments produce positive estimates of the effects of char-

ter attendance, though the distance estimates are less precise. The lottery instrument

produces estimates of 0.49σ and 0.32σ for math and ELA tests, while the distance instru-

ment generates estimates of 0.64σ and 0.17σ. The structural estimates reported below

efficiently combine information from both sources of variation in charter attendance.

5.3 Exclusion of School-specific Preferences

The exclusion of application- and school-specific tastes from equations (7) and (8) im-

plies that selection on unobservables has a “single-index” form: the relationship between

potential achievement and unobserved preferences is driven only by the average charter

school taste θi. This allows selection on unobservables to be parsimoniously character-

22

ized by the coefficients αθjc and αθj0. The single-index restriction effectively requires that

students view charter schools as a homogeneous treatment, implemented at multiple sites

throughout Boston. Students may know about cross-site heterogeneity in average effects

(captured by α0jk) and about their own suitability for the charter treatment (captured

by θi), but the exclusion restriction requires that students do not make application or

attendance choices based on their own idiosyncratic treatment gains across sites.

The exclusion of school-specific preferences is more plausible if the charter treatment

is in fact homogeneous across schools. Table 4 lists responses to a survey on school

practices for the seven charter middle schools included in the sample.17 For comparison,

column (8) reports average responses for other charter middle schools in Massachusetts.

The survey results show that practices are highly uniform across Boston’s charter middle

schools, and differ markedly from schools elsewhere in the state. The seven Boston mid-

dle schools all strongly identify with the No Excuses educational approach, choosing at

least 4 on a 5-point scale measuring adherence to No Excuses. These schools report em-

phasizing the key individual components of the No Excuses model, including traditional

reading and math skills, discipline and comportment, and measurable results; with a few

exceptions, they also use similar policies and classroom practices, including commitment

contracts, uniforms, merit/demerit systems, cold-calling, and math and reading drills.

These practices are much less likely to be used by charters outside Boston. The pattern

in Table 4 shows that all of Boston’s charter schools adhere to the same distinctive ed-

ucational model, lending support to the assumption that students are unlikely to make

choices based on private information about school-specific achievement gains.

To further motivate the exclusion of ψi, τi and ξi from equations (7) and (8), Table 5

summarizes the relationship between distance to charter schools and the choice of schools

among charter applicants. In the model outlined above, the decision to apply to one

charter school over another is determined by the combination of distance, application-

specific tastes, and school-specific tastes. If application portfolio choices are dominated

by distance, then there is little scope for selection on unobserved tastes.

The results in Table 5 show that the choice of school conditional on applying is

determined mostly by distance. Forty-one percent of applicants apply to the closest

school, and these students travel an average of 1.95 miles to their chosen schools. Twenty-

17Schools are randomly ordered to avoid divulging information about individual schools.

23

two percent apply to the second closest charter, traveling an average of 1.2 miles beyond

the closest school, and 25 percent choose the third closest, on average traveling 2.4 miles

further than necessary. Only 13 percent of applicants apply to the sixth or seventh closest

school. These facts show that although students are free to apply to distant schools, few

do so; conditional on choosing to apply to a charter, most students apply to one close by,

leaving little potential for matching on school-specific achievement gains.

5.4 Estimation

I estimate the parameters of the model by maximum simulated likelihood (MSL). Let

Ω denote the parameters of equations (5) through (9). The likelihood contribution of a

student with outcome variables (Ai, Zi, Si, Yim, Yie) can be written

Li(Ω) =

ˆQia(i)(θ, τ) · f(Zi|Ai; π) · Pis(i) (Zi, θ, τ)

× φb(Yi (θ) ; Σs(i)

)dF (θ, τ |Xi, Di,Ω), (11)

where

Yij(θ) ≡ Yij − α0js(i) −X ′iαxjs(i) − αθjs(i) · θ.

Here φb (y,Σ) is the bivariate normal density function with mean zero and covariance

matrix Σ evaluated at y, and Yi(θ) = (Yim(θ), Yie(θ))′. The subscript a(i) denotes the

application bundle chosen by student i, while s(i) denotes her school choice.18 To code

school choices, I assign each student to the charter school where she attended the most

days in grades five through eight; students who spent no time in charter schools are

assigned to public school.

I evaluate the integral in equation (11) by simulation. Let θri and τ ri be draws of θ

and τ for student i in simulation r. Define

ˆri (Ω) ≡ Qia(i) (θri , τ

ri ) · f(Zi|Ai; π) · Pis(i) (Zi, θ

ri , τ

ri ) · φb

(Yi (θ

ri ) ; Σs(i)

).

The simulated likelihood for observation i is

Li(Ω) =1

R

R∑r=1

ˆri (Ω),

18Here s(i) is used to refer both to the specific school chosen by student i, as in the school-specificintercept α0

js(i), and to the type of school chosen by student i (charter or public), as in the coefficientvector αxjs(i).

24

where R is the number of draws. The MSL estimator is defined by

ΩMSL = arg maxΩ

1

N

N∑i=1

log Li(Ω).

If R rises faster than√N , the MSL estimator is

√N -consistent and has the same asymp-

totic distribution as the conventional maximum likelihood estimator (Train 2003). I use

300 draws of θi and τi for each student, and maximize the simulated likelihood using the

Newton-Rhapson method, with the gradient of the objective function calculated analyt-

ically. The results were not sensitive to increasing the number of draws beyond around

100.

6 Estimates of the Structural Model

I next discuss MSL estimates of the key parameters of the model, which are reported

in tables 6, 7 and 8. Additional parameter estimates are listed in Appendix tables A3

through A6. I estimated the model separately for students with observed 6th, 7th, and

8th-grade test scores. To gauge the fit of the model, Appendix tables A7 and A8 show

key moments from the data, together with corresponding model-based predictions using

the MSL estimates. These predictions provide a good fit to observed choice probabilities

and achievement distributions.19

6.1 Preference Parameters

Table 6 shows MSL estimates of the parameters governing preferences for charter

schools. These estimates are for the 6th grade subsample, the largest available sample;

there were no meaningful differences in the preference estimates for the 7th and 8th grade

subsamples. Estimates of the vector γx are consistent with the demographic patterns

reported in Table 1. Subsidized lunch status, special education, and limited English

proficiency are associated with weak demand for charter schools, while black students

and students with higher baseline math and ELA scores have stronger preferences for

charters. The covariate vector is de-meaned in the estimation sample, so the intercept

19The model slightly over-predicts the fraction of students applying to any charter school, and under-predicts the fraction of students applying to multiple charters. Model-predicted test score means andstandard deviations for charter school and public school students are similar to the corresponding em-pirical means and standard deviations in both subjects and every grade.

25

(computed as the average of γ0k across schools) is the average utility of charter attendance.

The estimated intercept is negative and statistically significant, which implies that on

average, students prefer traditional public schools to charter schools even in the absence

of application and distance costs.

Column (3) of Table 6 shows average marginal effects of observed characteristics on

the probability of applying to at least one charter school. Marginal effects for discrete

variables are computed by simulating the model first with the relevant characteristic set

to zero for each student and then with it set to one, and computing the average difference

in application probabilities across these simulations. Marginal effects for continuous vari-

ables are average simulated numerical derivatives of the application probability. Poverty

has the largest effect on application behavior: Holding other variables constant, sub-

sidized lunch status reduces the probability of submitting a charter application by 7

percentage points. This effect is large relative to the mean application rate (19 percent).

The bottom half of Table 6 reports estimates of the parameters governing preferences

for distance, application costs, and heterogeneity in tastes. Increased distance signifi-

cantly reduces the utility of charter school attendance. The marginal effect in column

(3) shows that a one-mile increase in distance to a charter school reduces the probability

of applying to that school by 0.6 percentage points.20 This effect is large relative to the

application rates at individual schools, which are around 3 percent. The application cost

γa is positive, large, and statistically significant. Its magnitude suggests that applying to

a charter school involves a utility cost equivalent to a five-mile increase in distance.

The estimates capturing unobserved heterogeneity in preferences for charter schools

are statistically significant and economically important: in utility terms, a one-standard-

deviation increase in θi is roughly equivalent to a 13-mile increase in distance to all

charter schools. The equivalent estimates for ξik, τik and ψia are smaller (6.9 miles, 0.12

miles, and 1.5 miles).21 The preference estimates therefore suggest that there is little

unobserved heterogeneity in tastes for individual schools or application bundles. This is

further evidence that a single-index selection model is appropriate.

20The reported marginal effect for distance is obtained by first computing the marginal effect of aone-mile increase in distance to each school on the probability of applying to that school, and thenaveraging these effects across schools.

21The standard deviation of τik is (π/√

6)λτ , and similarly for ξik and ψia.

26

6.2 Achievement Parameters

Table 7 reports estimates of the parameters of the achievement distribution for 8th

grade. I focus on 8th grade scores since charters have their largest impact on the 8th

grade score distribution, though patterns of results for 6th and 7th grade (reported in

Appendix tables A3 and A4) are similar. In each panel of Table 7, column (1) shows

estimates for charter schools, column (3) shows estimates for public schools, and column

(5) shows the difference, the causal effect of charter school attendance. Columns (2), (4),

and (6) report standard errors.

The estimates in Table 7 reveal that charters have larger effects on test scores for

more disadvantaged students. The constant term reported in column (5) implies that

on average, charter attendance raises 8th-grade math scores by 0.64σ. This estimate

is the population average treatment effects (ATE) of charter attendance.22 Subsidized

lunch students receive further benefits of 0.13σ, while black and Hispanic students also

experience larger gains. A one standard deviation improvement in baseline math scores

decreases the effect of charter attendance by 0.16σ. As shown in columns (1) and (3),

black students, Hispanics, and subsidized lunch students lag behind other students in

public school, but these characteristics are not predictive of scores in charter schools

conditional on the other included covariates. In this sense, the structural estimates

imply that charter schools close math achievement gaps between racial and socioeconomic

groups.

The last row of Panel A suggests that stronger unobserved preferences for charters

are associated with slightly smaller math gains. Column (3) shows that students with

strong charter preferences do better in public schools, and the estimates in column (1)

suggest that this relationship is weaker in charter schools, resulting in a negative inter-

action between charter preferences and achievement effects. Column (5) implies that a

one standard deviation increase in charter school tastes decreases the effect of charter

attendance by 0.05σ, and this estimate is marginally significant.

The pattern of estimates for 8th-grade ELA, displayed in panel B of Table 7, is broadly

similar to the corresponding results for math. Charter schools have a substantial average

treatment effect on ELA scores (0.63σ), with significantly larger effects for Hispanics,

subsidized lunch students, and students with low baseline math scores. As in math,

22The mean potential outcome for charter schools is an average of the school-specific intercepts α0mk.

27

race and subsidized lunch status are not predictive of ELA scores in charter schools.

Furthermore, the negative relationship between unobserved preferences and gains is even

more evident in ELA: a one standard deviation increase in tastes for charter schools

reduces the effect of charter attendance by 0.083σ. Unlike the math estimate, this is

driven partially by lower scores in the treated state among students with higher θi.

6.3 School Effects

Table 8 reports estimates of the model’s school-specific parameters, including the aver-

age utilities γ0k, the admission probabilities πk, and 8th-grade test score effects (α0

jk−α0j0).

School-specific test score effects for 6th and 7th grade are listed in Appendix Table A6.

The utility estimates in column (1) of Table 8 show that some charters are more popular

than others, but all of the estimates are negative, indicating that on average students

prefer traditional public school to attending any charter. The admission probabilities,

which are averages across applicant cohorts, range from 0.37 to 0.82. The achievement

effects reported in Table 8 suggest that the effects of Boston’s charters are not driven by

any particular school; all seven charter middle schools boost achievement in both sub-

jects. Interestingly, the two most effective schools, School 6 and School 7, are the least

popular, as measured by the average utility parameters in column (1).

6.4 The Relationship Between Preferences and Achievement

Gains

Taken together, the structural estimates reported in tables 6 and 7 reveal an im-

portant pattern: charter students tend to be negatively selected with respect to gains

from charter attendance. Students with subsidized lunch status and those with low base-

line math scores have weaker preferences for charter schools, but get larger achievement

gains. Those with weaker unobserved tastes for charter schools also experience larger

gains. Black students are an exception to this pattern, as they have stronger-than-average

preferences for charters and receive larger-than-average gains from charter attendance.

To summarize the relationship between tastes and achievement effects, define the

preference index

Pi ≡ X ′iγx + θi.

28

Pi indexes student i’s preference for charter schools relative to public school as a function

of observed characteristics and unobserved tastes. Figure 4 summarizes the relationship

between this index and the causal effect of charter attendance. Specifically, this figure

shows the function

βj(p) ≡ E[

1·K∑K

k=1 (Yij(k)− Yij(0))|Pi = p].

βj(p) is the average effect of charter attendance in subject j for students with preference

Pi = p. Figure 4 plots this conditional expectation, estimated in simulated data using

local linear regressions with triangle kernels and bandwidths of 0.5 standard deviations.

(Pi is normalized to have mean zero and standard deviation one for the figure.)

For both math and ELA and in all grades, βj(p) is downward sloping, reflecting

the fact that students with stronger preferences for charter schools benefit less from

charter attendance. The vertical dashed line indicates the average taste for charter schools

among lottery applicants. Since applicants have stronger-than-average preferences, they

experience smaller-than-average gains. This explains why the 2SLS estimates of LATE

in Table 2 are smaller than the structural estimates of ATE in tables 7, A3 and A4. This

pattern is somewhat surprising; one might have expected the students with the largest

potential benefits to be the most likely to seek out charter schools. Instead, the findings

reported here suggest that disadvantaged students struggle the most in traditional public

schools, but are unlikely to investigate the charter alternative.

7 Counterfactual Simulations

7.1 Charter Supply and Market Equilibrium

The model estimated in this paper allows for a parsimonious description of charter

demand that can be used to make out-of-sample predictions about student choices and

achievement distributions in counterfactual environments. I next use the MSL estimates

to investigate the effects of charter school expansion in a series of counterfactual simu-

lations. These simulations require a model of the supply side, as well as a definition of

market equilibrium.

29

7.1.1 Supply

The supply side of the charter market is defined by a set of charter schools, with each

school characterized by a location, an average utility γ0k, math and ELA achievement

parameters α0mk and α0

ek, and a seating capacity Λk. I assume that the charter sector

exhibits constant returns to scale. Specifically, I treat the vectors (γ0k, α

0mk, α

0ek,Λk)

′ as

independent and identically distributed draws from a fixed distribution F (γ0, α0m, α

0e,Λ).

Each new school is assigned a draw from the estimated joint distribution of school char-

acteristics in Table 8.23

There are several reasons that the constant-returns-to-scale assumption may fail to

hold in practice. If teachers, principals, or other inputs are supplied inelastically, it may be

difficult for new charters to replicate the production technology used on existing campuses

(Wilson 2008). Public schools may also respond to charter competition (Imberman 2011).

In addition, part of the impact of charter schools may be due to peer effects, and these

effects may shrink in expansions that draw in less positively-selected students. Together,

these factors seem likely to reduce the efficacy of charter schools at larger scales. My

simulation results should therefore be viewed as upper bounds on the possible effects

of charter expansion. The constant-returns-to-scale assumption allows me to describe

demand-driven limits on the effects of charter expansion in the best-case scenario for

charter supply.

Simulating charter expansions also require that I specify where new schools are lo-

cated. To choose locations for Boston’s planned expansion, I use the addresses of the new

campuses opened through 2013. To manage the computational complexity of the model,

I assign locations in larger expansions randomly instead of modeling them as strategic

choices. Charter addresses are chosen at random from a grid of half-mile by half-mile

blocks covering Boston; if a drawn block does not already contain a charter school, I

add the next school in this location. Locations may be non-random in practice, and

other models of location choice might produce slightly different predictions of the effects

of charter expansion.24 As shown in Section 7.3, however, my estimates imply that the

23I set capacities for existing schools equal to their average enrollment in years when they wereoversubscribed. To simulate Boston’s planned expansion, I assign utility and test score effects using thecharacteristics of the linked existing schools (see Appendix Table A1). KIPP Academy Boston is notlinked to an existing Boston charter, so it is assigned a draw from F .

24Anecdotally, charter schools in Boston usually locate in vacant buildings, such as empty churches(Roy 2010).

30

market for charter schools will be saturated by the time 30 schools have been added: at

this point, almost all students live in close proximity to a charter offering guaranteed

admission. For expansions at this scale, other models of location choice are therefore

likely to produce results similar to those reported below. The yellow stars in Appendix

Figure A1 shows the locations used for the counterfactual simulations.

7.1.2 Equilibrium

Charter admission probabilities adjust endogenously to equate the demand for charter

enrollment among admitted students with the supply of charter seats. I assume that each

charter school sets its admission probability to fill its capacity, or (if demand is too weak)

come as close as possible to doing so. Equilibrium admission probabilities are determined

in a Subgame Perfect Nash Equilibrium in which students correctly anticipate πk and

apply optimally, and schools optimally choose πk given students’ application decisions

and (correctly anticipated) admission probabilities at other schools. Appendix D formally

outlines the structure of the game, derives schools’ best response functions, shows that

an equilibrium exists, and gives conditions under which the equilibrium is unique.

The results in Appendix D show that when students apply anticipating the admission

probability vector πe ≡ (πe1, ..., πeK)′ and schools other than k choose the vector π−k, the

best response admission probability for school k is given by

πBRk (πe, π−k) = minΓk(πe, π−k), 1,

where

Γk(πe, π−k) =

Λk

E[∑

a:ak=1

∑z:zk=1Qia(θi, τi; πe)f−k (z−k|a−k; π−k)Pik(z, θi, τi)]

.

Here z−k, a−k and f−k omit the k-th elements of z, a and f , and the application probability

Qia now explicitly depends on πe. I numerically solve for a fixed point of the best response

vector πBR(π) ≡ (πBR1 (π, π−1), ..., πBRK (π, π−K))′ in each simulation. I never found more

than one equilibrium in any counterfactual. The equilibrium probabilities for a subset of

the simulations are reported in Appendix Table A9.

7.2 Simulation Results

I use the model to investigate the effects of changing Boston’s charter school network

on the distribution of middle school test scores for a representative cohort of middle school

31

entrants. The model is simulated 100 times for each observation in every counterfactual,

and the results are used to compute average choice probabilities and achievement scores.

I begin with a look at the effects of closing Boston’s charter schools. I then simulate the

effects of Boston’s planned expansion, which adds six new charter middle schools to the

existing set of seven. This is followed by an analysis of progressively larger expansions that

add schools one by one until the number of charter schools reaches 30.25 Finally, I simulate

the effects of forcing all Boston middle school students to attend charter schools. While

unrealistic, this last scenario allows me to compare the population average treatment

effect (ATE) to other treatment parameters and put upper bounds on the possible effects

of charter expansion.

Figure 5 summarizes the counterfactual simulations. The outcomes of interest are

school choices, charter oversubscription, and test scores. In each panel, a vertical black

line indicates the existing number of charter schools, and a red line indicates the size of

Boston’s planned charter expansion. Panel A of Figure 5 shows how charter applications,

attendance, admission probabilities, and filled capacity change as the number of charter

schools rises. Panel B shows effects on average math and ELA scores, and panel C shows

effects on white-black achievement gaps. Tables 9 and 10 show numerical results for

choice behavior and 8th-grade achievement in a subset of the simulations. Table 10 also

reports the effect of treatment on the treated (TOT) for each simulation, which indicates

how charter effectiveness varies across counterfactuals.26

The simulations imply that charter schools have had a significant impact on the

distribution of test scores in Boston. This can be seen in the second row of Table 10,

which shows the simulated effects of closing all charter schools. Without charter schools,

the gap in average test scores between Boston and the rest of Massachusetts would widen

by 5 percent in math and 1 percent in ELA, and the citywide white-black achievement

gap would increase by 3 percent in both subjects.

25The size of the application choice set becomes unmanageable for expansions at this scale, since thenumber of possible application portfolios grows exponentially with the number of schools. I thereforelimit students to two applications in the counterfactual simulations. This restriction has little effect onthe results since very few students (1.5 percent) apply to more than two schools in the data, and themodel predicts that fewer students will submit multiple applications at larger scales since the incentiveto insure against a lottery loss is reduced as offer probabilities rise.

26The reported TOTs are given by E[βj(Pi)|Si 6= 0]. Note that βj(Pi) is an average effect acrossthe distribution of charter schools. This definition isolates treatment effect variation driven by studentcomposition, rather than allowing effects to vary according to which charter parameters are randomlydrawn from F .

32

The next row of Table 10 shows that the recently authorized charter expansion is

predicted to further raise average math and ELA scores by 3 percent and 1 percent, and

reduce achievement gaps by 2 to 3 percent. As shown in Figure 5, opening additional

charters is predicted to boost average test scores further and continue to reduce the white-

black gap in both subjects and all grades. Furthermore, columns (3) and (6) of Table

10 imply that with constant returns to scale on the supply side, the efficacy of charter

schools is likely to increase as the charter sector expands. The TOTs associated with

Boston’s planned expansion are 1 percent and 7 percent larger than the TOT for current

applicants in math and ELA, and the TOTs for the 30-school expansion are 6 percent and

24 percent larger than the current TOT. This reflects the pattern of selection discussed

in Section 6: at the margin, charter expansion draws in students with weaker tastes for

charter schools, who receive larger achievement gains.

However, the simulation results also imply that demand for charter schools in Boston

is limited, especially among the students with the largest potential achievement gains.

Panel A of Figure 5 shows that charter expansion is predicted to reduce oversubscription:

admission probabilities rise quickly with the number of schools, and the share of charter

seats left unfilled also increases. In the 30-school expansion, only eight schools are over-

subscribed, so almost all students live in close proximity to a charter offering guaranteed

admission (see Appendix Table A9). Nevertheless, only about half of students apply to

a charter, and more than 40 percent of charter seats are unfilled. Moreover, the ATE

remains well above the TOT in all simulations, suggesting that many of the students

for whom charters are most effective choose to remain in public schools. This finding is

driven by the large application cost and negative average utilities reported in tables 6

and 8, together with comparatively weak preferences for charters among disadvantaged

and low-achieving students.

7.3 Breaking the Link Between Preferences and Achievement

Gains

In a final set of simulations, I ask how the efficacy of charter expansion would change

if the pattern of selection into charter schools were altered. Specifically, I simulate coun-

terfactuals in which charter preferences are random, and counterfactuals in which the

observed pattern of selection is reversed. The results describe how charter effectiveness

33

might be expected to change in expansions that successfully target either average students

or high-benefit students.

For this purpose, I redefine charter preferences as

Uik = γ0k + γd ·Dik + vik − ci (Ai) + Pi. (12)

Setting Pi = Pi yields the utility function in equation (5) and produces the pattern of

negative selection on gains observed in the data. I compare this pattern to results from

a “random preference” model in which Pi is a random draw from the distribution of Pi.

In addition, I simulate a “positive selection” model in which Pi = 2E[Pi] − Pi. These

alternative models keep the distribution of charter preferences constant in the population,

while either rendering preferences unrelated to achievement gains or creating a positive

relationship between preferences and gains.

Table 11 shows simulated effects of charter expansion on 8th-grade test scores using

the alternative preference models. The results show that charter expansion would be

substantially more effective if charter applicants were randomly or positively selected

with respect to achievement gains. Simulated math TOTs are 0.11σ to 0.14σ larger for

the random preference model than for the negative selection model, and these differences

grow to between 0.23σ and 0.29σ for the positive selection model. As a result, charter

expansion increases average math scores in Boston by up to an additional 0.04σ in the

positive selection model, an effect equal to 10 percent of the current value of the gap

between Boston and the rest of the state.

The increase in charter efficacy for the alternative preference models is larger in ELA,

since the degree of selection is more extreme for ELA scores. TOTs for the random

preference and positive selection models in ELA are roughly 0.3σ and 0.6σ larger than

corresponding TOTs for the negative selection model. Charter expansion therefore raises

Boston’s average ELA score by up to an additional 0.05σ and 0.1σ in the two alternative

preference models. The latter estimate is more than 20 percent of the gap between

Boston and the Massachusetts average. In other words, charter expansion would close

a much larger share of the gap between Boston and the rest of the state if applicants

were positively selected. Interestingly, the effects of charter expansion on the racial

achievement gap are much less sensitive to the preference model used. This is due to the

fact that black students are an exception to the negative selection pattern documented

above: for black students, both charter preferences and achievement gains are above

34

the average. On the whole, the results in Table 11 show that charter expansion would

produce much larger effects for randomly- or positively-selected students than for the

students who are most likely to apply. This suggests that expansions may be especially

effective when paired with outreach policies targeting groups with low application rates.

8 Conclusion

Estimates based on admission lotteries show that Boston’s charter middle schools

have substantial positive effects on test scores and close racial achievement gaps among

their applicants. At the same time, the implications of these findings for charter school

expansion are unclear; applicants are a small, non-random subset of the student popula-

tion, so such gains may be atypical. This paper develops a structural model of charter

applications, school choice, and academic achievement that links the demand for charter

schools to achievement gains from charter attendance. To identify the parameters of the

model, I combine two sets of instruments based on random lotteries and proximity to

charter schools. I then use the model to predict out-of-sample effects for non-applicants

and use my estimates to simulate the effects of charter expansion in an equilibrium school

choice model.

Estimates of the structural model reveal that tastes for charter schools are inversely

related to achievement gains. Specifically, low-achievers, poor students, and those with

weak unobserved tastes for charters gain the most from charter attendance, but are

unlikely to apply. Consistent with this finding, counterfactual simulations show that the

average effect of charter schools is increasing in the size of the charter sector, as larger

expansions draw in students with weaker preferences who receive larger gains. At the

same time, demand for charters among the highest-benefit students is weak, so the effects

of charter expansion may be limited by a lack of demand even in the best-case scenario

for charter supply. Simulations using alternative preference models show that the effects

of charter expansion would be much larger if high-benefit students could be induced to

apply to charter schools.

This pattern is surprising – the canonical Roy (1951) selection model predicts that

students with larger potential gains will be more likely to apply. However, the “reverse

Roy” pattern described here is consistent with results from other contexts, such as the

35

long-term care insurance market described by Finkelstein and McGarry (2006). Finkel-

stein and McGarry note that the decision to purchase long-term care insurance is driven

by both risk aversion and health risk; since more risk-averse people also tend to have lower

health risks, they find that those who purchase more insurance are not higher risk on

average. More generally, in settings where participation decisions are driven by multiple

heterogeneous factors, selection on one dimension can lead to apparent negative selection

on gains on another dimension. In the charter school context, application decisions are

driven by socioeconomic status and baseline achievement, which are negatively correlated

with achievement gains from charter attendance. These findings are also consistent with

a growing body of evidence suggesting that low-income students are unlikely to choose

high-quality schools across many settings (Butler et al., 2013; Brand and Xie, 2010; Dillon

and Smith, 2013; Hastings et al., 2009; Hoxby and Avery, 2012).

These findings raise the further question of whether parents who forgo large potential

achievement gains are truly uninterested in achievement, or simply unaware of differences

in effectiveness across schools. The model estimated in this paper does not distinguish

between these two possibilities. If the lack of demand for charter schools among dis-

advantaged students reflects a lack of parental information, the demand for charters

may shift in the long run as parents become more informed. In related work, Hastings

and Weinstein (2008) show that providing test score information leads parents to choose

higher-performing schools, which suggests that informational frictions may play a role.

Changes in recruitment practices may also change the pattern of selection into charter

schools; the recent legislation authorizing charter expansion in Massachusetts requires

schools to take efforts to recruit applicants who are demographically similar to students

in the local district. In future work, I plan to use data from Boston’s expansion to vali-

date the model estimated here, and study changes in the demand for charter schools as

the city’s charter network expands.

36

p-value for distributional equality = 0.320 p-value for distributional equality = 0.845Notes: This figure plots kernel densities of math test scores for black and white lottery compliers. Black vertical lines show mean scores for untreated compliers, while red vertical lines in the treated plots show means for treated compliers. The treated densities are estimated from 2SLS regressions of a kernel smoother interacted with a charter dummy on the charter dummy and lottery fixed effects. The charter dummy is instrumented with an indicator for winning the lottery. All densities use a triangle kernel with a bandwidth of one standard deviation. Untreated densities are estimated using analogous regressions that replace the charter dummy with a non-charter dummy. P-values are from bootstrap Kolmogorov-Smirnov tests of distributional equality.

Figure 1: Math Score Distributions for Black and White Compliers

p-value for distributional equality = 0.005 p-value for distributional equality = 0.070

p-value for distributional equality = 0.000 p-value for distributional equality = 0.000

p-value for distributional equality = 0.115 p-value for distributional equality = 0.760

37

p-value for distributional equality = 0.395 p-value for distributional equality = 0.535Notes: This figure plots kernel densities of ELA test scores for black and white lottery compliers. The treated densities are estimated from 2SLS regressions of a kernel smoother interacted with a charter dummy on the charter dummy and lottery fixed effects. The charter dummy is instrumented with an indicator for winning the lottery. All densities use a triangle kernel with a bandwidth of one standard deviation. Untreated densities are estimated using analogous regressions that replace the charter dummy with a non-charter dummy. P-values are from bootstrap Kolmogorov-Smirnov tests of distributional equality.

Figure 2: ELA Score Distributions for Black and White Compliers

p-value for distributional equality = 0.010 p-value for distributional equality = 0.005

p-value for distributional equality = 0.015 p-value for distributional equality = 0.075

p-value for distributional equality = 0.010 p-value for distributional equality = 0.855

38

Figure 3: Sequence of Events

0 1 2 3 4 Students learn 𝜃! , 𝜏!", ψia

Students learn 𝜉!"

Students choose 𝐴!"

Schools offer 𝑍!" = 1 with probability 𝜋! ⋅ 𝐴!"

Students choose 𝑆!

Students earn test scores

𝑌!"

39

Figure 4: Relationship Between Charter Tastes and Achievement Effects

Notes: This figure describes the relationship between tastes for charter school attendance and the achievement impact of charter schools. Panel A plots the average causal effect of charter school attendance at each value of charter tastes for math. Panel B plots corresponding results for ELA. Charter effects are averages of school-specific effects. The vertical dashed lines indicate mean preferences among applicants. The plots come from local linear regressions using a triangle kernel and a bandwidth of 0.5 standard deviations, estimated in a simulated data set the same size as the actual data. The simulated data are produced by simulating the model using the MSL estimates and the empirical distribution of observed student characteristics.

A. Math effects B. ELA effects

40

Figure 5: Simulated Effects of Charter School Expansion

Applications and attendance Oversubscription

Notes: This figure displays simulated effects of charter school expansion. The black dashed line in each panel corresponds to the existing number of charter schools, while the red dashed line corresponds to Boston's planned expansion. Simulated statistics are produced using 100 simulations per observation in the data set.

Panel A. Choice behavior

Math ELAPanel B. Average citywide test scores

Math ELAPanel C. Citywide achievement gaps

41

All students Mean Offer differential(1) (2) (3)

Applied to charter school 0.190 1.000 -

Received charter offer 0.125 0.616 1.000

Attended charter school 0.100 0.495 0.639***(0.018)

Female 0.483 0.484 0.010(0.024)

Black 0.449 0.512 0.027(0.023)

Hispanic 0.401 0.328 -0.028(0.025)

Subsidized lunch 0.817 0.734 -0.003(0.021)

Special education 0.225 0.176 0.011(0.020)

Limited English proficiency 0.242 0.190 0.025(0.019)

Miles to closest charter school 2.117 1.925 -0.042(0.054)

4th grade math score -0.538 -0.368 -0.040(0.049)

4th grade ELA score -0.642 -0.444 0.003(0.050)

Joint p-value - - 0.221

N 13473

* significant at 10%; ** significant at 5%; *** significant at 1%

Table 1: Descriptive Statistics

Notes: This table shows descriptive statistics for students attending 4th grade at traditional public schools in Boston between 2006 and 2010. The sample excludes students with missing middle school test scores. Column (1) shows means for the full sample, while column (2) shows means for charter applicants. Column (3) shows coefficients from regressions of characteristics on a charter offer indicator. These regressions control for lottery fixed effects. Robust standard errors are in parentheses. The p-value is from a test of the hypothesis that offer status is unrelated to all baseline characteristics.

Charter applicants

2564

Panel A. Charter status

Panel B. Baseline characteristics

42

First stage Math effect ELA effectRace (1) (2) (3)All 0.637*** 0.492*** 0.321***

(0.020) (0.072) (0.074)N 5465

White 0.611*** 0.187 -0.013(0.043) (0.139) (0.144)

N 898

Black 0.650*** 0.595*** 0.384***(0.023) (0.096) (0.098)

N 2828

Hispanic 0.672*** 0.537*** 0.457***(0.028) (0.105) (0.108)

N 1739

- 0.044 0.025

* significant at 10%; ** significant at 5%; *** significant at 1%

Notes: This table reports 2SLS estimates of the effects of attendance at Boston charter schools on test scores for lottery applicants. The sample stacks test scores in grades 6 through 8. The endogenous variable is a dummy for attending any charter school after the lottery and prior to the test. The instrument is a dummy for receiving a lottery offer from any charter school. Column (1) reports coefficients from regressions of charter attendance on the offer variable. Columns (2) and (3) report second stage estimates for math and ELA scores. Race-specific estimates come from models that interact charter attendance with race dummies and use race*offer interactions as instruments. All models control for lottery fixed effects. P-values are from Wald tests of the hypothesis that the 2SLS coefficients are equal across races. Standard errors are robust to heteroskedasticity and are clustered at the student level.

Table 2: Lottery-based Estimates of Charter Effects

p-value for racial equality

43

Math ELA First stage Math ELAControls (1) (2) Instrument (3) (4) (5)

None 0.044*** 0.052*** Lottery 0.637*** 0.492*** 0.321***(0.008) (0.008) (0.020) (0.072) (0.074)

N 5465

0.003 0.008 Distance -0.022*** 0.637*** 0.173(0.005) (0.005) (0.002) (0.208) (0.203)

N 31085

* significant at 10%; ** significant at 5%; *** significant at 1%

Table 3: The Distance Instrument

Notes: Columns (1) and (2) show regressions of 4th-grade test scores on miles to the closest charter middle school. The first row includes no controls, while the second controls for student characteristics, including sex, race, free lunch status, special education status, limited English proficiency, and 4th grade score in the other subject. Columns (3) through (5) show 2SLS results for middle school test scores using the lottery and distance instruments. The lottery estimates are reproduced from Table 2. The distance models control for demographics and 4th grade test scores. Standard errors are clustered at the student level.

2SLS comparisonBalance check: 4th grade scores

13473

13473

Baseline characteristics

44

School 1 School 2 School 3 School 4 School 5 School 6 School 7 Other MAPractice (1) (2) (3) (4) (5) (6) (7) (8)

Instruction timeDays per year 190 190 190 180 185 193 190 185Length of school day (hours:minutes) 8:25 7:00 8:30 7:56 9:00 7:33 7:14 7:17

School philosophy (5 pt. scale)No Excuses 4 4 4 5 5 5 5 2.76Emphasize traditional reading and math 5 5 5 5 5 5 4 3.86Emphasize discipline/comportment 5 5 5 5 5 5 5 3.33Emphasize measurable results 5 5 5 5 5 5 5 3.62

School practices (1 or 0 for yes/no)Parent and student contracts 1 1 1 0 1 1 1 0.67Uniforms 1 1 1 1 1 1 1 0.74Merit/demerit system 1 1 1 1 0 1 1 0.30

Classroom techniques (5 pt. scale)Cold calling 3 5 5 5 5 3 5 2.48Math drills 2 4 5 5 5 5 5 3.33Reading aloud 4 5 5 4 4 5 4 3.14

Table 4: School Practices

Notes: This table shows school practices at Boston charter middle schools, measured from a survey of school administrators. Columns (1)-(7) show practices for the 7 schools used to estimate the structural model, while column (8) shows an average for other charter middle schools in Massachusetts.

45

Fraction Mean distance Extra distanceFraction of applicants choosing: (1) (2) (3)

Closest charter 0.411 1.95 0.002nd closest 0.277 3.02 1.203rd closest 0.251 4.20 2.424th closest 0.164 5.14 3.115th closest 0.194 7.01 5.146th closest 0.099 8.35 6.307th closest 0.034 12.39 10.53

Table 5: Charter School Choices Among Applicants

Notes: This table shows the fractions of applicants in the structural sample applying to charter schools by distance. Column (1) shows fractions of applicants making each choice. Column (2) shows mean distance smong students who applied. Column (3) shows extra distance relative to the closest charter school. Students who applied to multiple charters are used in the calculations for each of their choices.

46

Estimate Standard error Marginal effectParameter Description (1) (2) (3)

γ0 Mean utility -1.487*** 0.113 -

γx Female -0.050 0.072 -0.004Black 0.427*** 0.119 0.030Hispanic 0.113 0.122 0.006Subsidized lunch -0.801*** 0.098 -0.070Special education -0.350*** 0.100 -0.025Limited English proficiency -0.075 0.095 -0.004Baseline math score 0.187*** 0.054 0.015Baseline ELA score 0.135** 0.054 0.018

γd Distance -0.187*** 0.008 -0.006

γa Application cost 0.942*** 0.032 -

σθ Standard deviation of charter school tastes 2.422*** 0.109 -

λτ Scale of school-specific tastes 0.017 0.042 -

λψ Scale of application-specific tastes 0.221*** 0.009 -

N Sample size 13473

* significant at 10%; ** significant at 5%; *** significant at 1%

Table 6: Maximum Simulated Likelihood Estimates of Utility Parameters

Notes: This table reports maximum simulated likelihood estimates of the parameters of the structural school choice model. Column (1) reports parameter estimates, while column (2) reports standard errors. The constant is the average of school-specific mean utilities, evaluated at the sample mean of the covariate vector X. Column (3) reports average marginal effects of observed characteristics on the probability of applying to at least one charter school. The marginal effect for distance is the effect of a one-mile increase in distance to a school on the probability of applying to that school, averaged across schools. The sample includes all students with observed 6th-grade test scores. The likelihood is evaluated using 300 simulations per observation.Marginal effects for discrete variables are differences between average simulated application probabilities with the relevant characteristic set to 1 and 0 for all observations. Marginal effects for continuous variables are average simulated numerical derivatives of the application probability. Marginal effects are evaluated using 100 simulations per observation.

47

Estimate Standard error Estimate Standard error Estimate Standard errorParameter Description (1) (2) (3) (4) (5) (6)

α0m Mean potential outcome 0.216** 0.105 -0.424*** 0.010 0.640*** 0.105

αxm Female -0.052 0.046 -0.061*** 0.017 0.008 0.049

Black -0.031 0.074 -0.198*** 0.028 0.167** 0.079Hispanic 0.069 0.079 -0.092*** 0.029 0.160* 0.084Subsidized lunch 0.029 0.058 -0.105*** 0.026 0.134** 0.064Special education -0.359*** 0.061 -0.409*** 0.021 0.051 0.064Limited English proficiency -0.029 0.071 0.068*** 0.022 -0.096 0.075Baseline math score 0.338*** 0.034 0.495*** 0.012 -0.157*** 0.036Baseline ELA score 0.056 0.034 0.058*** 0.011 -0.002 0.036

αθm· σθ Taste for charter schools (std. dv. units) 0.022 0.028 0.075*** 0.011 -0.053* 0.030

α0e Mean potential outcome 0.151 0.117 -0.482*** 0.011 0.633*** 0.118

αxe Female 0.097* 0.052 0.177*** 0.018 -0.080 0.055

Black 0.024 0.079 -0.094*** 0.031 0.118 0.085Hispanic 0.122 0.085 -0.046 0.032 0.169* 0.090Subsidized lunch 0.025 0.068 -0.115*** 0.028 0.140* 0.074

Special education -0.361*** 0.063 -0.428*** 0.021 0.067 0.066Limited English proficiency -0.001 0.079 0.018 0.023 -0.019 0.082Baseline math score 0.094** 0.037 0.179*** 0.012 -0.084** 0.039Baseline ELA score 0.294*** 0.038 0.346*** 0.011 -0.052 0.040

αθe· σθ Taste for charter schools (std. dv. units) -0.065** 0.031 0.018 0.012 -0.083** 0.033

N Sample size

* significant at 10%; ** significant at 5%; *** significant at 1%

Panel B. ELA

7279Notes: This table reports maximum simulated likelihood estimates of the parameters of the 8th-grade achievement distribution. Panel A shows estimates for math, while Panel B shows estimates for ELA. The likelihood is evaluated using 300 simulations per observation. Mean potential outcomes are evaluated at the sample mean of the covariate vector X. The mean potential outcome for charter schools is an average of school-specific means. The sample includes all students with observed 8th-grade test scores. The likelihood is evaluated using 300 simulations per observation.

Table 7: Maximum Simulated Likelihood Estimates of 8th-grade Achievement ParametersCharter school Traditional public school Charter effect

Panel A. Math

48

Math ELASchool (1) (2) (3) (4)

Charter school 1 -1.145*** 0.523*** 0.492*** 0.577***(0.118) (0.060) (0.116) (0.125)

Charter school 2 -0.853*** 0.370*** 0.471*** 0.540***(0.116) (0.050) (0.101) (0.118)

Charter school 3 -1.438*** 0.542*** 0.543*** 0.510***(0.120) (0.036) (0.123) (0.137)

Charter school 4 -1.684*** 0.574*** 0.771*** 0.618***(0.115) (0.042) (0.119) (0.139)

Charter school 5 -0.701*** 0.346*** 0.551*** 0.344***(0.113) (0.053) (0.115) (0.126)

Charter school 6 -2.505*** 0.823*** 0.682*** 0.834***(0.126) (0.049) (0.123) (0.149)

Charter school 7 -2.087*** 0.769*** 0.968*** 1.007***(0.144) (0.043) (0.171) (0.174)

* significant at 10%; ** significant at 5%; *** significant at 1%

Notes: This table reports maximum simulated likelihood estimates of the school-specific parameters from the structural model. The likelihood is evaluated using 300 simulations per observation. Test score effects are evaluated at the mean of the covariate vector X. The admission probabilities in column (2) are averages for 2006-2010. Average utilties and test score effects are computed at the population mean.

Table 8: Maximum Simulated Likelihood Estimates of School-specific ParametersTest score effects (8th grade)Admission

probability Average utility

49

Fraction applying Fraction attendingAvg. admission

probabilityFraction of seats

filledPolicy change (1) (2) (3) (4)

None (7 charter schools) 0.231 0.094 0.682 1.000

0.347 0.142 0.846 0.852(50.2%) (51.7%) (24.0%) (-14.8%)

Expand to 30 schools 0.537 0.233 0.981 0.562(132.4%) (148.4%) (43.8%) (-43.8%)

Table 9: Simulated Effects of Policy Changes -- Choice Behavior

Notes: This table reports simulated effects of changing Boston's charter school network on charter application and attendance behavior. Numbers in parentheses are percentage changes relative to the existing policy environment. Simulated statistics are produced using 100 simulations per observation in the data set.

Boston's planned expansion (expand to 13 schools)

50

Boston average

White-black gap TOT

Boston average

White-black gap TOT

Policy change (1) (2) (3) (4) (5) (6)None (7 charter schools) -0.400 0.632 0.497 -0.475 0.504 0.291

All charter schools close -0.420 0.653 - -0.481 0.518 -(-5.0%) (3.3%) (-1.3%) (2.7%)

-0.387 0.622 0.503 -0.470 0.487 0.311(3.3%) (-1.6%) (1.1%) (1.1%) (-3.4%) (6.9%)

Expand to 30 schools -0.362 0.592 0.527 -0.463 0.469 0.361(9.7%) (-6.4%) (6.0%) (2.5%) (-7.0%) (24.1%)

0.216 0.333 0.640 0.149 0.262 0.633(154.0%) (-47.4%) (28.7%) (131.3%) (-48.0%) (117.8%)

Notes: This table reports simulated effects of modifying Boston's charter school network on the citywide math score distribution. Numbers in parentheses are percentage changes relative to the existing policy environment. Simulated statistics are produced using 100 simulations per observation in the data set.

ELATable 10: Simulated Effects of Policy Changes -- 8th-grade Test Scores

Math

Boston's planned expansion (expand to 13 schools)

All students forced to attend charter schools

51

Boston average

White-black gap TOT

Boston average

White-black gap TOT

Counterfactual Preference model (1) (2) (3) (4) (5) (6)No charter schools - -0.420 0.653 - -0.481 0.518 -

7 charter schools Negative selection (MSL estimates) -0.400 0.632 0.497 -0.475 0.504 0.291Random preferences -0.392 0.642 0.640 -0.454 0.512 0.633

Positive selection -0.384 0.647 0.788 -0.428 0.511 0.985

13 charter schools Negative selection (MSL estimates) -0.387 0.622 0.503 -0.470 0.487 0.311Random preferences -0.374 0.634 0.640 -0.442 0.503 0.633

Positive selection -0.360 0.642 0.772 -0.405 0.501 0.957

30 charter schools Negative selection (MSL estimates) -0.362 0.592 0.527 -0.463 0.469 0.361Random preferences -0.344 0.602 0.640 -0.417 0.483 0.633

Positive selection -0.322 0.610 0.757 -0.362 0.474 0.911Notes: This table reports simulated effects of modifying Boston's charter school network on the citywide 8th-grade test score distribution. Simulated statistics are produced using 100 simulations per observation in the data set. The "negative selection" preference model uses the MSL estimates. The "random preferences" model draws preferences randomly from the distribution implied by the MSL estimates. The "positive selection model" reverses the pattern of selection implied by the MSL estimates while keeping the mean and variance of charter preferences constant.

Table 11: Simulations Results for Alternative Preference Models -- 8th-grade Test ScoresMath ELA

52

References

[1] Abadie, A. (2002). “Bootstrap Tests for Distributional Treatment Effects in Instru-mental Variable Models.” Journal of the American Statistical Association 97(457).

[2] Abadie, A. (2003). “Semiparametric Instrumental Variable Estimation of TreatmentResponse Models.” Journal of Econometrics 113(2).

[3] Abdulkadiroglu, A., Angrist, J., Dynarski, S., Kane, T., and Pathak, P. (2011).“Accountability and Flexibility in Public Schools: Evidence from Boston’s Chartersand Pilots.” The Quarterly Journal of Economics 126(2).

[4] Ajayi (2011). “A Welfare Analysis of School Choice Reforms in Ghana.” Mimeo,Boston College.

[5] Angrist, J., Cohodes, S., Dynarski, S., Pathak, P., and Walters, C. (2013a). “Standand Deliver: Effects of Boston’s Charter High Schools on College Preparation, Entry,and Choice.” NBER Working Paper no. 19275.

[6] Angrist, J., Pathak, P., and Walters, C. (2013b). “Explaining Charter School Effec-tiveness.” American Economic Journal: Applied Economics, forthcoming.

[7] Angrist, J., Dynarski, S., Kane, T., Pathak, P., and Walters, C. (2012). “WhoBenefits from KIPP?” Journal of Policy Analysis and Management 31(4).

[8] Angrist, J., and Fernandez-Val, I. (2011). “ExtrapoLATE-ing: External Validity andOveridentification in the LATE Framework.” NBER Working Paper no. 16566.

[9] Angrist, J., and Imbens, G. (1995). “Two-Stage Least Squares Estimation of Av-erage Causal Effects in Models with Variable Treatment Intensity.” Journal of theAmerican Statistical Association 90(430).

[10] Attanasio, O., Meghir, C., and Santiago, A. (2011). “Education Choices in Mexico:Using a Structural Model and a Randomized Experiment to Evaluate PROGRESA.”Review of Economic Studies 79.

[11] Berry, S., Levinsohn, J., and Pakes, A. (1995). “Automobile Prices in Market Equi-librium.” Econometrica 63(4).

[12] Booker, K., Sass, T., Gill, B., and Zimmer, R. (2011). “The Effects of Charter HighSchools on Educational Attainment.” Journal of Labor Economics 29(2).

[13] Brand, J., and Xie, Y. (2010). “Who Benefits Most from College? Evidence for Neg-ative Selection in Heterogeneous Economic Returns to Higher Education.” AmericanSociological Review 75(2).

[14] Butler, J., Carr, D., Toma, E., and Zimmer, R. (2013). “Choice in a World of NewSchool Types.” Journal of Policy Analysis and Management, forthcoming.

[15] Card, D. (1993). “Using Geographic Variation in College Proximity to Estimate theReturn to Schooling.” NBER Working Paper no. 4483.

[16] Card, D., and Hyslop, R. (2005). “Estimating the Effects of a Time-Limited EarningsSubsidy for Welfare-Leavers.” Econometrica 73(6).

53

[17] Carneiro, P., Heckman, J., and Vytlacil, E. (2010). “Estimating Marginal Returnsto Education.” IZA Discussion Paper no. 5275.

[18] Chade, H., and Smith, L. (2006). “Simultaneous Search.” Econometrica 74(5).

[19] Commonwealth of Massachusetts (2010). “An Act Relative to the Achieve-ment Gap: Turning Around Low-Performing Schools and Promoting In-novation for All.” http://www.mass.gov/Eoedu/docs/legislation policy/20100125 ed law fact sheet.pdf.

[20] Dillon, E., and Smith, J. (2013). “The Determinants of Mismatch Between Studentsand Colleges.” NBER Working Paper no. 19286.

[21] Dobbie, W., and Fryer, R. (2011). “Are High Quality Schools Enough to IncreaseAchievement Among the Poor? Evidence from the Harlem Children’s Zone.” Amer-ican Economic Journal: Applied Economics 3(3).

[22] Dobbie, W., and Fryer, R. (2013). “Getting Beneath the Veil of Effective Schools:Evidence from New York City.” American Economic Journal: Applied Economics,forthcoming.

[23] Duflo, E., Hanna, R., and Ryan, S. (2012). “Incentives Work: Getting Teachers toCome to School.” American Economic Review 102(4).

[24] Epple, D., and Romano, R. (1998). “Competition between Private and PublicSchools, Vouchers, and Peer-Group Effects.” The American Economic Review 88(1).

[25] Epple, D., Romano, R., and Seig, H. (2003). “Peer Effects, Financial aid and Selec-tion of Students into Colleges and Universities: An Empirical Analysis.” Journal ofApplied Econometrics 18(5).

[26] Epple, D., Romano, R., and Sieg, H. (2006). “Admission, Tuition, and Financial AidPolicies in the Market for Higher Education.” Econometrica 74(4).

[27] Epple, D., Romano, R., Sarpca, S., and Sieg, H. (2013). “The US Market for HigherEducation: A General Equilibrium Analysis of State and Private Colleges and PublicFunding Policies.” NBER Working Paper no. 19298.

[28] Ferreyra, M., and Kosenok, G. (2011). “Charter School Entry and School Choice:The Case of Washington, D.C.” Mimeo, Carnegie Mellon University.

[29] Finkelstein and McGarry (2006). “Multiple Dimensions of Private Information: Ev-idence from the Long-Term Care Insurance Market.” American Economic Review96(4).

[30] Fryer, R. (2010). “Racial Inequality in the 21st Century: The Declining Significanceof Discrimination.” NBER Working Paper no. 16256.

[31] Fryer, R. (2011). “Creating ‘No Excuses’ (Traditional) Public Schools: PreliminaryEvidence from an Experiment in Houston.” NBER Working Paper no. 17494.

[32] Fryer, R., and Curto, V. (2011). “Estimating the Returns to Urban Boarding Schools:Evidence from SEED.” NBER Working Paper no. 16746.

54

[33] Fryer, R., and Levitt, S. (2004). “Understanding the Black-White Test Score Gap inthe First Two Years of School.” Review of Economics and Statistics 86(2).

[34] Fryer, R., and Levitt, S. (2006). “The Black-White Test Score Gap through ThirdGrade.” American Law and Economics Review 8(2).

[35] Gleason, P., Clark, M., Tuttle, C., and Dwoyer, E. (2010). “The Evaluation ofCharter School Impacts: Final Report.” Washington, D.C.: National Center forEducation Evaluation and Regional Assistance, Institute of Education Sciences, U.S.Department of Education.

[36] Hastings, J., Kane, T., and Staiger, D. (2009). “Heterogeneous Preferences and theEfficacy of Public School Choice.” Mimeo, Yale University.

[37] Hastings, J., and Weinstein, J. (2008). “Information, School Choice, and AcademicAchievement: Evidence from Two Experiments.” The Quarterly Journal of Eco-nomics 123(4).

[38] Hausman, J., and Wise, D. (1978). “A Conditional Probit Model for QualitativeChoice: Discrete Decisions Recognizing Interdependence and Heterogeneous Prefer-ences.” Econometrica 46(2).

[39] Heckman, J. (1979). “Sample Selection as a Specification Error.” Econometrica47(1).

[40] Heckman, J. (1990). “Varieties of Selection Bias.” The American Economic Review80(2).

[41] Heckman, J. (2010). “Building Bridges Between Structural and Program EvaluationApproaches to Evaluating Policy.” Journal of Economic Literature 48(2).

[42] Heckman, J. (2011). “The American Family in Black and White: A Post-RacialStrategy for Improving Skills to Promote Equality.” IZA Discussion Paper no. 5495.

[43] Heckman, J., and Vytlacil, E. (2001). “Policy-Relevant Treatment Effects.” TheAmerican Economic Review 91(2).

[44] Heckman, J., and Vytlacil, E. (2005). “Structural Equations, Treatment Effects, andEconometric Policy Evaluation.” Econometrica 73(3).

[45] Hoxby, C., and Avery, C. (2012). “The Missing ‘One-Offs:’ The Hidden Supply ofHigh-Achieving, Low-Income Students.” NBER Working Paper no. 18586.

[46] Hoxby, C., and Murarka, S. (2009). “Charter Schools in New York City: Who Enrollsand How They Affect Their Students’ Achievement.” NBER Working Paper no.14852.

[47] Hoxby, C., and Rockoff, J. (2004). “The Impact of Charter Schools on StudentAchievement.” Harvard Institute of Economic Research Working Paper Series.

[48] Hotz, V., Imbens, G., and Mortimer, J. (2005). “Predicting the Efficacy of FutureTraining Programs Using Past Experiences at Other Locations.” Journal of Econo-metrics 125(1).

55

[49] Imbens, G., and Angrist, J. (1994). “Identification and Estimation of Local AverageTreatment Effects.” Econometrica 62(2).

[50] Imbens, G., and Rubin, D. (1997). “Estimating Outcome Distributions for Compliersin Instrumental Variables Models.” The Review of Economic Studies 64(4).

[51] Imberman, S. (2011). “The Effect of Charter Schools on Achievement and Behaviorof Public School Students.” Journal of Public Economics 95(7-8).

[52] Lise, J., Seitz, S., and Smith, J. (2004). “Equilibrium Policy Experiments and theEvaluation of Social Programs.” NBER Working Paper no. 10283.

[53] Massachusetts Department of Elementary and Secondary Educa-tion (2011). “Understanding Charter School Tuition Reimbursements.”http://www.doe.mass.edu/charter/finance/tuition/Reimbursements.html.

[54] Massachusetts Department of Elementary and Secondary Education (2012a).“Application for a Massachusetts Horace Mann Public Charter School.”http://www.doe.mass.edu/charter/app/HM full.pdf

[55] Massachusetts Department of Elementary and Secondary Education (2012b). “Char-ter School News Archives.” http://www.doe.mass.edu/charter/default.html? sec-tion=archive.

[56] Massachusetts Department of Elementary and Secondary Education (2012c). “Open-ing Procedures Handbook: A Guide for Boards of Trustees and Leaders of NewCharter Schools.” http://www.doe.mass.edu/charter/guides/ophandbook.pdf.

[57] Mehta, N. (2011). “Competition in Public School Districts: Charter School Entry,Student Sorting, and School Input Determination.” Mimeo, University of Pennsyl-vania.

[58] Neal, D. (2006). “Why has Black-White Skill Convergence Stopped?” In: Hanushek,E. and Welch, F., eds., Handbook of the Economics of Education, Volume 1. Ams-terdam: Elsvier.

[59] Nevo, A. (2000). “A Practitioner’s Guide to Estimation of Random-Coefficients LogitModels of Demand.” Journal of Economics and Management Strategy 9(4).

[60] Pathak, P., and Sonmez, T. (2008). “Leveling the Playing Field: Sincere and So-phisticated Players in the Boston Mechanism.” American Economic Review 98(4).

[61] Ravitch, D. (2010). The Death and Life of the Great American School System:How Testing and Choice are Undermining Education. New York: Basic Books.

[62] Reardon, S. (2011). “The Widening Academic Achievement Gap Between the Richand the Poor: New Evidence and Possible Explanations.” In: Duncan, J., and Mur-nane, R., eds., Whither Opportunity? New York: Russell Sage Foundation.

[63] Rothstein, R. (2004). Class and Schools: Using Social, Economic, and Educat-ional Reform to Close the Black-White Achievement Gap. Washington, DC: Eco-nomic Policy Institute.

56

[64] Roy, A. (1951). “Some Thoughts on the Distribution of Earnings.” Oxford EconomicPapers 3(2).

[65] Roy, A. (2010). “Public Charter School Sets Sights on Vacant South End Church.”South End Patch, November 10.

[66] Todd, P., and Wolpin, K. (2006). “Using Experimental Data to Validate a DynamicBehavioral Model of Child Schooling and Fertility: Assessing the Impact of a SchoolSubsidy Program in Mexico.” American Economic Review 96(5).

[67] Train, K. (2003). Discrete Choice Models with Simulation. New York: CambridgeUniversity Press.

[68] Vanneman, A., Hamilton, L., Anderson, J., and Rahman, T. (2009). “AchievementGaps: How Black and White Students in Public Schools Perform in Mathematicsand Reading on the National Assessment of Educational Progress.” US Departmentof Education, NCES 2009-455.

[69] Levitz, J. (2013). “New Front in Charter Schools.” The Wall Street Journal, March10.

[70] Wald, A. (1940). “The Fitting of Straight Lines if Both Variables are Subject toError.” The Annals of Mathematical Statistics 11(3).

[71] Wilson, S. (2008). “Success at Scale in Charter Schooling.” American EnterpriseInstitute Future of American Education Project, Working Paper no. 2008-02.

57

Appendix A: Complier Densities

This appendix shows that 2SLS estimation of equation (4) produces consistent esti-

mates of potential outcome densities for lottery compliers. This result is an extension

of the methods developed in Abadie (2002). Abadie’s Lemma 2.1 implies that the 2SLS

estimate of γ(y) is a consistent estimator of the expectation of Kh(y−Yi(1)) for compliers

as N → ∞. I show that as h → 0 and Nh → ∞, this estimator converges in proba-

bility to the complier density function. Imbens and Rubin (1997) outline an alternative

method for estimating complier densities based on linear combinations of empirical den-

sities for the four possible combinations of Zi and Si. The approach taken here allows

these densities to be estimated in a simple one-step IV procedure.

Let f cs (y), fats (y), and fnts (y) be the density functions of Yi(s) for compliers, always

takers, and never takers, respectively, with s ∈ 0, 1. Define Kh(u) ≡ 1hK(uh

), where

K(·) is a function that satisfies´K(u)du = 1,

´uK(u)du = 0,

´u2K(u)du < ∞, and´

K2(u)du <∞. Consider the equation

Kh(y − Yi) · 1Si = s = αs(y) + γs(y) · 1Si = s+ εisy

for s ∈ 0, 1. If Zi is used as an instrument for 1Si = s in this equation, the resulting

IV estimator is

γs(y) ≡ EN [Kh(y − Yi)1Si = s|Zi = 1]− EN [Kh(y − Yi)1Si = s|Zi = 0]

EN [1Si = s|Zi = 1]− EN [1Si = s|Zi = 0](13)

where EN [·] is the empirical expectation operator. The following theorem shows that this

estimator is consistent for f cs (y).

Theorem: Suppose that assumptions A1-A3 hold, and that the density functions f cs (y),

fat1 (y), and fnt0 (y) exist and are twice differentiable at y. Then

plimh→0,Nh→∞

γs(y) = f cs (y).

Proof: I demonstrate the result for s = 1. The proof for s = 0 is analogous. I begin

by considering the expectation and variance of the each term in the numerator of (13).

Define

Ez(y) = EN [Kh(y − Yi) · 1Si = 1|Zi = z].

58

Ez is a sample average, so it is unbiased for the corresponding population moment. For

z = 1, we have

E[E1(y)

]= E[Kh(y − Yi) · 1Si = 1|Zi = 1]

= E[Kh(y − Yi(1))|Si(1) > Si(0)] · Pr[Si(1) > Si(0)]

+ E[Kh(y − Yi(1))|Si(1) = Si(0) = 1] · Pr[Si(1) = Si(0) = 1],

which can be written

E[E1(y)

]=

ˆKh(y − t) ·

(φcf c1(t) + φatfat1 (t)

)dt

where φc and φat are the fractions of compliers and always takers, respectively. Using the

change of variables u = y−th

and taking a second-order Taylor expansion around h = 0

yields

E[E1(y)

]= φcf c1(y) + φatfat1 (y) +

h2

2·(φcf c

′′

1 (y) + φatfat′′

1 (y))·ˆK(u)u2du+ o(h2)

which implies

limh→0

E[E1(y)] = φcf c1(y) + φatfat1 (y).

A similar argument shows that

limh→0

E[E0(y)] = φatfat1 (y).

Next, consider the variance of E1(y). We have

V ar(E1(y)

)= 1

N1E[K2

h(y − Yi) · 1Si = 1|Zi = 1]− 1N1E[E1(y)]2

where N1 is the number of observations with Zi = 1. The argument above shows that

E[E1(y)] is bounded as h → 0, so as N1 → ∞ (which is implied by N → ∞ together

with (A2)) the second term is negligible. The first term is

1N1E[K2

h(y − Yi) · 1Si = 1|Zi = 1] = 1N1

ˆK2h(y − t) ·

(φcf c1(t) + φatfat1 (t)

)dt

= 1N1h·(φcf c1(y) + φatfat1 (y)

)·ˆK2(u)du+ o

(1Nh

).

Therefore, we have

limh→0,Nh→∞

V ar(E1(y)

)= 0.

A similar calculation shows that the variance of E0(y) also converges to zero.

59

The arguments so far imply that as h → 0 and Nh →∞, E1(y) and E0(y) converge

in mean square, and therefore in probability, to (φcf c1(y) + φatfat1 (y)) and φatfat1 (y), re-

spectively. When s = 1, the probability limit of the denominator of (13) as N → ∞ is

φc. Then by the continuous mapping theorem we have

plimh→0,Nh→∞

γ1(y) =φcf c1(y) + φatfat1 (y)− φatfat1 (y)

φc

= f c1(y).

This completes the proof.

60

Appendix B: Relationship to Roy Model

This appendix shows that equations (5) through (9) nest a Roy model of selection in

which students seek to maximize achievement and have private information about their

test scores in charter and public schools. For simplicity, I omit application costs and

preferences for distance, and focus on scores in a single subject. Achievement for student

i at charter school k is given by

Yi(k) = α0k +X ′iα

xc + ηic + νik,

while public school achievement is

Yi(0) = α00 +X ′iα

x0 + ηi0 + νi0

where E[νik|Xi, ηic, ηi0] = 0. Assume that students know the parameters of these equa-

tions, their own characteristics Xi, and private signals of their achievement in charter

and public schools ηic and ηi0. Also assume that (ηic, ηi0)′ follows a bivariate normal

distribution with E[ηi`|Xi] = 0 and V ar(ηi`) = σ2` for ` ∈ c, 0, and Cov(ηic, ηi0) = σc0.

The νik represent random fluctuations in test scores unknown to the student.

Suppose that students choose schools to maximize expected achievement. Then stu-

dent utility can be written

uik = α0k +X ′iα

xc + ηic

ui0 = α00 +X ′iα

x0 + ηi0.

Subtracting ui0 from uik, student preferences can be equivalently represented by the

utility functions

Uik = γ0k +X ′iγ

x + θi,

where

γ0k = αk − α0

γx = αxc − αx0

θi = ηic − ηi0

and Ui0 ≡ 0. These preferences are a special case of equation (5) with γd = γa = 0 and

V ar(vik) = V ar(ψia) = 0.

61

Returning to the test score equation, we have

E[Yi(k)|Xi, θi] = α0k +X ′iα

xc + αθc · θi

E[Yi(0)|Xi, θi] = α00 +X ′iα

x0 + αθ0 · θi,

where

αθc =σ2c − σc0

σ2c + σ2

0 − 2σc0

αθ0 =σc0 − σ2

0

σ2c + σ2

0 − 2σcp.

This implies that potential test scores are given by

Yi(k) = α0k +X ′iα

xc + αθc · θi + εik

Yi(0) = α00 +X ′iα

x0 + αθ0 · θi + εi0,

where E[εik|Xi, θi] = 0, which is the specification for achievement in equations (7) and

(8).

Finally, note that the Roy framework implies that αθc > 0, αθ0 < 0, and αθc − αθ0 = 1.

If students choose schools to maximize academic achievement, then charter preferences

will be positively related to scores in charter schools, negatively related to scores in

public schools, and the causal effect of charter attendance will be increasing in charter

preferences.

62

Appendix C: Identification of Preference Coefficients

This appendix uses a simplified version of the structural model to show analytically

that the combination of lottery and distance instruments identifies the coefficients on the

charter preference θi in equations (7) and (8). Suppose there is a single charter school,

and the utilities of charter and public school attendance are given by

Ui1 = γ0 + γd ·Di + θi + vi − γa · Ai

Ui0 = −γa · Ai

where Di is distance to the charter school, Ai indicates charter application, θi ∼ N(0, σ2θ)

is observed prior to the application decision, and vi ∼ N(0, 1) is observed after the

application decision.27 The charter school holds a lottery for applicants with acceptance

probability π.

The expected utility of applying to the charter school is

π · E[maxγ0 + γd ·Di + θi + vi, 0|θi]− γa,

while not applying yields utility of zero with certainty. It is optimal to apply if

ψ(γ0 + γd ·Di + θi

)>γa

π

where ψ(t) ≡ Φ(t) ·(t+φ(t)). It is straightforward to show that ψ(·) is strictly increasing,

so the application rule can be written

Ai = 1θi > θ∗(Di),

where

θ∗(D) = ψ−1(γa

π

)− γ0 − γd ·D.

Note that with γd < 0, we have dθ∗

dD> 0: students who live further from the charter school

must have stronger tastes for charter attendance to justify incurring the application cost.

Let Si(z) indicate charter attendance as a function of Zi. Rejected applicants cannot

attend, so Si(0) = 0 ∀i. Attendance for admitted applicants is given by

Si(1) = 1γ0 + γd ·Di + θi + vi > 0.

27I use a normal distribution rather than an extreme value distribution for vi because it allows me toobtain analytic formulas in the calculations to follow.

63

Lottery applicant compliers choose to apply and have Si(1) = 1. Compliers are therefore

characterized by

(Ai = 1) ∩ (Si(1) > Si(0))⇐⇒ θi > maxθ∗(Di),−γ0 − γd ·Di − vi.

The model for potential outcomes in charter and public school is

Yi(1) = α01 + αθ1 · θi + εi1

Yi(0) = α00 + αθ0 · θi + εi0

with E[εi`|θi, Di] = 0 for ` ∈ 0, 1. It is straightforward to show that average potential

outcomes for compliers who live a distance D from charter schools are given by

E[Yi(`)|Ai = 1, Si(1) > Si(0), Di = D] = α0` + αθ` · µcθ(D)

where

µcθ(D) =σθ · Φ(ψ−1

(γa

π

))· λ(θ∗(D)σθ

)+ σθ · (1− Φ

(ψ−1

(γa

π

))) ·ˆλ(−γ0−γd·D−vi

σθ

)dF(vi|vi < −ψ−1

(γa

π

)).

Here λ(t) ≡ φ(t)1−Φ(t)

is the inverse Mills ratio.

The inverse Mills ratio is an increasing function, so µcθ(D) is increasing in D. Appli-

cant compliers who apply to charter from further away therefore have stronger preferences

for charters, and comparisons of potential outcomes for lottery compliers who live dif-

ferent distances from charter schools identify the relationship between preferences and

achievement. Specifically, for D1 6= D0, we have

E[Yi(`)|Ai = 1, Si(1) > Si(0), Di = D1]− E[Yi(`)|Ai = 1, Si(1) > Si(0), Di = D0]

µcθ(D1)− µcθ(D0)= αθ`

for ` ∈ 0, 1. The numerator of the left-hand side of this equation can be computed using

the methods described in Abadie (2002) for estimating marginal mean counterfactuals for

compliers. The denominator is non-zero because complier preferences vary with distance;

it can be calculated with knowledge of the parameters of the student utility function,

which are identified from charter application and attendance behavior. The selection

parameters αθ` are therefore identified.

64

Appendix D: Equilibrium Admission Probabilities

Description of the Game

This appendix describes the determination of equilibrium admission probabilities.

These probabilities are determined in a Subgame Perfect Nash Equilibrium (SPE) in

which students make utility-maximizing choices as described in Section 4, and schools set

admission probabilities to fill their capacities Λk, or come as close as possible to doing

so. The timing of the game follows Figure 3. Strategies in each stage of the game are as

follows:

1. Students choose applications.

2. Schools observe students’ application choices, and choose their admission probabil-

ities.

3. Offers are randomly assigned among applicants.

4. Students observe their offers and make school choices.

To simplify the game, note that the distribution of students is atomless, so schools do

not change their admission probabilities in the second stage in response to the application

decisions of individual students in the first stage. Students therefore act as “price takers”

in the first stage, in the sense that they do not expect schools to react to their application

choices. Without loss of generality, therefore, the game can be analyzed as if applications

and admission probabilities are chosen simultaneously. I analyze the static Nash equilibria

of this simultaneous-move game, which are equivalent to Subgame Perfect equilibria of

the dynamic game described above.

Definition of Equilibrium

An equilibrium of the game requires an application rule for each student, a vector

of admission probabilities π∗, and a rule for assigning school choices that satisfy the

following conditions:

1. The probability that student i chooses application bundle a is given byQia(θi, τi; π∗),

where Qia is defined as in Section 4.2 and now explicitly depends on the vector of

admission probabilities students expect to face in each lottery.

65

2. For each k, π∗k is chosen to maximize enrollment subject to school k’s capacity

constraint, taking student application rules as given and assuming that other schools

choose π∗−k, which denotes the elements of π∗ excluding the k-th.

3. After receiving the offer vector z, student i chooses school k with probability

Pik(z, θi, τi) as in Section 4.2.

School Problem

I begin by deriving a school’s optimal admission probability as a function of students’

expected admission probabilities and the actions of other schools. Suppose that students

anticipate the admission probability vector πe when making application decisions in the

first stage of the model. Their application decisions are described by Qia(θi, τi; πe). In

addition, suppose that schools other than k admit students with probability π−k. If school

k admits students with probability πk in the second stage, its enrollment is given by

ek(πk, π−k, πe) = E

∑a∈0,1K

∑z∈0,1K

Qia(θi, τi; πe)f(z|a; πk, π−k)Pik(z, θi, τi)

.School k chooses πk to solve

maxπk∈[0,1]

ek(πk, π−k, πe) s.t. ek(πk, π−k, π

e) ≤ Λk. (14)

The best response function πBRk (π−k, πe) is the solution to problem (14). The optimal

admission probability sets school k’s enrollment equal to its capacity if possible. The

following equation implicitly defines πBRk at interior solutions:

E

[∑a

∑z

Qia(θi, τi; πe)f(z|a; πBRk , π−k)Pik(z, θi, τi)

]= Λk.

Noting that Pik(z) = 0 when zk = 0 (since school k is not in student i’s choice set if she

does not receive an offer) and setting fk(1|ak; πk) = akπk, this equation can be rewritten

E

[ ∑a:ak=1

∑z:zk=1

Qia(θi, τi; πe)f−k (z−k|a−k; π−k) · πBRk · Pik(z, θi, τi)

]= Λk,

where z−k, a−k, and f−k are z, a and f excluding the k-th elements. An interior solution

for πBRk therefore satisfies

πBRk =Λk

E[∑

a:ak=1

∑z:zk=1Qia(θi, τi; πe)f−k (z−k|a−k; π−k)Pik(z, θi, τi)

]66

≡ Γk(π−k, πe).

If the denominator of Γk is sufficiently small, it may exceed one, in which case school

k cannot fill its capacity. In this case, the optimal action is to set πk = 1 and fill as many

seats as possible. This implies that the best response function is given by

πBRk (π−k, πe) = minΓk (π−k, π

e) , 1.

Existence of Equilibrium

Let πBR : [0, 1]K → [0, 1]K be the vector-valued function defined by

πBR(π) ≡(πBR1 (π−1, π), ..., πBRK (π−K , π)

)′.

A vector of admission probabilities supports a Nash equilibrium if and only if it is a fixed

point of πBR(π). The following theorem shows that an equilibrium of the game always

exists.

Theorem: There exists a π∗ ∈ [0, 1]K such that πBR (π∗) = π∗.

Proof: Note that Qia(θi, τi; π) is continuous in π and strictly positive, Pik(z, θi, τi) is

strictly positive when zk = 1, and f−k(z−k|a−k; π−k) is continuous in π−k and sums to

one for each a−k, so the denominator of Γk is always non-zero and continuous in π. πBRk

is therefore a composition of continuous functions, and is continuous. Then πBR is a

continuous function that maps the compact, convex set [0, 1]K to itself. Brouwer’s Fixed

Point Theorem immediately applies and πBR has at least one fixed point in [0, 1]K .

Uniqueness of Equilibrium

I next give conditions under which the equilibrium is unique. Define the functions

∆k(π) ≡ πk −minΓk(π−k, π), 1

and let ∆(π) ≡ (∆1(π), ...,∆K(π))′. A vector supporting an equilibrium satisfies ∆(π∗) =

0. A sufficient condition for a unique equilibrium is that the Jacobean of ∆(π) is a positive

dominant diagonal matrix. This requires the following two conditions to hold at every

value of π ∈ [0, 1]K :

1a.∂∆k

∂πk> 0 ∀k

2a.

∣∣∣∣∂∆k

∂πk

∣∣∣∣ ≥∑j 6=k

∣∣∣∣∂∆k

∂πj

∣∣∣∣ ∀k67

To gain intuition for when a unique equilibrium is more likely, note that in any

equilibrium, admission probabilities must be strictly positive for all schools; an admission

rate of zero guarantees zero enrollment, while expected enrollment is positive and less

than Λk for a sufficiently small positive πk. When πk > 0, we can write Γk as

Γk(π−k, π) =Λkπk

ek(πk, π−k, π)

It follows that conditions 1a and 2a are equivalent to the following conditions on the

model’s enrollment elasticities:

1b.∂ log ek∂ log πk

>

(Λk − ek

Λk

)∀k

2b.∂ log ek∂ log πk

≥∑j 6=k

πkπj·∣∣∣∣∂ log ek∂ log πj

∣∣∣∣+

(Λk − ek

Λk

)∀k

Condition 1b necessarily holds in the neighborhood of an equilibrium since the elas-

ticity of school k’s enrollment with respect to its own admission probability is positive

and Λk ≈ ek. This condition is more likely to hold throughout the parameter space when

demand for charter schools is strong, so that ek(πk, π−k, π) > Λk at most values of π.

Condition 2b is also more likely to hold in these circumstances, and when the cross elas-

ticities of enrollment at school k with respect to other schools’ admission probabilities are

small. This occurs when charter demand is more segmented. If preferences for distance

are strong enough, for example, each student will consider only the closest charter school,

and the cross elasticities are zero, leading to a unique equilibrium.

68

Figure A1: Boston Charter Middle Schools

Notes: Black stars mark the locations of the charter middle schools used to estimate the structural model (open by 2010-2011). Red stars mark the locations of additional schools opened through 2012-2013. Yellow stars mark the locations of schools opened in a hypothetical expansion that raises the number of charters to 30.

69

School name Grade coverage Years open Lotteries available Linked schools(1) (2) (3) (4) (5)

Academy of the Pacific Rim 5-12 1997- Yes -

Boston Collegiate 5-12 1998- Yes -

Boston Preparatory 6-12 2004- Yes -

Edward Brooke K-8 (with 5th entry) 2002- Yes -

Excel Academy 5-8 2003- Yes -

Frederick Douglass 6-10 2000-2005 No -

MATCH Middle School 6-8 2008- Yes -

Smith Leadership Academy 6-8 2003- No -

Roxbury Preparatory 6-8 1999- Yes -

Uphams Corner 5-8 2002-2009 No -

Dorchester Preparatory 5-12 2012- - Roxbury Preparatory

Edward Brooke II K-8 (with 5th entry) 2011- - Edward Brooke

Edward Brooke III K-8 (with 5th entry) 2012- - Edward Brooke

Excel Academy II 5-12 2012- - Excel Academy

Grove Hall Preparatory 5-12 2011- - Roxbury Preparatory

KIPP Academy Boston 5-8 2012- - KIPP Academy Lynn

Table A1: Boston Charter Middle Schools

Notes: This table lists charter middle schools serving traditional student populations in Boston, Massachusetts Schools are included if they accept students in 5th or 6th grade. Panel A lists schools open before the 2011-2012 school year, while Panel B lists expansion school opened for 2011-2012 and 2012-2013. Column (3) lists the opening and (where relevant) closing year for each school. Column (4) indicates whether lottery records were available for cohorts of applicants attending 4th grade between 2006 and 2010. For expansion schools, column (5) lists existing Massachusetts charter schools operated by the same organization.

Panel A. Schools open before 2011

Panel B. Expansion schools

70

Follow-up rate Differential(1) (2)

All students 0.889 -

Lottery applicants 0.869 -0.001(0.015)

N (scores) 35849 6417

* significant at 10%; ** significant at 5%; *** significant at 1%

Notes: This table reports the fraction of follow-up test scores in grades 6 through 8 that are observed for students attending 4th grade in Boston between 2006 and 2010. A student is coded as observed in a grade if both her math and ELA scores are recorded in that grade. The sample stacks grades, and includes observations for all scores that should be observed assuming normal academic progress after 4th grade. Column (1) shows the follow-up rate, while column (2) shows the difference in follow-up rates for lottery winners and losers. This differential is computed from a regression that controls for lottery fixed effects. The standard error is robust to heteroskedasticity and is clustered at the student level.

Table A2: Attrition

71

Estimate Standard error Estimate Standard error Estimate Standard errorParameter Description (1) (2) (3) (4) (5) (6)

α0m Mean potential outcome 0.028 0.069 -0.478*** 0.007 0.506*** 0.069

αxm Female -0.006 0.034 -0.005 0.013 -0.001 0.036

Black -0.104* 0.059 -0.194*** 0.021 0.090 0.062Hispanic 0.036 0.062 -0.100*** 0.022 0.136** 0.066Subsidized lunch -0.022 0.040 -0.147*** 0.019 0.125*** 0.044Special education -0.362*** 0.040 -0.354*** 0.015 -0.008 0.042Limited English proficiency -0.017 0.045 0.049*** 0.016 -0.066 0.048Baseline math score 0.402*** 0.024 0.566*** 0.009 -0.164*** 0.026Baseline ELA score 0.144*** 0.022 0.101*** 0.008 0.043* 0.024

αθm· σθ Taste for charter schools (std. dv. units) 0.109*** 0.019 0.144*** 0.009 -0.035* 0.021

α0e Mean potential outcome -0.085 0.076 -0.557*** 0.008 0.472*** 0.077

αxe Female 0.095** 0.038 0.158*** 0.013 -0.063 0.040

Black -0.080 0.060 -0.164*** 0.021 0.084 0.063Hispanic -0.052 0.062 -0.098*** 0.022 0.047 0.066Subsidized lunch -0.063 0.045 -0.150*** 0.020 0.087* 0.049

Special education -0.331*** 0.042 -0.332*** 0.015 0.001 0.045Limited English proficiency -0.009 0.048 -0.069*** 0.016 0.060 0.051Baseline math score 0.076*** 0.026 0.183*** 0.009 -0.107*** 0.027Baseline ELA score 0.488*** 0.025 0.452*** 0.008 0.036 0.026

αθe· σθ Taste for charter schools (std. dv. units) -0.032 0.021 0.009 0.009 -0.041* 0.023

N Sample size

* significant at 10%; ** significant at 5%; *** significant at 1%

13473Notes: This table reports maximum simulated likelihood estimates of the parameters of the 6th-grade achievement distribution. Panel A shows estimates for math, while Panel B shows estimates for ELA. The likelihood is evaluated using 300 simulations per observation. Mean potential outcomes are evaluated at the sample mean of the covariate vector X. The mean potential outcome for charter schools is an average of school-specific means. The sample includes all students with observed 8th-grade test scores. The likelihood is evaluated using 300 simulations per observation.

Table A3: Maximum Simulated Likelihood Estimates of 6th-grade Achievement ParametersCharter school Traditional public school Charter effect

Panel A. Math

Panel B. ELA

72

Estimate Standard error Estimate Standard error Estimate Standard errorParameter Description (1) (2) (3) (4) (5) (6)

α0m Mean potential outcome 0.219*** 0.075 -0.456*** 0.008 0.676*** 0.075

αxm Female 0.116*** 0.038 0.010 0.015 0.105*** 0.040

Black -0.036 0.067 -0.181*** 0.023 0.145** 0.071Hispanic 0.033 0.070 -0.083*** 0.025 0.116 0.074Subsidized lunch 0.036 0.044 -0.150*** 0.021 0.186*** 0.049Special education -0.374*** 0.048 -0.350*** 0.017 -0.024 0.051Limited English proficiency 0.005 0.058 0.083*** 0.019 -0.078 0.061Baseline math score 0.367*** 0.028 0.494*** 0.010 -0.127*** 0.029Baseline ELA score 0.073*** 0.027 0.091*** 0.009 -0.019 0.028

αθm· σθ Taste for charter schools (std. dv. units) 0.029 0.022 0.061*** 0.011 -0.032 0.025

α0e Mean potential outcome 0.042 0.091 -0.517*** 0.009 0.559*** 0.092

αxe Female 0.170*** 0.041 0.221*** 0.015 -0.051 0.044

Black 0.003 0.068 -0.084*** 0.024 0.087 0.072Hispanic 0.088 0.073 -0.023 0.025 0.111 0.077Subsidized lunch 0.029 0.051 -0.130*** 0.022 0.159*** 0.055

Special education -0.412*** 0.052 -0.404*** 0.016 -0.008 0.054Limited English proficiency -0.034 0.061 -0.014 0.019 -0.020 0.064Baseline math score 0.101*** 0.030 0.180*** 0.009 -0.079** 0.031Baseline ELA score 0.331*** 0.029 0.370*** 0.009 -0.039 0.030

αθe· σθ Taste for charter schools (std. dv. units) -0.025 0.027 -0.002 0.012 -0.024 0.030

N Sample size

* significant at 10%; ** significant at 5%; *** significant at 1%

10333Notes: This table reports maximum simulated likelihood estimates of the parameters of the 7th-grade achievement distribution. Panel A shows estimates for math, while Panel B shows estimates for ELA. The likelihood is evaluated using 300 simulations per observation. Mean potential outcomes are evaluated at the sample mean of the covariate vector X. The mean potential outcome for charter schools is an average of school-specific means. The sample includes all students with observed 8th-grade test scores. The likelihood is evaluated using 300 simulations per observation.

Table A4: Maximum Simulated Likelihood Estimates of 7th-grade Achievement ParametersCharter school Traditional public school Charter effect

Panel A. Math

Panel B. ELA

73

Estimate Standard error Estimate Standard error Estimate Standard errorParameter Description (1) (2) (3) (4) (5) (6)

Var(εimk)1/2 Math standard deviation 0.643*** 0.011 0.702*** 0.004 -0.059*** 0.012

Var(εiek)1/2 ELA standard deviation 0.582*** 0.011 0.664*** 0.006 -0.082*** 0.013

ρme Math/ELA correlation 0.494*** 0.018 0.424*** 0.009 0.071*** 0.020

Var(εimk)1/2 Math standard deviation 0.636*** 0.011 0.681*** 0.004 -0.045*** 0.012

Var(εiek)1/2 ELA standard deviation 0.579*** 0.013 0.674*** 0.006 -0.095*** 0.014

ρme Math/ELA correlation 0.481*** 0.023 0.480*** 0.008 0.001 0.024

Var(εimk)1/2 Math standard deviation 0.645*** 0.016 0.718*** 0.006 -0.073*** 0.017

Var(εiek)1/2 ELA standard deviation 0.585*** 0.016 0.672*** 0.007 -0.087*** 0.018

ρme Math/ELA correlation 0.530*** 0.026 0.492*** 0.010 0.038 0.028

* significant at 10%; ** significant at 5%; *** significant at 1%

Notes: This table reports maximum simulated likelihood estimates of the covariance parameters of the math and ELA achievement distributions. The likelihood is evaluated using 300 simulations per observation.

Charter school Traditional public school Charter effectTable A5: Maximum Simulated Likelihood Estimates of Covariance Parameters

Panel A. 6th Grade

Panel B. 7th Grade

Panel C. 8th Grade

74

6th grade 7th grade 8th grade 6th grade 7th grade 8th gradeSchool (3) (4) (5) (6) (7) (8)

Charter school 1 0.434*** 0.577*** 0.492*** 0.283*** 0.453*** 0.577***(0.075) (0.082) (0.116) (0.084) (0.098) (0.125)

Charter school 2 0.448*** 0.486*** 0.471*** 0.357*** 0.472*** 0.540***(0.073) (0.078) (0.101) (0.085) (0.091) (0.118)

Charter school 3 0.375*** 0.562*** 0.543*** 0.410*** 0.364*** 0.510***(0.080) (0.088) (0.123) (0.086) (0.108) (0.137)

Charter school 4 0.470*** 0.573*** 0.771*** 0.381*** 0.540*** 0.618***(0.083) (0.087) (0.119) (0.091) (0.106) (0.139)

Charter school 5 0.476*** 0.692*** 0.551*** 0.187** 0.229** 0.344***(0.077) (0.087) (0.115) (0.082) (0.100) (0.126)

Charter school 6 0.656*** 0.790*** 0.682*** 0.846*** 0.843*** 0.834***(0.095) (0.100) (0.123) (0.114) (0.121) (0.149)

Charter school 7 0.684*** 1.050*** 0.968*** 0.842*** 1.013*** 1.007***(0.100) (0.118) (0.171) (0.108) (0.135) (0.174)

* significant at 10%; ** significant at 5%; *** significant at 1%

Notes: This table reports maximum simulated likelihood estimates of the school-specific test score parameters from the structural model. Test score effects are evaluated at the mean of the covariate vector X. The likelihood is evaluated using 300 simulations per observation..

Table A6: School-specific Test Score ParametersMath effects ELA effects

75

Data Model Data ModelChoice (1) (2) (3) (4)

Apply to/attend any charter 0.190 0.231 0.100 0.094

Apply to more than one charter 0.061 0.045 - -

Number of charter applications 0.272 0.281 - -

Among applicants/attenders:Charter school 1 0.158 0.165 0.159 0.118

Charter school 2 0.205 0.200 0.172 0.159

Charter school 3 0.252 0.179 0.166 0.172

Charter school 4 0.312 0.215 0.215 0.203

Charter school 5 0.350 0.223 0.145 0.206

Charter school 6 0.070 0.114 0.070 0.063

Charter school 7 0.084 0.116 0.075 0.080

Table A7: Model Fit -- Choice Probabilities

Notes: This table compares empirical choice probabilities to simulated probabilities using the MSL estimates. Model statistics are averages from 100 simulations per observation.

Application Attendance

76

Data Model Data Model Data Model Data ModelSubject Grade (1) (2) (3) (4) (5) (6) (7) (8)Math 6th -0.521 -0.520 1.038 1.034 0.282 0.314 0.814 0.837

7th -0.486 -0.486 0.975 0.970 0.336 0.361 0.760 0.783

8th -0.457 -0.452 0.958 0.954 0.350 0.348 0.732 0.747

ELA 6th -0.580 -0.581 1.057 1.052 -0.040 -0.028 0.916 0.939

7th -0.538 -0.537 0.985 0.982 0.086 0.091 0.848 0.854

8th -0.506 -0.502 0.993 0.987 0.143 0.121 0.801 0.812Notes: This table compares empirical test score distributions to simulated distributions using the MSL estimates. Model statistics are produced using 100 simulations per observation.

Table A8: Model fit -- Achievement DistributionsTraditional public schools

Mean Standard deviation Mean Standard deviationCharter schools

77

7 schools 13 schools 30 schools(1) (2) (3)

School 1 0.668 0.784 1.000School 2 0.449 0.569 0.880School 3 0.663 0.792 1.000School 4 0.699 0.871 1.000School 5 0.476 0.577 0.873School 6 0.951 1.000 1.000School 7 0.867 1.000 1.000School 8 - 0.794 1.000School 9 - 1.000 1.000School 10 - 1.000 1.000School 11 - 0.620 0.975School 12 - 1.000 1.000School 13 - 0.990 1.000School 14 - - 1.000School 15 - - 0.863School 16 - - 1.000School 17 - - 1.000School 18 - - 1.000School 19 - - 0.863School 20 - - 1.000School 21 - - 1.000School 22 - - 1.000School 23 - - 0.994School 24 - - 1.000School 25 - - 1.000School 26 - - 0.991School 27 - - 1.000School 28 - - 1.000School 29 - - 1.000School 30 - - 0.981

Table A9: Equilibrium Admission Probabilities

Notes: This table shows equilibrium admission probabilities used for the counterfactual simulations.

78