2012–2013 growth model for educator evaluation technical report

2012–2013 Growth Model

for Educator Evaluation

Technical Report

Prepared for the New York State

Education Department

December 2013

DRAFT – NOT FOR DISTRIBUTION OR CITATION

Growth Model for Educator Evaluation Technical Report

i

Table of Contents

INTRODUCTION ...........................................................................................................................1

Changes in Growth Measures for Teachers in Grades 4–8 from 2011–2012 to 2012–2013 ...........2

Growth Measures for Principals of Grades 9–12 .............................................................................3

Content and Organization of This Report ........................................................................................4

DATA ..............................................................................................................................................6

Test Scores .......................................................................................................................................6 State Tests in ELA and Math (Grades 3–8) .........................................................................6 Regents Exams .....................................................................................................................7

Demographics ..................................................................................................................................9 Academic History Variables ..............................................................................................11

Students with Disabilities (SWD) Variables ......................................................................13

English Language Learner Variables ................................................................................13 Economic Disadvantage Variables ....................................................................................13

Attribution Data and Weighting of Student Growth for Educators ...............................................14

Linking Students to Teachers of Grades 4–8 .....................................................................14 School and District Linkages in Grades 4–8 .....................................................................16

Linking Students to Principals of Grades 9–12 .................................................................17 School and District Linkages in Grades 9–12 ...................................................................17

MODEL .........................................................................................................................................19

MGP Model ...................................................................................................................................19 Covariate Adjustment Model .............................................................................................20

Accounting for Measurement Variance in the Predictor Variables ..................................20 Specification for MGP Model for Grades 4–8 ...................................................................22

Specification for the MGP Model for Grades 9–12 ...........................................................22 Student Growth Percentiles ...............................................................................................23

Mean Growth Percentiles ..................................................................................................25 Combining Growth Percentiles Across Grades and Subjects............................................26

Comparative Growth in Regents Exams Passed (GRE) Model .....................................................26

REPORTING .................................................................................................................................29

Reporting for Teachers and Principals of Grades 4–8 ...................................................................29

Reporting for Grades 9–12.............................................................................................................30

Minimum Sample Sizes for Reporting ..........................................................................................30

Performance Categories .................................................................................................................31

RESULTS ......................................................................................................................................33

Results from Growth Models for Grades 4–8 ................................................................................33 Model Fit Statistics for Grades 4–8 ...................................................................................33

Student Growth Percentiles for Grades 4-8.......................................................................34 Mean Growth Percentiles for Grades 4-8 .........................................................................34 Precision of the Mean Growth Percentiles for Grades 4–8 ..............................................35


ii

Impact Data Results for Grades 4–8 .................................................................................38

Growth Ratings for Grades 4–8 .........................................................................................45 Stability of Growth Ratings for Grades 4–8 Over Time ....................................................45

Results for Grades 9–12 .................................................................................................................47

Model Fit Statistics for Grade 9–12 Models......................................................................47 Correlation of Combined MGP with GRE Results ............................................................47 Fraction of Students Included in Measures .......................................................................47 Distribution of MGPs and GRE Scores for Grades 9–12 ..................................................47 Precision of the Measures for Grades 9–12 ......................................................................48

Impact Data Results for Grades 9–12 ...............................................................................51 Growth Ratings for Principals of Grades 9–12 .................................................................57 Growth Ratings for Schools/Principals Serving Grades 4–8 and Grade 9–12 .................57

CONCLUSION ..............................................................................................................................59

REFERENCES ..............................................................................................................................60


iii

List of Appendices

Appendix A. Task Force and Technical Advisory Committee Members

Appendix B. Grade 4–8 Data Processing Rules and Results

Appendix C. Grade 4–8 Item Descriptions Used in Analysis

Appendix D. Model Derivation

Appendix E. Interpolating Standard Errors of Measurement at the Lowest and Highest

Obtainable Scale Scores (LOSS and HOSS)

Appendix F. Grade 9–12 Data Processing Rules and Results

Appendix G. Grade 4–8 Attribution and Weighting Rules

Appendix H. Model Coefficients

Appendix I. Grade 4–8 Impact Tables by Grade and Subject

List of Figures

Figure 1. Conditional Standard Error of Measurement Plot (Grade 8 ELA, 2010–2011) ............ 21 Figure 2. Sample Growth Percentile from Model ......................................................................... 24

Figure 3. Sample Growth Percentile from Model ......................................................................... 25 Figure 4. Determining Growth Ratings ........................................................................................ 32

Figure 5. Distribution of Grade 4–8 Teacher MGPs by Grade, Adjusted Model ......................... 35 Figure 6. Grade 4–8 Distribution of Principal MGPs, Adjusted Model ....................................... 35

Figure 7. Grades 4–8 Overall MGP with 95 % Confidence Interval Based on Random

Sample of 100 Teachers .............................................................................................. 36 Figure 8. Grades 4–8 Overall MGP with 95 % Confidence Interval Based on Random

Sample of 100 Principals ............................................................................................ 37 Figure 9. Grades 4–8 Relationship of Teacher MGP Scores to Percent of ELL Students in

Class/Course ............................................................................................................... 40 Figure 10. Grades 4–8 Relationship of Teacher MGP Scores to Percent SWD in Class/Course . 40 Figure 11. Grades 4–8 Relationship of Teacher MGP Scores to Percent of Economically

Disadvantaged Students in Class/Course .................................................................... 41 Figure 12. Grades 4–8 Relationship of Teacher MGP Scores to Mean Prior ELA Scores in

Class/Course ............................................................................................................... 41 Figure 13. Grades 4–8 Relationship of Teacher MGP Scores to Mean Prior Math Scores in

Class/Course ............................................................................................................... 42 Figure 14. Relationship of Principal MGP Scores to Percent of ELL Students ........................... 43 Figure 15. Relationship of Principal MGP Scores to Percent SWD in School ............................ 43


iv

Figure 16. Relationship of Principal MGP Scores to Percent of Economically Disadvantaged

Students ....................................................................................................................... 44 Figure 17. Relationship of Principal MGP Scores to Average Prior ELA Scores ........................ 44 Figure 18. Relationship of Principal MGP Scores to Average Prior Math Scores ....................... 45

Figure 19. Grades 9–12 Distribution of Principal MGP, Adjusted Model ................................... 48 Figure 20. Grades 9–12 Distribution of Principal GRE Scores, Adjusted Model ........................ 48 Figure 21. Grades 9–12 Caterpillar Plot of School MGPs ............................................................ 49 Figure 22. Grades 9–12 Caterpillar Plot of School GRE Results ................................................. 50 Figure 23. Relationship of Principal MGP Scores to Percent of ELL Students ........................... 52

Figure 24. Relationship of Principal MGP Scores to Percent SWD in School ............................ 53 Figure 25. Relationship of Principal MGP Scores to Percent of Economically Disadvantaged

Students ....................................................................................................................... 53 Figure 26. Relationship of Principal MGP Scores to Average Prior ELA Scores ........................ 54

Figure 27. Relationship of Principal MGP Scores to Average Prior Math Scores ....................... 54 Figure 28. Relationship of Grades 9–12 Principal Growth in Regents Exam (GRE) Scores

and Percent of ELL in the School ............................................................................... 55 Figure 29. Relationship of Grades 9–12 Principal Growth in Regents Exam (GRE) Scores

and Percent of Students with Disabilities in the School ............................................. 55 Figure 30. Relationship of Grades 9–12 Principal Growth in Regents Exam (GRE) Scores

and Percent of Economically Disadvantaged in the School ....................................... 56

Figure 31. Relationship of Grades 9–12 Principal Growth in Regents Exam (GRE) Scores

and Average Grade 8 ELA Scale Scores .................................................................... 56


and Average Grade 8 Math Scale Scores .................................................................... 57

List of Tables

Table 1. Variables Included in the Adjusted Models* ...................................................................10

Table 2. Grade 4–8 Teacher-Student Linkage Rates .....................................................................15

Table 3. Grades 4–8 School-Student Linkage Rates .....................................................................16

Table 4. Grades 4–8 District-Student Linkage Rates.....................................................................16

Table 5. Number of Unique Grades 4–8 Teacher-Schools, Schools, and Districts with Linked

Students ...........................................................................................................................17

Table 6. Grades 9–12 School-Student Linkage Rates ...................................................................18

Table 7. Number of Grade 9-12 Schools and Districts with Linked Students ...............................18

Table 8. Grade 4–8 Reporting Rates for Educators and Districts ..................................................31

Table 9. Grade 9–12 Reporting Rates for Educators and Districts ................................................31

Table 10. Grade 4–8 Pseudo R-Squared Values by Grade and Subject ........................................33

Table 11. Grade 4–8 Correlation between SGP and Prior Year Scale Score ................................34

Table 12. Grades 4–8 Mean Standard Errors, Standard Deviation, and Value of ρ for Adjusted

Model by Grade for Teachers and for Schools ...............................................................38

Table 13. Percent of Educator MGPs Above or Below Mean at the 95 % Confidence Level ......38

Table 14. Teacher MGP Correlated with Class/Course Characteristics ........................................39


v

Table 15. Principal MGP Correlated with School Characteristics ................................................42

Table 16. Grades 4–8 Teacher and Principal Growth Ratings .......................................................45

Table 17. Grades 4–8 Teacher and Principal Growth Ratings For Educators with Scores in 2011–

2012 and 2012–2013 .......................................................................................................46

Table 18. Grades 4–8 Teacher Growth Ratings for Teachers Present in Both 2011–2012 and

2012–2013.......................................................................................................................46

Table 19. Grade 4–8 School Growth Ratings for Schools Present in both

2011–2012 and 2012–2013 .............................................................................................46

Table 20. Grade 9–12 Pseudo R-Squared Values ..........................................................................47

Table 21. Average Percent of Students Included in 2012–2013 Measures ...................................47

Table 22. Grade 9–12 Percent of Principals Measures Above or Below Mean at the 95 %

Confidence Level ............................................................................................................50

Table 23. Grades 9–12 Mean Standard Errors, Standard Deviation,

and Value of ρ for Adjusted Model ................................................................................51

Table 24. Principal MGP Correlated with Demographic Characteristics ......................................51

Table 25. Distribution of Growth Ratings for Principals of Grades 9–12 in 2012–2013 ..............57

Table 26. Growth Ratings for Principals in 2012–2013 ................................................................58


1

INTRODUCTION

This document describes the models used to measure student growth for the purpose of educator

evaluation in New York State for the 2012–2013 school year. In 2012–2013, growth models

were implemented for teacher and principal evaluation in grades 4–8 English Language Arts

(ELA) and math and for principals of grades 9–12 (all grades). All models are based on assessing

each student’s change in performance between 2011–2012 and 2012–2013 on State assessments

compared to students with similar characteristics.

New York Education Law §3012-c requires performance evaluation for classroom teachers and

building principals in New York State. Under the law, New York State is required to

differentiate teacher and principal effectiveness using four rating categories: Highly Effective,

Effective, Developing, and Ineffective (HEDI). Education Law §3012-c(2)(a) requires Annual

Professional Performance Reviews (APPRs) resulting in a single composite teacher or principal

effectiveness score that incorporates multiple measures of effectiveness. Education Law §3012-

c(1) requires the results of the evaluations to be a significant factor in employment decisions,

including but not limited to promotion, retention, tenure determinations, termination, and

supplemental compensation. The law also provides that the results be a significant factor in

teacher and principal professional development (including but not limited to coaching, induction

support, and differentiated professional development).

State-provided growth scores are just one of the several measures that make up the annual

professional performance reviews and count for 20 percent of an evaluation score for the 2012–

2013 and 2013–2014 school years. Another 20 percent of educators’ evaluations are based on

locally selected measures of student achievement that are rigorous and comparable across

classrooms in accordance with standards prescribed by the Commissioner. The remaining 60

percent is based on multiple measures of educator effectiveness consistent with standards

prescribed by the Commissioner in regulation. This includes the extent to which the educator

demonstrates proficiency in meeting New York State’s teaching or leadership standards. For

teachers with fewer than 50 percent of students who take State assessments in grades 4–8 in

ELA or math, other comparable measures of student learning growth must be used for the State

growth subcomponent, using the student learning objective (SLO) process established in State-

provided guidance. Results from the growth model will also be incorporated into the State’s

metrics used for school accountability as part of New York’s Elementary and Secondary

Education Act (ESEA) waiver.

The Regents Task Force on Teacher and Principal Effectiveness, made up of representatives

from key stakeholder groups including educators, educator unions, educator professional

organizations, and other interested parties, has given input into the development of APPR

regulations and the design of the State-provided growth scores. In addition, a technical advisory

committee of leading experts in the nation has reviewed the technical accuracy and utility of the

statistical methodology used to calculate scores. A list of Task Force members and technical

advisory committee members is provided in Appendix A.

As required by Education Law §3012-c, New York State teachers of math and ELA in grades 4–

8 and their principals first received growth scores based on 2011–2012 State tests.Between

2011–2012 and 2012–2013, the New York State Education Department (NYSED) made a


2

number of refinements to the growth models and developed new measures for principals of

grades 9–12 (all grades). Note that students, teachers, and principals in grades 4–8 programs

administered by Boards of Cooperative Educational Services (BOCES) are not included in the

analysis and results presented in this report. Because BOCES are not compehensive schools with

a full set of grades 9–12 course offerings, grades 9–12 State-provided growth measures were not

computed for BOCES. Students who take Regents Exams as part of a BOCES 9–12 program are

included in growth measures for their “home” schools.

Changes in Growth Measures for Teachers in Grades 4–8 from 2011–2012 to 2012–2013

A number of key changes were made to the growth models previously used to measure growth

for evaluation of teachers and principals of students in grades 4–8. These changes include:

Enhancement of factors used to define similar students. In 2011–2012, similar

students were defined as those with similar prior performance on State tests in the same

subject measured in the current year as well as whether or not a student had a disability

status, lived in poverty, and/or was an English language learner (ELL). In 2012–2013,

additional variables (both student level and class/course level) were added to further

refine the definition of similar students in each of these areas. For example, in addition to

taking into account whether or not a student is an ELL student, in 2012–2013 the growth

model also accounts for the proportion of ELL students in a class or course and for the

level of a student’s English language proficiency (through use of New York State English

as a Second Language Achievement Test [NYSESLAT] scores). As described in the

Results section of this report, we see no substantive relationship between teacher Mean

Growth Percentiles (MGPs) and the characteristics of students in their courses in 2012–

2013. A full description of the factors used in the growth model in 2012–2013 can be

found later in this report.

Refined rules regarding which students count for teacher growth scores and with

what weight. In 2011–2012, a student counted for a teacher’s growth score only if the

student was enrolled with the teacher for 195 calendar days for ELA or 203 calendar days

for math. In 2012–2013, students counted if they spent at least 60 percent of a course

with a teacher. In addition, students meeting this minimum enrollment duration criterion

are weighted in a teacher’s growth score based on the amount of time they were enrolled

in and attended a teacher’s course. About 93 percent of student test scores were linked to

at least one teacher using these revised rules. A more detailed description of the process

used to attribute students to teachers in 2012–2013 can be found later in this report. No

change was made in 2012–2013 from the 2011–2012 minimum enrollment requirements

or the method for weighting students in growth scores for principals of grades 4–8. About

98 percent of students were linked to a school.

In addition to changes in the growth model itself, the State assessments that were administered in

2012–2013 measure the Common Core State Standards and have different scale scores than

those in 2011–2012. However, the relative performance of students compared to similar students

in 2012–2013 can still be calculated because all students took both old and new assessments and


3

the growth models do not depend on the use of a similar scale from year to year.1 Indeed, we see

that the statistical relationship between test scores from 2011–2012 and 2012–2013 is stronger

than that observed between 2010–2011 and 2011–2012. More detail about these results is

presented later in this report.

Growth Measures for Principals of Grades 9–12

Beginning in 2012–2013, New York State principals of grades 9–12 also received growth scores

describing how much students in their schools are growing academically in algebra and ELA and

how well students are progressing toward passing the Regents Exams required for graduation

and college and career readiness, compared to similar students statewide.

Development of the growth measures for principals of grades 9–12 was informed by the

development of the growth model for principals of grades 4–8. Where possible, the same

definitions of similar students and the same rules about student attribution were used for the

grades 9–12 measures as for the grades 4–8 principal measures.

The goal of growth measures for principals of grades 9–12 is to measure student growth toward

graduation and college and career readiness using available Regents Exam data. To achieve this

goal for 2012–2013, two different growth measures are reported. These two measures are

intended to acknowledge progress in passing Regents Exams required for graduation as well as

to account for high performance on Regents Exams and passing Regents Exams beyond the

minimum five required. Using these two measures allows us to capture two different but

important aspects of student progress toward graduation and college and career readiness and to

include most students in a principal’s high school in at least one measure. Several alternatives

were considered by the Regents Task Force and SED before the combination of these two

measures was recommended to the Board of Regents. The rationale for the recommendation was

based on the following key points:

The Integrated Algebra and ELA Regents MGP measure closely parallels the measure

used for principal growth scores for grades 4–8 and generates results that have similar

technical characteristics. Using these measures promotes consistency between growth

measures for principals of grades 9–12 and lower grades.

However, using the MGP measures alone in grades 9–12 could leave some schools with

measures that include only a small fraction of students (if many students have taken

Algebra before grade 9), and could overlook the impact of students who drop out before

taking ELA Regents. Therefore, an additional measure (the Comparative Growth in

Regents Exams Passed [GRE] measure) was also constructed.

Initial analysis showed that the GRE measure includes an average of 84 percent of

students in a school, ensuring broad student coverage. Its correlation with the combined 1 For additional information about the State-provided growth scores calculated in 2011–12, see A Teacher’s Guide

to Interpreting State-Provided Growth Scores for Grades 4–8

(http://www.engageny.org/sites/default/files/resource/attachments/teachers_guide_to_interpreting_your_growth_s

core_2011-12.pdf) and A Principal’s Guide to Interpreting State-Provided Growth Scores for Grades 4–8

(http://www.engageny.org/sites/default/files/resource/attachments/principals_guide_to_interpreting_your_growth

_score_2011-12.pdf).

http://www.engageny.org/sites/default/files/resource/attachments/teachers_guide_to_interpreting_your_growth_score_2011-12.pdf

http://www.engageny.org/sites/default/files/resource/attachments/teachers_guide_to_interpreting_your_growth_score_2011-12.pdf

http://www.engageny.org/sites/default/files/resource/attachments/principals_guide_to_interpreting_your_growth_score_2011-12.pdf

http://www.engageny.org/sites/default/files/resource/attachments/principals_guide_to_interpreting_your_growth_score_2011-12.pdf


4

MGP measure was 0.42, suggesting it may measure a different form of progress than the

MGP measure, therefore accounting for more of the complex role of a high school

principal than the MGP measure alone. The GRE measure also incorporates the

consequences of dropout students and recognizes students who accomplish more than

the minimum five required Regents Exams.

In discussing the possible measures for grades 9-12, the Metrics subgroup of the Task

Force did not reach consensus on a recommendation. The group did agree that an April

2013 letter to the Commissioner written by SED staff characterized the various views of

the group. The letter explains that some members did not want the State to construct any

growth measure for 9–12 principals. Considering the two measures (MGP and GRE), the

letter explains that “most workgroup members expressed a preference for the

Comparative Growth in Regents Passed measure if a single measure was chosen. The

rationale expressed for preferring this measure was that it covered more Regents Exams

than other measures and more students were included in this measure than in the other

measures.” When asked to consider the option of combining more than one measure, the

letter explains, “Of the workgroup members willing to consider any of these measures,

most preferred that two of the high school measure options be used, the Growth in

Regents Passed measure and the MGP on Integrated Algebra and ELA Regents

measure.” A rationale expressed for using the MGP measure was that it provides some

continuity and similarity to the growth measure used for principals (and teachers) in

grades 4–8 and that these two Regents Exams are two of the basic exams required to

graduate. In a June meeting in person with the Task Force, a similar lack of consensus

emerged.

Each measure is described in detail in the sections that follow along with the technical and policy

considerations that led to the use of the two measures.

Content and Organization of This Report

Results presented in this report are based on 2012–2013 and prior school years’ data, with some

comparisons to the 2011–2012 results. A technical report describing models and full results from

the 2011–2012 school year can be found at the EngageNY website at

http://www.engageny.org/sites/default/files/resource/attachments/growth-model-11-12-air-

technical-report.pdf. The 2010–2011 “beta growth model” technical report, published in August

2012 (also available online at http://usny.nysed.gov/rttt/docs/nysed-2011-beta-growth-tech-

report.pdf) describes the initial models that were constructed with 2010–2011 and prior school

years’ data to design an initial model with stakeholder input. The 2010–2011 results were not

used for evaluation purposes.

This technical report contains four main sections:

1. Data. Description of the data used to implement the student growth model, including

data processing rules and relevant issues that arose during processing.

2. Model. Statistical description of the model.

3. Reporting. Description of reporting metrics and computation of effectiveness scores.

http://www.engageny.org/sites/default/files/resource/attachments/growth-model-11-12-air-technical-report.pdf

http://www.engageny.org/sites/default/files/resource/attachments/growth-model-11-12-air-technical-report.pdf

http://usny.nysed.gov/rttt/docs/nysed-2011-beta-growth-tech-report.pdf

http://usny.nysed.gov/rttt/docs/nysed-2011-beta-growth-tech-report.pdf


5

4. Results. Overview of key model results aimed at providing information on model

quality and characteristics.


6

DATA

To measure student growth and to attribute that growth to educators, at least two sources of data

are required: student test scores that can be observed over time and information describing how

students are linked to schools, teachers, and courses (i.e., identifying which teachers teach which

students for which tested subjects and which school[s] those students attended). In addition, New

York State models also use other information about students and schools, such as student

background.

There were several notable changes in the data used to estimate the 2012–2013 models: first,

assessments in grades 4–8 ELA and math were aligned to Common Core State Standards (and a

new reporting scale was used); also, the rule used to link students in grades 4–8 to their teachers

was updated to make use of more detailed data on course enrollment and attendance.

The following sections describe the data used for model estimation in New York in more detail,

including some of the issues and challenges that arose and how they were handled.

Test Scores

New York’s student growth model drew on test score data from statewide testing programs in

grades 3–8 in ELA and math for the growth models for teachers and principals of grades 4–8 and

on Regents Exam scores for principals of grades 9–12. Models are estimated separately by grade

and subject using scores from each grade (e.g., grade 5 math) as the outcome, with predictor

scores as described in the following section.

State Tests in ELA and Math (Grades 3–8)

The New York State tests at the elementary and middle school grade levels include a variety of

content aimed at measuring a range of knowledge and skills in math and ELA. State tests in ELA

and math at grades 3–8 are given in the spring.

This year’s State tests were the first for New York students designed to measure the Common

Core State Standards. As expected, the percentage of students scoring proficient or advanced was

significantly lower than in 2011–2012.

Although the specific content or skills covered may have changed from year to year, as well as

the scale used to measure results, the statistical relationship between 2011–2012 and 2012–2013

test scores was strong (see the section on r-squared statistics for more detailed information). We

use test scores in each subject area as the predictor for that subject area (e.g., math scores are

used to predict math scores). In addition, the other subjects’ scores are used because they reflect

the general achievement of the students prior to the outcome year (e.g., ELA scores are used in

math models and vice versa).

New York’s growth models include three prior test scores in the same subject area assessed by

the growth measure and one prior test score in another subject. If the immediate prior year test

score in the same subject was missing from the immediate prior grade, the student was not

included in the growth measures for that subject. For example, students without a prior year test

score or with a prior year test score for the same grade as the current year test score do not have


7

growth scores computed for them. More detail on exclusion rules and results of applying those

rules (along with other specifications) is included in Appendix B.

For the other prior scores, missing data indicators were used. These missing indicator variables

allow the model to include students who do not have the maximum possible test history and

mean that the model results measure outcomes for students with and without the maximum

possible assessment history. This approach was taken in order to include as many students as

possible. For the 2012–2013 analyses, data from 2012–2013 were used as outcomes, with prior

achievement predictors coming from the three years before (going back to 2009–2010). Specific

tests used vary by grade and subject and are as follows:

Grade 4 ELA and math models use scores from grade 3 in ELA and math. Students are

NOT included if they lack grade 3 or 4 scores in the same subject.

Grade 5 ELA and math models use scores from grades 3 and 4 in ELA and math.

Students are NOT included if they lack grade 4 scores in the same subject.

Grades 6–8 ELA and math models use scores from grades 3–7 in ELA and math.

Students are NOT included if they lack the immediate prior year score in the same

subject.

In addition to test scores, AIR also used the conditional standard errors of those test scores in

growth analysis. All assessments contain some amount of measurement error, and the New York

growth model accounts for this error (as described in more detail in the Model section of this

report). Conditional standard errors were obtained from published technical reports for the

assessments’ prior year test scores and a similar table provided by the State’s test vendor for

2012–2013 test scores.

Regents Exams

One growth measure for grades 9–12 principals is the MGP measure, which is based on student

performance on grade 8 State tests in math or ELA compared to their performance in high school

on the Integrated Algebra and ELA Regents Exams. The model for generating the MGPs is very

similar to the grades 4–8 model, as described later in this report.

The ELA and Integrated Algebra Regents Exams are the most commonly taken exams in high

school. For 2012-13, 43 percent of students who met the linkage requirements took either an

Algebra or ELA Regents Exam. Since Regents Exams are offered multiple times each year and

students take Regents Exams at different points in their schooling, we include students and test

scores using the following rules:

• Students who take the Integrated Algebra or ELA Regents Exams prior to high school are

NOT included in the MGP of a principal of grades 9–12.

• We count Regents Exam scores from the following administrations: August of the prior

year (except for grade 9 students), January and June (of current year).

• Student scores are used until the students pass. (After students pass, we do not want the

measure alone to encourage additional test taking, which may not be necessary.)

• If a student takes a Regents Exam more than once during the school year, the higher test

score is used until that student receives a passing score.


8

• Students are included for up to eight years after first entering grade 9. (We want to

acknowledge schools that keep students beyond four years in high school to complete

graduation requirements.)

Another growth measure for grades 9–12 principals is the GRE metric. Since a major graduation

requirement is for students to pass five Regents Exams (more for advanced Regents diplomas),

this measure compares how much progress a school’s students are making from one year to the

next toward passing up to eight Regents Exams (the five required Regents Exams plus up to

three more). A principal’s score on this measure reflects whether or not his or her students

exceed the average change in number of Regents Exams passed each year by similar students

statewide. On average, about 84 percent of students in a high school are included in the GRE

measure. Major reasons for not including students in a 9–12 school’s GRE measure include lack

of grades 7 or 8 State test scores and having already passed the maximum number of Regents

Exams used in this measure.

As noted, Regents Exams are offered multiple times each year and students take Regents Exams

at different points in their schooling. We include students and test scores using the following

rules:

• We count Regents Exam scores from the following administrations: August of prior year

(except for grade 9 students) and January and June of current year.

• Student scores are used until they pass. (After students pass, we do not want the measure

alone to encourage additional test taking, which may not be necessary.)

• If a student takes a Regents Exam more than once during the year, we use the higher test

score until that student receives a passing score.

• Five required Regents Exams, and no more than three additional exams, are counted. The

scores for students who exceed eight Regents Exams passed are NOT included in a

principal’s results.

• The State’s modified passing score rules for students with disabilities are used to

determine passing for these students.

• All students who meet the minimum enrollment requirement (i.e., students who are

enrolled on BEDS day and at the beginning of the June Regents administration) are

included in determining a school’s score whether or not they take a Regents Exam during

the year.

• Students are included for up to eight years after first entering grade 9. (We want to

acknowledge schools who keep students beyond four years in high school to complete

graduation requirements.)

• Students who drop out are counted in the school from which they dropped out until they

have reached their fourth year since entering grade 9, starting with the 2012–2013 school

year. Students who dropped out prior to the 2012–2013 school year are not counted.


9

Demographics

The results of growth models are used to measure the effects of educators on student learning

gains, taking into account a student’s prior achievement; however, some factors outside of an

educator’s control may impact student learning gains. For example, different learning trajectories

are often statistically related to students living in poverty, beyond what would be expected based

only on the student’s prior achievement.

For all growth measures used in New York State for educator evaluation, students are always

compared to similar students in the State. That is, in computing student-level growth, we always

assess a student’s progress relative to students with a similar academic history and other defined

characteristics.

NYSED reports unadjusted growth scores that include only prior achievement as predictor

variables and adjusted growth scores including the full list of approved predictor variables.

Unadjusted scores are reported for informational purposes and are used for school accountability

in grades 4–8. In this report, the terms “SGP” and “MGP” refer to adjusted scores, including all

predictor variables, unless specifically identified as unadjusted.

Both student and classroom or school-level characteristics are included in growth measures used

for educator evaluation for 2012–2013 (and will be included in 2013–2014). Table 1 provides a

complete list of the factors included in 2012–2013. Note that additional factors were included in

2012–2013 models compared to 2011–2012. For instance, in 2012–2013, we account for whether

a student is an ELL student, and we also account for the percentage of ELL students in a school.

In 2011–2012, only individual student ELL status was included as a factor. This type of school-

level factor is intended to take peer effects into account, acknowledging that a student may have

a different growth trajectory in a classroom/course or school with many ELL students compared

to one with few ELL students.

Factors are the same for growth measures for teachers and principals of grades 4–8 as for

principals of grades 9–12, with a few additions for the high school context (e.g., we also account

for the total number of Regents Exams a student has passed at the time we measure growth).

Additional description of these variables follows Table 1.


10

Table 1. Variables Included in the Adjusted Models*

Variable Grades 4–8 Grades 9–12

ELA Math Regents

ELA

Regents

Integrated

Algebra

Compara-

tive

Growth in

Regents

Exams

Passed

Academic History Variables

Prior year ELA scale score (student level)

Two year prior ELA scale score if

available (student level)

Three year prior ELA scale score if


Prior year math scale score (student level)

Two year prior math scale score if


Three year prior math scale score if


Retained in grade (student level)

Mean prior score (aggregate level)

Range around mean prior score (aggregate

level)

New to school in non-articulation year

(student level)

Number of years since entering ninth grade

(student level)

**

Count of prior required Regents passed

(student level)

Count of prior required and other Regents

passed (student level)

Students with Disabilities Variables

Student with Disability (SWD) status

(student level)

SWD student is in the general education

classroom < 40 % of the time (student

level)

Percent SWD (aggregate level)

English Language Learner Variables

ELL status (student level)

Percent ELL (aggregate level)

NYSESLAT scores (student level)

Economically Disadvantaged Variables

ED status (student level)

Percent ED (aggregate level)

* Aggregate variables are computed at the class/course level for grades 4–8 and at the school level for

grades 9–12.

** GRE models are estimated separately by cohort (based on number of years since entering grade 9).


11

In addition to prior achievement/academic history, the rules of the Board of Regents provide that

three specific types of characteristics be included in the growth model to produce adjusted scores

(ELL status, SWD status, and poverty status). In 2011–2012, these characteristics were included

at the student level only. In 2012–2013, the growth model was enhanced to include additional

factors that are related to or derived from these characteristics. These characteristics, which are

described in more detail here, were selected after discussion with the Regents Task Force and

other stakeholders.

Academic History Variables

Prior Achievement Scores

o For grades 4–8 growth measures, up to three years of prior achievement scores in

the same subject are included (except for grades 4 and 5, where fewer years of

data are available). Students without scores from the immediate prior grade level

in the immediate prior year are excluded from analysis. In addition, the immediate

prior grade level score in the other subject (for ELA models, the math score; for

math models, the ELA score) is included if available.

o For grades 9–12 growth measures, scores from grade 7 and grade 8 assessments

(if available) in ELA and math are used as predictors. For the MGP measure,

students must have at least one score from grade 7 or grade 8 in the same subject

(for Integrated Algebra Regents models, from the grade 7 or grade 8 math test; for

the ELA Regents models, from the grade 7 or grade 8 ELA test). For the

Comparative Growth in Regents Exams Passed measure, to be included in

analysis, students must have at least one grade 7 or grade 8 score in either math or

ELA.

Retained in grade (grades 4–8 growth measures only). This is a yes/no variable that

indicates whether a student was retained in grade in one of the two years preceding the

most recent school year for students above grade 4 (for example, if a student was in

grade 5, grade 5 again, and then grade 6). Since students must have an immediate prior

score from the prior grade, students who were retained in grade between 2011–2012 and

2012–2013 are not included in the model (for example, students with data from grade 5 in

2010–2011, grade 6 in 2011–2012, and grade 6 in 2012–2013). This variable is computed

based on students’ tested grades in the assessment score file.

Mean prior score. This variable is intended to account for differences in learning

environments that are made up of students with disparate levels of incoming

achievement.

o For grades 4–8 growth measures, the average prior same-subject achievement on

the State test of all students attributed to a teacher in the current year is included

in the model. (For example, the average prior ELA achievement of all students in

a teacher’s class/course is included in ELA models.)

o For grades 9–12 growth measures, average grade 8 achievement of the schools’

students when they were in grade 8 is included in each model. For the MGP

measure, average grade 8 achievement of the schools’ students when they were in


12

grade 8 at the school level in the same subject (for Integrated Algebra Regents

models, from the grade 8 math test; for the ELA Regents models, from grade 8

ELA test) is used. For the Comparative Growth in Regents Exams Passed

measure, average grade 8 achievement at the school level in math and ELA is

used (computed as a standardized average).

Range around mean prior score. Schools and classrooms/courses with the same

average prior score may differ in the range of prior scores, and students may have

different growth trajectories based on being in schools or classrooms/courses with more

widely varying prior scores than those in more closely bunched prior scores. In other

words, students’ peers may affect students not only through their average ability but also

through the diversity of ability levels in the classroom/course. This group variable is an

indicator of the magnitude of difference in prior achievement in a teacher’s class/course,

calculated as the interquartile range of prior test scores—or the distance between the 25th

and the 75th percentile of prior performance in the class/course. This variable is

calculated using prior achievement scores in the same subject in a teacher’s class/course.

For example, for ELA models, the interquartile range of prior scores in ELA in a

teacher’s class/course is used in the model.

New to school in non-articulation year (grades 4–8 growth measures only). This

student-level variable is intended to account for differences between students who enroll

in a school at a different grade level than the typical entering year for most students (for

example, a student who enrolls as a seventh-grader in a school that serves grades 6–8,

when most other students entered the school at grade 6). To compute this variable, a

grades served file was compared to a student’s tested school.

Years since entering ninth grade (grades 9–12 growth measures only). This variable is

intended to account for differences among students related to when they take Regents

Exams, rather than using a student’s grade level (since grade may be inconsistently

reported and Regents Exams are taken in many different grades). For example, a student

who takes the Integrated Algebra Regents Exams as an 11th grader has a different

academic history than a student who takes it as a ninth grader. This variable is used as an

alternative to the “retained in grade” variable used in grades 4–8 analysis as a way to

compare students with similar kinds of academic histories. To compute this variable, we

use the grade 9 entry date provided on a high school enrollment file.

Count of prior required Regents Exams (grades 9–12 MGP measures only). This

variable captures the number of Regents Exams in the five required subject areas that

students have passed before the current year (in this case, 2012–2013) for the ELA and

Integrated Algebra Regents models. To compute this variable, we review Regents

assessment score files back to 2005–2006.

Count of prior required and prior additional Regents Exams (grades 9–12

comparative growth in Regents Exams passed measure only). This variable captures the

number of Regents Exams in the five required subject areas and up to three additional

non-required Regents Exams that students have passed before the current year (in this

case, 2012–2013). To compute this variable, we review Regents assessment score files

back to 2005–2006.


13

Students with Disabilities (SWD) Variables

Disability status (SWD). A yes/no variable for each student to indicate the student has

an individual education plan (IEP). This variable is derived directly from the assessment

score file, representing data that districts report to the State.

Student with disability spending less than 40 percent of their time in general

education settings. This variable is intended to account for differences among special

education students in terms of the intensity or type of services received. Per Individuals

with Disabilities Education Act (IDEA) requirements, students should be enrolled in the

“least restrictive environment” appropriate for their learning needs. This variable captures

students who spend less than 40 percent of their time in a general education setting (who

may have a disability requiring more specialized or intensive services). This variable is

derived directly from the assessment score file, representing data that districts report to

the State.

Percent SWD. This variable is intended to account for differences in the learning

environment for courses or schools serving diverse proportions of special education

students. The variable is defined as the percent of students identified as SWD in the

class/course for grades 4–8 growth measures and percent of students identified as SWD

in the school for grades 9–12 measures.

English Language Learner Variables

o ELL status. This is a yes/no variable for each student to indicate whether he or

she is an ELL student. This variable is derived directly from the assessment score

file, representing data that districts report to the State.

o NYSESLAT listening/Speaking (LS) and Reading/Writing (RW) scores. This

variable is intended to account for differences in the English language proficiency

of students identified as ELL students by controlling directly for their prior year

NYSESLAT listening/speaking and reading/writing scores. For grades 9–12

models, NYSESLAT scores from grade 8 are used. Prior year NYSESLAT scores

are used in analysis. That is, for the 2012–2013 growth model, NYSESLAT

scores from 2011–2012 are used.2

o Percent ELL. This variable is intended to account for differences in the learning

environment for courses or schools serving diverse proportions of ELL students.

The variable is defined as the percent of students identified as ELL in the

class/course for grades 4–8 growth measures and percent of students identified as

ELL in the school for 9-12 measures.

Economic Disadvantage Variables

Economic disadvantage (poverty). A yes/no variable for each student to indicate

whether the student is identified as economically disadvantaged based on eligibility for a

2 Note that in 2012–2013 the NYSESLAT assessment was changed. Going forward, only a single scaled score

result will be available. Separate listening/speaking and reading/writing scores will not be reported.


14

variety of State economic assistance programs. This flag is set to yes for students whose

families participate in economic assistance programs, such as the free- or reduced-priced

lunch programs, Social Security Insurance, food stamps, foster care, refugee assistance,

earned income tax credit, the Home Energy Assistance Program, Safety Net Assistance,

the Bureau of Indian Affairs, or Temporary Assistance for Needy Families, based on

district-provided information. This variable is derived directly from the test score file,

representing data that districts report to the State.

Percent poverty. This variable is intended to account for differences in the learning

environment for courses or schools serving diverse proportions of economically

disadvantaged students. The variable is defined as the percent of students identified as

economically disadvantaged in the class/course for grades 4–8 growth measures and

percent of students identified as economically disadvantaged in the school for grades 9–

12 measures.

Not all students have values on the incoming data for all variables. For the three main student-

level demographic characteristics (SWD, ELL, and ED), missing values are set to zero,

indicating that the student is not in the status. For all other factors, any time a factor is missing

for a student, the value is set to zero and a flag is created that indicates that the variable was

missing for that student. These missing factors are then also used as predictors in the growth

model.

Attribution Data and Weighting of Student Growth for Educators

Student-level growth scores are attributed to educators based on records of educational links

between the educators and the students Several different data sources and procedures are used to

link students to teachers and principals of grades 4–8 and 9–12 and to determine the weighting of

each student’s score for teachers, as described in the sections that follow (see also Appendix G).

Linking Students to Teachers of Grades 4–8

A critical element of growth analyses is the accurate identification of the courses students are

taking in which they learn the content and skills covered on the tests used to measure their

learning. Another critical element is identifying who is teaching those courses.

A first step is to identify which courses are considered “relevant”—that is, courses in which

instruction is provided that is aligned to the test being used to measure student growth. New

York has developed a common set of course codes across the State, and we used the courses

identified as “relevant” by the State for analysis. Appendix C provides a list of the item

descriptions used.

New York also provided data files showing student enrollment in courses and teacher assignment

to those courses. Students enrolled in relevant courses were attributed to the teacher(s) who was

identified as a teacher of record for that course.

Scores are provided at the course or subject level, meaning that teachers’ scores may reflect

multiple classrooms of students in the same content area. For example, a grade 7 math teacher

might provide instruction for several sections of grade 7 math.


15

The rule for teacher attribution has evolved over time to make use of increasingly detailed data

on student-teacher-course relationships and to better account for the time that students spend

with teachers. This topic was a major focus of the Regents Task Force after the 2011–2012

school year. In the beta analysis year (2010–2011), students were linked to teachers when there

was any record indicating the student was enrolled in a course taught by the teacher. In the first

year of growth model implementation (2011–2012), this rule was updated to require a student be

enrolled for at least 195 days (ELA) or 203 days (math) to be attributed to a teacher. For this year

(2012–2013), the rule was updated again to include more students by requiring that a student be

enrolled for 60 percent of a course with a teacher. Students who were enrolled for less than 60

percent of a course’s duration were not included in a teacher’s MGP. Students with course

enrollment of 60 percent or more were included in a teacher’s MGP, and their SGPs were

weighted based on the percentage of time the students were enrolled in and attended the course.

SGPs for students who were in a teacher’s course for longer periods of time and who attended

the class/course more regularly counted more heavily in a teacher’s MGP than those who were

enrolled and attended for less time.

There were approximately 47,000 unique teacher-school combinations with any student

relationships represented in the student-teacher-course linkage files, as well as about 4,000

unique schools, and more than 800 unique local education agencies (districts, BOCES, and

charter schools).3

Table 2 shows the linkage between students with at least two years of valid same-subject test

results and teachers. Note that students can have test scores in both ELA and math, so the count

of students with valid test data does not represent unique students, but rather student test scores.

Appendix B provides additional detail on data processing and validation steps.

Table 2. Grade 4–8 Teacher-Student Linkage Rates

Grade

Student Record With at

Least 2 Years of Valid

Same-Subject Test Data

Student Records with at Least Two Years

of Valid Same-Subject Test Data

Who Meet Minimum Enrollment

Requirements for at Least One Teacher

Linkage

Rate

4 373,458 356,462 95.4 %

5 372,252 355,939 95.6 %

6 371,806 342,286 92.1 %

7 379,726 347,204 91.4 %

8 375,243 342,366 91.2 %

Total 1,872,485 1,744,257 93.2 %

Overall, only 7 percent of student scores with sufficient data for inclusion in the model could not

be linked to a teacher (i.e., 93 percent of valid test scores were linked to at least one teacher).

This is an improvement of 10 percentage points from 17 percent in 2011–2012. (Note that the

linkage rate is not expected to be 100 percent, since students may move within and across

schools and teacher assignments may also change.)

3 Note that “teacher-school” refers to a teacher in a school. This is different from a unique individual teacher in that

a teacher may teach at several schools and so be represented as more than one “teacher-school.”


16

School and District Linkages in Grades 4–8

In both 2011–2012 and 2012–2013, students are linked to schools and districts based on a

continuous enrollment indicator found in the assessment files. This variable describes whether or

not a student was enrolled at the start and end of the year in a school or district (on BEDS day

and at the beginning of the State test administration in the spring). The same indicator is used for

institutional accountability purposes. Note that student results were not weighted by attendance

in determining a principal’s MGP and growth score. The policy rationale for not using

attendance weighting for principals (although it is used for teachers) is that principals may have

more influence on student attendance, and on the integrity of attendance data, than do teachers.

As a result of the difference in data sources and indicators used to attribute students to teachers,

principals, and districts, students can be linked to a district or a school but not a teacher, and in

rare cases, vice versa. Tables 3 and 4 show linkage rates for schools and districts.

The linkage rates at the school and district levels are higher than at the teacher level, with a 98.1

percent student test score-school linkage rate and 98.5 percent student test score-district linkage

rate. These linkage rates represent about a 3 percentage point increase in students linked to a

school from the 2011–2012 school year; however, some of this improvement is due to removing

non-reportable schools (such as non-public schools not subject to the APPR process) from the

original data files provided to AIR.

Table 3. Grades 4–8 School-Student Linkage Rates

Grade

Student Scores

with Valid Data

Student Scores with

Valid Data Who Are

Linked to Schools

Linkage

Rate

4 373,458 365,475 97.9 %

5 372,252 364,851 98.0 %

6 371,806 364,645 98.1 %

7 379,726 372,660 98.1 %

8 375,243 369,013 98.3 %

Total 1,872,485 1,836,644 98.1 %

Table 4. Grades 4–8 District-Student Linkage Rates

Grade

Student Scores

with Valid Data

Student Scores with

Valid Data Who Are

Linked to Districts

Linkage

Rate

4 373,458 367,479 98.4 %

5 372,252 366,282 98.4 %

6 371,806 366,282 98.5 %

7 379,726 374,141 98.5 %

8 375,243 370,239 98.7 %

Total 1,872,485 1,8444,23 98.5 %


17

A small proportion of the teachers, schools, and districts represented in the data files have no

students associated with them (i.e., no students meet the minimum enrollment requirements).

Table 5 shows the number of unique teacher/schools, schools, and districts in the data files, and

the numbers with at least one student linked to them.

Table 5. Number of Unique Grades 4–8 Teacher-Schools, Schools, and Districts with

Linked Students

Number in

Data Files

Number with at

Least One Student

Linked

Percentage with at

Least One Student

Linked

Teachers 46,762 44,343 94.8 %

Schools 3,641 3,525 96.8 %

Districts4 934 870 93.1 %

Linking Students to Principals of Grades 9–12

In 2012–2013, new measures of student growth for principals of grades 9–12 were implemented.

Students in grades 9–12 are linked to schools and districts based on a continuous enrollment

indicator created from a school enrollment file. Using school entry and exit dates, the indicator

describes whether not a student was enrolled at the start and end of the year in a school or district

(on BEDS day and at the beginning of June Regents Exam administration on June 11, 2013).

Students who were enrolled at these two points in time in a given school or district are attributed

to that school or district. This rule is similar to that used for principals of grades 4–8, although

the sources of data used to implement the rule are somewhat different.5 For grade 9–12 models,

students are linked to districts when they are linked to schools.

School and District Linkages in Grades 9–12

Table 6 shows linkage rates for both the MGP and GRE model. For the MGP models (based on

ELA and Integrated Algebra Regents Exams) students are described as having valid data when

they have a current year score, at least one valid grade 7 or 8 assessment in the same subject

(math for algebra and ELA for ELA), and did not pass that Regents Exam in a prior year. More

than 99 percent of student scores with valid data are linked to schools.

For the GRE model, students are described as having valid data when they are enrolled at a

school in grades 9–12 for any amount of time. Again, note that any students linked to a school

are also linked to the associated district; therefore, the linkage rates are the same.

4The number of districts includes 209 districts consisting of only a single charter school.

5 For grades 4–8, NYSED provides an indicator (the school_in flag) of student enrollment/attribution for schools..

For grades 9–12, AIR calculated a similar variable directly from enrollment data.


18

Table 6. Grades 9–12 School-Student Linkage Rates

Model

Student

Scores (ELA

and Algebra)

or Students

(GRE) with

Valid Data

Student Scores

(ELA and

Algebra) or

Students (GRE)

with Valid Data

Who Are

Linked to

Schools and

Districts

Linkage

Rate

ELA 176,911 175,009 98.9 %

Algebra 162,031 158,755 98.0 %

GRE 610,987 602,026 98.5 %

A small proportion of the schools and districts represented in the data files have no students

associated with them (i.e., no students meet the minimum enrollment requirements). Table 7

shows the number of schools and districts in the data files, and the numbers with at least some

students linked to them. The one district not included in Table 7 is District 75 in New York City.

Further analysis will be done to determine if and how to include these schools with specialized

high school programs for students with disabilities in future years.

Table 7. Number of Grade 9-12 Schools and Districts with Linked Students

Number in

Incoming

Files

Number with at

Least One

Student Linked

Percentage with

at Least One

Student Linked

Schools 1,117 1,077 96.4 %

Districts 693 692 99.9 %


19

MODEL

In 2012–2013, two different types of models were used to produce growth measures in New

York State. The first is the MGP model. Statistically, this is the same model that was used to

produce growth measures for teachers and principals of grades 4–8 in 2011–2012 (although

additional variables were included, as described in the previous section).

As described earlier in this report, beginning in 2012–2013, New York State principals of grades

9–12 also receive growth scores describing how much students in their schools are growing

academically in algebra and ELA and how well students are progressing toward passing the

Regents Exams required for graduation and college and career readiness. To produce scores

based on ELA and Integrated Algebra Regents Exams, the same basic MGP model that was used

for grades 4–8 is used with some differences in factors based on differences in the high school

environment. To produce scores describing how well students are progressing toward passing

Regents Exams, a new model was implemented. This second model is referred to as the

Comparative Growth in Regents Exams Passed (Growth in Regents Exam or GRE model). These

two models are described in detail in the sections that follow.

MGP Model

In this section we describe the statistical model used to measure student growth in New York

between two points in time on a single subject of a State assessment. We begin with a description

of the statistical model used to form the comparison point and against which students are

measured—based on similar students—and follow with a description of how student growth

percentiles (SGPs) are derived from the comparison point and its dispersion as produced by the

model. In addition, we describe how mean growth percentiles (MGPs) and all variance estimates

are produced.

At the core of the New York growth model is the production of an SGP. This is a statistic that

characterizes the student’s current year score relative to other students with similar prior test

score histories. For instance, an SGP equal to 75 denotes that the student’s current year score is

the same as or better than 75 percent of the students in the data with prior test score histories and

other measured characteristics that similar. It does not mean that the student’s growth is better

than that of 75 percent of all other students in the population.

One common approach to estimating SGPs is to use a quantile regression model (Betebenner,

2009). This approach models the current year score as a function of prior test scores and finds the

SGP by comparing the current year score to the predicted values at various quantiles of the

conditional distribution.

The methods described here do not rely on the quantile regression method for two reasons. First,

the typical implementation of the quantile regression makes no correction for measurement

variance in the predictor variables or in the outcome variable. It is known that ignoring the

measurement variance in the predictor variables yields bias in the model coefficients (e.g., Wei

& Carroll, 2009). Further complicating the issue, the measurement variance in the outcome

variable also adds to the bias in a quantile regression (Hausman, 2001), an issue that does not

occur with linear regression.


20

The model described in this section is designed to account for measurement variance in the

predictor variables, as well as in the outcome variable, to yield unbiased estimates of the model

coefficients. Subsequently, these model coefficients are used to form a predicted score, which is

ultimately the basis for the SGP. Because the prediction is based on the observed score, it is

necessary to account for measurement variance in the prediction as well. Hence, the model

accounts for measurement variance in two steps: first in the model estimation and second in

forming the prediction.

Covariate Adjustment Model

The statistical model implemented as the MGP model is typically referred to as a covariate

adjustment model (McCaffrey, Lockwood, Koretz, & Hamilton, 2004), as the current year

observed score is conditioned on prior levels of student achievement as well as other possible

covariates.

In its most general form, the model can be represented as:

∑ ∑

[1]

where is the observed score at time t for student i, is the model matrix for the student and

school-level demographic variables, is a vector of coefficients capturing the effect of any

demographics included in the model, is the observed lag score at time t–r ( { }), γ is the coefficient vector capturing the effects of lagged scores, and is the q, i element of a

design matrix with one column for each unit in q ( { }) and one row for each student

record in the database. The entries in the matrix indicate the association between the test

represented in the row and the unit (e.g., school, teacher) represented in the column. We often

concatenate the sub-matrices such that { }. is the qth element of a vector of

effects for the units within a level. In New York, it represents the vector of school or teacher

random effects for which we assume

for each level of q.

Corresponding to { }, we define

. In the subsequent sections, we

use the notation { }, and { } to simplify computation and

explanation.

Accounting for Measurement Variance in the Predictor Variables

All test scores are measured with variance, and the magnitude of the variance varies over the

range of test scores. The standard errors (variances) of measurement are referred to as

conditional standard errors of measurement (CSEMs) since the variance of a score is

heteroscedastic and depends on the score itself. Figure 1 shows a sample from the grade 8 ELA

test in New York.


21

Figure 1. Conditional Standard Error of Measurement Plot (Grade 8 ELA, 2010–2011)

Treating the observed scores as if they were the true scores introduces a bias in the regression,

and this bias cannot be ignored within the context of a high-stakes accountability system

(Greene, 2003). In test theory, the observed score is described as the sum of a true score plus an

independent variance component, where is a matrix of unobserved disturbances

with the same dimensions as .

Our estimator accounting for the error in the predictor variables is derived in a manner similar to

Goldstein (1995). The estimator is presented below with a complete theoretical derivation

provided in Appendix D.

Using Henderson’s notation (1953) we define the expected values for the mixed model as:

(

) ( ) (

)

And taking the expectations shown in Appendix D we arrive at the following estimator using the

observed scores:

( ) (

) (

)

where is a diagonal “correction” matrix with dimensions p × p accounting for measurement

variance in the predictor variables, , and is the column dimension of .


22

Specification for MGP Model for Grades 4–8

The preceding section provides details on the general modeling approach and specifically how

measurement variance is accounted for in the model. The exact specification for the New York

model is described as:

∑ ∑

∑

where is the current year test scale score for student i in grade g; is the intercept; is the

set of coefficients associated with the three prior test scores; is the set of coefficients

associated with the missing variable indicators; is the set of coefficients associated with the

student-level measured characteristics (which are described in the section on similar students);

and , , and are the school, teacher, and student random effects.

The model is implemented separately for each grade and subject. There are also two model runs.

The “adjusted” model is the model as described above. The “unadjusted” model is simply a

special case of the adjusted model that does not contain any fixed effects (such as the ELL

status) except prior test scores and missing indicators for the two- and three-year prior scores.

In all models, special procedures are used to adjust standard errors of measurement. These

procedures are described in Appendix E.

Specification for the MGP Model for Grades 9–12

The MGP model for grades 9-12 is implemented somewhat differently than the model used for

grades 4–8 in that prior Regents Exam scores are not themselves used as predictors (although the

number of Regents Exams passed prior to the outcome year is used as a predictor). For the MGP

model used for grades 9–12, scaled scores from assessments taken before grade 9 are used as

predictors. In addition, the MGP model for grades 9–12 does not fit random effects, since they

are not needed to generate SGPs and MGPs.6 This type of model is a special case of the grades

4–8 model where the teacher and principal random effects are zero. The specification for the

model is:

∑

∑

∑

where is the Regents Exam scale score for student i in subject s; is the intercept; is the

set of coefficients associated with the and grades 7 and 8 test scores and is estimated with an

error-in-variables approach; is the set of coefficients associated with the missing variable

6 Random effects were fit for the 4-8 model in 2011-2012 in order to allow for possible transition to reporting

metrics other than SGPs and MGPs, but a decision was made to maintain those metrics going forward. The 2012-

2013 grades 4-8 model maintained the use of random effects, but since 2012-2013 was the first implementation of

the grades 9-12 MGP measures, random effects were excluded from implementation in order to improve efficiency

of model estimation.


23

indicators; is the set of coefficients associated with the student-level measured characteristics

(which are described in the section on similar students); and is student random effect.

The model is implemented separately for each subject. There are also two model runs. The

“adjusted” model is the model as described above. The “unadjusted” model is simply a special

case of the adjusted model that does not contain any fixed effects except prior test scores and

missing indicators for the prior scores.

Student Growth Percentiles

The previously described regression model yields unbiased estimates of the fixed effects by

accounting for the measurement error in the observed scores. The resulting estimates of the fixed

effects are then used to form the student-level SGP statistic. For purposes of the growth model, a

predicted value and its variance for each student are required to compute the SGPs as:

(

√

)

where is the observed value of the outcome variable and where is the ith row of

the model matrix and the notation is used to mean the variance of the predicted value of y

for the ith student.

Here the regression is of the form:

Where:

The classic variance of a predictor is, for this case:

[

]

where is the variance of the predictor. However, in this case, we make two refinements to

acknowledge the effect of measurement error on the residual variance. The first is to use the

actual variance on , called , rather than the population variance on , called

, which is

already included in . This is done by subtracting the population variance and adding back the

individual variance. Thus, the variance on the predictor becomes:

[

][

]

The second refinement is to replace the population variance in , called , with the individual

variance in , called . This is done in the same way as the variance in , so the variance

estimate is now:


24

[

][

]

There is then a predicted value for each student that is used to compute the SGP. However, that

prediction is based on the estimates of the fixed effects that were corrected for measurement

variance but based on the observed score in the vector .

Figure 2 below provides an illustration of how the SGPs are found from the previously described

approach. The illustration considers only a single predictor variable although the concept can be

generalized to multiple predictor variables, as presented above.

For each student, we find a predicted value conditional on his or her observed prior scores and

the model coefficients. To illustrate the concept, assume we find the prediction and its variance

but do not account for the measurement variance in the observed scores used to form that

prediction. We would form a conditional distribution around the predicted value and find the

portion of the normal distribution that falls below the student’s observed score. This is equivalent

to:

SGP ∫

with although this is readily accomplished via the cumulative normal

distribution function, .

Figure 2. Sample Growth Percentile from Model

Figure 3 below illustrates the same hypothetical student as above. Note that the observed score

and predicted value are exactly the same. However, the prediction variance is larger than in


25

Figure 2 above. As a result, when we integrate over the normal from to , the SGP is 60

and not 90 as in the example above. This occurs because the conditional density curve has

become more spread out, reflecting less precision in the prediction.

Figure 3. Sample Growth Percentile from Model

Mean Growth Percentiles

Once the SGPs are estimated for each student, group-level (e.g., teacher-level) statistics can be

formed that characterize the typical performance of students within a group. The NYSED growth

model Technical Advisory Committee recommended using the mean SGP when providing

educator scores. Hence, group-level statistics are expressed as the mean SGP within a group.

This is referred to as the MGP or mean growth percentile.

For each aggregate unit j ( { }), such as a class/course, the interest is a summary

measure of growth for students within this group. Within group j we have

{ }. That is, there is an observed SGP for each student within group j.

Then the MGP for unit j is produced as (grade 4–8 and grade 9–12 principals):

mean

and using the weighted mean (grade 4–8 teachers only):

∑ ∑

where is the weight for student i in teacher j’s class/course.

Like all statistics, the MGP is an estimate, and it has a variance term. For New York, AIR

provides the following measures of variance for the MGP.


26

The analytic standard error of the unweighted MGP (principals) is computed within unit j as:

( )

√

and in the weighted case (teachers):

( )

√ ∑

∑

where is the sample standard deviation of the SGPs in group j and N is the number of

students in group j.

The confidence intervals were computed for MGPs using the analytic standard errors based on

the t-distribution. In the prior year, AIR used a bootstrap method to compute the standard errors

because the analytic standard error has two theoretical limitations. First, MGPs are bounded

between 1 and 99; hence, the standard errors cannot be used to form confidence limits around the

MGP because the confidence limits must be asymmetric. Second, the standard errors do not

account for potential non-normality of the distribution of the SGPs. To improve efficiency, AIR

compared confidence intervals created through a bootstrap and analytic procedure and found that

they were nearly identical and that the theoretical criticisms of the analytic standard errors were

not problems in reality. Therefore, results were computed with analytic standard errors in 2012–

2013.

Combining Growth Percentiles Across Grades and Subjects

Many teachers and principals serve students from different grades and with results from different

tested subjects. For evaluation purposes, there is a need to aggregate these SGPs and form

summary measures.

Because the SGPs are expressed as percentiles, they are free from scale-specific inferences and

can be easily combined. For any aggregate-level statistics to be provided (in this case, MGPs),

we simply pool all SGPs of relevant students and find the average of the pooled SGPs. In the

case of grades 4–8 teachers, the average is a weighted average as described earlier. Variances of

these MGPs are found using the same methods described above. More detail on reported scores

can be found in the Reporting section.

Comparative Growth in Regents Exams Passed (GRE) Model

For this model, the outcome of interest is the number of Regents Exams that a student passes for

the first time in the outcome or current year (in this case, 2012–2013). Educators whose students

pass more Regents Exams in a year than similar students will have higher scores on this metric

than other educators. For this model, Regents Exams in the five required subject areas and up to

three additional Regents Exams (for a total possible of eight Regents Exams for each student) are

counted. Once a student has passed eight Regents exams, he or she is excluded from the model.


27

Since the outcome can only take on positive integer values and is bounded by a minimum (a

student can never pass fewer than zero Regent Exams in a year) and a maximum (a student can

never have more than eight Regents Exams passed in a year), an ordered logit model is

implemented. The model is fit separately for each cohort of students (students who entered grade

9 one year ago, two years ago, and so on) for years 1, 2, 3, and 4. Students who entered grade 9

more than four years ago are aggregated into a single fifth run.

The linear part of the model is:

where X includes the variables named in the definition of similar students as well as an intercept

term, is the latent variable that dictates the number of Regents Exams a student passes, is the

fitted parameters for the variables in X, the superscript c is used to indicate that the coefficients

depend on the cohort, and the subscript i is used to indicate that and X are specific to an

individual student.

From this, the logistic function and a series of cut points are used to map to the outcome space,

generating an estimated fraction of the time that zero through eight Regents Exams were passed

by similar students. The fraction of similar students passing a particular number of Regents

Exams is then given by:

|

where is the number of Regents Exams passed this year and the are fitted cut points7

between having passed k–1 and k Regents Exams.

This set of nine values is then collapsed into the average number of Regents Exams similar

students passed this year using:

∑ |

here is the estimated number of Regents Exams passed by similar students and is the

number of Regents Exams passed at the initiation of this school year. In the above equation, the

first term represents the probability of a similar students having passed k Regents Exams this

year and the second term often multiplies that probability by k. A min function is also included in

the second term that imposes a ceiling on the number of Regents Exams passed this year,

acknowledging that the total number passed this year plus the number that had been passed at the

beginning of this year ( ) cannot exceed eight.

Finally, values of that are larger than two are set to two, because to meet a projection larger

than two Regents Exams per year, students would have to complete the eight Regents Exams

counted in this model on a schedule faster than eight Regents Exams over four years. Since

NYSED did not wish to encourage unnecessary Regents Exam-taking, this cap on projected

Regents Exams was applied.

7 These are sometimes also called intercepts.


28

Using this approach, each student has an actual number of Regents that they passed ( ), and a

number passed by similar students ( ); the latter is subtracted from the former to find a student-

level comparative growth in Regents Exams passed (GRE):

A school’s score is then the mean GRE (or MGRE) for students attributed to that school:

∑

The standard error is found by taking the sample standard deviation of the student GREs. Thus

the variance estimate is:

∑[ ]

and the standard error is the square root of that. Confidence intervals are formed from the

variances and point estimates in the same way as they were for MGPs.


29

REPORTING

Results of the New York growth models are reported to districts in a series of data files as well

as through an online reporting system accessible to teachers, principals, and district

administrators.

Reporting for Teachers and Principals of Grades 4–8

The main reporting metrics generated for teachers and principals of grades 4–8 are:

Number of Student Scores. The number of SGPs included in an MGP.

Unadjusted MGP (Principal). The mean of the SGPs for students in the school is based

on similar prior achievement scores only, without taking into consideration ELL, SWD,

or economic disadvantage student characteristics.

Unadjusted MGP (Teacher). The weighted mean of the SGPs for students who are

linked to a teacher is based on similar prior achievement scores only, without taking into

consideration ELL, SWD, or economic disadvantage student characteristics. The

weighted mean is calculated based on the amount of time students were enrolled in and

attended a course with a teacher.

Adjusted MGP (Principal). Adjusted MGP is the mean of the SGPs for students linked

to a principal, based on similar prior achievement scores, and includes consideration of

ELL, SWD, and economic disadvantage student characteristics. This MGP is used to

determine a principal’s State-provided growth score and growth rating.

Adjusted MGP (Teacher). Adjusted MGP is the weighted mean of the SGPs for

students linked to a teacher, based on similar prior achievement scores, and includes

consideration of ELL, SWD, and economic disadvantage student characteristics. This

MGP is used to determine a teacher’s State-provided growth score and growth rating.

Lower Limit and Upper Limit. Highest and lowest possible MGP for a 95 percent

confidence range.

Growth Rating. Growth rating describes the educator’s HEDI performance on the State-

provided growth subcomponent.

Growth Score. Using scoring bands determined by the Commissioner, a growth score of

0–20 points is assigned to each educator based on his or her overall MGP within each

growth rating category.

Through the online reporting system, educators can also obtain MGPs based on the subgroups

listed below.

Students with Disabilities. Students identified as having disabilities, based on district-

provided information.

English Language Learners. Students identified as speaking English as a second

language or who are receiving services through a bilingual program or a two-way

bilingual education program, based on district-provided information.

Economically Disadvantaged. Students whose families participate in economic

assistance programs such as the free- or reduced-priced lunch programs, Social Security

Insurance, food stamps, foster care, refugee assistance, earned income tax credit, the


30

Home Energy Assistance Program, Safety Net Assistance, the Bureau of Indian Affairs,

or Temporary Assistance for Needy Families, based on district-provided information.

Low-Achieving. Students who achieved at performance level 1 in either math or ELA on

the prior year assessment.

High-Achieving. Students who achieved at performance level 4 in either math or ELA on

the prior year assessment.

Reporting for Grades 9–12

The main reporting metrics generated for principals of grades 9-12 are:

Number of Student Scores (for MGP measure) or Students (for GRE measure).

These numbers refer to the SGPs included in an MGP or the number of students included

in the GRE Exams Passed score.

Unadjusted Measure. This measure is based on student growth and accounts for prior

achievement scores only, without taking into consideration ELL, SWD, or economic

disadvantage student characteristics.

Adjusted Measure: This measure is based on student growth and is adjusted for prior

achievement scores and ELL, SWD, and economic disadvantage characteristics at the

student and school level.

Lower Limit and Upper Limit. Highest and lowest possible measure score for a

95 percent confidence range.

Growth Rating. Growth rating describes the educator’s performance category (HEDI)

for each individual measure (MGP) or GRE Exams Passed and overall for grades 9–12.

The overall growth rating is used in a principal’s evaluation on the State-provided growth

subcomponent.

Growth Score. A growth score of 0–20 points is computed for a principal for each

individual measure (MGP and GRE) growth score and overall. The overall growth score

is used in a principal’s evaluation on the State-provided growth subcomponent.

As with grades 4–8 measures, MGPs and GRE results are also reported by various categories

(such as cohort, ELL, and SWD subgroups).

Minimum Sample Sizes for Reporting

Minimum sample size requirements for reporting were determined to ensure a minimum of

statistical reliability of the educator growth scores. Setting no (or a low) minimum sample size

will result in the greatest number of teachers and principals receiving information; on the other

hand, the quality of the information they receive may be poor. For 2012–2013 (and in 2011–

2012), a minimum threshold of 16 student scores or 16 students for the GRE measure was

implemented. Educator scores on any measure at any level based on fewer than 16 student scores

(or 16 students for the GRE measure) are not reported.

After applying these rules, the fraction of teachers, principals, and districts with reported results

is shown in Table 8 for grades 4–8 and Table 9 for grades 9–12.


31

Table 8. Grade 4–8 Reporting Rates for Educators and Districts

Number with

at Least One

Student

Linked

Number Meeting

the Minimum

Sample Size

Requirement

Percentage

Meeting the

Minimum Sample

Size Requirement

Teachers 44,343 39,716 89.6 %

Schools 3,525 3,460 98.1 %

Districts 870 868 99.8 %

Table 9. Grade 9–12 Reporting Rates for Educators and Districts

Number with

at Least One

Student

Linked

Number Meeting

the Minimum

Sample Size

Requirement

Percentage

Meeting the

Minimum Sample

Size Requirement

Schools 1,077 1,067 99.1 %

Districts 692 686 99.1 %

Performance Categories

To determine an educator’s growth rating (HEDI category) and growth points (0–20), NYSED

has developed a set of general rules that describe how similar or different a score on each

measure is from the State average. The actual scores that determine each rating may change from

year to year, while the general rules do not. The general rules used to obtain growth ratings are

shown in Figure 4.

Within each growth rating category, points are then assigned so that educators are approximately

uniformly distributed at each HEDI point value (with higher MGPs or GRE results earning more

points than lower MGPs or GRE results).


32

Figure 4. Determining Growth Ratings

For teachers and principals of grades 4–8, we use the overall adjusted MGP (that is, the MGP

that combines information across all applicable grade levels and subjects) and upper and lower

limit MGPs to determine their growth rating. To determine the growth rating for a principal of

grades 9–12, we first find a growth rating and score for each of the two types of principal

metrics: the MGP measure and the GRE measure using the process shown in Figure 3.

To determine a final State-provided growth subcomponent rating for principals who serve grades

4–8 and grades 9–12, growth ratings and scores are determined for grades 4–8 and grades 9–12

separately and then combined. The grades 4–8 measure growth rating is determined using the

process shown in Figure 5. Since there are multiple grade 9–12 measures, growth scores for each

grade 9–12 measure are averaged together, weighted by the number of students in each measure,

to find an overall grade 9–12 growth rating and score. An overall growth subcomponent rating

that includes results for both grades 4–8 and grades 9–12 students is then computed in the same

manner, by averaging grades 4–8 and grades 9–12 growth scores by the number of students in

each measure and finding the final rating.

Additional detail can be found in the resources for educators posted at:

http://www.engageny.org/resource/resources-about-state-growth-measures.

http://www.engageny.org/resource/resources-about-state-growth-measures


33

RESULTS

Results from Growth Models for Grades 4–8

This section provides an overview of the results of model estimation using 2012–2013 data. A

pseudo R-squared statistic and summary statistics characterizing the SGPs, MGPs, and their

precision provide an overview of model fit. Note that this section focuses on teacher-level

results, although additional information on principal/school-level results is available in the

appendices.

The appendices to this report provides more detailed information on model behavior and results

including model coefficients and variance components.

Model Fit Statistics for Grades 4–8

The R-square is a statistic commonly used to describe the goodness-of-fit for a regression model.

Because the model implemented here is a mixed model and not a least squares regression, we

refer to this as a pseudo R-square. Table 10 presents the pseudo R-square values for each grade

and subject, computed as the squared correlation between the fitted values and the outcome

variable.

The pseudo R-squared values increased between 2011–2012 and 2012–2013, as shown in

Table 10. The average increase for both models is over 0.10. Because the R-squared increase

also occurs in the unadjusted models (which do not contain any additional predictors beyond

prior achievement), a large fraction of the increase can be attributed to an increase in the

explanatory power of the prior year scores. This increase in explanatory power of the model

suggests that the change in assessments not only did not harm the prediction model, but actually

improved its precision.

Table 10. Grade 4–8 Pseudo R-Squared Values by Grade and Subject

2011–2012 2012–2013

Subject Grade

Unadjusted

Model

Adjusted

Model

Unadjusted

Model

Adjusted

Model

ELA

4 0.61 0.61 0.69 0.72

5 0.63 0.64 0.73 0.74

6 0.66 0.67 0.75 0.76

7 0.64 0.65 0.74 0.76

8 0.63 0.64 0.74 0.75

Math

4 0.60 0.61 0.70 0.73

5 0.65 0.66 0.77 0.78

6 0.62 0.62 0.79 0.80

7 0.70 0.70 0.76 0.77

8 0.66 0.66 0.78 0.79


34

Student Growth Percentiles for Grades 4-8

The SGPs describe a student’s current year score relative to other students in the data with

similar prior academic histories and other measured characteristics. A student’s SGP should not

be expected to be higher or lower based on his or her prior year score. This would mean that high

prior year score students were receiving higher or lower SGPs than students at large. The

correlation between the prior-year scale score and SGP is shown in Table 11 for each grade and

subject. These small correlations are usually negative as a result of using the EiV approach to

account for measurement variance in the prior year scale score; the correlation need not be zero.

Squaring these values gives the percent of variation in SGPs explained by prior year scores for

any grade and subject. Because less than 1 percent of the variation in SGPs is explained by the

prior year test score, the prior year test score is a poor predictor of current year SGPs. Because

SGPs are intended to allow students to show low or high growth no matter what their prior

performance was, this result is as expected.

Table 11. Grade 4–8 Correlation between SGP and Prior Year Scale Score

Grade ELA Math

4 0.00 –0.05

5 –0.05 –0.07

6 –0.03 –0.09

7 0.01 –0.09

8 0.04 –0.03

Mean Growth Percentiles for Grades 4-8

As described earlier in this report, teachers’ MGPs are aggregate educator-level statistics,

computed as the weighted mean of SGPs for all students associated with a roster (teacher) or as

the mean for schools (principals). In this section, we provide descriptive statistics on overall or

combined MGPs.

For teachers with results for students in both ELA and math, the combined MGP is an average

across SGPs for both subjects. For teachers who provide instruction in only one subject, their

overall or combined MGP is the same as their subject-specific MGP. At the teacher level, about

half of the teachers have results for students in only one subject (either math or ELA).

Figure 5 provides a histogram of the teacher combined MGPs for the adjusted model (including

demographics). In all grades, the results are approximately normally distributed.


35

Figure 5. Distribution of Grade 4–8 Teacher MGPs by Grade, Adjusted Model

Figure 6 shows that for principals, the results are less widely distributed than for teachers.

Figure 6. Grade 4–8 Distribution of Principal MGPs, Adjusted Model

Precision of the Mean Growth Percentiles for Grades 4–8

The caterpillar plot in Figure 7 is a random sample of 100 teacher MGPs taken from the 2012–

2013 data. The MGPs are sorted from lowest to highest with the corresponding 95 percent

confidence range showing the lower and upper limits of the MGP. Figure 8 shows the same type

of plot for principals (where larger underlying samples mean that there is substantially less

variation in the MGP and the error bars are narrower). These figures provide a sample of the

distribution of MGPs and a typical confidence range.


36


Sample of 100 Teachers


37


Sample of 100 Principals

Figures 7 and 8 provide a means to visually gauge the precision of MGPs. However, it may also

be useful to examine a statistic to assess the precision of the teacher-level MGPs. We specify a

reliability statistic as :

(

)

where is the mean standard error of the MGP and sd is the standard deviation between

teacher MGPs. In theory, the highest possible value is 1, which would represent complete

precision in the measure. When the ratio is 0, the variation in MGPs is explained entirely by

sampling variation. Larger values of are associated with more precisely measured MGPs.

Table 12 provides the mean standard errors, the standard deviation, and the value of for the adjusted model by grade (again, for combined-subject MGPs). The values of the ratio (ρ)

quantify imprecision in the estimates. In all grades, the statistics are much closer to one than

zero, indicating that the differentiation between schools seen in the measures is not largely due to

measurement variance.


38

Table 12. Grades 4–8 Mean Standard Errors, Standard Deviation, and Value of ρ for

Adjusted Model by Grade for Teachers and for Schools

Grade

(Teachers)

Adjusted

Mean SE

Adjusted

Standard

Deviation

Reliability

Statistic

( )

4 4.1 11.6 0.88

5 4.1 11.6 0.87

6 4.1 10.8 0.86

7 3.8 9.6 0.85

8 3.8 10.2 0.86

Schools 1.6 7.0 0.95

Table 13 provides the share of educators whose MGPs are significantly above or below the State

mean for that educator type, using the 95 percent confidence intervals. In all cases, the percent

exceeding the mean is larger than what would be expected by chance alone (5 percent would be

expected to be in the above and below the mean categories by chance, or 2.5 percent for each

table entry).

Table 13. Percent of Educator MGPs Above or Below Mean at the 95 % Confidence Level

Level Below Mean Above Mean

Teacher 25 % 23 %

School 33 % 36 %

Impact Data Results for Grades 4–8

Table 14 provides the correlations of the combined-subject MGP (or for teachers with only one

subject, their single-subject MGP) with six classroom/course characteristics: the three control

variables at the individual student level NYSED’s regulations permit for inclusion in the model

and that were selected after discussion with New York’s Task Force and other stakeholders—

ELL, SWD, and poverty or economic disadvantage (ED) and the mean prior ELA or math score

of the students8. Correlations are presented for adjusted MGPs (the adjusted model includes

demographic variables for individual students).9

8 For prior scores, the Z-score of the scale score is used instead of the actual scale score because many teachers

have students in various grades and the scale scores are not designed to be averaged directly across grades. 9 The impact of these demographic characteristics on the expected value of students’ current test scores used to

compute SGPs can be seen through the model coefficients presented in Appendix H. The inclusion of these

variables serves to make SGPs for students with different demographic characteristics comparable, given the prior

test scores included in the model.


39

Table 14. Teacher MGP Correlated with Class/Course Characteristics

Percent

2011–2012

Adjusted

Model

2012–2013

Adjusted

Model

ELL in Class/Course 0.00 0.05

SWD in

Class/Course –0.06 0.05

ED in Class/Course –0.10 0.05

Mean Prior ELA 0.10 0.02

Mean Prior Math 0.13 0.08

Large correlations between MGP and classroom/courses or school characteristics would indicate

systematic relationships between scores and the types of students that teachers and schools serve.

No such relationships are seen in the 2012–2013 data, where correlations generally decreased

from 2011–2012 (likely due to the inclusion of additional covariates in 2012–2013, including

aggregate covariates) and all have absolute values under 0.10. A value of 0.10 or less indicates

that less than 1 percent of the variance in MGPs can be predicted with that demographic variable

and therefore represents results that are essentially zero.

The scatter plots shown in Figures 9 through 13 provide visual representations of the data

underlying the correlations for teachers shown in Table 14, and Figures 14 to 18 provide similar

images of the data underlying school-level (principal MGP) correlation shown in Table 15.10

10

Results disaggregated by grade and subject are shown in Appendix I. The results in this section are combined over

grades and subjects.


40

Figure 9. Grades 4–8 Relationship of Teacher MGP Scores to Percent of ELL Students in

Class/Course

Figure 10. Grades 4–8 Relationship of Teacher MGP Scores to Percent SWD in

Class/Course


41

Figure 11. Grades 4–8 Relationship of Teacher MGP Scores to Percent of Economically

Disadvantaged Students in Class/Course

Figure 12. Grades 4–8 Relationship of Teacher MGP Scores to Mean Prior ELA Scores in

Class/Course


42

Figure 13. Grades 4–8 Relationship of Teacher MGP Scores to Mean Prior Math Scores in

Class/Course

Table 15 provides the observed correlations of principal MGPs with the same characteristics

presented for teachers (but aggregated to the school level). As was the case at the teacher level,

correlations decreased between 2011–2012 and 2012–2013 (likely due to the inclusion of

additional covariates in 2012–2013). Three impact correlations remain above 0.10, indicating

that schools with students who have higher prior scores and more ELL students receive higher

MGPs on average. However, the fraction of the variation in MGPs explained by these variables

is still relatively small. For mean prior ELA score, where the correlation is 0.23, the mean prior

score explains about 5 percent of variation in MGPs.

Table 15. Principal MGP Correlated with School Characteristics

Percent

2011–2012

Adjusted

Model

2012–2013

Adjusted

Model

ELL in School 0.05 0.11

SWD in School –0.23 0.04

ED in School –0.11 0.06

Mean Prior ELA 0.35 0.16

Mean Prior Math 0.40 0.23


43

Figure 14. Relationship of Principal MGP Scores to Percent of ELL Students

Figure 15. Relationship of Principal MGP Scores to Percent SWD in School


44

Figure 16. Relationship of Principal MGP Scores to Percent of Economically

Disadvantaged Students

Figure 17. Relationship of Principal MGP Scores to Average Prior ELA Scores


45

Figure 18. Relationship of Principal MGP Scores to Average Prior Math Scores

Growth Ratings for Grades 4–8

This section describes the observed distribution of the growth ratings (assigned using the rules

described earlier in the results section). Table 16 shows the distribution for grades 4–8 teachers

and all principals in schools that serve at least grades 4–8 (including, for instance, schools

serving grades 4–12) for 2011–2012 and 2012–2013.

Table 16. Grades 4–8 Teacher and Principal Growth Ratings

School

Year

Educator

Level

Highly

Effective

Effective Developing Ineffective

2011–2012 Teacher 7 % 77 % 10 % 6 %

Principal 6 % 79 % 8 % 7 %

2012–2013 Teacher 7 % 76 % 11 % 6 %

Principal 9 % 75 % 9 % 7 %

Stability of Growth Ratings for Grades 4–8 Over Time

Table 17 shows the distribution in the current year and the prior year for all grades 4–8 teachers

and all grades 4–8 principals that received a rating in both 2012–2013 as well as in 2011–2012.

Note that not all educators had scores in both years.


46

Table 17. Grades 4–8 Teacher and Principal Growth Ratings For Educators with Scores in

2011–2012 and 2012–2013

School

Year

Educator

Level

Highly

Effective


2011–2012 Teacher 7 % 78 % 9 % 5 %

Principal 7 % 81 % 7 % 6 %

2012–2013 Teacher 7 % 76 % 11 % 6 %

Principal 8 % 76 % 9 % 7 % Note: Due to rounding, percentages may not add to 100.

For teachers who had growth ratings in 2011–2012 and 2012–2013, Table 18 shows the

relationship between ratings across years. Table 19 shows the relationship for school-level

MGPs. The results show that the ratings are stable, with about two-thirds (68 percent of teachers

and 69 percent of principals) remaining in the same growth rating category from year to year.

The MGPs have a Pearson correlation coefficient of 0.46 for teachers and a correlation

coefficient of 0.44 for schools/principals between 2011–2012 and 2012–2013. These correlation

coefficients are larger than those often reported in the literature on growth scores (see, e.g.,

McCaffrey, Sass, Lockwood, & Mihaly, 2009), suggesting that the NYS MGPs are relatively

stable compared to other growth measures.

Table 18. Grades 4–8 Teacher Growth Ratings for Teachers Present in Both 2011–2012

and 2012–2013

Growth Rating 2012–2013

Growth Rating

in 2011–2012

Highly

Effective Effective Developing Ineffective

Highly Effective 2 % 5 % 0 % 0 %

Effective 5 % 63 % 8 % 3 %

Developing 0 % 6 % 2 % 1 %

Ineffective 0 % 3 % 1 % 1 % Note: Due to rounding, percentages may not add to 100.

Table 19. Grade 4–8 School Growth Ratings for Schools Present in both

2011–2012 and 2012–2013

Growth Rating 2012–2013

Growth Rating

in 2011–2012

Highly

Effective Effective Developing Ineffective

Highly Effective 2 % 4 % 0 % 0 %

Effective 6 % 64 % 7 % 4 %

Developing 0 % 5 % 1 % 1 %

Ineffective 0 % 3 % 1 % 2 % Note: Due to rounding, percentages may not add to 100.


47

Results for Grades 9–12

This section provides the results for the grades 9–12 models using 2012–2013 Regents Exam

data.

Model Fit Statistics for Grade 9–12 Models

Table 20 shows the R-squared for the MGP models based on ELA and Algebra Regents Exam

data.

Table 20. Grade 9–12 Pseudo R-Squared Values

Subject Unadjusted

Model

Adjusted

Model

ELA 0.52 0.60

Algebra 0.46 0.52

The GRE model is not a linear model and so an alternative fit quality measure is needed to

replace the R-squared statistic; instead we evaluate the behavior of the data using the impact

data.

Correlation of Combined MGP with GRE Results

For grades 9–12 in 2012–2013, the correlation between a school’s combined MGP and GRE

results was 0.41, which may indicate that these two measures capture different aspects of student

growth (and were one reason both measures were included for 9–12 principals).

Fraction of Students Included in Measures

On average, the GRE measure includes a much higher percentage of students in a 9–12 school

annually than the combined MGP measure. Table 21 shows the percentage included.

Table 21. Average Percent of Students Included in 2012–2013 Measures

Measure Mean Fraction of

Students in a School

Included in

Measures

MGP

(ELA/Algebra) 44 %

GRE 84 %

Distribution of MGPs and GRE Scores for Grades 9–12

Figure 19 shows the distribution of combined school MGPs for grades 9–12, that is, MGPs that

combine information across SGPs in Algebra and ELA. The distribution is approximately

normal.


48

Figure 19. Grades 9–12 Distribution of Principal MGP, Adjusted Model

The GRE model reports results as the number of Regents Exams that the average student in a

school will pass compared to the number passed by similar students. For example, a GRE score

of 0.25 would indicate that, on average, students in that principal's school pass one-quarter of a

Regents Exam more than similar students. Over four years of high school, this rate per year

would add up to an additional Regents Exam passed by each student. Figure 20 displays a

histogram of GRE results. GRE results are somewhat skewed relative to the normal distribution.

Figure 20. Grades 9–12 Distribution of Principal GRE Scores, Adjusted Model

Precision of the Measures for Grades 9–12

The caterpillar plot in Figure 21 shows 100 randomly selected school MGPs and their confidence

interval, giving a sense of the precision of the estimates. A second caterpillar plot in Figure 22

shows the GRE measure values and the associated confidence intervals. In both of these plots, it

is apparent that the confidence intervals are small relative to the overall dispersion in the

measures themselves.


49

Figure 21. Grades 9–12 Caterpillar Plot of School MGPs


50

Figure 22. Grades 9–12 Caterpillar Plot of School GRE Results

Table 22 shows the share of principals of grades 9–12 whose scores are significantly different

from the mean (their confidence intervals on the caterpillar plot do not cross the average value).

Once again, the share exceeds what would be expected by chance alone, indicating that the

model is able to distinguish among schools.

Table 22. Grade 9–12 Percent of Principals Measures Above or Below Mean at the 95 %

Confidence Level

Educator Type Below Mean Above Mean

Principal, MGP 33 % 31 %

Principal, GRE 41 % 30 %

The reliability ( ) statistic, which was introduced earlier as a measure of the precision of the

MGP measure, is shown in Table 23 for both the GRE and MGP adjusted models for grades 9–

12 models. In both cases, the statistics are much closer to one than zero, indicating that the

differentiation between schools seen in the measures is not largely due to measurement variance.


51

Table 23. Grades 9–12 Mean Standard Errors, Standard Deviation,

and Value of ρ for Adjusted Model

Model

Adjusted

Mean SE

Adjusted

Standard

Deviation

Reliability

Statistic

( )

MGP 2.1 7.7 0.92

GRE 0.05 0.21 0.95

Impact Data Results for Grades 9–12

Table 24 shows the correlations for the MGP and GRE adjusted models with various

demographics aggregated at the school level.11

As was seen with results for the grades 4–8

models, all of the MGP correlations with demographics are small. (Note that there is no

comparative data from 2011–2012 for these measures, since 2012–2013 is the first year of their

implementation.) However, for the GRE model, all correlations are larger than 0.10 in absolute

value, indicating that schools that have a higher percentage of ELL, SWD, or economically

disadvantaged (ED) students receive lower GRE scores on average. In the case of the percent

ED, for example, 24 percent of the variation in GRE scores is explained by the percent of

economically disadvantaged students.

Table 24. Principal MGP Correlated with Demographic Characteristics

Percent

MGP,

Adjusted

Model

GRE,

Adjusted

Model

ELL in School 0.04 –0.21

SWD in School –0.01 –0.24

ED in School –0.01 –0.49

Mean Grade 8 ELA 0.06 0.52

Mean Grade 8 Math 0.03 0.51

Figures 23 to 27 plot these data for MGP results and Figures 28 to 32 for GRE results. The

higher demographic correlations for the GRE measure (as compared to the MGP measure) are

not surprising given that it is rooted in a status (or achievement) metric: passing enough Regents

exams to earn a NYS diploma. Schools with large concentrations of high-achieving grade 8

students and smaller concentrations of students in poverty, students with disabilities or ELLs will

typically have a statistical advantage over schools whose students arrive with more needs. The

GRE measure partially mitigates that advantage by looking at the number of Regents Exams

passed each year by each student compared to students with similar prior academic history and

demographic characteristics, but the pattern remains overall.

Although at the student level economically disadvantaged or lower-achieving students can

outperform similar peers, schools whose students enter at lower average levels of achievement or

11

Note that for 9–12 models, all students have prior scores in the same grade (grade 8), so the scale scores

themselves are averaged.


52

schools who have greater proportions of economically disadvantaged students show less average

annual progress schoolwide than other schools toward having their students pass up to eight

Regents needed for New York’s several diploma categories. Individual students with low grade

8 scores and/or who live in poverty may be further challenged if they attend schools with higher

concentrations of low-achieving or poor students. At the same time, it is important to note that

there is variation in school-level results at all levels of average prior achievement (as seen in the

following figures), suggesting that schools can demonstrate strong results regardless of school

characteristics.

Figure 23. Relationship of Principal MGP Scores to Percent of ELL Students


53

Figure 24. Relationship of Principal MGP Scores to Percent SWD in School

Figure 25. Relationship of Principal MGP Scores to Percent of Economically

Disadvantaged Students


54

Figure 26. Relationship of Principal MGP Scores to Average Prior ELA Scores

Figure 27. Relationship of Principal MGP Scores to Average Prior Math Scores


55


and Percent of ELL in the School


and Percent of Students with Disabilities in the School


56


and Percent of Economically Disadvantaged in the School


and Average Grade 8 ELA Scale Scores


57


and Average Grade 8 Math Scale Scores

Growth Ratings for Principals of Grades 9–12

Table 25 shows the distribution of growth ratings for principals of all schools serving grades 9–

12 (including those which may also serve other grades, such as grades 4–8).

Table 25. Distribution of Growth Ratings for Principals of Grades 9–12 in 2012–2013

Educator

Level

Highly

Effective


Principal 2 % 86 % 11 % 2 % Note: Due to rounding, percentages may not add to 100.

Growth Ratings for Schools/Principals Serving Grades 4–8 and Grade 9–12

Some schools received separate growth ratings for grades 4–8 and grades 9–12. Table 26 shows

growth ratings for schools that serve only grades 4–8 (4–8 only), schools that serve grades 9–12

only (9–12 only), schools that serve grades 4–12 and receive both 4–8 and 9–12 growth ratings

(4–8 and 9–12), and all schools that received a growth rating (all schools).


58

Table 26. Growth Ratings for Principals in 2012–2013

Inclusion Highly

Effective


4–8 Growth

Rating

4–8 only 9 % 76 % 8 % 7 %

4–8 and 9–12 5 % 72 % 14 % 10 %

All schools 9 % 75 % 9 % 7 %

9–12 Growth

Rating

9–12 only 1 % 86 % 11 % 1 %

4–8 and 9–12 2 % 85 % 10 % 3 %

All schools 2 % 86 % 11 % 2 %

Overall Growth

Rating

4–8 and 9–12 2 % 82 % 14 % 1 %

All schools 7 % 78 % 9 % 6 % Note: Due to rounding, percentages may not add to 100.


59

CONCLUSION

The models selected to estimate growth scores for New York State in 2012–2013 represent an

effort to improve on the models used in 2011–2012 by making better use of administrative data

on student-teacher linkages, and by enhancing the factors accounted for in the models. New

models for principals of grades 9–12 were selected and developed based on technical and data

considerations and on the recommendations of a variety of stakeholders.

Between 2012–2013 and 2013–2014, New York State plans to continue to improve the use of

data linking students to teachers and to principals, while maintaining the factors accounted for in

the model and the overall weight of State-provided growth scores in teacher and principal

evaluations (20 percent).

For 2014–2015, the New York Board of Regents has approved the use of a value-added model

that will allow some additional covariates to be included in the analyses and may include some

other technical refinements. This may involve including additional variables at the

classroom/course and school levels to help adjust for differences in teacher and principal

outcomes not captured by student-level variables.


60

REFERENCES

Betebenner, D. W. (2009). Norm- and criterion-referenced student growth. Educational

Measurement: Issues and Practices 28(4), 42–51.

Goldstein, H. (1995). Multilevel Statistical Models. University of Bristol, Bristol, UK. Available

online at http://www.bristol.ac.uk/cmm/team/hg/multbook1995.pdf.

Greene, W. (2003). Econometric Analysis. Upper Saddle River, New Jersey, 5th ed.

Hausman, J. (2001). Mismeasured variables in econometric analysis: Problems from the right

and problems from the left. Journal of Economic Perspectives 15(4), 57–67.

Henderson, C. R. (1953). Estimation of variance and covariance components. Biometrics 9, 226–

252.

McCaffrey, D. F., Lockwood, J. R., Koretz, D. M., & Hamilton, L. S. (2004). Evaluating value-

added models for teacher accountability. Santa Monica, CA: RAND Corporation.

McCaffrey, D. F., Sass, T. R., Lockwood, J. R., & Mihaly, K., “The Intertemporal variability of

teacher effect estimates.” Education, Finance and Policy, 4(4), 572–606.

Wei, Y., & Carroll, R. J. (2009). Quantile regression with measurement error. Journal of the

American Statistical Association 104, 1129–1143.


A–1

Appendix A. Task Force and Technical Advisory Committee Members

Participant Affiliation

Technical Advisory Committee

Dan Goldhaber University of Washington

Hamilton Lankford State University of New York at Albany

Daniel F.

McCaffrey Educational Testing Service/RAND

Jonah Rockoff Columbia University

Tim R. Sass Georgia State University

Douglas Staiger Dartmouth College

Marty West Harvard University

James A. Wyckoff University of Virginia


B–1

Appendix B. Grades 4–8 Data Processing Rules and Results

Table B-1: Grades 4–8 Data Processing Rules, Assessments

Description

D.1 Keep only records with item descriptions as shown in Appendix D of

specifications document.

D.2 Drop records exclusively from the assessment file when the State student ID

(SSID) is missing or invalid or shows duplicate ID numbers. Drop both

duplicates.

A valid ID number is 10 characters long (including leading zeros) and contains

only numbers.

A duplicate ID occurs when there are two records for the same assessment date

and the same SSID.

D.3 After applying rule D.2, drop records with out-of-range test scores in current

or prior years. Out-of-range scores are those with no standard errors of

measurement (SEMs) on the SEM file for that year/subject/grade.

D.4 For the current-year file only, drop schools not on the school grade file.

D.5 Include in analysis students with missing demographic data on variables

included in models.

D.6 Exclude from analysis students who do not have a prior-year assessment from

the prior grade. Thus a grade 4 student must have a valid score in 2011–2012

for grade 3.


B–2

Grades 4–8 Data Processing Rules, Teacher-Student-Course Linkage

Teacher-student-course linkage records contain the following key variables (some additional

variables that appear on the source files, such as student names, are not used in processing):

SSID, teacher ID, school ID, item description, course duration (minutes that the course was

scheduled), enrollment linkage duration (minutes that the student was enrolled in the course),

attendance linkage duration (minutes that the student attended the course), start date (date the

student enrolled in the course), end date (date the student exited the course), reporting date (date

the data were reported, typically January, April, or June), and course name. In addition to the

variables found in the original source data, AIR derives several variables from these data, such as

student growth percentile (SGP) weight, that are used in producing final mean growth percentiles

(MGPs). The original source data file contains multiple records per student-teacher-course

combination. The rules that follow describe how these data are processed to arrive at a single

record per student-teacher-course combination, so that student SGPs can be weighted in a

teacher’s MGP according to the length of time the student was enrolled and attended the

teacher’s course.

Step 1. (L.M2) Remove duplicate high school course records. In NYSED’s source data file, some

high school course names were associated with multiple item descriptions (e.g., grades 6, 7, and

8 math). In this step, we maintain only records with test scores in the grade and subject of the

assessment.

Step 2. (L.M1) Drop records with missing school IDs. Student-teacher-course records will

ultimately be merged to student test scores by student ID and school ID, and records without

school ID will not be able to be merged.

Step 3. (L.0) Drop all records with a start date after the exam date. Since we are interested in the

time students spent with teachers prior to testing, we drop any records with a student-teacher-

course relationship that began after the test date.

Step 4. (L.1) Drop duplicates of records where they have the same student ID, teacher ID, school

ID, district ID, item description, start date, end date, course duration, enrollment linkage

duration, and attendance linkage duration. At this stage, records which were otherwise the same

on key variables would differ only by their reporting date. Here we keep the later reporting date.

Step 5. (L.1a) For records that are otherwise the same in terms of student ID, teacher ID, school

ID, district ID, item description, and start date, adjust any end date that is after the assessment

date to be the assessment date. This step is an interim step to detect duplicate records for

otherwise similar records with different reporting dates (which are removed in step 7).

Step 6 (L.2). For records that are otherwise the same in terms of student ID, teacher ID, school

ID, district ID, item description, start date, and end date, drop any records that have zero or

missing course duration. If all records that are otherwise the same in terms of the variables

described in this step have missing or zero course duration, keep one.


B–3


ID, district ID, item description, start date, and end date, keep the record with the earliest

reporting date on or after the assessment date for this subject. If records have the same reporting

date, keep all records. If no records have dates after the assessment, keep the latest reporting date


ID, district ID, and item description, but where start and end dates do NOT overlap (e.g., one

record begins on September 5 and ends on December 20 and a second record starts on January 1

and ends on April 15), create one record by taking the following steps: (1) choose the longest

course duration; (2) sum the enrollment linkage duration and attendance linkage duration; and

(3) if necessary, adjust enrollment linkage duration and attendance duration to be no larger than

course duration. Create a days of enrollment variable (used only if the course duration data is

missing) that is the count of all unique days across all records.

Step 9. (L.5) For records that are otherwise the same in terms of student ID, teacher ID, school

ID, district ID, and item description but have start and end dates that overlap (e.g., one record

begins on September 5 and ends on December 20 and another record begins on September 5 and

ends on April 15), create one record by taking the following steps: a) sum course durations,

enrollment linkage duration and attendance linkage duration. Create a days of enrollment

variable (used only if the course duration data is missing) that is the count of all unique days

across all records.


B–4

Table B-2: Grades 4–8 Data Processing Rules, NYSESLAT Assessments

Description

N.1 Drop records exclusively from the NYSESLAT assessment file with missing,

invalid, or duplicate ID numbers. A valid ID number is defined in D.2.

N.3 Keep only scores that are valid (in the “Scale score” column of the appropriate

appendix for that item description to the memo with the subject “NYSESLAT–

Determining an English Language Learner’s English Performance Level dated

August 2011”).

N.4 When either of the LS or RW scale score is missing, drop the record.

N.5 Drop NYSESLAT assessment forms variable when fewer than 10 students

would otherwise have the variable defined.


B–5

Table B-3: ELA Data Processing

Data Processing Description Grade Year Resulting #

Obs After

Exclusion

# Records

excluded

2009–10 Score File Processing

Number of records in the input score file (includes math

and ELA)

All 2009–10 2,540,947 —

Number of records in the score file after keeping only

the records with valid item descriptions

All 2009–10 1,263,223 1,277,724

Number of records in the score file after deleting

blank/invalid/duplicate SSID

All 2009–10 1,262,683 540


invalid and out-of-range scores

All 2009–10 1,261,629 1,054


Number of records in the input score file All 2010–11 2,540,183 —



All 2010–11 1,263,809 1,276,374



All 2010–11 1,263,254 555



All 2010–11 1,262,052 1,202





All 2011–12 1,259,262 1,270,328



All 2011–12 1,258,696 566



All 2011–12 1,257,534 1,162





All 2012–13 1,248,810 1,254,266



All 2012–13 1,248,387 423



All 2012–13 1,247,302 1,085

Number of records after deleting records not associated

with a school on the school grades file

All 2012–13 1,159,953 87,349

Number of records after deleting grade 3 from current

year

All 2012–13 976,067 192,886

Number of records in the data for a specific grade 4 2012–13 192,793 —





Merging 2011–12 Score File with Prior-Year Score

Files


B–6


Obs After

Exclusion

# Records

excluded

Number of records in the merged score file after

deleting records without immediate prior score in prior

grade

All 2012–13 926,083 40,984

Number of records for a specific grade after merging

years of data

4 2012–13 184,564 —

Number of records for a specific grade after merging 3

years of data

5 2012–13 184,031 —


years of data

6 2012–13 183,606 —


years of data

7 2012–13 187,977 —


years of data

8 2012–13 185,905 —

Teacher-Course File Processing

Number of records in input teacher course file All 2012–13 39,405,383 —

Number of records after filtering to relevant item

descriptions and students who are not in schools for

which the growth model is intended

All 2012–13 2,910,713 36,494,670

Number of records with start dates before the

assessment day

All 2012–13 2,895,862 14,851

Number of records with schools on the school grade file All 2012–13 2,894,023 1,839

Number of records after condensing to a single record

per student/teacher/school combination

All 2012–13 1,299,638 1,594,38512

Merging Teacher-Course File to Merged Student

Test score files

Number of records in student teacher course file after

deleting records with courses but no valid student score

All 2012–13 1,072,918 —

Number of students attributed to at least one teacher All 2012–13 900,674 —

NYSESLAT 2011–2012 File Processing

Number of records on NYSESLAT file All 2011–12 234,402 —

Number of records after dropping invalid scores and

IDs and LS/RW scores without the other score

All 2011–12 234,402 0

Number of records after dropping duplicate IDs All 2011–12 234,229 173

Merging NYSESLAT File to Merged Student Test

score files

Number of records with NYSESLAT scores after merge ALL 2012–13 70,274 —

Number of students with NYSESLAT scores defined

after keeping only those NYSESLAT forms with 10 or

ALL 2012–13 70,254 20

12

Condensed records are not excluded but aggregated with other records, resulting in a smaller number of records

overall.


B–7


Obs After

Exclusion

# Records

excluded

more students who with valid scores on that form

Description of Final Reference File Used For

Analysis

Number of records in the ELA Reference File ALL 2012–13 926,083 —

Number of records in the ELA Reference File for a

specific grade

4 2012–13 184,564 —


specific grade

5 2012–13 184,031 —


specific grade

6 2012–13 183,606 —


specific grade

7 2012–13 187,977 —


specific grade

8 2012–13 185,905 —


B–8

Table B-4: Math Data Processing


Obs after

Exclusion

# Records

excluded



and ELA)

All 2009–10 2,540,947 —



All 2009–10 1,277,724 1,263,223



All 2009–10 1,277,195 529



All 2009–10 1,275,835 1,360



and ELA)

All 2010–11 2,540,183 —



All 2010–11 1,276,374 1,263,809



All 2010–11 1,275,835 539



All 2010–11 1,274,923 912



and ELA)

All 2011–12 2,529,590 —



All 2011–12 1,270,328 1,259,262



All 2011–12 1,269,796 532



All 2011–12 1,268,712 1,084



and ELA)

All 2012–13 2,503,076 —



All 2012–13 1,254,266 1,248,810



All 2012–13 1,253,828 438



All 2012–13 1,252,545 1,283

Number of records after deleting records not associated

with a school on the school grades file

All 2012–13 1,181,359 71,186

Number of records after deleting grade 3 from current

year.

All 2012–13 197,244 —






B–9


Obs after

Exclusion

# Records

excluded


Merging 2011–12 Score File with Prior Year Score

Files

Number of records in the merged score file after

deleting records without immediate prior score in prior

grade

All 2012–13 940,305 43,810


years of data

4 2012–13 188,048 —


years of data

5 2012–13 187,241 —


years of data

6 2012–13 187,074 —


years of data

7 2012–13 190,327 —


years of data

8 2012–13 187,615 —

Teacher-Course File Processing

Number of records in input teacher course file All 2012–13 39,405,383

Number of records after filtering to relevant item

descriptions and students who are not in schools for

which the growth model is intended

All 2012–13 2,723,930 36,681,453

Number of records with start dates before the

assessment day

All 2012–13 2,712,947 10,983

Number of records with schools on the school grade file All 2012–13 2,710,795 2,152

Number of records after condensing to a single record

per student/teacher/school combination

All 2012–13 1,240,663 1,470,13213

Merging Teacher-Course File to Merged Student

Test Score Files

Number of records in student teacher course file after

deleting records with courses but no valid student score

All 2012–13 1,039,554 —

Number of students attributed to at least one teacher All 2012–13 908,368 —

NYSESLAT 2011–2012 File Processing

Number of records on NYSESLAT file All 2011–12 234,402 —


IDs and LS/RW scores without the other score

All 2011–12 234,402 0

Number of records after dropping duplicate IDs All 2011–12 234,229 173

Merging NYSESLAT File to Merged Student Test

Score Files

13

Condensed records are not excluded but aggregated with other records, resulting in a smaller number of records

overall.


B–10


Obs after

Exclusion

# Records

excluded

Number of records with NYSESLAT scores after merge ALL 2012–13 72,262

—

Number of students with NYSESLAT scores defined

after keeping only those NYSESLAT forms with 10 or

more students who with valid scores on that form

ALL 2012–13 72,240

22

Description of Final Reference File Used For

Analysis

Number of records in the Math Reference File ALL 2012–13 940,305 —

Number of records in the Math Reference File for a

specific grade

4 2012–13 188,048 —


specific grade

5 2012–13 187,241 —


specific grade

6 2012–13 187,074 —


specific grade

7 2012–13 190,327 —


specific grade

8 2012–13 187,615 —


C–1

Appendix C. Grades 4–8 Item Descriptions Used in Analysis

The teacher-student-course linkage file includes information about courses taught to students.

The item description provides information about which courses are relevant to State tests. Table

C-1 shows the records used for growth model analysis.

Table C-1: Relevant Item Descriptions

Item Description

Grade 3 ELA

Grade 3 Math

Grade 4 ELA

Grade 4 Math

Grade 5 ELA

Grade 5 Math

Grade 6 ELA

Grade 6 Math

Grade 7 ELA

Grade 7 Math

Grade 8 ELA

Grade 8 Math


D–1

Appendix D. Model Derivation

To describe how the model accounts for measurement variance, we first re-express the true score

regression as:

∑

[1]

We use * to denote the variables without measurement variance. For convenience, define the

matrices { }, {

}, and { }. Label the

matrix of measurement variance disturbances for disturbances associated with

, and label the vector of measurement disturbances with the dependent

variable, , , hence . Let have the same dimension as , but only the final L

columns of are non-zero, so . If those disturbances were observed, the

parameters { } can be estimated using Henderson’s methods (1953) by solving the following

mixed model equations:

(

) ( ) (

) [2]

The matrix is made up of Q diagonal blocks, one for each level in the hierarchy. Each diagonal

is constructed as where is an identity matrix with dimension equal to the number of units

at level q, and is the estimated variance of the random effects among units at level q. When

concatenated diagonally, the square matrix has dimension ∑ .

Two complications intervene. First, we cannot observe , and second, the unobservable nature of

this term, along with the heterogeneous measurement variance in the dependent variable, renders

this estimator inefficient.

Addressing the first issue, upon expansion we see that:

Taking expectation over the measurement error distributions and treating the true score matrix,

, as fixed , we have

( )

And then rearranging terms gives

We also have with the expectation taken over the measurement error

distributions associated with observed , and (

) (

) with the expectation

taken over the measurement error distributions associated with observed .


D–2

Addressing the second issue, both the right-side and left-side variables in the model equation

measured with variance contribute to the heteroscedasticity. While the correction

eliminates the bias due to measurement variance associated with the independent variables, we

still do not have a variance-free measure of for any time period. Therefore, the residual is made

up of:

where , is the conditional mean of the random effects. The residual variance of any

given observation is

∑

,

where is the known measurement variance of the dependent variable for student i at time t.

Similarly, are the known measurement variance of r prior test scores. Now, let be a

diagonal matrix of dimension N with diagonal elements .

With the above, we can define the mixed model equations as:

(

) ( ) (

)

Using observed scores and measurement error variance, the mixed model equations are redefined

as:

(

) (

) (

)

Observed Values for

As indicated, is unobserved, and so solving the mixed model equation cannot be computed

unless is replaced with some observed values. First, the mixed model equations are redefined

as:

( ) (

) (

)

where is a diagonal “correction” matrix with dimensions p × p accounting for measurement

variance in the predictor variables, , and is the column dimension of .

The matrix S is used in lieu of based on the following justification. Recall that we

previously defined as diag

and the matrix of unobserved disturbances is:

[

]

where is a matrix of dimension of with elements of 0, and:


D–3

[

]

The theoretical result of the matrix operation yields the following symmetric matrix:

[

∑

∑

∑

∑

∑

∑

]

The theoretical result is limited only because we do not observe since it is latent. However,

( ) where

is taken as the conditional standard error of measurement for

student i. The theoretical result also simplifies because variances of measurement on different

variables are by expectation uncorrelated, ( ) when .

Because the conditional standard error of measurement varies for each student i and the off-

diagonals can be ignored, let be:

( ∑

∑

∑

)

where denotes the measurement variance for the jth, j = (1, 2, … L), variable measured

with variance.


E–1

Appendix E. Interpolating Standard Errors of Measurement at the Lowest

and Highest Obtainable Scale Scores (LOSS and HOSS)

The linear model used to produce student-level predictions can cause these predictions to fall

outside the boundaries of the defined score scale. Let the floor or ceiling in the data be denoted

as and , respectively. It is therefore possible that or . However, the

observed score can never fall outside these bounds.

When a prediction falls outside the boundaries of the score scale, it can cause bias in the statistics

used to characterize a student, teacher, or school. This phenomenon seems to occur as a result of

the large conditional standard errors of measurement at the extreme scores, . The

procedure below is implemented to deal with these large standard errors.

INTERPOLATION PROCEDURE FOR CONDITIONAL STANDARD ERRORS OF LOSS

AND HOSS

Interpolate new conditional standard errors of measurement from the following kth degree

polynomial regression:

∑

,

where is and is the observed score for the ith student. The square root of the

fitted values will then be used in lieu of the CSEM:

∑

IMPLEMENTATION

Implement the linear regression and subsequently the growth model using the following steps:

1. Run the regression without modification.

2. Verify that for all i.

3. If the inequality in step 2 is true, stop; the run is complete. Otherwise, continue to step 4.

4. Set M = 1 and update the SEMs of the exact HOSS and LOSS scores.

5. Use the updated in lieu of the standard error of the LOSS or HOSS in the test

score data.

6. Run the growth model/value-added model.

7. Verify the inequality in step 2; if it holds, stop updating. If it does not hold, increase M

by 1 and return to step 5.

If this method does not result in the inequality in step 2 being met after M = 7 (i.e., after running

with M = 7), then simply take the most recent run that did converge, set where

and where For the predicted variance, use the predicted variance of the closest

estimate where the inequality in step 6 does hold. Where there are several, take the mean.


F–1

Appendix F. Grades 9–12 Data Processing Rules and Results

Table F-1: Grades 9–12, Regents Exam Assessment Score File Processing

Description

A.1 Rows that have State student ID (SSID) values that are not 10 characters long

and entirely numeric are dropped. The file should have the records from the

exam history file that are dropped on it and be named invalid_regents.

A.2 Rows with scale scores not between 0 and 100 or are not numeric are dropped.

A.3 Identify any records that have the same SSID but where the name (first name

or last name) is not the same. Mark this SSID as having an invalid Regents

Exam history.

A.4 Records that share a SSID, test date, item code, first name, and last name but

do not have the same scale score are dropped and the SSID is marked as

having an invalid Regents Exam history. If the scale score is the same, keep

only one record.


F–2

Table F-2: Grades 9–12, Student Enrollment File Processing

Description

B.1 Keep only records for the current year.

B.2 Only read in rows where the nominal_grade is 9, 10, 11, 12, or 14.

B.3 Drop records with invalid SSIDs. See A.1 for the definition of valid SSID.

B.4 Keep only students associated with schools in the school grade file.

B.5 Condense multiple records for a single student/school into a single record.

Mark the student as linked if they have at least one record showing they were

present on BEDS day (10/03/2012) and the first day of June Regents

(06/11/2013). Also keep the student if they were present on BEDS day and

graduated or dropped out.


F–3

Table F-3: Grades 9–12, Algebra File Processing

Description

C.1 Students are included only when the student has a valid score on the relevant

Regents Exam. Regents alternatives (scores of 999) are not considered to be

valid.

C.2 Exclude students who have passed the Regents Exam in a prior year. The

definition of pass for this rule is 65 or better. This only regards that exact

exam; it does not exclude any exam that counts for the same requirement (i.e.,

math for the Algebra exam). However, a prior Regents alternative with the

same test code does count and means that a student will not be included in this

file. Within a year, include the highest score for that year. This means that

each student will appear on the file at most once.

C.3 Include only students who have valid grade 9 entry-date information.

C.4 Exclude SSIDs for students with invalid Regents Exam histories.

C.5 Exclude students who did not attend a school on the school grades file.

C.6 Students must have at least one grade 7 or 8 assessment in math.

C.7 Exclude students who take the Regents Exam in August before ever attending

9th grade.


F–4

Table F-4: Grades 9–12, ELA File Processing

Description

D.1 Students are included only when the student has a valid score on the relevant

Regents Exam. Regents alternatives (scores of 999) are not considered to be

valid.

D.2 Exclude students who have passed the Regents Exam in a prior year. The

definition of pass for this rule is 65 or better. This only regards that exact

exam; it does not exclude any exam that counts for the same requirement (i.e.

math for the Algebra exam). However, a prior Regents alternative with the

same test code does count and means that a student will not be included in this

file. Within a year, include the highest score for that year. This means that

each student will appear on the file at most once.

D.3 Include only students who have valid grade 9 entry-date information.

D.4 Exclude SSIDs for students with invalid Regents Exam histories.

D.5 Exclude students who did not attend a school on the school grades file.

D.6 Students must have at least one grade 7 or 8 assessment in ELA.

D.7 Exclude students who take the Regents Exam in August before ever attending

9th grade.


F–5

Table F-5: Grades 9–12, GRE File Processing

Description

E.1 Include all students who are attributable to a school on the grades served file;

present on BEDS day(10/03/2012) and first day of June Regents Exam

assessments (06/11/2013).

E.2 Students must have at least one grade 7 or 8 assessment in either math or ELA.

E.3 Students must have valid grade 9 entry-date information.


F–6

Table F-6: Grades 9–12 Data Processing

Data Processing Description Model

Resulting #

Obs after

Exclusion

# Records

excluded

2004–05 Regents Score File Processing

Number of records in the input score file All 1,705,365 —

Number of records in the score file after keeping

only valid SSIDs and removing duplicate IDs

without duplicate names and invalid scores

All

1,664,586 40,779






All

1,712,963 12,511


Number of records in the input score file. All 1,927,169 —




All

1,874,520 52,649






All

1,987,860 20,081






All

2,042,581 27,235






All

2,083,841 9,546






All

2,174,911 9,242






All

2,130,325 4139


F–7


Resulting #

Obs after

Exclusion

# Records

excluded






All

2,062,193 3,629

Student Enrollment File Processing

Number of records on the Student Year File All 6,582,470 —

Number of records on the Student Year File for the

current year All

3,268,631 3,313,839

Number of records on the Student Year File after

keeping relevant grades All

1,013,853 2,254,778


removing invalid SSIDs All

1,013,853 0


keeping only schools in the school grade file All

852,429 161,424

Number of records after condensing to a single

record per SSID/school and keeping only students

who meet minimum enrollment

All

724,612 127,817

Number of records after removing duplicate

names/number of records attributable to a school All

724,162 450

NYSESLAT File Processing

Number of records in NYSESLAT File All 1,059,064

Number of records after keeping the most recent

score for each student All

1,058,837 227


dropping LS/RW scores without the other score All

908,286 150,551

Algebra File Processing

Number of records with valid Algebra Regents

Exam in 2012–13 Alg

342,895

Number of records after keeping only the highest

score Alg

299,620 43,275

Students for whom the following rules applied were

all identified at once and then students with any of

the following type of record were removed. Because

several lines may have applied to a single student, a

separate line showing the total number of affected

students is also shown.

Alg

— —

Number of students, after keeping only the

highest score, who have no valid grade 7 or 8

ELA test scores

Alg

— 117,076


highest score, who passed the ELA Regents

Exam in a prior school year

Alg

— 24,900


highest score, who attended a school not in the Alg

— 103,939


F–8


Resulting #

Obs after

Exclusion

# Records

excluded

school grade files


highest score, who had invalid grade 9 entry

date information

Alg

— 116,941


highest score, who took the Regents Exam in

August before attending a high school for the

first time

Alg

— 977

Remove students where at least one of the above

five lines apply Alg

158,755 140,865

Number of students with a grade 8 math score Alg 156,980 —

Number of students with a grade 7 math score Alg 149,740 —

Number of NYSESLAT records merged on with

valid LS and RW scores Alg

144,707 —

ELA File Processing

Number of records with valid ELA Regents Exam

in 2012–13 ELA

282,851

Number of records after keeping only the highest

score ELA

242,035 40,816







ELA

— —


highest score, who have no valid grade 7 or 8

ELA test scores

ELA

— 54,481


highest score, who passed the ELA Regents

Exam in a prior school year

ELA

— 11,891


highest score, who attended a school not in the

school grade files

ELA

— 38,739


highest score, who had invalid grade 9 entry

date information

ELA

— 51,778


highest score, who took the Regents Exam in

August before attending a high school for the

first time

ELA

— 22


five lines apply ELA

175,009 67,026

Number of students with a grade 8 ELA score ELA 173,749 —

Number of students with a grade 7 ELA score ELA 165,990 —

Number of NYSESLAT records merged on with ELA 8,969 —


F–9


Resulting #

Obs after

Exclusion

# Records

excluded

valid LS and RW scores

GRE File Processing

Number of records with valid Regents Exam history

for GRE model GRE

906,786







GRE

— —

Number of student who have no valid grade 7

or 8 ELA or grade 7 or 8 math test scores GRE

— 213,954

Number of student who attended a school not in

the school grade files GRE

— 188,642

Number of student who had invalid grade 9

entry date information GRE

— 214,014


five lines apply. GRE

602,026 304,760


G–1

Appendix G. Grades 4–8 Attribution and Weighting Rules

Teacher attribution relies on a 60% enrollment fraction. Table C-2 describes the system by which

the teacher-student-course linkage records are condensed to a single record. Table G-1 describes

using that single record for attribution and weighting.

Table G-1: Teacher Attribution Rules

Attribution Rule

A.1 Set enrollment fraction to the enrollment duration (the length of time the course was

set to meet during which the student was enrolled in the course) divided by the

course duration (the length of time the course was set to meet). When the course

duration is zero, set the enrollment fraction to the number of days of enrollment

divided by 195 (ELA) or 203 (math). Days of enrollment is the number of unique

calendar days that the student is enrolled in the class before assessment day. When

the days of enrollment exceeds 195 (ELA) or 203 (math), set the enrollment fraction

to 1.

A.2 When the enrollment fraction is larger than 0.60 (60%), the student is attributed to

the teacher.

A.3 When there is a link, set the weight of the link to the attendance link duration (the

length of time the student attended the class) divided by the course duration.

Principal linkage/attribution is handled entirely with the “school enrollment flag” found in the

assessment score file. When this flag is marked “yes,” then a student is linked to the principal at

the school on the assessment score file. District linkage/attribution is handed in an identical way

with the “district enrollment flag.”


H–1

Appendix H. Model Coefficients

Table H-1. Grade 4 ELA Model Coefficients, Adjusted Model

Effect Name Effect Standard

Error

p-value

Constant term -643.211 7.729 0

Prior-Grade ELA Scale Score 1.109 0.005 0

Prior-Grade Math Scale Score 0.269 0.004 0

Missing Flag: Prior-Grade Math Scale Score 183.766 3.012 0

Mean Prior Score 0.019 0.011 0.089

Range Around Prior Score 0.073 0.011 0

New to School 0.937 0.182 0

SWD -7.838 0.154 0

Gen Ed < 40% (LRE3) -2.473 0.407 0

Percent SWD -0.048 0.005 0

English Language Learner (ELL) -0.293 1.619 0.856

Percent ELL 0.006 0.006 0.291

Missing Flag: Percent Variables 9.595 7.461 0.198

Grades 2–4 NYSESLAT LS Scale Score -0.007 0.004 0.088

Grades 2–4 NYSESLAT RW Scale Score 0.024 0.004 0

Missing Flag: Grades 2–4 NYSESLAT Scale Scores 10.369 3.208 0.001

ED -2.413 0.116 0

Percent ED -0.025 0.004 0


H–2


Effect Name Effect Standard

Error

p-value



Two-Grades Prior ELA Scale Score 0.363 0.005 0

Missing Flag: Two-Grades Prior ELA Scale Score 242.194 3.314 0



Mean Prior Score -0.046 0.007 0


Retained in Grade -1.523 0.3 0


SWD -3.533 0.149 0

Gen Ed < 40% (LRE3) 2.553 0.386 0

ELL 0.581 1.451 0.689

Percent SWD -0.04 0.005 0

Percent ELL -0.013 0.005 0.018

Missing Flag: Percent Variables -32.603 5.003 0

Grades 2–4 NYSESLAT LS Scale Score -0.02 0.004 0

Grades 2–4 NYSESLAT RW Scale Score 0.003 0.004 0.478

Missing Flag: Grades 2–4 NYSESLAT Scale Scores -14.501 3.067 0

Percent ED -0.011 0.004 0.008

ED -1.493 0.112 0


H–3


Effect Name Effect

Standard

Error p-value





Three-Grades Prior ELA Scale Score 0.043 0.002 0

Missing Flag: Three-Grades Prior ELA Scale Score 31.91 1.114 0



Mean Prior Score -0.045 0.013 0

Range Around Prior Score 0.046 0.014 0.001


New to School 0.081 0.227 0.72

SWD -3.71 0.137 0

Gen Ed < 40% (LRE3) 0.684 0.342 0.046

Percent SWD -0.04 0.005 0

ELL 2.02 1.443 0.161

Percent ELL -0.005 0.006 0.446

Missing Flag: Percent Variables -35.411 8.637 0

Grades 5–6 NYSESLAT LS Scale Score -0.026 0.004 0


Missing Flag: Grades 5–6 NYSESLAT Scale Scores 14.757 3.29 0

Percent ED -0.058 0.004 0

ED -1.755 0.1 0


H–4


Effect Name Effect

Standard

Error p-value













SWD -1.464 0.134 0

Gen Ed < 40% (LRE3) 0.688 0.349 0.049

Percent SWD -0.032 0.005 0

ELL 2.282 1.227 0.063

Percent ELL -0.001 0.007 0.856





ED -0.937 0.097 0

Percent ED -0.003 0.005 0.573


H–5


Effect Name Effect

Standard

Error p-value









Mean Prior Score 0.078 0.014 0



New to School 0.194 0.205 0.343

SWD -2.915 0.144 0

Gen Ed < 40% (LRE3) -0.959 0.373 0.01

Percent SWD -0.031 0.005 0

ELL 0.245 1.188 0.837

Percent ELL 0.013 0.006 0.036

Missing Flag: Percent Variables 49.667 9.324 0

Grades 7–8 NYSESLAT LS Scale Score 0.028 0.006 0



ED -1.452 0.103 0

Percent ED -0.031 0.005 0


H–6

Table H-6. Grade 4 Math Model Coefficients, Adjusted Model

Effect Name Effect

Standard

Error p-value

Constant term -624.060 8.737 0.000

Prior-Grade Math Scale Score 1.071 0.004 0.000

Prior-Grade ELA Scale Score 0.341 0.004 0.000

Missing Flag: Prior-Grade ELA Scale Score 225.224 2.879 0.000

Mean Prior Score -0.026 0.012 0.034


New to School 0.065 0.197 0.742

SWD -4.634 0.157 0.000

Gen Ed < 40% (LRE3) 0.956 0.458 0.037

Percent SWD -0.038 0.006 0.000

ELL -4.033 1.604 0.012

Percent ELL -0.012 0.007 0.066

Missing Flag: Percent Variables -19.947 8.555 0.020


Grades 2–4 NYSESLAT RW Scale Score -0.010 0.004 0.022

Missing Flag: Grades 2–4 NYSESLAT Scale Scores -20.674 2.967 0.000

ED -2.580 0.120 0.000

Percent ED -0.039 0.005 0.000


H–7


Effect Name Effect

Standard

Error p-value

Constant term -408.845 5.816 0.000


Two-Grades Prior Math Scale Score 0.330 0.004 0.000

Missing Flag: Two-Grades Prior Math Scale Score 227.868 3.044 0.000



Mean Prior Score -0.003 0.007 0.681


Retained in Grade -2.741 0.276 0.000

New to School 0.361 0.191 0.058

SWD -3.640 0.139 0.000

Gen Ed < 40% (LRE3) 1.341 0.418 0.001

Percent SWD -0.040 0.005 0.000

ELL -1.417 1.306 0.278

Percent ELL -0.009 0.007 0.164




Missing Flag: Grades 2–4 NYSESLAT Scale Scores -19.591 2.542 0.000

ED -1.633 0.105 0.000

Percent ED -0.033 0.005 0.000


H–8


Effect Name Effect

Standard

Error p-value

Constant term -450.533 6.397 0.000




Three-Grades Prior Math Scale Score 0.048 0.002 0.000

Missing Flag: Three-Grades Prior Math Scale Score 36.658 1.201 0.000






New to School 0.257 0.244 0.294

SWD -3.543 0.133 0.000

Gen Ed < 40% (LRE3) -1.139 0.378 0.003

Percent SWD -0.037 0.006 0.000

ELL 3.719 1.383 0.007

Percent ELL -0.005 0.008 0.552





ED -0.882 0.098 0.000

Percent ED -0.025 0.006 0.000


H–9


Effect Name Effect

Standard

Error p-value

Constant term -510.635 6.329 0.000






Prior Grade ELA Scale Score 0.351 0.005 0.000





New to School -0.487 0.219 0.026

SWD -2.788 0.143 0.000

Gen Ed < 40% (LRE3) 2.416 0.397 0.000

Percent SWD -0.028 0.006 0.000

ELL 2.099 1.275 0.100

Percent ELL -0.020 0.008 0.008





ED -0.757 0.104 0.000

Percent ED -0.042 0.006 0.000


H–10


Effect Name Effect

Standard

Error p-value

Constant term -444.783 7.100 0.000











New to School -0.094 0.222 0.672

SWD -2.930 0.137 0.000

Gen Ed < 40% (LRE3) 1.635 0.421 0.000

Percent SWD -0.049 0.006 0.000

ELL 4.664 1.138 0.000

Percent ELL 0.005 0.009 0.551





ED -0.628 0.098 0.000

Percent ED -0.029 0.006 0.000


H–11

Table H-11. Grades 9–12, GRE, Year in School 1 Model Coefficients, Adjusted Model

Effect Name Estimate Standard

Error

Intercept 1 -56.485 *14

Intercept 2 -57.950 *





Grade 8 ELA Scale Score 0.019 <0.001

Missing Flag: Grade 8 ELA Scale Score 11.631 0.310

Grade 7 ELA Scale Score 0.011 0.001


Grade 8 Math Scale Score 0.031 <0.001

Missing Flag: Grade 8 Math Scale Score 20.290 0.250



Mean Prior Grade 8 ELA -0.008 0.002

Mean Prior Grade 8 Math -0.008 0.001

Count of Prior Regents Exams = 0 14.587 *







Count of Prior Regents Exams = 7 0.000 —15

SWD 0.146 0.017

Gen Ed < 40% (LRE3) -0.554 0.063

Percent SWD -0.034 0.001

ELL -0.259 0.065

Percent ELL -0.017 0.001

NYSESLAT LS Scale Score 0.001 0.001

NYSESLAT RW Scale Score 0.004 0.001

Missing Flag: NYSESLAT Scale Scores 3.017 0.533

ED -0.297 0.013

Percent ED -0.007 <0.001

14

An asterisk indicates that the statistical software did not produce standard errors for these coefficients. 15

The em dash is used to indicate standard errors that are not defined. Here the count of prior Regents Exams = 7

variable was omitted and is included to indicate that it was the omitted category.


H–12


Effect Name Estimate Standard Error

Intercept 1 -28.957 0.807

Intercept 2 -30.171 0.807

Intercept 3 -31.933 0.807

Intercept 4 -35.059 0.808

Intercept 5 -38.375 0.810

Intercept 6 -41.536 0.851

Intercept 7 -44.176 1.286








Missing Flag: Grade7 Math Scale Score 5.635 0.190


Mean Prior Grade 8 Math 0.009 0.001

Missing Flag: Mean Prior Grade 8 ELA -15.599 121.656

Count of Prior Regents Exams = 0 3.039 0.270







Count of Prior Regents Exams = 7 0.000 —

SWD 0.073 0.016

Gen Ed < 40% (LRE3)3 -0.308 0.060


ELL -0.347 0.049



NYSESLAT LS Scale Score -0.001 0.001


ED -0.243 0.012

Percent ED -0.010 <0.001


H–13



Intercept 1 -13.036 0.790

Intercept 2 -14.583 0.790

Intercept 3 -17.073 0.790

Intercept 4 -19.788 0.790

Intercept 5 -22.183 0.791

Intercept 6 -24.374 0.801

Intercept 7 -26.389 0.866

Intercept 8 -27.775 1.060





Grade 8 Math Scale Score 0.008 0.000




Mean Prior Grade 8 ELA 0.003 0.001


Missing Flag: Mean Prior Grade 8 ELA -15.031 220.405









SWD -0.406 0.017

Gen Ed < 40% (LRE3)3 -0.496 0.071


ELL -0.546 0.053





ED -0.051 0.012

Percent ED -0.012 <0.001


H–14



Intercept 1 -15.852 1.316

Intercept 2 -17.743 1.316

Intercept 3 -19.333 1.316

Intercept 4 -21.045 1.317

Intercept 5 -22.744 1.321

Intercept 6 -25.778 1.408

Intercept 7 -26.471 1.494

Grade 8 ELA Scale Score -0.002 <0.001

Missing Flag: Grade 8 ELA Scale Score -1.122 0.316

















SWD -0.706 0.024

Gen Ed < 40% (LRE3)3 -0.455 0.100


ELL -0.279 0.061

Percent ELL 0.004 0.001



Missing Flag: NYSESLAT Scale Scores -0.367 0.563

ED 0.196 0.018

Percent ED 0.002 0.001


H–15

Table H-15. Grades 9–12, GRE, Year in School 5+ Model Coefficients, Adjusted Model


Intercept 1 -9.735 4.010

Intercept 2 -11.303 4.010

Intercept 3 -12.799 4.011

Intercept 4 -14.371 4.013

Intercept 5 -16.305 4.026











Missing Flag: Mean Prior Grade 8 ELA 4.140 5.285









SWD -0.612 0.064

Gen Ed < 40% (LRE3)3 -0.395 0.196


ELL 0.069 0.115





ED 0.320 0.051

Percent ED 0.002 0.002


H–16

Table H-16. Grades 9–12, Algebra Model Coefficients, Adjusted Model



Missing Flag: 8 Math Scale Score 104.598 1.264





Grade 7 ELA Scale Score -0.021 0.004



Count of Prior Required Regents Exams = 0 -191.593 2.623






Cohort 1 -6.292 0.338

Cohort 2 -6.306 0.333

Cohort 3 -8.083 0.333

Cohort 4 -4.813 0.340

SWD -2.313 0.068

Gen Ed < 40% (LRE3)3 -1.544 0.233


ELL -0.969 0.195




Missing Flag: NYSESLAT Scale Scores -10.069 1.698

ED -0.702 0.054

Percent ED -0.062 0.001


H–17

Table H-17. Grades 9–12, ELA Model Coefficients, Adjusted Model











Missing Flag: Mean Prior Grade 8 ELA 4.295 11.998







Cohort 1 1.167 0.418

Cohort 2 -1.197 0.309

Cohort 3 -2.615 0.302

Cohort 4 -3.312 0.311

SWD -6.290 0.079

Gen Ed < 40% (LRE3)3 -6.011 0.315


ELL -3.423 0.238





ED -0.764 0.058

Percent ED -0.049 0.001


I–1

Appendix I. Grades 4–8 Impact Charts by Grade and Subject

Table I-1. Impact Correlations by Grade for ELA

Grade %ELL %SWD %ED Mean Prior

Scale Score

4 0.04 0.10 0.06 0.03

5 0.08 0.04 0.06 0.08

6 0.03 0.07 0.03 -0.01

7 0.12 0.06 0.10 -0.02

8 0.08 0.06 0.03 0.00

Table I-2. Impact Correlations by Grade for Math

Grade %ELL %SWD %ED Mean Prior

Scale Score

4 0.05 0.05 0.04 0.16

5 0.04 0.07 0.05 0.09

6 0.00 0.01 -0.01 0.08

7 0.02 -0.02 0.01 0.16

8 0.03 0.02 0.01 0.18


LOCATIONS

Domestic

Washington, D.C.

Atlanta, GA

Baltimore, MD

Chapel Hill, NC

Chicago, IL

Columbus, OH

Concord, MA

Frederick, MD

Honolulu, HI

Naperville, IL

New York, NY

Portland, OR

Sacramento, CA

San Diego, CA

San Mateo, CA

Silver Spring, MD

INTERNATIONAL

Egypt

Ethiopia

Georgia

Haiti

Honduras

Kenya

Liberia

Malawi

Nicaragua

Pakistan

South Africa

Zambia

ABOUT AMERICAN INSTITUTES FOR RESEARCH

Established in 1946, with headquarters in Washington, D.C.,

nonpartisan, not-for-profit organization that conducts behavioral

and social science research and delivers technical assistance

both domestically and internationally. As one of the largest

behavioral and social science research organizations in the world,

AIR is committed to empowering communities and institutions with

innovative solutions to the most critical challenges in education,

health, workforce, and international development.


1000 Thomas Jefferson Street NW

Washington, DC 20007-3835

202.403.5000 | TTY: 877.334.3499

www.air.org

http://www.air.org/

2012–2013 growth model for educator evaluation technical report

Documents