Download - Teacher evaluation presentation oregon
John Cronin, Ph.D.John Cronin, Ph.D.DirectorDirector
The Kingsbury Center @ NWEAThe Kingsbury Center @ NWEA
Implementing the Oregon Framework for Teacher and Administrator Professional Evaluation
Presenter - John Cronin, Ph.D.
Contacting us:Rebecca Moore: 503-548-5129E-mail: [email protected]
This PowerPoint presentation and recommended resources are available at our SlideShare website: http://www.slideshare.net/NWEA/tag/kingsbury-center
This presentation is the top presentation on this page
Implementing the Oregon Framework for Teacher and Administrator Professional Evaluation
Suggested reading
Baker B., Oluwole, J., Green, P. (2013). The legal consequences of mandating high stakes decisions based on low quality information: Teacher evaluation in the Race to the Top Era. Education Policy Analysis Archives. Vol 21. No 5.
It’s rarely good to be the first wildebeest into the Zambezi River.
We have entered an area of the law with little guiding precedent. In such cases remember….
The difference between formative and summative evaluation
• Formative evaluation is intended to give educators useful feedback to help them improve their job performance. For most educators formative evaluation should be the focus of the process.
• Summative evaluation is a judgment of educator performance that informs future decisions about employment including granting of tenure, performance pay, protection from layoff.
Teacher evaluation requirement
• Tested subjects (Math and ELA)– State assessment– One additional measure
• Non-tested subjects two of (all other)– Common measure (national, regional, or district
developed) – Mandatory– Classroom-based measure – School-wide measure
Administrator evaluation requirement
• Two goals – One must be the state assessment– The other must come from:
• A common measurement• Other measures such as graduation rate, attendance rate, discipline data, college
readiness indicators –
– Aligned with achievement compact indicators when possible
Unique features of Oregon’s approach
• Peer review panels for evaluation and support systems
• Local control– Student growth data isn’t a fixed percentage of the
evaluation– Districts decide how to translate performance to ratings.
Distinguishing teacher effectiveness from teacher evaluation
• Teacher effectiveness – The judgment of a teacher’s ability to positively impact learning in the classroom.
• Teacher evaluation – The judgment of a teacher’s overall performance including:– Teacher effectiveness– Common standards of job performance– Participation in the school community– Adherence to professional standards
Distinguishing teacher effectiveness from teacher evaluation
• Testing – A claim that the improvement in learning (or lack of it) reflected on one or more tests is caused by the teacher.
• Classroom observation – That the observers ratings or conclusions are reliable and associated with behaviors that cause improved learning in the classroom.
Purposes of summative evaluation
• An accurate and defensible judgment of an educator’s job performance.
• Ratings of performance that provide meaningful differentiation across educators.
Differences between principal and teacher evaluation
• New groups of students each year• Instruction of that group is limited to one
school year.
Teachers
Thus teachers are normally held accountable for yearly growth
Differences between principal and teacher evaluation
• Inherit a pre-existing staff.• Have limited control over staffing conditions• Work with this intact group from year to year.
Principals
Thus principals should be evaluated on their ability to improve growth or maintain high levels of growth over time rather than their students’ growth within a school year.
The testing to teacher evaluation process
Issues in the use of growth measures
Measurement design of the instrument
Many assessments are not designed to measure growth. Others do not measure growth equally well for all students.
Tests are not equally accurate for all students
California STAR NWEA MAP
Issues in the use of growth measures
Instructional alignment
Tests used for teacher evaluation should align to the teacher’s instructional responsibilities.
Common problems with instructional alignment
• Using school level math and reading results in the evaluation of music, art, and other specials teachers.
• Using general tests of a discipline (reading, math, science) as a major component of the evaluation high school teachers delivering specialized courses.
The testing to teacher evaluation process
The problem with spring-spring testing
3/11 4/11 5/11 6/11 7/11 8/11 9/11 10/11 11/11 12/11 1/12 2/12 3/12
Teacher 1 Summer Teacher 2
A better approach
• When possible use a spring – fall – spring approach.• Measure summer loss and incentivize schools and
teachers to minimize it.• Measure teacher performance fall to spring, giving as
much instructional time as possible between assessments.
• Monitor testing conditions to minimize gaming of fall spring results.
Issues in the use of growth and value-added measures
Instability of results
A variety of factors can cause value-added results to lack stability.
Results are more likely to be stable at the extremes. The use of multiple-years of data is highly recommended.
Issues in the use of growth and value-added measures
“Among those who ranked in the top category on the TAKS reading test, more than 17% ranked among the lowest two categories on the Stanford. Similarly more than 15% of the lowest value-added teachers on the TAKS were in the highest two categories on the Stanford.”
Corcoran, S., Jennings, J., & Beveridge, A., Teacher Effectiveness on High and Low Stakes Tests, Paper presented at the Institute for Research on Poverty summer workshop, Madison, WI (2010).
Teachers with growth scores in lowest and highest quintile over two years using NWEA’s Measures of Academic Progress
Bottom quintile Y1&Y2
Top quintile Y1&Y2
Number 59/493 63/493
Percent 12% 13%
r .64 r2 .41
Typical r values for measures of teaching effectiveness range between .30 and .60 (Brown Center on Education Policy, 2010)
Reliability of teacher value-added estimates
The testing to teacher evaluation process
Value-added measures
• Estimate the progress of students relative to the progress that would be expected students like these.
• Adjust for factors that are beyond the teachers control – for example– Starting score– Income of the students– ELL status– Mobility
• End result is an estimate of the teacher’s contribution to learning relative to other teachers
Erica Juan
Starting score
How value-add analysis works
Erica Juan
Starting score
Lunch Status Free Not Eligible
How value-add analysis works
Erica Juan
Starting score
Lunch Status Free Not EligibleELL Status English ELL
How value-add analysis works
Erica Juan
Starting score
Lunch Status Free Not EligibleELL Status English ELLSPED Status Dyslexic Regular Ed
How value-add analysis works
Issues in the use of growth and value-added measures
Differences among value-added models
Los Angeles Times Study
Issues in the use of value-added measures
Control for statistical error
All models attempt to address this issue. Nevertheless, many teachers value-added scores will fall within the range of statistical error.
Test Question:
There are 300 students in the 10th grade. Mary and Mark want to find the students’ favorite color. Mary asks 30 people. Mark asks 150 people. Mark says:
“My conclusions are more likely to be right than Mary’s.”
Why does Mark think he is right?
Because he is a man.
Range of teacher value-added estimates
Issues in the use of growth and value-added measures
Control for statistical error
New York City
Limitations of value-added metrics
• Value-added metrics are inherently NORMATIVE.• Value-added metrics are not useful in small or
medium district settings.• Value-added metrics don’t measure improvement in
the teacher population over time.• Changes in the value-added model have the greatest
effect on extreme cases.
What Makes Schools Work Study - Mathematics
Data used represents a portion of the teachers who participated in Vanderbilt University’s What Makes Schools Work Project, funded by the federal Institute of Education Sciences
What makes schools work study - Reading
The testing to teacher evaluation process
If evaluators do not differentiate their ratings, then all differentiation comes from the test.
Results of Tennessee Teacher Evaluation Pilot
Results of Georgia Teacher Evaluation Pilot
Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective Teaching: Culminating Findings from the MET Projects Three-Year Study
Observation by Reliability coefficient (relative to state test value-added gain)
Proportion of test variance explained
Model 1 – State test – 81% Student surveys 17% Classroom Observations – 2%
.51 26.0%
Model 2 – State test – 50% Student Surveys – 25% Classroom Observation – 25%
.66 43.5%
Model 3 – State test – 33% - Student Surveys – 33% Classroom Observations – 33%
.76 57.7%%
Model 4 – Classroom Observation 50%State test – 25%Student surveys – 25%
.75 56.2%
Reliability of evaluation weights in predicted stability of student growth gains year to year
Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective Teaching: Culminating Findings from the MET Projects Three-Year Study
Observation by Reliability coefficient (relative to state test value-added gain)
Proportion of test variance explained
Principal – 1 .51 26.0%
Principal – 2 .58 33.6%
Principal and other administrator .67 44.9%
Principal and three short observations by peer observers .67 44.9%
Two principal observations and two peer observations .66 43.6%
Two principal observations and two different peer observers .69 47.6%
Two principal observations one peer observation and three short observations by peers
.72 51.8%
Reliability of a variety of teacher observation implementations
An example of an arbitrary rating matrix
Other issues
Security and Cheating
When measuring growth, one teacher who cheats disadvantages the next teacher.
Cheating
Atlanta Public SchoolsCrescendo Charter SchoolsPhiladelphia Public SchoolsWashington DC Public SchoolsHouston Independent School DistrictMichigan Public Schools
Cheating
Atlanta Journal Constitution Database
Mean spring and fall test duration in minutes by school
Mean value-added growth by school
Security considerations
• Teachers should not be allowed to view the contents of the item bank or record items.
• Districts should have policies for accomodation that are based on student IEPs.
• Districts should consider having both the teacher and a proctor in the test room.
• Districts should consider whether other security measures are needed for both the protection of the teacher and administrators.
14th Amendment – Due Process Clause
Procedural Due Process
“A right to a fair procedures or set of procedures before one can be deprived of property by the state.”
Procedural due process in educational employment
• Follow the contract.• Give notice.• Give opportunity to improve.• Provide a process of appeal or review.
Substantive due process
“…the state is obligated to avoid action which is arbitrary and capricious, does not achieve or even frustrates a legitimate state interest, or is fundamentally unfair.”
Federal 5th Circuit, Debra P. v. Turlington, 1984, p.404
Quoted in Baker, Odewole, and Green (2013)
Substantive due process
“…(state practices) reflect a substantial departure from academic norms as to demonstrate that the person or committee did not actually exercise professional judgment.”
GI Forom Image de Tejas v. Texas Education Agency, 2000, p.682 quoting U. of Michigan v. Ewing, 1985, p. 225
Quoted in Baker, Odewole, and Green (2013)
How state teacher evaluation policies introduce new problems.
“Due process is violated where administrators or other decision makers place blind faith in the quantitative measures, assuming them to be causal and valid and applying arbitrary and capricious cut-off points to these measures (performance categories leading to dismissal.
Baker, Oluwole, Green (2013).
The problem…is that some of these state statutes require these due process violations, even when the informed thoughtful professional understands full well that she is being forced to make the wrong decision. They require that decision makers take action based on these measures even against their informed professional judgment.”
Title VII of the Civil Rights Act of 1964
In United States employment law, the doctrine of disparate impact holds that employment practices may be considered discriminatory and illegal if they have a disproportionate "adverse impact" on members of a minority group.
Under the doctrine, a violation of Title VII of the 1964 Civil Rights Act may be proven by showing that an employment practice or policy has a disproportionately adverse effect on members of the protected class as compared with non-members of the protected class.
Retrieved on February 23, 2013 from Wikipedia http://en.wikipedia.org/wiki/Disparate_impact
Title VII of the Civil Rights Act of 1964
This form of discrimination occurs where an employer does not intend to discriminate; to the contrary, it occurs when identical standards or procedures are applied to everyone, despite the fact that they lead to a substantial difference in employment outcomes for the members of a particular group and they are unrelated to successful job performance.
Retrieved on February 23, 2013 from Wikipedia http://en.wikipedia.org/wiki/Disparate_impact
Disparate impact doctrine
Presenter - John Cronin, Ph.D.
Contacting us:NWEA Main Number: 503-624-1951 E-mail: [email protected]
The presentation and recommended resources are available at our SlideShare site: http://www.slideshare.net/NWEA/tag/kingsbury-center
Thank you for attending