Download - Teacher evaluation presentation oregon

John Cronin, Ph.D.John Cronin, Ph.D.DirectorDirector

The Kingsbury Center @ NWEAThe Kingsbury Center @ NWEA

Implementing the Oregon Framework for Teacher and Administrator Professional Evaluation

Presenter - John Cronin, Ph.D.

Contacting us:Rebecca Moore: 503-548-5129E-mail: [email protected]

This PowerPoint presentation and recommended resources are available at our SlideShare website: http://www.slideshare.net/NWEA/tag/kingsbury-center

This presentation is the top presentation on this page

Implementing the Oregon Framework for Teacher and Administrator Professional Evaluation

Suggested reading

Baker B., Oluwole, J., Green, P. (2013). The legal consequences of mandating high stakes decisions based on low quality information: Teacher evaluation in the Race to the Top Era. Education Policy Analysis Archives. Vol 21. No 5.

It’s rarely good to be the first wildebeest into the Zambezi River.

We have entered an area of the law with little guiding precedent. In such cases remember….

The difference between formative and summative evaluation

• Formative evaluation is intended to give educators useful feedback to help them improve their job performance. For most educators formative evaluation should be the focus of the process.

• Summative evaluation is a judgment of educator performance that informs future decisions about employment including granting of tenure, performance pay, protection from layoff.

Teacher evaluation requirement

• Tested subjects (Math and ELA)– State assessment– One additional measure

• Non-tested subjects two of (all other)– Common measure (national, regional, or district

developed) – Mandatory– Classroom-based measure – School-wide measure

Administrator evaluation requirement

• Two goals – One must be the state assessment– The other must come from:

• A common measurement• Other measures such as graduation rate, attendance rate, discipline data, college

readiness indicators –

– Aligned with achievement compact indicators when possible

Unique features of Oregon’s approach

• Peer review panels for evaluation and support systems

• Local control– Student growth data isn’t a fixed percentage of the

evaluation– Districts decide how to translate performance to ratings.

Distinguishing teacher effectiveness from teacher evaluation

• Teacher effectiveness – The judgment of a teacher’s ability to positively impact learning in the classroom.

• Teacher evaluation – The judgment of a teacher’s overall performance including:– Teacher effectiveness– Common standards of job performance– Participation in the school community– Adherence to professional standards

Distinguishing teacher effectiveness from teacher evaluation

• Testing – A claim that the improvement in learning (or lack of it) reflected on one or more tests is caused by the teacher.

• Classroom observation – That the observers ratings or conclusions are reliable and associated with behaviors that cause improved learning in the classroom.

Purposes of summative evaluation

• An accurate and defensible judgment of an educator’s job performance.

• Ratings of performance that provide meaningful differentiation across educators.

Differences between principal and teacher evaluation

• New groups of students each year• Instruction of that group is limited to one

school year.

Teachers

Thus teachers are normally held accountable for yearly growth

Differences between principal and teacher evaluation

• Inherit a pre-existing staff.• Have limited control over staffing conditions• Work with this intact group from year to year.

Principals

Thus principals should be evaluated on their ability to improve growth or maintain high levels of growth over time rather than their students’ growth within a school year.

The testing to teacher evaluation process

Issues in the use of growth measures

Measurement design of the instrument

Many assessments are not designed to measure growth. Others do not measure growth equally well for all students.

Tests are not equally accurate for all students

California STAR NWEA MAP

Issues in the use of growth measures

Instructional alignment

Tests used for teacher evaluation should align to the teacher’s instructional responsibilities.

Common problems with instructional alignment

• Using school level math and reading results in the evaluation of music, art, and other specials teachers.

• Using general tests of a discipline (reading, math, science) as a major component of the evaluation high school teachers delivering specialized courses.

The problem with spring-spring testing

3/11 4/11 5/11 6/11 7/11 8/11 9/11 10/11 11/11 12/11 1/12 2/12 3/12

Teacher 1 Summer Teacher 2

A better approach

• When possible use a spring – fall – spring approach.• Measure summer loss and incentivize schools and

teachers to minimize it.• Measure teacher performance fall to spring, giving as

much instructional time as possible between assessments.

• Monitor testing conditions to minimize gaming of fall spring results.

Issues in the use of growth and value-added measures

Instability of results

A variety of factors can cause value-added results to lack stability.

Results are more likely to be stable at the extremes. The use of multiple-years of data is highly recommended.


“Among those who ranked in the top category on the TAKS reading test, more than 17% ranked among the lowest two categories on the Stanford. Similarly more than 15% of the lowest value-added teachers on the TAKS were in the highest two categories on the Stanford.”

Corcoran, S., Jennings, J., & Beveridge, A., Teacher Effectiveness on High and Low Stakes Tests, Paper presented at the Institute for Research on Poverty summer workshop, Madison, WI (2010).

Teachers with growth scores in lowest and highest quintile over two years using NWEA’s Measures of Academic Progress

Bottom quintile Y1&Y2

Top quintile Y1&Y2

Number 59/493 63/493

Percent 12% 13%

r .64 r2 .41

Typical r values for measures of teaching effectiveness range between .30 and .60 (Brown Center on Education Policy, 2010)

Reliability of teacher value-added estimates

Value-added measures

• Estimate the progress of students relative to the progress that would be expected students like these.

• Adjust for factors that are beyond the teachers control – for example– Starting score– Income of the students– ELL status– Mobility

• End result is an estimate of the teacher’s contribution to learning relative to other teachers

Erica Juan

Starting score

How value-add analysis works

Erica Juan

Starting score

Lunch Status Free Not Eligible


Erica Juan

Starting score

Lunch Status Free Not EligibleELL Status English ELL


Erica Juan

Starting score

Lunch Status Free Not EligibleELL Status English ELLSPED Status Dyslexic Regular Ed



Differences among value-added models

Los Angeles Times Study

Issues in the use of value-added measures

Control for statistical error

All models attempt to address this issue. Nevertheless, many teachers value-added scores will fall within the range of statistical error.

Test Question:

There are 300 students in the 10th grade. Mary and Mark want to find the students’ favorite color. Mary asks 30 people. Mark asks 150 people. Mark says:

“My conclusions are more likely to be right than Mary’s.”

Why does Mark think he is right?

Because he is a man.

Range of teacher value-added estimates


Control for statistical error

New York City

Limitations of value-added metrics

• Value-added metrics are inherently NORMATIVE.• Value-added metrics are not useful in small or

medium district settings.• Value-added metrics don’t measure improvement in

the teacher population over time.• Changes in the value-added model have the greatest

effect on extreme cases.

What Makes Schools Work Study - Mathematics

Data used represents a portion of the teachers who participated in Vanderbilt University’s What Makes Schools Work Project, funded by the federal Institute of Education Sciences

What makes schools work study - Reading

If evaluators do not differentiate their ratings, then all differentiation comes from the test.

Results of Tennessee Teacher Evaluation Pilot

Results of Georgia Teacher Evaluation Pilot

Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective Teaching: Culminating Findings from the MET Projects Three-Year Study

Observation by Reliability coefficient (relative to state test value-added gain)

Proportion of test variance explained

Model 1 – State test – 81% Student surveys 17% Classroom Observations – 2%

.51 26.0%

Model 2 – State test – 50% Student Surveys – 25% Classroom Observation – 25%

.66 43.5%

Model 3 – State test – 33% - Student Surveys – 33% Classroom Observations – 33%

.76 57.7%%

Model 4 – Classroom Observation 50%State test – 25%Student surveys – 25%

.75 56.2%

Reliability of evaluation weights in predicted stability of student growth gains year to year

Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective Teaching: Culminating Findings from the MET Projects Three-Year Study

Observation by Reliability coefficient (relative to state test value-added gain)

Proportion of test variance explained

Principal – 1 .51 26.0%

Principal – 2 .58 33.6%

Principal and other administrator .67 44.9%

Principal and three short observations by peer observers .67 44.9%

Two principal observations and two peer observations .66 43.6%

Two principal observations and two different peer observers .69 47.6%

Two principal observations one peer observation and three short observations by peers

.72 51.8%

Reliability of a variety of teacher observation implementations

An example of an arbitrary rating matrix

Other issues

Security and Cheating

When measuring growth, one teacher who cheats disadvantages the next teacher.

Cheating

Atlanta Public SchoolsCrescendo Charter SchoolsPhiladelphia Public SchoolsWashington DC Public SchoolsHouston Independent School DistrictMichigan Public Schools

Cheating

Atlanta Journal Constitution Database

Mean spring and fall test duration in minutes by school

Mean value-added growth by school

Security considerations

• Teachers should not be allowed to view the contents of the item bank or record items.

• Districts should have policies for accomodation that are based on student IEPs.

• Districts should consider having both the teacher and a proctor in the test room.

• Districts should consider whether other security measures are needed for both the protection of the teacher and administrators.

14th Amendment – Due Process Clause

Procedural Due Process

“A right to a fair procedures or set of procedures before one can be deprived of property by the state.”

Procedural due process in educational employment

• Follow the contract.• Give notice.• Give opportunity to improve.• Provide a process of appeal or review.

Substantive due process

“…the state is obligated to avoid action which is arbitrary and capricious, does not achieve or even frustrates a legitimate state interest, or is fundamentally unfair.”

Federal 5th Circuit, Debra P. v. Turlington, 1984, p.404

Quoted in Baker, Odewole, and Green (2013)

Substantive due process

“…(state practices) reflect a substantial departure from academic norms as to demonstrate that the person or committee did not actually exercise professional judgment.”

GI Forom Image de Tejas v. Texas Education Agency, 2000, p.682 quoting U. of Michigan v. Ewing, 1985, p. 225

Quoted in Baker, Odewole, and Green (2013)

How state teacher evaluation policies introduce new problems.

“Due process is violated where administrators or other decision makers place blind faith in the quantitative measures, assuming them to be causal and valid and applying arbitrary and capricious cut-off points to these measures (performance categories leading to dismissal.

Baker, Oluwole, Green (2013).

The problem…is that some of these state statutes require these due process violations, even when the informed thoughtful professional understands full well that she is being forced to make the wrong decision. They require that decision makers take action based on these measures even against their informed professional judgment.”

Title VII of the Civil Rights Act of 1964

In United States employment law, the doctrine of disparate impact holds that employment practices may be considered discriminatory and illegal if they have a disproportionate "adverse impact" on members of a minority group.

Under the doctrine, a violation of Title VII of the 1964 Civil Rights Act may be proven by showing that an employment practice or policy has a disproportionately adverse effect on members of the protected class as compared with non-members of the protected class.

Retrieved on February 23, 2013 from Wikipedia http://en.wikipedia.org/wiki/Disparate_impact

Title VII of the Civil Rights Act of 1964

This form of discrimination occurs where an employer does not intend to discriminate; to the contrary, it occurs when identical standards or procedures are applied to everyone, despite the fact that they lead to a substantial difference in employment outcomes for the members of a particular group and they are unrelated to successful job performance.

Retrieved on February 23, 2013 from Wikipedia http://en.wikipedia.org/wiki/Disparate_impact

Disparate impact doctrine

Presenter - John Cronin, Ph.D.

Contacting us:NWEA Main Number: 503-624-1951 E-mail: [email protected]

The presentation and recommended resources are available at our SlideShare site: http://www.slideshare.net/NWEA/tag/kingsbury-center

Thank you for attending

Download - Teacher evaluation presentation oregon

Top Related