teacher evaluation presentation oregon

Download Teacher evaluation presentation   oregon

Post on 19-Jan-2015




0 download

Embed Size (px)


Overviews issues in the use of tests for teacher evaluation and implementation of the new Oregon standards for evaluation


  • 1. Implementing the Oregon Framework for Teacher and Administrator Professional Evaluation John Cronin, Ph.D. DirectorThe Kingsbury Center @ NWEA

2. Implementing the Oregon Framework for Teacherand Administrator Professional EvaluationPresenter - John Cronin, Ph.D.Contacting us:Rebecca Moore: 503-548-5129E-mail: rebecca.moore@nwea.orgThis PowerPoint presentation and recommended resources areavailable at our SlideShare website:http://www.slideshare.net/NWEA/tag/kingsbury-centerThis presentation is the top presentation on this page 3. Suggested readingBaker B., Oluwole, J., Green, P. (2013). The legalconsequences of mandating high stakesdecisions based on low quality information:Teacher evaluation in the Race to the Top Era.Education Policy Analysis Archives. Vol 21. No5. 4. We have entered an area of the law with littleguiding precedent. In such cases remember. Its rarely good to be the first wildebeest into the Zambezi River. 5. The difference between formative and summative evaluation Formative evaluation is intended to give educatorsuseful feedback to help them improve their jobperformance. For most educators formativeevaluation should be the focus of the process. Summative evaluation is a judgment of educatorperformance that informs future decisions aboutemployment including granting of tenure,performance pay, protection from layoff. 6. Teacher evaluation requirement Tested subjects (Math and ELA) State assessment One additional measure Non-tested subjects two of (all other) Common measure (national, regional, or district developed) Mandatory Classroom-based measure School-wide measure 7. Administrator evaluation requirement Two goals One must be the state assessment The other must come from: A common measurement Other measures such as graduation rate, attendance rate, discipline data, collegereadiness indicators Aligned with achievement compact indicators when possible 8. Unique features of Oregons approach Peer review panels for evaluation and supportsystems Local control Student growth data isnt a fixed percentage of the evaluation Districts decide how to translate performance to ratings. 9. Distinguishing teacher effectiveness from teacher evaluation Teacher effectiveness The judgment of a teachersability to positively impact learning in the classroom. Teacher evaluation The judgment of a teachersoverall performance including: Teacher effectiveness Common standards of job performance Participation in the school community Adherence to professional standards 10. Distinguishing teacher effectivenessfrom teacher evaluation Testing A claim that the improvement in learning (or lack ofit) reflected on one or more tests is caused by the teacher. Classroom observation That the observers ratings orconclusions are reliable and associated with behaviors thatcause improved learning in the classroom. 11. Purposes of summative evaluation An accurate and defensible judgment of aneducators job performance. Ratings of performance that providemeaningful differentiation across educators. 12. Differences between principal and teacher evaluationTeachers New groups of students each year Instruction of that group is limited to oneschool year.Thus teachers are normally held accountablefor yearly growth 13. Differences between principal and teacher evaluationPrincipals Inherit a pre-existing staff. Have limited control over staffing conditions Work with this intact group from year to year.Thus principals should be evaluated on theirability to improve growth or maintain high levelsof growth over time rather than their studentsgrowth within a school year. 14. The testing to teacher evaluation process 15. Issues in the use of growth measuresMeasurement design of theinstrumentMany assessments are notdesigned to measure growth.Others do not measure growthequally well for all students. 16. Tests are not equally accurate for allstudentsCalifornia STAR NWEA MAP 17. Issues in the use of growth measures Instructional alignment Tests used for teacher evaluation should align to the teachers instructional responsibilities. 18. Common problems with instructional alignment Using school level math and reading results in theevaluation of music, art, and other specials teachers. Using general tests of a discipline (reading, math,science) as a major component of the evaluationhigh school teachers delivering specialized courses. 19. The testing to teacher evaluation process 20. The problem with spring-spring testing Teacher 1 Summer Teacher 23/114/11 5/11 6/117/11 8/11 9/11 10/11 11/11 12/11 1/12 2/12 3/12 21. A better approach When possible use a spring fall spring approach. Measure summer loss and incentivize schools andteachers to minimize it. Measure teacher performance fall to spring, giving asmuch instructional time as possible betweenassessments. Monitor testing conditions to minimize gaming of fallspring results. 22. Issues in the use of growth and value-added measures Instability of results A variety of factors can cause value- added results to lack stability. Results are more likely to be stable at the extremes. The use of multiple-years of data is highly recommended. 23. Issues in the use of growth and value-added measures Among those who ranked in the top category on the TAKS reading test, more than 17% ranked among the lowest two categories on the Stanford. Similarly more than 15% of the lowest value-added teachers on the TAKS were in the highest two categories on the Stanford.Corcoran, S., Jennings, J., & Beveridge, A., Teacher Effectiveness on High and Low StakesTests, Paper presented at the Institute for Research on Poverty summer workshop, Madison, WI(2010). 24. Reliability of teacher value-added estimates Teachers with growth scores in lowest and highest quintile over two years using NWEAs Measures of Academic Progress BottomTop quintile quintileY1&Y2 Y1&Y2 Number59/49363/493 Percent 12% 13% r .64 r2 .41Typical r values for measures of teaching effectiveness rangebetween .30 and .60 (Brown Center on Education Policy, 2010) 25. The testing to teacher evaluation process 26. Value-added measures Estimate the progress of students relative to the progressthat would be expected students like these. Adjust for factors that are beyond the teachers control forexample Starting score Income of the students ELL status Mobility End result is an estimate of the teachers contribution tolearning relative to other teachers 27. How value-add analysis worksStarting score Erica Juan 28. How value-add analysis worksStarting score Erica JuanLunch Status FreeNot Eligible 29. How value-add analysis worksStarting score Erica JuanLunch Status FreeNot EligibleELL Status English ELL 30. How value-add analysis works Starting score EricaJuanLunch Status Free Not EligibleELL Status EnglishELLSPED StatusDyslexic Regular Ed 31. Issues in the use of growth and value-added measures Differences among value-added models Los Angeles Times Study 32. Issues in the use of value-added measuresControl for statistical errorAll models attempt to address thisissue. Nevertheless, many teachersvalue-added scores will fall withinthe range of statistical error. 33. Test Question:There are 300 students in the 10th grade. Mary andMark want to find the students favorite color. Maryasks 30 people. Mark asks 150 people. Mark says:My conclusions are more likely to be right thanMarys.Why does Mark think he is right?Because he is a man. 34. Range of teacher value-addedestimates 35. Issues in the use of growth and value-added measures Control for statistical error New York City 36. Limitations of value-added metrics Value-added metrics are inherently NORMATIVE. Value-added metrics are not useful in small ormedium district settings. Value-added metrics dont measure improvement inthe teacher population over time. Changes in the value-added model have the greatesteffect on extreme cases. 37. What Makes Schools Work Study - MathematicsData used represents a portion of the teachers who participated in Vanderbilt UniversitysWhat Makes Schools Work Project, funded by the federal Institute of Education Sciences 38. What makes schools work study - Reading 39. The testing to teacher evaluation process 40. If evaluators do not differentiate theirratings, then all differentiation comesfrom the test. 41. Results of Tennessee Teacher EvaluationPilot 42. Results of Georgia Teacher Evaluation Pilot 43. Reliability of evaluation weights in predicted stability of student growth gains year to year Observation by Reliability coefficient Proportion of test(relative to state test variancevalue-added gain) explained Model 1 State test 81% Student surveys 17% Classroom.51 26.0% Observations 2% Model 2 State test 50% Student Surveys 25%.66 43.5% Classroom Observation 25% Model 3 State test 33% - Student Surveys 33%.76 57.7%% Classroom Observations 33% Model 4 Classroom Observation 50%.75 56.2% State test 25% Student surveys 25%Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of EffectiveTeaching: Culminating Findings from the MET Projects Three-Year Study 44. Reliability of a variety of teacher observation implementations Observation by Reliability coefficient Proportion of test(relative to state test variancevalue-added gain) explained Principal 1.51 26.0% Principal 2.58 33.6% Principal and other administrator.67 44.9% Principal and three short.67 44.9% observations by peer observers Two principal observations and.66 43.6% two peer observations Two principal observations and.69 47.6% two different peer observers Two principal observations one peer observation and three short .72 51.8% observations by peersBill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of EffectiveTeaching: Culminating Findings from the MET Projects Three-Year Study 45. An example of an arbitrary ratingmatrix