wyoming accountability advisory committee scott marion & chris domaleski center for assessment...
Post on 01-Apr-2015
215 Views
Preview:
TRANSCRIPT
Designing a Statewide System for Measuring Teacher and
Leader Effectiveness
Wyoming Accountability Advisory Committee
Scott Marion & Chris DomaleskiCenter for Assessment
June 14, 2012
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
2
Some background Outline key decisions for creating educator
evaluation systems Our purpose today is to highlight some of the
key decisions we will need to make through the interim
We’ll be asking a lot more questions than providing answers, but we will need to answer these questions in order to move forward…
A process note: Given the number of people on the WEBEX/call, I will pause at specific places in the presentation to respond to questions.
Overview of presentation…
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
3
Wyoming, like an increasing number of states, intends to revise its teacher and leader evaluation practices
Educator effectiveness will be determined “in part by student achievement”
This enterprise holds great promise, but also presents real challenges
We are fortunate to be able to build off of the work in many other states. We are closely involved in:◦ CO, RI, NH, GA, PA, UT, NYC, HI, LA
Introduction
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
4
Why the interest in new forms of teacher evaluation?
Nobody doubts the critical influence of teacher quality on student achievement
Current (traditional) evaluation systems rarely identify either highly effective or ineffective teachers
Rationale
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
5
From Aspen Report and our experience:◦ Vision and Goals◦ State-Local Roles and Responsibilities◦ Theory of Action◦ General Evaluation Model
Coherence◦ Specific Measurement Model(s)
Attribution rules Combining multiple measures
◦ Information Requirements◦ Capacity Requirements◦ Reporting & Communication◦ Consequences & Support◦ Monitoring and Evaluation
Key Decisions
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
6
What is the vision and what are the guiding principles of the system we will design?
For example, will the system be designed to identify and “council out” low quality educators or is it designed primarily to improve the performance of the majority of educators?
Goals and key principles
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
7
The primary purpose of the system is to maximize student learning The system is designed to maximize educator development by
providing specific information, including appropriate formative information that can be used to improve teaching quality.
Local instantiations of the State Model system must be designed collaboratively among teachers, leaders, and other key stakeholders such as parent and students as appropriate. Individual educators will have input into the specific nature of their evaluation and considerable involvement into the establishment of their specific goals.
The effectiveness rating of each educator shall be based on multiple measures of teaching practice and student outcomes including using multiple years of data when available, especially for measures of student learning.
The Model system is designed to ensure that the framework, methods, and tools lead to a coherent system that is also coherent with the developing NH Leader Evaluation System.
The Model system shall be applied by well trained leaders and evaluation teams using the multiple sources of evidence along with professional judgment to arrive at an overall evaluation for each educator.
Excerpt rom NH’s draft system
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
8
What will be the “reach” of the state in defining local systems?
What factors must be considered in this decision?◦ Comparability/portability vs. flexibility◦ Support and capacity building◦ Oversight and monitoring◦ Required Framework, “State Model” or State-required
system
We are proceeding here with the assumption that there will at least be a state required framework?
Major policy decisions
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
9
Grounds our design Clarifies the assumptions, purposes, and goals
of the system Specifies the various indicators and
mechanisms by which the system will fulfill its purposes (and minimize unintended negative consequences)
Serves as a framework for evaluation
The ToA on the following slide is oversimplified and somewhat naïve, but it is what is driving much of the policy. We’ll be working with more complex and honest ToAs as we do our work.
A Theory of Action…
Center for Assessment. WY Accountability Advisory Committee (6/14/12) 10
A Simplified Theory of Action for Reformed Educator Evaluation Systems
Measures of Educator
Effectiveness and
Evaluation Processes
Hiring
Placement
Career Ladder
Compensation
Dismissal
Professional Development Student
Outcomes Improve
Center for Assessment. WY Accountability Advisory Committee (6/14/12) 11
Basic Structure of a Theory of Action
Assumptions or
Antecedents
Activities and
Mechanisms
Proximal Indicators
Intermediate Indicators
Activities and
Mechanisms
Distal Indicators (Intended Outcomes)
Consequences
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
12
Let’s look at a more reasonable approximation for an improvement-based educator evaluation system
Theory of Action
Simple ToA for an “improvement” system
13
Student Learning Improves
Focuses educators’
attention on productive practices
Educator evaluation
system
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
Results are used to improve
instruction
Student performance
is well measured
Evaluation results
improve
Thinking Through a Theory of ActionPolicy makers should have to very explicitly say
why and how implementing test-based approaches to support educator effectiveness for these grades and subjects will lead to improved educational opportunities for students For example, one might postulate that holding
teachers accountable for increases in student test scores on classroom-based assessments will lead to the development of both better assessments and improvements in student learning.
What are the specific mechanism(s) by which the intended outcomes will occur? E.g., targeted instruction, better PD, and/or more
appropriate curricular materials?
14Center for Assessment. WY Accountability Advisory Committee (6/14/12)
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
15
What will be the major components of our system?◦ Measures of teacher practice◦ Measures of student performance
◦ Student voice?◦ Peer input?◦ Other?
How will these be combined and weighted? How will these classes of indicators be
integrated to form a coherent picture?
The General Evaluation Model
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
16
Involves ensuring that the school accountability and educator accountability systems are sending similar messages to schools and stakeholders
It would make sense to use data from the school accountability system to augment information from the educator system
Further, it would also make sense to integrate the various components of the educator evaluation system to avoid a silo effect
Coherence
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
17
The following slides present some of the key decisions related to measurement model that will need to be made as we proceed?
As you know, the “devil is in the details” and there are many details with which to contend.
This is even more complicated when trying to reconcile and be clear about the state role
Specific Measurement Model
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
18
What are the indicators that operationalize the knowledge & skills that define educator practice? For example, domains from Danielson’s Framework for Teaching include:
Planning and Preparation The Classroom Environment Instruction Professional Responsibilities
◦Should these be the default “standards of professional practice” or should WY adopt more general standards (e.g., ISLIC, NC,CO) or leave it up to districts?
Measures of Educator Practice
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
19
Whatever standards are selected/developed, how shall they be measured?◦ Classroom observations?◦ Document (artifact) analysis?◦ Structured interviews?◦ Professional portfolios?
What about required data collection strategies and protocols (e.g., 4 observations/year)?
What are the expected levels of performance on the various indicators?
What about observer training and certification?
Measures of Educator Practice
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
20
Student Performance Measures and Analytics
What indicators of student growth should be used for PAWS grades and content areas?
What performance (growth) indicators should be used for non-PAWS grades and content areas?◦This is a huge issue!
Should state-level measures of student growth be combined with local measures of student performance for each educator determination? If so, how?
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
21
What analytic approach (model) will be used for analyzing State test data? ◦ What are the technical and policy issues that
need to be considered in choosing a model? ◦ What are the advantages/disadvantages of using
SGPs for educator evaluation? What is the standard for ‘good enough’
growth? Should growth expectations be
“conditioned” on factors other than prior performance such as poverty, etc.?
What information should be reported to whom and at what level?
Student Performance: Analyzing Growth
Mapping educators to standards, assessments & growth (Lee, 2010, based on preliminary data from MA DOE)
No curriculum framework
(25%)
HS Electives
Pre-K – 2
Special Education
ELA and Math 4-8 Self Contained Classes and
Middle School Subject teachers
Growth Direct(16%)
Growth Indirect(17%)
teachers
Curriculum Framework but no Assessment
(32%)
Assessment, but no growth
(10%)
Music
Drama
Visual Arts
3rd Grade Teachers
MS & HS School STE Teachers
K-12 ELL Teachers (MEPA)
Voc Ed
7th Grade History
Teachers*
* HSS Tests have been suspended
Gr 10 & 11 US History*
8th & 12th Grade History & Social
Science
Specialists K-2, 11&12
Reading Specialists (4-8)
Gr. 11 & 12 STE, ELA
& Math
K-4 Reading using DIBELS &
Grade **
9 & 10 ELA and
Math
Foreign Language
Phys Ed
Health
Special
Education 4-10
MS & HS Computers
Business& Mkting
Spring 2010 Robert Lee, Massachusetts
ESE
**These teachers have
not been linked yet
AP and IB Teachers**
AdminStaff
22Center for Assessment. WY Accountability Advisory Committee (6/14/12)
The Non-Tested Challenge
Center for Assessment. WY Accountability Advisory Committee (6/14/12)23
Lack of high quality measures of student performance, particularly for the purposes for which they are being used
Limitations of analytical options for calculating educator contributions to student performance
Comparability concerns Lack of technical capacity at the local and even state levels Lack of predictable course sequences Not enough time Not enough money Too much policy pressure (e.g., 50%) Huge risk of corruption Challenging issues of attribution
Many of these are challenges for tested as well as non-tested, but may be exacerbated for non-tested subjects and grades
All Educators in NTSG are Not the Same
Center for Assessment. WY Accountability Advisory Committee (6/14/12)24
Instead of dealing with each individual case, it makes sense to create an approach for addressing categories of educators
The general categorization can occur at the state level and should be fine-tuned at the district or even school level
One classification approach is based on the data available for the various groups of educators
The following excerpt of a chart, created for Colorado, provides examples of the nominal types of educators that would fall into the different data categories
25
Personnel defined by end of year state summative assessments available
Personnel Type (Examples)
Personnel teaching a core subject area where end of year state assessments measuring content taught in their subject area are available in two adjacent grades
Grades 4 -10 core subject teachers for literacy and math
Interventionists/specialists with shared responsibility with core subject teachers for improving literacy/numeracy skills of students in grades 4-10 (e.g., RTI specialists, ELA, special education teachers)
Personnel teaching in a core subject area where an end of year state summative assessment is available to measure content taught in their classrooms.
Science teachers (currently, grades 5,8 and 10) and grade 3 teachers with end of year summative state assessments available for their respective grade
Personnel teaching in a core subject area where no end of year state summative assessments are currently available to measure content taught in their classrooms.
Core subject teachers in the sciences (with the exception of grades 5, 8 and some personnel for10) and social studies. All ECE, grades K-2 and grades 11-12 teachers.
Resource teachers/specialists with instructional responsibility not directly linked to literacy/numeracy skills of students (e.g., music, arts, and P.E. teachers)
Personnel with no direct instructional responsibilities Resource teachers/specialists with indirect (non-instructional) responsibility for improving literacy/numeracy skills of students (e.g., social workers, psychologists, and school nurses).
Comparability What do we mean by comparability in this
context?◦ Educators within the units of analysis are held to
similar levels of expectations, at least in some relative sense
◦ For example, it would be a threat to the system if the teachers in grades 4-8 reading and math received noticeably lower ratings than the rest of the teachers (NTSG) in the school
At what levels is comparability important?◦ Within schools? Clearly yes.◦ Within districts? Probably yes.◦ Within states? It would be nice, but it might be too
high of a bar right now.
Center for Assessment. WY Accountability Advisory Committee (6/14/12) 26
What Measurement Approaches Are Being Proposed?
1. Norm-referenced tests (NRTs)2. Commercial interim assessments3. State or district created end-of-course exams
(both externally and locally developed)a. Includes new assessment development in places like
DE, CO, Hillsborough, FL
4. School or teacher-developed measures of student performance
a. Often includes Student Learning Objectives
*Note: 1 & 2 rarely cover courses beyond the core content areas and even then, not well in HS.
27Center for Assessment. WY Accountability Advisory Committee (6/14/12)
Analytic Approaches
Center for Assessment. WY Accountability Advisory Committee (6/14/12)28
If you thought the measurement/assessment issue was daunting….
It pales in comparison to the analytic challenges (i.e., how growth is calculated at local levels)
Remember, using the most sophisticated VAM models with high quality state test data has been rightfully questioned based on challenges with causal inferences, unreliability (year-to-year), and other technical issues (e.g., EPI report, Braun, et al., 2010, Rothstein, 2009 & 2010)
What Approaches Are Being Proposed for NTSG?
Center for Assessment. WY Accountability Advisory Committee (6/14/12) 29
1. Growth models using pre and post test from the same subject
2. Value-added modelsa.Pre and post test score in the same subject b.Conditioned on data other than pretest from same
content area as posttest
3. Student Growth Percentiles4. Shared attribution of aggregate
growth/VAM results5. Student learning objectives (SLO)
Definitions Growth refers to measures of performance for the same
students at two or more points in time and requires a common, often vertical, scale to evaluate the magnitude of change. Only true growth model here.
VAM: Generally describes multivariate models that include certain variables to produce to an expectation against which actual performance is evaluated.
Student Growth Percentiles (SGP) is a regression based measure of growth that works by evaluating current achievement based on prior achievement and describing performance (using percentiles) relative to other students with the “same” prior achievement histories.
Student Learning Objectives (SLO) is a general approach (often called Student Growth Objectives) whereby educators establish goals for individual or groups of students (often in conjunction with administrators) and then evaluating the extent to which the goals have been achieved.
Center for Assessment. WY Accountability Advisory Committee (6/14/12) 30
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
31
Attribution: linking educator behavior to student outcomes◦Assigning accountability
Multiple educators contribute to instruction “Contact time” requirements—how long does
the student need to be in the teacher’s classroom to count
◦Opportunity to employ shared attribution strategies Must be tied to local theories of action or
theories of improvement
Attribution
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
32
Combining Multiple Measures How should we arrive at an overall
judgment of educator effectiveness?◦ Weighting of student performance and knowledge
& skills What are the different types of information
that should be employed when evaluating principals compared with teachers?◦ We know the specific indicators and even
standards will differ Who should be responsible for making these
overall judgments?
33
Data system requirements to link students with teachers at the state level
Data system requirements to manage the data at the local level
Dealing with student mobility Dealing with missing data, especially
non-random missing data “Full academic year” rules
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
Information Requirements
34
How will this be managed at the state level?◦ Data, information, and analytics◦ Reporting and communication◦ Support and capacity building◦ Training and monitoring
How will this be managed at the local level?◦ Capacity for implementation
Conducting observations, document analysis, etc Induction, mentoring, and support Training Record keeping Reporting and feedback Decision making and appeals
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
Capacity Requirements
35
How will results be communicated to educators to improve practice?
How will information about the system be communicated to the public and policy makers while protecting educators?
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
Reporting & Communication
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
36
What sanctions, rewards, and/or consequences are appropriate to advance prioritized outcomes?
What strategies will be employed to use information to support schools/ teachers/ students?
Is there capacity in the state (in the districts) to improve educator quality in WY?
What resources will be required for this improvement to occur?◦ Where will they come from?
Consequences & Support
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
37
Negative Consequences As we consider the design and implementation of
WY’s new educator evaluation system, we must be mindful that the likelihood of getting this wrong (i.e., leading to unintended negative consequences) are at least as high as the chances of getting it right (i.e., improving teacher quality and student learning)
Unintended consequences could include:◦ Narrowing curriculum◦ Competition vs. Cooperation◦ Assignment of students or teachers to selected classes
for reasons unrelated to educational benefit◦ Educator transition◦ Educator attrition
Campbell’s Law
38
"The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” (emphases added)
http://en.wikipedia.org/wiki/Campbell%27s_Law
Educator accountability systems will invite significantly more implicit and explicit corruption than has been seen with school accountability
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
39
What types of formative evaluation approaches need to be put in place to monitor implementation and consequences?
Evaluate claims in theory of action Evaluate impact
◦ Establish criteria to determine if results are reasonable
Develop methods and standards to assess the precision and stability of results
Does the system meet important utility criteria?
Monitoring and Evaluation
Center for Assessment. WY Accountability Advisory Committee (6/14/12)
40
How should we plan our work going forward?
Who’s going to do what? How will we work?
Goals for next meeting…
Next steps…
top related