a comprehensive assessment system: tough choices for the rtt assessment competition

Scott MarionNational Center for the Improvement of Educational

AssessmentRace to the Top Assessment Public and Expert Input Meeting

Boston, MANovember 12, 2009

Overview of CommentsAn explicit theory of actionPurposes and usesSound design principlesMy proposed designAccess and equityA note about psychometricsHigh schoolsSome advice on the proposed RFP/RFA

2Marion. Center for Assessment. Nov. 12, 2009

A Preview of My Vision…A conceptually coherent comprehensive

assessment system that incorporates explicit curriculum/OTL linksEnd-of-year summative assessments build on

well-articulated content and performance standards

Interim performance tasks embedded in mini curricular units

Formative assessment supports/probesFocused professional developmentActionable reporting system to help reveal

student and school strengths and weaknesses3Marion. Center for Assessment. Nov. 12, 2009

A Theory of Action Before finalizing the RFP, USED must articulate a

clear and explicit theory of actionDescribes how the particular CLEAR goals will be

achieved as a result the particular assessment system(s)

Specific mechanisms—how does USED expect we will get from A to B? What is the evidence to support this expectation?

Explicitly describes prioritized design choices, e.g.: Influence and shape teaching and learning, OR Measuring existing knowledge, OR Making cross-state comparisons

The theory of action is a check on the logic of the underlying assumptions


Purposes and UsesThe plethora of design requirements in the RTT notice

will stress any (even comprehensive )assessment system

USED must have a firm sense of the likely accountability uses before letting the RFPEven though Congress will ultimately reauthorize ESEA

Clarity on purposes/uses will serve as an important touchstone during complicated design deliberations—this is where choices are made explicit, for example:Trying to have BOTH diagnostic information for each

child and a common proficiency test for all students can be incompatible (certainly within the same reasonable length test)

Similarly, growth models could produce more valid information if we measured students over much of the achievement continuum rather than clustering test information around the proficient cutscore


Overarching GoalALL students should have meaningful

opportunities to develop deep understanding of important content and critical skills to allow for viable postsecondary choices (e.g., college/work ready) and for becoming contributing members of society

I propose a system that is intended to support this overall goal…


My Prioritized Purposes/Uses1. Measuring a limited number of big ideas at deeper

levels of understanding to provide students opportunities to develop robust knowledge and skills for use in novel and complex settings

Better integration of curriculum, instruction, and assessment because we cannot address these challenges with just an “assessment fix”

2. Measuring student longitudinal growth as a foundation for valid accountability systems and as information for school improvement

Notice that I am limiting myself to two main purposes, because I do not think a system can do more than 2-3 well.

Intentionally not focusing on cross-state comparisons…I think my proposed design purposes will help us meet the overall goal better and the trade-offs are too great to focus on x-state comparisons.


Design Principles: Theoretically BasedThe RFP must require proposed designs to be

based on theoretically sound design models, 2 examples include:

Evidence-centered design (ECD, Mislevy, 1994, 1996)Student model—exactly what do you want students to

know and how (well) do we want them to know it?Evidence model—what will you accept as evidence that

the student has the desired knowledge?Task model—what tasks will students perform to

demonstrate/communicate their knowledge?

Knowing What Students Know (Pellegrino, et al., 2001)


Observation Interpretation

Cognition

My Vision…A conceptually coherent comprehensive

assessment system that incorporates explicit curricular connectionsEnd-of-year summative assessments built on well-

articulated content and performance standardsInterim performance tasks embedded in mini

curricular unitsFormative assessment supports/promptsFocused professional developmentActionable reporting system to help reveal

student and school strengths and weaknessesThis proposal is designed to build a coherent

system that bridges curriculum, multiple forms of assessment, and supports for instruction


Design reporting systems up frontToo often the reports are simply an add-onMust be conceived as a system of reports

Different purposes, users, and levels of information

Must be actionable—leads to appropriate inferences, decisions, and instructional/programmatic actions

See http://www.schoolview.org/ for a terrific example of what’s possible

Should support the theory of action


The Curricular UnitsDepending on grade level, approximately 2-6 of these units

throughout the year, varied by grade level (can phase-in)The units could be as short as a few days or as long as a couple of

weeksEach unit is focused on a “big idea” of the domainCan be strategically used within existing curricula (e.g.,

perhaps at the end of a longer unit of study)Serves as the basis for performance tasks and as a context for

summative assessment Includes training materials and supports for implementing

formative assessment and progress monitoring strategies within each unit

Flexible enough to use each year with new/comparable contexts: different science experiment or grade-level text, but assessing same concepts

Provides a vehicle for structuring equitable OTL and access for all


Summative AssessmentServes as the foundation for growth measurement Some of the content and specific examples will

come from curricular units (so we can measure more than “general” knowledge)

Should be administered toward the end of the school year

Should include rich representation of knowledge and skills (i.e., plenty of open-ended tasks)

Why the obsession with “instant” results? Who needs the results, in what form, and by when?What will be done with these results?Remember, our current accountability schedule has

driven this “need” for a rapid turnaround.Everything comes with a cost!


Interim performance tasksThese rich and engaging tasks are the foundation

of this systemContextualized within the curricular unitsScored locally and incorporated within local

assessment and grading (graduation) systemsLocal scoring audited (e.g., KY portfolios) so results

can be used in state accountability systemsSchool level for K-8School and individual (e.g., graduation) levels for high

schoolTasks should be designed using ECD principles to

reveal students’ need for additional supportMost tasks should be released each year


Formative assessmentThe curricular units and associated materials

should be designed to facilitate formative assessment probes and processes

Professional development provided to increase teachers’ capacity for implementing and using formative assessment to improve instruction

Formative assessment training and strategies should include a focus on helping all students achieve expectations

Maintain a clear separation between formative assessment and district/state accountability systemsStakes changes (corrupts) everything


Opportunity, Access, and EquityI argue that we have much more of an instruction

(OTL) than an assessment problemAssessment can’t make up for lack of OTLThe proposed curricular units are designed to help

level the curriculum and instruction playing fieldProvide supports for teachers to help them ensure

that all students access the knowledge and skillsBuild formative assessment capacity and use so

students don’t fall so far behind Design tasks with multiple and varied opportunities

for students to validly participate in the assessment system

Finally, assessment guidelines need to focus first on fair access and less on narrow definitions of comparabilityCapitalize on tremendous advances in innovative

technological approaches for access and accommodations


A “New” Psychometrics A system such as the one I’m proposing will

require some serious re-examination of our current psychometric practices

We’ve traded a lot (of validity) in the past for student-level reliability, pretty scales, and overly strict notions of comparabilityYes, we will have serious equating challenges

The foundations for “new” approaches have been established (e.g., Linn, Baker, Dunbar, 1991, Mislevy 1994, Pellegrino, et al, 2001), but still need more attention to work in large-scale, efficient practice

The RFP should push for requirements and expectations beyond the current “safe” methods


High SchoolsAssessment system should be situated in

specific “indicator” or core courses up to some point (e.g., 10th grade)

After this point, there should be more choice in the assessment (and accountability) system to allow for specialization and choice by students

Interim performance tasks can be used as part of a student accountability system like Wyoming’s or Rhode Island’s graduation systems


Some advice on RFA/RFPDevelopment is an ONGOING cost, not a one-time

purchase!Recognize and embrace the differences between high

schools and elementary schoolsDetermine the absolutely essential pieces and then

examine costs for additional componentsReconsider the current practice of having every student

tested on every item Matrix sampling is still a viable approach

Allow for multiple awards Nobody has the “right” answer and even if they think they do,

it won’t be “right” in all contexts Especially true in high school

According to Rich Hill, a good RFP is: Exceptionally clear on goals Flexible on specific means unless you are absolutely clear

on what you wantThink about a phase-in over the next 5 yearsRecognize critical operational and bureaucratic constraints

Existing contracts, state laws, procurement rules 18Marion. Center for Assessment. Nov. 12, 2009

For more informationFormal comments will be submitted by

December 2, 2009 and available on request:[email protected]


a comprehensive assessment system: tough choices for the rtt assessment competition

Documents

assessment b

assessment public

scott marionnational

prioritized design choices

students opportunities

usessound design principlesmy

clustering test information

valid information