automated scoring readiness for next generation...

36
Automated Scoring Readiness for Next Generation Assessments Karen Lochbaum May 18, 2011

Upload: lamthuan

Post on 25-Apr-2018

225 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Automated Scoring Readiness for Next Generation Assessments

Karen LochbaumMay 18, 2011

Page 2: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Copyright © 2011 Pearson Education, Inc. or its affiliates. All rights reserved. 2

Welcome & Introductions

Moderator: Anne JohnsonProgram Manager, Pearson

Presenter: Karen LochbaumVP Technology ServicesKnowledge Technologies, Pearson

Page 3: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Common Core State StandardsKey Features

• 21st Century Skills• More authentic tasks and assessment• More constructed responses => Automated

scoring

Page 4: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Benefits of Automated Scoring

• Immediacy & Efficiency– Evaluate responses in seconds– Reduce score turnaround time– Give students and teachers instant feedback– Reduce costs

• Accuracy• Consistency, Objectivity• Can detect off-topic, inappropriate and “odd”

responses

Page 5: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Common Core State StandardsKey Features

• Reading: Text complexity and the growth of comprehension

• Writing: Text types, responding to reading, and research

• Speaking and Listening: Flexible communication and collaboration

• Language: Conventions, effective use, and vocabulary

• Mathematics

Page 6: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Reading

Page 7: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments
Page 8: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments
Page 9: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Text Complexity• Conceptual level of vocabulary vs. surface level (e.g. word

frequency)• Measure how word meanings are learned and change over

time with increasing exposure

Page 10: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

WritingAsks students to answer several questions about a

hypothetical, yet realistic, scenario.You advise Pat Williams, the president of DynaTech, a

company that makes precision electronic instruments and navigation equipment. Sally Evans, a member of DynaTech’s sale force, recommends that DynaTech buy a small private plane (a SwiftAir 235) that she and other member of the sales force could use to visit customers. Pat was about to approve the purchase when there was an accident involving a SwiftAir 235.

Document Library• Newspaper article about the accident• Federal Accident Report on in-flight breakups in single engine planes• Internal Correspondence (Pat’s email to you and Sally’s e-mail to Pat)• Charts relating to SwiftAir’s performance characteristics• Excerpt from magazine article comparing SwiftAir 235 to similar planes• Pictures and descriptions of Swiftair Models 180 and 235

Page 11: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

ScienceUse the technical passage 'Green Ocean Machine' to answer the following.

The passage states that “the new green partner [alga] seems to provide Hatena with most of its energy needs.”

Describe the process that enables organisms to use energy from light to make food. In your description, be sure to include

* the specialized features needed to produce food* the substances needed to produce food* the substances produced during this process

Page 12: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Listening & Speaking

RETELL

Page 13: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Language

Page 14: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments
Page 15: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Language

Page 16: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Pearson

Proprietary

16

Mathematics

Page 17: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Automated Scoring Approach

• Learn from human scored student responses• Measure the content and quality of responses by

determining– The language features that human scorers evaluate when

scoring a response– How those features are weighed and combined to

produce scores

Page 18: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

18

Essay Scoring Process

Page 19: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

The Intelligent Essay Assessor

Learn to score like human scorers by measuring different aspects of writing

• Content -- including subject area knowledge– Semantic analysis, measures of similarity to prescored

responses, ideas, vocabulary growth, examples, ….

• Style– Appropriate word choice, word and sentence flow, fluency,

coherence, ….– Does each sentence logically follow the next? – Does each sentence contribute to the essay as a whole?

• Mechanics– Grammar, word usage, punctuation, spelling, …

Page 20: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

20

Other Features of IEA

• Uses non coachable measures– No counts of total words, syllables, characters,

etc. – No trigger surface features: “thus”, “therefore”– Detects larding of big words

• Knows when it doesn’t know– Detects off-topic or highly unusual essays, non-

standard language constructions, too long, too short …

Page 21: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Content Based Scoring

• Use Latent Semantic Analysis (LSA) to capture the “meaning” of language

• LSA knows that– Surgery is often performed by a team of doctors.– On many occasions, several physicians are involved in an

operation.

mean about the same thing even though they share no words.

• Enables evaluating the content of what is written rather than just matching keywords

Page 22: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Why LSA?

Search for “Cars”…

Page 23: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments
Page 24: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Why LSA?

• Studies have shown that:• People agree on the Keywords for a text only 15% of the

time• If you have 100 people name a document, you will get 30

different answers

• LSA operates on the level of deep word (latent) meaning

Page 25: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

What does that have to do with automated scoring?

• LSA reads lots of text• Learns what words mean and how they

relate to each other• Result is a “Semantic Space”

• Every word represented as a vector

• Every paragraph represented as a vector

M(Paragraph) = M(w1) + M(w2) + …

25

Page 26: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

26

• Every essay represented as a vector• New essays are placed based on the

words they contain

ContentScoring

Page 27: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

27

• Every section represented as a vector

• Student summaries are placed based on the words they contain

Reading Comprehension

Section 1

Section 2

Section 3

Summary

Page 28: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Spoken Assessments

Copyright © 2011 Pearson Education, Inc. or its affiliates. All rights reserved. 28

waveform

spectrum

segmentationwords

Page 29: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

29

REPEAT: New York City is famous for its ethnic diversity.

Pronunciation: 5.9

Fluency: 3.3

Accuracy: 1 word error (insertion)

Example: Learner

Page 30: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

30

Performance Comparison

3.026 secondsNative speaker

5.502 secondsLearner

Pronunciation AccuracyFluency

Page 31: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

31

Versant Scoring Logic

31

Read Read Answer Short QuestionAnswer Short QuestionRepeat SentenceRepeat Sentence Build SentenceBuild Sentence RetellsRetells

17 minutes

Sentence MasteryFluencyPronunciation Vocabulary

31

Page 32: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Mathematics Representation

Equations are saved using MathML markup, thus preserving the computational meaning of the math even the presentation is changed

Page 33: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Keys to Success

Design for automated scoring from the start!

Page 34: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Keys to Success

Item Development– Optimize for scoring effectiveness

Item Delivery– Math: Input and capture of student response

Field Test and Human Scoring– Representative samples– Double scoring with resolution

Page 35: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Keys to Success

Psychometrics– Automated scoring performance as part of field

test item evaluation

Operational Scoring &Monitoring– Requirements vary with nature of assessment

and acceptable performance criteria – Automated scoring in combination with human

scoring

Page 36: Automated Scoring Readiness for Next Generation Assessmentsimages.pearsonassessments.com/images/pdf/Automate… ·  · 2011-05-19Automated Scoring Readiness for Next Generation Assessments

Copyright © 2011 Pearson Education, Inc. or its affiliates. All rights reserved. 36

Q&A/Discussion

• If you have not done so already, please type any questions or comments you have about the webinar into the Chat Box on your screen.

• You may also email questions directly to [email protected] after the webinar.

Thank you!

Please join us May 25th when we discuss “Through Course Common Core Assessments: A Proposed Design for English Language Arts”

pearsonassessments.com/nextgenwebinars