si 760 / eecs 597 / ling 702 language and information winter 2004 handout #3

166
SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Upload: gunnar-crumb

Post on 02-Apr-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

SI 760 / EECS 597 / Ling 702

Language and Information

Winter 2004

Handout #3

Page 2: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Course Information

• Instructor: Dragomir R. Radev ([email protected])

• Office: 3080, West Hall Connector• Phone: (734) 615-5225• Office hours: M&F 12-1• Course page:

http://www.si.umich.edu/~radev/LNI-winter2004/

• Class meets on Mondays, 1-4 PM in 412 WH

Page 3: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Lexical Semanticsand WordNet

Page 4: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

• Lexemes, lexicon, sense(s)• Examples:

– Red, n: the color of blood or a ruby

– Blood, n: the red liquid that circulates in the heart, arteries and veins of animals

– Right, adj: located nearer the right hand esp. being on the right when facing the same direction as the observer

• Do dictionaries gives us definitions??

Meanings of words

Page 5: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Relations among words• Homonymy:

– Instead, a bank can hold the investments in a custodial account in the client’s name.

– But as agriculture burgeons on the east bank, the river will shrink even more.

• Other examples: be/bee?, wood/would?

• Homophones

• Homographs

• Applications: spelling correction, speech recognition, text-to-speech

• Example: Un ver vert va vers un verre vert.

Page 6: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Polysemy• They rarely serve red meat, preferring to prepare seafood,

poultry, or game birds.

• He served as U.S. ambassador to Norway in 1976 and 1977.

• He might have served his time, come out and led an upstanding life.

• Homonymy: distinct and unrelated meanings, possibly with different etymology (multiple lexemes).

• Polysemy: single lexeme with two meanings.

• Example: an “idea bank”

Page 7: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Synonymy

• Principle of substitutability• How big is this plane?• Would I be flying on a large or small plane?• Miss Nelson, for instance, became a kind of big

sister to Mrs. Van Tassel’s son, Benjamin.• ?? Miss Nelson, for instance, became a kind of

large sister to Mrs. Van Tassel’s son, Benjamin.• What is the cheapest first class fare?• ?? What is the cheapest first class cost?

Page 8: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Semantic Networks

• Used to represent relationships between words

• Example: WordNet - created by George Miller’s team at Princeton (http://www.cogsci.princeton.edu/~wn)

• Based on synsets (synonyms, interchangeable words) and lexical matrices

Page 9: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Lexical matrix

Word FormsWord

Meanings F1 F2 F3 … Fn

M1 E1,1 E1,2

M2 E1,2

……

Mm Em,n

Page 10: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Synsets

• Disambiguation– {board, plank}– {board, committee}

• Synonyms– substitution– weak substitution– synonyms must be of the same part of speech

Page 11: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

$ ./wn board -hypen

Synonyms/Hypernyms (Ordered by Frequency) of noun board9 senses of board

Sense 1board => committee, commission => administrative unit => unit, social unit => organization, organisation => social group => group, grouping

Sense 2board => sheet, flat solid => artifact, artefact => object, physical object => entity, something

Sense 3board, plank => lumber, timber => building material => artifact, artefact => object, physical object => entity, something

Page 12: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Sense 4display panel, display board, board => display => electronic device => device => instrumentality, instrumentation => artifact, artefact => object, physical object => entity, something

Sense 5board, gameboard => surface => artifact, artefact => object, physical object => entity, something

Sense 6board, table => fare => food, nutrient => substance, matter => object, physical object => entity, something

Page 13: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Sense 7control panel, instrument panel, control board, board, panel => electrical device => device => instrumentality, instrumentation => artifact, artefact => object, physical object => entity, somethingSense 8circuit board, circuit card, board, card => printed circuit => computer circuit => circuit, electrical circuit, electric circuit => electrical device => device => instrumentality, instrumentation => artifact, artefact => object, physical object => entity, somethingSense 9dining table, board => table => furniture, piece of furniture, article of furniture => furnishings => instrumentality, instrumentation => artifact, artefact => object, physical object => entity, something

Page 14: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Antonymy

• “x” vs. “not-x”

• “rich” vs. “poor”?

• {rise, ascend} vs. {fall, descend}

Page 15: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Other relations

• Meronymy: X is a meronym of Y when native speakers of English accept sentences similar to “X is a part of Y”, “X is a member of Y”.

• Hyponymy: {tree} is a hyponym of {plant}.

• Hierarchical structure based on hyponymy (and hypernymy).

Page 16: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Other features of WordNet

• Index of familiarity

• Polysemy

Page 17: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

board used as a noun is familiar (polysemy count = 9)

bird used as a noun is common (polysemy count = 5)

cat used as a noun is common (polysemy count = 7)

house used as a noun is familiar (polysemy count = 11)

information used as a noun is common (polysemy count = 5)

retrieval used as a noun is uncommon (polysemy count = 3)

serendipity used as a noun is very rare (polysemy count = 1)

Familiarity and polysemy

Page 18: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Compound nouns

advisory boardappeals boardbackboardbackgammon boardbaseboardbasketball backboardbig boardbillboardbinder's boardbinder board

blackboardboard gameboard measureboard meetingboard memberboard of appealsboard of directorsboard of educationboard of regentsboard of trustees

Page 19: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Overview of senses1. board -- (a committee having supervisory powers; "the board has seven members")2. board -- (a flat piece of material designed for a special purpose; "he nailed boards across the windows")3. board, plank -- (a stout length of sawn timber; made in a wide variety of sizes and used for many purposes)4. display panel, display board, board -- (a board on which information can be displayed to public view)5. board, gameboard -- (a flat portable surface (usually rectangular) designed for board games; "he got out the board and set up the pieces")6. board, table -- (food or meals in general; "she sets a fine table"; "room and board")7. control panel, instrument panel, control board, board, panel -- (an insulated panel containing switches and dials and meters for controlling electrical devices; "he checked the instrument panel"; "suddenly the board lit up like a Christmas tree")8. circuit board, circuit card, board, card -- (a printed circuit that can be inserted into expansion slots in a computer to increase the computer's capabilities) 9. dining table, board -- (a table at which meals are served; "he helped her clear the dining table"; "a feast was spread upon the board")

Page 20: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Top-level concepts

{act, action, activity}

{animal, fauna}

{artifact}

{attribute, property}

{body, corpus}

{cognition, knowledge}

{communication}

{event, happening}

{feeling, emotion}

{food}

{group, collection}

{location, place}

{motive}

{natural object}

{natural phenomenon}

{person, human being}

{plant, flora}

{possession}

{process}

{quantity, amount}

{relation}

{shape}

{state, condition}

{substance}

{time}

Page 21: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Text Summarization

Page 22: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

The BIG problem• Information overload: 3 Billion+ URLs

catalogued by Google• Possible approaches:

– information retrieval– document clustering– information extraction– visualization– question answering– text summarization

Page 23: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3
Page 24: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3
Page 25: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

MILAN, Italy, April 18. A small airplane crashed into a governmentbuilding in heart of Milan, setting the top floors on fire, Italianpolice reported. There were no immediate reports on casualties asrescue workers attempted to clear the area in the city's financialdistrict. Few details of the crash were available, but news reportsabout it immediately set off fears that it might be a terrorist actakin to the Sept. 11 attacks in the United States. Those fears sentU.S. stocks tumbling to session lows in late morning trading.

Witnesses reported hearing a loud explosion from the 30-storyoffice building, which houses the administrative offices of the localLombardy region and sits next to the city's central train station.Italian state television said the crash put a hole in the 25th floorof the Pirelli building. News reports said smoke poured from theopening. Police and ambulances rushed to the building in downtownMilan. No further details were immediately available.

Page 26: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

MILAN, Italy, April 18. A small airplane crashed into a governmentbuilding in heart of Milan, setting the top floors on fire, Italianpolice reported. There were no immediate reports on casualties asrescue workers attempted to clear the area in the city's financialdistrict. Few details of the crash were available, but news reportsabout it immediately set off fears that it might be a terrorist actakin to the Sept. 11 attacks in the United States. Those fears sentU.S. stocks tumbling to session lows in late morning trading.

Witnesses reported hearing a loud explosion from the 30-storyoffice building, which houses the administrative offices of the localLombardy region and sits next to the city's central train station.Italian state television said the crash put a hole in the 25th floorof the Pirelli building. News reports said smoke poured from theopening. Police and ambulances rushed to the building in downtownMilan. No further details were immediately available.

Page 27: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

MILAN, Italy, April 18. A small airplane crashed into a governmentbuilding in heart of Milan, setting the top floors on fire, Italianpolice reported. There were no immediate reports on casualties asrescue workers attempted to clear the area in the city's financialdistrict. Few details of the crash were available, but news reportsabout it immediately set off fears that it might be a terrorist actakin to the Sept. 11 attacks in the United States. Those fears sentU.S. stocks tumbling to session lows in late morning trading.

Witnesses reported hearing a loud explosion from the 30-storyoffice building, which houses the administrative offices of the localLombardy region and sits next to the city's central train station.Italian state television said the crash put a hole in the 25th floorof the Pirelli building. News reports said smoke poured from theopening. Police and ambulances rushed to the building in downtownMilan. No further details were immediately available.

How many victims?

Was it a terrorist act?

What was the target?

What happened?

Says who?

When, where?

Page 28: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

1. How many people were injured?2. How many people were killed? (age, number, gender, description)3. Was the pilot killed?4. Where was the plane coming from?5. Was it an accident (technical problem, illness, terrorist act)? 6. Who was the pilot? (age, number, gender, description) 7. When did the plane crash? 8. How tall is the Pirelli building? 9. Who was on the plane with the pilot? 10. Did the plane catch fire before hitting the building? 11. What was the weather like at the time of the crash? 12. When was the building built? 13. What direction was the plane flying? 14. How many people work in the building? 15. How many people were in the building at the time of the crash? 16. How many people were taken to the hospital? 17. What kind of aircraft was used?

Page 29: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Some concepts

• Abstracts: “a concise summary of the central subject matter of a document” [Paice90].

• Indicative, informative, and critical summaries

• Extracts (representative paragraphs/sentences/phrases)

• Still grammatical

Page 30: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Types of summaries

• Dimensions– Single-document vs. multi-document

• Context– Query-specific vs. query-independent

• Genres

Page 31: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Genres• headlines• outlines• minutes• biographies• abridgments• sound bites• movie summaries• chronologies, etc.

[Mani and Maybury 1999]

Page 32: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Bush may send 500-1,000 troops to Liberia

Wednesday, July 2, 2003 Posted: 7:36 PM EDT (2336 GMT)

President Bush could announce later this week that he is sending 500 to 1,000 peacekeeping troops to Liberia, two senior officials told CNN.

Facing mounting international pressure to have the United States lead a Liberia mission that also would include West African peacekeepers, Bush discussed such a deployment Wednesday, the officials said.

U.N. Secretary-General Kofi Annan and others have talked of a U.S. deployment of 2,000 troops, but U.S. officials told CNN any deployment would be no more than half that.

The officials said the timing of the announcement could be slowed by efforts to get Liberian President Charles Taylor, who faces war crimes charges by a U.N. court in neighboring Sierra Leone, to step down and leave the war-torn country.

The White House official line is that Taylor should leave now and face war crimes trial later. But Bush used different language Wednesday regarding Taylor, saying simply that he should leave the country. Many analysts read the new Bush language as a sign the president was prepared to accept Taylor going into exile in a country that would not extradite him to Sierra Leone.

Bush has been reluctant to commit U.S. troops to Liberia, which was founded in 1822 as a settlement for freed American slaves, and hoped West African peacekeepers would be enough, with the possible exception of Marine reinforcements at the U.S. Embassy in Monrovia. But Secretary of State Powell has been arguing in favor of a U.S. commitment, sources said -- citing recent peacekeeping commitments by France in the Ivory Coast and Great Britain in Sierra Leone.

Bush leaves this weekend for his first trip to Africa, and the Liberia issue has become a test of his promise to make a commitment to promoting peace, democracy and economic development in Africa, administration officials said. One senior official said, "There will be a U.S. role, but the details are still in somewhat of a flux." Another senior official said "it is not sealed" but a force of 500 to no more than 1,000 Army troops was under serious discussion and that there were "strong indications" a final decision in favor of a deployment "will be sooner rather than later."

Page 33: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Despite suggestions by some administration officials to the contrary, neither Defense Secretary Donald Rumsfeld nor Joint Chiefs Chairman Gen. Richard Myers has expressed reservations about involving U.S. troops in Liberia, key aides to both men told CNN.

An aide to Rumsfeld said the defense secretary believes the mission would fit into the category of "lesser contingencies" the Pentagon is prepared to handle. Sources close to Myers said the general shares that view.

Pentagon officials acknowledged forces are stretched thin overseas -- in Afghanistan, Iraq and the Balkans -- but said the small number of troops required for Liberia would not create problems.

But other administration officials said the Pentagon is wary in part because of the humiliating memories of the last major U.S. deployment in Africa -- to Somalia -- which ended in retreat 10 years ago after 18 Americans were killed. Several senior officials said reports that Bush had already signed orders authorizing a deployment were inaccurate.

But these officials said planning was intensifying, including detailed conversations with the United Nations and with West African nations that would be part of a peacekeeping mission.

Pentagon sources told CNN a unit of 50 U.S. Marines known as a FAST team -- for Fleet Anti-terrorism Security Team -- was on standby in Rota, Spain, for possible deployment to reinforce security at the U.S. Embassy.

Several hundred Americans remain in Liberia, where intense fighting between Taylor's government and rebel forces has continued despite a June 17 cease-fire.

Nigeria had been working with Taylor on a possible deal for him to take refuge in that country. One problem, however, is that Taylor has agreed to deals before, then backed out. Officials said the United States was working closely with members of the Economic Community of West African States on diplomatic efforts, particularly Ghana and Nigeria.

Comments Tuesday by White House press secretary Ari Fleischer that Bush was considering sending troops provoked a nearly instantaneous reaction in Monrovia, where thousands of people gathered outside the U.S. Embassy to cheer a possible American presence.

"We feel America can bring peace because they are the original founders of this nation, and secondly, they are the superpower of the world," one man said.

Page 34: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Bush may send 500-1,000 troops to Liberia

President Bush could announce later this week that he is sending 500 to 1,000 peacekeeping troops to Liberia.

Bush discussed such a deployment Wednesday.

The White House official line is that Liberian President Taylor should leave now and face war crimes trial later. A unit of 50 U.S. Marines known as a FAST teamwas on standby in Rota, Spain

Several hundred Americans remain in Liberia, where intense fighting between Taylor's government and rebel forces has continued despite a June 17 cease-fire.

Page 35: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Bush may send 500-1,000 troops to Liberia

Wednesday, July 2, 2003 Posted: 7:36 PM EDT (2336 GMT)

President Bush could announce later this week that he is sending 500 to 1,000 peacekeeping troops to Liberia , two senior officials told CNN.

Facing mounting international pressure to have the United States lead a Liberia mission that also would include West African peacekeepers, Bush discussed such a deployment Wednesday, the officials said.

U.N. Secretary-General Kofi Annan and others have talked of a U.S. deployment of 2,000 troops, but U.S. officials told CNN any deployment would be no more than half that.

The officials said the timing of the announcement could be slowed by efforts to get Liberian President Charles Taylor, who faces war crimes charges by a U.N. court in neighboring Sierra Leone, to step down and leave the war-torn country.

The White House official line is that Taylor should leave now and face war crimes trial later. But Bush used different language Wednesday regarding Taylor, saying simply that he should leave the country. Many analysts read the new Bush language as a sign the president was prepared to accept Taylor going into exile in a country that would not extradite him to Sierra Leone.

Pentagon sources told CNN a unit of 50 U.S. Marines known as a FAST team -- for Fleet Anti-terrorism Security Team -- was on standby in Rota, Spain, for possible deployment to reinforce security at the U.S. Embassy.

Several hundred Americans remain in Liberia, where intense fighting between Taylor's government and rebel forces has continued despite a June 17 cease-fire.

Page 36: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

What does summarization involve?

• Three stages (typically)– content identification– conceptual organization– realization

Page 37: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Human summarization and abstracting

• What professional abstractors do

• Ashworth:• “To take an original article, understand it and pack it

neatly into a nutshell without loss of substance or clarity presents a challenge which many have felt worth taking up for the joys of achievement alone. These are the characteristics of an art form”.

Page 38: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Borko and Bernier 75

• The abstract and its use:– Abstracts promote current awareness– Abstracts save reading time– Abstracts facilitate selection– Abstracts facilitate literature searches– Abstracts improve indexing efficiency– Abstracts aid in the preparation of reviews

Page 39: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Cremmins 82, 96

• American National Standard for Writing Abstracts:– State the purpose, methods, results, and conclusions presented in

the original document, either in that order or with an initial emphasis on results and conclusions.

– Make the abstract as informative as the nature of the document will permit, so that readers may decide, quickly and accurately, whether they need to read the entire document.

– Avoid including background information or citing the work of others in the abstract, unless the study is a replication or evaluation of their work.

Page 40: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Cremmins 82, 96

– Do not include information in the abstract that is not contained in the textual material being abstracted.

– Verify that all quantitative and qualitative information used in the abstract agrees with the information contained in the full text of the document.

– Use standard English and precise technical terms, and follow conventional grammar and punctuation rules.

– Give expanded versions of lesser known abbreviations and acronyms, and verbalize symbols that may be unfamiliar to readers of the abstract.

– Omit needless words, phrases, and sentences.

Page 41: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Cremmins 82, 96• Original version:

There were significant positive associations between the concentrations of the substance administered and mortality in rats and mice of both sexes.

There was no convincing evidence to indicate that endrin ingestion induced and of the different types of tumors which were found in the treated animals.

• Edited version:

Mortality in rats and mice of both sexes was dose related.

No treatment-related tumors were found in any of the animals.

Page 42: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Morris et al. 92

• Reading comprehension of summaries• 75% redundancy of English [Shannon 51] • Compare manual abstracts, Edmundson-style

extracts, and full documents• Extracts containing 20% or 30% of original

document are effective surrogates of original document

• Performance on 20% and 30% extracts is no different than informative abstracts

Page 43: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Luhn 58• Very first work in

automated summarization

• Computes measures of significance

• Words:– stemming

– bag of words WORDSF

RE

QU

EN

CY

E

Resolving power of significant words

Page 44: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Luhn 58• Sentences:

– concentration of high-score words

• Cutoff values established in experiments with 100 human subjects

SIGNIFICANT WORDS

ALL WORDS

* * * * 1 2 3 4 5 6 7

SENTENCE

SCORE = 42/7 2.3

Page 45: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Edmundson 69

• Cue method:– stigma words

(“hardly”, “impossible”)

– bonus words (“significant”)

• Key method:– similar to Luhn

• Title method:– title + headings

• Location method:– sentences under

headings– sentences near

beginning or end of document and/or paragraphs (also [Baxendale 58])

Page 46: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Edmundson 69• Linear combination of

four features:

1C + 2K + 3T + 4L

• Manually labelled training corpus

• Key not important!0 10 20 30 40 50 60 70 80 90 100 %

RANDOM

KEY

TITLE

CUE

LOCATION

C + K + T + L

C + T + L

1

Page 47: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Paice 90

• Survey up to 1990• Techniques that

(mostly) failed:– syntactic criteria [Earl

70]

– indicator phrases (“The purpose of this article is to review…)

• Problems with extracts:– lack of balance

– lack of cohesion• anaphoric reference

• lexical or definite reference

• rhetorical connectives

Page 48: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Paice 90

• Lack of balance– later approaches based

on text rhetorical structure

• Lack of cohesion– recognition of

anaphors [Liddy et al. 87]

• Example: “that” is– nonanaphoric if preceded

by a research-verb (e.g., “demonstrat-”),

– nonanaphoric if followed by a pronoun, article, quantifier,…,

– external if no later than 10th word,else

– internal

Page 49: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Brandow et al. 95

• ANES: commercial news from 41 publications

• “Lead” achieves acceptability of 90% vs. 74.4% for “intelligent” summaries

• 20,997 documents• words selected based

on tf*idf• sentence-based

features:– signature words– location– anaphora words– length of abstract

Page 50: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Brandow et al. 95

• Sentences with no signature words are included if between two selected sentences

• Evaluation done at 60, 150, and 250 word length

• Non-task-driven evaluation:

“Most summaries judged less-than-perfect would not be detectable as such to a user”

Page 51: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Lin & Hovy 97

• Optimum position policy

• Measuring yield of each sentence position against keywords (signature words) from Ziff-Davis corpus

• Preferred order

[(T) (P2,S1) (P3,S1) (P2,S2) {(P4,S1) (P5,S1) (P3,S2)} {(P1,S1) (P6,S1) (P7,S1) (P1,S3)(P2,S3) …]

Page 52: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Kupiec et al. 95

• Extracts of roughly 20% of original text

• Feature set:– sentence length

• |S| > 5

– fixed phrases• 26 manually chosen

– paragraph• sentence position in

paragraph

– thematic words• binary: whether

sentence is included in manual extract

– uppercase words• not common acronyms

• Corpus:• 188 document +

summary pairs from scientific journals

Page 53: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Kupiec et al. 95

• Uses Bayesian classifier:

• Assuming statistical independence:

k

j j

k

j j

kFP

SsPSsFPFFFSsP

1

121

)(

)()|(),...,|(

),(

)()|,...,(),...,|(

,...21

2121

k

kk FFFP

SsPSsFFFPFFFSsP

Page 54: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Kupiec et al. 95

• Performance:– For 25% summaries, 84% precision– For smaller summaries, 74% improvement over

Lead

Page 55: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Salton et al. 97

• document analysis based on semantic hyperlinks (among pairs of paragraphs related by a lexical similarity significantly higher than random)

• Bushy paths (or paths connecting highly connected paragraphs) are more likely to contain information central to the topic of the article

Page 56: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Salton et al. 97

Page 57: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Salton et al. 97

Overlap between manual extracts: 46%Algorithm Optimistic Pessimistic Intersection Union

Globalbushy

45.60% 30.74% 47.33% 55.16%

Globaldepth-first

43.98% 27.76% 42.33% 52.48%

Segmentedbushy

45.48% 26.37% 38.17% 52.95%

Random 39.16% 22.07% 38.47% 44.24%

Page 58: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Marcu 97-99

• Based on RST (nucleus+satellite relations)

• text coherence• 70% precision and

recall in matching the most important units in a text

• Example: evidence

[The truth is that the pressure to smoke in junior high is greater than it will be any other time of one’s life:][we know that 3,000 teens start smoking each day.]

• N+S combination increases R’s belief in N [Mann and Thompson 88]

Page 59: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

2Elaboration

2Elaboration

8Example

2BackgroundJustification

3Elaboration

8Concession

10Antithesis

Mars experiences

frigid weather

conditions(2)

Surface temperatures typically average

about -60 degrees

Celsius (-76 degrees

Fahrenheit) at the

equator and can dip to -

123 degrees C near the

poles(3)

4 5Contrast

Although the atmosphere

holds a small

amount of water, and water-ice

clouds sometimes develop,

(7)

Most Martian weather involves

blowing dust and carbon monoxide.

(8)

Each winter, for example, a blizzard of

frozen carbon dioxide

rages over one pole, and a few meters of

this dry-ice snow

accumulate as

previously frozen carbon dioxide

evaporates from the opposite

polar cap.(9)

Yet even on the summer pole, where

the sun remains in the sky all day long,

temperatures never warm

enough to melt frozen

water.(10)

With its distant orbit (50 percent farther from the sun than Earth) and

slim atmospheric

blanket,(1)

Only the midday sun at tropical latitudes is

warm enough to

thaw ice on occasion,

(4)

5Evidence

Cause

but any liquid water formed in this way would

evaporate almost

instantly(5)

because of the low

atmospheric pressure

(6)

Page 60: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Barzilay and Elhadad 97

• Lexical chains [Stairmand 96]

Mr. Kenny is the person that invented the anesthetic machine which uses micro-computers to control the rate at which an anesthetic is pumped into the blood. Such machines are nothing new. But his device uses two micro-computers to achineve much closer monitoring of the pump feeding the anesthetic into the patient.

Page 61: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Barzilay and Elhadad 97

• WordNet-based

• three types of relations:– extra-strong (repetitions)– strong (WordNet relations)– medium-strong (link between synsets is longer

than one + some additional constraints)

Page 62: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Barzilay and Elhadad 97

• Scoring chains:– Length– Homogeneity index:

= 1 - # distinct words in chain

Score = Length * Homogeneity

Score > Average + 2 * st.dev.

Page 63: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Mani & Bloedorn 97,99

• Summarizing differences and similarities across documents

• Single event or a sequence of events

• Text segments are aligned

• Evaluation: TREC relevance judgments

• Significant reduction in time with no significant loss of accuracy

Page 64: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Carbonell & Goldstein 98

• Maximal Marginal Relevance (MMR)

• Query-based summaries

• Law of diminishing returns

C = doc collectionQ = user queryR = IR(C,Q,)S = already retrieved

documentsSim = similarity metric

used

MMR = argmax [ (Sim1(Di,Q) - (1-) max Sim2(Di,Dj)]DiR\S DiS

Page 65: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Radev et al. 00

• MEAD• Centroid-based• Based on sentence

utility

• Topic detection and tracking initiative [Allen et al. 98, Wayne 98]

TIME

Page 66: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

1. Algerian newspapers have reported that 18 decapitated bodies have been found by authorities in the south of the country.

2. Police found the ``decapitated bodies of women, children and old men,with their heads thrown on a road'' near the town of Jelfa, 275 kilometers (170 miles) south of the capital Algiers.

3. In another incident on Wednesday, seven people -- including six children -- were killed by terrorists, Algerian security forces said.

4. Extremist Muslim militants were responsible for the slaughter of the seven people in the province of Medea, 120 kilometers (74 miles) south of Algiers.

5. The killers also kidnapped three girls during the same attack, authorities said, and one of the girls was found wounded on a nearby road.

6. Meanwhile, the Algerian daily Le Matin today quoted Interior Minister Abdul Malik Silal as saying that ``terrorism has not been eradicated, but the movement of the terrorists has significantly declined.''

7. Algerian violence has claimed the lives of more than 70,000 people since the army cancelled the 1992 general elections that Islamic parties were likely to win.

8. Mainstream Islamic groups, most of which are banned in the country, insist their members are not responsible for the violence against civilians.

9. Some Muslim groups have blamed the army, while others accuse ``foreign elements conspiring against Algeria.’’

1. Eighteen decapitated bodies have been found in a mass grave in northern Algeria, press reports said Thursday, adding that two shepherds were murdered earlier this week.

2. Security forces found the mass grave on Wednesday at Chbika, near Djelfa, 275 kilometers (170 miles) south of the capital.

3. It contained the bodies of people killed last year during a wedding ceremony, according to Le Quotidien Liberte.

4. The victims included women, children and old men.

5. Most of them had been decapitated and their heads thrown on a road, reported the Es Sahafa.

6. Another mass grave containing the bodies of around 10 people was discovered recently near Algiers, in the Eucalyptus district.

7. The two shepherds were killed Monday evening by a group of nine armed Islamists near the Moulay Slissen forest.

8. After being injured in a hail of automatic weapons fire, the pair were finished off with machete blows before being decapitated, Le Quotidien d'Oran reported.

9. Seven people, six of them children, were killed and two injured Wednesday by armed Islamists near Medea, 120 kilometers (75 miles) south of Algiers, security forces said.

10. The same day a parcel bomb explosion injured 17 people in Algiers itself.

11. Since early March, violence linked to armed Islamists has claimed more than 500 lives, according to press tallies.

ARTICLE 18854: ALGIERS, May 20 (UPI) ARTICLE 18853: ALGIERS, May 20 (AFP)

Page 67: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Vector-based representationTerm 1

Term 2

Term 3

Document

Centroid

Page 68: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Vector-based matching

• The cosine measure

n

i i

n

i i

n

i ii

yx

yx

yx

yxyx

1

2

1

2

1.

),cos(

Page 69: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

CIDR

sim T

sim < T

Page 70: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

CentroidsC 00022 (N=44)

(10000)diana 1.93princess 1.52

C 00025 (N=19)(10000)albanians 3.00

C 00026 (N=10)(10000)universe 1.50

expansion 1.00bang 0.90

C 10007 (N=11)(10000)crashes 1.00

safety 0.55transportat

ion0.55

drivers 0.45board 0.36flight 0.27buckle 0.27

pittsburgh 0.18graduating 0.18automobile 0.18

C 00035 (N=22)(10000)airlines 1.45

finnair 0.45

C 00031 (N=34)(10000)el 1.85

nino 1.56

C 00008 (N=113)(10000)space 1.98

shuttle 1.17station 0.75nasa 0.51

columbia 0.37mission 0.33mir 0.30

astronauts

0.14steering 0.11safely 0.07

C 10062 (N=161)microsoft 3.24justice 0.93

department

0.88windows 0.98corp 0.61

software 0.57ellison 0.07hatch 0.06

netscape 0.04metcalfe 0.02

Page 71: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

MEAD

...

...

Page 72: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

MEAD

• INPUT: Cluster of d documents with n sentences (compression rate = r)

• OUTPUT: (n * r) sentences from the cluster with the highest values of SCORE

SCORE (s) = i (wcCi + wpPi + wfFi)

Page 73: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

[Barzilay et al. 99]

• Theme intersection (paraphrases)

• Identifying common phrases across multiple sentences:– evaluated on 39 sentence-level predicate-

argument structures– 74% of p-a structures automatically identified

Page 74: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Other multi-document approaches

• Reformulation [McKeown et al. 99, McKeown et al. 02]

• Generation by Selection and Repair [DiMarco et al. 97]

Page 75: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Overview• Schank and Abelson 77

– scripts

• DeJong 79– FRUMP (slot-filling from UPI news)

• Graesser 81– Ratio of inferred propositions to these explicitly

stated is 8:1

• Young & Hayes 85– banking telexes

Page 76: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Radev and McKeown 98MESSAGE: ID TST3-MUC4-0010 MESSAGE: TEMPLATE 2 INCIDENT: DATE 30 OCT 89 INCIDENT: LOCATION EL SALVADOR INCIDENT: TYPE ATTACK INCIDENT: STAGE OF EXECUTION ACCOMPLISHED INCIDENT: INSTRUMENT ID INCIDENT: INSTRUMENT TYPEPERP: INCIDENT CATEGORY TERRORIST ACT PERP: INDIVIDUAL ID "TERRORIST" PERP: ORGANIZATION ID "THE FMLN" PERP: ORG. CONFIDENCE REPORTED: "THE FMLN" PHYS TGT: ID PHYS TGT: TYPEPHYS TGT: NUMBERPHYS TGT: FOREIGN NATIONPHYS TGT: EFFECT OF INCIDENTPHYS TGT: TOTAL NUMBERHUM TGT: NAMEHUM TGT: DESCRIPTION "1 CIVILIAN"HUM TGT: TYPE CIVILIAN: "1 CIVILIAN"HUM TGT: NUMBER 1: "1 CIVILIAN"HUM TGT: FOREIGN NATIONHUM TGT: EFFECT OF INCIDENT DEATH: "1 CIVILIAN"HUM TGT: TOTAL NUMBER

Page 77: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Generating text from templates

On October 30, 1989, one civilian was killed in a reported FMLN attack in El Salvador.

Page 78: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Input: Cluster of templates

T1 Tm

Conceptual combiner

T2 …..

Combiner

Paragraph planner

Planningoperators

Linguistic realizer

Sentence planner

Sentence generator

Lexical chooserLexicon

OUTPUT: Base summary

SURGE

Domainontology

Page 79: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Excerpts from four articles JERUSALEM - A Muslim suicide bomber blew apart 18 people on a Jerusalem bus and wounded 10 in a mirror-image of an attack one week ago. The carnage could rob Israel's Prime Minister Shimon Peres of the May 29 election victory he needs to pursue Middle East peacemaking. Peres declared all-out war on Hamas but his tough talk did little to impress stunned residents of Jerusalem who said the election would turn on the issue of personal security.

JERUSALEM - A bomb at a busy Tel Aviv shopping mall killed at least 10 people and wounded 30, Israel radio said quoting police. Army radio said the blast was apparently caused by a suicide bomber. Police said there were many wounded.

A bomb blast ripped through the commercial heart of Tel Aviv Monday, killing at least 13 people and wounding more than 100. Israeli police say an Islamic suicide bomber blew himself up outside a crowded shopping mall. It was the fourth deadly bombing in Israel in nine days. The Islamic fundamentalist group Hamas claimed responsibility for the attacks, which have killed at least 54 people. Hamas is intent on stopping the Middle East peace process. President Clinton joined the voices of international condemnation after the latest attack. He said the ``forces of terror shall not triumph'' over peacemaking efforts.

TEL AVIV (Reuter) - A Muslim suicide bomber killed at least 12 people and wounded 105, including children, outside a crowded Tel Aviv shopping mall Monday, police said. Sunday, a Hamas suicide bomber killed 18 people on a Jerusalem bus. Hamas has now killed at least 56 people in four attacks in nine days. The windows of stores lining both sides of Dizengoff Street were shattered, the charred skeletons of cars lay in the street, the sidewalks were strewn with blood. The last attack on Dizengoff was in October 1994 when a Hamas suicide bomber killed 22 people on a bus.

1

2

3

4

Page 80: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Four templates

MESSAGE: ID TST-REU-0001 SECSOURCE: SOURCE Reuters SECSOURCE: DATE March 3, 1996 11:30 PRIMSOURCE: SOURCE INCIDENT: DATE March 3, 1996 INCIDENT: LOCATION Jerusalem INCIDENT: TYPE Bombing HUM TGT: NUMBER “killed: 18'' “wounded: 10” PERP: ORGANIZATION ID

MESSAGE: ID TST-REU-0002 SECSOURCE: SOURCE Reuters SECSOURCE: DATE March 4, 1996 07:20 PRIMSOURCE: SOURCE Israel Radio INCIDENT: DATE March 4, 1996 INCIDENT: LOCATION Tel Aviv INCIDENT: TYPE Bombing HUM TGT: NUMBER “killed: at least 10'' “wounded: more than 100” PERP: ORGANIZATION ID

MESSAGE: ID TST-REU-0003 SECSOURCE: SOURCE Reuters SECSOURCE: DATE March 4, 1996 14:20 PRIMSOURCE: SOURCE INCIDENT: DATE March 4, 1996 INCIDENT: LOCATION Tel Aviv INCIDENT: TYPE Bombing HUM TGT: NUMBER “killed: at least 13'' “wounded: more than 100” PERP: ORGANIZATION ID “Hamas”

MESSAGE: ID TST-REU-0004 SECSOURCE: SOURCE Reuters SECSOURCE: DATE March 4, 1996 14:30 PRIMSOURCE: SOURCE INCIDENT: DATE March 4, 1996 INCIDENT: LOCATION Tel Aviv INCIDENT: TYPE Bombing HUM TGT: NUMBER “killed: at least 12'' “wounded: 105” PERP: ORGANIZATION ID

43

21

Page 81: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Fluent summary with comparisons

Reuters reported that 18 people were killed on Sunday in a bombing in Jerusalem. The next day, a bomb in Tel Aviv killed at least 10 people and wounded 30 according to Israel radio. Reuters reported that at least 12 people were killed and 105 wounded in the second incident. Later the same day, Reuters reported that Hamas has claimed responsibility for the act.

(OUTPUT OF SUMMONS)

Page 82: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Operators• If there are two templates

ANDthe location is the same

ANDthe time of the second template is after the time of the first template

ANDthe source of the first template is different from the source of the second template

ANDat least one slot differs

THENcombine the templates using the contradiction operator...

Page 83: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Operators: Change of Perspective

Change of perspective

March 4th, Reuters reported that a bomb in Tel Aviv killed at least 10 people and wounded 30. Later the same day, Reuters reported that exactly 12 people were actually killed and 105 wounded.

Precondition:The same source reports a change in a small number of slots

Page 84: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Operators: ContradictionContradiction

The afternoon of February 26, 1993, Reuters reported that a suspected bomb killed at least six people in the World Trade Center. However, Associated Press announced that exactly five people were killed in the blast.

Precondition:Different sources report contradictory values for a small number of slots

Page 85: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Operators: Refinement and Agreement

Refinement

On Monday morning, Reuters announced that a suicide bomber killed at least 10 people in Tel Aviv. In the afternoon, Reuters reported that Hamas claimed responsibility for the act.

Agreement

The morning of March 1st 1994, both UPI and Reuters reported that a man was kidnapped in the Bronx.

Page 86: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Operators: GeneralizationGeneralization

According to UPI, three terrorists were arrested in Medellín last Tuesday. Reuters announced that the police arrested two drug traffickers in Bogotá the next day.

A total of five criminals were arrested in Colombia last week.

Page 87: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Other conceptual methods

• Operator-based transformations using terminological knowledge representation [Reimer and Hahn 97]

• Topic interpretation [Hovy and Lin 98]

Page 88: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Ideal evaluation

Compression Ratio =|S|

|D|

Retention Ratio =i (S)

i (D)

Information content

Page 89: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Overview of techniques

• Extrinsic techniques (task-based)

• Intrinsic techniques

Page 90: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

• Can you recreate what’s in the original? – the Shannon Game [Shannon 1947–50].

– but often only some of it is really important. • Measure info retention (number of keystrokes):

– 3 groups of subjects, each must recreate text:• group 1 sees original text before starting. • group 2 sees summary of original text before starting. • group 3 sees nothing before starting.

• Results (# of keystrokes; two different paragraphs):

Group 1 Group 2 Group 3approx. 10 approx. 150 approx. 1100

Hovy 98

Page 91: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

• Burning questions:1. How do different evaluation methods compare for each type of summary? 2. How do different summary types fare under different methods? 3. How much does the evaluator affect things?

4. Is there a preferred evaluation method?

Shannon Q&A

Original 1 1 1 1 1

Abstract Background 1 3 1 1 1Just-the-News 3 1 1 1

Regular 1 2 1 1 1Extract Keywords 2 4 1 1 1

Random 3 1 1 1

No Text 3 5

1-2: 50% 1-2: 30%2-3: 50% 2-3: 20%

3-4: 20%4-5:100%

Classification

Hovy 98

• Small Experiment– 2 texts, 7 groups.

• Results:– No difference!– As other

experiment…– ? Extract is best?

Page 92: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Precision and Recall

Relevant Non-relevant

System:relevant

A BSystem:

non-relevantC D

Page 93: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Precision and Recall

CA

A R

:Recall

BA

A P

:Precision

)(

2

RP

PRF

Page 94: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Jing et al. 98

• Small experiment with 40 articles

• When summary length is given, humans are pretty consistent in selecting the same sentences

• Percent agreement

• Different systems achieved maximum performance at different summary lengths

• Human agreement higher for longer summaries

Page 95: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

SUMMAC [Mani et al. 98]

• 16 participants• 3 tasks:

– ad hoc: indicative, user-focused summaries

– categorization: generic summaries, five categories

– question-answering

• 20 TREC topics• 50 documents per

topic (short ones are omitted)

Page 96: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

SUMMAC [Mani et al. 98]

• Participants submit a fixed-length summary limited to 10% and a “best” summary, not limited in length.

• variable-length summaries are as accurate as full text

• over 80% of summaries are intelligible

• technologies perform similarly

Page 97: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Goldstein et al. 99

• Reuters, LA Times• Manual summaries• Summary length rather

than summarization ratio is typically fixed

• Normalized version of R & F.

C)B,A(A

A R'

min

)R(P

PR F

'

''

2

Page 98: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Goldstein et al. 99

b)(

bp p'

1

)(

b)(g

gs

g

gs

)()(

'

''

• How to measure relative performance?

p = performanceb = baselineg = “good” systems = “superior” system

Page 99: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Radev et al. 00

---S10

---S9

---S8

---S7

---S6

---S5

+--S4

---S3

+++S2

-++S1

System 2System 1Ideal

Cluster-Based Sentence Utility

Page 100: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Cluster-Based Sentence Utility

---S10

---S9

---S8

---S7

---S6

---S5

+--S4

---S3

+++S2

-++S1

System 2System 1Ideal

9(+)67S4

432S3

8(+)9(+)8(+)S2

510(+)10(+)S1

System 2System 1Ideal

Summary sentence extraction method

CBSU method

CBSU(system, ideal)= % of ideal utility covered by system summary

Page 101: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Interjudge agreement

Judge1 Judge2 Judge3

Sentence 1 10 10 5

Sentence 2 8 9 8

Sentence 3 2 3 4

Sentence 4 5 6 9

Page 102: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Relative utility

Judge1 Judge2 Judge3

Sentence 1 10 10 5

Sentence 2 8 9 8

Sentence 3 2 3 4

Sentence 4 5 6 9

RU =

Page 103: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Relative utility

Judge1 Judge2 Judge3

Sentence 1 10 10 5

Sentence 2 8 9 8

Sentence 3 2 3 4

Sentence 4 5 6 9

17

RU =

Page 104: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Relative utility

Judge1 Judge2 Judge3

Sentence 1 10 10 5

Sentence 2 8 9 8

Sentence 3 2 3 4

Sentence 4 5 6 9

1317

RU =

= 0.765

Page 105: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Normalized System Performance

1.000

0.765

0.765

Judge 3

0.7560.7890.722Judge 3

0.8831.0001.000Judge 2

0.8831.0001.000Judge 1

AverageJudge 2Judge 1

D = (S-R)

(J-R)

System performance

Interjudge agreement

Normalized system performance Random performance

Page 106: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Random Performance

D = (S-R)

(J-R)

Page 107: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Random Performance

D = (S-R)

(J-R)

n !

( n(1-r))! (r*n)!systemsaverage of all

Page 108: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Random Performance

D = (S-R)

(J-R)

n !

( n(1-r))! (r*n)!systemsaverage of all

{12}{13}{14}{23}{24}{34}

Page 109: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Examples

0.833 - 0.732

0.841 - 0.732= 0.927D {14} =

(S-R)

(J-R)=

Page 110: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Examples

0.833 - 0.732

0.841 - 0.732= 0.927D {14} =

(S-R)

(J-R)=

0.963D {24} =

Page 111: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

1.0

J = 0.841

0.5

0.0

J’ = 1.0

0.5

R’= 0.0

R = 0.732

S = 0.833

S’ = 0.927 = D

Normalized evaluation of {14}

Page 112: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Cross-sentence Informational Subsumption and Equivalence

• Subsumption: If the information content of sentence a (denoted as I(a)) is contained within sentence b, then a becomes informationally redundant and the content of b is said to subsume that of a:

I(a) I(b)

• Equivalence: If I(a) I(b) I(b) I(a)

Page 113: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Example

(1) John Doe was found guilty of the murder.

(2) The court found John Doe guilty of the murder of Jane Doe last August and sentenced him to life.

Page 114: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Cross-sentence Informational Subsumption

967S4

432S3

898S2

51010S1

Article 3Article 2Article 1

Page 115: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Toxic spill in SpainAP, NYTTDT-3 corpus, topic 67833F

General strike in Denmark

AP, PRI, VOATDT-3 corpus, topic 7815110E

Explosion in a Moscow apartment building (Sept. 13, 1999)

AP, AFP, UPIclari.world.europe.russia1897D

Explosion in a Moscow apartment building (Sept. 9, 1999)AP, AFPclari.world.europe.russia652C

The FBI puts Osama bin Laden on the most wanted listAFP, UPIclari.world.terrorism453B

Algerian terrorists threaten BelgiumAFP, UPIclari.world.africa.northwestern252A

topicnews sourcessource# sents# docsCluster

Evaluation

Page 116: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

0.75

0.8

0.85

0.9

0.95

1

10 20 30 40 50 60 70 80 90 100

Compression rate (r)

Ag

ree

me

nt

(J) Cluster A

Cluster B

Cluster C

Cluster D

Cluster E

Cluster F

Inter-judge agreementversus compression

Page 117: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

4A2-8----A1-7

4A2-7----A1-6

2A2-4A2-2-A2-1-A1-5

4A2-10-A2-10A2-10A2-10A1-4

4A2-10----A1-3

3A2-5--A2-5A2-5A1-2

3A2-1-A2-1A2-1-A1-1

- score+ scoreJudge5Judge4Judge3Judge2Judge1Sent

Evaluating Sentence Subsumption

Page 118: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Subsumption (Cont’d)

SCORE (s) = i (wcCi + wpPi + wfFi) - wRRs

Rs = cross-sentence word overlap

Rs = 2 * (# overlapping words) / (# words in sentence 1 + # words in sentence 2)

wR = Maxs (SCORE(s))

Page 119: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Subsumption analysis

0107070112112

7323520284454633

11035837910163614

610731880450240705

-+-+-+-+-+-+#judges agreeing

Cluster FCluster ECluster DCluster CCluster BCluster A

Total: 558 sentences, full agreement on 292 (1+291), partial on 406 (23+383)Of 80 sentences with some indication of subsumption, only 24 had agreement of 4 or more judges.

Page 120: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Results10% 20% 30% 40% 50% 60% 70% 80% 90%

Cluster A 0.855 0.572 0.427 0.759 0.862 0.910 0.554 1.001 0.584

Cluster B 0.365 0.402 0.690 0.714 0.867 0.640 0.845 0.713 1.317

Cluster C 0.753 0.938 0.841 1.029 0.751 0.819 0.595 0.611 0.683

Cluster D 0.739 0.764 0.683 0.723 0.614 0.568 0.668 0.719 1.100

Cluster E 1.083 0.937 0.581 0.373 0.438 0.369 0.429 0.487 0.261

Cluster F 1.064 0.893 0.928 1.000 0.732 0.805 0.910 0.689 0.199

MEAD performed better than Lead in 29 (in bold) out of 54 cases.

MEAD+Lead performed better than the Lead baseline in 41 cases

Page 121: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Donaway et al. 00

• Sentence-rank based measures– IDEAL={2,3,5}:

compare {2,3,4} and {2,3,9}

• Content-based measures– vector comparisons of summary and document

Page 122: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Background

• Summer 2001

• Eight weeks

• Johns Hopkins University• Participants: Dragomir Radev, Simone Teufel,

Horacio Saggion, Wai Lam, Elliott Drabek, Hong Qi, Danyu Liu, John Blitzer, and Arda Çelebi

Page 123: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

510

2030

4050

6070

8090

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

% agreement

compression

Humans: Percent Agreement (20-cluster average) and compression

Page 124: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Kappa

• N: number of items (index i)

• n: number of categories (index j)

• k: number of annotators

)(1

)()(

EP

EPAP

N

i

n

jij k

mkNk

AP1 1

2

1

1

)1(

1)(

2

1

1

)(

Nk

mEP

N

iijn

j

Page 125: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

510

2030

4050

6070

8090

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

K

compression

Humans: Kappa and compression

Page 126: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Relative Utility (RU) per summarizer and compression rate (Single-document)

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Compression rate

Su

mm

ariz

er J

R

WEBS

MEAD

LEAD

J 0.785 0.79 0.81 0.833 0.853 0.875 0.913 0.94 0.962 0.982

R 0.636 0.65 0.68 0.711 0.738 0.765 0.804 0.84 0.896 0.961

WEBS 0.761 0.765 0.776 0.801 0.828

MEAD 0.748 0.756 0.764 0.782 0.808 0.834 0.863 0.895 0.921 0.968

LEAD 0.733 0.738 0.772 0.797 0.829 0.85 0.877 0.906 0.936 0.973

5 10 20 30 40 50 60 70 80 90

Page 127: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Relevance correlation (RC)

22)()(

))((

ii

ii

iii

yyxx

yyxxr

Page 128: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

FDMEAD

WEBSLEAD

SUMMRAND 5%

10%20%

30%40%

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

RPV

Summarizer

Compression rate

Relevance Preservation Value (RPV) per compression rate and summarizer (English, 5 queries)

5%

10%

20%

30%

40%

5% 1 0.724 0.73 0.66 0.622 0.554

10% 1 0.834 0.804 0.73 0.71 0.708

20% 1 0.916 0.876 0.82 0.82 0.818

30% 1 0.946 0.912 0.88 0.848 0.884

40% 1 0.962 0.936 0.906 0.862 0.922

FD MEAD WEBS LEAD SUMM RAND

Page 129: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Properties of evaluation metricsKappa,P/R,accuracy

RU Wordoverlap,cosine, lcs

Relevancepreserv.

Agreement Humanextracts

X X X

Agreement humanextracts – automaticextracts

X X X

Agreement humansummaries/extracts

X

Non-binary decisions X X X

Full documents vs.extracts

X X

Systems with differentsentence segm.

X X

Multidocument extracts X X X

Full corpus coverage X X

Page 130: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

DUC 2003 [Harman and Over]

• Provide an overview of DUC 2003:– Data: documents, topics, viewpoints, manual

summaries– Tasks:

• 1: very short (~10-word) single document summaries• 2-4: short (~100-word) multi-document summaries with focus

2: TDT event topics3: viewpoints4: question/topic

– Evaluation: procedures, measures• Experience with implementing the evaluation procedure

Page 131: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Task 2: Mean LAC with penalty REGWQ Grouping Mean N peer A 0.18900 30 13 A B A 0.18243 30 6 B A B A 0.17923 30 16 B A B A 0.17787 30 22 B A B A 0.17557 30 23 B A B A 0.17467 30 14 B A B A C 0.16550 30 20 B A C B D A C 0.15193 30 18 B D A C B D A C 0.14903 30 11 B D A C B D A C 0.14520 30 10 B D A C B D E A C 0.14357 30 12 B D E A C B D E A C 0.14293 30 26 B D E C B D E C 0.12583 30 21 D E C D E C 0.11677 30 3 D E D E F 0.09960 30 19 D E F D E F 0.09837 30 17 E F E F 0.09057 30 2 F F 0.05523 30 15

Page 132: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Task 4: Mean LAC with penalty

REGWQ Grouping Mean N peer

A 0.155814 118 23 A A 0.144517 118 14 B A B A C 0.141136 118 22 B C B D C 0.134596 114 16 B D C B D C 0.131220 118 5 B D C B D C 0.123449 118 10 D C D C 0.122186 118 13 D D 0.116576 118 4 E 0.092966 118 17 E E 0.091059 118 20 F 0.058780 118 19

Page 133: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Part VI Language modeling

Page 134: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Language modeling

• Source/target language• Coding process

Noisy channel Recovery

e f e*

Page 135: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Language modeling

• Source/target language• Coding process

e* = argmax p(e|f) = argmax p(e) . p(f|e)e e

p(E) = p(e1).p(e2|e1).p(e3|e1e2)…p(en|e1…en-1)

p(E) = p(e1).p(e2|e1).p(e3|e2)…p(en|en-1)

Page 136: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Summarization using LM

• Source language: full document• Target language: summary

Page 137: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Berger & Mittal 00

• Gisting (OCELOT)

• content selection (preserve frequencies)• word ordering (single words, consecutive

positions)• search: readability & fidelity

g* = argmax p(g|d) = argmax p(g) . p(d|g)g g

Page 138: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Berger & Mittal 00

• Limit on top 65K words• word relatedness = alignment• Training on 100K summary+document pairs• Testing on 1046 pairs• Use Viterbi-type search• Evaluation: word overlap (0.2-0.4)• transilingual gisting is possible• No word ordering

Page 139: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Berger & Mittal 00

Sample output:

Audubon society atlanta area savannah georgia chatham and local birding savannah keepers chapter of the audubon georgia and leasing

Page 140: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Banko et al. 00

• Summaries shorter than 1 sentence• headline generation• zero-level model: unigram probabilities• other models: Part-of-speech and position• Sample output:

Clinton to meet Netanyahu Arafat Israel

Page 141: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Knight and Marcu 00

• Use structured (syntactic) information

• Two approaches:– noisy channel– decision based

• Longer summaries

• Higher accuracy

Page 142: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Teufel & Moens 02

• Scientific articles

• Argumentative zoning (rhetorical analysis)

• Aim, Textual, Own, Background, Contrast, Basis, Other

Page 143: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Buyukkokten et al. 02

• Portable devices (PDA)

• Expandable summarization (progressively showing “semantic text units”)

Page 144: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Barzilay, McKeown, Elhadad 02

• Sentence reordering for MDS

• Multigen

• “Augmented ordering” vs. Majority and Chronological ordering

• Topic relatedness

• Subjective evaluation

• 14/25 “Good” vs. 8/25 and 7/25

Page 145: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Osborne 02

• Maxent (loglinear) model – no independence assumptions

• Features: word pairs, sentence length, sentence position, discourse features (e.g., whether sentence follows the “Introduction”, etc.)

• Maxent outperforms Naïve Bayes

Page 146: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Zhang, Blair-Goldensohn, Radev 02

• Multidocument summarization using Crossdocument Structure Theory (CST)• Model relationships between sentences: contradiction, followup, agreement, subsumption, equivalence• Followup (2003): automatic id of CST relationships

Page 147: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Wu et al. 02

• Question-based summaries

• Comparison with Google

• Uses fewer characters but achieves higher MRR

Page 148: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Jing 02

• Using HMM to decompose human-written summaries

• Recognizing pieces of the summary that match the input documents

• Operators: syntactic transformations, paraphrasing, reordering

• F-measure: 0.791

Page 149: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Grewal et al. 03

• Next take the group of sentences:

“Peter Piper picked a peck of pickled peppers.

Peter Piper picked a peck of pickled peppers.”

Gzipped size of these sentences is : 70

• Finally take the group of sentences:

“Peter Piper picked a peck of pickled peppers.

Peter Piper was in a pickle in Edmonton.”

Gzipped size of these sentences is : 92

• Take the sentence :

“Peter Piper picked a peck of pickled peppers.”

Gzipped size of this sentence is : 66

Page 150: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

2003 WS papersHeadline generation (Maryland, BBN)

Compression-based MDS (Michigan)

Summarization of OCRed text (IBM)

Summarization of legal texts (Edinburgh)

Personalized annotations (UST&MS, China)

Limitations of extractive summ (ISI)

Human consensus (Cambridge, Nijmegen)

Page 151: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Newsinessence [Radev & al. 01]

Page 152: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3
Page 153: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3
Page 154: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3
Page 155: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3
Page 156: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3
Page 157: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Newsblaster [McKeown & al. 02]

Page 158: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Google News [02]

Page 159: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Summarization meetings• Dagstuhl Meeting, 1993 (Karen Spärck Jones, Brigitte Endres-

Niggemeyer)• ACL/EACL Workshop, Madrid, 1997 (Inderjeet Mani, Mark Maybury)• AAAI Spring Symposium, Stanford, 1998 (Dragomir Radev, Eduard

Hovy)• ANLP/NAACL, Seattle, 2000 (Udo Hahn, Chin-Yew Lin, Inderjeet

Mani, Dragomir Radev)• NAACL, Pittsburgh, 2001 (Jade Goldstein and Chin-Yew Lin)• DUC 2001 (Donna Harman and Daniel Marcu)• DUC 2002 (Udo Hahn and Donna Harman)• HLT-NAACL, Edmonton, 2003 (Dragomir Radev, Simone Teufel)• DUC 2003 (Donna Harman and Paul Over)• DUC 2004 (Marie-Francine Moens and Stan Szpakowicz)

Page 160: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Readings

Advances in Automatic Text Summarization by Inderjeet Mani and Mark Maybury (eds.), MIT Press, 1999

Automated Text Summarization by Inderjeet Mani, John Benjamins, 2002

Page 161: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

1 Automatic Summarizing : Factors and Directions (K. Spärck-Jones )

2 The Automatic Creation of Literature Abstracts (H. P. Luhn)

3 New Methods in Automatic Extracting (H. P. Edmundson)

4 Automatic Abstracting Research at Chemical Abstracts Service (J. J. Pollock and A. Zamora)

5 A Trainable Document Summarizer (J. Kupiec, J. Pedersen, and F. Chen)

6 Development and Evaluation of a Statistically Based Document Summarization System (S. H. Myaeng and D. Jang)

7 A Trainable Summarizer with Knowledge Acquired from Robust NLP Techniques (C. Aone, M. E. Okurowski, J. Gorlinsky, and B. Larsen)

8 Automated Text Summarization in SUMMARIST (E. Hovy and C. Lin)

9 Salience-based Content Characterization of Text Documents (B. Boguraev and C. Kennedy)

10 Using Lexical Chains for Text Summarization (R. Barzilay and M. Elhadad)

11 Discourse Trees Are Good Indicators of Importance in Text (D. Marcu)

12 A Robust Practical Text Summarizer (T. Strzalkowski, G. Stein, J. Wang, and B. Wise)

13 Argumentative Classification of Extracted Sentenses as a First Step Towards Flexible Abstracting (S. Teufel and M. Moens)

14 Plot Units: A Narrative Summarization Strategy (W. G. Lehnert)

15 Knowledge-based text Summarization: Salience and Generalization Operators for Knowledge Base Abstraction (U. Hahn and U. Reimer)

16 Generating Concise Natural Language Summaries (K. McKeown, J. Robin, and K. Kukich)

17 Generating Summaries from Event Data (M. Maybury)

18 The Formation of Abstracts by the Selection of Sentences (G. J. Rath, A. Resnick, and T. R. Savage)

19 Automatic Condensation of Electronic Publications by Sentence Selection (R. Brandow, K. Mitze, and L. F. Rau)

20 The Effects and Limitations of Automated Text Condensing on Reading Comprehension Performance (A. H. Morris, G. M. Kasper, and D. A. Adams)

21 An Evaluation of Automatic Text Summarization Systems (T. Firmin and M J. Chrzanowski)

22 Automatic Text Structuring and Summarization (G. Salton, A. Singhal, M. Mitra, and C. Buckley)

23 Summarizing Similarities and Differences among Related Documents (I. Mani and E. Bloedorn)

24 Generating Summaries of Multiple News Articles (K. McKeown and D. R. Radev)

25 An Empirical Study of the Optimal Presentation of Multimedia Summaries of Broadcast News (A Merlino and M. Maybury)

26 Summarization of Diagrams in Documents (R. P. Futrelle)

Page 162: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Collections of papers

• Information Processing and Management, 1995

• Computational Linguistics, 2002

Page 164: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Ongoing projects

• Columbia, ISI, Michigan

• BBN, Maryland, Lethbridge, LCC

• Sheffield, KU Leuven

• Tokyo

• Etc.

Page 165: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Available corpora– DUC corpus

• http://duc.nist.gov

– MEAD/NIE corpus• www.summarization.com/mead

– SUMMAC corpus• send mail to [email protected]

– <Text+Abstract+Extract> corpus• send mail to [email protected]

– Open directory project• http://dmoz.org

Page 166: SI 760 / EECS 597 / Ling 702 Language and Information Winter 2004 Handout #3

Possible research topics

• Corpus creation and annotation

• MMM: Multidocument, Multimedia, Multilingual

• Evolving summaries

• Personalized summarization

• Web-based summarization

• Feature selection