the process of anaphora resolution (.ppt)

35
The process of anaphora resolution Ruslan Mitkov

Upload: sammy17

Post on 11-Jun-2015

1.380 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: The process of anaphora resolution (.ppt)

The process of anaphora resolution

Ruslan Mitkov

Page 2: The process of anaphora resolution (.ppt)

Anaphoric jokes or the importance of correct anaphora resolution

• There is a pile of inflammable trash next to your car. You'll have to get rid of it. 

• Fried eggs should be cooked properly and if there are frail or elderly people in the house, they should be hard-boiled.

• Autumn leaves may cause a problem for the elderly, especially when they fall and become a wet and soggy mess on the ground.

• If these shoes don't fit your feet, you can exchange them. 

Page 3: The process of anaphora resolution (.ppt)

Anaphoric jokes or the importance of correct anaphora resolution (2)

• If the baby does not thrive on raw milk, boil it.

• If an incendiary bomb drops near you, don't lose your head. Put it in a bucket and cover it with sand.

• There will be a Moscow Exhibition of Arts by 150,000 Soviet Republic painters and sculptors. These were executed over the past two years.

Page 4: The process of anaphora resolution (.ppt)

Outline of the lecture

• Anaphora resolution and the knowledge needed

• Anaphora resolution in practice– Identification of anaphors – Defining search scope and identifying

candidates– The resolution algorithm: factors in

anaphora resolution • Discussion limited to nominal anaphora

Page 5: The process of anaphora resolution (.ppt)

Morphological and lexical knowledge

• Some anaphors are successfully resolved on the basis of lexical information such as gender and number 

• Anaphors usually match (the head of) their antecedents in gender and number.

Page 6: The process of anaphora resolution (.ppt)

Examples gender and number agreement rules

• Steven, of Worthing, Sussex, said he and Emily had a huge row after he discovered she had been skipping lessons at school.

• John Bradley spoke to Jane McCarthy and to the Browns about a forthcoming project. The businessman said this enterprise would cost millions

• The gender and number agreement rule is not as discriminative for English as for German or Russian.

Page 7: The process of anaphora resolution (.ppt)

Syntax knowledge

• Identification of boundaries: vital for identifying NPs, PPs, sentences

• Application of syntax-based constraints (“filtering” rules)

• Application of syntax-based preferences

Page 8: The process of anaphora resolution (.ppt)

Semantic knowledge• However important morphological, lexical and

syntax knowledge are, in many cases they alone cannot help. – The mouse was under the table. It was eating a

piece of cheese.

• Semantic knowledge vital for resolving lexical noun phrase anaphors– Roy Keane has warned Manchester United he may

snub their pay deal. United's skipper is even hinting that unless the future Old Trafford Package meets his demands, he could quit the club in June 2000. Alex Ferguson's No. 1 player confirmed…

Page 9: The process of anaphora resolution (.ppt)

Discourse knowledge • Morphological, lexical, syntactic and semantic

criteria are not always sufficient to distinguish between a set of possible candidates.

• Jenny put the cup on the plate and broke it.• Jenny went window shopping yesterday and spotted a

nice cup. She wanted to buy it, but she had no money with her. Nevertheless, she knew she would be shopping the following day, so she would be able to buy the cup then. The following day, she went to the shop and bought the coveted cup. However, once back home and in her kitchen, she put the cup on a plate and broke it...

Page 10: The process of anaphora resolution (.ppt)

Real-world knowledge

• The soldiers shot at the women and they fell.

• The soldiers shot at the women add they missed.

• They can be resolved only with the help of real-world knowledge.

• Rule 1: If X shoots at Y and if Z (Z {X,Y}) falls, then it is more likely for Z to be Y

• Rule 2: If X shoots at Y and if Z (Z {X,Y}) misses, then it is more likely for Z to be X

Page 11: The process of anaphora resolution (.ppt)

Real-world knowledge (2)• The following pronominal anaphors are no easier

to be dealt with:• The FBI's role is to ensure our country's freedom

and be ever watchful of those who threaten it.• The KGB's role is to ensure our country's

freedom and be ever watchful of those who threaten it.

• If Peter Mandelson had been in Tony Blair’s shoes he would have demanded his resignation the day the Prime Minister forced him to leave the Cabinet.

Page 12: The process of anaphora resolution (.ppt)

Anaphora resolution in practice

– Identification of anaphors

– Defining search scope and

identifying candidates

– The resolution algorithm: factors in

anaphora resolution

Page 13: The process of anaphora resolution (.ppt)

Identification of anaphors

• Identification of pronouns

• Identification of pleonastic pronouns: – It must be stated that Oskar behaved

impeccably – It is cloudy – It’s three o'clock

Page 14: The process of anaphora resolution (.ppt)

Identification of anaphors (2)

• Identification of lexical NPs (definite descriptions, proper names)

• Identification of non-anaphoric lexical NPs

• Queen Elizabeth attended the ceremony. The Queen delivered a speech.

• The Queen attended the ceremony. The Duchess of York was there too.

Page 15: The process of anaphora resolution (.ppt)

Tools and resources needed at this stage

• Morphological or lexical information usually provided by a morphological analyser, part-of-speech tagger or dictionary.

• Program for recognising pleonastic pronouns or one for identifying non-anaphoric definite descriptions

• Parser• Machine learning annotated corpora. • Partial parser or NP extractor• Proper name recogniser• Ontology (WordNet)

Page 16: The process of anaphora resolution (.ppt)

Location of the candidates for antecedents

• All NPs preceding an anaphor within a certain search scope are initially regarded as potential candidates for antecedents

• Typical search scope pronominal anaphora: 2-3 sentences

• Typical search scope lexical NP anaphors: up to 10 sentences

• Discourse segment

Page 17: The process of anaphora resolution (.ppt)

Tools/resources needed at this stage

• Identifying noun phrases and the sentence boundaries

• Full parser / sentence splitter (or POS tagger) + NP extractor

• Clause splitter

• Discourse segmentation algorithm

• Proper name recogniser

Page 18: The process of anaphora resolution (.ppt)

The resolution algorithm: factors in anaphora resolution

• One the anaphors have been detected, the program will attempt to resolve them by selecting their antecedents from the identified sets of candidates.

• The resolution rules based on the different sources of knowledge and used in the resolution process usually referred to as "anaphora resolution factors".

Page 19: The process of anaphora resolution (.ppt)

Eliminating factors (constraints)

• Factors that eliminate candidates for antecedents

• gender constraints

• number constraints

• GB theory (c-command) constraints

• selectional restrictions

Page 20: The process of anaphora resolution (.ppt)

Definition c-command

A node A c-commands a node B if and only if

I. A does not dominate B

II. B does not dominate A

III. the first branching node dominating A also dominates B

Page 21: The process of anaphora resolution (.ppt)

Example c-command

A B C D E F G H I J K L M N For example (not exhaustive): A c-commands all the other nodes. B c-commands C and every node that C dominates. C c-commands B and every node that B dominates. D c-commands E and J, but not C, or any of the nodes that C dominates. H c-commands I and no other node.

Page 22: The process of anaphora resolution (.ppt)

Preferential factors (preferences)

• salience (center of attention)

• parallelism

• proximity

Page 23: The process of anaphora resolution (.ppt)

Gender and number agreement constraints

• This constraint requires that anaphors and their antecedents must agree in number and gender.– Jane told Philip and his friends that she

was in love.

Page 24: The process of anaphora resolution (.ppt)

Traps: are gender and number agreement a must?

• Ask another Macintosh user about the problem you’re having; they may have a solution

• If there is a doctor on board, could they please make themselves known to the crew

• You were called on the 30th of April at 21.38 hours. The caller withheld their number

Page 25: The process of anaphora resolution (.ppt)

Traps: other gender and number agreement problems

Problems with proper names

Proper names are ambiguous in gender in one language

Chris, Lesley or Tracey in English, Claude in French

Or differ in gender across languages such as Jean

Page 26: The process of anaphora resolution (.ppt)

C-command constraints (1)

A reflexive anaphor must be c-commanded by its antecedent. In Sylvia admires herself herself is c-commanded by Sylvia S NP VP NP V Sylvia admires herself

Page 27: The process of anaphora resolution (.ppt)

C-command constraints (2)

• A non-pronominal NP cannot corefer with NP that c-commands it– He warned Mr. Byers that whether it will

be public or private investment….

Page 28: The process of anaphora resolution (.ppt)

Selectional restrictions (semantic constraints)

• Semantic (selectional) restrictions which apply to the anaphor, should apply to the antecedent as well. – Vincent removed the diskette from the

computer and then disconnected it.– Vincent removed the diskette from the

computer and then copied it.

Page 29: The process of anaphora resolution (.ppt)

Preferences

• There is a general preference to the most recent compatible candidate NP to be the antecedent, but this is not always the case. – Jack took the newspaper and then got hold of

the magazine. He started reading it straight away.

– The newspaper contained the latest international news but the magazine did not. Jack started reading it straight away.

Page 30: The process of anaphora resolution (.ppt)

Preferences (2)

• Entities in the major clause are favoured over those in the subordinate clause

• Preference to entities in non-adjunct phrases over those in adjunct phrases. – Jack drank the wine on the table. It was

brown and round.

Page 31: The process of anaphora resolution (.ppt)

Preferences (3)

• Preference is given to NPs with the same syntactic function as the anaphor. – The programmer successfully combined Prolog

with C, but he had combined it with Pascal last time.

– The programmer successfully combined Prolog with C, but he had combined Pascal with it last time.

– The program successfully combined Prolog with C but Jack had to modify it because the combination of Prolog with Perl did not work.

Page 32: The process of anaphora resolution (.ppt)

Preferences (4)• Center / Focus preference

– If an incendiary bomb drops near you, don't loose your head. Put it in a bucket and cover it with sand.

– Tilly tried on the dress over her skirt and ripped it.

– Tilly's mother had agreed to make her a new dress for the party. She worked hard on the dress for weeks and finally it was ready for Tilly to try on. Impatient to see what it would look like, Tilly tried on the dress over her skirt and ripped it.

Page 33: The process of anaphora resolution (.ppt)

Tools and resources needed

• The gender and number filters require information on the gender and number of the anaphor and its candidates

• dictionaries or

• part-of-speech taggers

• partial parsers

• morphological analysers

Page 34: The process of anaphora resolution (.ppt)

Tools and resources needed (2)• C-command constraints require tree structures of

sentences full parser is desirable.• Semantic knowledge can be provided by WordNet• Some semantic information (verb selectional

restrictions) supplied in dictionary entries• Selectional restrictions can be cheaply modelled by

collocations extracted from corpora• Word-sense disambiguation • A center tracking program is needed for

approaches relying on center preference.

Page 35: The process of anaphora resolution (.ppt)

Example of anaphora resolution based on a simple model

• A simple constraint-based model using the gender and number agreement constraint and the center preference.

• Jim was dating both Jane and Becky but it was Jane he was falling in love with. The young lad found her very attractive and kind.