coreferenceresolution: successes and challenges2 plan for the talk part i: background task...
TRANSCRIPT
![Page 1: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/1.jpg)
Vincent Ng
Human Language Technology Research Institute
University of Texas at Dallas
http://www.hlt.utdallas.edu/~vince/ijcai-2016/coreference
IJCAI 2016 Tutorial
Coreference Resolution:
Successes and Challenges
![Page 2: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/2.jpg)
2
Plan for the Talk
� Part I: Background
� Task definition
� Why coreference is hard
� Applications
� Brief history
� Part II: Machine learning for coreference resolution
� System architecture
� Computational models
� Resources and evaluation (corpora, evaluation metrics, …)
� Employing semantics and world knowledge
� Part III: Solving hard coreference problems
� Difficult cases of overt pronoun resolution
� Relation to the Winograd Schema Challenge
![Page 3: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/3.jpg)
3
Plan for the Talk
� Part I: Background
� Task definition
� Why coreference is hard
� Applications
� Brief history
� Part II: Machine learning for coreference resolution
� System architecture
� Computational models
� Resources and evaluation (corpora, evaluation metrics, …)
� Employing semantics and world knowledge
� Part III: Solving hard coreference problems
� Difficult cases of overt pronoun resolution
� Relation to the Winograd Schema Challenge
![Page 4: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/4.jpg)
4
Entity Coreference
Identify the noun phrases (or entity mentions) that refer to the same real-world entity
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 5: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/5.jpg)
5
Entity Coreference
Identify the noun phrases (or entity mentions) that refer to the same real-world entity
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 6: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/6.jpg)
6
Entity Coreference
Identify the noun phrases (or entity mentions) that refer to the same real-world entity
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 7: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/7.jpg)
7
Entity Coreference
Identify the noun phrases (or entity mentions) that refer to the same real-world entity
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 8: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/8.jpg)
8
Entity Coreference
Identify the noun phrases (or entity mentions) that refer to the same real-world entity
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 9: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/9.jpg)
9
Entity Coreference
Identify the noun phrases (or entity mentions) that refer to the same real-world entity
� Inherently a clustering task
� the coreference relation is transitive
� Coref(A,B) ∧ Coref(B,C) � Coref(A,C)
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 10: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/10.jpg)
10
Entity Coreference
Identify the noun phrases (or entity mentions) that refer to the same real-world entity
� Typically recast as the problem of selecting an antecedentfor each mention, mj
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 11: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/11.jpg)
11
Entity Coreference
Identify the noun phrases (or entity mentions) that refer to the same real-world entity
� Typically recast as the problem of selecting an antecedentfor each mention, mj
� Does Queen Elizabeth have a preceding mention coreferent
with it? If so, what is it?
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 12: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/12.jpg)
12
Entity Coreference
Identify the noun phrases (or entity mentions) that refer to the same real-world entity
� Typically recast as the problem of selecting an antecedentfor each mention, mj
� Does her have a preceding mention coreferent with it? If so,
what is it?
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 13: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/13.jpg)
13
Entity Coreference
Identify the noun phrases (or entity mentions) that refer to the same real-world entity
� Typically recast as the problem of selecting an antecedentfor each mention, mj
� Does husband have a preceding mention coreferent with it?
If so, what is it?
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 14: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/14.jpg)
14
Plan for the Talk
� Part I: Background
� Task definition
� Why coreference is hard
� Applications
� Brief history
� Part II: Machine learning for coreference resolution
� System architecture
� Computational models
� Resources and evaluation (corpora, evaluation metrics, …)
� Employing semantics and world knowledge
� Part III: Solving hard coreference problems
� Difficult cases of overt pronoun resolution
� Relation to the Winograd Schema Challenge
![Page 15: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/15.jpg)
15
Why it’s hard
� Many sources of information play a role
� lexical / word: head noun matches
� President Clinton = Clinton =? Hillary Clinton
� grammatical: number/gender agreement, …
� syntactic: syntactic parallelism, binding constraints
� John helped himself to... vs. John helped him to…
� discourse: discourse focus, salience, recency, …
� semantic: semantic class agreement, …
� world knowledge
� Not all knowledge sources can be computed easily
![Page 16: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/16.jpg)
16
Why it’s hard
� Many sources of information play a role
� lexical / word: head noun matches
� President Clinton = Clinton =? Hillary Clinton
� grammatical: number/gender agreement, …
� syntactic: syntactic parallelism, binding constraints
� John helped himself to... vs. John helped him to…
� discourse: discourse focus, salience, recency, …
� semantic: semantic class agreement, …
� world knowledge
� Not all knowledge sources can be computed easily
![Page 17: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/17.jpg)
17
Why It’s Hard
No single source is a completely reliable indicator
� number and gender
� assassination (of Jesuit priests) =? these murders
� the woman = she = Mary =? the chairman
![Page 18: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/18.jpg)
18
Why It’s Hard
Coreference strategies differ depending on the mention type
� definiteness of mentions
� … Then Mark saw the man walking down the street.
� … Then Mark saw a man walking down the street.
� pronoun resolution alone is notoriously difficult
� There are pronouns whose resolution requires world knowledge
� The Winograd Schema Challenge (Levesque, 2011)
� pleonastic pronouns refer to nothing in the text
I went outside and it was snowing.
![Page 19: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/19.jpg)
19
Why it’s hard
� Anaphoricity determination is a difficult task
� determine whether a mention has an antecedent
� check whether it is part of a coreference chain but is not the
head of the chain
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. Logue,
a renowned speech therapist, was summoned to help
the King overcome his speech impediment...
![Page 20: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/20.jpg)
20
Why it’s hard
� Anaphoricity determination is a difficult task
� determine whether a mention has an antecedent
� check whether it is part of a coreference chain but is not the
head of the chain
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. Logue,
a renowned speech therapist, was summoned to help
the King overcome his speech impediment...
![Page 21: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/21.jpg)
21
Why it’s hard
� Anaphoricity determination is a difficult task
� determine whether a mention has an antecedent
� check whether it is part of a coreference chain but is not the
head of the chain
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. Logue,
a renowned speech therapist, was summoned to help
the King overcome his speech impediment...
![Page 22: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/22.jpg)
22
Why it’s hard
� Anaphoricity determination is a difficult task
� determine whether a mention has an antecedent
� check whether it is part of a coreference chain but is not the
head of the chain
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. Logue,
a renowned speech therapist, was summoned to help
the King overcome his speech impediment...
![Page 23: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/23.jpg)
23
Why it’s hard
� Anaphoricity determination is a difficult task
� determine whether a mention has an antecedent
� check whether it is part of a coreference chain but is not the
head of the chain
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. Logue,
a renowned speech therapist, was summoned to help
the King overcome his speech impediment...
![Page 24: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/24.jpg)
24
Why it’s hard
� Anaphoricity determination is a difficult task
� determine whether a mention has an antecedent
� check whether it is part of a coreference chain but is not the
head of the chain
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. Logue,
a renowned speech therapist, was summoned to help
the King overcome his speech impediment...
Resolving a non-anaphoric mention
causes a coreference system’s
precision to drop.
![Page 25: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/25.jpg)
25
Why it’s hard
� Anaphoricity determination is a difficult task
� determine whether a mention has an antecedent
� check whether it is part of a coreference chain but is not the
head of the chain
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. Logue,
a renowned speech therapist, was summoned to help
the King overcome his speech impediment...
If we do a perfect job in anaphoricity
determination, coreference systems
can improve by an F-score of 5% absolute (Stoyanov et al., 2009)
![Page 26: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/26.jpg)
26
Not all coreference relations are equally
difficult to resolve
![Page 27: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/27.jpg)
27
Not all coreference relations are equally
difficult to resolve
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 28: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/28.jpg)
28
Not all coreference relations are equally
difficult to resolve
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 29: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/29.jpg)
29
Not all coreference relations are equally
difficult to resolve
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 30: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/30.jpg)
30
Not all coreference relations are equally
difficult to resolve
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 31: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/31.jpg)
31
Not all coreference relations are equally
difficult to resolve
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 32: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/32.jpg)
32
Not all coreference relations are equally
difficult to resolve
Queen Elizabeth set about transforming her husband,
King George VI, into a viable monarch. A renowned
speech therapist was summoned to help the King
overcome his speech impediment...
![Page 33: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/33.jpg)
33
Let’s modify the paragraph
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
![Page 34: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/34.jpg)
34
Let’s modify the paragraph
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
![Page 35: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/35.jpg)
35
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
Let’s modify the paragraph
![Page 36: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/36.jpg)
36
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
Let’s modify the paragraph
![Page 37: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/37.jpg)
37
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
Let’s modify the paragraph
![Page 38: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/38.jpg)
38
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
Let’s modify the paragraph
![Page 39: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/39.jpg)
39
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
Let’s modify the paragraph
![Page 40: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/40.jpg)
40
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
Let’s modify the paragraph
![Page 41: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/41.jpg)
41
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
Let’s modify the paragraph
![Page 42: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/42.jpg)
42
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
Let’s modify the paragraph
![Page 43: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/43.jpg)
43
Queen Mother asked Queen Elizabeth to transform her
sister, Margaret, into an elegant lady. Logue was
summoned to help the princess improve her manner...
� Not all coreference relations are equally difficult to identify
� A system will be more confident in predicting some and less
confident in predicting others
� use the more confident ones to help predict the remaining ones?
Let’s modify the paragraph
![Page 44: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/44.jpg)
44
Plan for the Talk
� Part I: Background
� Task definition
� Why coreference is hard
� Applications
� Brief history
� Part II: Machine learning for coreference resolution
� System architecture
� Computational models
� Resources and evaluation (corpora, evaluation metrics, …)
� Employing semantics and world knowledge
� Part III: Solving hard coreference problems
� Difficult cases of overt pronoun resolution
� Relation to the Winograd Schema Challenge
![Page 45: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/45.jpg)
45
Applications of Coreference
� Question answering
� Information extraction
� Machine translation
� Text summarization
� Information retrieval
� ...
![Page 46: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/46.jpg)
46
text collection
Question Answering
System
natural language question
answer
Application: Question Answering
![Page 47: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/47.jpg)
47
Mozart was one of the first classical composers. He was
born in Salzburg, Austria, in 27 January 1756. He wrote
music of many different genres...
Haydn was a contemporary and friend of Mozart. He was
born in Rohrau, Austria, in 31 March 1732. He wrote 104
symphonies...
Where was Mozart born?
Coreference for Question Answering
![Page 48: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/48.jpg)
48
Mozart was one of the first classical composers. He was
born in Salzburg, Austria, in 27 January 1756. He wrote
music of many different genres...
Haydn was a contemporary and friend of Mozart. He was
born in Rohrau, Austria, in 31 March 1732. He wrote 104
symphonies...
Where was Mozart born?
Coreference for Question Answering
![Page 49: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/49.jpg)
49
Haydn was a contemporary and friend of Mozart. He was
born in Rohrau, Austria, in 31 March 1732. He wrote 104
symphonies...
Where was Mozart born?
Coreference for Question Answering
Mozart was one of the first classical composers. He was
born in Salzburg, Austria, in 27 January 1756. He wrote
music of many different genres...
![Page 50: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/50.jpg)
50
Haydn was a contemporary and friend of Mozart. He was
born in Rohrau, Austria, in 31 March 1732. He wrote 104
symphonies...
Mozart was one of the first classical composers. He was
born in Salzburg, Austria, in 27 January 1756. He wrote
music of many different genres...
Where was Mozart born?
Coreference for Question Answering
![Page 51: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/51.jpg)
51
Disaster Type:
• location:
• date:
• magnitude:
• magnitude-confidence:
• damage:
• human-effect:
• victim:
• number:
• outcome:
• physical-effect:
• object:
• outcome:
AFGANISTAN MAY BE
PREPARING FOR ANOTHER
TEST
Thousands of people are feared
dead following... (voice-over)
...a powerful earthquake that hit
Afghanistan today. The quake
registered 6.9 on the Richter
scale. (on camera) Details now
hard to come by, but reports say
entire villages were buried by
the earthquake.
Application: Information Extraction
![Page 52: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/52.jpg)
52
Disaster Type: earthquake
• location: Afghanistan
• date: today
• magnitude: 6.9
• magnitude-confidence: high
• damage:
• human-effect:
• victim: Thousands of people
• number: Thousands
• outcome: dead
• physical-effect:
• object: entire villages
• outcome: damaged
AFGANISTAN MAY BE
PREPARING FOR ANOTHER
TEST
Thousands of people are feared
dead following... (voice-over)
...a powerful earthquake that hit
Afghanistan today. The quake
registered 6.9 on the Richter
scale. (on camera) Details now
hard to come by, but reports say
entire villages were buried by
the earthquake.
Application: Information Extraction
![Page 53: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/53.jpg)
53
Disaster Type: earthquake
• location: Afghanistan
• date: today
• magnitude: 6.9
• magnitude-confidence: high
• damage:
• human-effect:
• victim: Thousands of people
• number: Thousands
• outcome: dead
• physical-effect:
• object: entire villages
• outcome: damaged
AFGANISTAN MAY BE
PREPARING FOR ANOTHER
TEST
Thousands of people are feared
dead following... (voice-over)
...a powerful earthquake that hit
Afghanistan today. The quake
registered 6.9 on the Richter
scale. (on camera) Details now
hard to come by, but reports say
entire villages were buried by
the earthquake.
Coreference for Information Extraction
![Page 54: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/54.jpg)
54
Disaster Type: earthquake
• location: Afghanistan
• date: today
• magnitude: 6.9
• magnitude-confidence: high
• damage:
• human-effect:
• victim: Thousands of people
• number: Thousands
• outcome: dead
• physical-effect:
• object: entire villages
• outcome: damaged
AFGANISTAN MAY BE
PREPARING FOR ANOTHER
TEST
Thousands of people are feared
dead following... (voice-over)
...a powerful earthquake that hit
Afghanistan today. The quake
registered 6.9 on the Richter
scale. (on camera) Details now
hard to come by, but reports say
entire villages were buried by
the earthquake.
The last major earthquake in
Afghanistan took place in
August 2001.
Coreference for Information Extraction
![Page 55: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/55.jpg)
55
Application: Machine Translation
� Chinese to English machine translation
俄罗斯作为米洛舍夫维奇一贯的支持者,曾经提出
调停这场政治危机。
Russia is a consistent supporter of Milosevic, has proposed to mediate the political crisis.
![Page 56: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/56.jpg)
56
Application: Machine Translation
� Chinese to English machine translation
俄罗斯作为米洛舍夫维奇一贯的支持者,曾经提出
调停这场政治危机。
Russia is a consistent supporter of Milosevic, ???has proposed to mediate the political crisis.
![Page 57: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/57.jpg)
57
Application: Machine Translation
� Chinese to English machine translation
俄罗斯作为米洛舍夫维奇一贯的支持者,曾经提出
调停这场政治危机。
Russia is a consistent supporter of Milosevic, ???has proposed to mediate the political crisis.
![Page 58: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/58.jpg)
58
Plan for the Talk
� Part I: Background
� Task definition
� Why coreference is hard
� Applications
� Brief history
� Part II: Machine learning for coreference resolution
� System architecture
� Computational models
� Resources and evaluation (corpora, evaluation metrics, …)
� Employing semantics and world knowledge
� Part III: Solving hard coreference problems
� Difficult cases of overt pronoun resolution
� Relation to the Winograd Schema Challenge
![Page 59: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/59.jpg)
59
Rule-Based Approaches
� Popular in the 1970s and 1980s
� A popular PhD thesis topic
� Charniak (1972): Children’s story comprehension
� “In order to do pronoun resolution, one had to be able to do
everything else.”
� Focus on sophisticated knowledge & inference mechanisms
� Syntax-based approaches (Hobbs, 1976)
� Discourse-based approaches / Centering algorithms
� Kantor (1977), Grosz (1977), Webber (1978), Sidner (1979)
� Centering algorithms and alternatives
� Brennan et al. (1987), …
![Page 60: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/60.jpg)
60
Rule-Based Approaches
� Knowledge-poor approaches (Mitkov, 1990s)
![Page 61: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/61.jpg)
61
Evaluation of Rule-Based Approaches
� Small-scale evaluation
� a few hundred sentences
� sometimes by hand
� algorithm not always implemented/implementable
![Page 62: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/62.jpg)
62
MUC Coreference
� The MUC conferences
� Goal: evaluate information extraction systems
� Coreference as a supporting task for information extraction
� First recognized in MUC-6 (1995)
� First large-scale evaluation of coreference systems. Need
� Scoring program
� MUC scoring program (Vilain et al., 1995)
� Guidelines for coreference-annotating a corpus
� Original task definition very ambitious
� Final task definition focuses solely on identity coreference
![Page 63: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/63.jpg)
63
Other Types of Coreference
� Non-identity coreference: bridging
� Part-whole relations
� He passed by Jan’s house and saw that the door was painted red.
� Set-subset relations
� Difficult cases
� Verb phrase ellipsis
� John enjoys watching movies, but Mary doesn’t.
� Reference to abstract entities
� Each fall, penguins migrate to Fuji.
� It happens just before the eggs hatch.
![Page 64: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/64.jpg)
64
Other Types of Coreference
� Non-identity coreference: bridging
� Part-whole relations
� He passed by Jan’s house and saw that the door was painted red.
� Set-subset relations
� Difficult cases
� Verb phrase ellipsis
� John enjoys watching movies, but Mary doesn’t.
� Reference to abstract entities
� Each fall, penguins migrate to Fuji.
� That’s why I’m going there next month.
![Page 65: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/65.jpg)
65
MUC Coreference
� MUC-6 coreference task
� MUC-6 corpus: 30 training documents, 30 test documents
� Despite the fact that 30 training documents are available, all
but one resolver are rule-based
� UMASS (Lenhert & McCarthy, 1995) resolver is learning-based
� Best-performing resolver, FASTUS, is rule-based
![Page 66: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/66.jpg)
66
MUC Coreference
� MUC-6 coreference task
� MUC-6 corpus: 30 training documents, 30 test documents
� Despite the fact that 30 training documents are available, all
but one resolver are rule-based
� UMASS (Lenhert & McCarthy, 1995) resolver is learning-based
� Best-performing resolver, FASTUS, is rule-based
� MUC-7 coreference task
� MUC-7 corpus: 30 training documents, 20 test documents
� none of the 7 participating resolvers used machine learning
![Page 67: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/67.jpg)
67
MUC Coreference
� MUC-6 coreference task
� MUC-6 corpus: 30 training documents, 30 test documents
� Despite the fact that 30 training documents are available, all
but one resolver are rule-based
� UMASS (Lenhert & McCarthy, 1995) resolver is learning-based
� Best-performing resolver, FASTUS, is rule-based
� MUC-7 coreference task
� MUC-7 corpus: 30 training documents, 20 test documents
� none of the 7 participating resolvers used machine learning
� Learning-based approaches were not the mainstream
� But … interest grew when Soon et al. (2001) showed that a
learning-based resolver can offer competitive performance
![Page 68: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/68.jpg)
68
Plan for the Talk
� Part I: Background
� Task definition
� Why coreference is hard
� Applications
� Brief history
� Part II: Machine learning for coreference resolution
� System architecture
� Computational models
� Employing semantics and world knowledge
� Resources and evaluation (corpora, evaluation metrics, …)
� Part III: Solving hard coreference problems
� Difficult cases of overt pronoun resolution
� Relation to the Winograd Schema Challenge
![Page 69: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/69.jpg)
69
System Architecture: Training
Preprocessing
Mention Detection
Model Training
Training texts
Coreference Model
![Page 70: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/70.jpg)
70
System Architecture: Training
Preprocessing
Mention Detection
Model Training
Training texts
Coreference Model
� Tokenization
� Sentence splitting
� part-of-speech tagging
![Page 71: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/71.jpg)
71
System Architecture: Training
Preprocessing
Mention Detection
Model Training
Training texts
Coreference Model
� Tokenization
� Sentence splitting
� part-of-speech tagging
� Trivial: extract the hand-annotated
(i.e., gold) mentions from training texts
![Page 72: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/72.jpg)
72
System Architecture: Training
Preprocessing
Mention Detection
Model Training
Training texts
Coreference Model
� Tokenization
� Sentence splitting
� part-of-speech tagging
� Trivial: extract the hand-annotated
(i.e., gold) mentions from training texts
� Feature extraction
� Model design and learning
![Page 73: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/73.jpg)
73
System Architecture: Testing
Preprocessing
Mention Detection
Coreference Model
Test text
Coreference partition
![Page 74: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/74.jpg)
74
System Architecture: Testing
Preprocessing
Mention Detection
Coreference Model
Test text
Coreference partition
� Tokenization
� Sentence splitting
� part-of-speech tagging
![Page 75: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/75.jpg)
75
System Architecture: Testing
Preprocessing
Mention Detection
Coreference Model
Test text
Coreference partition
� Tokenization
� Sentence splitting
� part-of-speech tagging
� Not-so-trivial: extract the mentions
(pronouns, names, nominals, nested NPs)
� Some researchers reported results on
gold mentions, not system mentions
� Substantially simplified the coref task
� F-scores of 80s rather than 60s
![Page 76: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/76.jpg)
76
System Architecture: Testing
Preprocessing
Mention Detection
Coreference Model
Test text
Coreference partition
� Tokenization
� Sentence splitting
� part-of-speech tagging
� Not-so-trivial: extract the mentions
(pronouns, names, nominals, nested NPs)
� Some researchers reported results on
gold mentions, not system mentions
� Substantially simplified the coref task
� F-scores of 80s rather than 60s
![Page 77: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/77.jpg)
77
Plan for the Talk
� Part I: Background
� Task definition
� Why coreference is hard
� Applications
� Brief history
� Part II: Machine learning for coreference resolution
� System architecture
� Computational models
� Employing semantics and world knowledge
� Resources and evaluation (corpora, evaluation metrics, …)
� Part III: Solving hard coreference problems
� Difficult cases of overt pronoun resolution
� Relation to the Winograd Schema Challenge
![Page 78: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/78.jpg)
78
System Architecture: Training
Preprocessing
Mention Detection
Model Training
Training texts
Coreference Model
![Page 79: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/79.jpg)
79
The Mention-Pair Model
� a classifier that, given a description of two mentions, mi and mj, determines whether they are coreferent or not
� coreference as a pairwise classification task
![Page 80: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/80.jpg)
80
� Training instance creation
� create one training instance for each pair of mentions from texts
annotated with coreference information
[Mary] said [John] hated [her] because [she] …
How to train a mention-pair model?
![Page 81: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/81.jpg)
81
� Training instance creation
� create one training instance for each pair of mentions from texts
annotated with coreference information
[Mary] said [John] hated [her] because [she] …
How to train a mention-pair model?
negative
![Page 82: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/82.jpg)
82
� Training instance creation
� create one training instance for each pair of mentions from texts
annotated with coreference information
[Mary] said [John] hated [her] because [she] …
How to train a mention-pair model?
negative negative
![Page 83: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/83.jpg)
83
� Training instance creation
� create one training instance for each pair of mentions from texts
annotated with coreference information
[Mary] said [John] hated [her] because [she] …
How to train a mention-pair model?
negative negative positive
![Page 84: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/84.jpg)
84
� Training instance creation
� create one training instance for each pair of mentions from texts
annotated with coreference information
[Mary] said [John] hated [her] because [she] …negative
How to train a mention-pair model?
negative negative positive
positive
positive
![Page 85: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/85.jpg)
85
� Training instance creation
� create one training instance for each pair of mentions from texts
annotated with coreference information
� Problem: all mention pairs produce a large and skewed data set
� Soon et al.’s (2001) heuristic instance creation method
[Mary] said [John] hated [her] because [she] …negative
How to train a mention-pair model?
negative negative positive
positive
positive
![Page 86: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/86.jpg)
86
How to train a mention-pair model?
� Soon et al.’s feature vector: describes two mentions
� Exact string match
� are mi and mj the same string after determiners are removed?
� Grammatical
� gender and number agreement, Pronouni?, Pronounj?, …
� Semantic
� semantic class agreement
� Positional
� distance between the two mentions
� Learning algorithm
� C5 decision tree learner
![Page 87: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/87.jpg)
87
Decision Tree Learned for MUC-6 DataSTR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
![Page 88: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/88.jpg)
88
[Jing] likes [him] but [she] …
Applying the mention-pair model
� After training, we can apply the model to a test text
� Classify each pair of mentions as coreferent or not coreferent
� Problem: the resulting classifications may violate transitivity!
positivenegative
positivepositivepositivepositivenegative
positivenegative
positivenegative
positive
[Jing] likes [him] but [she] …
positive
negativepositive
![Page 89: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/89.jpg)
89
How to resolve the conflicts?
![Page 90: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/90.jpg)
90
How to resolve the conflicts?
� Given a test text,
� process the mentions in a left-to-right manner
� for each mj,
� select as its antecedent the closest preceding mention
that is classified as coreferent with mj
� otherwise, no antecedent is found for mj
![Page 91: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/91.jpg)
91
How to resolve the conflicts?
� Given a test text,
� process the mentions in a left-to-right manner
� for each mj,
� select as its antecedent the closest preceding mention
that is classified as coreferent with mj
� otherwise, no antecedent is found for mj
[Jing] likes [him] but [she] …
positive
negativepositive
![Page 92: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/92.jpg)
92
How to resolve the conflicts?
� Given a test text,
� process the mentions in a left-to-right manner
� for each mj,
� select as its antecedent the closest preceding mention
that is classified as coreferent with mj
� otherwise, no antecedent is found for mj
[Jing] likes [him] but [she] …
positive
negativepositive
No antecedent
![Page 93: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/93.jpg)
93
How to resolve the conflicts?
� Given a test text,
� process the mentions in a left-to-right manner
� for each mj,
� select as its antecedent the closest preceding mention
that is classified as coreferent with mj
� otherwise, no antecedent is found for mj
[Jing] likes [him] but [she] …
positive
negativepositive
![Page 94: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/94.jpg)
94
How to resolve the conflicts?
� Given a test text,
� process the mentions in a left-to-right manner
� for each mj,
� select as its antecedent the closest preceding mention
that is classified as coreferent with mj
� otherwise, no antecedent is found for mj
[Jing] likes [him] but [she] …
positive
negativepositive
Antecedent is Jing
![Page 95: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/95.jpg)
95
How to resolve the conflicts?
� Given a test text,
� process the mentions in a left-to-right manner
� for each mj,
� select as its antecedent the closest preceding mention
that is classified as coreferent with mj
� otherwise, no antecedent is found for mj
[Jing] likes [him] but [she] …
positive
negativepositive
![Page 96: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/96.jpg)
96
How to resolve the conflicts?
� Given a test text,
� process the mentions in a left-to-right manner
� for each mj,
� select as its antecedent the closest preceding mention
that is classified as coreferent with mj
� otherwise, no antecedent is found for mj
[Jing] likes [him] but [she] …
positive
negativepositive
Antecedent is Jing
![Page 97: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/97.jpg)
97
How to resolve the conflicts?
� Given a test text,
� process the mentions in a left-to-right manner
� for each mj,
� select as its antecedent the closest preceding mention
that is classified as coreferent with mj
� otherwise, no antecedent is found for mj
[Jing] likes [him] but [she] …
positive
negativepositive
In the end, all three mentions will be in the same cluster
![Page 98: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/98.jpg)
98
How to resolve the conflicts?
� Given a test text,
� process the mentions in a left-to-right manner
� for each mj,
� select as its antecedent the closest preceding mention
that is classified as coreferent with mj
� otherwise, no antecedent is found for mj
[Jing] likes [him] but [she] …
positive
negativepositive
In the end, all three mentions will be in the same cluster
Single-link clustering
![Page 99: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/99.jpg)
99
MUC Scoring Metric
� Link-based metric
� Key: {A, B, C}, {D}
� Response: {A, B}, {C, D}
� Two links are needed to create the key clusters
� Response recovered one of them, so recall is ½
� Out of the two links in the response clusters, one is correct
� So precision is ½
� F-measure
![Page 100: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/100.jpg)
100
MUC Scoring Metric
� Link-based metric
� Key: {A, B, C}, {D}, {E}
� Response: {A, B}, {C, D}, {E}
� Two links are needed to create the key clusters
� Response recovered one of them, so recall is ½
� Out of the two links in the response clusters, one is correct
� So precision is ½
� F-measure
![Page 101: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/101.jpg)
101
60.465.556.162.667.358.6
FPRFPR
Soon et al. Results
MUC-6 MUC-7
![Page 102: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/102.jpg)
102
Anaphoricity Determination
Determines whether
a mention has an
antecedent
![Page 103: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/103.jpg)
103
Anaphoricity Determination
� In single-link clustering, when selecting an antecedent for mj,
� select the closest preceding mention that is coreferent with mj
� if no such mention exists, mj is classified as non-anaphoric
![Page 104: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/104.jpg)
104
Anaphoricity Determination
� In single-link clustering, when selecting an antecedent for mj,
� select the closest preceding mention that is coreferent with mj
� if no such mention exists, mj is classified as non-anaphoric
� Why not explicitly determine whether mj is anaphoric prior to
coreference resolution (anaphoricity determination)?
Anaphoricity
Determination
Coreference
Resolution
![Page 105: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/105.jpg)
105
Anaphoricity Determination
� In single-link clustering, when selecting an antecedent for mj,
� select the closest preceding mention that is coreferent with mj
� if no such mention exists, mj is classified as non-anaphoric
� Why not explicitly determine whether mj is anaphoric prior to
coreference resolution (anaphoricity determination)?
� If mj is not anaphoric, shouldn’t bother to resolve it
� pipeline architecture
� filter non-anaphoric NPs prior to coreference resolution
Anaphoricity
Determination
Coreference
Resolution
![Page 106: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/106.jpg)
106
Anaphoricity Determination
� train a classifier to determine whether a mention is anaphoric (i.e., whether an mention has an antecedent)
� one training instance per mention
� 37 features
� class value is anaphoric or not anaphoric
Ng & Cardie COLING’02
![Page 107: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/107.jpg)
107
Anaphoricity Determination
� train a classifier to determine whether a mention is anaphoric (i.e., whether an mention has an antecedent)
� one training instance per mention
� 37 features
� class value is anaphoric or not anaphoric
� Result on MUC-6/7 (Ng & Cardie, 2002)
� coreference F-measure drops
� precision increases, recall drops abruptly
� many anaphoric mentions are misclassified
� Error propagation
Ng & Cardie COLING’02
![Page 108: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/108.jpg)
108
Some Questions (Circa 2003)
� Is there a model better than the mention-pair model?
� Can anaphoricity determination benefit coreference
resolution?
![Page 109: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/109.jpg)
109
Some Questions (Circa 2003)
� Is there a model better than the mention-pair model?
� Can anaphoricity determination benefit coreference
resolution?
![Page 110: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/110.jpg)
110
� Limited expressiveness
� information extracted from two mentions may not be sufficient
for making an informed coreference decision
Weaknesses of the Mention-Pair Model
![Page 111: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/111.jpg)
111
Weaknesses of the Mention-Pair Model
� Limited expressiveness
� information extracted from two mentions may not be sufficient
for making an informed coreference decision
Mr. Clinton Clinton she
![Page 112: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/112.jpg)
112
Weaknesses of the Mention-Pair Model
� Limited expressiveness
� information extracted from two mentions may not be sufficient
for making an informed coreference decision
Mr. Clinton Clinton she
![Page 113: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/113.jpg)
113
Weaknesses of the Mention-Pair Model
� Limited expressiveness
� information extracted from two mentions may not be sufficient
for making an informed coreference decision
Mr. Clinton Clinton she
![Page 114: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/114.jpg)
114
Weaknesses of the Mention-Pair Model
� Limited expressiveness
� information extracted from two mentions may not be sufficient
for making an informed coreference decision
Mr. Clinton Clinton she
?
![Page 115: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/115.jpg)
115
Weaknesses of the Mention-Pair Model
� Limited expressiveness
� information extracted from two mentions may not be sufficient
for making an informed coreference decision
� Can’t determine which candidate antecedent is the best
� only determines how good a candidate is relative to the mention to be resolved, not how good it is relative to the others
![Page 116: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/116.jpg)
116
Weaknesses of the Mention-Pair Model
� Limited expressiveness
� information extracted from two mentions may not be sufficient
for making an informed coreference decision
� Can’t determine which candidate antecedent is the best
� only determines how good a candidate is relative to the mention to be resolved, not how good it is relative to the others
Bush Clinton she
?
?
![Page 117: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/117.jpg)
117
� Limited expressiveness
� information extracted from two mentions may not be sufficient
for making an informed coreference decision
� Can’t determine which candidate antecedent is the best
� only determines how good a candidate is relative to the mention to be resolved, not how good it is relative to the others
Weaknesses of the Mention-Pair Model
![Page 118: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/118.jpg)
118
Improving Model Expressiveness
� Want a coreference model that can tell us whether “she” and a preceding cluster of “she” are coreferent
Mr. Clinton Clinton she?
?
![Page 119: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/119.jpg)
119
Improving Model Expressiveness
� Want a coreference model that can tell us whether “she” and a preceding cluster of “she” are coreferent
Mr. Clinton Clinton she
?
![Page 120: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/120.jpg)
120
The Entity-Mention Model
� a classifier that determines whether (or how likely) a mention belongs to a preceding coreference cluster
� more expressive than the mention-pair model
� an instance is composed of a mention and a preceding cluster
� can employ cluster-level features defined over any subset of
mentions in a preceding cluster
Pasula et al. (2003), Luo et al. (2004), Yang et al. (2004, 2008),
Daume & Marcu (2005), Culotta et al. (2007), …
![Page 121: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/121.jpg)
121
The Entity-Mention Model
� a classifier that determines whether (or how likely) a mention belongs to a preceding coreference cluster
� more expressive than the mention-pair model
� an instance is composed of a mention and a preceding cluster
� can employ cluster-level features defined over any subset of
mentions in a preceding cluster
� is a mention gender-compatible with all mentions in a preceding
cluster?
� is a mention gender-compatible with most of the mentions in it?
� is a mention gender-compatible with none of them?
Pasula et al. (2003), Luo et al. (2004), Yang et al. (2004, 2008),
Daume & Marcu (2005), Culotta et al. (2007), …
![Page 122: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/122.jpg)
122
� Limited expressiveness
� information extracted from two mentions may not be sufficient
for making an informed coreference decision
� Can’t determine which candidate antecedent is the best
� only determine how good a candidate is relative to the mention
to be resolved, not how good it is relative to the others
Weaknesses of the Mention-Pair Model
![Page 123: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/123.jpg)
123
How to address this problem?
� Idea: train a model that imposes a ranking on the candidate
antecedents for a mention to be resolved
� so that it assigns the highest rank to the correct antecedent
![Page 124: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/124.jpg)
124
How to address this problem?
� Idea: train a model that imposes a ranking on the candidate
antecedents for a mention to be resolved
� so that it assigns the highest rank to the correct antecedent
� A ranker allows all candidate antecedents to be compared
� allows us find the best candidate antecedent for a mention
![Page 125: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/125.jpg)
125
How to address this problem?
� Idea: train a model that imposes a ranking on the candidate
antecedents for a mention to be resolved
� so that it assigns the highest rank to the correct antecedent
� A ranker allows all candidate antecedents to be compared
� allows us find the best candidate antecedent for a mention
� There is a natural resolution strategy for a mention-ranking
model
� A mention is resolved to the highest-ranked candidate antecedent
Yang et al. (2003), Iida et al., (2003), Denis & Baldridge (2007, 2008), …
![Page 126: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/126.jpg)
126
Caveat
� Since a mention ranker only imposes a ranking on the candidates, it cannot determine whether a mention is
anaphoric
� Need to train a classifier to perform anaphoricity determination
![Page 127: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/127.jpg)
127
Recap
Cannot determine best candidate
Limited expressiveness
Mention Ranking
Entity Mention
Problem
![Page 128: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/128.jpg)
128
Recap
Cannot determine best candidate
Limited expressiveness
Mention Ranking
Entity Mention
Problem
Can we combine the strengths of these two model?
![Page 129: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/129.jpg)
129
Consider preceding clusters, not candidate antecedents
Rank candidate antecedents
Mention-ranking model Entity-mention model
![Page 130: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/130.jpg)
130
Consider preceding clusters, not candidate antecedents
Rank candidate antecedents
Mention-ranking model Entity-mention model
Rank preceding clusters
![Page 131: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/131.jpg)
131
The Cluster-Ranking Model
Consider preceding clusters, not candidate antecedents
Rank candidate antecedents
Mention-ranking model Entity-mention model
Rank preceding clusters
![Page 132: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/132.jpg)
132
The Cluster-Ranking Model
� Training
� train a ranker to rank preceding clusters
� Testing
� resolve each mention to the highest-ranked preceding cluster
Rahman & Ng EMNLP’09
![Page 133: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/133.jpg)
133
The Cluster-Ranking Model
� Training
� train a ranker to rank preceding clusters
� Testing
� resolve each mention to the highest-ranked preceding cluster
After many years of hard work … finally came up with cluster rankers, which are conceptually similar to Lappin & Leass’ (1994) pronoun resolver --- Bonnie Webber (2010)
Rahman & Ng EMNLP’09
![Page 134: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/134.jpg)
134
� As a ranker, the cluster-ranking model cannot determine whether a mention is anaphoric
� Before resolving a mention, we still need to use an anaphoricity
classifier to determine if it is anaphoric
� yields a pipeline architecture
� Potential problem
� errors made by the anaphoricity classifier will be propagated to the coreference resolver
The Cluster-Ranking Model
Rahman & Ng EMNLP’09
![Page 135: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/135.jpg)
135
Potential Solution
� Jointly learn anaphoricity and coreference
![Page 136: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/136.jpg)
136
How to jointly learn anaphoricity and
coreference resolution?
![Page 137: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/137.jpg)
137
How to jointly learn anaphoricity and
coreference resolution?
� Currently, the cluster-ranking model is trained to rank
preceding clusters for a given mention, mj
� In joint modeling, the cluster-ranking model is trained to rank preceding clusters + null cluster for a given mention, mj
� want to train the model such that the null cluster has the
highest rank if mj is non-anaphoric
� Joint training allows the model to simultaneously learn whether to resolve an mention, and if so, which preceding
cluster is the best
![Page 138: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/138.jpg)
138
How to apply the joint model?
� During testing, resolve mj to the highest-ranked cluster
� if highest ranked cluster is null cluster, mj is non-anaphoric
� Same idea can be applied to mention-ranking models
![Page 139: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/139.jpg)
139
Experimental Setup
� The English portion of the ACE 2005 training corpus
� 599 documents coref-annotated on the ACE entity types
� 80% for training, 20% for testing
� Mentions extracted automatically using a mention detector
� Scoring programs: recall, precision, F-measure
� B3 (Bagga & Baldwin, 1998)
� CEAF (Luo, 2005)
![Page 140: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/140.jpg)
140
B3 Scoring Metric
� Mention-based metric
� Computes per-mention recall and precision
� Aggregates per-mention scores into overall scores
� Key: {A, B, C}, {D}
� Response: {A, B}, {C, D}
� To compute the recall and precision for A:
� A’s key cluster and response cluster have 2 overlapping mentions
� 2 of the 3 mentions in key cluster is recovered, so recall = 2/3
� 2 mentions in response cluster, so precision = 2/2
![Page 141: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/141.jpg)
141
B3 Scoring Metric
� Mention-based metric
� Computes per-mention recall and precision
� Aggregates per-mention scores into overall scores
� Key: {A, B, C}, {D}, {E}
� Response: {A, B}, {C, D}, {E}
� To compute the recall and precision for A:
� A’s key cluster and response cluster have 2 overlapping mentions
� 2 of the 3 mentions in key cluster is recovered, so recall = 2/3
� 2 mentions in response cluster, so precision = 2/2
![Page 142: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/142.jpg)
142
CEAF Scoring Metric
� Entity/Cluster-based metric
� Computes the best bipartite matching between the set of key clusters and the set of response clusters
� Key: {A, B, C}, {D, E}
� Response: {A, B}, {C, D}, {E}
![Page 143: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/143.jpg)
143
CEAF Scoring Metric
� Entity/Cluster-based metric
� Computes the best bipartite matching between the set of key clusters and the set of response clusters
� Key: {A, B, C}, {D, E}
� Response: {A, B}, {C, D}, {E}
� Recall: (2+1)/(3+2)
� Precision: (2+1)/3
![Page 144: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/144.jpg)
144
CEAF Scoring Metric
� Entity/Cluster-based metric
� Computes the best bipartite matching between the set of key clusters and the set of response clusters
� Key: {A, B, C}, {D, E}
� Response: {A, B}, {C, D}, {E}
� Recall: (2+1)/(3+2)
� Precision: (2+1)/3
![Page 145: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/145.jpg)
145
Experimental Setup
� Three baseline coreference models
� mention-pair, entity-mention, mention-ranking models
![Page 146: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/146.jpg)
146
B3 CEAF
R P F R P F
Mention-Pair Baseline 50.8 57.9 54.1 56.1 51.0 53.4
Entity-Mention Baseline 51.2 57.8 54.3 56.3 50.2 53.1
Mention-Ranking Baseline (Pipeline) 52.3 61.8 56.6 51.6 56.7 54.1
Mention-Ranking Baseline (Joint) 50.4 65.5 56.9 53.0 58.5 55.6
Cluster-Ranking Model (Pipeline) 55.3 63.7 59.2 54.1 59.3 56.6
Cluster-Ranking Model (Joint) 54.4 70.5 61.4 56.7 62.6 59.5
Results (Mention-Pair Baseline)
![Page 147: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/147.jpg)
147
B3 CEAF
R P F R P F
Mention-Pair Baseline 50.8 57.9 54.1 56.1 51.0 53.4
Entity-Mention Baseline 51.2 57.8 54.3 56.3 50.2 53.1
Mention-Ranking Baseline (Pipeline) 52.3 61.8 56.6 51.6 56.7 54.1
Mention-Ranking Baseline (Joint) 50.4 65.5 56.9 53.0 58.5 55.6
Cluster-Ranking Model (Pipeline) 55.3 63.7 59.2 54.1 59.3 56.6
Cluster-Ranking Model (Joint) 54.4 70.5 61.4 56.7 62.6 59.5
Results (Entity-Mention Baseline)
![Page 148: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/148.jpg)
148
B3 CEAF
R P F R P F
Mention-Pair Baseline 50.8 57.9 54.1 56.1 51.0 53.4
Entity-Mention Baseline 51.2 57.8 54.3 56.3 50.2 53.1
Mention-Ranking Baseline (Pipeline) 52.3 61.8 56.6 51.6 56.7 54.1
Mention-Ranking Baseline (Joint) 50.4 65.5 56.9 53.0 58.5 55.6
Cluster-Ranking Model (Pipeline) 55.3 63.7 59.2 54.1 59.3 56.6
Cluster-Ranking Model (Joint) 54.4 70.5 61.4 56.7 62.6 59.5
Results (Pipeline Mention-Ranking)
� Apply an anaphoricity classifier to filter non-anaphoric NPs
![Page 149: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/149.jpg)
149
B3 CEAF
R P F R P F
Mention-Pair Baseline 50.8 57.9 54.1 56.1 51.0 53.4
Entity-Mention Baseline 51.2 57.8 54.3 56.3 50.2 53.1
Mention-Ranking Baseline (Pipeline) 52.3 61.8 56.6 51.6 56.7 54.1
Mention-Ranking Baseline (Joint) 50.4 65.5 56.9 53.0 58.5 55.6
Cluster-Ranking Model (Pipeline) 55.3 63.7 59.2 54.1 59.3 56.6
Cluster-Ranking Model (Joint) 54.4 70.5 61.4 56.7 62.6 59.5
Results (Joint Mention-Ranking)
![Page 150: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/150.jpg)
150
B3 CEAF
R P F R P F
Mention-Pair Baseline 50.8 57.9 54.1 56.1 51.0 53.4
Entity-Mention Baseline 51.2 57.8 54.3 56.3 50.2 53.1
Mention-Ranking Baseline (Pipeline) 52.3 61.8 56.6 51.6 56.7 54.1
Mention-Ranking Baseline (Joint) 50.4 65.5 56.9 53.0 58.5 55.6
Cluster-Ranking Model (Pipeline) 55.3 63.7 59.2 54.1 59.3 56.6
Cluster-Ranking Model (Joint) 54.4 70.5 61.4 56.7 62.6 59.5
Results (Pipeline Cluster Ranking)
� Apply an anaphoricity classifier to filter non-anaphoric mentions
![Page 151: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/151.jpg)
151
B3 CEAF
R P F R P F
Mention-Pair Baseline 50.8 57.9 54.1 56.1 51.0 53.4
Entity-Mention Baseline 51.2 57.8 54.3 56.3 50.2 53.1
Mention-Ranking Baseline (Pipeline) 52.3 61.8 56.6 51.6 56.7 54.1
Mention-Ranking Baseline (Joint) 50.4 65.5 56.9 53.0 58.5 55.6
Cluster-Ranking Model (Pipeline) 55.3 63.7 59.2 54.1 59.3 56.6
Cluster-Ranking Model (Joint) 54.4 70.5 61.4 56.7 62.6 59.5
Results (Joint Cluster Ranking)
![Page 152: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/152.jpg)
152
� B3 CEAF
R P F R P F
Mention-Pair Baseline 50.8 57.9 54.1 56.1 51.0 53.4
Entity-Mention Baseline 51.2 57.8 54.3 56.3 50.2 53.1
Mention-Ranking Baseline (Pipeline) 52.3 61.8 56.6 51.6 56.7 54.1
Mention-Ranking Baseline (Joint) 50.4 65.5 56.9 53.0 58.5 55.6
Cluster-Ranking Model (Pipeline) 55.3 63.7 59.2 54.1 59.3 56.6
Cluster-Ranking Model (Joint) 54.4 70.5 61.4 56.7 62.6 59.5
� In comparison to the best baseline (joint mention-ranking),
� significant improvements in F-score for both B3 and CEAF
� due to simultaneous rise in recall and precision
Results (Joint Cluster Ranking)
![Page 153: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/153.jpg)
153
B3 CEAF
R P F R P F
Mention-Pair Baseline 50.8 57.9 54.1 56.1 51.0 53.4
Entity-Mention Baseline 51.2 57.8 54.3 56.3 50.2 53.1
Mention-Ranking Baseline (Pipeline) 52.3 61.8 56.6 51.6 56.7 54.1
Mention-Ranking Baseline (Joint) 50.4 65.5 56.9 53.0 58.5 55.6
Cluster-Ranking Model (Pipeline) 55.3 63.7 59.2 54.1 59.3 56.6
Cluster-Ranking Model (Joint) 54.4 70.5 61.4 56.7 62.6 59.5
Results (Joint Cluster Ranking)
� Joint modeling is better than pipeline modeling
![Page 154: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/154.jpg)
154
The CoNLL Shared Tasks
� Much recent work on entity coreference resolution was stimulated in part by the availability of the OntoNotes corpus
and its use in two coreference shared tasks
� CoNLL-2011 and CoNLL-2012
� OntoNotes coreference: unrestricted coreference
![Page 155: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/155.jpg)
155
� Multi-pass sieve approach (Lee et al., 2011)
� Winner of the CoNLL-2011 shared task
� English coreference resolution
� Latent tree-based approach (Fernandes et al.,2012)
� Winner of the CoNLL-2012 shared task
� Multilingual coreference resolution (English, Chinese, Arabic)
Two Top Shared Task Systems
![Page 156: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/156.jpg)
156
� Multi-pass sieve approach (Lee et al., 2011)
� Winner of the CoNLL-2011 shared task
� English coreference resolution
� Latent tree-based approach (Fernandes et al.,2012)
� Winner of the CoNLL-2012 shared task
� Multilingual coreference resolution (English, Chinese, Arabic)
Two Recent Approaches
![Page 157: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/157.jpg)
157
Stanford’s Sieve-Based Approach
� Rule-based resolver
� Each rule enables coreference links to be established
� Rules are partitioned into 12 components (or sieves) arranged
as a pipeline
![Page 158: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/158.jpg)
158
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� Pronouns
![Page 159: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/159.jpg)
159
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� Pronouns
Each sieve is composed of a
set of rules for establishing
coreference links
![Page 160: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/160.jpg)
160
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� Pronouns
� Two mentions are coreferent
if they have the same string
![Page 161: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/161.jpg)
161
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� Pronouns
� Two mentions are coreferent
if the strings obtained by
dropping the text after their head words are identical
![Page 162: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/162.jpg)
162
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� Pronouns
� Two mentions are coreferent
if they are in an appositive
construction
� Two mentions are coreferent
if they are in a copular construction
� Two mentions are coreferentif one is a relative pronoun
that modifies the head of the
antecedent NP
� …
![Page 163: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/163.jpg)
163
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� Pronouns
Sieves implementing different
kinds of string matching
![Page 164: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/164.jpg)
164
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� PronounsPosits two mentions as coreferent if
they are linked by a WordNet
lexical chain
![Page 165: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/165.jpg)
165
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� PronounsResolves a pronoun to a mention
that agree in number, gender,
person, number & semantic class
![Page 166: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/166.jpg)
166
A Few Notes on Sieves
� Each sieve is composed of a set of rules for establishing coreference links
� Sieves are ordered in decreasing order of precision
![Page 167: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/167.jpg)
167
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� Pronouns
� Rules in the DP sieve has the
highest precision
![Page 168: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/168.jpg)
168
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� Pronouns
� Rules in the DP sieve has the
highest precision
� … followed by those in Exact
String Match (Sieve 2)
![Page 169: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/169.jpg)
169
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� Pronouns
� Rules in the DP sieve has the
highest precision
� … followed by those in Exact
String Match (Sieve 2)
� … followed by those in Relaxed
String Match (Sieve 3)
![Page 170: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/170.jpg)
170
The 12 Sieves
� Discourse Processing
� Exact String Match
� Relaxed String Match
� Precise Constructs
� Strict Head Matching A,B,C
� Proper Head Word Match
� Alias Sieve
� Relaxed Head Matching
� Lexical Chain
� Pronouns
� Rules in the DP sieve has the
highest precision
� … followed by those in Exact
String Match (Sieve 2)
� … followed by those in Relaxed
String Match (Sieve 3)
� Those in the Pronouns sieve
have the lowest precision
![Page 171: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/171.jpg)
171
A Few Notes on Sieves
� Each sieve is composed of a set of rules for establishing coreference links
� Sieves are ordered in decreasing order of precision
� Coreference clusters are constructed incrementally
� Each sieve builds on the partial coreference clusters constructed
by the preceding sieves
� Enables the use of rules that link two clusters
� Rules can employ features computed over one or both clusters
� E.g., are there mentions in the 2 clusters that have same head?
� Resemble entity-mention models
![Page 172: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/172.jpg)
172
Evaluation
� Corpus
� Training: CoNLL-2011 shared task training corpus
� Test: CoNLL-2011 shared task test corpus
� Scoring programs: recall, precision, F-measure
� MUC (Vilain et al., 1995)
� B3 (Bagga & Baldwin, 1998)
� CEAFe (Luo, 2005)
� CoNLL (unweighted average of MUC, B3 and CEAFe F-scores)
![Page 173: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/173.jpg)
173
Results (Closed Track)
� Caveat
� Mention detection performance could have played a role
� Best system’s mention detection results: R/75, P/67, F/71
� 2nd best system’s mention detection results: R/92, P/28, F/43
System MUC B3 CEAFe CoNLL
Rank 1: Multi-Pass Sieves 59.6 68.3 45.5 57.8
Rank 2: Label Propagation 59.6 67.1 41.3 56.0
![Page 174: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/174.jpg)
174
Lessons
� Easy-first coreference resolution
� Exploit easy relations to discover hard relations
� Results seem to suggest that humans are better at
combining features than machine learners
� Better feature induction methods for combining primitive features into more powerful features?
![Page 175: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/175.jpg)
175
Another Sieve-Based Approach
� Ratinov & Roth (EMNLP 2012)
� Learning-based
� Each sieve is a machine-learned classifier
� Later sieves can override earlier sieves’ decisions
� Can recover from errors as additional evidence is available
![Page 176: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/176.jpg)
176
Ratinov & Roth’s 9 Sieves (Easy First)
• Each sieve is a mention-pair model applicable to a subset of mention pairs
1. Nested (e.g., {city of {Jurusalem}})
2. Same Sentence both Named Entities (NEs)
3. Adjacent (Mentions closest to each other in dependency tree)
4. Same Sentence NE&Nominal (e.g., Barack Obama, president)
5. Different Sentence two NEs
6. Same Sentence No Pronouns
7. Different Sentence Closest Mentions (no intervening mentions)
8. Same Sentence All Pairs
9. All Pairs
![Page 177: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/177.jpg)
177
Information Propagation
� Encoded as features
� Decision-encoding features at sieve i
� whether mj and mk are posited as coreferent by sieve 1, sieve
2, …, sieve i-1
� whether mj and mk are in the same coreference cluster after sieve 1, sieve 2, …, sieve i-1
� the results of various set operations applied to the cluster
containing mj and the cluster containing mk
� set identity, set containment, set overlap, …
![Page 178: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/178.jpg)
178
� Multi-pass sieve approach (Lee et al., 2011)
� Winner of the CoNLL-2011 shared task
� English coreference resolution
� Latent tree-based approach (Fernandes et al.,2012)
� Winner of the CoNLL-2012 shared task
� Multilingual coreference resolution (English, Chinese, Arabic)
Two Recent Approaches
![Page 179: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/179.jpg)
179
Training the Mention-Pair Model
� The mention-pair model determines whether two mentions are coreferent or not
� Each training example corresponds to two mentions
� Class value indicates whether they are coreferent or not
� Soon et al. train the model using a decision tree learner
� But we can train it using other learners, such as the
perceptron learning algorithm
![Page 180: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/180.jpg)
180
The Perceptron Learning Algorithm
� Parameterized by a weight vector w
� Learns a linear function
� Output of perceptron y = w • x
weight vector
feature vector(features computed based
on two mentions)
![Page 181: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/181.jpg)
181
The Perceptron Learning Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + xi // update the weights
until convergence
� An iterative algorithm
� An error-driven algorithm:
weight vector is updated whenever a mistake is made
![Page 182: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/182.jpg)
182
� Observation (McCallum & Wellner, 2004):
� Since the goal is to output a coreference partition, why not
learn to predict a partition directly?
� They modified the perceptron algorithm in order to learn to
predict a coreference partition
� each training example corresponds to a document
� Class value is the correct coreference partition of the mentions
![Page 183: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/183.jpg)
183
The Perceptron Learning Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + xi // update the weights
until convergence
![Page 184: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/184.jpg)
184
The Perceptron Learning Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
predict the most probable
partition yi’
xi = (document, correct partition yi)xi = document w/ correct partition yi
![Page 185: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/185.jpg)
185
The Perceptron Learning Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
� Each example xi corresponds to a partition. What features should be used to represent a partition?
xi = (document, correct partition yi)xi = document w/ correct partition yi
predict the most probable
partition yi’
![Page 186: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/186.jpg)
186
The Perceptron Learning Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
� Each example xi corresponds to a partition. What features should be used to represent a partition?
� still pairwise features, but they are computed differently
� Before: do the two mentions have compatible gender?
� Now: how many coreferent pairs have compatible gender?
xi = (document, correct partition yi)xi = document w/ correct partition yi
predict the most probable
partition yi’
![Page 187: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/187.jpg)
187
The Perceptron Learning Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
� Each example xi corresponds to a partition. What features should be used to represent a partition?
� still pairwise features, but they are computed differently
� Before: do the two mentions have compatible gender?
� Now: how many coreferent pairs have compatible gender?
xi = (document, correct partition yi)xi = document w/ correct partition yi
predict the most probable
partition yi’
each value in F(yi’) is the
sum of the feature values
over all mention pairs in yi’
![Page 188: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/188.jpg)
188
The Perceptron Learning Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
� How can we predict most prob. partition using the current w?
Method 1:
� For each possible partition p, compute w • F(p)
� Select the p with the largest w • F(p)
xi = (document, correct partition yi)xi = document w/ correct partition yi
predict the most probable
partition yi’
![Page 189: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/189.jpg)
189
The Perceptron Learning Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
� How can we predict most prob. partition using the current w?
Method 1:
� For each possible partition p, compute w • F(p)
� Select the p with the largest w • F(p)
Computationally
intractable
xi = (document, correct partition yi)xi = document w/ correct partition yi
predict the most probable
partition yi’
![Page 190: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/190.jpg)
190
The Perceptron Learning Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
� How can we predict most prob. partition using the current w?
Method 2:
� Approximate the optimal partition given the current w using
correlation clustering
xi = (document, correct partition yi)xi = document w/ correct partition yi
predict the most probable
partition yi’
![Page 191: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/191.jpg)
191
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
� How can we predict most prob. partition using the current w?
Method 2:
� Approximate the optimal partition given the current w using
correlation clustering
xi = (document, correct partition yi)xi = document w/ correct partition yi
predict the most probable
partition yi’
![Page 192: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/192.jpg)
192
Observation
� Now we have an algorithm that learns to partition
� Can we further improve?
![Page 193: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/193.jpg)
193
Observation
� Now we have an algorithm that learns to partition
� Can we further improve?
� Recall that to compute each pairwise feature, we sum the
values of the pairwise feature over the coreferent pairs
� E.g., number of coreferent pairs that have compatible gender
![Page 194: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/194.jpg)
194
Observation
� Now we have an algorithm that learns to partition
� Can we further improve?
� Recall that to compute each pairwise feature, we sum the
values of the pairwise feature over the coreferent pairs
� E.g., number of coreferent pairs that have compatible gender
� We are learning a partition from all coreferent pairs
� But … learning from all coreferent pairs is hard
� Some coreferent pairs are hard to learn from
� And … we don’t need to learn from all coreferent pairs
� We do not need all coreferent pairs to construct a partition
![Page 195: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/195.jpg)
195
Observation
� To construct a coreference partition, we need to construct each coreference cluster
� To construct a coreference cluster with n mentions, we need
only n-1 links
Queen Elizabeth
her
a viable monarch
a renowned speech therapistspeech impediment
husband
King George VI
the King
his
![Page 196: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/196.jpg)
196
Observation
� To construct a coreference partition, we need to construct each coreference cluster
� To construct a coreference cluster with n mentions, we need
only n-1 links
Queen Elizabeth
her
a viable monarch
a renowned speech therapistspeech impediment
husband
King George VI
the King
his
![Page 197: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/197.jpg)
197
husband
King George VI the King his
Observations
� There are many ways links can be chosen
husband
King George VI
the King
his
husband
King George VI
the King his
husband
King George VI his
the King
…
![Page 198: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/198.jpg)
198
Coreference Tree
husband
King George VI
the King his
Queen Elizabeth
her
a viable monarch
a renowned speech therapist
speech impediment
Root
![Page 199: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/199.jpg)
199
Coreference Tree
� A coreference tree is an equivalent representation of a coreference partition
� Latent tree-based approach (Fernandes et al., 2012)
� Learn coreference trees rather than coreference partitions
� But … many coreference trees can be created from one
coreference partition … which one should we learn?
![Page 200: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/200.jpg)
200
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
predict the partition yi’
xi = (document, correct partition yi)xi = document w/ correct partition yi
predict the most probable
partition yi’
![Page 201: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/201.jpg)
201
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
xi = (document, correct tree yi)
predict the most
probable tree yi’
xi = document w/ correct tree yi
![Page 202: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/202.jpg)
202
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
xi = (document, correct tree yi)
predict the most
probable tree yi’
� How can we predict most prob. tree yi’ using the current w?
xi = document w/ correct tree yi
![Page 203: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/203.jpg)
203
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
xi = (document, correct tree yi)
predict the most
probable tree yi’
� How can we predict most prob. tree yi’ using the current w?
Method 1:
� For each possible tree t, compute w • F(t)
� Select the t with the largest w • F(t)
xi = document w/ correct tree yi
![Page 204: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/204.jpg)
204
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
xi = (document, correct tree yi)
predict the most
probable tree yi’
� How can we predict most prob. tree yi’ using the current w?
Method 1:
� For each possible tree t, compute w • F(t)
� Select the t with the largest w • F(t)
xi = document w/ correct tree yi
each value in F(yi’) is the
sum of the feature values
over all the edges in yi’
![Page 205: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/205.jpg)
205
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
xi = (document, correct tree yi)
predict the most
probable tree yi’
� How can we predict most prob. tree yi’ using the current w?
Method 2:
� Run the Chu-Liu/Edmonds’ algorithm to find max spanning tree
xi = document w/ correct tree yi
![Page 206: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/206.jpg)
206
� Each xi is labeled with correct tree yi. Since many correct trees
can be created from a partition, which one should be used?
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
xi = (document, correct tree yi)
predict the most
probable tree yi’
xi = document w/ correct tree yi
![Page 207: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/207.jpg)
207
� Each xi is labeled with correct tree yi. Since many correct trees
can be created from a partition, which one should be used?
� Heuristically?
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
xi = (document, correct tree yi)
predict the most
probable tree yi’
xi = document w/ correct tree yi
![Page 208: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/208.jpg)
208
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
xi = (document, correct tree yi)
predict the most
probable tree yi’
� Each xi is labeled with correct tree yi. Since many correct trees
can be created from a partition, which one should be used?
� Select the correct yi with the largest w • F(yi)
xi = document w/ correct tree yi
![Page 209: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/209.jpg)
209
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
xi = (document, correct tree yi)
predict the most
probable tree yi’
� Each xi is labeled with correct tree yi. Since many correct trees
can be created from a partition, which one should be used?
� Select the correct yi with the largest w • F(yi)
Select the correct tree
that is best given the
current model
xi = document w/ correct tree yi
![Page 210: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/210.jpg)
210
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(1) predict the class of xi using the current w
(2) if the predicted class is not equal to the correct class
w w + F(yi) – F(yi’) // update the weights
until convergence
xi = (document, correct tree yi)
predict the most
probable tree yi’
� Each xi is labeled with correct tree yi. Since many correct trees
can be created from a partition, which one should be used?
� Select the correct yi with the largest w • yi
Select the correct tree
that is best given the
current model
xi = document w/ correct tree yi
A different correct tree
will be selected in each
iteration
![Page 211: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/211.jpg)
211
The Structured Perceptron Learning
Algorithm
� Initialize w
� Loop
for each training example xi
(0) select the correct tree yi using the current w
(1) predict the most prob. tree yi’ of xi using the current w
(2) if the predicted tree is not equal to the correct tree
w w + F(yi) – F(yi’) // update the weights
until convergence
![Page 212: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/212.jpg)
212
The Latent Structured Perceptron Learning
Algorithm (Joachims & Yu, 2009)
� Initialize w
� Loop
for each training example xi
(0) select the correct tree yi using the current w
(1) predict the most prob. tree yi’ of xi using the current w
(2) if the predicted tree is not equal to the correct tree
w w + F(yi) – F(yi’) // update the weights
until convergence
![Page 213: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/213.jpg)
213
What’s left?
� Recall that …
� Humans are better at combining features than machine learners
� Better feature induction methods for combining primitive
features into more powerful features?
� The latent tree-based model employs feature induction
� Entropy-based feature induction
� given the same training set used to train a mention-pair model,
train a decision tree classifier
![Page 214: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/214.jpg)
214
STR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
![Page 215: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/215.jpg)
215
STR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
� Generate feature combinations using all paths of all possible lengths starting the root node
![Page 216: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/216.jpg)
216
STR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
� Generate feature combinations using all paths of all possible lengths starting the root node
![Page 217: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/217.jpg)
217
STR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
� Generate feature combinations using all paths of all possible lengths starting the root node
![Page 218: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/218.jpg)
218
STR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
� Generate feature combinations using all paths of all possible lengths starting the root node
![Page 219: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/219.jpg)
219
STR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
� Generate feature combinations using all paths of all possible lengths starting the root node
![Page 220: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/220.jpg)
220
STR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
� Generate feature combinations using all paths of all possible lengths starting the root node
![Page 221: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/221.jpg)
221
STR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
� Generate feature combinations using all paths of all possible lengths starting the root node
![Page 222: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/222.jpg)
222
STR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
� Generate feature combinations using all paths of all possible lengths starting the root node
![Page 223: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/223.jpg)
223
STR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
� Generate feature combinations using all paths of all possible lengths starting the root node
![Page 224: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/224.jpg)
224
Feature Induction
� Use the resulting feature combinations, rather than the original features, to represent a training/test instance
![Page 225: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/225.jpg)
225
Evaluation
� Corpus
� Training: CoNLL-2012 shared task training corpus
� Test: CoNLL-2012 shared task test corpus
� Scoring programs: recall, precision, F-measure
� MUC (Vilain et al., 1995)
� B3 (Bagga & Baldwin, 1998)
� CEAFe (Luo, 2005)
� CoNLL (unweighted average of MUC, B3 and CEAFe F-scores)
![Page 226: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/226.jpg)
226
Results (Closed Track)
� Usual caveat
� Mention detection performance could have played a role
58.353.660.061.2Rank 2: Mention-Pair
System English Chinese Arabic Avg
Rank 1: Latent Trees 63.4 58.5 54.2 58.7
Rank 3: Multi-Pass Sieves 59.7 62.2 47.1 56.4
![Page 227: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/227.jpg)
227
Latent Tree-Based Approach: Main Ideas
� Represent a coreference partition using a tree
� Avoid learning from the hard coreference pairs
� Allow gold tree to change in each perceptron learning iteration
� Reduce number of candidate trees using Stanford sieves
� Use feature induction to better combine available features
� Which of them are effective?
![Page 228: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/228.jpg)
228
Recent Models
� Revival of mention-ranking models
� Durrett & Klein (2013): “There is no polynomial-time dynamic
program for inference in a model with arbitrary entity-level
features, so systems that use such features make decisions in
a pipelined manner and sticking with them, operating greedily
in a left-to-right fashion or in a multi-pass, sieve-like manner”
� Two recent mention-ranking models
� Durrett & Klein (EMNLP 2013)
� Wiseman et al. (ACL 2015)
![Page 229: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/229.jpg)
229
Recent Models
� Revival of mention-ranking models
� Durrett & Klein, 2013: “There is no polynomial-time dynamic
program for inference in a model with arbitrary entity-level
features, so systems that use such features make decisions in
a pipelined manner and sticking with them, operating greedily
in a left-to-right fashion or in a multi-pass, sieve-like manner”
� Two recent mention-ranking models
� Durrett & Klein (EMNLP 2013)
� Wiseman et al. (ACL 2015)
![Page 230: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/230.jpg)
230
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
![Page 231: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/231.jpg)
231
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
![Page 232: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/232.jpg)
232
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
![Page 233: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/233.jpg)
233
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
document
context
![Page 234: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/234.jpg)
234
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
document
context
ith mention
![Page 235: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/235.jpg)
235
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
document
context
ith mention
antecedent
chosen for
mention i
![Page 236: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/236.jpg)
236
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
document
context
ith mentionsum over all
mentions
antecedent
chosen for
mention i
![Page 237: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/237.jpg)
237
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Similar to mention-ranking model, except that we train it to
jointly maximize the likelihood of selecting all antecedents
document
context
ith mentionsum over all
mentions
antecedent
chosen for
mention i
Vector of
antecedents
chosen for the document, one
per mention
![Page 238: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/238.jpg)
238
Durrett & Klein (EMNLP 2013)
� Goal
� maximize the likelihood of the antecedent vector
� Problem
� a mention may have more than one antecedent, so which one
should we use for training?
� Solution
� sum over all antecedent structures licensed by gold clusters
![Page 239: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/239.jpg)
239
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
![Page 240: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/240.jpg)
240
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
![Page 241: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/241.jpg)
241
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
weight
parameters
![Page 242: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/242.jpg)
242
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
weight
parameters sum over all
training docs
![Page 243: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/243.jpg)
243
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
weight
parameters sum over all
training docs
![Page 244: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/244.jpg)
244
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
weight
parameters sum over all
training docs
sum over all
antecedent
structures
![Page 245: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/245.jpg)
245
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
weight
parameters sum over all
training docs
sum over all
antecedent
structures
L1 regularizer
![Page 246: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/246.jpg)
246
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
weight
parameters sum over all
training docs
sum over all
antecedent
structures
L1 regularizer
![Page 247: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/247.jpg)
247
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
![Page 248: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/248.jpg)
248
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
![Page 249: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/249.jpg)
249
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
![Page 250: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/250.jpg)
250
Durrett & Klein (EMNLP 2013)
� Log linear model of the conditional distribution P(a|x)
� Log likelihood function
![Page 251: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/251.jpg)
251
The Loss Function
![Page 252: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/252.jpg)
252
The Loss Function
antecedent
vector
![Page 253: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/253.jpg)
253
The Loss Function
antecedent
vectorgold
clusters
![Page 254: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/254.jpg)
254
The Loss Function
antecedent
vectorgold
clusters
![Page 255: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/255.jpg)
255
The Loss Function
antecedent
vectorgold
clusters
False Anaphoric
![Page 256: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/256.jpg)
256
The Loss Function
antecedent
vectorgold
clusters
False Anaphoric False New
![Page 257: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/257.jpg)
257
The Loss Function
antecedent
vectorgold
clusters
False Anaphoric False New Wrong Link
![Page 258: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/258.jpg)
258
The Loss Function
antecedent
vectorgold
clusters
False Anaphoric False New Wrong Link
![Page 259: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/259.jpg)
259
Surface Features
� Features computed on each of the two mentions
� mention type (pronoun, name, nominal)
� complete string, semantic head
� first word, last word, preceding word, following word
� length (in words)
� Features computed based on both mentions
� Exact string match, head match
� Distance in number of sentences and number of mentions
� Feature conjunctions
� Attach to each feature the mention type (or the pronoun itself if
it’s a pronoun)
![Page 260: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/260.jpg)
260
Results on the CoNLL-2011 test set
� Easy victories
� Using surface features (rather than standard coreference
features) allows their system to outperform the state of the art
![Page 261: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/261.jpg)
261
Why?
� D&K’s explanation
� These standard features do capture the same phenomena as
standard coreference features, just implicitly
� Examples
� Rather than using rules targeting person, number, or gender of
mentions, they use conjunctions of pronoun identity
� Rather than using a feature encoding definiteness, the first
word of a mention would capture this
� Rather than encoding grammatical role (subject/object), such
information can be inferred from the surrounding words
![Page 262: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/262.jpg)
262
Recent Models
� Revival of mention-ranking models
� Durrett & Klein, 2013: “There is no polynomial-time dynamic
program for inference in a model with arbitrary entity-level
features, so systems that use such features make decisions in
a pipelined manner and sticking with them, operating greedily
in a left-to-right fashion or in a multi-pass, sieve-like manner”
� Two recent mention-ranking models
� Durrett & Klein (EMNLP 2013)
� Wiseman et al. (ACL 2015)
![Page 263: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/263.jpg)
263
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
![Page 264: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/264.jpg)
264
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
![Page 265: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/265.jpg)
265
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
![Page 266: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/266.jpg)
266
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
anaphor
![Page 267: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/267.jpg)
267
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
anaphor candidate
antecedent
![Page 268: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/268.jpg)
268
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
� Another way of expressing the scoring function
![Page 269: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/269.jpg)
269
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
� Another way of expressing the scoring functionnon-null antecedent
![Page 270: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/270.jpg)
270
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
� Another way of expressing the scoring functionnon-null antecedent
Features on
anaphor
![Page 271: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/271.jpg)
271
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
� Another way of expressing the scoring functionnon-null antecedent
Features on
anaphor
Features on both anaphor
and candidate antecedent
![Page 272: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/272.jpg)
272
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
� Another way of expressing the scoring function
null antecedent
![Page 273: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/273.jpg)
273
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
� Another way of expressing the scoring function
null antecedent
Features on
anaphor
![Page 274: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/274.jpg)
274
Wiseman et al. (ACL 2015)
� Observation: recent mention-ranking models are all linear; why not train a non-linear mention-ranking model?
� Recap
� Scoring function for linear mention-ranking models
� Another way of expressing the scoring function
Raw/Unconjoined
features
![Page 275: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/275.jpg)
275
Raw/Unconjoined Features
� Mention’s head, complete string, first word, last word, …
� Researchers don’t seem to like them
� they almost always use conjoined features
� created by hand or obtained via feature induction
� can add some non-linearity to the linear model
� Why?
� Wiseman et al. empirically showed that raw/unconjoinedfeatures are not predictive for the coreference task
![Page 276: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/276.jpg)
276
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
![Page 277: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/277.jpg)
277
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
![Page 278: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/278.jpg)
278
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
![Page 279: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/279.jpg)
279
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
![Page 280: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/280.jpg)
280
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
![Page 281: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/281.jpg)
281
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
![Page 282: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/282.jpg)
282
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
Network’s
parameters
![Page 283: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/283.jpg)
283
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
Can learn
feature
combinations
![Page 284: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/284.jpg)
284
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
![Page 285: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/285.jpg)
285
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
� Option 1: let g be the identity function
� Neural net model same as linear model except that it’s defined over non-linear feature representations
![Page 286: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/286.jpg)
286
� Learn feature representations that are useful for the task
� Scoring function for linear mention-ranking model
� Scoring function for Wiseman’s neural net-based model
Wiseman et al.’s Proposal
� Option 2:
functions as an additional hidden layer
![Page 287: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/287.jpg)
287
Wiseman et a.’s Proposal
� Pros
� Can learn (non-linear) feature representations from raw features
� Don’t have to conjoin features by hand
� Cons
� Training a non-linear model is more difficult than training a linear model
� Model performance sensitive to weight initializations
![Page 288: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/288.jpg)
288
Training
1. Train two neural nets separately
� One for anaphoricity and one for coreference
2. Use the weight parameters learned as initializations for the combined neural net
� Objective function similar to Durrett & Klein’s, except that it
is based on margin loss rather than log loss
![Page 289: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/289.jpg)
289
Results on CoNLL 2012 test set
![Page 290: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/290.jpg)
290
Wiseman et al. (NAACL 2016)
� Improved their neural net model by incorporating entity-level information
� CoNLL score increased by 0.8
� State-of-the-art results on the English portion of the CoNLL
2012 test data
� Software available from the Harvard NLP group page
![Page 291: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/291.jpg)
291
Unsupervised Models
� EM
� Cherry & Bergsma (2005), Charniak & Elsner (2009)
� Clustering
� Cardie & Wagstaff (1999), Cai & Strube (2010)
� Nonparametric models
� Haghighi & Klein (2007, 2010)
� Markov Logic networks
� Poon & Domingos (2008)
� Bootstrapping
� Kobdani et al. (2011)
![Page 292: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/292.jpg)
292
Plan for the Talk
� Part I: Background
� Task definition
� Why coreference is hard
� Applications
� Brief history
� Part II: Machine learning for coreference resolution
� System architecture
� Computational models
� Resources and evaluation (corpora, evaluation metrics, …)
� Employing semantics and world knowledge
� Part III: Solving hard coreference problems
� Difficult cases of overt pronoun resolution
� Relation to the Winograd Schema Challenge
![Page 293: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/293.jpg)
293
Semantics and World Knowledge
� Coreference resolution is considered one of the most difficult tasks in NLP in part because of its reliance on sophisticated
knowledge sources
� The importance of semantics and world knowledge in
coreference resolution has long been recognized
� Hobbs (1976)
� Syntactic approach (the naïve algorithm)
� Semantic approach
� Shift in research trends
� Knowledge-rich approaches (1970s and 1980s)
� Knowledge-lean approaches (1990s)
� Knowledge-rich approaches (2000 onwards)
![Page 294: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/294.jpg)
294
Semantics and World Knowledge
� Have researchers been successful in employing semantics and world knowledge to improve learning-based coreference
resolution systems?
� Are these features useful in the presence of morpho-
syntactic (knowledge-lean, robustly computed) features?
![Page 295: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/295.jpg)
295
Soon et al. (2001)
� One of the first learning-based coreference systems
� Mention-pair model trained using C4.5
� 12 features
� Mostly morpho-syntactic features
� One semantic feature computed based on WordNet (1st sense)
� U if one/both mentions have undefined WordNet semantic class
� T if the two have the same WordNet semantic class
� F otherwise
� No separate evaluation of how useful the WordNet semantic feature is, but…
![Page 296: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/296.jpg)
296
Decision Tree Learned for MUC-6 DataSTR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
![Page 297: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/297.jpg)
297
Decision Tree Learned for MUC-6 DataSTR-MATCH
GENDER
NUMBER
ALIAS
J-PRONOUN
I-PRONOUN
APPOSITIVE
DISTANCE
+ -
+ -
+ -
+ -
+ -
> 0 ≤ 0
+ -
+
+
+
+
+
-
-
- -
10 2
-
Semantic features not useful???
![Page 298: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/298.jpg)
298
Ng & Cardie (ACL 2002)
� A large-scale expansion of the features used by Soon et al.
� 53 lexical, grammatical, and semantic features
![Page 299: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/299.jpg)
299
Four WordNet-based Semantic Features
� Whether mj is the closest mention preceding mention that has the same WordNet semantic class as mk
� Whether the two mentions have an ancestor-descendent
relationship in WordNet; if yes,
� Encode the sense numbers in WordNet that give rise to
this ancestor-descendent relationship
� compute the distance between the two WordNet synsets
![Page 300: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/300.jpg)
300
Evaluation
� No separate evaluation of how useful the WordNet semantic
feature is, but…
� A hand-selected subset of the features was used to train a
mention-pair model that yielded better performance
� This subset did not include any of these 4 semantic features
![Page 301: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/301.jpg)
301
Evaluation
� No separate evaluation of how useful the WordNet semantic
feature is, but…
� A hand-selected subset of the features was used to train a
mention-pair model that yielded better performance
� The subset did not include any of these 4 semantic features
Semantic features not useful???
![Page 302: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/302.jpg)
302
Kehler et al. (NAACL 2004)
� Approximate world knowledge using predicate-argument statistics
� forcing_industry is a more likely verb-object combination in
naturally-occurring data than forcing_initiative or forcing_edge
� Such predicate-argument (e.g., subject-verb, verb-object) statistics can be collected from a large corpus
� TDT-2 corpus (1.32 million subj-verb; 1.17 million verb-obj)
He worries that Trump’s initiative would push his industry
over the edge, forcing it to shift operations elsewhere.
![Page 303: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/303.jpg)
303
Kehler et al. (NAACL 2004)
� Goal: examine whether predicate-argument statistics, when encoded as features, can improve a pronoun resolver
trained on state-of-the-art morpho-syntactic features
� Morpho-syntactic features
� Gender agreement
� number agreement
� Distance between the two mentions
� Grammatical role of candidate antecedent
� NP form (def, indef, pronoun) of the candidate antecedent
� Train a mention-pair model using maximum entropy
� Evaluate on the ACE 2003 corpus
![Page 304: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/304.jpg)
304
Results
![Page 305: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/305.jpg)
305
Results
Semantic features not useful???
![Page 306: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/306.jpg)
306
Error Analysis
� MaxEnt selected the temerity
� Predicate-argument statistics selected the endowment
� Need a better way to exploit the statistics?
After the endowment was publicly excoriated for having
the temerity to award some of its money to art that
addressed changing views of gender and race, …
![Page 307: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/307.jpg)
307
Error Analysis
� MaxEnt selected the supporters
� So did the predicate-argument statistics
The dancers were joined by about 70 supporters as they
marched around a fountain not far from the mayor’s office.
![Page 308: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/308.jpg)
308
Kehler et al.’s Observations
� In the cases in which statistics reinforced a wrong answer, no manipulation of features can rescue the prediction
� For the cases in which statistics could help, their successful use will depend on the existence of a formula that can
capture these cases without changing the predictions for examples that the model currently classifiers correctly
� Conclusion: predicate-argument statistics are a poor substitute for world knowledge, and more to the point, they
do not offer much predictive power to a state-of-the-art morphosyntactically-driven pronoun resolution system
![Page 309: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/309.jpg)
309
Yang et al. (ACL 2005)
� Goal: examine whether predicate-argument statistics, when encoded as features, can improve a pronoun resolver
trained on state-of-the-art morpho-syntactic features
� But… with the following differences in the setup
� Web as corpus (to mitigate data sparsity)
� Pairwise ranking model
� Baseline morpho-syntactic features defined on m_i and m_j
� DefNP_i, Pro_i, NE_i, SameSent, NearestNP_i, Parallel_i,
FirstNPinSent_i, Reflexive_j, NPType_j
� Learner: C4.5
� Corpus: MUC-6 and MUC-7
![Page 310: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/310.jpg)
310
Results
Neutral Pro. Personal Pro. Overall
Corp Web Corp Web Corp Web
Baseline 73.9 91.9 81.9
Baseline + predicate-arg statistics 76.7 79.2 91.4 91.9 83.3 84.8
![Page 311: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/311.jpg)
311
Results
Semantic features are useful!!!
Neutral Pro. Personal Pro. Overall
Corp Web Corp Web Corp Web
Baseline 73.9 91.9 81.9
Baseline + predicate-arg statistics 76.7 79.2 91.4 91.9 83.3 84.8
![Page 312: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/312.jpg)
312
Ponzetto & Strube (NAACL 2006)
� Goal: improve learning-based coreference resolution by exploiting three knowledge sources
� WordNet
� Wikipedia
� Semantic role labeling
![Page 313: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/313.jpg)
313
Using WordNet
� Motivation:
� Soon et al. employed a feature that checks whether two
mentions have the same WordNet semantic class
� Noisy: had problems with coverage and sense proliferation
� Solution: measure the similarity between the WordNet
synsets of the two mentions using six similarity measures
� 3 path-length based measures
� 3 information-content based measures
� Two features
� Highest similarity score over all senses and all measures
� Average similarity score over all senses and all measures
![Page 314: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/314.jpg)
314
Using Wikipedia
� can also resolve the celebrity using syntactic parallelism, but
� heuristics are not always accurate
� does not mimic the way humans look for antecedents
� Use world knowledge extracted from Wikipedia
Martha Stewart is hoping people don’t run out on her.
The celebrity indicted on charges stemming from …
![Page 315: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/315.jpg)
315
Using Wikipedia
� Given mentions mi and mj, retrieve the Wiki pages they refer to, Pi and Pj, with titles mi and mj (or their heads)
� Create features for coreference resolution
� Features based on first paragraph of Wiki page
� Whether Pi’s first paragraph contains mj
� Create an analogous feature by reversing the roles of mi & mj
![Page 316: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/316.jpg)
316
Using Wikipedia
� Given mentions mi and mj, retrieve the Wiki pages they refer to, Pi and Pj, with titles mi and mj (or their heads)
� Create features for coreference resolution
� Features based on first paragraph of Wiki page
� Features based on the hyperlinks of Wiki page
� Whether at least one hyperlink in Pi contains mj
� Create an analogous feature by reversing the roles of mi & mj
![Page 317: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/317.jpg)
317
Using Wikipedia
� Given mentions mi and mj, retrieve the Wiki pages they refer to, Pi and Pj, with titles mi and mj (or their heads)
� Create features for coreference resolution
� Features based on first paragraph of Wiki page
� Features based on the hyperlinks of Wiki page
� Features based on the Wiki categories
� Whether the categories Pi belongs to contains mj (or its head)
� Create analogous feature by reversing the roles of mi & mj
![Page 318: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/318.jpg)
318
Using Wikipedia
� Given mentions mi and mj, retrieve the Wiki pages they refer to, Pi and Pj, with titles mi and mj (or their heads)
� Create features for coreference resolution
� Features based on first paragraph of Wiki page
� Features based on the hyperlinks of Wiki page
� Features based on the Wiki categories
� Features based on overlap of first paragraphs
� Overlap score between first paragraphs of the two Wiki pages
![Page 319: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/319.jpg)
319
Using Wikipedia
� Given mentions mi and mj, retrieve the Wiki pages they refer to, Pi and Pj, with titles mi and mj (or their heads)
� Create features for coreference resolution
� Features based on first paragraph of Wiki page
� Features based on the hyperlinks of Wiki page
� Features based on the Wiki categories
� Features based on overlap of first paragraphs
� Highest & average relatedness score of all category pairs
formed from the categories associated with the two Wiki pages
![Page 320: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/320.jpg)
320
Semantic Role Labeling (SRL)
� Knowing that “program trading” is the PATIENT of the
“decry” predicate and “it” being the PATIENT of “denounce”could trigger the (semantic parallelism based) inference
Peter Anthony decries program trading as “limiting the
game to a few,” but he is not sure whether he wants to
denounce it because …
![Page 321: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/321.jpg)
321
Semantic Role Labeling (SRL)
� Use the ASSERT semantic parser to identify all verb predicates in a sentence and their semantic arguments
� Each argument is labeled with its PropBank-style semantic role
� ARG1, …, ARGn
� Two SRL features
� The role-predicate pair of mention mi
� The role-predicate pair of mention mj
![Page 322: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/322.jpg)
322
Experimental Setup
� Baseline
� mention-pair model trained with Soon et al.’s 12 features using
MaxEnt
� ACE 2003 (Broadcast News + Newswire)
� Evaluation metric: MUC scorer
![Page 323: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/323.jpg)
323
Results
![Page 324: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/324.jpg)
324
Results
Semantic features are useful!!!
![Page 325: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/325.jpg)
325
Rahman & Ng (ACL 2011)
� Goal: improve learning-based coreference resolution by exploiting two knowledge sources
� YAGO
� FrameNet
![Page 326: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/326.jpg)
326
YAGO (Suchanek et al., 2007)
� contains 5 million facts derived from Wikipedia and WordNet
� each fact is a triple describing a relation between two NPs
� <NP1, rel, NP2>, rel can be one of 90 YAGO relation types
� focuses on two types of YAGO relations: TYPE and MEANS(Bryl et al., 2010, Uryupina et al., 2011)
� TYPE: the IS-A relation
� <AlbertEinstein, TYPE, physicist>
<BarackObama, TYPE, president>
� MEANS: addresses synonymy and ambiguity
� <Einstein, MEANS, AlbertEinstein>,
<Einstein, MEANS, AlfredEinstein>
� provide evidence that the two NPs involved are coreferent
![Page 327: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/327.jpg)
327
Why YAGO?
� combines the information in Wikipedia and WordNet
� can resolve the celebrity to Martha Stewart
� neither Wikipedia nor WordNet alone can
� How to use YAGO to resolve?
1. Heuristically maps each Wiki category in the Wiki page for
Martha Stewart to its semantically closest WordNet synset
� AMERICAN TELEVISION PERSONALITIES �
synset for sense #2 of personality
2. Realizes personality is a direct hyponym of celebrity in WordNet
3. Extracts the fact <MarthaStewart, TYPE, celebrity>
Martha Stewart is hoping people don’t run out on her.
The celebrity indicted on charges stemming from …
![Page 328: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/328.jpg)
328
Using YAGO for Coreference Resolution
� create a new feature for mention-pair model whose value is
1 if the two NPs are in a TYPE or MEANS relation
0 otherwise
![Page 329: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/329.jpg)
329
FrameNet (Baker et al., 1998)
� A lexico-semantic resource focused on semantic frames
� A frame contains
� the lexical predicates that can invoke it
� the frame elements (i.e., the semantic roles)
� E.g., the JUDGMENT_COMMUNICATION frame describes
situations in which a COMMUNICATOR communicates a judgment of an EVALUEE to an ADDRESSEE
� frame elements: COMMUNICATOR, EVALUEE, ADDRESSEE
� lexical predicates: acclaim, accuse, decry, denounce, slam, …
![Page 330: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/330.jpg)
330
Motivating Example
� To resolve it to Peter Anthony, it may be helpful to know
� decry and decounce are “semantically related”
� the two mentions have the same semantic role
Peter Anthony decries program trading as “limiting the
game to a few,” but he is not sure whether he wants to
denounce it because …
![Page 331: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/331.jpg)
331
Motivating Example
� To resolve it to Peter Anthony, it may be helpful to know
� decry and decounce are “semantically related”
� the two mentions have the same semantic role
Peter Anthony decries program trading as “limiting the
game to a few,” but he is not sure whether he wants to
denounce it because …
� Missing from Ponzetto & Strube’s application of semantic roles
� We model this using FrameNet
![Page 332: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/332.jpg)
332
Observation
� Features encoding
� the semantic roles of the two NPs under consideration
� whether the associated predicates are “semantically related”
could be useful for identifying coreference relations.
![Page 333: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/333.jpg)
333
Observation
� Features encoding
� the semantic roles of the two NPs under consideration
� whether the associated predicates are “semantically related”
could be useful for identifying coreference relations.
Use ASSERT
� Provides PropBank-style roles (Arg0, Arg1, …)
![Page 334: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/334.jpg)
334
Observation
� Features encoding
� the semantic roles of the two NPs under consideration
� whether the associated predicates are “semantically related”
could be useful for identifying coreference relations.
Use ASSERT
� Provides PropBank-style roles (Arg0, Arg1, …)
Use PropBank
� Checks whether the two predicates appear in the same frame
![Page 335: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/335.jpg)
335
Results on ACE 2005 and OntoNotes
� Baseline: mention-pair model trained with 39 features from Rahman & Ng (2009)
ACE OntoNotes
B3 CEAF B3 CEAF
Baseline 62.4 60.0 53.3 51.5
Baseline+YAGO types 63.1 62.8 54.6 52.8
Baseline+YAGO types & means 63.6 63.2 55.0 53.3
Baseline+YAGO types & means & FN 63.8 63.1 55.2 53.4
![Page 336: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/336.jpg)
336
Difficulty in Exploiting World Knowledge
� World knowledge extracted from YAGO is noisy
� Numerous entries for a particular mention, all but one are
irrelevant
![Page 337: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/337.jpg)
337
Ratinov & Roth (EMNLP 2012)
� Goal: Improve their learning-based multi-pass sieve approach using world knowledge extracted from Wikipedia
� Extract for each mention knowledge attributes from Wiki
� Find the Wiki page the mention refers to using their context-
sensitive entity linking system, which could reduce noise
� Extract from the retrieved page three attributes
� (1) gender; (2) nationality; (3) fine-grained semantic categories
� Create features from the extracted attributes
� Whether the two mentions are mapped to the same Wiki page
� Agreement w.r.t. gender, nationality, and semantic categories
� Augment each sieve’s feature set with these new features
![Page 338: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/338.jpg)
338
Results on ACE 2004 Newswire texts
� System shows a minimum improvement of 3 (MUC), 2 (B3), and 1.25 (CEAF) F1 points on gold mentions
� Not always considered an acceptable evaluation setting
� Improvements on gold mentions do not necessarily imply
improvements on automatically extracted mentions
![Page 339: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/339.jpg)
339
Durrett & Klein (EMNLP 2013)
� Using only surface features, their log linear model achieved state of the art results on the CoNLL-2011 test set
� Can performance be further improved with semantics?
� Derive semantic features from four sources
� WordNet hypernymy and synonymy
� Number and gender for names and nominals
� Named entity types
� Latent clusters computed from English Gigaword
� Each element in a cluster is a nominal head together with the
conjunction of its verbal governor and its semantic role
![Page 340: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/340.jpg)
340
Results on CoNLL-2011 Test Set
![Page 341: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/341.jpg)
341
Results on CoNLL-2011 Test Set
Semantic features not useful???
![Page 342: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/342.jpg)
342
Results on CoNLL-2011 Test Set
� D&K’s explanation
� only a small fraction of the mention pairs are coreferent
� A system needs very strong evidence to overcome the default
hypothesis that a pair of mentions is not coreferent
� The weak (semantic) indicators of coreference will likely have
high false positive rates, doing more harm than good
![Page 343: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/343.jpg)
343
Results on CoNLL-2011 Test Set
� D&K’s explanation
� only a small fraction of the mention pairs are coreferent
� A system needs very strong evidence to overcome the default
hypothesis that a pair of mentions is not coreferent
� The weak (semantic) indicators of coreference will likely have
high false positive rates, doing more harm than good
� Our explanation
� It’s harder to use semantics to improve a strong baseline
� D&K’s semantic features are shallow semantic features
![Page 344: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/344.jpg)
344
Other (Failed) Attempts
� Stanford’s multi-pass sieve system
� beet system in the CoNLL 2011 shared task
� extended their system with 2 new sieves that exploit semantics
from WordNet, Wikipedia infoboxes, and Freebase records
� the semantics sieves didn’t help
![Page 345: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/345.jpg)
345
Other (Failed) Attempts
� Stanford’s multi-pass sieve system
� beet system in the CoNLL 2011 shared task
� extended their system with 2 new sieves that exploit semantics
from WordNet, Wikipedia infoboxes, and Freebase records
� the semantics sieves didn’t help
� Sepena et al.’s (2013) system
� 2nd best system in the CoNLL 2011 shared task
� Proposed a constraint-based hypergraph partitioning approach
� Used info extracted from Wikipedia as features/constraints
� Conclusion: the problem seems to lie with the extracted info, which is biased in favor of famous/popular entities… including
false positives… imbalance against entities with little or no info
![Page 346: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/346.jpg)
346
Plan for the Talk
� Part I: Background
� Task definition
� Why coreference is hard
� Applications
� Brief history
� Part II: Machine learning for coreference resolution
� System architecture
� Computational models
� Resources and evaluation (corpora, evaluation metrics, …)
� Employing semantics and world knowledge
� Part III: Solving hard coreference problems
� Difficult cases of overt pronoun resolution
� Relation to the Winograd Schema Challenge
![Page 347: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/347.jpg)
347
Hard-to-resolve Definite Pronouns
� Resolve definite pronouns for which traditional linguistic constraints on coreference and commonly-used resolution
heuristics would not be useful
![Page 348: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/348.jpg)
348
A Motivating Example (Winograd, 1972)
� The city council refused to give the demonstrators a permit
because they feared violence.
� The city council refused to give the demonstrators a permit because they advocated violence.
![Page 349: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/349.jpg)
349
Another Motivating Example (Hirst, 1981)
� When Sue went to Nadia’s home for dinner, she served
sukiyaki au gratin.
� When Sue when to Nadia’s home for dinner, she ate sukiyaki au gratin.
![Page 350: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/350.jpg)
350
Another Example
� James asked Robert for a favor, but he refused.
� James asked Robert for a favor, but he was refused.
![Page 351: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/351.jpg)
351
Yet Another Example
� Sam fired Tom but he did not regret doing so.
� Sam fired Tom although he is diligent.
![Page 352: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/352.jpg)
352
Focus on certain kinds of sentences
� The target pronoun should
� appear in a sentence that has two clauses with a discourse
connective, where the first clause contains two candidate
antecedents and the second contains the pronoun
� agree in gender, number, semantic class with both candidates
When Sue went to Nadia’s home for dinner, she served sukiyaki au gratin.
When Sue when to Nadia’s home for dinner, she ate sukiyaki au gratin.
Rahman & Ng EMNLP’12
![Page 353: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/353.jpg)
353
Focus on certain kinds of sentences
� The target pronoun should
� appear in a sentence that has two clauses with a discourse
connective, where the first clause contains two candidate
antecedents and the second contains the pronoun
� agree in gender, number, semantic class with both candidates
� We ensure that each sentence has a twin. Two sentences are twins if
� their first clauses are the same
� they have lexically identical pronouns with different antecedents
When Sue went to Nadia’s home for dinner, she served sukiyaki au gratin.
When Sue when to Nadia’s home for dinner, she ate sukiyaki au gratin.
Rahman & Ng EMNLP’12
![Page 354: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/354.jpg)
354
Dataset
� 941 sentence pairs composed by 30 students who took my undergraduate machine learning class in Fall 2011
![Page 355: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/355.jpg)
355
Our Approach: Ranking
� Create one ranking problem from each sentence
� Each ranking problem consists of two instances
� one formed from the pronoun and the first candidate
� one formed from the pronoun and the second candidate
� Goal: train a ranker that assigns a higher rank to the instance having the correct antecedent for each ranking problem
![Page 356: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/356.jpg)
356
Eight Components for Deriving Features
� Narrative chains
� FrameNet
� Semantic compatibility
� Heuristic polarity
� Machine-learned polarity
� Connective-based relations
� Lexical features
![Page 357: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/357.jpg)
357
Narrative Chains (Chambers & Jurafsky, 2008)
� Narrative chains are learned versions of scripts
� Scripts represent knowledge of stereotypical event sequences
that can aid text understanding
� Reach restaurant, waiter sits you, gives you a menu, order food,..
� Partially ordered sets of events centered around a protagonist
� e.g., borrow-s invest-s spend-s pay-s raise-s lend-s
� Someone who borrows something may invest, spend, pay, or
lend it
� can contain a mix of “s” (subject role) and “o” (object role)
� e.g., the restaurant script
![Page 358: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/358.jpg)
358
How can we apply narrative chains for
pronoun resolution?
Ed punished Tim because he tried to escape.
� Employ narrative chains to heuristically predict the antecedent
![Page 359: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/359.jpg)
359
How can we apply narrative chains for
pronoun resolution?
Ed punished Tim because he tried to escape.
� Employ narrative chains to heuristically predict the antecedent
1) Find the event in which the pronoun participates and its role
� “he” participates in the “try” and “escape” events as a subject
![Page 360: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/360.jpg)
360
How can we apply narrative chains for
pronoun resolution?
Ed punished Tim because he tried to escape.
� Employ narrative chains to heuristically predict the antecedent
1) Find the event in which the pronoun participates and its role
� “he” participates in the “try” and “escape” events as a subject
2) Find the event(s) in which the candidates participate
� Both candidates participate in the “punish” event
![Page 361: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/361.jpg)
361
How can we apply narrative chains for
pronoun resolution?
Ed punished Tim because he tried to escape.
� Employ narrative chains to heuristically predict the antecedent
1) Find the event in which the pronoun participates and its role
� “he” participates in the “try” and “escape” events as a subject
2) Find the event(s) in which the candidates participate
� Both candidates participate in the “punish” event
3) Pair each candidate event with each pronoun event
� Two pairs are created: (punish, try-s), (punish, escape-s)
![Page 362: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/362.jpg)
362
How can we apply narrative chains for
pronoun resolution?
Ed punished Tim because he tried to escape.
� Employ narrative chains to heuristically predict the antecedent
1) Find the event in which the pronoun participates and its role
� “he” participates in the “try” and “escape” events as a subject
2) Find the event(s) in which the candidates participate
� Both candidates participate in the “punish” event
3) Pair each candidate event with each pronoun event
� Two pairs are created: (punish, try-s), (punish, escape-s)
4) For each pair, extract chains containing both elements in pair
� One chain is extracted, which contains punish-o and escape-s
![Page 363: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/363.jpg)
363
How can we apply narrative chains for
pronoun resolution?
Ed punished Tim because he tried to escape.
� Employ narrative chains to heuristically predict the antecedent
1) Find the event in which the pronoun participates and its role
� “he” participates in the “try” and “escape” events as a subject
2) Find the event(s) in which the candidates participate
� Both candidates participate in the “punish” event
3) Pair each candidate event with each pronoun event
� Two pairs are created: (punish, try-s), (punish, escape-s)
4) For each pair, extract chains containing both elements in pair
� One chain is extracted, which contains punish-o and escape-s
5) Obtain role played by pronoun in the candidate’s event: object
![Page 364: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/364.jpg)
364
How can we apply narrative chains for
pronoun resolution?
Ed punished Tim because he tried to escape.
� Employ narrative chains to heuristically predict the antecedent
1) Find the event in which the pronoun participates and its role
� “he” participates in the “try” and “escape” events as a subject
2) Find the event(s) in which the candidates participate
� Both candidates participate in the “punish” event
3) Pair each candidate event with each pronoun event
� Two pairs are created: (punish, try-s), (punish, escape-s)
4) For each pair, extract chains containing both elements in pair
� One chain is extracted, which contains punish-o and escape-s
5) Obtain role played by pronoun in the candidate’s event: object
6) Find the candidate that plays the extracted role: Tim
![Page 365: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/365.jpg)
365
How can we apply narrative chains for
pronoun resolution?
Ed punished Tim because he tried to escape.
� Employ narrative chains to heuristically predict the antecedent
1) Find the event in which the pronoun participates and its role
� “he” participates in the “try” and “escape” events as a subject
2) Find the event(s) in which the candidates participate
� Both candidates participate in the “punish” event
3) Pair each candidate event with each pronoun event
� Two pairs are created: (punish, try-s), (punish, escape-s)
4) For each pair, extract chains containing both elements in pair
� One chain is extracted, which contains punish-o and escape-s
5) Obtain role played by pronoun in the candidate’s event: object
6) Find the candidate that plays the extracted role: Tim
� Creates a binary feature that encodes this heuristic decision
![Page 366: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/366.jpg)
366
Search Engine (Google)
Lions eat zebras because they are predators.
![Page 367: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/367.jpg)
367
Search Engine (Google)
Lions eat zebras because they are predators.
1) Replace the target pronoun with a candidate antecedent
Lions eat zebras because lions are predators.
Lions eat zebras because zebras are predators.
2) Generate search queries based on lexico-syntactic patterns
� Four search queries for this example: “lions are”, “zebras are”,
“lions are predators”, “zebras are predators
3) Create features where the query counts obtained for the two
candidate antecedents are compared
![Page 368: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/368.jpg)
368
FrameNet
John killed Jim, so he was arrested.
![Page 369: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/369.jpg)
369
FrameNet
John killed Jim, so he was arrested.
� Both candidates are names, so search queries won’t return useful counts.
![Page 370: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/370.jpg)
370
FrameNet
John killed Jim, so he was arrested.
� Both candidates are names, so search queries won’t return useful counts.
� Solution: before generating search queries, replace each
name with its FrameNet semantic role
� “John” with “killer”, “Jim” with “victim”
� Search “killer was arrested”, “victim was arrested”, …
![Page 371: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/371.jpg)
371
Semantic Compatibility
� Same as what we did in the Search Engine component, except that we obtain query counts from the Google
Gigaword corpus
![Page 372: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/372.jpg)
372
Heuristic Polarity
Ed was defeated by Jim in the election although he is more popular.
Ed was defeated by Jim in the election because he is more popular.
![Page 373: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/373.jpg)
373
Heuristic Polarity
Ed was defeated by Jim in the election although he is more popular.
Ed was defeated by Jim in the election because he is more popular.
![Page 374: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/374.jpg)
374
Heuristic Polarity
Ed was defeated by Jim in the election although he is more popular.
Ed was defeated by Jim in the election because he is more popular.
� Use polarity information to resolve target pronouns in
sentences that involve comparison
![Page 375: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/375.jpg)
375
Heuristic Polarity
Ed was defeated by Jim in the election although he is more popular.
Ed was defeated by Jim in the election because he is more popular.
� Use polarity information to resolve target pronouns in
sentences that involve comparison
1) Assign rank values to the pronoun and the two candidates
� In first sentence, “Jim” is better, “Ed” is worse, “he” is worse
� In second sentence, “Jim” is better, “Ed” is worse, “he” is better
![Page 376: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/376.jpg)
376
Heuristic Polarity
Ed was defeated by Jim in the election although he is more popular.
Ed was defeated by Jim in the election because he is more popular.
� Use polarity information to resolve target pronouns in
sentences that involve comparison
1) Assign rank values to the pronoun and the two candidates
� In first sentence, “Jim” is better, “Ed” is worse, “he” is worse
� In second sentence, “Jim” is better, “Ed” is worse, “he” is better
2) Resolve pronoun to the candidate that has the same rank
value as the pronoun
![Page 377: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/377.jpg)
377
Heuristic Polarity
Ed was defeated by Jim in the election although he is more popular.
Ed was defeated by Jim in the election because he is more popular.
� Use polarity information to resolve target pronouns in
sentences that involve comparison
1) Assign rank values to the pronoun and the two candidates
� In first sentence, “Jim” is better, “Ed” is worse, “he” is worse
� In second sentence, “Jim” is better, “Ed” is worse, “he” is better
2) Resolve pronoun to the candidate that has the same rank
value as the pronoun
� Create features that encode this heuristic decision and rank values
![Page 378: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/378.jpg)
378
Machine-Learned Polarity
� Hypothesis
� rank values could be computed more accurately by employing
a sentiment analyzer that can capture contextual information
![Page 379: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/379.jpg)
379
Machine-Learned Polarity
� Hypothesis
� rank values could be computed more accurately by employing
a sentiment analyzer that can capture contextual information
� Same as Heuristic Polarity, except that OpinionFinder
(Wilson et al., 2005) is used to compute rank values
![Page 380: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/380.jpg)
380
Connective-Based Relations
Google bought Motorola because they are rich.
![Page 381: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/381.jpg)
381
Connective-Based Relations
Google bought Motorola because they are rich.
� To resolve “they”, we
1) Count number of times the triple <“buy”, “because”, “rich”>
appears in the Google Gigaword corpus
![Page 382: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/382.jpg)
382
Connective-Based Relations
Google bought Motorola because they are rich.
� To resolve “they”, we
1) Count number of times the triple <“buy”, “because”, “rich”>
appears in the Google Gigaword corpus
2) If count is greater than a certain threshold, resolve pronoun to candidate that has the same deep grammatical role as pronoun
![Page 383: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/383.jpg)
383
Connective-Based Relations
Google bought Motorola because they are rich.
� To resolve “they”, we
1) Count number of times the triple <“buy”, “because”, “rich”>
appears in the Google Gigaword corpus
2) If count is greater than a certain threshold, resolve pronoun to candidate that has the same deep grammatical role as pronoun
3) Generate feature based on this heuristic resolution decision
![Page 384: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/384.jpg)
384
Lexical Features
� Exploit information in the coreference-annotated training texts
![Page 385: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/385.jpg)
385
Lexical Features
� Exploit information in the coreference-annotated training texts
� Antecedent-independent features
� Unigrams
� Bigrams (pairing word before connective and word after connective)
� Trigrams (augmenting each bigram with connective)
![Page 386: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/386.jpg)
386
Lexical Features
� Exploit information in the coreference-annotated training texts
� Antecedent-independent features
� Unigrams
� Bigrams (pairing word before connective and word after connective)
� Trigrams (augmenting each bigram with connective)
� Antecedent-dependent features
� pair a candidate’s head word with
� its governing verb
� its modifying adjective
� the pronoun’s governing verb
� the pronoun’s modifying adjective
![Page 387: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/387.jpg)
387
Evaluation
� Dataset
� 941 annotated sentence pairs (70% training; 30% testing)
![Page 388: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/388.jpg)
388
Results
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 389: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/389.jpg)
389
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
� Evaluation metrics: percentages of target pronouns that are
� correctly resolved
� incorrectly resolved
� unresolved (no decision)
Results
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 390: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/390.jpg)
390
� Evaluation metrics: percentages of target pronouns that are
� correctly resolved
� incorrectly resolved
� unresolved (no decision)
Results
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 391: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/391.jpg)
391
� Evaluation metrics: percentages of target pronouns that are
� correctly resolved
� incorrectly resolved
� unresolved (no decision)
Results
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 392: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/392.jpg)
392
� Evaluation metrics: percentages of target pronouns that are
� correctly resolved
� incorrectly resolved
� unresolved (no decision)
Results
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 393: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/393.jpg)
393
Results
� Unadjusted Scores
� Raw scores computed based on a resolver’s output
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 394: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/394.jpg)
394
� Unadjusted Scores
� Raw scores computed based on a resolver’s output
� Adjusted Scores
� “Force” a resolver to resolve every pronoun by probabilistically
assuming that it gets half of the unresolved pronouns right
Results
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 395: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/395.jpg)
395
Results
� Three baseline resolvers
� Stanford resolver (Lee et al., 2011)
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 396: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/396.jpg)
396
� Three baseline resolvers
� Stanford resolver (Lee et al., 2011)
� Baseline Ranker: same as our ranking approach, except that
ranker is trained using the 39 features from Rahman & Ng (2009)
Results
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 397: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/397.jpg)
397
� Three baseline resolvers
� Stanford resolver (Lee et al., 2011)
� Baseline Ranker: same as our ranking approach, except that
ranker is trained using the 39 features from Rahman & Ng (2009)
� The Combined resolver combines Stanford and Baseline Ranker:
� Baseline Ranker is used only when Stanford can’t make a decision
Results
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 398: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/398.jpg)
398
Results
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
![Page 399: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/399.jpg)
399
� Stanford outperforms Baseline ranker
� Combined resolver does not outperform Stanford
� Our system outperforms Stanford by 18 accuracy points
Unadjusted Scores Adjusted Scores
Correct Wrong No Dec. Correct Wrong No Dec.
Stanford 40.07 29.79 30.14 55.14 44.86 0.00
Baseline Ranker 47.70 47.16 5.14 50.27 49.73 0.00
Combined resolver 53.49 43.12 3.39 55.19 44.77 0.00
Our system 73.05 26.95 0.00 73.05 26.95 0.00
Results
![Page 400: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/400.jpg)
400
Ablation Experiments
� Remove each of the 8 components one at a time
� Accuracy drops significantly (paired t-test, p < 0.05) after each component is removed
� Most useful: Narrative chains, Google, Lexical Features
� Least useful: FrameNet, Learned Polarity
![Page 401: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/401.jpg)
401
Peng et al. (2015)
� An alternative approach to address this task
� Define two types of predicate schemas
� Collect statistics for instantiated schemas from different
knowledge sources: Gigaword, Wikipedia, and the Web
![Page 402: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/402.jpg)
402
Motivating Example
The bee landed on the flower because it had pollen.
� To correctly resolve ‘it’, we need to know:
S(have(m=[the flower], a=pollen)) >
S(have(m=[the bee], a=pollen))
� Corresponding predicate schema:
S(predm(m,a))
![Page 403: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/403.jpg)
403
Motivating Example
The bird perched on the limb and it bent.
� To correctly resolve ‘it’, we need to know:
S(bend(m=[the limb], a=*)) >
S(bend(m=[the bird], a=*))
� Corresponding predicate schema:
S(predm(m,*))
![Page 404: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/404.jpg)
404
Predicate Schema Type 1
� We saw
S(predm(m,a)) and S(predm(m,*))
� More generally, we also need
S(predm(a,m)) and S(predm(*,m))
![Page 405: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/405.jpg)
405
Predicate Schema Type 1
� We saw
S(predm(m,a)) and S(predm(m,*))
� More generally, we also need
S(predm(a,m)) and S(predm(*,m))
![Page 406: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/406.jpg)
406
Predicate Schema Type 1
� We saw
S(predm(m,a)) and S(predm(m,*))
� More generally, we also need
S(predm(a,m)) and S(predm(*,m))
![Page 407: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/407.jpg)
407
Predicate Schema Type 1
� We saw
S(predm(m,a)) and S(predm(m,*))
� More generally, we also need
S(predm(a,m)) and S(predm(*,m))
![Page 408: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/408.jpg)
408
Motivating Example
� To correctly resolve ‘he’, we need to know:
S(be afraid of(m,a), because, get scared(m,a’)) >
S(be afraid of(a,m), because, get scared(m,a’))
� Corresponding predicate schemas
S(pred1m(m,a), dc, pred2m(m,a’))
S(pred1m(a,m), dc, pred2m(m,a’))
Ed was afraid of Tim because he gets scared around new people.
![Page 409: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/409.jpg)
409
Motivating Example
� To correctly resolve ‘he’, we need to know:
S(be afraid of(m,a), because, get scared(m,a’)) >
S(be afraid of(a,m), because, get scared(m,a’))
� Corresponding predicate schemas
S(pred1m(m,a), dc, pred2m(m,a’))
S(pred1m(a,m), dc, pred2m(m,a’))
Ed was afraid of Tim because he gets scared around new people.
![Page 410: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/410.jpg)
410
Motivating Example
� To correctly resolve ‘he’, we need to know:
S(be afraid of(m,a), because, get scared(m,a’)) >
S(be afraid of(a,m), because, get scared(m,a’))
� Corresponding predicate schemas
S(pred1m(m,a), dc, pred2m(m,a’))
S(pred1m(a,m), dc, pred2m(m,a’))
Ed was afraid of Tim because he gets scared around new people.
![Page 411: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/411.jpg)
411
Predicate Schema Type 2
� So far, we have
S(pred1m(m,a), dc, pred2m(m,a’))
S(pred1m(a,m), dc, pred2m(m,a’))
� More generally, we also need:
S(pred1m(m,a), dc, pred2m(a’,m))
S(pred1m(a,m), dc, pred2m(a’,m))
S(pred1m(m,*), dc, pred2m(*,m))
S(pred1m(*,m), dc, pred2m(*,m))
![Page 412: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/412.jpg)
412
Predicate Schema Type 2
� So far, we have
S(pred1m(m,a), dc, pred2m(m,a’))
S(pred1m(a,m), dc, pred2m(m,a’))
� More generally, we also need:
S(pred1m(m,a), dc, pred2m(a’,m))
S(pred1m(a,m), dc, pred2m(a’,m))
S(pred1m(m,*), dc, pred2m(*,m))
S(pred1m(*,m), dc, pred2m(*,m))
![Page 413: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/413.jpg)
413
Predicate Schema Type 2
� So far, we have
S(pred1m(m,a), dc, pred2m(m,a’))
S(pred1m(a,m), dc, pred2m(m,a’))
� More generally, we also need:
S(pred1m(m,a), dc, pred2m(a’,m))
S(pred1m(a,m), dc, pred2m(a’,m))
S(pred1m(m,*), dc, pred2m(*,m))
S(pred1m(*,m), dc, pred2m(*,m))
![Page 414: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/414.jpg)
414
Collecting Statistics for the Schemas
� … from Gigaword, Wikipedia, and the Web
![Page 415: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/415.jpg)
415
Using the Statistics
� As features and/or as constraints for their coreferenceresolver
![Page 416: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/416.jpg)
416
Peng et al.’s results on our dataset
![Page 417: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/417.jpg)
417
Summary
� There is a recent surge of interest in these hard, but incredibly interesting pronoun resolution tasks
� They could serve as an alternative to the Turing Test
(Levesque, 2011)
� This challenge is known as the Winograd Schema Challenge
� Announced as a shared task in AAAI 2014
� Sponsored by Nuance
� Additional test cases available from Ernest Davis’ website:
https://www.cs.nyu.edu/davise/papers/WS.html
![Page 418: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/418.jpg)
418
Challenges
![Page 419: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/419.jpg)
419
Challenges: New Models
� Can we jointly learn coreference resolution with other tasks?
� Exploit cross-task constraints to improve model learning
� Durrett & Klein (2014): jointly learn coreference with two tasks
� Named entity recognition (coarse semantic typing)
� Entity linking (matching to Wikipedia entities)
using a graphical model, encoding soft constraints in factors
� Use semantic info in Wikipedia for better semantic typing
� Use semantic types to disambiguate tricky Wikipedia links
� Ensure consistent type predictions across coreferent mentions
� …
![Page 420: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/420.jpg)
420
Challenges: New Models
� Can we jointly learn coreference resolution with other tasks?
� Exploit cross-task constraints to improve model learning
� Durrett & Klein (2014): jointly learn coreference with two tasks
� Named entity recognition (coarse semantic typing)
� Entity linking (matching to Wikipedia entities)
using a graphical model, encoding soft constraints in factors
� Use semantic info in Wikipedia for better semantic typing
� Use semantic types to disambiguate tricky Wikipedia links
� Ensure consistent type predictions across coreferent mentions
� …
� Hajishirzi et al. (2013): jointly learn coreference w/ entity linking
� Can we jointly learn entity coreference with event coreference?
![Page 421: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/421.jpg)
421
Challenges: New Features
� There is a limit on how far one can improve coreferenceresolution using machine learning methods
� A good model can profitably exploit the available features, but if
the knowledge needed is not present in the data, there isn’t
much that the model can do
![Page 422: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/422.jpg)
422
Challenges: New Features
� There is a limit on how far one can improve coreferenceresolution using machine learning methods
� A good model can profitably exploit the available features, but if
the knowledge needed is not present in the data, there isn’t
much that the model can do
� We know that semantics and world knowledge are important
� But it’s hard to use them to improve state-of-the-art systems
� Wiseman: learn non-linear representations from raw features
� What if we learn such representations from complex features,
including those that encode world knowledge?
� Can we leverage recent advances in distributional lexical semantics (e.g., word embeddings)?
![Page 423: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/423.jpg)
423
Challenges: New Languages
� Low-resource languages
� Large lexical knowledge bases may not be available
� Can we learn world knowledge from raw text?
� Idea: using appositive constructions
� E.g., Barack Obama, president of the United States, …
![Page 424: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/424.jpg)
424
Challenges: New Languages
� Low-resource languages
� Large lexical knowledge bases may not be available
� Can we learn world knowledge from raw text?
� Idea: using appositive constructions
� E.g., Barack Obama, president of the United States, …
� Large coreference-annotated corpora may not be available
� Can we employ weakly supervised learning or active learning?
� Can we exploit resources from a resource-rich language?
� Idea: translation-based coreference annotation projection
![Page 425: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/425.jpg)
425
Translation-Based Projection: Example
玛丽玛丽玛丽玛丽告告告告诉约诉约诉约诉约翰她非常喜翰她非常喜翰她非常喜翰她非常喜欢欢欢欢他他他他。。。。
![Page 426: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/426.jpg)
426
Translation-Based Projection: Example
1. Machine-translate document from target to source
玛丽玛丽玛丽玛丽告告告告诉约诉约诉约诉约翰她非常喜翰她非常喜翰她非常喜翰她非常喜欢欢欢欢他他他他。。。。
![Page 427: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/427.jpg)
427
Translation-Based Projection: Example
1. Machine-translate document from target to source
玛丽玛丽玛丽玛丽告告告告诉约诉约诉约诉约翰她非常喜翰她非常喜翰她非常喜翰她非常喜欢欢欢欢他他他他。。。。
Mary told John that she liked him a lot.
![Page 428: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/428.jpg)
428
Translation-Based Projection: Example
2. Run resolver on the translated document
• to extract mentions and produce coreference chains
玛丽玛丽玛丽玛丽告告告告诉约诉约诉约诉约翰她非常喜翰她非常喜翰她非常喜翰她非常喜欢欢欢欢他他他他。。。。
Mary told John that she liked him a lot.
![Page 429: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/429.jpg)
429
Translation-Based Projection: Example
2. Run resolver on the translated document
• to extract mentions and produce coreference chains
玛丽玛丽玛丽玛丽告告告告诉约诉约诉约诉约翰她非常喜翰她非常喜翰她非常喜翰她非常喜欢欢欢欢他他他他。。。。
Mary told John that she liked him a lot.
![Page 430: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/430.jpg)
430
Translation-Based Projection: Example
3.Project annotations from source back to target
• project mentions
玛丽玛丽玛丽玛丽告告告告诉约诉约诉约诉约翰她非常喜翰她非常喜翰她非常喜翰她非常喜欢欢欢欢他他他他。。。。
Mary told John that she liked him a lot.
![Page 431: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/431.jpg)
431
Translation-Based Projection: Example
3.Project annotations from source back to target
• project mentions
[玛丽玛丽玛丽玛丽]告告告告诉诉诉诉[约约约约翰翰翰翰][她她她她]非常喜非常喜非常喜非常喜欢欢欢欢[他他他他]。。。。
Mary told John that she liked him a lot.
![Page 432: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/432.jpg)
432
Translation-Based Projection: Example
3.Project annotations from source back to target
• project mentions
• project coreference chains
[玛丽玛丽玛丽玛丽]告告告告诉诉诉诉[约约约约翰翰翰翰][她她她她]非常喜非常喜非常喜非常喜欢欢欢欢[他他他他]。。。。
Mary told John that she liked him a lot.
![Page 433: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/433.jpg)
433
Translation-Based Projection: Example
3.Project annotations from source back to target
• project mentions
• project coreference chains
[玛丽玛丽玛丽玛丽]告告告告诉诉诉诉[约约约约翰翰翰翰][她她她她]非常喜非常喜非常喜非常喜欢欢欢欢[他他他他]。。。。
Mary told John that she liked him a lot.
![Page 434: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/434.jpg)
434
Challenges: New Coreference Tasks
� Bridging [Non-identity coreference]
� Set-subset relation, part-whole relation
� Event coreference resolution
� Determines which event mentions refer to the same event
� Difficult because for two events to be coreferent, one needs to check whether their arguments/participants are compatible
� Partial coreference relation [Non-identity coreference]
� subevent
� Subevent relations form a sterotypical sequence of events
� e.g., bombing � destroyed � wounding
� membership
� multiple instances of the same kind of event
� e.g., I attended three parties last month. The 1st one was the best.
![Page 435: CoreferenceResolution: Successes and Challenges2 Plan for the Talk Part I: Background Task definition Why coreference is hard Applications Brief history Part II: Machine learning for](https://reader031.vdocuments.mx/reader031/viewer/2022011823/5ed4743c64cb9d0fda746f66/html5/thumbnails/435.jpg)
435
Challenges: New Evaluation Metrics
� Designing evaluation metrics is a challenging task
� There are four commonly used coreference evaluation metrics (MUC, B3, CEAF, BLANC), but it’s not clear which of
them is the best
� Can we trust them? (Moosavi & Strube, 2016)
� Weaknesses
� Linguistically agnostic
� Are all links equally important?
� E.g., 3 mentions: Hillary Clinton, she, she
� System 1: Clinton-she; System 2: she-she
� Hard to interpret the resulting F-scores
� Can the scores tell which aspects of a system can be improved?