entity-driven desideratausers.umiacs.umd.edu/~jbg/teaching/cmsc_723/15b_coref.pdf · entity-driven...
TRANSCRIPT
![Page 1: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/1.jpg)
Entity-Driven Desiderata
Computational Linguistics: Jordan Boyd-GraberUniversity of MarylandCOREFERENCE
Adapted from slides by Vincent Ng
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 1 / 12
![Page 2: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/2.jpg)
What is Coref?
Identify the noun phrases (or entity mentions) that refer to the samereal-world entity
Example
Queen Elizabeth set about transforming her husband, King George VI, intoa viable monarch. A renowned speech therapist was summoned to helpthe King overcome his speech impediment . . .
� Inherently a transitive clustering task
� Typical reframing: selecting antecedent for each mention mj
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 2 / 12
![Page 3: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/3.jpg)
What is Coref?
Identify the noun phrases (or entity mentions) that refer to the samereal-world entity
Example
Queen Elizabeth set about transforming her husband, King George VI, intoa viable monarch. A renowned speech therapist was summoned to helpthe King overcome his speech impediment . . .
� Inherently a transitive clustering task
� Typical reframing: selecting antecedent for each mention mj
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 2 / 12
![Page 4: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/4.jpg)
What is Coref?
Identify the noun phrases (or entity mentions) that refer to the samereal-world entity
Example
Queen Elizabeth set about transforming her husband, King George VI, intoa viable monarch. A renowned speech therapist was summoned to helpthe King overcome his speech impediment . . .
� Inherently a transitive clustering task
� Typical reframing: selecting antecedent for each mention mj
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 2 / 12
![Page 5: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/5.jpg)
What is Coref?
Identify the noun phrases (or entity mentions) that refer to the samereal-world entity
Example
Queen Elizabeth set about transforming her husband, King George VI, intoa viable monarch. A renowned speech therapist was summoned to helpthe King overcome his speech impediment . . .
� Inherently a transitive clustering task
� Typical reframing: selecting antecedent for each mention mj
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 2 / 12
![Page 6: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/6.jpg)
Why it’s hard
� Many sources of information play a role� lexical / word: head noun matches President Clinton = Clinton =? Hillary
Clinton� grammatical: number/gender agreement, . . .� syntactic: syntactic parallelism, binding constraints: John helped himself
to... vs. John helped him to. . .� discourse: discourse focus, salience, recency, . . .� semantic: semantic class agreement, . . .� world knowledge
� Not all knowledge sources can be computed easily
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 3 / 12
![Page 7: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/7.jpg)
Application: Question Answering
Where was Mozart born?
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 4 / 12
![Page 8: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/8.jpg)
Application: Question Answering
Where was Mozart born?
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 4 / 12
![Page 9: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/9.jpg)
Hobb’s Algorithm
Intuition:
� Start with target pronoun
� Climb parse tree to S root� For each NP or S� Do breadth-first, left-to-right search of children� Restricted to left of target� For each NP, check agreement with target
� Repeat on earlier sentences until matching NP found
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 5 / 12
![Page 10: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/10.jpg)
Hobb’s Algorithm Example
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 6 / 12
![Page 11: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/11.jpg)
Machine Learning Approach
� Preprocessing
� Mention Detection
� Coreference
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 7 / 12
![Page 12: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/12.jpg)
Machine Learning Approach
� Preprocessing
� Mention Detection
� Coreference
Not-so-trivial: extract the mentions (pronouns, names, nominals, nestedNPs): Some researchers reported results on gold mentions, not systemmentions
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 7 / 12
![Page 13: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/13.jpg)
Machine Learning: Pairwise
Features:
� Exact string: are mi and mj same after determiners removed
� Grammatical: gender and number agreement
� Semantic: class agreement (country/company)
� Positional: distance between the two mentions
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 8 / 12
![Page 14: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/14.jpg)
Problems
� Conflicts
� Constraints
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 9 / 12
![Page 15: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/15.jpg)
More Advanced Coreference
� Anaphoric classifier
� Rank mentions
� Cluster assignment
� Pipeline approach
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 10 / 12
![Page 16: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/16.jpg)
More Advanced Coreference
� Anaphoric classifier
� Rank mentions
� Cluster assignment
� Pipeline approach (Hand-crafted?)
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 10 / 12
![Page 17: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/17.jpg)
More Advanced Coreference
� Anaphoric classifier
� Rank mentions
� Cluster assignment
� Pipeline approach
Harder to evaluate!
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 10 / 12
![Page 18: Entity-Driven Desideratausers.umiacs.umd.edu/~jbg/teaching/CMSC_723/15b_coref.pdf · Entity-Driven Desiderata Computational Linguistics: Jordan Boyd-Graber University of Maryland](https://reader035.vdocuments.mx/reader035/viewer/2022071015/5fce07c02d68045dd70380b8/html5/thumbnails/18.jpg)
Possible Projects
� Improve QA (find mentions of candidate answers in Wikipedia)
� Use world knowledge to improve coref
� Better features / representations
Computational Linguistics: Jordan Boyd-Graber | UMD Entity-Driven Desiderata | 11 / 12