introduction to artificial intelligence massimo poesio supervised relation extraction

45
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

Upload: oliver-pope

Post on 04-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

Massimo Poesio

Supervised Relation Extraction

Page 2: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

RE AS A CLASSIFICATION TASK

• Binary relations• Entities already manually/automatically

recognized• Examples are generated for all sentences with

at least 2 entities• Number of examples generated per sentence

is NC2 – Combination of N distinct entities selected 2 at a time

Page 3: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

GENERATING CANDIDATES TO CLASSIFY

Page 4: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

RE AS A BINARY CLASSIFICATION TASK

Page 5: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

NUMBER OF CANDIDATES TO CLASSIFY – SIMPLE MINDED VERSION

Page 6: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

THE SUPERVISED APPROACH TO RE

• Most current approaches to RE are kernel-based

• Different information is used – Sequences of words, e.g., through the GLOBAL

CONTEXT / LOCAL CONTEXT kernels of Bunescu and Mooney / Giuliano Lavelli & Romano

– Syntactic information through the TREE KERNELS of Zelenko et al / Moschitti et al

– Semantic information in recent work

Page 7: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

KERNEL METHODS: A REMINDER

• Embedding the input data in a feature space

• Using a linear algorithm for discovering non-linear patterns

• Coordinates of images are not needed, only pairwise inner products

• Pairwise inner products can be efficiently computed directly from X using a kernel function K:X×X→R

Page 8: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

MODULARITY OF KERNEL METHODS

Page 9: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

THE WORD-SEQUENCE APPROACH

• Shallow linguistic Information:– tokenization – Lemmatization – sentence splitting – PoS tagging

Claudio Giuliano, Alberto Lavelli, and Lorenza Romano (2007), FBK-IRST: Kernel methods for relation extraction, Proc. Of SEMEVAL-2007

Page 10: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

LINGUISTIC REALIZATION OF RELATIONS

Bunescu & Mooney, NIPS 2005

Page 11: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

WORD-SEQUENCE KERNELS

• Two families of “basic” kernels – Global Context– Local Context

• Linear combination of kernels• Explicit computation – Extremely sparse input representation

Page 12: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

THE GLOBAL CONTEXT KERNEL

Page 13: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

THE GLOBAL CONTEXT KERNEL

Page 14: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

THE LOCAL CONTEXT KERNEL

Page 15: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

LOCAL CONTEXT KERNEL (2)

Page 16: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

KERNEL COMBINATION

Page 17: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

EXPERIMENTAL RESULTS

• Biomedical data sets– AIMed– LLL

• Newspaper articles– Roth and Yih

• SEMEVAL 2007

Page 18: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

EVALUATION METHODOLOGIES

Page 19: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

EVALUATION (2)

Page 20: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

EVALUATION (3)

Page 21: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

EVALUATION (4)

Page 22: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

RESULTS ON AIMED

Page 23: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

OTHER APPROACHES TO RE

• Using syntactic information• Using lexical features

Page 24: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

Syntactic information for RE

• Pros: – more structured information useful when dealing

with long-distance relations• Cons: – not always robust – (and not available for all languages)

Page 25: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

Zelenko et al JMLR 2003

• TREE KERNEL defined over a shallow parse tree representation of the sentences– approach vulnerable to unrecoverable parsing

errors• data set: 200 news articles (not publicly

available)• two types of relations : person-affiliation and

organization-location

Page 26: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

ZELENKO ET AL

Page 27: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

CULOTTA & SORENSEN 2004

• generalized version of Zelenko’s kernel based on dependency trees (smallest dependency tree containing the two entities of the relation)

• a bag-of-words kernel is used to compensate syntactic errors

• data set: ACE 2002 & 2003• results: syntactic information improves

performance w.r.t. bag-of-words (good precision but low recall)

Page 28: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

CULOTTA AND SORENSEN (2)

Page 29: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

EVALUATION CAMPAIGNS FOR RE

• Much of modern evaluation of methods is done by competing with other teams on evaluation campaigns like MUC and ACE

• Modern evaluation campaigns for RE: SEMEVAL (now *SEM)

• Interesting to look also at the problems of– DATA CREATION– EVALUATION METRICS

Page 30: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

SEMEVAL 2007

• 4th International Workshop on Semantic Evaluations

• Task 04: Classification of Semantic Relations between Nominals– organizers: Roxana Girju, Marti Hearst, Preslav

Nakov, ViviNastase, Stan Szpakowicz, Peter Turney, Deniz Yuret

– 14 participating teams

Page 31: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

SEMEVAL 2007: THE RELATIONS

Page 32: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

SEMEVAL 2007: DATASET CREATION

Page 33: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

SEMEVAL 2007: DATASET CREATION (2)

Page 34: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

SEMEVAL 2007 – DATASET CREATION (3)

Page 35: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

SEMEVAL 2007 – DATASET CREATION (4)

Page 36: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

SEMEVAL 2007: DATASET

Page 37: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

SEMEVAL 2007: COMPETITION

Page 38: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

SEMEVAL 2007: COMPETITION (2)

Page 39: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

SEMEVAL 2007: BEST RESULTS

Page 40: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

INFLUENCE OF NER ON RE

Page 41: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

INFLUENCE OF NER ON RE (2)

Page 42: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

GENERATING CANDIDATES

Page 43: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

GENERATING CANDIDATES

Page 44: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

GENERATING CANDIDATES

Page 45: INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

ACKNOWLEDGMENTS

• Many slides borrowed from – Roxana Girju – Alberto Lavelli