benchmarking the extraction and disambiguation of named entities on the semantic web

Benchmarking the Extraction and Benchmarking the Extraction and Disambiguation of Named Entities Disambiguation of Named Entities

on the Semantic Webon the Semantic Web

Giuseppe Rizzo, Marieke van Erp, Raphaël Troncy

@merpeltje @rtroncy@giusepperizzo

May 30, 2014 2/219th Edition of the Language Resources and Evaluation Conference (LREC'14)

Benchmarking NER & NED

➢ NER➢ [newswire] CoNLL, ACE, MUC➢ [microposts] Microposts Concept Extraction

➢ NED➢ [newswire] TAC KBP➢ [microposts] Microposts NEEL

➢ Numerous academic and commercial NER and NED tools

➢ To name a few: AlchemyAPI, DBpedia Spotlight, GATE, OpeNER, Stanford

http://www.conll.org/

https://www.ldc.upenn.edu/collaborations/past-projects/ace

http://www.itl.nist.gov/iaui/894.02/related_projects/muc/

http://oak.dcs.shef.ac.uk/msm2013/challenge.html

http://www.nist.gov/tac/2014/KBP/

http://www.scc.lancs.ac.uk/microposts2014/index.html

htpp://www.eurecom.fr/


This Work

➢ Evaluation and comparison of 11 NER and NED tools through the NERD API

➢ Combination of the 11 NER tools in NERD-ML

➢ Experiments on two types of corpora: newswire and microposts



➢ http://nerd.eurecom.fr

➢ Ontology, REST API & Web Application

➢ Uniform access to 11 NER/NED external tools

➢ commercial: AlchemyAPI, dataTXT, OpenCalais, Saplo, TextRazor, Wikimeta, Yahoo!, Zemanta

➢ academic: DBpedia Spotlight, Lupedia, THD

http://nerd.eurecom.fr/



Theoretical limit

➢ Any of these systems have strengths in entity typing

➢ An ideal combination will use the best in typing among all

➢ Estimate the upper bound where each type is

t target=select te=tGS(te1

, te2, ... , t en

)



➢ Try to perform better than each individual NER tool

➢ Learning:➢ NERD tool predictions➢ Stanford CRF predictions➢ Linguistic features

➢ Naive Bayes (NB), k-nearest neighbors (k-NN), Support Vector Machines (SVM, RBF kernel)

N ERD-ML



Feature Vector

extractor2type

extractor1type

linguisticvector

...extractorN

typeGStype

training vector



Linguistic Features

POSinitial

cap (*)all

caps (*)capitalizedratio (**)

prefix suffixbegin or end (*)

linguisticvector

* Boolean value** Double value

token



Experiments - NER

➢ CoNLL2003 English, testb set [newswire]

➢ 231 Articles➢ 46,435 Tokens➢ 5,648 NEs

➢ MSM2013, test set [microposts]➢ 1,450 Posts➢ 29,085 Tokens➢ 1,538 NEs



Results on CoNLL2003



Results on MSM2013



CoNLL2003

NERD-ML Incremental Learning (1/2)

Experimental settings:➢ Feature Vector: token, AlchemyAPI, DBpedia Spotlight, Cicero, Lupedia,

OpenCalais, Saplo, Yahoo!, Textrazor, Wikimeta, Stanford, GS type➢ Classifier = NB



MSM2013

NERD-ML Incremental Learning (2/2)

Experimental settings:➢ Feature Vector: token, pos, initialcaps, allcaps, prefix, suffix, capitalfreq, start,

AlchemyAPI, DBpedia Spotlight, Cicero, Lupedia, Opencalais, Textrazor, Ritter, Stanford, GS type

➢ Classifier = SVM



Experiments - NED

➢ AIDA CoNLL-YAGO links to Wikipedia, testb set [newswire]➢ 231 Articles➢ 46,435 Tokens➢ 4,485 Links

➢ Microposts2014 links to DBpedia, test set [microposts]➢ 1,165 Posts➢ 23,815 Tokens➢ 1,330 Links



Results on AIDA CoNLL-YAGO

Wikipeda is the reference Knowledge Base



Results on Microposts2014

DBpedia v3.9 is the reference Knowledge Base



Discussion NER

➢ Newswire➢ Robust performance on recognizing common types ➢ But MISC class is hard to detect (always will be?)

➢ Microposts➢ Fairly robust for PER ➢ Weak in recognizing LOC and ORG➢ MISC is around 30% of F1



Discussion NED

➢ Newswire➢ Unreliable performance on linking, with the peak in

F1 of 50.41% for TextRazor ➢ Linkers use different reference knowledge bases.

Source of bias is the link normalization part

➢ Microposts➢ Linking shows a big drop in performance➢ TextRazor has the best score with a 32.65% F1



Future Work

➢ NER

➢ Improving the taxonomy alignment➢ NED

➢ Better harmonization of the linking stage ➢ NERD-ML

➢ Getting closer to the theoretical limit in NER➢ Use of gazetteers for MISC types ➢ Combining the outputs of the NEL tools to predict the links



Acknowledgments

The research leading to this paper was partially supportedby the European Union’s 7th Framework Programme via the projects LinkedTV (GA 287911) and NewsReader (ICT-316404)



Thank You For Listening

http://www.slideshare.net/giusepperizzo

https://github.com/giusepperizzo/nerdml

http://www.slideshare.net/giusepperizzo

https://github.com/giusepperizzo/nerdml


benchmarking the extraction and disambiguation of named entities on the semantic web

Documents