ontology-based word sense disambiguation for scientific literature

5

Click here to load reader

Upload: exascale-infolab

Post on 10-May-2015

310 views

Category:

Education


3 download

TRANSCRIPT

Page 1: Ontology-Based Word Sense Disambiguation for Scientific Literature

Ontology-based Word Sense Disambiguation For Scientific Literature Roman Prokofyev, Gianluca Demartini, Philippe Cudre-Mauroux, Alexey Boyarsky and Oleg Ruchayskiy eXascale Infolab University of Fribourg, Switzerland

March 25, ECIR 2013, Moscow

Page 2: Ontology-Based Word Sense Disambiguation for Scientific Literature

Problem definition State Space Model

Sequential Standard Model Supersymmetric Standard Model

•  Machine translation: correct lexical choice. •  Information retrieval: ambiguity in queries, result diversification, etc. •  Knowledge extraction: proper text analysis and classification (our case).

Datasets •  ScienceWISE abstract dataset + SW ontology

http://sciencewise.info •  MSH abstract dataset + ontology from bioontology.org

Available at http://exascale.info/papers/ecir2013disambig

Our contribution: leveraging the structure of community-based ontology to improve correct sense identification.

Page 3: Ontology-Based Word Sense Disambiguation for Scientific Literature

• Concept Context Vectors

• Document Concept Context Vectors

Base models Star formation efficiency (SFE) (Instability, 4), (Supernova, 2), (Milky Way, 3),…

Min distance

1 (Milky way, 1), (Electron neutrino, 1), (Electron antineutrino, 1),…

2 (Local analysis, 1), (White dwarf, 3), (Poynting-Robertson effect, 1), …

��������Minimum over the ontological paths to other concepts in the document

Page 4: Ontology-Based Word Sense Disambiguation for Scientific Literature

Ontology shortest path

Nearest neighbors

��������

Average distance to other concepts in the document

Co-occurring 1-hop neighbors from the ontology

Page 5: Ontology-Based Word Sense Disambiguation for Scientific Literature

Graph models evaluation Approach Precision (ScienceWISE) Precision (MSH) Min Distance 0.8882 0.6728 Ontology Shortest Path 0.8646 0.5677 Nearest neighbors 0.7393 0.7237

Approach Precision (ScienceWISE) Precision (MSH) Naïve Bayes 0.8513 0.6731 Binary CCV 0.9334 0.9077 + Ontology Shortest Path 0.9444 0.8077 + Nearest neighbors 0.9453 0.9060

Combined approaches