presenter : chang,chun-chih authors : david milne * , ian h. witten 2012, ai

Post on 22-Feb-2016

33 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

An open-source toolkit for mining Wikipedia. Presenter : Chang,Chun-Chih Authors : David Milne * , Ian H. Witten 2012, AI. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Intelligent Database Systems Lab

Presenter : Chang,Chun-Chih

Authors : David Milne * , Ian H. Witten

2012, AI

An open-source toolkit for mining Wikipedia

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments

Intelligent Database Systems Lab

Motivation The online encyclopedia Wikipedia is a vast,

constantly evolving tapestry of interlinked articles.

For developers and researchers it represents a giant multilingual database of concepts and semantic relations, a potential resource for natural language processing

Intelligent Database Systems Lab

Objectives

• The Wikipedia Miner toolkit, an open-source software system that allows researchers and developers to integrate Wikipedia’s rich semantics into their own applications.

• Wikipedia Miner is intended to be a platform for sharing data mining techniques.

Intelligent Database Systems Lab

Methodology - Architecture of the wikipedia Miner toolkit

Intelligent Database Systems Lab

Methodology - Measuring relatedness between concepts

Intelligent Database Systems Lab

Methodology - Measuring relatedness between concepts

Intelligent Database Systems Lab

Methodology -Features for measuring artucle relatedness

Intelligent Database Systems Lab

Experiments - Impact of thresholds for disambiguation and detection

Intelligent Database Systems Lab

Experiments - Impact of relatedness dependencies

Intelligent Database Systems Lab

Experiments - Impact of traning data

Intelligent Database Systems Lab

Experiments - performance of the disambiguator

Intelligent Database Systems Lab

Experiments - performance of the detector

Intelligent Database Systems Lab

Conclusions

• Our aim in releasing this work open source is not to provide a complete and polished product,

• but rather a resource for the research community to collaborate around and continue building together.

Intelligent Database Systems Lab

Comments

• Advantages• Applications - wikipedia - Disambiguation - Annotation

top related