newskdd 2014: crowdsourcing event extraction (poster)

Post on 29-Nov-2014

66 Views

Category:

Science

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Poster for our extended abstract presented at NewsKDD workshop at KDD 2014 conference and ESWC Summer School 2014 where it won 3rd place for best student poster.

TRANSCRIPT

Funded under: FP7Area: Language Technologies (ICT-2011.4.2)Project reference: 288342Coordinator: Marko Grobelnik

www.xlike.org

INTERFACE

Aljaž Košmerlj, Jenya Belyaeva, Gregor Leban, Blaž Fortuna, Marko Grobelnik

Artificial Intelligence Laboratory, Jožef Stefan Institute, Ljubljana, Slovenia

We present a system for manually extracting structured event information from freeform newswire text. The extraction isperformed on news articles preprocessed by services developed within the XLike project and is guided by suggestions thesystem produces using machine learning techniques. Results of testing performed using human annotators show thesystem can produce meaningful data and suggest several avenues for improvement of the system.

List of articles about the event and alist of entities (i.e. noun phrases)found in the articles.

Type of event described in thearticles defined by user or selectedfrom suggestions.

List of filled and unfilled rolesdefined by users for the selectedevent type.

Entity role selection using adropdown list – either in text or inentity list.

ABSTRACT

INPUTSets of articles about the sameevent from the Event Registryservice (http://eventregistry.org).

PIPELINE EVENT TYPE SUGGESTION EVALUATION

suggestions generated by SVM classifierbuilt using the QMiner data analyticsplatform (http://qminer.ijs.si)

11 annotators annotating thesame 10 events

12.1% ± 3.1% of proposedentities annotated per event

6.2 ± 0.9 roles filled per event

average pairwise event typeagreement: 5.9 ± 2.0

average pairwise Jaccard indexof roles with same annotation:0.25 ± 0.09

average number of successfulevent type suggestions peruser: 6.6 ± 1.9

built on dataset of 100 events annotatedinto 5 event types (road accident, productlaunch, protest, earthquake and bombing)by an expert annotator

features include concepts found in eventby Event Registry as well as bag-of-wordsfeatures computed on article titles andevent summary

leave-one-out testing classificationaccuracy score of: CA = 0.67

top related