news construction from microblogging posts using open data

Download News construction from microblogging posts using open data

If you can't read please download the document

Upload: francisco-berrizbeitia

Post on 13-Jun-2015

248 views

Category:

Technology


0 download

DESCRIPTION

Presentation of the report of the same name. The research was made for the course on Semantic Web at the Univesidad Simon Bolívar.

TRANSCRIPT

  • 1. News construction from microblogging post using linked open data

2. Introduction Information access can be limited in some situations where traditional media outlets cant cover the events due to geographical limitations or censorship in situations such as civil unrest, war or natural disasters. In this research we propose a method to create searchable, semantically annotated news articles from tweets in an automated way using the cloud of linked open data. 3. Motivation Everyone has the right to freedom of opinion and expression; this right includes freedom tohold opinions without interference and to seek, receive and impart information and ideas through any media and regardless of frontiers. 4. An example A tweet is not a document it will be unreachable in few days and the information lost. 5. An example We want to create a news article from the tweet using the cloud of linked open data transforming the message into a document that can be retrive and use later 6. What we want to do Determine thet 5 W's of the post Who is it about? What happened? Where did it take place? When did it take place? Why did it happen? How did it happen? Use the cloud of linked open data to expand each concept,person, organization, place or action decribed in the post 7. Tweet ID Twitter API Tweet Text Wordnet (Local) List of candidates Wikipedia API Word type recognition Noise removal Dbpedia Endpoint List of candidates with know wikipedia page Sparql query Semantic information Author information Virtuoso EndpointTurtle File Our method - overview 8. The rNews core news ontology 9. Experimentation We selected 90 tweets directly from the Twitter search on 3 subjects: The Brazilian riot during the 2014 world cup, Barack Obama and Venezuela. Manually tag each tweet (twice) Run the automated aproach and compare the results 10. Results Expected Terms: 413 Found: 433 Expected and found: 317 No added Value: 63 Wrong: 53 Precision: 76.36 % Errors: 12.24 11. Future work Use a federated engine (ANAPSID) to provide a more complete information on the subject. Desing and implement and algorithm that retrieves all relevant information from the linked open data cloud. Use open data to resolve the disambiguation problem to minimize the incorrectly suggested concepts. 12. Conclutions These results encourage us to further develop the method and the system to solve first the disambiguation problems and to create a more ambitious approach that will allow us to create a semantically annotated news stream based not only on tweet, but also includes other microblogging services, independent blogs and corporate media outlets that can serve a centralized semantic endpoint for data journalism. 13. Thank you !