open data conference - sören auer - linked open data

50
http://lod2.eu Intelligent Information Management Collaborative Project 2010-2014 in Information and Communication Technologies Project No. 257943 Start Date 01/09/2010

Upload: opening-upeu

Post on 20-Jan-2015

461 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 02.09.2010 . Page 1 http://lod2.eu

Creating Knowledge out of Interlinked Data

http://lod2.eu

Intelligent Information Management

Collaborative Project 2010-2014 in Information and Communication Technologies

Project No. 257943 Start Date 01/09/2010

Page 2: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 2 http://lod2.eu

Creating Knowledge out of Interlinked Data

Achievements 1. Extension of the Web with a

data commons (currently amounting to 25 Bn. facts)

2. Vibrant, global RTD community

3. Industrial uptake begins (e.g. BBC, Thomson Reuters, Eli Lilly)

4. Emerging governmental adoption in sight

5. Establishing Linked Data as a deployment path for the Semantic Web.

The Emerging Web of Data: Achievements and Challenges

Challenges 1. Coherence: Relatively few,

expensively maintained links 2. Quality: partly low-quality data

and inconsistencies 3. Performance: Still substantial

penalties compared to relational 4. Data Consumption: large-scale

processing, schema mapping and data fusion still in its infancy

5. Usability: Missing direct end-user tools and network effect

These issues are closely related and should ultimately lead to an eco-system of interlinked knowledge!

Web - a global, distributed platform for data, information and knowledge integration exposing, sharing, and connecting pieces of data, information, and knowledge on the

Semantic Web using URIs and RDF

July 2007 April 2008 September 2008

July 2009

Page 3: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 3 http://lod2.eu

Creating Knowledge out of Interlinked Data

LOD2 in a Nutshell (1)

Research Focus very large RDF data

management knowledge enrichment &

interlinking fusion & information

quality adaptive, semantic user interfaces

Use Cases Media & Publishing Enterprise Data Webs Open Gov Data Public Sector Contracts

Main Result integrated LOD2 Stack for Linked

Data lifecycle management

Partners ULEI, CWI, NUIG, FUB/UMA, SWCG, OGL, Tenforce, Exalead, WKD, OKFN, UEP, ZEM, I2G, IMP, KAIST

Page 4: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 4 http://lod2.eu

Creating Knowledge out of Interlinked Data

LOD2 in a Nutshell (2)

LOD2 EC-funded collabarotive project that aims to utilize the Web as an integration platform for data and information

Linked Data Linked Data provides the necessary basic technologies and standards to realize the goal of LOD2.

Linked Open Data publicly accessible data which is to be integrated into the web and linked among one another and with non-public contents such as enterprise intranets

Project Highlights Open Government Linked Data Initiative Common European platform publicdata.eu Leading Web 3.0 technologies are combined in the project into the coherent LOD2 Stack (e.g. DBpedia, Virtuoso, Sindice, Silk)

Page 5: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 5 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP1: Requirements, Design & LOD2 Stack Prototype

Use Case High-Level Abstraction

Page 6: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 6 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP1: Use Case Objectives

Objective of WP8:

Applying Linked Data technologies in an enterprise stack to support Human Resources-related issues.

ENTERPRISE APPLICATIONS

(Exalead)

MEDIA

& PUBLISHING

(Wolters Kluwer

Germany)

OPEN

GOVERNMENT DATA

(Open

Knowledge Foundation)

Objective of WP9:

Improving accessibility, findability and reusability of Open Government Data.

Objective of WP7:

Supporting content-related production workflows in the media & publishing industry.

Page 7: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 7 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP2: Storing & Querying very Large Knowledge Bases

Goal:

Enabling large-scale, feature-rich & enterprise-ready Linked Data management solutions

Database Partners in LOD2:

CWI - Leading open source analytics RDBMS

OGL - Leading linked data deployment platform Technological Excellence:

Creating and publishing metrics for choosing RDF solutions

Bringing Column Store Technology for Business Intelligence on RDF

Ground-breaking database innovations for RDF stores (Dynamic Query Optimization, Adaptive Caching of Joins, Optimized Graph Processing, Cluster/Cloud Scalability)

MonetDB as advanced experimentation platform first RDF enabled prototype released Aug2012

V7: Vectorized Column-Store (a la CWI) Full release in 2012

LOD2 spawned the LOD Benchmark Council LDBCouncil.org starts Oct 2012

Six papers published on RDF column stores, large-scale query processing, benchmarking (1 EDBT, 1 IEEE DEBULL, 3 ISWC, 1TPCTC)

Page 8: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 8 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP2: Linked Open Data for Real in Your Apps

Business Advantages:

Enrich your application with (free & rich) Linked Open Data

RDF store technology has lower deployment costs than relational for “ragged” (untidy) data Technological Flexibility:

Deliver schema-last flexibility and inference at relational data warehouse cost and performance

Grow as you go: LOD2 platform dynamically adapts to your usage patterns and structure of your data

Integrate, resolve, align anything: schema, instance identity Rich Features for Complex Applications:

Advanced SPARQL and SQL query processing

SPARQL and SQL federation

Full text, geospatial, text search

Scale-out on clusters, replication

Virtuoso V7: much more compact, much faster

Virtuoso Cluster V7: Distributed Asynchronous Queues (DAQ) recursive, function-shipping declarative cluster programming

Page 9: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 9 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP3: Goals

General Goal:

Extraction, Enrichment, Repair (EER) of knowledge bases Focus:

very large knowledge bases, diverse knowledge, web data refine existing (Virtuoso Sponger, RDF Views, Triplify, D2R) triplification approaches improve schema of knowledge based on data fix problems in knowledge bases, e.g. inconsistencies Techniques:

semi-automatic machine learning, ontology debugging, NLP, shallow parsing,etc.

Page 10: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 10 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP3: Knowledge Base Improvement Cycle

Mutual Refinement Cycle (with optional Extraction phase)

Repair

Modelling Problems

Performance Problems

Enrichment

Definitions

Disjoint- ness

Linkage Validation

Extraction

Structured Semi-structured

Un- structured

Page 11: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 11 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP3: Multilingual Support for Information Extraction

Standard information extraction techniques can be used to extract the information from documents in different natural languages, but the use of available linguistic resources (e.g. electronic dictionaries) can simplify their application.

Goals: To integrate an information extraction tool into the LOD2 Stack that can use the available

linguistic resources, which will be able to represent the information extracted from natural language documents in the linked data format

To integrate the document searching tool, which will be able to learn domain-specific vocabularies (words and phrases) in different languages and then, using these vocabularies and previously processed natural language documents, to provide a list of similar documents sorted according to the semantic similarity to the given one

Page 12: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 12 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP3: Tasks and Results – Knowledge Extraction Knowledge Extraction Wikipedia Page Overview on http://en.wikipedia.org/wiki/Knowledge_extraction ( Wiki page was bootstrapped by LOD2) Knowledge Extraction from structured sources:

relational databases, spreadsheets, CMS, logs, XML documents many tools created and/or improved: D2R, Triplify, Virtuoso Sponger, SPARQLMap, SPARQLIFY, Poolparty

(http://poolparty.biz) LOD2 Google Refine extension: http://code.zemanta.com/sparkica/download.html

Knowledge Extraction from semi-structured sources:

Wikis, HTML DBpedia Live Extraction http://live.dbpedia.org internationalization of DBpedia http://wiki.dbpedia.org/Internationalization extension of DBpedia framework for all MediaWikis e.g. http://wiki.dbpedia.org/Wiktionary large-scale text mining and entity detection from blogs (led by Zemanta)

Knowledge Extraction from unstructured sources:

textual sources and Natural Language Processing (NLP) NLP2RDF and NIF 1.0 specification of an RDF/OWL vocabulary to convert NLP tool output to RDF project and reference implementations available at http://nlp2rdf.org Demo: http://nlp2rdf.lod2.eu/demo.php

Page 13: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 13 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP3: Tasks and Results

Knowledge Base Schema Enrichment learn axioms in knowledge bases, e.g. disjointness, definitions, super-classes development of ORE and DL-Learner enrichment of ontologies with multilingual information to allow language-independent search

Knowledge Base Repair fix inconsistencies, modeling problems, reasoning performance problems development of ORE Web Linkage Validator reports whether knowledge base is suitable to be interlinked with others

Page 14: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 14 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP4: Reuse, Interlinking and Knowledge Fusion (1)

Provide open-source software components for link generation, schema mapping, data quality assessment and knowledge fusion.

Goal:

Page 15: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 15 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP4: Reuse, Interlinking and Knowledge Fusion (2)

Technological Excellence: Ease the creation of RDF links by using machine-learning as well as link quality assessment

workbench

Provide for the flexible integration of Web data based on mappings discovered on the Web

Provide for assessing the quality of Web data and fusing high-quality data.

Expected Outcomes: 1. Link discovery tools, linking assist and workbench

2. Framework for publishing and discovering expressive mappings on the Web

3. Data quality assessment framework providing for a wide range of different quality assessment policies

4. Data fusion components providing various conflict resolution strategies

5. LOD-enabled data cleansing tool leveraging crowd intelligence to improve the quality of linked data

Page 16: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 16 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP4: Reuse, Interlinking and Knowledge Fusion (3)

Progress:

1. Link discovery tools, linking assist and workbench

2. Framework for publishing and discovering expressive mappings on the Web

3. Data quality assessment framework providing for different quality assessment policies

4. Data fusion components providing various conflict-resolution strategies

First version of Silk Workbench released in Feb 2012

Mapping Publication and Discovery Framework released in Feb 2012

Initial version of Data Fusion Component released in Aug 2012

Data Quality Assessment Tool released in Aug 2012

Data Linking Environment Released in Aug 2012

Sieve Quality Assessment Language specified in Feb 2012

Mapping Publication and Discovery Language specified in Jul 2011

Sieve Data Fusion Language released in Feb 2012

Page 17: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 17 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP4: Involvement of KAIST (Associated Partner from Korea)

Current Status:

Secured Korean government funding support

Original project scope expanded to involve an industry partner, Synapsoft, due to the Korean government funding requirement

Successfully completed and submitted work on Korean Resource Linking Assist.

Installed on the developing webpage for Korean NLP2RDF in http://semanticweb.kaist.ac.kr/nlp2rdf/.

In preparation of Chinese and Japanese data for the work on Asian Resource Linking Assist

Goals:

Building Korean, Chinese and Japanese RDF data integration platform for Linked Data

Multilingual Linked Data Fusion of Chinese and Japanese DBpedia with Korean and English DBpedia

Page 18: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 18 http://lod2.eu

Creating Knowledge out of Interlinked Data

Goal:

The objectives of WP5 are to develop new browsing, visualization and authoring interfaces for LOD, which support a wide range of devices (from mobile phones to desktop PCs), which integrate heterogeneous information from various sources and support the evolution of both instance data as well as information structures over time. In order to achieve these objectives we will explore new browsing and visualization paradigms. Partners:

Universität Leipzig (ULEI)

Stichting Centrum voor Wiskunde en Informatica (CWI)

National University of Ireland, Galway (NUIG)

Freie Universität Berlin (FUB)

Semantic Web Company GmbH (SWCG)

WP5: Linked Data Browsing, Visualization, and Authoring Interfaces

Page 19: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 19 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP5: Task Breakdown

Page 20: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 20 http://lod2.eu

Creating Knowledge out of Interlinked Data

In general, our concerns are: the browsing, visual representation, and creation of linked data through various mediums. Our ongoing and delivered components for LOD2 include:

Faceted Spatial Browsing

LinkedGeoBrowser – a linked data spatial semantic browser

Madr – a linked data augmented reality viewer

Tools to search, browse and inspect datasets

RDFAuthor – an adaptive RDF widget interface

Sig.ma Enterprise Edition (Sig.ma EE) – an interactive information visualizer for linked data

SparqlEd - an assisted SPARQL query writer

Plus other LOD powered widgets and social networking applications

WP 5: Deliverables

Page 21: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 21 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP6: Interfaces, Integration & LOD2 Stack (1)

The development of the LOD2 stack is driven by the application cases elaborated in WP7, WP8, WP9, WP9a and WP6. WP6 will deliver the following:

Integrated user-interface components Integrated LOD2 Stack API components Documentation, Tutorials Evaluation of the LOD2 Stack

This work package deploys the LOD2 stack, a repository of downloadable packages. With these components the whole LOD lifecycle is supported.

Page 22: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 22 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP6: Interfaces, Integration & LOD2 Stack (2)

Use Case Media &

Publishing

Use Case Enterprise Data web

Use Case Government

Data

Requirements/ Prerequisites

Requirements/ Prerequisites

Requirements/ Prerequisites

SAF (Software Assembly Factory)

Starts on 09/2011

Open Source

Package

Generates packages with integrated tools

WP7: Media &

Publishing

WP8: Enterprise Data web

WP9, WP9a: Government

Data

Applied on

Applied on

Applied on

LOD2 stack

Released on • 09/2012 • 09/2013 • 09/2014

Yearly releases

Output WP1 WP2 – WP5:

Develop components

feedback

Page 23: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 23 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP6: Evaluation of the LOD2 Stack (3)

LOD2 Stack Repository:

http://stack.lod2.eu/deb/distributions/dists/

LOD2 Stack Demo server:

http://demo.lod2.eu/lod2demo

LOD2 Stack Documentation:

http://wiki.lod2.eu/display/LOD2DOC/Home

Page 24: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 24 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP6: Flexible Solutions for Different Application Domains (4)

Goals:

In-depth analysis of different application scenarios

Identification of LOD2 functional components that adequately respond to specific application requirements

Evaluation of the suitability of the LOD2 approach for different domains by building prototype applications

Development of a stack configurator that will enable

automatic initialization and configuration of the LOD2 Stack for domain-specific Linked Data applications

potential users to create their own personalized version of the LOD2 Stack

Page 25: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 25 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP7: LOD2 for Publishers

WKD Tax & Accounting

Companies/Brands - Akademische Arbeits- gemeinschaft Verlag - Addison Group - Schleupen Tax - Wago Curadata

Products (Examples) - Tax SW for Consumers - SW for Tax Accountants - SW for SMEs with focus Controlling and Accounting

WKD Legal & Regulatory

Companies/Brands - Carl Heymanns Verlag - Luchterhand - Werner Verlag - Carl Link - CW Haarfeld - Deutscher Wirtschaftsdienst - AnNoText - Trigon Data

Products (Examples) - IP, Administrative Law - Civil, Family, Labor Law - Construction Law - Publications for Schools/KiTas - Public Health Insurance - Magazin „Personalwirtschaft“ (HR Management) - SW for Lawyers and Notaries

WKD is part of Wolters Kluwer B.V.

Worldwide reach - Europe - North America - Asia/Pacific

Economic success - Revenue 2010 EUR 3,6 bln. - 19.000 Employees - Listed Amsterdam SE

Customer orientation - Lawyers - Tax Accountants - Corporations and SMEs - Fincancial institutions - Health Providers - Public Sector

Wolters Kluwer Deutschland (WKD): “Semantic Technologies and Standards are an enabler for the media and publishing industry to create added-value for their customers with reasonable costs.“

Page 26: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 26 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP7: WKD as a Consumer of LOD Data

Content Acquisition Editing Composing

Bundling Publishing Interfacing Sales

Content Supply Chain of

Wolters Kluwer Deutschland

(WKD)

Customer Service Customer

Content Acquisition

Acquisition of LOD governmental data

- Laws & Regulations

- Court cases

- Administrative Rulings

- Statistical information

Based on:

- Adequate delivery format

- Adequate metadata

- Adequate Licensing and IPR

Content Enrichment

Enrichment of WKD data

- Enrichment with additional metadata from the LOD cloud

- Automatic Interlinking within WKD data, but also into the LOD cloud

Based on:

- Adequate delivery format

- Adequate metadata

- Adequate functionality

- Adequate Licensing and IPR

Enterprise Applications

Data integration in Enterprise and other Costumer Applications

- Integration of customer and WKD data with data from the LOD cloud

- Development of new services, e.g. around metadata economics

Based on:

- Adequate functionality

- Adequate APIs

- Adequate Licensing and IPR

Page 27: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 27 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP7: WKD as a Publisher of LOD Data

Content Acquisition Editing Composing

Bundling Publishing Interfacing Sales

Content Supply Chain of

Wolters Kluwer Deutschland

(WKD)

Customer Service Customer

Marketing measures

Integration in overall marketing strategy of WKD

- Dissemination of LOD2 in media and publishing sector

- Launching surveys

- Permanent information of customers

- Sponsoring of conferences

Based on:

- Clear scope of LOD2 project to support future publishing paradigms

Cloud - Publishing

Development of WKpedia

- Publishing of enriched governmental information

- Publishing of legal domain thesauri

- Motivating contextualisation in LOD cloud

Based on:

- Adequate functionality

-Adequate APIs

- Adequate Licensing and IPR

Page 28: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 28 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP8: Towards Linked Enterprise Data Webs (1)

Linked Enterprise Intra Data Webs can fill the gap between Intra-/Extranets and ERP systems and facilitates data integration along value-chains within and across enterprises.

The pragmatic, incremental, vocabulary-based Linked Data approach reduces data integration costs significantly.

Objective: Promote openness and standards

in enterprise data workflows and applications

Web publishing

Page 29: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 29 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP8: Linked Enterprise Data Use Case Scenario (2)

Wage policy EBI: Build an application for surveying wage policy in a company, domain, sector, region, etc. Scenarios: A company wants to know if its wage policy is consistent with the market (in similar

and related companies and sectors). A job applicant would like to have an idea about his wage expectations according to his

expertise, profile and education background A governmental agency would like to survey the salaries in a particular region according

to an economic branch and other parameters

Page 30: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 30 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP8: Linked Enterprise Data (3)

Targeted Service:

A Saas service with different levels of subscription

The service is a mashup of payroll and HR data of enterprises subscribing to the service to build an index store of data facts about wages.

Different consolidation parameters and key performance indicators (KPI) will be studied to provide relevant reports and visualisation interfaces.

Integration of external datasets in a particular survey: public datasets in the web cloud or private datasets of participating companies.

Privacy issues management: make private and nominative data anonymous.

Page 31: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 31 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP8: Linked Enterprise Data (4) Preliminary Overview

Data crawler

Data Cleaning and

uniformisation

Data enrichment, annotation

Data consolidation

Indexer

RDF store

Full text index

Search and EBI interface

SPARQL endpoint

Taxonomies

Payroll software

Employees and HR database

Data LODification and anonymisation

Page 32: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 32 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP9: Open Government Data Use Case

Overview PublicData.eu is a prototype of a pan-European data catalogue and federation mechanism, developed by OKFN as part of the FP7-funded LOD2 project. Based on the CKAN open-source data portal, the site is developed as a use case and an early adopter of the LOD2 linked data stack technologies.

Page 33: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 33 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP9: Key Stats

Publicdata.eu provides robust and useful search, filtering and previewing tools. It currently houses 17027 data sets, harvested from 18 data catalogues, and it provides the

option to browse data sets by top-level categories.

Page 34: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 34 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP9: Results

Technical improvements to Publicdata.eu during the past year In March 2012 we upgraded PublicData.eu to CKAN version 1.6, adding the data preview

functionality (powered by Recline), improvements to search, interface improvements to dataset pages, newly added resource (file) pages and group pages.

We also re-ran all the harvesters to have the most up to date set of datasets. Some catalogues have been migrated to groups on thedatahub.org and therefore can't currently be harvested without also including non EU datasets. In the future we may resolve this by extending the harvester to allow us to specify which groups or tags should be harvested. This would allow us to import relevant datasets from thedatahub.org without importing non EU datasets.

Many new CKAN instances have recently been launched by various countries, which we hope to include in our harvesting for the intermediate launch (August 2013)

Page 35: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 35 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP9: Next Steps (1)

Further technical improvements to Publicdata.eu : Improvements that have been already implemented (Personalization features)

activity streams and follow support (i.e. allowing users to subscribe to information based on the region of interest, language and other Social /sharing buttons)

Improvements scheduled for the next site releases (Personalization features)

ability to rate datasets (by users or via QA extension) allow users to add/revise their own data sets users to contribute tools and mash-ups and visualization/Data store “related" extension allowing users to link to and from their data add Disqus comments extensions

Page 36: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 36 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP9: Next Steps (2)

Ckan core technology improvements (Harvesting)

ability to only harvest changed data ability to harvest part of a site (i.e. A particular group) automate the harvesting process so that the data in publicdata.eu is always up to date.Run

harvesters more often/automatically show harvesting info publicly - Success/Failure alerts in the UI

Additional features

adding more advanced multilingual capabilities to the portal adding an RDF datastore

Page 37: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 37 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP9: Linked Open Government Data in the West Balkan Countries

Objectives:

deployment of the LOD2 Stack to government data from this region

establishment and

maintainance of the Serbian CKAN

provision of best practice examples

Page 38: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 38 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP9: Linked Open Government Data Scenarios

National Statistical Office Use Case: using LOD2 tools for publishing and integrating statistical data into the LOD Cloud Scenarios: A company interested in investing in Southeast Europe wants to search the PublicData.eu portal,

download and analyse an extended set of economic indicators for the countries in this region and compare them with indicators from the UK.

A statistician preparing reports on a monthly, quarterly and yearly basis would like to publish the reports in a machine-processable format and integrate them into the LOD2 Cloud.

An IT Administrator maintaining the central repository of statistical classifications and statistical

indicators in a standard machine-processable format would like to link the national code lists with the ones recommended by EUROSTAT, as well as to enrich their description with information from DBpedia.

Page 39: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 39 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP9: Linked Open Government Data Scenarios

Polish Ministry of Economy Use Case Using LOD2 tools for publishing and integrating Polish economy data into the LOD cloud mission: to support Polish enterprises in data search, acquisition and using for decision making

Main Datasets INSIGOS – Online Business Information System (multidimensional reports on economy) CEIDG – Central Register and Information on Economic Activity (of natural persons) public procurement data

Tasks Overview requirements analysis – identification of main datasets, their intended use adoption of the LOD2 Stack for Polish economy data a showcase – sample application showing advantages of LOD for enterprises

Page 40: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 40 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP9a: LOD2 for Public Contracts

Research Focus: creating linked data for public sector contracts matching the demand of public sector bodies with linked commerce data analytics of linked data for public sector contracts Results: In the second year of the project, the Public Contracts Ontology was developed, which

provides semantics to multiple triplified datasets (the European TED system and various national-level ones).

Next Step: An application for public contracts filing and management is under construction.

Page 41: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 41 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP10: Training, Dissemination, Community Building & Fertilization (1)

The general aim of this work package is to establish a worldwide focal point for academic,

governmental and industry parties interested in contributing to or taking advantage of the

novel Linked Data methodologies and components, which will emerge in the project. In particular, our activities will be targeted at:

informing the community of the state-of-the-art developments taking place in the field,

disseminating the project results in order to foster community building and to create an impact on industry and research in Europe and worldwide,

providing training and consultancy to interested audiences in the technologies developed throughout the project

Page 42: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 42 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP10: Training, Dissemination, Community Building & Fertilization (2)

Training Activities

Internal face-to-face training

External training

PhD programme

Dissemination Activities Scientific dissemination

Industrial dissemination

Online marketing activities across all identified target groups

Training and Dissemination in Korea

KAIST will, for instance, ensure the penetration of LOD2 results in a dynamic Asian country by organizing a number of events and outreach activities, such as:

two research-oriented Data Web symposia aiming to bring together relevant researchers in Asia with the LOD2 consortium,

two industry workshops aiming at disseminating LOD2 results to Korean and Japanese companies and to facilitate cooperation and market entry of industrial LOD2 partners,

already hosted 2012 International Asian LOD Summer School, attended by 35 students and 15 lecturers, from 13/08-17/08/2012 (http://semanticweb.kaist.ac.kr/2012lodsummer)

Page 43: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 43 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP10: Most Important LOD2 Dissemination Resources

Website: http://lod2.eu

Weblog: http://blog.lod2.eu

Twitter: http://twitter.com/lod2project

Mailing List: [email protected]

SlideShare: http://www.slideshare.net/lod2project

PUBLINK: http://lod2.eu/Article/Publink.html

Webinar Series: http://lod2.eu/BlogPost/webinar-series

Flickr Account: http://www.flickr.com/photos/lod2/

#lod2 . @lod2project . #lod2stack

European Data Forum: http://data-forum.eu

Page 44: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 44 http://lod2.eu

Creating Knowledge out of Interlinked Data

PubLink – LOD2’s Linked Open Data Starter Service

PubLink helps selected organizations with a focused consulting in the course of mini projects to publish and make use out of Linked (Open) Data

PubLink helps to evaluate the LOD2 technologies and to increase the wealth of Linked Data

Annual application deadline is always in Winter (end of respective year)

2012 PubLink participants include:

1. Food and Agriculture Organization of the United Nations (FAO), Italy 2. European Environment Agency, Denmark 3. Birmingham City Council, Great Britain 4. Municipality of Udine, Italy 5. Statistical Office of the Republic of Serbia, Serbia 6. Bezirksverordnetenversammlung Berlin-Kreuzberg, Germany

For details see: http://lod2.eu/Article/Publink.html

Page 45: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 45 http://lod2.eu

Creating Knowledge out of Interlinked Data

LOD2 Webinar Series – Online Trainings on LOD Tools One of the most extensive training activities within the LOD2 project is the online seminar (webinars) series. It serves to make the main technologies in use and the major outcomes of the project known to an audience of international scope. LOD2 Webinars available:

1. Semantic Search (via PoolParty) – June 2011 2. Linked Data Management – November 2012 3. LOD2 Stack 1st Release – November 2011 4. Virtuoso – December 2011 5. Ontowiki – January 2012 6. SILK Workbench –Februar 2012 7. LIMEs & SAIM – March 2012 8. Linked Data and SKOS – April 2012 9. D2R and Sparqlify – April 2012 10. CloudView – May 2012 11. PoolParty Thesaurus Manager (PPT) – June 2012

For details (infos, presentations, recordings) see: http://lod2.eu/BlogPost/webinar-series

Page 46: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 46 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP11: Exploitation and Standardization (1)

Objectives:

Realizing the vision of the LOD2 project and use case studies

Standardisation of LOD2 architecture

Exploitation of knowledge and technical results Exploitation:

Use case studies and the industrial and end-user community partners will drive the exploitation.

Tracking important technical and commercial in information retrieval, data management including news and media.

Publish exploitation plan identifying opportunities, benefits and impact of LOD2 consortium. First LOD2 Exploitation Plan produced in M12

Page 47: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 47 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP11: Exploitation and Standardization (2) Intellectual Property Rights (IPR):

Core component of the LOD2 stack will be published under open-source license.

Domain adoptions of LOD2 stack considered on case-by-case basis to protect IPR.

Strategy ensures that all components of LOD2 are royalty-free. First IPR Strategy plan produced in M18

Standardization:

Actively participating in appropriate standards bodies.

Establishing a W3C Linked data interest group. W3C Linked Data Interest and WebID Incubator groups established

Orchestration with other projects:

Encourage take-up of LOD2 technologies by other projects.

Foster input from other EU projects relevant for the development of LOD2 ESWC Project Networking Track M20

Annual Summer School ULEI

Page 48: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 48 http://lod2.eu

Creating Knowledge out of Interlinked Data

Means

Total Budget: 9,9 M€

Total Funding: 7,2 M€

Total Resources: 1132 PM

WP12: Fact Sheet (1)

Project

Instrument: Large-Scale Integrating Project

Objective: Intelligent Information Management

Call: FP7-ICT-2009-5

Duration: 09/2010 – 08/2014

Page 49: Open Data Conference - Sören Auer - Linked Open Data

EU-FP7 LOD2 Project Overview . 31.08.2012 . Page 49 http://lod2.eu

Creating Knowledge out of Interlinked Data

WP12: Fact Sheet (2)

Universität Leipzig (Coordinator), Germany

Centrum Wiskunde & Informatica, Netherlands

National University of Ireland in Galway, Irland

Freie Universität Berlin/Universität Mannheim, Germany

OpenLink Software, United Kingdom

Semantic Web Company, Austria

TenForce, Belgium

Exalead, France

Wolters Kluwer Deutschland, Germany

Open Knowledge Foundation, United Kingdom

Vysoka Škola Ekonomická v Praze, Czech Republic

Zemanta d.o.o., Slovenia

Instytut Informatyki Gospodarczej, Poland

Institut Mihajlo Pupin, Serbia

Korea Advanced Institute of Science and Technology, South Korea

Consortium – 15 Partners from 11 European Countries + 1 Associated Partner from Korea

Page 50: Open Data Conference - Sören Auer - Linked Open Data

LOD2 Title . 02.09.2010 . Page 50 http://lod2.eu

Creating Knowledge out of Interlinked Data

Address University of Leipzig Faculty of Mathematics and Computer Science Department of Computer Science Institute of Business Information Systems Postfach 100920 04009 Leipzig Germany

Coordinator

Thanks for your attention! http://lod2.eu

Contact

Dr Sören Auer Scientific Project Leader Phone: +49 (341) 97-32367 Fax: +49 (341) 97-32329 Email: [email protected] http://www.informatik.uni-leipzig.de/~auer Nadine Jänicke Project Manager Phone: +49 (341) 97-32332 Fax: +49 (341) 97-32329 Email: [email protected] http://bis.informatik.uni-leipzig.de/NadineJaenicke