linked data for enterprise data integration

35
Linked Data for Enterprise Information Integration Sören Auer

Upload: soeren-auer

Post on 11-May-2015

1.964 views

Category:

Technology


0 download

DESCRIPTION

The Web evolves into a Web of Data. In parallel Intranets of large companies will evolve into Data Intranets based on the Linked Data principles. Linked Data has the potential to complement the SOA paradigm with a light-weight, adaptive data integration approach.

TRANSCRIPT

Page 1: Linked data for Enterprise Data Integration

Linked Data for Enterprise Information Integration

Sören Auer

Page 2: Linked data for Enterprise Data Integration

© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

The Web evolves into a Web of Data

2

Linked Open Data

FacebookOpen Graph

Page 3: Linked data for Enterprise Data Integration

© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

The Evolution of the Web

3

Web 1.0 - Hypertext Static Web pages Hyperlinks Link directories

Web 2.0 – Social Apps Social Web Crowd-sourcing Mashups

Web 3.0 – Linked Data REST APIs, RDF,

JSON-LD Vocabularies Rich-snippets,

Semantic Search

1990 2000 2010

Intranet 1.0 - Hypertext Static Intranet pages Keyword search Hyperlinks

Intranet 2.0 –Social Enterprise Apps Salesforce Crowd-sourcing Mashups

Intranet 3.0 –Enterprise Data Intranet URI Scheme Enterprise taxonomies /

knowledge bases RDB2RDF Mapping

1995 2005 2015

& Enterprise Intranets

Page 4: Linked data for Enterprise Data Integration

© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Linked Data Principles

1. Use URIs to identify the “things” in your data

2. Use http:// URIs so people (and machines) can look them up on the web

3. When a URI is looked up, return a description ofthe thing (in RDF format)

4. Include links to related things

http://www.w3.org/DesignIssues/LinkedData.html

4

Page 5: Linked data for Enterprise Data Integration

© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Linked Enterprise Data Principles

1. Evolve existing existing taxonomies into enterprise knowledge bases/hubs

2. Establish a enterprise wide URI scheme

3. Equip existing information systems in your intranet with Linked Data interfaces

4. Establish links between related information

5

Page 6: Linked data for Enterprise Data Integration

© Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Linked Enterprise Data Advantages

• Light-weight linked data integration complements more complex SOA architectures

• Unified data (access) model simplifies data integration

• Increase standardization while preserving diversity

• Facilitate information flows along supply and value creation chains

Dramatically reduce data integration costs, increase enterprise flexibility

6

Page 7: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Inter-linking/ Fusing

Classifi-cation/ Enrichment

Quality Analysis

Evolution / Repair

Search/ Browsing/

Exploration

Extraction

Storage/ Querying

Manual revision/ authoring

Linked DataLifecycle

Page 8: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Extraction

Inter-linking

Enrichment

Quality Analysis

Evolution Repair

Explora-tion

Extrac-tion

Store Query

Authoring

Page 9: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

From unstructured sources

• NLP, text mining, annotation

From semi-structured sources

• DBpedia, LinkedGeoData, DataCube

From structured sources

• RDB2RDF

Extraction

Page 10: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Many different approaches: D2R, Virtuoso RDF Views,

Triplify,

No agreement on a formal

semantics of RDF2RDF

mapping

• LOD readiness,

SPARQL-SQL translation

W3C RDB2RDF WG

Extraction Relational Data

Tool Triplify Sparqlify D2RQ Virtuoso RDF Views

TechnologyScripting

languages (PHP)

Java JavaWhole

middleware solution

SPARQL endpoint - X X X

Mapping language SQL

SPARQL CONSTRUCT Views + SQL

RDF based RDF based

Mapping generation Manual Semi-

automaticSemi-

automatic Manual

ScalabilityMedium-

high(but no

SPARQL)Very high Medium High

Malhotra, Auer, Erling, Hausenblas: W3C RDB2RDF Incubator Group Report. W3C RDB2RDF Incubator Group, 2009.

Page 11: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

• Rationale: Exploit existing

formalisms (SQL, SPARQL

Construct) as much as possible

• flexible & versatile mapping

language

• translating one SPARQL query into

exactly one efficiently executable

SQL query

• Solid theoretical formalization

based on SPARQL-relational algebra

transformations

• Extremely scalable through

elaborated view candidate selection

mechanism

• Used to publish 20B triples for

LinkedGeoData

Sparqlify

Stadler, Unbehauen, Auer, Lehmann: Sparqlify – Very Large Scale Linked Data Publication from Relational Databases. Submitted to VLDB-Journal.

SPARQLConstruct

SQLView

Bridge

Page 12: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Storage and Querying

Inter-linking

Enrichment

Quality Analysis

Evolution Repair

Explora-tion

Extrac-tion

Store Query

Authoring

Page 13: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Authoring

Inter-linking

Enrichment

Quality Analysis

Evolution Repair

Explora-tion

Extrac-tion

Store Query

Authoring

Page 14: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

1. Semantic (Text)

Wikis

• Authoring of

semantically

annotated texts

2. Semantic Data

Wikis

• Direct authoring of

structured information

(i.e. RDF, RDF-

Schema, OWL)

Two Kinds of Semantic Wikis

Page 15: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

The situation at Daimler (€97.76 billion revenue, 250.000

employees):

• 3.000 heterogeneous IT systems

• Different units (car, bus, truck etc.) with very different

views

• No common language

• Inability to identify crucial entities (parts, locations

etc.) enterprise wide

There is no (can not be a) single Enterprise Information Model

A distributed, iterative, bottom-up integration approach

such as Linked Data might be able to help (pay-as-you-go).

Can Linked Data help to solve the EII problem in a fortune-500 company?

Page 16: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

16

Search before

Page 17: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

OntoWiki with loaded car model data

Page 18: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

OntoWiki with loaded car model data

Page 19: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Management of Enterprise Taxonomies with OntoWikiBased on the W3C SKOS standard

Corporate Language Management at Daimler: 500k concepts in 20 languages

Page 20: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Search afterShowing recommondations from the knowledge base integrating car model data and enterprise taxonomy

Page 21: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

You can search for „Kombi“ (station wagon) and find T-Models (Daimler term for station waggon)

Page 23: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

© CC-BY-NC-ND by ~Dezz~ (residae on flickr)

Linking

Inter-linking

Enrichment

Quality Analysis

Evolution Repair

Explora-tion

Extrac-tion

Store Query

Authoring

Page 24: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked DataIn an uncontrolled

environment as the Data

Web, there will be a

proliferation of equivalent

or similar entity identifiers

Manual Link discovery:• Sindice integration into UIs

• Semantic Pingback

Semi-automatic:• SILK

• LIMES

Automatic/ Supervised:

• Raven [1]

Linking Entities on the Data Web

[1] Ngonga, Lehmann, Auer, Höffner: RAVEN -- Active Learning of Link Specifications, OM@ISWC, 2011.

Page 25: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Enrichment

Inter-linking

Enrichment

Quality Analysis

Evolution Repair

Explora-tion

Extrac-tion

Store Query

Authoring

Page 26: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Linked Data is mainly instance data!!!

ORE (Ontology Repair and Enrichment) tool allows to improve an

OWL ontology by fixing inconsistencies & making suggestions for

adding further axioms.• Ontology Debugging: OWL reasoning to detect inconsistencies and

satisfiable classes + detect the most likely sources for the problems.

user can create a repair plan, while maintaining full control.

• Ontology Enrichment: uses the DL-Learner framework to suggest

definitions & super classes for existing classes in the KB. works if

instance data is available for harmonising schema and data.

http://aksw.org/Projects/ORE

Enrichment & Repair

Lehmann, Auer, Tramp: Class Expression Learning for Ontology Engineering. Journal of Web Semantics (JWS), 2011.

Page 27: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Analysis

Quality

Inter-linking

Enrichment

Quality Analysis

Evolution Repair

Explora-tion

Extrac-tion

Store Query

Authoring

Page 28: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Quality on the Data Web is varying a lot• Hand crafted or expensively curated knowledge base

(e.g. DBLP, UMLS) vs. extracted from text or Web 2.0

sources (DBpedia)

Research Challenge• Establish measures for assessing the authority,

provenance, reliability of Data Web resources

Opportunity for EII: Employ crowd-sourced

knowledge from the Data Web in the Enterprise

Linked Data Quality Analysis

FP7-IP DIACHRON Managing the Evolution and Preservation of the Data WebStarted April 2013

Page 29: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Evolution

© CC-BY-SA by alasis on flickr)

Inter-linking

Enrichment

Quality Analysis

Evolution Repair

Explora-tion

Extrac-tion

Store Query

Authoring

Page 30: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Exploration

Inter-linking

Enrichment

Quality Analysis

Evolution Repair

Explora-tion

Extrac-tion

Store Query

Authoring

Page 31: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

An ecosystem of LOD visualizations

LOD

Exp

lora

tion

Wid

gets

Spatial faceted-browsing

Faceted-browsing

Statisticalvisualization

Entity-/faceted-Based browsing

Domain specificvisualizations … …

LOD

Dat

aset

sCh

oreo

grap

hyla

yer

• Dataset analysis (size, vocabularies, property histograms etc.)• Selection of suitable visualization widgets

Brunetti, Auer, García: The Linked Data Visualization Model. To appear in IJSWIS, 2012.

Page 32: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

LOD Life-(Washing-)cycle supported by Debian based LOD2 Stack

http://stack.lod2.eu

Page 33: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Linked Enterprise Intra Data Webs fill the gap between Intra-/Extranets and EIS/ERP

Unstructured InformationManagement

Structured InformationManagement

Support the long tail of enterprise information domains

• Human-resources• Requirements engineering• Supply-chains

Page 34: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

• Linked Data is a promising technology for closing the

gap between SOA and unstructured information

management

• wealth of knowledge available as LOD can be

leveraged as background knowledge for Enterprise

applications

• The application of Linked Data in the enterprise is still

largely unexplored (opportunity)

• Linked Data will make Enterprise Information

Integration more flexible, iterative, cost effective

Take home messages

Auer, Frischmuth, Klímek, Tramp, Unbehauen, Holzweißig, Marquardt: Linked Data in Enterprise Information Integration Submitted to Semantic Web Journal.

Page 35: Linked data for Enterprise Data Integration

Creating Knowledge out of Interlinked Data

Thanks for your attention!

Sören Auerhttp://www.informatik.uni-leipzig.de/~auer | http://aksw.org |

http://[email protected]