sigma ee: reaping low-hanging fruits in rdf-based data integration
DESCRIPTION
A presentation I gave at I-Semantics 2010 on Sigma EE, an RDF-based data integration front-end. Sigma EE is now available for download here: http://sig.ma/?page=helpTRANSCRIPT
Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Sigma EE: Reaping low-hanging fruits in RDF-based data
integrationRichard Cyganiak
I-Semantics 2010, Graz
Digital Enterprise Research Institute www.deri.ie
Intro
Semantic Technologies conferences In-use Tracks Applications session
D2RQ Expose contents of relational databases as RDF/SPARQL Just a format converter; what do people use it for?
Digital Enterprise Research Institute www.deri.ie
The common theme …
Integration of data across the organization/project
3 of XYZ
Digital Enterprise Research Institute www.deri.ie
The RDF-based data integration project
Digital Enterprise Research Institute www.deri.ie
The RDF-based data integration project
Probably limited budget … Otherwise would buy from SAP or Oracle
Digital Enterprise Research Institute www.deri.ie
Where next after “Hello World”?
Digital Enterprise Research Institute www.deri.ie
Sigma EE
Originally not built for enterprise data but for web data
Sindice, search engine for the Web of Data Microformats, RDFa, Linked Data on the Web For building apps on top of data search API http://sindice.com/
How to show the richness of all that data? http://sig.ma/
Digital Enterprise Research Institute www.deri.ie
sig.ma demo
Digital Enterprise Research Institute www.deri.ie
Off-the-shelf UI for the RDF Bus
Digital Enterprise Research Institute www.deri.ie
Background
The problem: How to provide uniform access to heterogeneous data sources? Value-added services:
– Search– Browsing– Recommendations of related items– Reporting– Dashboarding– Notifications– …
Digital Enterprise Research Institute www.deri.ie
Solutions?
Data Warehousing Enterprise Information Integration Enterprise Search A middle ground in-between?
Digital Enterprise Research Institute www.deri.ie
Data Warehousing, EII
Integrate enterprise data sources into a new data source Data Warehouse: materialized (new DB) Enterprise Information Integration: virtual (distrib.
queries) Focus on data Tight integration High up-front cost
Digital Enterprise Research Institute www.deri.ie
Enterprise Search
Provides the most sought-after service (search) Focus on documents
full-text search Lower up-front cost (no schema alignment) Providing value-added services on top is difficult
Digital Enterprise Research Institute www.deri.ie
A middle ground
Start by providing access to data on a per-business-object basis without prior schema alignment
Services: Browsing of the catalog of objects; search
Align, link and reconciliate as required to enable more services, e.g., expressive queries
Digital Enterprise Research Institute www.deri.ie
A middle ground
No accepted term yet Data Spaces? Pay-as-you-go Data Integration? Linked Enterprise Data?
Digital Enterprise Research Institute www.deri.ie
The RDF technology stack
A standards-based “data-first” approach RDF, SPARQL, OWL – W3C standards
Off-the-shelf components Integrates well with web data sources
Digital Enterprise Research Institute www.deri.ie
The “RDF Bus”
Various implementation strategies ETL + One Big Triple Store with SPARQL endpoint Several SPARQL endpoint (SPARQL 1.1 SERVICE feature?) Linked Data style (resolvable URIs)
Bus details determine what services can be provided Can you do high-performance SPARQL? Can you do full-text search? Real-time up-to-date information or significant delay? Where is alignment handled? Who can hook in new data sources?
Digital Enterprise Research Institute www.deri.ie
Sigma EE
Services: search, browsing Strengths
Minimal requirements for the RDF bus Strong support for provenance Dynamic UI
Bus has to provide Search and Entity descriptions E.g., SPARQL endpoint with full-text search E.g., Solr E.g., Sindice + (part of) the Web E.g., custom Java classes Or multiple of the above
Digital Enterprise Research Institute www.deri.ie
Architecture
Digital Enterprise Research Institute www.deri.ie
Sigma UI
Full-text search On-the-fly fuzzy merge of data sources Empower user to evaluate provenance, reject and
accept data sources Show/hide/rearrange properties and values Browse to related entities Permalinks, embeddable widgets
Digital Enterprise Research Institute www.deri.ie
Summary
Sigma EE: front-end for your RDF Bus E.g., for your triple store
Off-the-shelf UI with minimum configuration Available under GPL or other licenses at
http://sig.ma/?page=help Running at http://sig.ma/