facet: building web pages with sparql

26
a division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer, Ingenta

Upload: leigh-dodds

Post on 21-Aug-2015

7.606 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

FacetBuilding Web Pages With SPARQL

SWIG-UK Event, HP LabsNovember 23rd 2007

Leigh DoddsChief Technology Officer, Ingenta

Page 2: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Problem Statement

Page 3: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Where’s my RDF-native Web Framework?!

There is no good system for integrating RDF repositories with existing an web framework (in Java)

Page 4: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Design Constraints

Page 5: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Page 6: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Page 7: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Page 8: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Page 9: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Page 10: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Page 11: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Page 12: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Journal

Article

Article

ArticleIssue

Issue

Issue

2007 Index

2006 Index

New Article Index

Page 13: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Design Constraints

• A web page presents data that is a sub-graph (i.e. a view) over a larger RDF store, (the data model)

• The extent of the sub-graph may vary for different presentations of the data, and may contain arbitrary properties

• The description of the sub-graph (a lens) should be declarative

• That sub-graph is “rooted” on a single primary resource (e.g. a Journal)

• The identifier of the primary resource can be derived from the request URL, e.g. by rewriting the URI. And vice versa

– Therefore, we don’t support blank nodes as primary resources– Or fragment identifiers in URIs!

• The sub-graph should be serializable into an object graph for presentation to the templating system

Page 14: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Facet Request Handling

To return a response we need to answer three questions…

1. What lens are we going to apply?

2. What data model are we going to apply it to?

3. What’s the identifier of the primary resource?

Page 15: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Lenses

Describing views of RDF data

Page 16: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

A Simple Lens

PREFIX dc: <http://purl.org/dc/elements/1.1/>CONSTRUCT { ?item dc:title ?title . ?item dc:language ?language .} WHERE { ?item dc:title ?title . OPTIONAL { ?item dc:language ?language . } }

Page 17: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Configuring Lenses<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ja="http://jena.hpl.hp.com/2005/11/Assembler#"

xmlns:view="http://metastore.ingenta.com/facet/lens/"> <rdf:Description rdf:about="http://metastore.ingenta.com/facet/lens/Sparql">

<ja:assembler>com.ingenta.facet….</ja:assembler> </rdf:Description>

<view:Sparql rdf:about="http://oecd.metastore.ingenta.com/views/just-the-

title"> <view:query>sparql/just-the-title.rq</view:query> </view:Sparql> </rdf:RDF>

Page 18: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Data Model

Configuring RDF graphs

Page 19: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Data Model Configuration

• Jena Assembler API

• Add notion of application level default data model– Uses well-known URI

• Lenses may be configured to apply to a specific data model– Allows “sharding” of data models

<view:Sparql

rdf:about="http://oecd.metastore.ingenta.com/views/just-the-title">

<view:appliesTo rdf:resource=“…”/>

</view:Sparql>

Page 20: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Resource Identifiers

Page 21: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Mapping Resource Identifiers

• In a RESTful application, each resource should have a single primary location

• Allows resource identifiers to be derived using URL rewriting

http://test.sourceoecd.org/oecd/content/journal/18168116

http://oecd.metastore.ingenta.com/content/journal/18168116

Page 22: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Serialization

Mapping an RDF sub-graph to a Java object model

Page 23: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Serialization

• Primary resource is a ContentItem– Has an identifier and Map of properties

• Walk through graph, beginning at “root” resource, mapping RDF statements to Map entries

• Mapping of property names is configurable.

– Default based on namespace prefix, E.g. dc_title

• Mapping of objects of each statement to a suitable Java object– ContentItem, Map, List, Integer, etc

Page 24: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Serialization (special cases)

• Multilingual properties– Special casing (i.e. a hack!) to modify naming, e.g. dc_title_fr

• Repeated properties, e.g. dc:subject– Use schema annotation to indicate these, and then Serialize to a List

• XML Literals & Multi-lingual data– E.g. multi-lingual abstracts (dc:description) that contain XHTML

markup– Use schema annotation, parse and create separate Map entries

Page 25: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

Additional Features• “MultiLens”, applying multiple queries in series to build results

• Automatic availability of URL parameters as SPARQL query parameters

• Integral API support– RDF output for free; JSON output trivial

• Simple content lifecycle, mapping to HTTP resource statuses– E.g. Content Not Found, Moved, Gone– Add type (life:Deleted) and properties (life:newLocation) to data

• Support for URL Aliasing based on property values– /content/issn/1234-5678 -> /content/journal/abcdef– <prism:issn>

Page 26: Facet: Building Web Pages with SPARQL

a division of Publishing Technology

SummaryPros• By embracing a few limitations on RDF modelling, e.g. identifiers provides

a very flexible means of building web pages from an RDF repository

• Reliance on SPARQL and Jena API features provides great deal of configuration options

• Good integration with existing web templating environments

• Quick to learn

Cons• Model limitations mean its not suited to all RDF “in the wild”