a division of publishing technology facet building web pages with sparql swig-uk event, hp labs...

26
a division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer, Ingenta

Post on 30-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

FacetBuilding Web Pages With SPARQL

SWIG-UK Event, HP LabsNovember 23rd 2007

Leigh DoddsChief Technology Officer, Ingenta

Page 2: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Problem Statement

Page 3: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Where’s my RDF-native Web Framework?!

There is no good system for integrating RDF repositories with existing an web framework (in Java)

Page 4: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Design Constraints

Page 5: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Page 6: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Page 7: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Page 8: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Page 9: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Page 10: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Page 11: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Page 12: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Journal

Article

Article

ArticleIssue

Issue

Issue

2007 Index

2006 Index

New Article Index

Page 13: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Design Constraints

• A web page presents data that is a sub-graph (i.e. a view) over a larger RDF store, (the data model)

• The extent of the sub-graph may vary for different presentations of the data, and may contain arbitrary properties

• The description of the sub-graph (a lens) should be declarative

• That sub-graph is “rooted” on a single primary resource (e.g. a Journal)

• The identifier of the primary resource can be derived from the request URL, e.g. by rewriting the URI. And vice versa

– Therefore, we don’t support blank nodes as primary resources– Or fragment identifiers in URIs!

• The sub-graph should be serializable into an object graph for presentation to the templating system

Page 14: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Facet Request Handling

To return a response we need to answer three questions…

1. What lens are we going to apply?

1. What data model are we going to apply it to?

1. What’s the identifier of the primary resource?

Page 15: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Lenses

Describing views of RDF data

Page 16: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

A Simple Lens

PREFIX dc: <http://purl.org/dc/elements/1.1/>CONSTRUCT { ?item dc:title ?title . ?item dc:language ?language .} WHERE { ?item dc:title ?title . OPTIONAL { ?item dc:language ?language . } }

Page 17: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Configuring Lenses<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ja="http://jena.hpl.hp.com/2005/11/Assembler#"

xmlns:view="http://metastore.ingenta.com/facet/lens/"> <rdf:Description rdf:about="http://metastore.ingenta.com/facet/lens/Sparql">

<ja:assembler>com.ingenta.facet….</ja:assembler> </rdf:Description>

<view:Sparql rdf:about="http://oecd.metastore.ingenta.com/views/just-the-

title"> <view:query>sparql/just-the-title.rq</view:query> </view:Sparql> </rdf:RDF>

Page 18: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Data Model

Configuring RDF graphs

Page 19: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Data Model Configuration

• Jena Assembler API

• Add notion of application level default data model– Uses well-known URI

• Lenses may be configured to apply to a specific data model– Allows “sharding” of data models

<view:Sparql

rdf:about="http://oecd.metastore.ingenta.com/views/just-the-title">

<view:appliesTo rdf:resource=“…”/>

</view:Sparql>

Page 20: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Resource Identifiers

Page 21: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Mapping Resource Identifiers

• In a RESTful application, each resource should have a single primary location

• Allows resource identifiers to be derived using URL rewriting

http://test.sourceoecd.org/oecd/content/journal/18168116

http://oecd.metastore.ingenta.com/content/journal/18168116

Page 22: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Serialization

Mapping an RDF sub-graph to a Java object model

Page 23: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Serialization

• Primary resource is a ContentItem– Has an identifier and Map of properties

• Walk through graph, beginning at “root” resource, mapping RDF statements to Map entries

• Mapping of property names is configurable.

– Default based on namespace prefix, E.g. dc_title

• Mapping of objects of each statement to a suitable Java object– ContentItem, Map, List, Integer, etc

Page 24: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Serialization (special cases)

• Multilingual properties– Special casing (i.e. a hack!) to modify naming, e.g. dc_title_fr

• Repeated properties, e.g. dc:subject– Use schema annotation to indicate these, and then Serialize to a List

• XML Literals & Multi-lingual data– E.g. multi-lingual abstracts (dc:description) that contain XHTML

markup– Use schema annotation, parse and create separate Map entries

Page 25: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

Additional Features• “MultiLens”, applying multiple queries in series to build results

• Automatic availability of URL parameters as SPARQL query parameters

• Integral API support– RDF output for free; JSON output trivial

• Simple content lifecycle, mapping to HTTP resource statuses– E.g. Content Not Found, Moved, Gone– Add type (life:Deleted) and properties (life:newLocation) to data

• Support for URL Aliasing based on property values– /content/issn/1234-5678 -> /content/journal/abcdef– <prism:issn>

Page 26: A division of Publishing Technology Facet Building Web Pages With SPARQL SWIG-UK Event, HP Labs November 23 rd 2007 Leigh Dodds Chief Technology Officer,

a division of Publishing Technology

SummaryPros• By embracing a few limitations on RDF modelling, e.g. identifiers provides

a very flexible means of building web pages from an RDF repository

• Reliance on SPARQL and Jena API features provides great deal of configuration options

• Good integration with existing web templating environments

• Quick to learn

Cons• Model limitations mean its not suited to all RDF “in the wild”