oclc research tai chi webinar 7/1/2010 oclc open source linked data framework ralph levan sr....

Post on 23-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

OCLC ResearchTAI CHI Webinar

7/1/2010

OCLC Open Source Linked Data Framework

OCLC Open Source Linked Data Framework

Ralph LeVanSr. Research ScientistOCLC Research

Goal: Expose Text Database Content as Linked DataGoal: Expose Text Database Content as Linked Data

Technique: Using a combination of the urlrewritefilter from tuckey.org, the content negotiation component from the Freie Universität Berlin’s Pubby server and our Open Source SRW/U server you can expose the records in your database as Linked Data

RoadmapRoadmap

1. SRW/U Server

2. URIs for records

3. Real World Objects for records

4. Multiple record formats

5. RDF needs to be returned

6. Content Negotiation

SRW/U ServerSRW/U Server

SRW/U server sits in front of text databases

• We have interfaces for DSpace, Lucene and Pears

Easy to write your own interface

• Convert CQL query to native query language

• Do search and return a resultset object

• Return records from the resultset

• (The Lucene interface is a good simple example of how to build your own database interface)

I expose my SRW/U service as <context>/search, but you can put it wherever you want.

URIs for recordsURIs for records

urlrewritefilter implements apache mod_rewrite patterns for java servlets

It sees the URI and converts it to an SRU search.

• <from>^/([0-9][0-9]+)/$</from>

<to>/search?query=local.viafID+exact+%22$1%22</to>

• E.g. viaf/123 becomes viaf/search?query=viafID+exact+%22123%22

Aside: What to Return?Aside: What to Return?

An SRU query returns a searchRetrieveResponse. A smart client can pick its record out of that response, but that seems wrong

A bad URI will result in “no records found”, but a 404 (record not found) is more appropriate

• Solution: add a new parameter (service=APP) to signal that this was a request for a single record

• E.g. viaf/123 becomes viaf/search?query=viafID+exact+%22123%22&service=APP

Real World Objects for recordsReal World Objects for records

urlrewritefilter can generate 303 (see other) redirects based on URI patterns

• <from>^/([0-9][0-9]+)$</from>

<to type="seeother-redirect">/viaf/$1/</to>

• E.g. viaf/123 redirects to viaf/123/

• The target, viaf/123/, is called the Generic Record

• (note: now we use viaf/123/ as the URI that gets turned into the SRU search)

Multiple record formatsMultiple record formats

urlrewritefilter plus the new httpAccept parameter in SRU

• E.g., viaf/123/marc21.xml becomes viaf/search?query=viafID+exact+123&httpAccept=application/marc21+xml

SRU is configured with a list of supported media types and the XSL stylesheets that render them

MimeType ConfigurationMimeType Configuration

XML.mimeTypes=application/sru+xml;q=0.85, application/xml, text/xml

HTML.mimeTypes=text/html;q=0.9, application/xhtml+xml

RSS.mimeTypes=application/rss+xml;q=0.8RSS.styleSheet=viaf2rss.xslM21.mimeTypes=application/marc21+xml;q=0.7M21.styleSheet=viaf2marc21.xslmarc21HTML.mimeTypes=application/

marc21+html;q=0.7marc21HTML.styleSheet=viaf2marc21.xsl

Aside: 123/marc21.xml NOT 123.m21Aside: 123/marc21.xml NOT 123.m21

The Generic Record being at viaf/123/ seems to imply that it is a collection of records

How do I ask for the HTML version of the MARC21 version of viaf/123 if suffix mangling is all I have?

RDF needs to be returnedRDF needs to be returned

viaf/123/rdf.xml

Making good RDF is tricky and beyond the scope of this presentation (but I think we’re getting close to agreements of sensible basics)

Content Negotiation on Generic RecordContent Negotiation on Generic Record

Pubby has a really nice Content Negotiation module

• It is configured with the list of supported media types with optional quality measures

• It takes an HTTP Accept header and returns the supported media type that matches best

SRU is configured with a list of supported media types and their quality measures and the XSL stylesheets that render them (see previous slide)

ResultResult

viaf/123 redirected to viaf/123/

viaf/123/ turned into viaf/search?query=viafID+exact+123

VIAF record returned by SRW/U server

Content Negotiation causes viaf/123/viaf.html to be returned to googlebot and browsers, viaf/123/rdf.xml to applications that ask for application/rdf+xml and viaf/123/viaf.xml returned when no preference is provided

Lucene Database DemonstrationLucene Database Demonstration

TBD

Questions?Questions?

top related