oclc research tai chi webinar 7/1/2010 oclc open source linked data framework ralph levan sr....

15
OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

Upload: rodney-mitchell

Post on 23-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

OCLC ResearchTAI CHI Webinar

7/1/2010

OCLC Open Source Linked Data Framework

OCLC Open Source Linked Data Framework

Ralph LeVanSr. Research ScientistOCLC Research

Page 2: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

Goal: Expose Text Database Content as Linked DataGoal: Expose Text Database Content as Linked Data

Technique: Using a combination of the urlrewritefilter from tuckey.org, the content negotiation component from the Freie Universität Berlin’s Pubby server and our Open Source SRW/U server you can expose the records in your database as Linked Data

Page 3: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

RoadmapRoadmap

1. SRW/U Server

2. URIs for records

3. Real World Objects for records

4. Multiple record formats

5. RDF needs to be returned

6. Content Negotiation

Page 4: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

SRW/U ServerSRW/U Server

SRW/U server sits in front of text databases

• We have interfaces for DSpace, Lucene and Pears

Easy to write your own interface

• Convert CQL query to native query language

• Do search and return a resultset object

• Return records from the resultset

• (The Lucene interface is a good simple example of how to build your own database interface)

I expose my SRW/U service as <context>/search, but you can put it wherever you want.

Page 5: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

URIs for recordsURIs for records

urlrewritefilter implements apache mod_rewrite patterns for java servlets

It sees the URI and converts it to an SRU search.

• <from>^/([0-9][0-9]+)/$</from>

<to>/search?query=local.viafID+exact+%22$1%22</to>

• E.g. viaf/123 becomes viaf/search?query=viafID+exact+%22123%22

Page 6: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

Aside: What to Return?Aside: What to Return?

An SRU query returns a searchRetrieveResponse. A smart client can pick its record out of that response, but that seems wrong

A bad URI will result in “no records found”, but a 404 (record not found) is more appropriate

• Solution: add a new parameter (service=APP) to signal that this was a request for a single record

• E.g. viaf/123 becomes viaf/search?query=viafID+exact+%22123%22&service=APP

Page 7: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

Real World Objects for recordsReal World Objects for records

urlrewritefilter can generate 303 (see other) redirects based on URI patterns

• <from>^/([0-9][0-9]+)$</from>

<to type="seeother-redirect">/viaf/$1/</to>

• E.g. viaf/123 redirects to viaf/123/

• The target, viaf/123/, is called the Generic Record

• (note: now we use viaf/123/ as the URI that gets turned into the SRU search)

Page 8: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

Multiple record formatsMultiple record formats

urlrewritefilter plus the new httpAccept parameter in SRU

• E.g., viaf/123/marc21.xml becomes viaf/search?query=viafID+exact+123&httpAccept=application/marc21+xml

SRU is configured with a list of supported media types and the XSL stylesheets that render them

Page 9: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

MimeType ConfigurationMimeType Configuration

XML.mimeTypes=application/sru+xml;q=0.85, application/xml, text/xml

HTML.mimeTypes=text/html;q=0.9, application/xhtml+xml

RSS.mimeTypes=application/rss+xml;q=0.8RSS.styleSheet=viaf2rss.xslM21.mimeTypes=application/marc21+xml;q=0.7M21.styleSheet=viaf2marc21.xslmarc21HTML.mimeTypes=application/

marc21+html;q=0.7marc21HTML.styleSheet=viaf2marc21.xsl

Page 10: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

Aside: 123/marc21.xml NOT 123.m21Aside: 123/marc21.xml NOT 123.m21

The Generic Record being at viaf/123/ seems to imply that it is a collection of records

How do I ask for the HTML version of the MARC21 version of viaf/123 if suffix mangling is all I have?

Page 11: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

RDF needs to be returnedRDF needs to be returned

viaf/123/rdf.xml

Making good RDF is tricky and beyond the scope of this presentation (but I think we’re getting close to agreements of sensible basics)

Page 12: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

Content Negotiation on Generic RecordContent Negotiation on Generic Record

Pubby has a really nice Content Negotiation module

• It is configured with the list of supported media types with optional quality measures

• It takes an HTTP Accept header and returns the supported media type that matches best

SRU is configured with a list of supported media types and their quality measures and the XSL stylesheets that render them (see previous slide)

Page 13: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

ResultResult

viaf/123 redirected to viaf/123/

viaf/123/ turned into viaf/search?query=viafID+exact+123

VIAF record returned by SRW/U server

Content Negotiation causes viaf/123/viaf.html to be returned to googlebot and browsers, viaf/123/rdf.xml to applications that ask for application/rdf+xml and viaf/123/viaf.xml returned when no preference is provided

Page 14: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

Lucene Database DemonstrationLucene Database Demonstration

TBD

Page 15: OCLC Research TAI CHI Webinar 7/1/2010 OCLC Open Source Linked Data Framework Ralph LeVan Sr. Research Scientist OCLC Research

Questions?Questions?