oclc online computer library center interoperability standards & searching multiple repositories...
Post on 04-Jan-2016
218 Views
Preview:
TRANSCRIPT
OCLC Online Computer Library Center
Interoperability Standards &
Searching Multiple Repositories
Ralph LeVan/OCLC
Ray Denenberg/Library of Congress
The ProblemThe Problem
How do I provide a common interface for my users?
How do I combine results from multiple sources?
How do I provide a common interface for my users?
How do I provide a common interface for my users?
How do I convert my queries into the Content Provider’s (CP’s) queries?
How do I ask for 10 records?
How do I ask for more records?
How do I interpret their response?
How do I convert my queries into the CP’s queries?
How do I convert my queries into the CP’s queries?
My user said “author=twain and title=huck finn”
Google expects: +twain +”huck finn”
Z39.50: twain/1=1003;4=2 “huck finn”/1=4;4=1 and
Lucene: creator:twain and titlePhrase:”huck finn”
How do I ask for 10 records?How do I ask for 10 records?
Amazon won’t let you
RedLightGreen: MAXRECORDS=n
British Library: records=n
How do I ask for more records?How do I ask for more records?
Amazon: page=n
RedLightGreen: STARTINDEX=n
British Library: start=n
How do I interpret their response?How do I interpret their response?
How many records did I retrieve?
Did something go wrong?
How do I convert the CP’s records into something my users will recognize?
How many records did I retrieve?How many records did I retrieve?
Amazon:<a href="/gp/search/ref=sr_nr_i_0/002-2019116-
8269663?%5Fencoding=UTF8&keywords=pratchett&rh=i%3Aaps%2Ck%3Apratchett%2Ci%3Astripbooks&page=1">Books</a><span class="narrowValue"> (334)</span>
RedLightGreen:<b>Viewing:</b> 1-10 of 239 results
British Library<opensearch:totalResults>190</
opensearch:totalResults>
Did Something Go Wrong?Did Something Go Wrong?
RedLightGreen:<span class=smallText>We didn't find any
matches for <b>dog and</b>.</span>
British Library:<item ><title >Nothing found due to an error</title><description >Too many hits. Refine your
request.</description></item>
How do I convert the records?How do I convert the records?
Amazon:
<table class="searchresults" border="0" width="100%" cellpadding="0" cellspacing="0">
<tr><td width="100%" class="searchitem" id="Td:0">
<table border="0" width="100%" cellpadding="0" cellspacing="0"><tr valign="top">
<td>
<table class="n2" border="0" cellpadding="0" cellspacing="0">
<tr>
<td class="imageColumn" width="88"><table border="0" cellpadding="0" cellspacing="0">
<tr><td align="center" width="80">
<a href="http://www.amazon.com/gp/product/0060815221/sr=8-1/qid=1142436987/ref=pd_bbs_1/002-2019116-8269663?%5Fencoding=UTF8"><img src="http://ec1.images-amazon.com/images/P/0060815221.01._PIsitb-st-arrow,TopLeft,-1,-14_SCTHUMBZZZ_.jpg" width="55" alt="Thud! (Discworld, Book 32)" height="82" border="0" /></a>
</td><td width="8"></td></tr></table></td>
<td class="dataColumn"><table cellpadding="0" cellspacing="0" border="0"><tr><td>
<a href="http://www.amazon.com/gp/product/0060815221/sr=8-1/qid=1142436987/ref=pd_bbs_1/002-2019116-8269663?%5Fencoding=UTF8"><span class="srTitle">Thud! (Discworld, Book 32)</span></a>
by Terry Pratchett (<span class="binding">Hardcover</span>
- Sep 13, 2005)</td></tr>
<tr><td class="brandLink"><span class="aliasName">Books:</span> <a href="/gp/search/ref=sr_nr_seeall_1/002-2019116-8269663?%5Fencoding=UTF8&keywords=pratchett&rh=i%3Aaps%2Ck%3Apratchett%2Ci%3Astripbooks">See all 334 items</a></td></tr>
<tr><td><span class="priceType"><a href="http://www.amazon.com/gp/product/0060815221/sr=8-1/qid=1142436987/ref=pd_bbs_1/002-2019116-8269663?%5Fencoding=UTF8">Buy new</a>: </span> <span class="listprice">$24.95</span> <span class="saleprice">$15.72</span>
<span class="priceType">
<a href="http://www.amazon.com/gp/offer-listing/0060815221/sr=8-1/qid=1142436987/ref=pd_bbs_1/002-2019116-8269663?%5Fencoding=UTF8">Used & new</a>
</span> from <span class="otherprice">$3.76</span>
<span class="avail">Usually ships in 24 hours</span>
</td></tr><tr><td colspan="2"><table cellpadding="0" cellspacing="0" border="0">
<tr><td class="excerptStart"><span class="excerptLead">Excerpt from</span> <a href="/gp/reader/0060815221/ref=sib_aps_pg/002-2019116-8269663?%5Fencoding=UTF8&keywords=pratchett&p=S00E&checkSum=y3glB4NEGJ6Ql3iAWFd6teZptAJmys3Uu8CCW9387%252BA%253D">page 2</a>: "<span class="excerpt">... Terry <b>Pratchett</b> "Most of the news is ...</span>"</td></tr>
<tr><td class="excerptSeeMore"><a href="/gp/reader/0060815221/ref=sib_aps_ref/002-2019116-8269663?%5Fencoding=UTF8&keywords=pratchett&v=search-inside">See more references</a> to <span class="excerptUserInput">pratchett</span> in this book.</td></tr><tr><td style="padding-top: 5px; padding-bottom: 8px;"><span style="font-weight: bold; color: #339933;">Surprise me!</span> <a href="http://www.amazon.com/gp/reader/0060815221/ref=sib_aps_sup/002-2019116-8269663?%5Fencoding=UTF8&p=random">See a random page</a> in this book.</td></tr></table></td></tr>
</table></td></tr></table>
</td></tr></table></td>
</tr>
Converting Records Cont.Converting Records Cont.
RedLightGreen:
<td class="highlightcell"><span class="titleText"><b><a title="View more information about this title." href="ucw.servlets.UCWController?ACTION=EDITION&WORKID=21537371&LANGUAGE=ENG&MATERIAL=books&FROMRSLT=3&FROMWORK=1&lang=english">Hogfather</a></b>, by Terry Pratchett <br>3 editions published between 1996 and 1998 in English.<br>Primary Subject: Discworld Imaginary Place - Fiction<br><img src="/ucwprod/web/images/green.gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) 
and availability (how many libraries have a copy of the title)."/><img src="/ucwprod/web/images/white.gif" height="3" width="1"/><img src="/ucwprod/web/images/green.gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) 
and availability (how many libraries have a copy of the title)."/><img src="/ucwprod/web/images/white.gif" height="3" width="1"/><img src="/ucwprod/web/images/green.gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) 
and availability (how many libraries have a copy of the title)."/><img src="/ucwprod/web/images/white.gif" height="3" width="1"/><img src="/ucwprod/web/images/green.gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) 
and availability (how many libraries have a copy of the title)."/><img src="/ucwprod/web/images/white.gif" height="3" width="1"/><img src="/ucwprod/web/images/gray.gif" height="3" width="10" alt="A title's position in a search result is based on relevancy (how closely your search terms match the description) 
and availability (how many libraries have a copy of the title)."/><img src="/ucwprod/web/images/white.gif" height="3" width="1"/></span></td></tr></table><table xmlns="http://www.w3.org/TR/REC-html40" border="0" cellpadding="0" cellspacing="0" width="100%"><tr><td class="recordsepcell" colspan="2"><img src="/ucwprod/web/images/clear.gif" height="1"/></td></tr></table><table xmlns="http://www.w3.org/TR/REC-html40" border="0" cellpadding="3" cellspacing="0" width="100%"><tr valign="top"><td width="25" align="right" class="highlightcell"><span class="titleText">2.</span></td>
Converting Records Cont.Converting Records Cont.
British Library:
<item ><title >Thud! / Terry Pratchett.</title>
<link >http://catalogue.bl.uk/F/-?func=direct-doc-set&doc_number=013220851&l_base=BLL01&from=A9OpenSearch</link>
<description > Pratchett, Terry. ; London : Doubleday, 2005. . ISBN 0385608675 (hbk.) : £17.99 . (Added : 20050614 )</description></item>
How do I combine results from multiple sources?How do I combine results from multiple sources?
Things you might want the server to do for you:– Common Record Format– Common Sort Order– Common Rank Order
Functional MatrixFunctional MatrixRequest Record Starting Point
Request Number of Records
Request Record Schema
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages
XML Response
Record Count In Response
Records In Known Schema
The Old SolutionsThe Old Solutions
Screen Scraping
Private API’s
Z39.50
Screen ScrapingScreen Scraping
A query has to be generated and embedded in a CP specific URL
Code has to be written to examine the HTML returned by a CP
Prone to breakage– Web sites change formatting frequently
Every site is unique– Separate code to be maintained for every
site
Private API’sPrivate API’s
Often only a slight improvement over screen scraping
Provides documentation on how to construct the URL
Might provide documentation on how to construct the query
Might guarantee a stable response format
Still requires unique code for each site
Z39.50Z39.50
Guarantees a standard request and response
But…– Not HTTP or HTML
• Binary encoding over raw TCP/IP
– Complicated• 11 services• 7 extended services
– Easy to be compliant and not interoperable– Unfriendly
• The response to a protocol error was to drop the connection
Why Use A Standard API?Why Use A Standard API?
Defined requests and responses
Reusable code across sites
Open Source code
The New SolutionsThe New Solutions
OpenSearch 1.1
MXG– Levels 0-2
SRU
OpenSearch 1.1OpenSearch 1.1
From Wikipedia– OpenSearch is a collection of technologies
that allow publishing of search results in a format suitable for syndication. It is a way for search engines to publish their search results in a standard and accessible format
OpenSearch 1.1 (cont.)OpenSearch 1.1 (cont.)
Defines a Description Record with information about the CP– ShortName and LongName– Description– Tags– URL template
Example:
http://herbie.bl.uk:9080/opensearch.xml
OpenSearch 1.1 (cont.)OpenSearch 1.1 (cont.)
URL Template– Server Indicates how to specify OpenSearch request
parameters– Parameters not specified in the template are
unavailable– The only mandatory parameter is {searchTerms}
<Url type="application/rss+xml" template="http://herbie.bl.uk:9080/cgi-bin/OSxml1.cgi/?q={searchTerms}&start={startIndex?}&records={count?}&format=rss" />
OpenSearch 1.1 (cont.)OpenSearch 1.1 (cont.)
Request Parameters– {searchTerms}– {count}– {startIndex} – {startPage} – {language} – {outputEncoding}– {inputEncoding}
OpenSearch 1.1 (cont.)OpenSearch 1.1 (cont.)
Uses RSS 2.0 with a few extra elements for the response– RSS define title, description and link
elements– OpenSearch adds the totalResults,
startIndex, itemsPerPage, link and Query elements
http://herbie.bl.uk:9080/cgi-bin/OSxml1.cgi/?q=levan&format=rss
Functional MatrixFunctional MatrixOS 1.1
Request Record Starting Point ●
Request Number of Records ○
Request Record Schema
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages
XML Response ○
Record Count In Response ○
Records In Known Schema ○
Key: ●==Full Support ○==Limited Support
Cool FeatureCool FeatureThe RSS mechanism in OpenSearch provides the ability to have persistent and periodic queries!
NISO MetaSearch XML Gateway
MXG
NISO MetaSearch XML Gateway
MXGMXG has been designed to provide a low implementation barrier to content providers that want to make their databases available to metasearch engines. Interoperability across content providers was explicitly not a goal of MXG
MXG Levels of SupportMXG Levels of Support
Level 0: Requests are simple URL’s using any query grammar and responses are XML records
Level 1: Adds a description record for the database
Level 2: Support a limited subset of a standard query grammar: CQL
MXG RequestMXG Request
Version (mandatory)
Query (mandatory)
StartRecord
MaximumRecords
http://alcme.oclc.org/MXG/search/ORPubs?version=1.1&query="levan"&startRecord=1&maximumRecords=10
MXG ResponseMXG Response
<?xml version="1.0" ?> <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/"> <version>1.1</version> <numberOfRecords>10</numberOfRecords> <records> … </records> <nextRecordPosition>1</nextRecordPosition> <echoedSearchRetrieveRequest> <version>1.1</version> <query>"stuff"</query> </echoedSearchRetrieveRequest> </searchRetrieveResponse>
MXG Response RecordsMXG Response Records
<record> <recordSchema> info:srw/schema/1/dc-v1.1 </recordSchema> <recordPacking>xml</recordPacking> <recordData> … </recordData> <recordPosition>1</recordPosition> </record>
MXG Response recordDataMXG Response recordData
<srw_dc:dc xmlns="http://www.w3.org/TR/xhtml1/strict" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:srw_dc="info:srw/schema/1/dc-v1.1"> <dc:identifier>rrl1234</dc:identifier> <dc:title>Dog and Cat</dc:title> </srw_dc:dc>
MXG Error MessagesMXG Error Messages<diagnostics> <diagnostic
xmlns="http://www.loc.gov/zing/srw/diagnostic/"> <uri>info:srw/diagnostic/1/51</uri> <details>66ntqk</details> </diagnostic> </diagnostics>
http://www.loc.gov/z3950/agency/zing/srw/diagnostics-list.html
Functional MatrixFunctional MatrixMXG Level 0
Request Record Starting Point ●
Request Number of Records ●
Request Record Schema ○
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages ●
XML Response ●
Record Count In Response ●
Records In Known Schema ●
Key: ●==Full Support ○==Limited Support
MXG Level 1MXG Level 1
Add a description record for the database
http://www.loc.gov/z3950/agency/zing/srw/explain.html
http://alcme.oclc.org/MXG/search/ORPubs
Functional MatrixFunctional MatrixMXG Level 1
Request Record Starting Point ●
Request Number of Records ●
Request Record Schema ●
Defined Query Grammar
Specify Sort Order
Specify Ranking Order
Diagnostic Messages ●
XML Response ●
Record Count In Response ●
Records In Known Schema ●
Key: ●==Full Support ○==Limited Support
MXG Level 2MXG Level 2
Support a limited subset of a standard query grammar: CQL
Supports indexes and Booleans
http://www.loc.gov/z3950/agency/zing/cql/
http://alcme.oclc.org/srw/search/ORPublications?version=1.1&query=dc.author=levan&maximumRecords=1
Functional MatrixFunctional MatrixMXG Level 2
Request Record Starting Point ●
Request Number of Records ●
Request Record Schema ●
Defined Query Grammar ○
Specify Sort Order
Specify Ranking Order
Diagnostic Messages ●
XML Response ●
Record Count In Response ●
Records In Known Schema ●
Key: ●==Full Support ○==Limited Support
SRUSRU
MXG Level 2 Plus:– Full Query Grammar (CQL)– Full Sort Specification
CQL: Common Query LanguageCQL: Common Query Language
Loosely based on CCL Search
Boolean & Proximity Operators
Index Sets & Indexes
String Indexes vs. Keyword Indexes
Truncation Characters ‘*’, ‘#’ & ‘?’
Relations: ‘=‘, all, any, exact, within
Example:dc.title=“harry potter” or bib1.isbn=123-456-78x
SortSort
sortKeys parameter with the following comma separated values specified:– Xpath (path to the element to be sorted on)– Schema (that the xpath comes from)– Ascending (value is 1==true or 0==false,
default==true)– CaseSensitive (value is 1==true or
0==false, default==false)– missingValue (values are omit, abort,
highValue or lowValue, default==highValue)
e.g. &sortKeys=title,onix,0
Functional MatrixFunctional MatrixSRU
Request Record Starting Point ●
Request Number of Records ●
Request Record Schema ●
Defined Query Grammar ●
Specify Sort Order ●
Specify Ranking Order ○
Diagnostic Messages ●
XML Response ●
Record Count In Response ●
Records In Known Schema ●
Key: ●==Full Support ○==Limited Support
Cool FeatureCool Feature
Combining SRU response data and echoed data with javascript and stylesheets allows for thin, browser based, clients
http://alcme.oclc.org/MXG/search/ORPubs?version=1.1&query="levan"&startRecord=1&maximumRecords=10
Functional MatrixFunctional MatrixOS 1.1
MXG L0
MXG L1
MXG L2
SRU
Request Record Starting Point ● ● ● ● ●
Request Number of Records ○ ● ● ● ●
Request Record Schema ○ ● ● ●
Defined Query Grammar ○ ●
Specify Sort Order ●
Specify Ranking Order ○
Diagnostic Messages ● ● ● ●
XML Response ○ ● ● ● ●
Record Count In Response ○ ● ● ● ●
Records In Known Schema ○ ● ● ● ●
Key: ●==Full Support ○==Limited Support
top related