xml applications libraries

34
Applications of XML in Libraries for Electronic Resources Karen A. Coombs University of Houston [email protected]

Upload: librarywebchic

Post on 27-Jan-2015

118 views

Category:

Education


0 download

DESCRIPTION

Preconference presentation for ER&L 2009

TRANSCRIPT

Page 1: Xml Applications Libraries

Applications of XML in Libraries for Electronic Resources

Karen A. CoombsUniversity of [email protected]

Page 2: Xml Applications Libraries

XML formats you might see or use in libraries

• MARCXML

• MARCXML holdings

• ISO/FDIS 20775 - Holdings schema

• OpenURL XML formats• XML Metadata Format for Books (info:ofi/fmt:xml:xsd:book)• XML Metadata Format for Journals (info:ofi/fmt:xml:xsd:journal)

• Digital Library standards • Dublin Core• MODS• METS

Page 3: Xml Applications Libraries

MARCXML

• XML version of a MARC record

• Uses fields, subfields and indicators

• Very complex and often difficult to work with

• Typical output of most API for library catalogs

• Difficult to interpret if don’t know MARC

• OCLC Bibliographic Standards and Formats - http://www.oclc.org/bibformats/default.htm

Page 4: Xml Applications Libraries

<?xml version="1.0"?><marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim"> <totalResults xmlns="http://a9.com/-/spec/opensearch/1.1/">1</totalResults> <startIndex xmlns="http://a9.com/-/spec/opensearch/1.1/">1</startIndex> <itemsPerPage xmlns="http://a9.com/-/spec/opensearch/1.1/">10</itemsPerPage> <record xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.loc.gov/MARC21/slim" xmlns:marc="http://www.loc.gov/MARC21/slim" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/ standards/marcxml/schema/MARC21slim.xsd"> <leader>04957cam a22004698a 4500</leader> <controlfield tag="001">ocm61260129</controlfield> <controlfield tag="003">OCoLC</controlfield> <controlfield tag="005">20080604173055.0</controlfield> <controlfield tag="008">050712s2005 caua b 000 0 eng</controlfield> <datafield tag="020" ind1=" " ind2=" "> <subfield code="a">0596007655 (pbk.)</subfield> </datafield> <datafield tag="035" ind1=" " ind2=" "> <subfield code="a">(OCoLC)61260129</subfield> </datafield> <datafield tag="050" ind1=" " ind2="4"> <subfield code="a">QA76.9.D26</subfield> <subfield code="b">M67 2005</subfield> </datafield> <datafield tag="100" ind1="1" ind2=" "> <subfield code="a">Morville, Peter.</subfield> </datafield> <datafield tag="245" ind1="1" ind2="0"> <subfield code="a">Ambient findability /</subfield> <subfield code="c">Peter Morville.</subfield> </datafield></record></marc:collection>

Page 5: Xml Applications Libraries

MARCXML Holdings

• MARC format for holdings

• Most relevant for serials/journals

• Limited number of important fields

• 856 - Electronic Location and Access

• 853 - Captions and Pattern information

• 863 - Enumeration and Chronology

• 866 - Textual Statement of Holdings

Page 6: Xml Applications Libraries

<?xml version="1.0" encoding="UTF-8" ?><marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd"><marc:record><marc:leader>00381nam a2200133 45e0</marc:leader><marc:controlfield tag="001">mfhd1</marc:controlfield><marc:controlfield tag="008">8312304g####8###2001aaeng0831017</marc:controlfield><marc:datafield tag="852" ind1=" " ind2=" "><marc:subfield code="a">CSf</marc:subfield><marc:subfield code="b">Sci</marc:subfield><marc:subfield code="t">2</marc:subfield></marc:datafield><marc:datafield tag="853" ind1="1" ind2="0"><marc:subfield code="8">1</marc:subfield><marc:subfield code="a">v.</marc:subfield><marc:subfield code="b">no.</marc:subfield><marc:subfield code="u">12</marc:subfield><marc:subfield code="v">r</marc:subfield><marc:subfield code="i">(year)</marc:subfield><marc:subfield code="j">(month)</marc:subfield><marc:subfield code="w">m</marc:subfield><marc:subfield code="x">01</marc:subfield></marc:datafield><marc:datafield tag="863" ind1="4" ind2="0"><marc:subfield code="8">1.2</marc:subfield><marc:subfield code="a">22</marc:subfield><marc:subfield code="b">1-6</marc:subfield><marc:subfield code="i">1982</marc:subfield><marc:subfield code="j">01-06</marc:subfield></marc:datafield></marc:record></marc:collection>

Page 7: Xml Applications Libraries

ISO/FDIS 20775

• Standard for transmitting holdings information

• Also contains information about the library with the holdings

• Being used by OCLC in WorldCat API

• Can contain information about complex serial holdings

• Can contain information about availability, availability policy, conditions and charges

Page 8: Xml Applications Libraries

<holding> <institutionIdentifier> <value>CZP</value> <typeOrSource> <pointer>http://worldcat.org/registry/institutions/</pointer> </typeOrSource> </institutionIdentifier> <physicalLocation>Peninsula Library System</physicalLocation> <physicalAddress> <text>San Mateo, CA 94403 United States</text> </physicalAddress> <electronicAddress> <text>http://www.worldcat.org/wcpa/oclc/8114241? page=frame&url=http%3A%2F%2Fcatalog.plsinfo.org%2Fsearch%2Fi0380641135 &title=Peninsula+Library+System&linktype=opac &detail=CZP%3APeninsula+Library+System%3APublic &qt=affiliate&ai=wcapi</text> </electronicAddress> <holdingSimple> <copiesSummary> <copiesCount>1</copiesCount> </copiesSummary> </holdingSimple></holding>

Page 9: Xml Applications Libraries

OpenURL XML formats

• Normally we think of OpenURL as a set of key/value pairshttp://www.crossref.org/openurl?url_ver=Z39.882004&req_dat=username:password&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.atitle=Isolation of a common receptor for coxsackie B&rft.jtitle=Science&rft.aulast=Bergelson&rft.auinit=J&rft.date=1997&rft.volume=275&rft.spage=1320&rft.epage=1323

• Doesn’t have to be. Newer versions allow you to send the metadata as XML rather than a set of key/value pairs

Page 10: Xml Applications Libraries

Digital Library Standards for Metadata

• There are lots of different types of metadata for digital objects

• Descriptive• Structural• Administrative• Technical

• Different types of metadata = different standards

• Dublin Core, MODS - Descriptive• METS - Structural, Administrative• PREMIS - Administrative• MIX - Technical

Page 11: Xml Applications Libraries

Dublin Core

• Two different elements sets: Simple and Qualified

• Simple• 15 elements• Extremely simplistic• dc namespace

• Qualified• Includes all the elements in Simple Dublin Core plus additional

elements that refinements• description -> abstract

• Still fairly simple but better granularity• dcterms namespace

Page 12: Xml Applications Libraries

<?xml version="1.0" encoding="UTF-8" standalone="no"?><records xmlns:dc="http://purl.org/dc/elements/1.1/" ><record><dc:creator>Morville, Peter.</dc:creator><dc:date>2005</dc:date><dc:description>Includes bibliographical references and index.</dc:description><dc:description>How do you find your way in an age of information overload? How can you filter streams of complex information to pull out only what you want? Why does it matter how information is structured when Google seems to magically bring up the right answer to your questions? What does it mean to be "findable" in this day and age? This eye-opening new book examines the convergence of information and connectivity. Written by Peter Morville, author of the groundbreaking Information Architecture for the World Wide Web, the book defines our current age as a state of unlimited findability. In other words, anyone can find anything at any time. </dc:description><dc:format>xiv, 188 : ill. (some col.) ; 23 cm.</dc:format><dc:identifier>0596007655 (pbk.)</dc:identifier><dc:identifier>9780596007652 (pbk.)</dc:identifier><dc:language xsi:type="http://purl.org/dc/terms/ISO639-2">eng</dc:language><dc:publisher>O'Reilly</dc:publisher><dc:subject xsi:type="http://purl.org/dc/terms/DDC">005.72</dc:subject><dc:subject xsi:type="http://purl.org/dc/terms/LCC">QA76.9.D26 M67 2005</dc:subject><dc:subject xsi:type="http://purl.org/dc/terms/LCSH">Database searching.</dc:subject><dc:subject xsi:type="http://purl.org/dc/terms/NLM">TK 5105.888 M892a 2005</dc:subject><dc:title>Ambient findability </dc:title><dc:type>Text</dc:type></record></records>

Page 13: Xml Applications Libraries

METS

• Metadata Encoding Transmission Standard

• Used for digital objects to “wrap-up” all metadata elements

• Can include other metadata schemes

• Provides structural metadata

• what files are part of the objects

• what is their purpose

Page 14: Xml Applications Libraries

MODS

• Metadata Object Description Schema

• Advantages

• Richer description than Dublin Core

• Element names more user-friendly than MARCXML

• Better separation of data and presentation than MARC and actual datatyping of elements

• Typically used for describing digital library content but MARCXML can be converted to MODS

Page 15: Xml Applications Libraries

XML from the Internet also useful to Libraries

• Feeds

• Standard formats for syndicating content

• RSS

• title, description, link, author, pubDate

• Atom

• title, summary, link, modified, dc:date

Page 16: Xml Applications Libraries
Page 17: Xml Applications Libraries

<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Library Hi Tech </title><link>http://www.emeraldinsight.com/0737-8831.htm</link><description> Table of Contents from the most recently published issues of Library Hi Tech</description><language>en-us</language><copyright>2009 Emerald Group Publishing Ltd.</copyright><image><title>Library Hi Tech </title><url>http://www.emeraldinsight.com/info/pics/journals/lht-cover-xix.gif</url><width>120</width><height>157</height></image><item><title>Accessing information in a parliamentary environment: is the OPAC dead? : Table of Contents</title><link/><description> &lt;B&gt;Abstract:&lt;/B&gt;&lt;BR/&gt; &lt;B&gt;Purpose&lt;/B&gt; - Access to library collections in an era where users want to "get" rather than "find" offers particular challenges. This article explores users' needs for bibliographic records in a primarily full text environment.&lt;B&gt;Design/methodology/approach&lt;/B&gt; - The paper describes access to parliamentary and library information from the Australian Parliament. It then outlines the approach taken to develop and implement a new search system, ParlInfo, which applied a repository and search system that provides integrated access to bibliographic and full text information. The system was launched in September 2008 and offers facets, alerts, RSS feeds and other Web 2.0 functionality to offer both the Australian public and Parliamentary Network users to access to library collections and parliamentary collections. &lt;B&gt;Findings&lt;/B&gt; -.</description><author>Ms. Roxanne Missingham, Ms. Rina Brettell, Ms. Shirley White, Dr. Sarah Miskin</author><pubDate>Sun Jan 18 14:15:05 GMT 2009</pubDate></item></channel></rss>

Page 18: Xml Applications Libraries
Page 19: Xml Applications Libraries

<?xml version="1.0" encoding="UTF-8"?><feed xmlns="http://purl.org/atom/ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="0.3"> <title>Geological Magazine - Current Issue</title> <link rel="alternate" href="http://journals.cambridge.org/action/displayJournal?jid=GEO" /> <info>Geological Magazine, Volume 146 Issue 01 Geological Magazine , established in 1864, is one of the oldest and best-known periodicals in earth sciences. It publishes original scientific papers covering the complete spectrum of geological topics, with high quality illustrations. Its worldwide circulation and high production values, combined with Rapid Communications and Book Review sections keep the journal at the forefront of the field. This journal is included in the Cambridge Journals open access initiative, Cambridge Open Option. Offer readers unrestricted online access to your work, click here for more details.</info> <entry> <title>Volume 146 Issue 01</title> <link rel="alternate" href="http://journals.cambridge.org/action/displayIssue?jid=GEO&amp;volumeId=146&amp;issueId=01" />

<modified>2009-01-01T00:00:00Z</modified> <summary type="text/plain" mode="xml">Geological Magazine, Volume 146 Issue 01 Geological Magazine , established in 1864, is one of the oldest and best-known periodicals in earth sciences. It publishes original scientific papers covering the complete spectrum of geological topics, with high quality illustrations. Its worldwide circulation and high production values, combined with Rapid Communications and Book Review sections keep the journal at the forefront of the field. This journal is included in the Cambridge Journals open access initiative, Cambridge Open Option. Offer readers unrestricted online access to your work, click here for more details.</summary> <dc:date>2009-01-01T00:00:00Z</dc:date> </entry></feed>

Page 20: Xml Applications Libraries

Sources for data in XML format

• Syndicated Table of Content feeds• From Publisher websites - Emerald • From ticTOCs project- http://www.tictocs.ac.uk

• WorldCat API

• Evergreen Catalogs (Georgia Pines, University of Prince Edward Island)

• xISSN services

• Serial Solutions API

Page 21: Xml Applications Libraries
Page 22: Xml Applications Libraries
Page 23: Xml Applications Libraries

WorldCat API

• Service Levels• Default - limited set of indexes and limits; limited bibliographic data

returned• Full - all indexes available in WorldCat; full bibliographic data

• Search formats• OpenSearch• SRU

• Response formats• OpenSearch

• RSS• Atom

• SRU• MARCXML• Dublin Core

Page 24: Xml Applications Libraries

• Can search by ISSN or other fields, full MARC records can be returned

http://worldcat.org/webservices/catalog/search/sru?query=srw.in+all+%221041-7915%22&version=1.1&operation=searchRetrieve&wskey=key&recordSchema=info%3Asrw%2Fschema%2F1%2Fmarcxml&maximumRecords=10&startRecord=1&recordPacking=xml&servicelevel=default&sortKeys=relevance&resultSetTTL=300&recordXPath=

• query - srw queryUse SRU Explain Service (http://worldcat.org/webservices/catalog/) to help construct your query

• wskey - API key

SRU Query to WorldCat Search API

Page 25: Xml Applications Libraries

• Can only search by keywords and the data returned isn’t particularly useful when dealing with serials/journals

http://worldcat.org/webservices/catalog/search/worldcat/opensearch?q=computers%20in%20libraries&format=atom&wskey=key

• q - your queryThis is very simple really can’t be anything but a keyword search

• format - format you want results returned in Atom or RSS

• wskey - WorldCat Search API key

An Open Search Query to WorldCat Search API

Page 26: Xml Applications Libraries

xISSN Service

• Several types of Requests

• getForms - returns a list of ISSNs and its production form information in same group as the requested ISSN.

• Form is ONIX production form code

• JB ( Printed serial ), JC ( Serial distributed electronically by carrier ) ,JD ( Electronic serial distributed online ), MA ( Microform )

• getEditions - returns a list of ISSNs in same group as the requested ISSN.

• form, oclcnum, peerreview, publisher, rawcoverage, title

• getHistory - returns a list of ISSNs in same group as the requested ISSN, as well as ISSNs for preceding/succeeding groups

• getMetadata - returns metadata about the requested ISSN

• xISSN History Visualization Tool - generate a chart showing the history of a journal with a given ISSN

Page 27: Xml Applications Libraries

<rsp stat="ok"><group rel="this">

<issn form="JD" oclcnum="57136697 222024701 34298537" rawcoverage="Vol. 1, no. 1 (July 3, 1880)-v. 3, no. 82 (Mar. 4, 1882); [New ser.] Vol. 1, no. 1 (Feb. 9, 1883)-v. 23, no. 581 (Mar. 23, 1894); [2nd ser.] v. 1, no. 1 (Jan. 4, 1895)-" title="Science" publisher="New York, N.Y. : s.n" peerreview="Y">1095-9203</issn><issn form="JB" oclcnum="53849218 237823594 77943117 182894935 1644869 248155486 213776464 225979457 231016675 183350662 70737295 145332150 191712526 222180991 264687537 9292560 5582807 27118932 173731846 241455726 174295239 32917481 181820410 5933538 7648838 19698903" rawcoverage="Vol. 1, no. 1 (July 3, 1880)-v. 3, no. 82 (Mar. 4, 1882); [New ser.] Vol. 1, no. 1 (Feb. 9, 1883)-v. 23, no. 581 (Mar. 23, 1894); [2nd ser.] v. 1, no. 1 (Jan. 4, 1895)-" title="Science" publisher="New York, N.Y. : s.n" peerreview="Y">0036-8075</issn>

</group></rsp>

Page 28: Xml Applications Libraries
Page 29: Xml Applications Libraries

Serial Solutions API

• Proprietary APIs

• Available for customers only

• API for 360 Link (OpenURL)

• Serial Solutions provides other APIs depending on which of their products you subscribe to

• SFX OpenURL resolver also has an API

Page 30: Xml Applications Libraries

Query to Serial Solutions 360 Link XML API

http://<client identifier>.openurl.xml.serialssolutions.com/openurlxml?version=1.0&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rfr_id=info%3Asid%2Fsersol%3ARefinerQuery&url_ver=Z39.88-2004&rft_id=info%3Adoi%2F10.1037%2F0003-066X.59.1.29

• Standard OpenURL elements are passed

• In this case the DOI is providing the majority of the info

Page 31: Xml Applications Libraries

Other XML standard of interest

• COUNTER and SUSHI - http://www.niso.org/schemas/sushi/Data can be transmitted in XML format

• ONIX

• For Books - http://www.editeur.org/onix.html

• For Serials - http://www.editeur.org/onixserials.html

• Actually a set of formats

• Much more complex than books standard

Page 32: Xml Applications Libraries

Possible Applications

• Integrate journal table of contents into web pages

• Provide users with latest articles in their field by creating an aggregated feed of important journal in a given field

• Provide better interfaces for resources discovery

• Display print journal holdings in-line with e-journal holdings

• Check for other versions/iterations of a journal during OpenURL resolution (xISSN)

• Show users relationships between journals and title changes over time

Page 33: Xml Applications Libraries

Possible Applications

• Provide links to journal table of contents

• Use WorldCat API to search ISSN and retrieve 856

• Manipulate usage statistics information outside an ERM

• Show most popular journals, databases, ebooks to users

• Provide better interface for ILL staff to see holdings and loan rule information for e-resources

• Better display of cross-references between print and electronic journal holdings for users

Page 34: Xml Applications Libraries

Further Resources

• Auto-Populating an ILL form with the Serial Solutions Link Resolver API - http://journal.code4lib.org/articles/108

• Dublin Core - http://dublincore.org/

• ISO/FDIS 20775 - Holdings schema - http://www.loc.gov/standards/iso20775/

• MARC Holdings - http://www.loc.gov/marc/holdings/echdhome.html

• MARCXML - http://www.loc.gov/standards/marcxml/

• MODS - http://www.loc.gov/standards/mods/

• METS - http://www.loc.gov/standards/mets/

• OCLC Developer’s Network - http://worldcat.org/devnet/wiki/Main_Page

• WorldCat Search API URI Evaluator - http://worldcat.org/webservices/catalog/evaluator.html

• xISSB Web Services Documentation - http://xissn.worldcat.org/xissnadmin/doc/api.htm