open archive initiative – protocol for metadata harvesting (oai-pmh) surinder kumar technical...
TRANSCRIPT
![Page 1: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/1.jpg)
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH)
Surinder Kumar
Technical Director
NIC, New Delhi
[email protected], 011-24305503
![Page 2: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/2.jpg)
OAI-PMH
The mission of the Open Archives Initiative (OAI) (www.openarchives.org)
is to develop and promote “interoperability standards that aim to facilitate efficient
dissemination of content.
![Page 3: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/3.jpg)
OAI-PMH
The OAI-PMH is based on a simple and powerful model “whereby repositories
(data providers) make metadata . . . available via a well-defined protocol.
The exposure of the metadata allows other organizations (service providers) to
harvest it and then aggregate it, post-process it, and refine it with the goal of developing
services that add value.
![Page 4: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/4.jpg)
Background
Has origins in ePrints (arXive, CogPrints), dating back to 1999
– actively seeking wider applicability– Nothing to do with OAIS
Aims to “facilitate the efficient dissemination of content”
– free access to the archives (at least: metadata)– consistent interfaces for archives and service provider– low barrier protocol / effortless implementation (e.g., because
based on HTTP, XML, DC) Now on version 2.0 (June 2002)
![Page 5: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/5.jpg)
OAI-PMH: what’s it all about
Service providers harvest metadata from data providers.
Requests (HTTP)
Metadata (XML)
Data provider
Metadata(+ resources)
Harv
est
er
Service Provider
Metadata“service”
Adapted from http://www.oaforum.org/tutorial/english/page3.htm
OAI PMH
![Page 6: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/6.jpg)
What can be requested (verb)
Description of the archive (Identify) A list of metadata formats supported by the data
provider (ListMetadataFormats) A list of sets provided (ListSets) A list of resource identifiers (ListIdentifiers) Many records (ListRecords) An individual record (GetRecord)
![Page 7: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/7.jpg)
Example Requests
http://archive.example.org/oaipmh?verb=Identify
http:// archive.example.org/oaipmh?verb=ListRecords&metadataPrefix=oai_dc
![Page 8: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/8.jpg)
Metadata Formats
Metadata may be returned in any XML format Dublin Core is mandatory
– OAI-PMH specifies the XML schema to use– No single DC element is mandatory
Other element sets / bindings are optional– Qualified DC (e.g. RDN, NSDL)– MODS (LoC)– LOM (RDN-LTSN)– ODRL (JORUM (I think))– ...
![Page 9: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/9.jpg)
Sets
A grouping of items made to allow selective harvesting– E.g. all theses– E.g. the Engineering section– E.g. all resources from a given source
Optional
![Page 10: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/10.jpg)
List Records
Harvester can ask for specific metadata format for– All available items – All items in a set– All records modified in given date range– (A single item — GetRecord)
Data provider can return– All relevant records– Some relevant records + resumption token– An error code (no such set / metadata format)
![Page 11: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/11.jpg)
Static Repositories
Even lighter-weight specification for data providers with small and relatively static collections– E.g. the output from a conference
Essentially an XML file available at a URL Accessed through a “static repository
gateway” intermediary
![Page 12: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/12.jpg)
Issues: complexity
Providing data is easy Harvesting data is easyHowever Doing so may lead to complex workflow / policy
issues– What do you do with the harvested metadata?– Do you modify the metadata you harvest?– If so, do you feed this back to the provider?– What if the provider changes a modified record?– Does a service provider disseminate via OAI?
![Page 13: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/13.jpg)
Issues: uptake
Lots of implementers, who have produced lots of useful support
However Relatively little commercial uptake Relatively little support for harvesting rich
metadata Relatively little support/consensus on sets
![Page 14: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/14.jpg)
Issues: Harvesting resource (e.g. Full text)
Nothing in OAI-PMH requires that full-text should be available for harvesting.
– Resource may be physical or accessed controlled
Nothing in OAI-PMH requires that information required for harvesting should be available.
However in many cases OAI-PMH will provide the information required to harvest the resource.
http://www.myoai.co
![Page 15: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/15.jpg)
Service Provider
Arc: A Cross Archive Search Servicehttp://arc.cs.odu.edu/From October 2000Arc is an experimental research service that serves as
a platform for demonstrating the scalability of the OAI-PMH and as a vehicle for providing access to OAI-compliant repositories through a unified search interface. Arc is the oldest federated search service based on the OAI-PMH.
![Page 16: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/16.jpg)
OAISTER
OAIster is a union catalog of digital resources. We provide access to these digital resources by "harvesting" their descriptive metadata (records) using OAI-PMH (the Open Archives Initiative Protocol for Metadata Harvesting).
![Page 17: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/17.jpg)
Service Providers
Citebasehttp://citebase.eprints.org/ May 2001Citebase “allows researchers to search across
free, full-text research literature eprint archives, with results ranked according to many criteria (e.g., citation impact), and then to navigate that literature using citation links and analysis.”
![Page 18: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/18.jpg)
Contd…
SAIL-eprints (Search, Alert, Impact and Link)http://eprints.bo.cnr.it/April 2003 SAIL-eprints (Search, Alert, Impact and Link)
is “an electronic open access service provider for finding scientific or technical documents, published or unpublished, in Chemistry, Physics, Engineering, Materials Sciences, Nanotechnologies, Microelectronics, Computer Sciences, Astronomy, Astrophysics, Earth Sciences, Meteorology, Oceanography, . . . [Agriculture], and related . . . [subjects].”
![Page 19: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/19.jpg)
Resources
Open Archives Initiative– http://www.openarchives.org/– Spec, best practice guide and useful resources, mailing lists
OAI for beginners– http://www.oaforum.org/tutorial/– Online tutorial
OAI Repository Explorer– http://www.purl.org/NET/oai_explorer– Web interface for issuing OAI-PMH requests
![Page 20: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/20.jpg)
![Page 21: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/21.jpg)
![Page 22: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/22.jpg)
![Page 23: Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi suri@nic.insuri@nic.in, 011-24305503](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f315503460f94c4d67d/html5/thumbnails/23.jpg)