caltech library presentation · caltech library presentation author: ed sponsler created date:...
Post on 18-Jun-2020
7 Views
Preview:
TRANSCRIPT
http://resolver.caltech.edu/CaltechLIB:SPOiti05
Caltech CODA
• http://coda.caltech.edu• CODA: Collection of Digital
Archives• Caltech Scholarly Communication• 15 Production Archives• 3102 Records• Theses, technical reports,
conference proceedings, oral histories, refereed articles
We Want Federation
• Search all archives at once (federated search)
• Browse all authors, and all records from a given author, in one place (electronic CV)
OAI-PMH Can Help
• Open Archives Initiative – Protocol for Metadata Harvesting
• http://www.openarchives.org• Two Tier Model
– Data Providers– Service Providers
• Service Providers harvest metadata from Data Providers via the OAI Protocol
Data Providers
• Expose Metadata• All records must be described by a
minimal set of metadata:– Author– Title– Abstract– Submission date– URL to Record– Unique Identifier
Service Providers
• Metadata is routinely harvested and stored in a central database
• The central database is the foundation for federated services
• DP9, Celestial, Google Scholar
Federation using OAI• A collection of records must be
described with a common, minimal set of metadata
• Data Provider tools expose the metdataover http using the OAI-PMH
• Service Providers use OAI-PMH to harvest Data Providers, index the content and produce a new service (such as searching, or act as a Data Provider themselves)
Data Provider Requirements
• Expose metadata by responding to simple commands. Respond using xml over http.– Identify – GetRecord– ListIdentifiers– ListMetadataFormats– ListRecords– ListSets
OAI Repository Explorer
• Helps evaluate and validate a Data Provider implementation
• Provide an OAI Base URL and send it queries.
• Example Base URL: http://caltechcstr.library.caltech.edu/perl/oai2
Data Provider Tools
• http://www.openarchives.org/tools/tools.html
• Currently 26 tools freely available to help implement OAI
• Most implementation burden placed on Service Providers, not Data Providers
Eprints at Caltech
• Eprints.org is a scholarly communication archiving software package
• It is also an OAI Data Provider• All Caltech CODA archives are
Data Providers• Most run on eprints.org; Theses
runs on VT ETDdb
The Problem
• Each Service Provider must harvest each of our 15 archives individually
• This discourages participation• It is unnecessary, provided we can
build a local Service Provider (union catalog of all of CODA)
The Solution
• Design Caltech CODA Union Catalog
• Locally harvest each archive into a central database using OAI-PMH
• Implement this database as an OAI Data Provider
• Instruct all outside harvesters to use this one Data Provider rather than the 15 individually
EPrints.org as SP
• Build a harvesting routine to feed metadata into another instance of eprints.org using OAI-PMH
• Eprints.org does the rest– browse screens– search interface– Data Provider
End Result
• The Caltech Union Catalog will contain all 3100 CODA records in one database
• The metadata describing the records will be only the oai_dc subset (author, title, abstract, unique id, URL to target)
• Each record in union catalog will contain a link back to the full record in the harvested archive
End Result
• There will be one place for all harvesters to obtain Caltech records, instead of 15
• Use eprints to provide the local federated search interface across all our archives
• Author browse pages (like a CV)• Centralized RSS (eprints.org supports
this)• Centralized access statistics
Challenges
• Centralized Browse by Author requires author name identifier (authority)
• Implement OAI harvester to feed the Union Catalog (based on eprints.org)
• Customize eprints.org to import records provided by this harvester
Summary• Using OAI-PMH for federated searching
requires three steps:– Define a minimal metadata set for all
records– Wrap a Data Provider service around each
collection of records to expose metadata– Harvest metadata centrally, then produce a
service (such as search and browse)• Skip step three if you’re satisfied with
existing OAI Service Providers (DP9, Google, Celestial, etc.)
http://resolver.caltech.edu/CaltechLIB:SPOiti05
top related