kurt maly department of computer science old dominion university norfolk, virginia 23529, usa...
TRANSCRIPT
![Page 1: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/1.jpg)
Kurt MalyDepartment of Computer Science
Old Dominion UniversityNorfolk, Virginia 23529, USA
Digital Libraries, OAI and Free Software for Education and Science
5th National ConferenceComputer Application Federation of China Instrument & Control Society
Yinchuan, Ningxia Province,PRCSeptember 22-24, 2003
![Page 2: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/2.jpg)
Sept 24, 2003 5th National CACIS Conference
2
Outline Digital Libraries The Open Archives Initiative Free Software Systems
Arc DP9 Kepler RVOT
Conclusions Important URLs
![Page 3: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/3.jpg)
Sept 24, 2003 5th National CACIS Conference
3
Digital Libraries DL = library whose content is
stored digitally and can be accessed over the Internet
Key difference between DLs and the general Web is that the content is structured and has metadata associated with it allowing for more precise results to queries
![Page 4: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/4.jpg)
Sept 24, 2003 5th National CACIS Conference
4
Digital Libraries Development of software to support DLs
has proceeded along proprietary software lines
It is extremely difficult for the average user to find information that is in different DLs
Need for interoperability between DLs
![Page 5: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/5.jpg)
Sept 24, 2003 5th National CACIS Conference
5
Digital Libraries DL interoperability can be achieved at three
levels technical:protocol, format, etc. should be
consistent so that messages can be exchanged content: agreements cover the data and metadata,
agreements on the interpretation of messages organizational: includes rules for access, for
changing collections and services, payment, and authentication
Need to federate, filter and provide value-added services on remote content
![Page 6: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/6.jpg)
Sept 24, 2003 5th National CACIS Conference
6
Open Archives Initiative address technical interoperability
among distributed archives facilitate the discovery of content in
distributed archives The OAI framework defines two
functional roles: data providers (archives) and service providers
![Page 7: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/7.jpg)
Sept 24, 2003 5th National CACIS Conference
7
Open Archives Initiative Data providers: expose the metadata of their
objects for harvesting Service providers: extract metadata from data
providers via the OAI metadata harvesting protocol
Service provider develop value-added services that are based on the metadata collected from data providers such as: cross-archive search engines, linking systems,
and peer-review systems
![Page 8: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/8.jpg)
Sept 24, 2003 5th National CACIS Conference
8herbert van de sompel
The Open Archives Iinitiative has been set up to create a forum to discuss and solve matters of interoperability between preprint solutions, as a way to promote their global acceptance. Paul Ginsparg, Rick Luce & Herbert Van de Sompel
OAI origin
herbert van de sompel
![Page 9: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/9.jpg)
Sept 24, 2003 5th National CACIS Conference
9
Core concepts of Santa Fe convention
herbert van de sompel
• low-barrier interoperability
• data-provider & service-provider model
• metadata harvesting model
• shared metadata format and parallel, community-
specific metadata formats
• acceptable use
Dienst subset
OAMS
XML reply
HTTP based
Gentelmen’s agreement
![Page 10: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/10.jpg)
Sept 24, 2003 5th National CACIS Conference
10
core concepts in OAI 1.0
herbert van de sompel
• low-barrier interoperability
• data-provider & service-provider model
• metadata harvesting model
• shared metadata format and parallel, community-
specific metadata formats
• acceptable use
• flexibility
OAI 1.0 protocol
Dublin Core
HTTP based
Community specific
Reply • XML Schema
• Self contained
![Page 11: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/11.jpg)
Sept 24, 2003 5th National CACIS Conference
11
The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content.
new OAI mission statement
herbert van de sompel
![Page 12: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/12.jpg)
Sept 24, 2003 5th National CACIS Conference
12
The Open Archives Initiative has its roots in an effort to enhance access to e-print archives as a means of increasing the availability of scholarly communication. Continued support of this work remains a cornerstone of the Open Archives program.
new OAI mission statement
herbert van de sompel
![Page 13: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/13.jpg)
Sept 24, 2003 5th National CACIS Conference
13
The fundamental technological framework and standards that are developing to support this work are, however, independent of the both the type of content offered and the economic mechanisms surrounding that content, and promise to have much broader relevance in opening up access to a range of digital materials.
[...]
new OAI mission statement
herbert van de sompel
![Page 14: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/14.jpg)
Sept 24, 2003 5th National CACIS Conference
14
Free software - Arc Arc harvests metadata currently from
about 150 OAI compliant archives normalizes them, and stores them in a search service based on a relational database (MySQL or Oracle)
over 6 Million metadata records from various subject domains
Arc also provides OAI layer, thus making hierarchical harvesting possible
![Page 15: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/15.jpg)
Sept 24, 2003 5th National CACIS Conference
15
![Page 16: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/16.jpg)
Sept 24, 2003 5th National CACIS Conference
16
![Page 17: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/17.jpg)
Sept 24, 2003 5th National CACIS Conference
17
Free Software – DP9 “deep web" or "invisible web" a vast
repository of content, such as documents in online databases, that general-purpose web crawlers cannot reach
500 times that of the surface web Internet search engines can not index OAI
collections, as they are not aware of the OAI protocol
![Page 18: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/18.jpg)
Sept 24, 2003 5th National CACIS Conference
18
Free Software – DP9 A Web crawler indexes a Web site by starting
with a base HTML page and following the links on this page to go deeper to retrieve other pages on the Web site
DP9 computes and presents an HTML page presented to a Web crawler as a result of an OAI request, and the links on the Web page leads to other OAI requests
![Page 19: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/19.jpg)
Sept 24, 2003 5th National CACIS Conference
19
Free Software – DP9 DP9 provides an entry page and if a web
crawler finds this entry page, it may follow the links on this page and send requests to DP9.
DP9 will then forward the request to corresponding OAI Data Providers and process the returned XML records
Depending on the depth a crawler follows, it can index all records in an OAI Data Provider
![Page 20: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/20.jpg)
Sept 24, 2003 5th National CACIS Conference
20
Free Software – DP9
W eb Craw ler
O AI Repos itory
O AI Repos itory
URLW rapper
J S P /S ervlet
X S LTP roc es s or
O AIHandler
DP9S t a t icU R L
T rans late
S end O AI reques t/G et X M L reply
Call
ReturnHT M L
Call
![Page 21: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/21.jpg)
Sept 24, 2003 5th National CACIS Conference
21
![Page 22: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/22.jpg)
Sept 24, 2003 5th National CACIS Conference
22
Free Software - Kepler The objective of the Kepler framework is to
satisfy the need for the average researchers at an average university to publish results and disseminate them to a wide audience quickly and conveniently
The Kepler framework is based on OAI to support what is called "personal data providers" or "archivelets"
![Page 23: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/23.jpg)
Sept 24, 2003 5th National CACIS Conference
23
Free Software - Kepler Kepler framework - a digital library of
many ‘little’ publishers. an easy-to-use archivelet that is
downloadable and self-installing an automated registration service to
support tens of thousands of publishers a simple service provider to harvest
metadata from archivelets.
![Page 24: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/24.jpg)
Sept 24, 2003 5th National CACIS Conference
24
O AI C om pliantR epository
Publish ingT ool
O AI C om pliantR epository
Publish ingT ool
O AI C om pliantR epository
Publish ingT ool
R egistra tionService
ServiceProvider
ServiceProvider
ServiceProvider
![Page 25: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/25.jpg)
Sept 24, 2003 5th National CACIS Conference
25
![Page 26: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/26.jpg)
Sept 24, 2003 5th National CACIS Conference
26
![Page 27: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/27.jpg)
Sept 24, 2003 5th National CACIS Conference
27
Free Software - RVOT Rapid Visual OAI Tool (RVOT) is a tool that
can help small organizations in making their collections OAI-PMH compliant
construct an OAI-PMH repository from a collection of files metadata translation tool
records in the original collection can be in any of the supported formats including RFC1807, MARC subset, and COSATI formats
lightweight HTTP server including an OAI-PMH request handler
![Page 28: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/28.jpg)
Sept 24, 2003 5th National CACIS Conference
28
Free Software - RVOT
Table 1. OAI-PMH Related Tools
Category Tools Publishing software DSpace, eprints.org, CDSWare, Kepler Data provider programming framework
UIUC OAI Implementation, OCLC OAICat, VTOAI package, oaiperl
Server software integrated with harvester
Arc, Clelestial
Harvester programming framework
OCLC OAIHarvester, oaiperl, my.OAI
Other tools DP9, Repository Explorer
![Page 29: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/29.jpg)
Sept 24, 2003 5th National CACIS Conference
29
Free Software – RVOT
Category Tools Publishing software DSpace, eprints.org, CDSWare, Kepler Data provider programming framework
UIUC OAI Implementation, OCLC OAICat, VTOAI package, oaiperl
Server software integrated with harvester
Arc, Clelestial
Harvester programming framework
OCLC OAIHarvester, oaiperl, my.OAI
Other tools DP9, Repository Explorer
![Page 30: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/30.jpg)
Sept 24, 2003 5th National CACIS Conference
30
![Page 31: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/31.jpg)
Sept 24, 2003 5th National CACIS Conference
31
![Page 32: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/32.jpg)
Sept 24, 2003 5th National CACIS Conference
32
![Page 33: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/33.jpg)
Sept 24, 2003 5th National CACIS Conference
33
Conclusions OAI makes the many digital libraries available
today interoperate in such a way that users can discover information across a wide variety of domains without having to be aware of the many different user interfaces of the individual libraries
OAI was founded by researchers who were interested not only in free distribution of information but also in free distribution of software
![Page 34: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/34.jpg)
Sept 24, 2003 5th National CACIS Conference
34
Conclusions All the software systems described in this
paper are freely available either in OpenSource or directly from the research group that created it
one caveat: free software does not necessarily mean no cost running of services. One still has to account for the need for technical support and hardware to set up services
![Page 35: Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA maly@cs.odu.edu Digital Libraries, OAI and Free Software](https://reader035.vdocuments.mx/reader035/viewer/2022070401/56649f1e5503460f94c35ea9/html5/thumbnails/35.jpg)
Sept 24, 2003 5th National CACIS Conference
35
Important URLs http://dlib.cs.odu.edu - ODU digital
library research group http://www.openarchives.org http://arc.cs.odu.edu http://sourceforge.net/projects/oaiarc/ http://dlib.cs.odu.edu/dp9 http://kepler.cs.odu.edu