open source software for libraries - citeseerx
TRANSCRIPT
Open Source Software for Libraries
A Trend Report
Submitted by
Saiful Amin
Guided by
Dr. A R D Prasad
Project 2
A guided Research Project Submitted in Partial Fulfillment of the Course Leading to the Award of Associateship in Documentation and
Information Science (ADIS)
2001 - 2003
DOCUMENTATION RESEARCH AND TRAINING CENTRE
INDIAN STATISTICAL INSTITUTE
8th Mile, Mysore Road
Bangalore – 560 059
Acknowledgement
I am deeply indebted to my guide Dr. ARD Prasad, Associate Professor, Documentation Research and Training Centre (DRTC), Indian Statistical Institute, Bangalore. It is the best opportunity to thank him with the core of my heart.
I also want to thank Prof. IK Ravichandra Rao, Head, DRTC, and Dr. Devika P Madalli, Lecturer, DRTC, for their continuous encouragement.
I must also thank Dr. K Mohan and my colleagues at the Learning Resource Centre at the Indian School of Business, Hyderabad, who have helped create such a nice ambience and atmosphere to work.
_______________ (Saiful Amin)
Place: Bangalore Date: August 27, 2003
Declaration
I do hereby declare that the project report entitled “Open Source Software for Libraries: A Trend Report”, which is being submitted to the partial fulfillment of the course leading to the award of the Associateship in Documentation and Information Science in DRTC, Indian Statistical Institute, Bangalore, is the result of the work carried out by me under the guidance and supervision of Dr. ARD Prasad, Associate Professor, Documentation Research and Training Centre.
I further declare that any other person or I have not previously submitted this project report to any other Institution/University for any degree or diploma.
_____________________
(Saiful Amin)
Place: Bangalore
Date: August 27, 2003
It is certified that this project has been carried out under my guidance and supervision.
______________________
(Dr. ARD Prasad)
Place: Bangalore
Date: August 27, 2003
Table of Contents
Page No.
Chapter 1
Introduction
1-5
Chapter 2
Use of Software in Libraries
6-10
Chapter 3
What is OSS?
11-18
Chapter 4
Software Tools for Automation
19-50
Chapter 5
Software Tools for Value Added Services
51-64
Chapter 6
Software Tools for DL Initiatives
65-83
Chapter 7
Miscellaneous Supporting Tools
84-94
Chapter 8
Conclusion
95-99
Chapter 9
Appendix – OSI Certified Licenses
100-103
Chapter 10
Selective Bibliography
104-106
Chapter 1
Introduction
“Any sufficiently advanced technology is indistinguishable from magic.” – Arthur
C. Clarke
• An Invitation to Library Software
• Objective of the Study
• Scope of the Project
• Distribution of Chapters
Introduction
Chapter 1 2
1 An Invitation to Library Software Developments in electronic and communication technology have affected every
profession in the past decades and libraries are no exception. Libraries of all types
are challenged to provide greater information access and improved levels of
service, while coping with the pace of technological change and ever-increasing
budget pressure.
Use of software applications in libraries has become essential due to a number of
factors. The most visible factors among them are:
• Growth of Electronic Resources: Large databases from periodical,
magazine, and journal publishers became increasingly available in digital
format – at first on CD-ROM, later via online services. Library services are
transitioning from local traditional collections to global resources provided
on demand via the most advanced networking technologies. Today, library
collections are used by people on campus as well as by individuals who are
not even located on the library’s physical facilities.
• Anytime Anywhere Access: Access to online digital information from
anywhere is the need of the hour. This is forcing a shift in role of library
from a repository to a gateway, with users expecting online libraries that
can provide round the clock service.
“Library users have grown accustomed to using the Internet as a research
tool and do not always appreciate the difference in quality of information
available through a library’s specialized collections, especially when
compared to what can be located through an Internet search engine. Thus,
libraries with substantial collections of information often find those
collections under utilized if the user interface is not designed to make it
easy to locate the required information.” (Pasquinelli, 2003)
• Resource Sharing: Libraries of all types also need to utilize new
application systems to automate resource sharing. Union Catalogs and
Introduction
Chapter 1 3
Inter-Library Loan modules are needed to allow cooperating institutions to
combine their catalogs and allow patrons of one library to request and
borrow materials from linked institutions. These technologies will foster
the growth of library consortia and the extension of offerings beyond the
organizational boundaries of individual libraries.
However, implementing new technologies and tools into library environments may
be a highly challenging task. Despite significant benefits many libraries do not
have the definite resources and infrastructure to maintain and upgrade available
technologies. In addition, there is a significant demand for standards-based, open
systems to promote interoperability.
Open Source Software (OSS), as will be discussed in the present study, comes to
the rescue of less-privileged libraries to deal with the increasing demands for use of
technology. OSS enables democratization of technology. OSS has definite
advantages over proprietary systems in the total cost of ownership (TCO), since it
is available free for download on the Internet. Thus OSS bears great importance to
the libraries in developing countries like India.
OSS also gives freedom to the users of the software to customize it to his/her needs
since one has access to the source code of the software.
2 Objective of the Study The objective of the present study is to look into the technologies and tools
available in the open source world that can be used in improving the services
within the libraries.
3 Scope of the Project The project is based on the study of available Open Source Software (OSS) useful
to libraries in general. It includes integrated library systems (ILS), cataloguing
tools, resource sharing tools, digital library tools, and other information service
tools useful in day-to-day functioning of the libraries.
Introduction
Chapter 1 4
4 Distribution of Chapters
Chapter 1 – Introduction
Chapter 2 – Use of Software in Libraries
• Why Automate?
• Software Needs for Automation
• Software Needs for Value Added Services
• Software Needs for DL Initiatives
Chapter 3 – What is OSS?
• What is OSS?
• Criteria for OSS
• The OSS movement
• Why adopt OSS in Libraries?
Chapter 4 – Software Tools for Automation
• Integrated Solutions
• Databases
• Cataloguing/MARC Tools
• Z39.50 Tools
• Barcode Makers
Chapter 5 – Software Tools for Value Added Services
• Library Portal Solutions
• User Services
• Subject Gateways
Introduction
Chapter 1 5
Chapter 6 – Software Tools for DL Initiatives
• Digital Library Solutions
• DL-like Software
• OAI-PMH Tools
Chapter 7 – Miscellaneous Supporting Tools
• HTML tools
• XML tools
• Information Retrieval Tools
Chapter 8 – Conclusion
• Barriers in Using OSS
• Criteria for Selection of OSS
• Conclusion
Chapter 9 - Appendix – OSI Certified Licenses
Chapter 10 – Selective Bibliography
Chapter 2
Use of Software in Libraries
“Necessity is the mother of invention” – Proverb
• Why Automate?
• Software Needs for Automation
• Software Needs for Value Added Services
• Software Needs for Digital Library Initiatives
Use of Software in Libraries
Chapter 2 7
1 Why Automate?
Automation considerations have been so well debated in last few decades that we
do not see many arguments against it. Still we need to place the topic in the
context of possible improvements in the existing library services.
Benefits for Patrons: Library automation offers many opportunities to improve
services to the library users. Benefits include faster access to resources through
OPACs, remote access, access to online reference tools, etc.
Benefits for Staff: Automation reduces the need to do repetitive jobs manually. It
reduces the manual work involved in circulation, cataloguing, acquisitions, etc.
Automation allows the staff to take benefit of online resources, and offline
databases in giving reference services.
Benefits for Institution: Automation not only builds positive reputation of the
library services it also increases access points for the users.
2 Software Needs for Automation
Before we look into the needs of software let us see what are the activities in a
library that can be automated. There are basically two kinds of activities in a
library, viz., visible and background. The activities like circulation, reference
services, which are visible to the users are of the first kind. The activities such as
ordering, accessioning, cataloguing, etc. can be referred to as the background
activities in a library.
The libraries also need to interact with other libraries to share resources. So the
third type of activity would be resource sharing with other libraries. Each of these
three kinds of activities is mostly still done manually in the traditional libraries.
2.1 Housekeeping activities
The housekeeping activities are essential for the day-to-day functioning of the
library. These include:
Use of Software in Libraries
Chapter 2 8
• Acquisitions: tracking the purchase of materials through ordering,
claiming, receiving, invoicing, and processing.
• Cataloging: creating catalogue records.
• Serials: automating ordering, receipt, routing, and renewals of all serial
subscriptions.
• Reminders: for library patrons as well as vendors of books and periodicals
2.2 Services to users
• Online Public Access Catalog (OPAC): an electronic record of holding,
bibliographic, and item information.
• Circulation: allowing librarians to check materials in and out, place
renewals or holds, and enter payments.
• Reference Services: to the users and other communities.
2.3 Resource Sharing
• ILL: for sharing resources.
• Cooperative Cataloguing: for sharing the cataloguing work among a group
of libraries.
• Union Catalogue: to enable easy identification of a resource in the
holdings of a group of libraries.
3 Software Needs for Value Added Services
Value addition is an important need for any service institution. The libraries
always need to improve the quality of service by adding value to each of its
products.
Use of Software in Libraries
Chapter 2 9
• Library Website: has become very important in modern libraries. It is
more than simply a library OPAC and can include library rules, subject-
based directories, access to online resources, news items, as well as online
reservation.
• Subject Guides: are useful for academic libraries for supporting the
existing curriculum of the parent institution.
• Reading Lists: is the modern version of literature search services on a
specific topic.
• Web Directories: are used to organize Internet resources on the basis of
classification, often biased towards a particular subject.
4 Software Needs for Digital Libraries
The growth of electronic information over the decades and the democratization of
the Internet have paved the way for the emergence of digital libraries. Digital
libraries are more than mere a collection of digital documents. It can be seen as an
extension of the existing libraries with all the three basic functions, viz., collection,
organization, and dissemination of digital information resources.
The importance of digital libraries can be summarized in the following points:
• Digital Documents: As the number of digital and electronic documents will
always increase in the future librarians need to organize them as efficiently
as possible. Simple information retrieval systems are not enough to handle
digital documents. Use of metadata is important in managing digital
content. That is where digital libraries come into picture.
• Archival Needs: Library has now access to electronic documents online as
well as in CD-ROM. These resources need to be archived efficiently.
• Online/Remote Access: Managing online access to resources available
over a network is growing in importance by day.
Use of Software in Libraries
Chapter 2 10
• Full-Text Search Capabilities: Full-text search is needed in a number of
situations, e.g., when context-based search does not fetch enough
documents.
• OAI-PMH Needs: Open Archives Initiative Protocol for Metadata
Harvesting (OAI-PMH) is a model for sharing of metadata between digital
libraries by means of metadata harvesting. The model supports building
low-barrier yet high-end federated search services across number of digital
libraries. The protocol needs to be implemented by the individual digital
libraries as well as the search service providers.
Chapter 3
What is OSS?
“Think free speech, not free beer” – Richard Stallman on Free Software
Foundation
• What is OSS?
• The Ten Commandments
• The OSS movement
• Why Adopt OSS in Libraries?
What is OSS?
Chapter 3 12
1 What is OSS?
Open source is a software development model as well as a software distribution
model. In this model the source code of programs is made freely available with the
software itself so that anyone can see, change, and distribute it provided they abide
by the accompanying license. In this sense, Open Source is similar to peer review,
which is used to strengthen the progress of scholarly communication.
The open source software differs from the closed source or proprietary software
which may only be obtained by some form of payment, either by purchase or by
leasing. The primary difference between the two is the freedom to modify the
software.
An open system is a design philosophy antithetical to solutions designed to be
proprietary. The idea behind it is that institutions, such as libraries, are can build a
combination of components and deliver services that include several vendors’
offerings. Thus, for instance, a library might use an integrated library system from
one of the major vendors in combination with an open source product developed by
another library or by itself in order to better meet its internal or users’ requirements.
Definition
According to Open Source Initiative (http://www.opensource.org/):
"Open source promotes software reliability and quality by supporting independent
peer review and rapid evolution of source code. To be certified as open source, the
license of a program must guarantee the right to read, redistribute, modify, and use
it freely."
Open source means several things (Chudnov, 1999):
• Open source software is typically created and maintained by developers
crossing institutional and national boundaries, collaborating by using
internet-based communications and development tools;
• Products are typically a certain kind of "free", often through a license that
specifies that applications and source code (the programming instructions
What is OSS?
Chapter 3 13
written to create the applications) are free to use, modify, and redistribute as
long as all uses, modifications, and redistributions are similarly licensed;
• Successful applications tend to be developed more quickly and with better
responsiveness to the needs of users who can readily use and evaluate open
source applications because they are free;
• Quality, not profit, drives open source developers who take personal pride
in seeing their working solutions adopted;
• Intellectual property rights to open source software belong to everyone who
helps build it or simply uses it, not just the vendor or institution who created
or sold the software.
2 The Ten Commandments
The Open Source Initiative (OSI) identified ten criteria for a software product to be
called open source. The OSI certifies a software license as an ‘OSI Certified
License’ on the basis of the following ‘Ten Commandments.’
1. Free Redistribution: The license shall not restrict any party from selling or
giving away the software as a component of an aggregate software
distribution containing programs from several different sources. The license
shall not require a royalty or other fee for such sale.
2. Source Code: The program must include source code, and must allow
distribution in source code as well as compiled form. Where some form of a
product is not distributed with source code, there must be a well-publicized
means of obtaining the source code for no more than a reasonable
reproduction cost–preferably, downloading via the Internet without charge.
3. Derived Works: The license must allow modifications and derived works,
and must allow them to be distributed under the same terms as the license of
the original software.
What is OSS?
Chapter 3 14
4. Integrity of the Author’s Source Code: The license may restrict source-
code from being distributed in modified form only if the license allows the
distribution of "patch files" with the source code for the purpose of
modifying the program at build time. The license must explicitly permit
distribution of software built from modified source code.
5. No Discrimination Against Persons or Groups: The license must not
discriminate against any person or group of persons.
6. No Discrimination Against Fields of Endeavor: The license must not
restrict anyone from making use of the program in a specific field of
endeavor.
7. Distribution of License: The rights attached to the program must apply to
all to whom the program is redistributed without the need for execution of
an additional license by those parties.
8. License Must not be Specific to a Product: The rights attached to the
program must not depend on the program's being part of a particular
software distribution.
9. The License Must not Restrict Other Software: The license must not
place restrictions on other software that is distributed along with the
licensed software. For example, the license must not insist that all other
programs distributed on the same medium must be open-source software.
10. The License must be Technology-Neutral: No provision of the license
may be predicated on any individual technology or style of interface.
3 The OSS Movement
The free/open source software movement began in the "hacker" culture of U.S.
computer science laboratories (Stanford, Berkeley, Carnegie Mellon, and MIT) in
the 1960's and 1970's. (Raymond, 2001)
What is OSS?
Chapter 3 15
The community of programmers at that time was small, and close-knit. Code
passed back and forth between the members of the community and if someone
made an improvement he/she was expected to submit that code to the community
of developers.
It was in this environment that Richard Stallman began his computer science career
in 1971, as a graduate student at the Massachusetts Institute of Technology
Artificial Intelligence Lab. In this environment, Stallman and his colleagues built
an enormous array of software tools for the PDP-10 (Rasch, 2000). Stallman
founded the GNU (http://www.gnu.org/), which stands for GNU’s Not Unix, in the
early eighties which later became Free Software Foundation (http://www.fsf.org/).
Open Source movement has its roots in this hacker culture of seventies and
eighties. According to Morgan (2002):
“OSS is both a philosophy and a process. As a philosophy it describes the
intended use of software and methods for its distribution. Depending on
your perspective, the concept of OSS is a relatively new idea being only
four or five years old. On the other hand, the GNU Software Project -- a
project advocating the distribution of "free" software -- has been
operational since the mid '80's. Consequently, the ideas behind OSS have
been around longer than you may think. It begins when a man named
Richard Stallman worked for MIT in an environment where software was
shared.”
OSS is also a process for the creation and maintenance of software. This is not a
formalized process, but rather a process of convention with common characteristics
between software projects. (Morgan, 2002)
4 Why Adopt OSS in Libraries?
The range and quality of software available for libraries is small compared to other
industrial applications. According to David Chudnov (1999) it is not surprising:
What is OSS?
Chapter 3 16
“The library community is largely made up of not-for-profit, publicly
funded agencies which hardly command a major voice in today's high tech
information industry. As such, there is not an enormous market niche for
software vendors to fill our small demand for systems. Indeed the 1997
estimated library systems revenue was only $470 million, with the largest
vendor earning $60 million. Because even the most successful vendors are
very small relative to the Microsofts of this world (and because libraries
cannot compete against industry salary levels), there are relatively few
software developers available to build library applications, and therefore a
relatively small community pool of software talent.”
According to Eric Lease Morgan (2002), author of MyLibrary portal software:
“In many ways I believe OSS development, as articulated by Raymond, is
very similar to the principles of librarianship. First and foremost with the
idea of sharing information. Both camps put a premium on open access.
Both camps are gift cultures and gain reputation by the amount of "stuff"
they give away. What people do with the information, whether it be source
code or journal articles, is up to them. Both camps hope the shared
information will be used to improve our place in the world. Just as
Jefferson's informed public is a necessity for democracy, OSS is necessary
for the improvement of computer applications.”
According to Chudnov (1999) there are three factors pushing the use of OSS in
libraries:
1. OSS licenses allow libraries to cut budget on software and use it to other
issues needing more funds.
2. OSS product is not locked into a single vendor. Thus even if a library buys
an open source system from one vendor, it might choose to buy technical
support from another company or get it from in-house experts.
3. The entire library community might share the responsibility of solving
information systems accessibility issues.
What is OSS?
Chapter 3 17
According to the Draft Report (2001) of Digital Library Federation (USA) to
consider Open Source Software for Libraries there are three virtues of OSS in
libraries. They are:
• OSS is an economical alternative to libraries' reliance upon commercially
supplied software. That is, despite the real costs involved in the
development, maintenance, and use of OSS software but these are lower
than those associated with library reliance upon commercial software.
• OSS is essential if libraries are to develop software and systems that meet
their patrons' needs. With OSS the IT infrastructure that is essential to
library operations and services can be:
o open (that is, built according to open standards and as such
potentially inter operable with other essential software and systems);
o ubiquitously available to libraries;
o capable of being tailored to suit the needs and circumstances of
individual libraries
o documented (and documentation must be available); and
o errors can more effectively be identified and corrected ("many
eyeballs make bugs shallow")
• OSS ensures that library systems and online services will be more
functional for libraries and their patrons and as such be good for library
patrons. This hypothesis is posited because, through OSS developments,
libraries:
o are reinserted into the research and development process that results
in systems and software;
o share a stake in software development and as such have greater
influence over (and as a result take a greater interest in specification
of) the functional and performance requirements associated with
particular software tools and systems
What is OSS?
Chapter 3 18
o motivate and empower systems librarians and related technical staff
by encouraging creativity and positioning them to make a
difference; and
o are able more easily to collaborate with other information science
communities involved in common research and development area
OSS democratizes the use of software applications in libraries irrespective of the
size and scope of the library.
Chapter 4
Software Tools for Automation
“What one man can invent, another can discover.” – Arthur Conan Doyle
• Integrated Solutions
• Databases
• Cataloguing/MARC Tools
• Z39.50 Tools
• Barcode Makers
Software Tools for Automation
Chapter 4 20
1 Integrated Solutions
Integrated Library Systems (ILS) is the current wave in the field of library
automation. An ILS combines several activities of the library into one integrated
system, allowing the library staff to perform all their functions online. These
activities include simple housekeeping activities like acquisition, cataloguing to
user services, and inter-library loan activities.
In the last few years we have seen the development of a number of ILS products in
the open source world. One important trend in these kind products is the use of
web-based client/server architecture. Listed below are some of the well-known ILS
products.
1.1 Koha: The First Open Source Integrated Library System
Description: Koha is the first open source fully featured integrated library system
(ILS) used by a considerable number of libraries in USA, New Zealand, and
Europe. The Koha ILS includes catalogue, OPAC, circulation, member
management, and acquisitions package. Koha is used by public libraries, private
collectors, not-profit organizations, churches, schools, and corporates.
Special Features: Some of the key features are
• Simple clear interface for librarians and members (patrons) to search right
from the front page.
• Customizable search - you choose which fields you want on your search
forms when you set it up
• Reading lists for members - now you can find the name of that great book
you read last year.
• Full acquisitions including budgets and pricing information (including
supplier and currency conversion), being kept so that you can see what
you've ordered and received - so handy at end of year and audit time.
Software Tools for Automation
Chapter 4 21
• Simple acquisitions for the smaller library
• Able to catalogue websites as items, or have them as links to existing
records.
History: Koha was developed in 1999 and the first library went live in January of
2000. Koha's code has been in production since then and is continuing to move
towards higher levels of functionality and standards compliance, including
embracing the international records and cataloguing standards MARC and Z39.50.
Project Sponsors/Administrators: Katipo Communications, and funding by
Horowhenua Library Trust and other libraries. Current project leader is Patrick
Eyler.
Dependency: Apache, Perl, MySQL (or any RDBMS)
Supported Platforms: Windows (without Z39.50 support), Linux, and UNIX
License: GNU General Public License
Availability: http://sourceforge.net/projects/koha, http://www.koha.org/download/
Further Information:
1. Project Homepage: http://www.koha.org/
2. Koha Wiki Page:
http://www.saas.nsw.edu.au/wiki/index.php?page=KohaProject
3. Koha Labs: http://www.kohalabs.com/
1.2 PhpMyLibrary
Description: PhpMyLibrary is a web-based library automation application meant
for smaller libraries. The system consists of cataloguing, circulation, and the OPAC
module. The system also has an import export feature. It strictly follows the
USMARC standard for adding materials.
Special Features: The salient features are:
Software Tools for Automation
Chapter 4 22
• Fully compatible with the Postnuke Content Management System enabling
easy integration with the Postnuke-based portal
• Online reservation system for library patron with their own login
• Supports import from ISIS database with an ISIS2MARC program
History: Unknown
Project Sponsors/Administrators: Polerio Babao III, and Paolo Alexis Falcone
Dependency: Apache, PHP, MySQL, Python
Supported Platforms: Platform Independent
License: GNU General Public License
Availability: http://sourceforge.net/projects/phpmylibrary/
Further Information: Project Homepage: http://phpmylibrary.sourceforge.net/
1.3 OpenBiblio: A Library System That’s Free
Description: OpenBiblio is an easy to use, open source, automated library system
written in PHP containing OPAC, circulation, cataloging, and staff administration
functionality. The purpose of this project is to provide a cost effective library
automation solution for private collections, clubs, churches, schools, or public
libraries.
Special Features: The goals of the project has been to achieve the following
• Intuitive and easy to use
• Well documented
• Easy to install with minimal expertise
• Designed with common library features to work with most library
workflows
It is fully compatible with the Postnuke Content Management System.
Software Tools for Automation
Chapter 4 23
History: Unknown
Project Sponsors/Administrators: Dave Stevens
Dependency: Apache, PHP, MySQL
Supported Platforms: Platform Independent
License: GNU General Public License
Availability: http://sourceforge.net/project/showfiles.php?group_id=50071
Further Information: Project Home Page: http://obiblio.sourceforge.net/
1.4 GNU Library Management System (GLIBMS)
Description: Glibms is Library management software developed using PHP and
PostgreSQL to automate the different activities carried out in the library. The
project is currently inactive at Sourceforge. It is renamed as Karuna and hosted at
sarovar.org.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Sharmad Naik, Gaurav Priyolkar
Dependency: Apache, PHP, Perl, PostgreSQL
Supported Platforms: Linux, UNIX
License: GNU General Public License
Availability: http://sourceforge.net/projects/glibs/
Further Information: Project Home Page: http://sourceforge.net/projects/glibs/
1.5 Avanti: An Open Source Library Computing System
Description: Avanti MicroLCS is an open source general purpose library
computing system that is small, simple, and easy to install and use. Written in
Software Tools for Automation
Chapter 4 24
Java, it is platform independent and can run on any system that supports a Java
runtime environment. Although it targets small libraries, it has a powerful and very
flexible architecture that allows it to be adapted for use in libraries of any type.
Special Features: Some key objectives of the project are:
• Keep it as small, simple and extendable as possible, using a well-
considered, clean design.
• Implementation neutral: Base the design on a purely abstract model of
library systems. Avoid designing for a literal library. This makes the core
system very portable and adaptable to the needs of libraries of all types.
• Platform independent: 100% pure Java.
• It should be easy to install and use. Unlike most other open source
solutions, it should not require the skills of a system administrator to install
and maintain.
• User interfaces should be modeless, flat and simple.
• Keep the memory and resource footprint very small. Avanti is anticipated
being used in a variety of forms including that of a library automation
server appliance.
• Incorporate standards such as MARC and Z39.50 as modules and
interfaces, but do not allow them to become part of the underlying design.
History: Avanti is an effort, begun in 1998 by Peter Schlumpf, to develop a simple,
flexible, and open source solution to automating small and medium-sized libraries
of various types that requires a minimum of technical expertise to install and use.
Project Sponsors/Administrators: Peter Schlumpf
Dependency: Java Virtual Machine
Supported Platforms: Platform Independent
License: Unknown
Availability: http://home.earthlink.net/~schlumpf/avanti/downloads.html
Software Tools for Automation
Chapter 4 25
Further Information: Project Home Page:
http://home.earthlink.net/~schlumpf/avanti/index.html
1.6 PhpMyBibli: A Free Solution for the Media Library
Description: PhpMyBibli is a web-based library automation for French libraries.
Special Features: Some of the features are:
• A simplified administration being able to be ensured by the personnel of the
library
• Support of format UNIMARC
• Management of the authorities (responsible, editors, collections, matters...)
• Management of the loan, the reservations, the borrowers...
• Support for cataloguing electronic resources
• The management of the periodicals
History: Unknown
Project Sponsors/Administrators: Francois Lemarchand
Dependency: Apache, PHP, MySQL
Supported Platforms: Platform Independent
License: GNU General Public License
Availability: http://sourceforge.net/project/showfiles.php?group_id=64869
Further Information: Project Home Page: http://phpmybibli.sourceforge.net/
1.7 OpenBook
Description: OpenBook, a free Web-based integrated library system offers
flexible, sophisticated automation to small and mid-sized public or school libraries
and was created to increase digital access to information. OpenBook uses open
Software Tools for Automation
Chapter 4 26
source code to offer a low-cost, simple-to-use system rich in features generally
found only in high-end systems. The current technical beta version includes
complex searching capabilities, a full bibliographic record with external resource
linking as defined in MARC21, and a cataloging function that is MARC21-
compatible.
Special Features: Some distinctive features include the following:
• A completely Web-based cataloging system—It's simple to use, works with
any existing hardware or software, and supports all popular browsers.
• Combines total capture and retention of all MARC21 fields with custom
configuration of cataloging display fields
• A multilingual interface—Can be displayed in any Roman- character
language
• Patron ability to access the system from home
• Enhanced safety features, including backup, restore, and purge
• A home page development template
History: OpenBook developed as a modification of Koha, the first free open source
library system created in New Zealand by the Horowhenua Library Trust and
Katipo Communications, Ltd. The Technology Resource Foundation's OpenBook
design team, which comprises experienced librarians and programmers, used Koha
as a basis to develop OpenBook from the ground up.
Project Sponsors/Administrators: Technology Resource Foundation
Dependency: Apache, Perl, MySQL
Supported Platforms: Unknown
License: Unknown, GNU GPL
Availability: Currently not accessible
Further Information:
1. Project Home Page: http://www.trfoundation.org/projects/faq.html
Software Tools for Automation
Chapter 4 27
2. Press Release: http://www.infotoday.com/IT/sep01/news16.htm
1.8 Learning Access ILS
Description: The Learning Access ILS is a full-feature Open Source library
automation system developed for use by small public and school libraries in the
U.S. and the rest of the world. The Institute will make this system available free to
libraries that, because of cost, have been unable to achieve the benefits of
automation.
The LearningAccess ILS consists of three modules: the patron or user module
(OPAC), the cataloging module and the circulation module. In future releases it
may also include an acquisition module. All modules are Web-interface based and
are multi-lingual user capable, with our initial release supporting English, Spanish
and French.
Special Features: The system supports the full MARC21 format for bibliographic,
holding, authority and community records. It has an intuitive importing program to
add records to its database. The cataloging client includes Z39.50 searching
capabilities to allow for copy cataloging against OCLC or other larger union
databases. Future releases will also support Z39.50 searches against the database.
History: The Learning Access Institute pursues its mission through two distinct yet
interconnected programme areas. The Technology Development Program focuses
its efforts on the development of and adaptation of information technology
solutions to meet the information and learning access needs of underserved
communities.
Project Sponsors/Administrators: Learning Access Institute
Dependency: Apache, PHP, Perl, MySQL
Supported Platforms: Linux, Windows NT/2000 (Not tested)
License: GNU General Public License
Availability: Not currently available
Software Tools for Automation
Chapter 4 28
Further Information:
Project Home Page: http://www.learningaccess.org/website/techdev/ils.php
1.9 Karuna
Description: This project is a library management system designed to automate a
library. Taken into consideration all the aspects of a library like search,
issue/retrieval, acquisition and other aspects of a library.
Special Features: Unknown
History: It is another version of the GNU Library Management System (GNU
LMS). According the author of Karuna (who was also one of the developer for
GNU LMS) the original GNU LMS is no more supported.
Project Sponsors/Administrators: Sharmad Naik
Dependency: Apache, PHP, PostgreSQL
Supported Platforms: Linux, UNIX
License: GNU GPL
Availability: http://sarovar.org/project/showfiles.php?group_id=34
Further Information: Project Home Page: http://sarovar.org/projects/karuna/
2 Databases
Use of databases has grown in the library software applications whether it is an
ILS, cataloguing software, information retrieval tool, reference service tool, current
awareness service tool, or simply for a library website. There are a number of
Relational Database Management Software (RDBMS) available as open source
(like MySQL, PostgreSQL, and SAP DB) which support Structured Query
Language (SQL) standards.
Software Tools for Automation
Chapter 4 29
2.1 OpenIsis
Description: OpenIsis is the open source member of the CDS/ISIS software
family. It is well suited for bibliographic databases with variable length fields and
repeatable sub-fields.
Special Features: Some of the special features are
• Highly flexible data structure: potentially unlimited number of data fields in
record
• Highly efficient storage: unused data fields consume no space
• Natural Modeling – ultra fast access: logically related data that would be
artificially separated in a relational DB is stored in a single record
• Highly flexible index structure: index entries associated with a record are
under full application control, can even be derived from associated text
documents of any format.
History: Developed since May 2001
Project Sponsors/Administrators: OpenIsis Verein, Berlin
Dependency: Unknown
Supported Platforms: Linux, UNIX, Windows, MacOS X
License: GNU GPL, LGPL
Availability: http://sourceforge.net/project/showfiles.php?group_id=11257
Further Information: Project Home Page: http://www.openisis.org/
2.2 PostgreSQL
Description: PostgreSQL is claimed to be the most advanced Open Source
database system in the world.
Special Features: Some of the special features are
Software Tools for Automation
Chapter 4 30
• Exceptional performance and speed
• World-class security
• Flexibility to be extended as required
• Highly scalable design
• Minimal administration requirements
Full feature set is available at: http://advocacy.postgresql.org/advantages/
History: The PostgreSQL software itself had its beginnings in 1986 inside the
University of California at Berkeley as a research prototype, and in the 16 years
since has moved to its now globally distributed development model, with central
servers based in Canada.
Project Sponsors/Administrators: PostgreSQL Global Development Group
Dependency: Perl, Python, Tcl/Tk, JDK/Ant, Flex & Bison
Supported Platforms: Linux, UNIX, Windows (under cygwin environment)
License: BSD License
Availability: http://www.postgresql.org/mirrors-ftp.html
Further Information: Project Home Page: http://www.postgresql.org/
2.3 MySQL
Description: The MySQL database server is the world’s most popular open source
database. Its architecture makes it extremely fast and easy to customize. Extensive
reuse of code within the software and a minimalist approach to producing
functionality-rich features has resulted in a database management system
unmatched in speed, compactness, stability and ease of deployment. The unique
separation of the core server from the storage engine makes it possible to run with
strict transaction control or with ultra-fast transaction-less disk access, whichever is
most appropriate for the situation.
Software Tools for Automation
Chapter 4 31
Special Features: Some of the major features are
• ANSI SQL syntax support
• Cross-platform support
• Independent storage engines
• Full-text indexing and searching
• Query caching
• Flexible security system, including SSL support
• Replication of database servers for robustness and speed
Full feature set is available at: http://www.mysql.com/products/mysql/index.html
History: The project was started in 1995 and has become quite mature in the last
five years. Undoubtedly it is the most popular open source RDBMS primarily
because of its speed.
Project Sponsors/Administrators: MySQL AB
Dependency: Unknown
Supported Platforms: Linux, UNIX, Windows, MacOS X
License: GNU GPL and Commercial non-GNU
Availability: http://www.mysql.com/downloads/index.html
Further Information: Project Home Page: http://www.mysql.com/
2.4 Firebird
Description: Firebird is a relational database offering many ANSI SQL-92 features
that runs on Linux, Windows, and a variety of Unix platforms. Firebird offers
excellent concurrency, high performance, and powerful language support for stored
procedures and triggers. It has been used in production systems, under a variety of
names since 1981.
Software Tools for Automation
Chapter 4 32
Special Features: Unknown
History: In August 2000, Borland Software Corp. (formerly known as Inprise)
released the beta version of InterBase 6.0 as open source. The community of
waiting developers and users preferred to establish itself as an independent, self-
regulating team rather than submit to the risks, conditions and restrictions that the
company proposed for community participation in open source development. A
core of developers quickly formed a project and installed its own source tree on
SourceForge.
Project Sponsors/Administrators: Ann W. Harrison, Pavel Cisar, John Bellardo,
Mark Odonohue, David Jencks, Dmitry Yemanov, Sean Leyne
Dependency: glibc-2.2, ncurses4
Supported Platforms: Linux, UNIX, Windows
License: Mozilla Public License, InterBase Public License
Availability: http://sourceforge.net/project/showfiles.php?group_id=9028
Further Information: Project Home Page: http://www.firebirdsql.org/
2.5 SAP DB
Description: SAP DB is an open, SQL-based, relational database management
system for small to very large implementations, supporting object orientation and
unstructured data. SAP DB adheres to open standards including SQL, JDBC, and
ODBC; access from Perl and Python; and HTTP-based services with HTML or
extensible markup language (XML) content.
Special Features: The main features are
• Round-the-clock operation
• Easy administration
• Free of reorganization tasks
• Unlimited number of users
Software Tools for Automation
Chapter 4 33
• Unlimited database size
• Supports all SAP solutions
History: Project started in October 2000.
Project Sponsors/Administrators: SAP AG, Germany
Dependency: Unknown
Supported Platforms: Windows NT, Linux
License: GNU GPL, LGPL
Availability: http://www.sapdb.org/7.4/sap_db_software.htm
Further Information: Project Home Page: http://www.sapdb.org/
3 Cataloguing/MARC Tools
Many small libraries could not afford and ILS to be implemented due to various
reasons depending upon the clientele and available resources. Automating a small
part of the library function like cataloguing or circulation might satisfy them. It
might convince the library authority to go for full-fledged automation in future.
These tools are also useful for building OPAC services within the library or
through the library website.
There are a number of tools available for automation of the cataloguing function.
The important concern here is the compliance of well-accepted standards like
AACR and MARC for integration with future softwares.
3.1 Java Book Cataloguing System
Description: The purpose of this software is primarily to create a Book Catalog
using barcode data from the freely available cuecat barcode reader.
Software Tools for Automation
Chapter 4 34
Special Features: It uses a RDBMS backend database, and allows synchronization
between different library branches.
History: Unknown
Project Sponsors/Administrators: Josh Patterson
Dependency: Java, Hypersonic SQL, JDBC
Supported Platforms: Platform Independent
License: GNU Library or Lesser General Public License
Availability: http://sourceforge.net/project/showfiles.php?group_id=10661
Further Information: Project Home Page:
http://sourceforge.net/projects/jbiblioteca/
3.2 MARC/Perl
Description: MARC/Perl is a Perl library for reading, manipulating, outputting and
converting bibliographic records in the MARC format.
Special Features: Some of the important features are:
• Support for reading, editing, creating MARC records in batch mode
• Can be used to validate MARC records
• Can be used with Net::Z3950 to download MARC data in batch mode
History: In 1999 a group of developers began working on MARC.pm to provide a
Perl module for working with MARC data. MARC.pm was quite successful since it
grew to include many new options that were requested by the Perl/library
community.
In mid 2001 Andy Lester released MARC::Record and MARC::Field which
provided a much more simpler and maintainable package for processing MARC
data with Perl. Instead of forking the two projects the developers agreed to
encourage use of the MARC::Record framework, and to work on enhancing
Software Tools for Automation
Chapter 4 35
MARC::Record rather than extending MARC.pm further. Soon afterwards
MARC::Batch was added which allows you to read in a large data file without
having to worry about memory consumption.
Project Sponsors/Administrators: Andy Lester, Edward Summers
Dependency: Perl
Supported Platforms: UNIX, Linux, Windows
License: GNU General Public License
Availability: http://sourceforge.net/project/showfiles.php?group_id=1254,
http://www.cpan.org/modules/by-module/MARC/
Further Information:
1. Project Home Page: http://marcpm.sourceforge.net/
2. CPAN Site: http://search.cpan.org/author/PETDANCE/MARC-Record-
1.29/
3.3 MARC Template Library
Description: The MARC Template Library is a C++ API (using C++ templates
and STL) for reading, writing and processing MARC records.
Special Features: The project provides a simple Windows-based graphical tool to
convert MARC records into MARCXML.
History: The author developed these tools to improve his knowledge of C++
Standard Template Library.
Project Sponsors/Administrators: Mark Basedow
Dependency: C++ Compiler
Supported Platforms: Windows, Linux, UNIX
License: BSD License
Availability: http://sourceforge.net/project/showfiles.php?group_id=43694
Software Tools for Automation
Chapter 4 36
Further Information: Project Home Page: http://mtl.sourceforge.net
3.4 jake2marc
Description: jake2marc is a utility that creates simple USMARC records for the
full-text journals in any of the databases listed in the jake (Jointly Administered
Knowledge Environment: http://www.jake-db.org/) project.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Mark Jordan
Dependency: Perl, libwww-perl & MARC::Record (Perl modules)
Supported Platforms: Linux, Windows
License: GNU GPL
Availability: http://jake.lib.sfu.ca/jake2marc/
Further Information: Project Home Page: http://jake.lib.sfu.ca/jake2marc/
3.5 UseMARCON
Description: The USEMARCON software is designed to provide users with two
specific services.
• The facility to convert MARC records compliant with a specified input
format into MARC records compliant with a specified output format.
• The facility to create and modify rules files, used to achieve MARC
conversions, in order to meet specific local requirements. The present
software is designed to be used by senior cataloguers or others with a
detailed knowledge of the structure of the MARC formats they wish to
convert between.
Software Tools for Automation
Chapter 4 37
Special Features: The UseMARCON project aimed to develop a generic toolkit
for ISO2709 compatible MARC formats to enable libraries to create rules based
systems to convert records between national MARC formats. This would give
libraries the ability to obtain records from a far wider range of potential sources
than those currently available to them and stimulate an increase in the international
exchange of bibliographic records.
History: The UseMARCON Project, which was successfully completed in
February 1997, was funded by the consortium partners and the EU's Telematics
Applications Programme (DGXIII-E). The partners of the UseMARCON Project
consortium were drawn from a variety of library and information technology
backgrounds and comprised the following:
Partners:
• Koninklijke Bibliotheek, Holland
• Instituto da Biblioteca Nacional e do Livro, Portugal
• The British Library, UK
Project Sponsors/Administrators: UseMARCON Consortium and Jouve S.I.
Dependency: C++ Compiler, XVT C++ Toolkit
Supported Platforms: Windows, UNIX, Linux
License: Unknown: Unsupported freeware (with source code)
Availability: ftp://ftp.bl.uk/pub/nbs/ec/usemarcon/, ftp://ftp.kb.nl/pub/usemarcon/
Further Information: Project Home Page:
http://www.konbib.nl/kb/resources/frameset_kb.html?/kb/sbo/bibinfra/usema-
en.html
Software Tools for Automation
Chapter 4 38
3.6 USEMARCON Plus
Description: USEMARCON is a software application that allows users to convert
bibliographic records from one MAchine-Readable Cataloguing (MARC) format to
another.
Special Features: The British Library has since further developed the
USEMARCON application. This work was carried out on behalf of the Library by
Crossnet Systems Limited. The program has been enhanced in the following ways:
• The redevelopment of the application removing all proprietary XVT
components and substituting public domain equivalents.
• The removal of the graphical user interface in order that the program can
function as part of a batch process from the system command line.
• The re-design of the application for 32bit MS Windows and Linux
operating systems.
• The optimization of the program to allow the conversion of large files.
• The integration of new rule functions to enable the creation of more
complex conversions.
History: In 1995, a project funded by the European Union was set up to address
this issue. The project was successfully completed in 1997 with the development of
the USEMARCON (User Controlled Generic MARC Converter) software.
Project Sponsors/Administrators: The British Library, Crossnet Systems
Dependency: C++ Compiler, XVT C++ Toolkit
Supported Platforms: Windows, Solaris, UNIX
License: Unknown: Unsupported freeware
Availability: ftp://ftp.bl.uk/pub/nbs/ec/usemarcon
Further Information: Project Home Page:
http://www.bl.uk/services/bibliographic/usemarcon.html
Software Tools for Automation
Chapter 4 39
3.7 Marc2Opac
Description: Marc2Opac is a PHP4 script for searching and displaying MARC
files. It supports a good range of searching techniques and it is fast (searches more
than 1,00,000 entries in a second).
Special Features: The features added to this PHP module include
• Advanced search
• Subscriber logon
• Reservations system
History: Bundaberg City Council, Australia, developed Marc2Opac to put their
library catalogue online. Other features were added later.
Project Sponsors/Administrators: IT Services, Bundaberg City Council
(Australia)
Dependency: Apache, PHP, Grep
Supported Platforms: Linux
License: Unknown
Availability: http://www.bundaberg.qld.gov.au/library/catalog/about.php4
Further Information: Project Home Page:
http://www.bundaberg.qld.gov.au/library/catalog/about.php4
3.8 Medlane XMLMARC
Description: Medlane XMLMARC is a computer program that converts MARC
records into XML. It can also update MARC records, based on plain text
processing instructions, and write records to a file in the MARC format.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Kevin S. Clarke
Software Tools for Automation
Chapter 4 40
Dependency: Java
Supported Platforms: Platform Independent
License: GNU GPL, LGPL
Availability: http://sourceforge.net/project/showfiles.php?group_id=48203
Further Information: Project Home Page: http://medlane.stanford.edu/
3.9 MARCUTL
Description: MARCUTL (the MARC Update and Transformation Language) is a
mapping language that converts MARC into XML or MARC into "updated
MARC" based on the instructions in a MARCUTL file. These files are expressed
in XML and must conform to the MARCUTL schema.
Special Features: MARCUTL provides for several built in methods of updating or
transforming MARC records, but it also provides for the creation of special MARC
processing classes. These classes implement a particular interface, described in the
MARCUTL API (application programming interface), and accept a MedMARC
Record as input. MedMARC is a Java API for handling MARC records that was
developed by the Medlane project.
History: Unknown
Project Sponsors/Administrators: Kevin S. Clarke
Dependency: Java
Supported Platforms: Platform Independent
License: GNU GPL, LGPL
Availability: To be available
Further Information: Project Home Page: http://medlane.stanford.edu/
Software Tools for Automation
Chapter 4 41
4 Z39.50 Tools
The Z39.50 standard specifies a client/server-based protocol for searching and
retrieving information from remote databases. The protocol is sponsored by
American National Standards Institute (ANSI) and US National Information
Standards Organization (NISO). The first version of the protocol was published in
1988. The second version came out in 1992 and the latest version (version 3) is
dated 1995. However, the Z39.50 revision (Z39.50-2001) is still in progress!
The use of Z39.50 protocol in library is either to get bibliographic data from other
libraries or provide bibliographic services to other libraries. The library may
choose to be either client (for downloading/search records) or server (allowing
others to download/search local records). There are tools available to implement
both the activities.
Z39.50 might prove beneficial in identifying resources through its powerful
broadcast search functions where a user can send a query to a large number of
servers to search bibliographic records. This way the protocol can be seen as an
alternative to union catalogues, though it still does not support holdings records to
be displayed in the search results. It can also be combined with other activities,
such as inter-library loan (ILL), to speed up the process.
4.1 YAZ Toolkit
Description: YAZ (Yet Another Z39.50 Toolkit) is a toolkit for implementing the
Z39.50-1995 standard and protocol. Both the Origin (client) and Target (server)
roles of the protocol are supported. The toolkit is written in C.
Special Features: Its ability to provide an open, well-defined, and structured
framework to information retrieval tasks within any application domain makes it an
obvious candidate for use in many different roles.
History: Unknown
Project Sponsors/Administrators: Index Data
Software Tools for Automation
Chapter 4 42
Dependency: None
Supported Platforms: UNIX, Linux, Windows
License: Index Data Copyright (Based on BSD License)
Availability: http://www.indexdata.dk/yaz/
Further Information: Project Home Page: http://www.indexdata.dk/yaz/
4.2 ZContent
Description: ZContent is a Perl script and module that provides a Z39.50 target for
the CONTENTdm server. CONTENTdm (http://contentdm.com/) is a commercial
digital collection management software.
ZContent is based on the open source SimpleServer Perl module which is provided
by Index Data (http://www.indexdata.com/simpleserver/). SimpleServer is based
on the YAZ toolkit, which is also provided by Index Data.
(http://www.indexdata.com/yaz/). USMARC Records are created using the
MARC::Record Perl module.
Special Features: Unknown
History: The University of Utah Marriott Library has developed software that adds
Z39.50 compatibility to any CONTENTdm digital collections server.
Project Sponsors/Administrators: Aaron DeMille, Kenning Arlitsch (University
of Utah)
Dependency: Perl, YAZ Toolkit, SimpleServer
Supported Platforms: Windows
License: GNU General Public License
Availability: http://sourceforge.net/projects/zcontent
Further Information: Project Home Page:
http://www.lib.utah.edu/digital/ZContent.html
Software Tools for Automation
Chapter 4 43
4.3 SimpleServer
Description: SimpleServer is a Perl module which is intended to make it as simple
as possible to develop new Z39.50 servers over any type of database imaginable.
All you have to do is implement a function for initializing your database (optional),
searching the database, and returning "database records" on request. The module
takes care of everything else and automatically starts a server for you, listens to
incoming connections, and implements the Z39.50 protocol.
Special Features: Use SimpleServer together with other Perl modules to provide
gateways to relational databases, local file stores, SOAP/RDF-servers, etc.
SimpleServer currently supports the Init, Search, Present, Scan and Close services.
If you are interested in other functionality, get in touch and we'll help if we can.
History: Unknown
Project Sponsors/Administrators: Index Data
Dependency: YAZ 1.8 or later
Supported Platforms: UNIX, Linux, Windows
License: Index Data Copyright
Availability: http://www.indexdata.dk/simpleserver/
Further Information: Project Home Page: http://www.indexdata.dk/simpleserver/
4.4 VB Zoom
Description: VB ZOOM is a collection of ActiveX COMponents, written in Visual
Basic, which implement the ZOOM 1.2 (Z39.50 Object-Orientation Model)
Abstract API. The current VB ZOOM is a wrapper for the YAZ Toolkit from Index
Data, plus a helper component for doing MARC-8 to Unicode character
conversions.
Special Features: Unknown
Software Tools for Automation
Chapter 4 44
History: The original VB ZOOM was developed for a project called ZMARCO as
part of the Open Archives Initiative Metadata Harvesting Project at the University
of Illinois at Urbana-Champaign, funded by the Andrew Mellon Foundation.
Continuing work on this and the ZMARCO project is being funded by a Library
Services and Technology Act grant from the Illinois State Library.
Project Sponsors/Administrators: Index Data, Denmark
Dependency: Yaz.dll V 2.0.1 (YAZ Toolkit)
Supported Platforms: Windows
License: University of Illinois/NCSA Open Source License (http://vb-
zoom.sourceforge.net/License.html)
Availability: http://sourceforge.net/project/showfiles.php?group_id=53790
Further Information: Project Home Page: http://vb-zoom.sourceforge.net/
4.5 JZkit
Description: A pure Java toolkit to assist in the development of information
retrieval systems using the Z39.50 standard.
Special Features: The toolkit is presented in three distinct levels:
Encoders/Decoders, Protocol Endpoint and IR-Services. A number of example
origin and target implementations are available.
History: Unknown
Project Sponsors/Administrators: Ian Ibbotson
Dependency: Java VM
Supported Platforms: Platform Independent
License: GNU General Public License
Availability: http://sourceforge.net/project/showfiles.php?group_id=16429
Further Information: Project Home Page: http://www.k-int.com/jzkit
Software Tools for Automation
Chapter 4 45
4.6 Zeta Perl
Description: ZETA Perl defines a set of functions, variables and conventions that
provide a consistent interface to the Z39.50 services and protocol for Perl
applications. It was mainly designed and implemented to be usable by web
developers. However, it would be of help as well in writing a Z3950 client with
very little effort.
Special Features: The current version of the ZETA Perl (0.059) supports the
following APDUs: Init, Search, Present, Close, Delete, Scan and Sort
History: Unknown
Project Sponsors/Administrators: Unknown
Dependency: Perl 5.003 or better
Supported Platforms: Linux, Solaris, AIX
License: Perl Artistic License, GNU GPL
Availability: ftp://zeta.tlcpi.finsiel.it/pub/zeta/
Further Information: Project Home Page:
http://lcweb.loc.gov/z3950/agency/resources/software.html
4.7 ZedKit for Unix
Description: The Z39.50 Application Development Libraries for UNIX developed
for the German Library Project DBV OSI II and also the ONE project co-funded by
the European Commission Libraries Programme.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Crossnet Systems, UK
(http://www.crxnet.com/)
Software Tools for Automation
Chapter 4 46
Dependency: None
Supported Platforms: UNIX, Linux
License: Unknown (ftp://ftp.ddb.de/pub/dbvosi/dbvosiII-2.1.README)
Availability: http://www.crxnet.com/ZedKit_download.php
Further Information: Project Home Page: http://www.crxnet.com/zedkit.php
4.8 IrTcl Toolkit
Description: IrTcl is an extension to the Tcl/Tk (http://www.tcl.tk/) language
environments. IrTcl allows you to rapidly develop platform-independent, graphical
clients to the Z39.50 protocol supporting both the X Window and MS-Windows
environments.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Index Data
Dependency: Tcl/Tk, YAZ Toolkit
Supported Platforms: UNIX, Linux, Windows
License: Unknown
Availability: http://www.indexdata.dk/irtcl/
Further Information: Project Home Page: http://www.indexdata.dk/irtcl/
4.9 Net::Z3950
Description: The Net::Z3950 module provides a Perl interface to the Z39.50
information retrieval protocol (ISO 23950), a mature and powerful protocol used in
application domains as diverse as bibliographic information, geo-spatial mapping,
museums and other cultural heritage information, and structured vocabulary
navigation.
Software Tools for Automation
Chapter 4 47
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Mike Taylor
Dependency: Perl, YAZ Toolkit
Supported Platforms: UNIX, Linux, Windows (under Cygwin environment)
License: Perl Artistic License
Availability: http://perl.z3950.org/download/, http://www.cpan.org/modules/by-
module/Net/MIRK/
Further Information: Project Home Page: http://perl.z3950.org/
5 Barcode Makers
The barcodes are nothing but representation of some alphanumeric code in pictorial
bars. A barcode uniquely identifies an alphanumeric code which can be read by
machines. Barcode technology was invented for automatic identification of
products in the food chains in USA to enable rapid check out of items. The use of
barcodes for books came much later after the use of ISBN came in vogue.
The barcodes were mostly used by the department and bookstores to expedite the
process of check out. The use of barcodes has been found useful in automating
the check in and check out process in the circulation activities in the libraries. The
barcodes labels are assigned usually on the basis of the accession number of a
document which uniquely identifies an item within the library.
5.1 GNU Barcode
Description: GNU Barcode is a tool to convert text strings to printed bars. It
supports a variety of standard codes to represent the textual strings and creates
postscript output. The popular KBarcode software uses GNU Barcode at its
backend.
Software Tools for Automation
Chapter 4 48
Output is generated as either Postscript or Encapsulated Postscript (other back-ends
may be added if needed). The package is released as both a library and a command-
line frontend, so that one can include barcode-generation into one's application.
Special Features: Main features of GNU Barcode:
• Available as both a library and an executable program
• Supports UPC, EAN, ISBN, CODE39 and other encoding standards
• Postscript and Encapsulated Postscript output
• Accepts sizes and positions as inches, centimeters, millimeters
• Can create tables of barcodes (to print labels on sticker pages)
History: Unknown
Project Sponsors/Administrators: GNU
Dependency: Unknown
Supported Platforms: Unknown
License: GNU GPL
Availability: http://www.gnu.org/software/barcode/barcode.html
Further Information: Project Home Page:
http://www.gnu.org/software/barcode/barcode.html
5.2 KBarcode: The Open Source Barcode Solution
Description: KBarcode is a barcode and label printing application for KDE 3. It
can be used to print every thing from simple business cards up to complex labels
with several barcodes (e.g. article descriptions).
Special Features: KBarcode comes with an easy to use WYSIWYG label
designer, a setup wizard, batch import of labels (directly from the delivery note),
thousands of predefined labels, database managment tools and translations in many
Software Tools for Automation
Chapter 4 49
languages. Even printing more than 10.000 labels in one go is no problem for
KBarcode.
Additionally it is a simply xbarcode replacement for the creation of barcodes. All
major types of barcodes like EAN, UPC, CODE39 and ISBN are supported.
History: Unknown
Project Sponsors/Administrators: Project Leader: Stefan Onken.
Core Programmer: Dominik Seichter
Dependency: KDE 3, pdf417_encode (for 2-D barcodes)
Supported Platforms: Linux
License: GNU GPL
Availability: http://sourceforge.net/project/showfiles.php?group_id=51628
Further Information: Project Home Page: http://www.kbarcode.net/
5.3 PHP Barcode
Description: Barcode is a small implementation of a barcode rendering class using
the PHP language and GD graphics library.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Karim Mribti
Dependency: Apache, PHP, GD Graphics Library
Supported Platforms: Unknown
License: GNU LGPL
Availability: http://www.mribti.com/barcode/download.php
Further Information: Project Home Page: http://www.mribti.com/barcode/
Software Tools for Automation
Chapter 4 50
5.4 Barcodes-on-the-fly
Description: This utility will generate printable barcodes in the CODABAR (NW-
7) format based on the information you provide. The author hopes that libraries and
others will be able to print cheap disposable barcodes for, among other things,
books on loan from another library.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Ben Ostrowsky
Dependency: Apache, zlib, libpng, gd, Perl (with CGI), and GD::Barcode
Supported Platforms: Linux
License: GNU General Public License
Availability: http://bernie.tblc.org/~ostrowb/barcodes.html
Further Information: Project Home Page:
http://bernie.tblc.org/~ostrowb/barcodes.html
Chapter 5
Software Tools for Value Added Services
“Any tool should be useful in the expected way, but a truly great tool lends itself to
uses you never expected” – Eric S. Raymond
• Library Portal
• User Services
• Subject Gateways
• Inter Library Loan (ILL)
Software Tools for Value Added Services
Chapter 5 52
1 Library Portal
The wide use of Internet by the users has made it imperative for the libraries to
have a presence there. There can be three types of content in a library website
according to Morgan (2003):
1. Information about the library: staff directories, departmental descriptions,
maps of the building, hours, etc.
2. Electronic versions of traditional library services: online tutorials, book
renewals, interlibrary loan requests and status reports, requests for purchase,
online chat/reference, virtual tours of the building(s), etc.
3. Access to library content: catalogs, indexes, full-text magazines and
journals, digitized special collections, free and commercial ebooks,
government documents, freely accessible Internet resources, electronic
encyclopedias and dictionaries, licensed content from vendors, etc.
Simple websites are fairly easy to maintain with little knowledge of HTML editors.
But as the size of the website grows the one needs to have better searching and
browsing interface. One must follow the usability guidelines in creating and
maintaining the websites so that users are not lost while navigating the site.
1.1 MyLibrary
Description: MyLibrary is a user-driven, customizable interface to collections of
Internet resources -- a portal. Primarily designed for libraries, the system's purpose
is to reduce information overload by allowing patrons to select as little or as much
information as they so desire for their personal pages.
Special Features: Some of the important features are:
• Web-based administration to add, delete, modify user access
• Web-based report generation
• Current awareness service based on cron job
Software Tools for Value Added Services
Chapter 5 53
• Search engine support based on Swish-E
History: Unknown
Project Sponsors/Administrators: Eric Lease Morgan
Dependency: Apache, Perl, MySQL/PostgreSQL
Supported Platforms: UNIX, Linux
License: GNU General Public License
Availability: http://dewey.library.nd.edu/mylibrary/download/
Further Information: Project Home Page: http://dewey.library.nd.edu/mylibrary/
1.2 The Scout Portal Toolkit
Description: The Scout Portal Toolkit (SPT) allows groups or organizations that
have a collection of knowledge or resources they want to share via the World Wide
Web to put that collection online without making a big investment in technical
resources or expertise.
Special Features: The portal interface has a number of useful features including
• Cross-Field Searching
• Resource Annotations by Users
• Intelligent User Agents
• Resource Quality Ratings by Users
• Suggested Resource Referrals (Recommender System)
Go to http://scout.wisc.edu/research/SPT/features.html to get a detailed description
of the above features.
The Scout Portal Toolkit also provides the Intelligent Metadata Tool (IMT). The
IMT is a web-based tool for the entry and editing of resource information.
Although only accessible to portal site administrators and designated users, the
IMT is an integrated part of the portal site, providing ready access to portal
Software Tools for Value Added Services
Chapter 5 54
facilities and information collected by the portal for discipline experts while they
are working on resource entries.
History: Unknown
Project Sponsors/Administrators: Internet Scout Project
Dependency: Apache, PHP, MySQL
Supported Platforms: Platform Independent (But installer work only in shell
environment)
License: GNU GPL
Availability: http://scout.wisc.edu/research/SPT/download.html
Further Information: Project Home Page: http://scout.wisc.edu/research/SPT/
1.3 Research Guide
Description: Research Guide is a web-based management of subject guides for
academic libraries.
Special Features: Some of the features are:
• Support for creating specialist pages with contact information and other
background information on subject specialists in the library
• Web-based interface for creating and editing guides and specialist pages
• Database back-end
History: This application was written for use at the University of Michigan
Graduate Library. It is currently being used to serve research guides there
(http://www.lib.umich.edu/grad/guide/).
Project Sponsors/Administrators: Kelsey Libner
Dependency: Apache, MySQL, PHP
Supported Platforms: UNIX, Linux, Windows
Software Tools for Value Added Services
Chapter 5 55
License: MIT License
Availability: http://sourceforge.net/project/showfiles.php?group_id=63006
Further Information: Project Home Page: http://researchguide.sourceforge.net/
1.4 PostNuke Content Management System
Description: PostNuke is the most powerful and popular open source content
management system on the Internet. It is easy to install, easy to understand/use,
and easy to administer.
Special Features: It is full of features including:
• Complete web-based administration
• Support for additional modules with PostNuke API
• Strong community support
History: Based on PHP-Nuke (http://www.phpnuke.org/).
Project Sponsors/Administrators: PostNuke Development Team
Dependency: Apache, MySQL, PHP, ADOdb
Supported Platforms: Platform Independent
License: GNU GPL
Availability: http://download.postnuke.com/
Further Information: Project Home Page: http://www.postnuke.com/
1.5 Cascade
Description: Cascade is a Perl driven, web-based content management system. It's
based on a community model of managing of a large directory resource. Cascade
allows one to easily maintain a web-based Yahoo-like directory of resources using
web-based forms.
Software Tools for Value Added Services
Chapter 5 56
Special Features: Some of the features are:
• Supports Related Categories and Virtual Subcategories (what you see in
yahoo directory with an @ next to them)
• Designed to integrate with static content on your website
• Supports basic ratings of content
(Go to http://summersault.com/software/cascade/#features for detailed features)
History: Unknown
Project Sponsors/Administrators: Mark Stosberg
Dependency: Apache, Perl, RDBMS (MySQL/PostgreSQL)
Supported Platforms: Unix, Linux
License: GNU General Public License
Availability: http://sourceforge.net/project/showfiles.php?group_id=6582
Further Information: Project Home Page:
http://summersault.com/software/cascade/
2 User Services
The user services like reference, circulation, and document delivery are really
crucial since it is the face of the library. Automating these functions not only helps
reducing the burden on the librarians but also improves the image of the library
among the users.
2.1 Prospero
Description: An Open Source Internet Document Delivery (IDD) System.
Software Tools for Value Added Services
Chapter 5 57
Special Features: Prospero can be easily integrated with and ILL implementation
package.
History: Prospero was inspired by the Yale Library Electronic Document Delivery
(EDD) service authored by Daniel Chudnov from the Yale Medical Library. The
EDD Project (http://oss4lib.org/projects/edd.php3) is no more supported.
Project Sponsors/Administrators: Eric Hamrick, Eric Schnell
Dependency: Perl, COMCTL32.DLL (for Windows), SAMBA (for Linux)
Supported Platforms: Staff Module (Windows), Server-side (Windows, Linux)
License: GNU GPL
Availability: http://bones.med.ohio-state.edu/prospero/current.html
Further Information: Project Home Page: http://bones.med.ohio-
state.edu/prospero/
2.2 Ask a Librarian (ASKAL)
Description: Ask a Librarian (ASKAL) is a self-managing email-based reference
service suite for libraries.
Special Features: Includes an administrative interface.
History: Originally developed for use in the University Library, University of
Nebraska (USA).
Project Sponsors/Administrators: Karen K. Hein, Marc W. Davis
Dependency: Apache, Mail Server (e.g., Sendmail), PHP, MySQL
Supported Platforms: Linux, UNIX, Windows
License: GNU GPL
Availability: http://apocalypse.unomaha.edu/ask/
Further Information: Project Home Page: http://apocalypse.unomaha.edu/ask/
Software Tools for Value Added Services
Chapter 5 58
2.3 Reference Desk Manager (RDM)
Description: The Reference Desk Manager (RDM) is a PHP based web
application, specifically designed to meet the needs of Reference Services in
libraries.
Special Features: Current RDM features are:
• Email weblog -- with search feature
• Electronic Card File -- with search feature
• Common Links Area
• Web-based Administration
History: The RDM was initially developed at Oregon State University (USA) for
use by their Reference Services staff.
Project Sponsors/Administrators: Terry Reese, Carrie Ottow, John Matylonek,
and Joe Toth.
Dependency: Apache, Sendmail, PHP, MySQL
Supported Platforms: Linux, UNIX, Windows (not tested)
License: Oregon State University Copyright (with source code). Free for non-
commercial and educational use.
Availability: http://oregonstate.edu/~reeset/RDM/downloads.html
Further Information: Project Home Page: http://oregonstate.edu/~reeset/RDM/
2.4 Morris Messenger
Description: Morris Messenger is a web-based messenger system which can be
used as an effective reference tool by the libraries.
Special Features: Unknown
History: Originally developed for use in the Morris Library, Southern Illinois
University Carbondale (USA).
Software Tools for Value Added Services
Chapter 5 59
Project Sponsors/Administrators: Keith VanCleave, Jody Fagan
Dependency: Apache, Perl, MySQL
Supported Platforms: Linux, UNIX
License: GNU GPL
Availability: http://www.lib.siu.edu/chat/#software
Further Information: Project Home Page: http://www.lib.siu.edu/chat/
3 Subject Gateways
Subject gateways as the name suggests typically focus on a particular subject area.
These are online services and sites that provide that catalogues the Internet based
resources available in a specific field of study. The libraries have an important role
in the building of subject gateway in the area it specializes.
Building such kind of services demanded high level of technical adeptness in the
past. But with availability of good quality public domain OSS tools has removed
that fear. Most of these tools comply with well-accepted metadata standards like
Dublin Core, MARC, etc.
3.1 ROADS
Description: ROADS (Resource Organization And Discovery in Subject-based
Services) is a set of software tools to enable the set up and maintenance of Web
based subject gateways.
Special Features: ROADS is a software tool-kit allowing gateway managers to
pick and choose what parts of the software they require whilst allowing the
integration of other software according to requirement. ROADS include advanced
features for linking distributed cooperative databases together using the IETF's
Software Tools for Value Added Services
Chapter 5 60
WHOIS++ search and retrieval protocol, and their Common Indexing Protocol
(CIP).
History: ROADS was originally developed as part of the UK Electronic Libraries
Programme (eLib) by a consortium including the Institute of Learning and
Research Technology at the University of Bristol, and the UK Office of Library
and Information Networking at the University of Bath, with the bulk of the
development being done by the Department of Computer Science at Loughborough
University. Although this project itself has finished, the software continues to be
developed and used all over the world.
Project Sponsors/Administrators: The ROADS project has three partners:
• The Department of Computer Science at Loughborough University of
Technology
• The ILRT (Institute for Learning and Research Technology) at Bristol
University
• UKOLN (the UK Office for Library and Information Networking) at the
University of Bath
Dependency: Apache, Perl
Supported Platforms: POSIX (UNIX, Linux)
License: Artistic License, GNU GPL
Availability: http://sourceforge.net/project/showfiles.php?group_id=6936
Further Information: Project Home Page: http://roads.sourceforge.net/
3.2 iVia
Description: iVia is an open source Internet subject portal or virtual library system.
As a hybrid expert and machine built collection creation and management system,
it supports a primary, expert-created, first-tier collection that is augmented by a
Software Tools for Value Added Services
Chapter 5 61
large, second-tier collection of significant Internet resources that are automatically
gathered and described.
Special Features: Some of the major features of the iVia system include:
• A core system that is fast, robust, reliable and scalable to millions of records
and users.
• An array of Web crawlers capable of fully- to semi-automating the
identification of significant Internet resources.
• Classifiers that enable semi-automated metadata content creation providing
expert/machine interaction throughout the record building process.
• Search/browse interface options that provide users with great flexibility in
finding resources and which support all levels of user search skills.
• Support for single or multiple subject virtual library projects which can
share data and efforts on any of several levels of cooperation.
• Support for the following standards: OAI Protocol for Metadata Harvesting
(OAI-PMH), Dublin Core, MARC (Machine-Readable Cataloging), Library
of Congress Subject Headings (LCSH), and Library of Congress
Classifications (LCC).
History: The iVia system is an INFOMINE creation generously funded by the
National Leadership Grant Program of the U.S. Institute of Museum and Library
Services, the Fund for the Improvement of Post-Secondary Education of the U.S.
Department of Education and the Library of the University of California, Riverside.
Project Sponsors/Administrators: INFOMINE, The Regents of the University of
California
Dependency: Apache, MySQL, Berkeley DB
Supported Platforms: Linux
License: Affero General Public License (http://www.affero.org/oagpl.html)
Availability: http://infomine.ucr.edu/iVia/ivia.php?section=2
Software Tools for Value Added Services
Chapter 5 62
Further Information:
1. Project Home Page: http://infomine.ucr.edu/iVia/
2. iVia Open Source Virtual Library System:
http://www.dlib.org/dlib/january03/mitchell/01mitchell.html
3.3 IMesh Toolkit
Description: The IMesh Toolkit is a coherent set of tools and standards being
developed for use by subject gateway software developers and technically savvy
subject gateway implementers. These tools and standards will make use of
established open protocols and interfaces wherever possible to insure
interoperability. The toolkit will include reference implementations for all
standards.
Special Features: It has many components such as metadata exchange tools, RDF
query tools, OAI normalization tools, Reading Lists, etc.
History: The IMesh Toolkit Project is a joint effort by groups funded by JISC and
the NSF to develop the IMesh Toolkit. The major participants in this effort include
the UK Office for Library and Information Networking (UKOLN) and the
University of Bath in the UK, the Institute for Learning and Research Technology
(ILRT) at the University of Bristol in the UK, and the Internet Scout Project (ISP)
at the University of Wisconsin - Madison in the United States.
Project Sponsors/Administrators: UKOLN, ILRT, Internet Scout Project.
The IMesh Toolkit project was funded under the NSF/JISC International Digital
Libraries Initiative from September 1999 to July 2003.
Dependency: Perl
Supported Platforms: Unknown
License: GNU GPL
Availability: http://clark.cs.wisc.edu/cgi-bin/cvsweb.cgi
Software Tools for Value Added Services
Chapter 5 63
Further Information:
1. Project Home Page: http://www.imesh.org/toolkit/work/components/ME/
2. Internet Scout Portal Project: http://scout.wisc.edu/research/imeshtk/
4 Inter Library Loan
Inter Library Loan (ILL) is the most visible form of resource sharing among
libraries. The ILL protocol (ISO 10160:1997) developed by the National Library
of Canada has sought to automate this process. It has become an ISO standard in
1997. Wide implementation of this protocol would reduce the gestation period in
the delivery of ILL request considerably.
4.1 ILL Wizard
Description: ILL Wizard is ISO-compliant ILL web form to handle ILL requests.
It can run from a desktop or from the library's Web site server directory.
Special Features: Non-programmer technical librarians should be able to
configure and mount this Java Web form without help from computer experts.
History: Originally developed for use in the Benner Library and Resource Center,
Olivet Nazarene University (Illinois, USA).
Project Sponsors/Administrators: Bryan Wilhelm, Craighton Hippenhammer
Dependency: Java, Web Server/Web Browser
Supported Platforms: Linux, UNIX, Windows
License: Unknown
Availability: http://library.olivet.edu/iso-ill.html
Further Information: Project Home Page: http://library.olivet.edu/iso-ill.html
Software Tools for Value Added Services
Chapter 5 64
4.2 Biblio::ISO::ILL
Description: Biblio::ILL::ISO is ISO-protocol-based Interlibrary Loan (ISO
10161) module for Perl programming language.
Special Features: The module implements the 20 Interlibrary Loan message
classes (ILL-Request, Answer, etc), plus the hundred or so types that make up
those classes. There is a test suite. There are a handful of test/example programs.
History: The author had earlier written Biblio::ILL::GS which was a Interlibrary
Loan Generic Script.
Project Sponsors/Administrators: David Christensen
Dependency: Perl
Supported Platforms: Linux, UNIX
License: Perl Artistic License
Availability: Currently available at http://maplin.gov.mb.ca/pub/TEST/, Check
http://search.cpan.org/author/DCHRIS/ in future
Further Information: Project Home Page: http://www.lib.siu.edu/chat/
Chapter 6
Software Tools for Digital Library Initiatives
“Future is digital” – Famous Advertisement Campaign
• Digital Library Toolkit
• DL-like Softwares
• OAI-PMH Tools
Software Tools for DL Initiatives
Chapter 6 66
1 Digital Library Toolkit
The term "Digital Library" has a variety of potential meanings, ranging from a
digitized collection of material that one might find in a traditional library through to
the collection of all digital information. However, it is not merely equivalent to a
digitized collection with information management tools. It is also a series of
activities that brings together collections, services, and people in support of the full
life cycle of creation, dissemination, use, and preservation of data, information, and
knowledge.
The creation and maintenance of digital libraries is imperative with growing
amount of information available in the digital format. Building digital libraries
needs a fair amount of knowledge of information management tools such as
databases, web technology, information retrieval, user interface, etc. The usability
of hosted resources is as important as the quality of information presented.
The Digital Library toolkits discussed below are fairly integrated set of solutions to
build digital libraries with born digital resources. However, converting existing
hard copy documents into digital format would require few more tools like scanner,
optical character recognition (OCR) software, word processing software, image
editing tools, etc.
1.1 Greenstone
Description: Greenstone is a suite of software for building and distributing digital
library collections. It provides a new way of organizing information and publishing
it on the Internet or on CD-ROM.
Special Features: Some of the important features are:
• Support for image, video, and text collection
• Support for multilingual collection building
• Z39.50 client available on Linux systems
Software Tools for DL Initiatives
Chapter 6 67
• Highly portable collection, can easily be distributed even on a CD-ROM
History: Greenstone is produced by the New Zealand Digital Library Project at the
University of Waikato, and developed and distributed in cooperation with
UNESCO and the Human Info NGO.
Project Sponsors/Administrators: University of Waikato, New Zealand
Dependency: Apache, Perl, GDBM
Supported Platforms: UNIX, Windows, Linux, MacOS X
License: GNU GPL
Availability: http://www.greenstone.org/english/download.html
Further Information: Project Home Page: http://www.greenstone.org/
1.2 DSpace
Description: DSpace is a specialized type of digital asset management or content
management system: it manages and distributes digital items, made up of digital
files and allows for the creation, indexing, and searching of associated metadata to
locate and retrieve the items. It is designed to support the long-term preservation of
the digital material stored in the repository.
Special Features: The important features of DSpace are:
• Institutional Repository: DSpace is organized to accommodate the
multidisciplinary and organizational needs of a large institution.
• Document Formats: Support for a Variety of Digital Formats and Content
Types including text, images, audio, and video
• Access Control: DSpace allows contributors to limit access to items in
DSpace - at the collection and the individual item level.
• Digital Preservation: DSpace provides long-term physical storage and
management of digital items in a secure, professionally managed repository
Software Tools for DL Initiatives
Chapter 6 68
including standard operating procedures such as backup, mirroring,
refreshing media, and disaster recovery.
• Search and Retrieval: The DSpace submission process allows for the
description of each item using a qualified version of the Dublin Core
metadata schema.
(Go to http://dspace.org/technology/features.html for more detailed description on
features.)
History: DSpace was developed out of collaboration between MIT Libraries and
Hewlett-Packard Company.
Project Sponsors/Administrators: MIT Libraries & Hewlett-Packard Company
Dependency: Apache, Tomcat, PostgreSQL, Java
Supported Platforms: Claimed to be Platform Independent, but installation
manual suggests only UNIX-like platform.
License: BSD License
Availability: http://dspace.org/technology/download.html,
http://sourceforge.net/project/showfiles.php?group_id=19984
Further Information: Project Home Page: http://www.dspace.org/
1.3 iVia
Description: iVia is an open source Internet subject portal or virtual library system.
As a hybrid expert and machine built collection creation and management system,
it supports a primary, expert-created, first-tier collection that is augmented by a
large, second-tier collection of significant Internet resources that are automatically
gathered and described.
Special Features: Some of the major features of the iVia system include:
• A core system that is fast, robust, reliable and scalable to millions of records
and users.
Software Tools for DL Initiatives
Chapter 6 69
• An array of Web crawlers capable of fully- to semi-automating the
identification of significant Internet resources.
• Classifiers that enable semi-automated metadata content creation providing
expert/machine interaction throughout the record building process.
• Search/browse interface options that provide users with great flexibility in
finding resources and which support all levels of user search skills.
• Support for single or multiple subject virtual library projects which can
share data and efforts on any of several levels of cooperation.
• Support for the following standards: OAI Protocol for Metadata Harvesting
(OAI-PMH), Dublin Core, MARC (Machine-Readable Cataloging), Library
of Congress Subject Headings (LCSH), and Library of Congress
Classifications (LCC).
History: The iVia system is an INFOMINE creation generously funded by the
National Leadership Grant Program of the U.S. Institute of Museum and Library
Services, the Fund for the Improvement of Post-Secondary Education of the U.S.
Department of Education and the Library of the University of California, Riverside.
Project Sponsors/Administrators: INFOMINE, The Regents of the University of
California
Dependency: Apache, MySQL, Berkeley DB
Supported Platforms: Linux
License: Affero General Public License (http://www.affero.org/oagpl.html)
Availability: http://infomine.ucr.edu/iVia/ivia.php?section=2
Further Information:
1. Project Home Page: http://infomine.ucr.edu/iVia/
2. iVia Open Source Virtual Library System:
http://www.dlib.org/dlib/january03/mitchell/01mitchell.html
Software Tools for DL Initiatives
Chapter 6 70
1.4 Dienst
Description: The distributed Dienst software is configured to handle textual
resources (documents) in a variety of formats. However, the Dienst architecture
includes a sophisticated document model that accommodates a wide variety of
digital resources. Using the Dienst software for these other resources will require
some programming.
Special Features: Unknown
History: Dienst is a project of the CDLRG - Cornell Digital Library Research
Group. Work on Dienst sponsored by the Defense Advanced Research Projects
Agency (DARPA) on behalf of the Digital Libraries Initiative. Additional work on
Dienst is sponsored by the National Science Foundation Digital Libraries Initiative
Phase 2 Project Prism.
Project Sponsors/Administrators: Cornell University, USA
Dependency: Apache, Perl, mod_perl, ImageMagic, PerlMagic, freeways-sf
Supported Platforms: UNIX, Linux, MacOS X, Windows (not tested)
License: Unknown
Availability: Currently not available
Further Information:
Project Home Page:
http://www.cs.cornell.edu/cdlrg/dienst/software/DienstSoftware.htm
1.5 Fedora
Description: Flexible Extensible Digital Object and Repository Architecture
(Fedora) is a toolkit to build a digital object repository management system. The
system, designed to be a foundation upon which interoperable web-based digital
libraries, institutional repositories and other information management systems can
Software Tools for DL Initiatives
Chapter 6 71
be built, demonstrates how distributed digital library architecture can be deployed
using web-based technologies, including XML and Web services.
The interface to the system consists of three open APIs that are exposed as web
services:
• Management API (API-M) – defines an interface for administering the
repository. It includes operations necessary for clients to create and
maintain digital objects and their components. API-M is implemented as a
SOAP-enabled web service.
• Access API (API-A) – defines an interface for accessing digital objects
stored in the repository. It includes operations necessary for clients to
perform disseminations on objects in the repository and to discover
information about an object using object reflection. API-A is implemented
as a SOAP-enabled web service.
• Access-Lite API (API-A-Lite) – defines a streamlined version of the
Fedora Access Service that is implemented as an HTTP-enabled web
service.
Special Features: The major features are:
• Web Services: The interface to the Fedora repository system consists of
three open APIs that are exposed as web services: Management API known
as API-M, Access API known as API-A, and Access-Lite API known as
API-A-Lite.
• Datastreams: Objects in a repository may consist of content and metadata
(datastreams) that physically reside inside the repository or outside the
repository. The Fedora repository system supports content of any MIME
type.
• XML Submission and Storage: Digital objects are stored as XML-
encoded files that conform to an extension of the Metadata Encoding and
Transmission Standard (METS) schema. The schema for the extended
Software Tools for DL Initiatives
Chapter 6 72
version of METS used by Fedora can be found at
http://www.fedora.info/definitions/1/0/mets-fedora-ext.xsd.
• OAI Metadata Harvesting Provider: The Fedora metadata is accessible
using the OAI Protocol for Metadata Harvesting, v2.0.
• Parameterized Behaviors: Behaviors defined for an object support user-
supplied options that are handled at dissemination time.
• Versioning: Although not fully implemented in release 1.1, the Fedora
repository system includes the infrastructure to support versioning of digital
objects and their components.
• Access Control and Authentication: Release 1.1 includes a simple form of
access control to provide access restrictions based on IP address. IP range
restriction is supported in both the Management and Access APIs.
(Go to http://www.fedora.info/ for complete feature set)
History: Jointly developed by the University of Virginia and Cornell University
the Fedora project was funded by the Andrew W. Mellon Foundation.
Project Sponsors/Administrators: University of Virginia and Cornell University.
Technical Coordinator: Ronda A. Grizzle (Virginia)
Dependency: Java SDK, MySQL/Oracle (optional), JDBC
Supported Platforms: Platform Independent (with Java)
License: Mozilla Public License
Availability: http://www.fedora.info/release/
Further Information: Project Home Page: http://www.fedora.info/
1.6 DjVuLibre
Description: DjVu (pronounced "deja vu") is a compression technique, a file
format, and a delivery platform that is specifically designed to enable the creation
of digital libraries of printed material, either scanned from paper or digitally
Software Tools for DL Initiatives
Chapter 6 73
produced. For scanned document, DjVu file sizes are typically 3 to 10 times
smaller than TIFF or PDF in black and white, and 5 to 10 times smaller than JPEG
in color.
DjVu documents are displayed within web browsers through a very lightweight
plug-in (available for all major platforms). Server-side full-text search can easily
be provided using free indexing tools and a few Perl scripts.
Special Features: Unknown
History: The DjVu project was started by Yann LeCun at AT&T Labs-Research in
1996. Much of the research and innovations behind DjVu were the work of Leon
Bottou, Yann LeCun, Patrick Haffner, Paul Howard, and Yoshua Bengio, with
some contributions from Pascal Vincent, Patrice Simard, and Steven Pigeon.
DjVuLibre is a GPL implementation of DjVu maintained by the original inventors
of DjVu. Go to http://djvu.sourceforge.net/credits.html to know the historical
details of the project.
Project Sponsors/Administrators: Yann LeCun, Léon Bottou
Dependency: Unknown
Supported Platforms: Linux
License: GNU GPL (version 2)
Availability: http://sourceforge.net/project/showfiles.php?group_id=32953
Further Information: Project Home Page: http://djvu.sourceforge.net/
2 DL-like Softwares
Archiving of digital documents can be seen as an extension of digital libraries.
Digital archiving softwares can also be used to build useful services.
Software Tools for DL Initiatives
Chapter 6 74
2.1 E-prints
Description: The primary purpose of the E-Prints software is to help create open
access to the peer-reviewed research output of all scholarly and scientific research
institutions. The default configuration creates a research papers archive, but could
be used for other purposes.
Special Features: Unknown
History: E-Prints was part of the Open Citation Project, a DLI2 International
Digital Libraries Project funded by the Joint Information Systems Committee
(JISC) of the Higher Education Funding Councils, in collaboration with the
National Science Foundation. E-Prints was previously supported by CogPrints,
funded by JISC as part of its Electronic Libraries (eLib) Programme.
Project Sponsors/Administrators: University of Southampton, UK
Dependency: Apache, Perl, mod_perl, MySQL
Supported Platforms: Linux, UNIX
License: GNU GPL
Availability: http://software.eprints.org/
Further Information: Project Home Page: http://www.eprints.org/
2.2 CDSWare
Description: CERN Document Server Software (CDSware) allows one to run
one's own electronic preprint server, online library catalogue or a document system
on the web. It complies with the Open Archives Initiative metadata harvesting
protocol (OAI-PMH) and uses MARC 21 as its underlying bibliographic standard.
Special Features: Some of the salient features are:
• Configurable portal-like interfaces for hosting various kind of collections.
• Powerful search engine with Google-like syntax.
Software Tools for DL Initiatives
Chapter 6 75
• User personalization, including document baskets and email notification
alerts.
• Electronic submission and upload of various types of documents.
• Running an OAI data and service provider enabling the metadata exchange
between heterogeneous repositories.
History: Developed for use at the CERN Library, Europe.
Project Sponsors/Administrators: CERN
Dependency: Apache, MySQL, PHP, Python, WML
Supported Platforms: Linux, UNIX
License: GNU GPL
Availability: http://cdsware.cern.ch/download/
Further Information: Project Home Page: http://cdsware.cern.ch/
2.3 Harvest
Description: Harvest is a system to collect information and make them searchable
using a web interface. Harvest can collect information on inter- and intranet using
http, ftp, nntp as well as local files like data on hard disk, CDROM and file servers.
Current list of supported formats in addition to HTML include TeX, DVI, PS, full
text, mail, man pages, news, troff, WordPerfect, RTF, Microsoft Word/Excel,
SGML, C sources and many more.
Possible uses of Harvest include:
• Web Search Engine
• Specialized Search System
• Building a Distributed Search System
• Testbed for Search related Components
Special Features: Some of the features are:
Software Tools for DL Initiatives
Chapter 6 76
• Harvest is designed to work as distributed system.
• Harvest is designed to be modular.
• Harvest allows complete control over the content of data in the search
database.
• The Search interface is written in Perl to make customization easy, if
desired.
History: Unknown
Project Sponsors/Administrators: Kang-Jin Lee, Javier Masa Marin, Harald
Weinreich
Dependency: Apache, Perl, GDBM, Bison, Flex, and GCC (for compiling from
source)
Supported Platforms: Linux, UNIX, Windows (under cygwin)
License: GNU GPL
Availability: http://sourceforge.net/project/showfiles.php?group_id=27808
Further Information: Project Home Page: http://harvest.sourceforge.net/
3 OAI-PMH Tools
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a
means of making machine-readable metadata widely available for use. In other
words, it is a means of sharing metadata between digital archives and repositories.
The Open Archives Initiative was originally proposed to achieve federated
searching of to e-print/pre-print archives. Gradually, however, the scope of the
initiative has broadened to cover any kind of digital content including images and
videos.
Software Tools for DL Initiatives
Chapter 6 77
It is based on providing simple yet powerful framework of metadata harvesting.
This metadata harvesting method can be used to build high quality federated search
systems across collection in a very short span of time. The protocol stipulates that
all the metadata should be encoded in XML. The minimum common denominator
is the Unqualified Dublin Core (UDC) so that more digital collection can
implement this protocol without much hassle. The reason is, virtually any other
metadata schema can be downgraded to conform to UDC.
3.1 OAICat
Description: OAICat is a Java Servlet web application providing an OAI-PMH
v2.0 repository framework. The framework can be customized to work with
arbitrary data repositories by implementing some Java interfaces.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Jeffrey Young, OCLC
Dependency: Java Servlet Engine, RDBMS (tested with MySQL)
Supported Platforms: Platform Independent
License: OCLC Research Public License
(http://purl.oclc.org/oclc/research/ORPL/)
Availability: http://pubserv.oclc.org/oaicat/jars/dist/dist.html
Further Information:
Project Home Page: http://www.oclc.org/research/software/oai/cat.shtm
3.2 PHP OAI Data Provider
Description: As the name suggests it is an implementation of the OAI-PMH
(version 2) Data Provider.
Software Tools for DL Initiatives
Chapter 6 78
Special Features: This implementation currently supports
• Full OAI-PMH version 2.0 compliance
• Compressed XML support, which greatly reduces used bandwidth
• Can connect to many existing databases, by using PEAR abstract layer
• Quite easy to configure
History: Developed at the University of Oldenburg, Germany.
Project Sponsors/Administrators: Heinrich Stamerjohanns
Dependency: Apache, PHP, RDBMS (Oracle8/MySQL)
Supported Platforms: Linux, UNIX, Windows
License: Unknown
Availability: http://physnet.uni-oldenburg.de/oai/
Further Information: Project Home Page: http://physnet.uni-oldenburg.de/oai/
3.3 VTOAI OAI-PMH2 Perl Implementation
Description: This toolkit implements the skeleton of the OAI-PMH v2.0 in an
object-oriented fashion, thus hiding the details of the protocol from code that is
derived from the predefined class.
Special Features: Some of the features are:
• Strict compliance with OAI-PMH v2.0
• One installation can easily be used for multiple archives
• All extensions, configurations, and containers are specified using XML
Schema
• Minimal changes are required to create a working implementation
History: Developed at the Digital Library Research Laboratory (DLRL) of
Virginia Tech University, USA.
Software Tools for DL Initiatives
Chapter 6 79
Project Sponsors/Administrators: Hussein Suleman, Virginia Tech
Dependency: Apache, Perl
Supported Platforms: Linux, UNIX, Windows
License: Perl Artistic License
Availability: http://www.dlib.vt.edu/projects/OAI/software/vtoai/vtoai.html
Further Information:
Project Home Page: http://www.dlib.vt.edu/projects/OAI/software/vtoai/vtoai.html
3.4 ARC
Description: Arc is the first federated search service based on the OAI-PMH
protocol. It includes a harvester which can harvests OAI-PMH 1.x and OAI-PMH
2.0 compliant repositories, a basic search engine which is based on database and an
OAI-PMH. It was developed at the Old Dominion University, USA.
Special Features: It includes a harvester, a search engine together with a simple
search interface, and an OAI-PMH layer over harvested metadata. Arc can be easily
configured for a specific community.
History: Developed at the Digital Library Research Group of Old Dominion
University (USA).
Project Sponsors/Administrators: Digital Library Research Group, ODU
Dependency: Java Servlet Engine, Tomcat, RDBMS (Oracle/MySQL)
Supported Platforms: Platform Independent
License: University of Illinois/NCSA Open Source License
Availability: http://sourceforge.net/project/showfiles.php?group_id=61532
Further Information:
1. Project Home Page: http://oaiarc.sourceforge.net/
2. ARC Demo Search: http://arc.cs.odu.edu/
Software Tools for DL Initiatives
Chapter 6 80
3.5 OAIHarvester
Description: Developed by OCLC, the OAIHarvester Open Source project is a
Java application providing an OAI-PMH v2.0 harvester framework. This
framework can be customized to perform arbitrary operations on harvested data by
implementing some Java interfaces.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Jeffrey Young, OCLC
Dependency: Java, Apache Ant
Supported Platforms: Platform Independent
License: OCLC Research Public License
Availability: http://www.oclc.org/research/software/oai/harvester.shtm
Further Information:
Project Home Page: http://www.oclc.org/research/software/oai/harvester.shtm
3.6 OAI/ODL Harvester
Description: Harvest data from one or more archives. This is a template that does
nothing useful besides printing the records to STDOUT (screen). It is intended that
the Harvester class will be sub classed to perform more useful functions.
Special Features: Some of the important features are:
• Works with any OAI (PMH v1.0/1.1/2.0) or ODL (XOAIPMH v1.0)
archive
• Code layout for separate components or libraries of components
• One installation can easily be used for harvesting from multiple sites for
different purposes
Software Tools for DL Initiatives
Chapter 6 81
• All extensions, configurations, and containers are specified using XML
Schema
History: Unknown
Project Sponsors/Administrators: Digital Library Research Lab, Virginia Tech
(USA).
Dependency: Apache, Perl
Supported Platforms: Linux, UNIX, Windows
License: Perl Artistic License
Availability: http://oai.dlib.vt.edu/odl/software/harvest/
Further Information: Project Home Page:
http://oai.dlib.vt.edu/odl/software/harvest/
3.7 Net::OAI::Harvester
Description: Net::OAI::Harvester is a Perl extension for easily querying OAI-
PMH repositories. OAI-PMH allows data repositories to share metadata about
their digital assets. Net::OAI::Harvester is a OAI-PMH client, so it does for OAI-
PMH what LWP::UserAgent does for HTTP. At the moment this module supports
only Dublin Core (oai_dc) schema handling through XML::Handler.
Special Features: Some of the features are:
• It is able to handle memory-crazy requests like listRecords and
listIdentifiers
• XML::SAX filters are used which will allow interested developers to write
their own metadata parsing packages, and drop them into place
• It has built in support for unqualified Dublin Core, and has a framework for
dropping in one’s own parser for other kinds of metadata
History: Unknown
Software Tools for DL Initiatives
Chapter 6 82
Project Sponsors/Administrators: Edward Summers
Dependency: Perl, XML::SAX, LWP::UserAgent
Supported Platforms: Linux, UNIX, Windows
License: Perl Artistic License
Availability: http://search.cpan.org/author/ESUMMERS/OAI-Harvester-0.5/
Further Information:
Project Home Page: http://search.cpan.org/author/ESUMMERS/OAI-Harvester-
0.5/
3.8 Rapid Visual OAI Tool (RVOT)
Description: Rapid Visual OAI Tool (RVOT) can be used to graphically construct
an OAI-PMH repository from a collection of files. The records in the original
collection can be in any one of the acceptable formats. RVOT helps to define the
mapping visually from a native format to oai_dc format, and once this is done the
tool can respond to OAI-PMH requests.
Special Features: The design of RVOT is such that it can be easily extended to
support other metadata formats.
History: Developed at the Digital Library Research Group of Old Dominion
University (USA).
Project Sponsors/Administrators: Sathish Kumar Kothamasa, M. Zubair
Dependency: Java SDK
Supported Platforms: Platform Independent (Linux, UNIX, Windows
2000/NT/XP)
License: University of Illinois/NCSA Open Source License
Availability: http://sourceforge.net/project/showfiles.php?group_id=66652
Further Information: Project Home Page: http://rvot.sourceforge.net/
Software Tools for DL Initiatives
Chapter 6 83
3.9 OAI Repository Explorer
Description: This site presents an interface to interactively test archives for
compliance with the OAI Protocol for Metadata Harvesting.
Special Features: Some of the features are:
• Simple web-based interface
• Tests compliancy for OAI-PMH version 1.0/1.1/2.0 with schema validation
• Useful to test OAI compliancy of a Data Provider before making it public
History: Developed at the Digital Library Research Laboratory, Virginia Tech
University (USA).
Project Sponsors/Administrators: Hussein Suleman, Edward Fox
Dependency: Web browser with JavaScript support
Supported Platforms: Platform Independent
License: Unknown
Availability: http://www.purl.org/NET/oai_explorer
Further Information: Project Home Page: http://www.purl.org/NET/oai_explorer
Chapter 7
Miscellaneous Supporting Software Tools
“The good news: Computers allow us to work 100% faster. The bad news: They
generate 300% more work” – Unknown
• HTML Tools
• XML Tools
• Information Retrieval Tools
Miscellaneous Supporting Tools
Chapter 7 85
1 HTML Tools
HTML is the lingua franca for publishing documents on the World Wide Web
developed by the World Wide Web Consortium (W3C: http://www.w3c.org/). It is
a non-proprietary format based upon Standard Generalized Markup Language
(SGML), and can be created and processed by a wide range of tools, from simple
plain text editors to sophisticated WYSIWYG (What You See Is What You Get)
authoring tools.
Described below some of the open source tools available for editing HTML in
WYSIWYG way.
1.1 Amaya
Description: Amaya is a Web editor, i.e. a tool used to create and update
documents directly on the Web. Browsing features are seemlessly integrated with
the editing and remote access features in a uniform environment.
Special Features: Amaya started as an HTML + CSS style sheets editor. Since that
time it was extended to support XML and an increasing number of XML
applications such as the XHTML family, MathML, and SVG. It allows all those
vocabularies to be edited simultaneously in compound documents.
Amaya includes a collaborative annotation application based on Resource
Description Framework (RDF), XLink, and Xpointer. The current release, Amaya
8.1a, supports HTML 4.01, XHTML 1.0, XHTML Basic, XHTML 1.1, HTTP 1.1,
MathML 2.0, many CSS 2 features, a SVG support (transformation, transparency,
and SMIL animation on OpenGL platforms).
History: Work on Amaya started at W3C in 1996 to showcase Web technologies in
a fully featured Web client. The main motivation for developing Amaya was to
provide a framework that can integrate as many W3C technologies as possible. It is
used to demonstrate these technologies in action while taking advantage of their
combination in a single, consistent environment.
Miscellaneous Supporting Tools
Chapter 7 86
Project Sponsors/Administrators: World Wide Web Consortium (W3C)
Dependency: None
Supported Platforms: Linux, Windows, UNIX, Solaris
License: W3C Software License (GNU GPL Compatible)
Availability: http://www.w3.org/Amaya/User/BinDist.html
Further Information: Project Home Page: http://www.w3.org/Amaya/
1.2 Mozilla
Description: Mozilla has a decent web page editor built-in with the browser.
Though a low-end product it is useful for developing small websites and for editing
a page in a hurry.
Special Features: Mozilla is a browser that includes a web page editor, an address
book, an IRC chat client and a powerful mail client with intelligent spam filtering.
History: Developed by the Netscape Communications.
Project Sponsors/Administrators: Mozilla Foundation
(http://www.mozillafoundation.org/)
Dependency: glibc2.2.4 or better (for Linux)
Supported Platforms: Linux, Windows, MacOS X, UNIX
License: Mozilla Public License
Availability: http://www.mozilla.org/
Further Information: Project Home Page: http://www.mozilla.org/
Miscellaneous Supporting Tools
Chapter 7 87
1.3 Bluefish Editor
Description: Bluefish is a powerful editor for experienced web designers and
programmers. Bluefish supports many programming and markup languages, but it
focuses on editing dynamic and interactive websites.
Special Features: A What You See Is What You Need interface. Multiple
document interface, easily opens 500+ documents (tested 3500 documents
simultaneously). Customizable syntax highlighting based on Perl Compatible
regular expressions, with sub pattern support and default patterns for PHP, HTML,
C, Java, XML, Python, ColdFusion, Pascal, and R.
Complete feature set is available at: http://bluefish.openoffice.nl/features.html
History: Unknown
Project Sponsors/Administrators: Olivier Sessink
Dependency: gtk2, libpcre, libaspell (optional, for spell checking)
Supported Platforms: Linux, FreeBSD, MacOS-X, OpenBSD, Solaris and Tru64
License: GNU GPL
Availability: http://bluefish.openoffice.nl/download.html
Further Information: Project Home Page: http://bluefish.openoffice.nl/
1.4 Quanta Plus
Description: Quanta+ is a web development environment for HTML and associate
languages for the K Desktop Environment on Linux. Quanta is designed for quick
web development and is rapidly becoming a mature editor with a number of great
features.
Special Features: Unknown
History: Unknown
Miscellaneous Supporting Tools
Chapter 7 88
Project Sponsors/Administrators: Andras Mantia, Robert Nickel, Eric Laffoon
Dependency: KDE, Perl
Supported Platforms: Linux
License: GNU GPL
Availability: http://sourceforge.net/project/showfiles.php?group_id=4113
Further Information: Project Home Page: http://sourceforge.net/projects/quanta/
2 XML Editors
World Wide Web Consortium (W3C) says: “Extensible Markup Language (XML)
is a simple, very flexible text format derived from SGML (ISO 8879). Originally
designed to meet the challenges of large-scale electronic publishing, XML is also
playing an increasingly important role in the exchange of a wide variety of data on
the Web and elsewhere.”
XML has found wide use in the library community in describing metadata. A
number of XML Schema have been developed for various metadata standards like
Dublin Core, MARC, TEI, etc. Various digital library softwares, including
Greenstone, expect metadata only in XML format.
Creating well-formed or valid XML documents requires the help of XML editors.
We will look into few of the WYSIWIG XML editors available as open source.
2.1 Open eXeed
Description: Open eXeed is a Open Source development Environment for XML. It
is used to edit, create, and validate XML and other related documents, such as
XHTML, XSLT.
Special Features: Unknown
History: Unknown
Miscellaneous Supporting Tools
Chapter 7 89
Project Sponsors/Administrators: [email protected]
Dependency: MSXML (version 4)
Supported Platforms: Windows
License: GNU GPL
Availability: http://sourceforge.jp/frs/index.php?group_id=58
Further Information: Project Home Page: http://openexeed.sourceforge.jp/
2.2 Xerlin
Description: The Xerlin Project is a Java™ based XML modeling application
written to make creating and editing XML files easier. It runs on any Java 2 virtual
machine (JDK1.2.2 or higher). The application is extensible via custom editor
interfaces that can be added for individual DTD's.
Special Features: It is extensible via a plugin interface and can also be launched as
an XML editor widget to be included in other Java applications. It also supports
XML libraries such that XML components can be shared between different files. It
has standard editor features such as undo, cut, copy and paste.
History: Unknown
Project Sponsors/Administrators: SpeedLegal (http://www.speedlegal.com/)
Dependency: Java 2 Platform Standard Edition
Supported Platforms: Platform Independent
License: Unknown, claimed to be Apache-style
(http://www.xerlin.org/LICENSE.txt)
Availability: http://www.xerlin.org/downloads.shtml
Further Information: Project Home Page: http://www.xerlin.org/
Miscellaneous Supporting Tools
Chapter 7 90
2.3 Bitflux Editor
Description: Bitflux Editor (acronym: BXE) is a browser-based (currently Mozilla
only) WYSIWYG XML editor which is written in JavaScript and uses XML,
XSLT, and CSS for rendering. It is usable with any XML document and features
tables, lists, images, special chars, clipboard, undo/redo, and easy customization.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: Bitflux, Switzerland
Dependency: Netscape/Mozilla
Supported Platforms: Platform Independent
License: Apache License
Availability: http://bitfluxeditor.org/download/
Further Information: Project Home Page: http://bitfluxeditor.org/
3 Information Retrieval Tools
There is a wide range of open source search engines or information retrieval tools
available on the web from Sourceforge (http://www.sf.net/). These systems can be
categorized into two main groups, viz., those that use inverted files and those that
use database systems. We will look into some of the most popular search engines.
3.1 Ht://Dig
Description: The ht://Dig system is a complete world wide web indexing and
searching system for a domain or intranet. Instead it is meant to cover the search
needs for a single company, campus, or even a particular sub section of a web site.
Special Features: Some of the special features are
Miscellaneous Supporting Tools
Chapter 7 91
• Intranet searching
• Robot exclusion is supported
• Boolean expression searching
• Configurable search results
• Email notification of expired documents
• Searches on subsections of the database
(Go to http://www.htdig.org/require.html for full feature set)
History: ht://Dig was developed at San Diego State University as a way to search
the various web servers on the campus network.
Project Sponsors/Administrators: San Diego State University
Dependency: C++ Compiler, libstdc++ (for building from source)
Supported Platforms: Linux, UNIX, BSD, Solaris, HP/UX
License: GNU GPL
Availability: http://www.htdig.org/mirrors.html, http://www.htdig.org/where.html
Further Information: Project Home Page: http://www.htdig.org/
3.2 Swish-E
Description: Simple Web Indexing System for Humans - Enhanced (SWISH-E) is
a fast, powerful, flexible, free, and easy to use system for indexing collections of
Web pages or other files.
Special Features: Please refer to http://swish-
e.org/current/docs/README.html#Key_features for full feature set. Some of the
major features are:
• Quickly index a large number of documents in different formats including
text, HTML, and XML
Miscellaneous Supporting Tools
Chapter 7 92
• Use “filters” to index other types of files such as PDF, gzip, or Postscript.
• Includes a web spider for indexing remote documents over HTTP. Follows
Robots Exclusion Rules (including META tags).
• Can use an external program to supply documents to Swish-e, such as an
advanced spider for your web server or a program to read and format
records from a relational database.
• Document “properties” (some subset of the source document, usually
defined as a META or XML elements) may be stored in the index and
returned with search results
History: Developed by people at University of California (Berkeley and San
Francisco) and other places.
Project Sponsors/Administrators: Roy Tennant (UC, Berkeley)
Dependency: (To build from source) GCC (C++ Compiler), and some other
optional packages. Please refer to http://swish-e.org/dev/docs/INSTALL.html for
the latest requirements.
Supported Platforms: Sun/Solaris, UNIX, BSD, Linux, OS X, Windows
License: GNU GPL, or LGPL
Availability: http://swish-e.org/Download/
Further Information:
1. Project Homepage: http://swish-e.org/
2. How to Index Anything: http://www.linuxjournal.com/article.php?sid=6652
3.3 ASPseek
Description: ASPseek is an Internet search engine software developed by SWsoft
consists of an indexing robot, a search daemon, and a CGI search frontend. It can
index as many as a few million URLs and search for words and phrases, use
Miscellaneous Supporting Tools
Chapter 7 93
wildcards, and do a Boolean search. Search results can be limited to time period
given, site or Web space (set of sites) and sorted by relevance (PageRank is used)
or date.
Special Features: ASPseek is optimized for multiple sites (threaded index, async
DNS lookups, grouping results by site, Web spaces), but can be used for searching
one site as well. ASPseek can work with multiple languages/encodings at once
(including multibyte encodings such as Chinese) due to Unicode storage mode.
Other features include stopwords and ispell support, a charset and language
guesser, HTML templates for search results, excerpts, and query words
highlighting.
History: Developed and maintained by SWsoft.
Project Sponsors/Administrators: SWsoft (http://www.sw-soft.com/)
Dependency: C++ STL, RDBMS
Supported Platforms: Linux
License: GNU GPL
Availability: Binary packages: http://www.aspseek.org/packages.php, Source
packages: http://www.aspseek.org/download.php
Further Information: Project Home Page: http://www.aspseek.org/
3.4 Harvest: A Distributed Search System
Description: Harvest is a system to collect information and make them searchable
using a web interface. Harvest can collect information on inter- and intranet using
http, ftp, nntp as well as local files like data on harddisk, CDROM and file servers.
Special Features: Current list of supported formats in addition to HTML include
TeX, DVI, PS, full text, mail, man pages, news, troff, WordPerfect, RTF, Microsoft
Word/Excel, SGML, C sources and many more. Stubs for PDF support is included
Miscellaneous Supporting Tools
Chapter 7 94
in Harvest and will use Xpdf or Acroread to process PDF files. Adding support for
new format is easy due to Harvest's modular design.
History: Unknown
Project Sponsors/Administrators: Developers: Kang-Jin Lee, Javier Masa Marin,
Harald Weinreich
Dependency: Apache, Perl, GCC (C Compiler), Bison, Flex
Supported Platforms: UNIX, Linux
License: GNU GPL
Availability: http://sourceforge.net/project/showfiles.php?group_id=27808,
http://harvest.sourceforge.net/harvest/doc/download.html
Further Information: Project Home Page: http://harvest.sourceforge.net/
3.5 Zebra Server
Description: Zebra is a high-performance, general-purpose structured text indexing
and retrieval engine. It reads structured records in a variety of input formats (e.g..
email, XML, MARC) and allows access to them through exact Boolean search
expressions and relevance-ranked free-text queries.
Special Features: Zebra supports large databases (more than ten gigabytes of data,
tens of millions of records). It supports incremental, safe database updates on live
systems. You can access data stored in Zebra using a variety of Index Data tools
(e.g. YAZ and PHP/YAZ) as well as commercial and freeware Z39.50 clients and
toolkits.
History: Unknown
Project Sponsors/Administrators: Index Data (http://indexdata.dk/)
Dependency: YAZ Toolkit, [To build from source: C++ Compiler (GCC or
VC++)]
Supported Platforms: UNIX, Linux, Windows
Miscellaneous Supporting Tools
Chapter 7 95
License: GNU GPL
Availability: Source and binary: http://indexdata.dk/zebra/
Further Information: Project Home Page: http://indexdata.dk/zebra/
3.6 SiteSearch
Description: The OCLC SiteSearch software provides a comprehensive solution
for managing distributed library information resources in a World Wide Web
environment. It offers tools that integrate electronic resources under one web
interface, provide flexible access to resources, and build text and image databases
locally.
Special Features: Unknown
History: Unknown
Project Sponsors/Administrators: OCLC, Inc
Dependency: Java
Supported Platforms: Platform Independent
License: SiteSearch Open Source License Terms
Availability:
http://www.sitesearch.oclc.org/project/showfiles.php?group_id=16381
Further Information: Project Home Page: http://www.sitesearch.oclc.org/
Chapter 8
Conclusion
“The computer should be doing the hard work. That's what it's paid to do, after
all” – Larry Wall, author of Perl programming language
• Barriers in Using OSS
• Criteria for Selection of OSS
• Conclusion
Conclusion
Chapter 8 97
1 Barriers in Using OSS
Benefits of the Open Source Software notwithstanding there are a number of
barriers to the use of OSS in libraries. Library administrators are often reluctant to
adopt OSS due to number of factors.
According to the Draft Report (2001) of Digital Library Federation (USA):
• OSS can lack formal support making it difficult for libraries without
significant capacity in their systems department to participate in OSS
development or to use OSS.
• OSS needs to develop a participatory organizational model that allows
many to contribute perhaps in different ways to OSS development.
• OSS is not always easy to use. It is therefore largely inaccessible to the
many libraries and library system departments that require plug-and-play
software that is well documented and supported and can be easily installed
(and uninstalled).
• OSS initiatives do not always do enough to get non-systems librarians and
library patrons involved in design and testing of OSS. As such, they are
seen as being something that exclusively offers benefits to and holds
interest for library systems staff and not for the wider library community.
Another factor that often comes up is the usability of open source software. The
basic problem is that most open source systems are written by programmers who do
not understand the end user needs and whose software is often complex and
difficult to use. Thus, people argue that open source software projects need to
adapt in order to produce systems that can be used by a typical and non-technical
user.
Another issue related to this usability is the documentation of open source software.
A particular piece of software cannot be used easily without proper documentation.
While proprietary software vendors can afford to employ documentation people to
do the job open source world largely lacks the resources to do it. Programmers
Conclusion
Chapter 8 98
work in open source projects because they love programming and they do it as a
hobby or pastime. But documenting the product may not be as challenging to them
as writing the software. This factor sometimes reduces the usefulness of a software
product to a great extent.
2 Criteria for Selection of OSS Frank Cervone (2003) has given the following guidelines for evaluation of Open
Source Software Guidelines for Evaluating OSS all of which follow a single
principle: thoroughly investigate the software before implementing.
Some questions to be asked include:
• What are the programming language requirements?
o Do you have people on staff who can program in the language in
which the software is written?
o If not, do you have ready access to people who can?
o If not, are there alternative packages you can support?
• What is the operating environment?
o Is this software supported on your hardware?
o Does it run on the operating systems you support?
o Is there a large, active user base?
• How is maintenance handled?
o Who is currently supporting it?
o Is there an electronic discussion list, newsgroup, or blog [weblog]
that can be used for support?
o Is there a commercial entity that could provide support?
o Is there a community of peers providing input on enhancements and
modifications?
Conclusion
Chapter 8 99
o What sort of functional and integrated testing is performed by the
user community?
• Does the software have the necessary functionality?
o Is the product mature? Is it in a greater than 1.0 release?
o Will it require modification? If so, do you have the expertise?
o How will local modifications be folded back into the base product so
that the same modifications need not be repeated for each new
release?
Another concern is how much customization needed to make that product work.
Often librarians lack the skills themselves or are unable to find suitable support to
customize a software application to their his needs. A case in point could be the
Postnuke Content Management System which requires innumerable amount of
customization to use it as the library’s portal.
3 Conclusion
Open Source essentially empowers less privileged communities though it does not
follow that it is meant only for them. There is no denying the fact that OSS enables
bridging the digital divide in more ways than one. Libraries in the developing
countries are able to support electronic access, digital libraries, and resource
sharing because they are able to use OSS. Even libraries in well-developed
countries are becoming more inclined towards OSS to improve their services.
Chapter 9
Appendix
“Making Linux freely available is the single best decision I've ever made. There are
lots of good technical stuff I'm proud of too in the kernel, but they all pale by
comparison.” – Linus Torvalds
• OSI Certified Licenses
Appendix: OSI Certified Licenses
Chapter 9 101
OSI Certified Licenses
The Open Source Initiative (OSI) certifies open source licenses on the basis of ten
criteria describe in the chapter 3 of this document. Till now there are 45 OSI
certified licenses described in their home page (http://www.opensource.org/).
1. Academic Free License: http://www.opensource.org/licenses/academic.php
2. Apache Software License:
http://www.opensource.org/licenses/apachepl.php
3. Apple Public Source License: http://www.opensource.org/licenses/apsl.php
4. Artistic license: http://www.opensource.org/licenses/artistic-license.php
5. Attribution Assurance Licenses:
http://www.opensource.org/licenses/attribution.php
6. BSD license: http://www.opensource.org/licenses/bsd-license.php
7. Common Public License: http://www.opensource.org/licenses/cpl.php
8. Eiffel Forum License: http://www.opensource.org/licenses/eiffel.php
9. Eiffel Forum License V2.0:
http://www.opensource.org/licenses/ver2_eiffel.php
10. Entessa Public License: http://www.opensource.org/licenses/entessa.php
11. GNU General Public License (GPL):
http://www.opensource.org/licenses/gpl-license.php
12. GNU Library or "Lesser" General Public License (LGPL):
http://www.opensource.org/licenses/lgpl-license.php
13. Lucent Public License (Plan9):
http://www.opensource.org/licenses/plan9.php
14. IBM Public License: http://www.opensource.org/licenses/ibmpl.php
Appendix: OSI Certified Licenses
Chapter 9 102
15. Intel Open Source License: http://www.opensource.org/licenses/intel-open-
source-license.php
16. Historical Permission Notice and Disclaimer:
http://www.opensource.org/licenses/historical.php
17. Jabber Open Source License:
http://www.opensource.org/licenses/jabberpl.php
18. MIT license: http://www.opensource.org/licenses/mit-license.php
19. MITRE Collaborative Virtual Workspace License (CVW License):
http://www.opensource.org/licenses/mitrepl.php
20. Motosoto License: http://www.opensource.org/licenses/motosoto.php
21. Mozilla Public License 1.0 (MPL):
http://www.opensource.org/licenses/mozilla1.0.php
22. Mozilla Public License 1.1 (MPL):
http://www.opensource.org/licenses/mozilla1.1.php
23. Naumen Public License: http://www.opensource.org/licenses/naumen.php
24. Nethack General Public License:
http://www.opensource.org/licenses/nethack.php
25. Nokia Open Source License: http://www.opensource.org/licenses/nokia.php
26. OCLC Research Public License 2.0:
http://www.opensource.org/licenses/oclc2.php
27. Open Group Test Suite License:
http://www.opensource.org/licenses/opengroup.php
28. Open Software License: http://www.opensource.org/licenses/osl.php
29. Python license (CNRI Python License):
http://www.opensource.org/licenses/pythonpl.php
30. Python Software Foundation License:
http://www.opensource.org/licenses/PythonSoftFoundation.php
Appendix: OSI Certified Licenses
Chapter 9 103
31. Qt Public License (QPL): http://www.opensource.org/licenses/qtpl.php
32. RealNetworks Public Source License V1.0:
http://www.opensource.org/licenses/real.php
33. Reciprocal Public License: http://www.opensource.org/licenses/rpl.php
34. Ricoh Source Code Public License:
http://www.opensource.org/licenses/ricohpl.php
35. Sleepycat License: http://www.opensource.org/licenses/sleepycat.php
36. Sun Industry Standards Source License (SISSL):
http://www.opensource.org/licenses/sisslpl.php
37. Sun Public License: http://www.opensource.org/licenses/sunpublic.php
38. Sybase Open Watcom Public License 1.0:
http://www.opensource.org/licenses/sybase.php
39. University of Illinois/NCSA Open Source License:
http://www.opensource.org/licenses/UoI-NCSA.php
40. Vovida Software License v. 1.0:
http://www.opensource.org/licenses/vovidapl.php
41. W3C License: http://www.opensource.org/licenses/W3C.php
42. wxWindows Library License:
http://www.opensource.org/licenses/wxwindows.php
43. X.Net License: http://www.opensource.org/licenses/xnet.php
44. Zope Public License: http://www.opensource.org/licenses/zpl.php
45. zlib/libpng license: http://www.opensource.org/licenses/zlib-license.php
Selective Bibliography
Chapter 10 105
1. Cervone, Frank (2003). The Open Source Option [online] Available from:
http://libraryjournal.reviewsnews.com/index.asp?layout=articlePrint&articleI
D=CA304084&publication=libraryjournal (Accessed on August 27, 2003)
2. Chawner, Brenda (2003). Open Source Software and Libraries Bibliographies
(Version 0.5) [online] Available from:
http://www.vuw.ac.nz/staff/brenda_chawner/biblio.html (accessed on July 23,
2003)
3. Chudnov, Daniel (1999). Open Source Library Systems: Getting Started
[online] Available from: http://www.oss4lib.org/readings/oss4lib-getting-
started.php (accessed on July 23, 2003)
4. Ghosh, R.A. (1998). FM Interview with Linus Torvalds: What motivates free
software developers? First Monday [online] (2 March 1998) Vol.3 (3)
Available from
http://www.firstmonday.dk/issues/issue3_3/torvalds/index.html (Accessed
July 20, 2003)
5. Greenstein, D. (2001). DLF Architectures: Evaluation of Open Source
Software for Libraries [online] Available from:
http://www.diglib.org/architectures/ossrep.htm (Accessed on August 26,
2003)
6. MacFarlane, Andrew (2003). On Open Source IR. In Aslib Proceedings, 55
(4), pp. 217-222.
7. Moody, Glynn (2001). Rebel Code: Linux and the Open Source Revolution.
Allen Lane, London.
8. Morgan, Eric Lease (2002). Open Source Software in Libraries [online]
Available from: http://dewey.library.nd.edu/morgan/musings/ossnlibraries/
(accessed on July 20, 2003)
Selective Bibliography
Chapter 10 106
9. Morgan, Eric Lease (2003). Building Your Library’s Portal [online] Available
from: http://dewey.library.nd.edu/morgan/musings/portals/ (Accessed on
August 27, 2003)
10. OSS4Lib. Oss4lib – Projects [online] Available from:
http://www.oss4lib.org/projects/ (accessed on July 23, 2003).
11. OSS- Open Source Software [online] Available from:
http://www.eifl.net/opensoft/soft.html (accessed on July 23, 2003)
12. Pasquinelli, Art (2003). Information Technology Advances in Libraries
[online] Available from: http://www.sun.com/products-n-
solutions/edu/whitepapers/pdf/it_advances.pdf (Accessed on August 16, 2003)
13. Rasch, Chris (2000). A Brief History of Free/Open Source Software
Movement [online] Available from:
http://www.openknowledge.org/writing/open-source/scb/brief-open-source-
history.html (accessed on July 20, 2003)
14. Raymond, Eric S. (2001). The Cathedral and the Bazaar: Musings on Linux
and Open Source by an Accidental Revolutionary. Revised edition.
Sebastopol, CA; O’Reilly and Associates.