agenda

28
Agenda Why discuss Digital Libraries What is a digital Library • History • Meta-data • FEDORA • NSDL D Space

Upload: abel-abbott

Post on 31-Dec-2015

16 views

Category:

Documents


0 download

DESCRIPTION

Agenda. Why discuss Digital Libraries What is a digital Library History Meta-data FEDORA NSDL D Space. What is a Digital Library?. There are several definitions of Digital libraries Borgman identifies two major aspects emphasized in these definitions - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Agenda

Agenda

• Why discuss Digital Libraries• What is a digital Library• History• Meta-data• FEDORA• NSDL• D Space

Page 2: Agenda

What is a Digital Library?• There are several definitions of Digital

libraries

• Borgman identifies two major aspects emphasized in these definitions – DL Researchers from Computer Science

focus on DLs as content for user communities and therefore emphasize the enabling technologies

– Library professionals appear to emphasize DLs as services / institutions

Page 3: Agenda

DL:Evolution• Digital Library Initiative I

• Beginning in 1992-93, The US Govt. agencies, NSF ,DARPA, and others initiated the digital library efforts in the US with massive funding resulting in 6 major DL initiatives

• Digital Library Initiative II ( 1998-2002)

• E-Lib programme in the UK

Page 4: Agenda

Roots of Digital Libraries• DLs have evolved from and are based on the

techniques and principles developed by the early IR researchers such as-– Calvin Mooers ( 1950)– Perry ( 1951)– Taube (1955)

• Information Retrieval Systems of 1980s

• Salton’s Automatic indexing and search systems (1960s)

• Hypertext systems of 1980s

Page 5: Agenda

Meta-data• Data about data

• Meta-data standards• Dublin Core• Open Archive Initiative (OAI)

• Meta-data collection and harvesting• Periodic• Federated

Page 6: Agenda

What is Meta-data

• Record of the existence of knowledge

(i.e. libraries)

• Linkable to each other

• Governed by standards to ensure compatibility

Page 7: Agenda

Dublin Meta-data Core Element Set

• (A.K.A) Dublin Core

• “Born” in Dublin Ohio in 1995 funded by Online Computer Library Center (OCLC) and the National Center for Supercomputing Applications (NCSA)

• The Main standard for meta-data

Page 8: Agenda

Dublin Core Elements

• 16 elements make up the Dublin core, used to identify data for easy searches 14 listed below

Subject IdentifierTitle

LanguageAuthor SourcePublisher RelationOther agent RightsDate LeaderObject typeForm

Page 9: Agenda

Open Archive Initiative (OAI)

• Harvesting protocol – most used of all the harvesting protocols

• requires Dublin Core

• Runs on XML

Page 10: Agenda

Collection and harvesting methods

Periodic

• Fixed interval collection

(e.g. every day)

• Ability to rate relevance when using search engines

Federated

• “on the Fly”

• Smaller amounts of data constantly linked

• Becomes less effective with larger stores of data

Page 11: Agenda

Issues with federated harvesting

• 1985

USGS - United States Geological Survey

FGDC - Federal Geographic Data Community• Attempted to make a large federated meta-

data that became much slower as the amount of data grew. Resulting in less efficient searches

Page 12: Agenda

Current trends

• Dublin Core more in use in Europe than the United States

• Periodic harvesting is preferred

• Among the first industries to use meta-data and digital libraries are publishers, and other industries that work closely with libraries

Page 13: Agenda

Fedora

The problem

• Institutions and organizations face increasing demands to deliver rich digital content

• Delivery is only one aspect of a suite of content management tasks. Content needs to be created, ingested, and stored

Page 14: Agenda

• Without standardization, the costs of management tasks become prohibitive

• Content managers need a flexible content repository system that allows them to

uniformly store, manage, and deliver all their existing content and that will accommodate

new forms that will arise in the future.

Page 15: Agenda

Costs of not finding information

• IDC studies, found that knowledge workers spend from 15% to 35% of their time searching for information

• 40% of corporate users reported that they can not find the information they need to do their jobs on their intranets

Page 16: Agenda

• 90% of the time that knowledge workers spend is spent in recreating information that already exists

• IDC, estimates $5,000 cost per worker per year

• For a 1,000 employee company, the cost of reworking is $12 million a year (15% of time spent in duplicating existing information)

Page 17: Agenda

• First digital object repository management system based on the Flexible Extensible Digital Object and Repository Architecture (Fedora).

• Fedora is an open source software.

• Different to the Red Hat Fedora Project

Page 18: Agenda
Page 19: Agenda

• Powerful digital object model that supports multiple views of each digital object and the relationships among digital objects

• Digital objects can encapsulate locally-managed content or make reference to remote content.

• Dynamic views are possible • Digital objects exist within a repository architecture

that supports a variety of management functions. • All functions of Fedora are exposed as web services

Page 20: Agenda

• Fedora an attractive solution in a variety of domains

• Applications: library collections management, multimedia authoring systems, archival repositories, institutional repositories, and digital libraries for education

Page 21: Agenda

National Science Digital Library (NSDL)

• Created by the National Science Foundation in 2000.

• 192 projects awarded since start.

• ~50 directly funded by NSDL.

• 400 unique collections.

• Goal of being a free service which directs users to exemplary resources for education.

Page 22: Agenda

NSDL

• Provides an organized point of access to STEM content (Science, Technology, Engineering, Mathematics).

• Supports Dublin Core standard.

• Most commonly, XML is used to store and encode documents for NSDL collections.

Page 23: Agenda

• Access to most resources discovered through the NSDL is free.

• NSDL is a collection of other digital library collections.– In essence, the NSDL is more of a digital

card-catalog spanning many different collections.

Page 24: Agenda

• Common selection mechanisms include: – Peer review boards– Content creation committees– User recommendations

• Advantage over other search methods in that it has a selection process and guaranteed quality results.

Page 25: Agenda

DSpace

• Developed by HP and MIT in 2002.• Open source digital repository system.

– Provides digital archiving.

• Available under the BSD open source license.• Free to download and use.• Mission to: “Collect, manage, preserve and

redistribute digital content.”

Page 26: Agenda

• Can store many different file types including: Books, Theses, Audio/Video and programs.

• Currently in use at many institutions world-wide including: Cornell, OSU, Vanderbilt and many others.

Page 27: Agenda

• DSpace software is maintained by the DSpace Federation– No formal membership structure– Adoption is growing

• DSpace focuses on making content accessible and preserving it over time.

Page 28: Agenda

DSpace Data Model