lab : federated searching: searching distributed data & searching harvested data

15
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel [email protected] Lab : federated searching: searching distributed data & searching harvested data Access to PC orpcuser orpcpw

Upload: ethan

Post on 25-Jan-2016

33 views

Category:

Documents


1 download

DESCRIPTION

CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel [email protected]. Lab : federated searching: searching distributed data & searching harvested data. Access to PC orpcuser orpcpw. A&I. image. FTXT. OPAC. e-print. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lab :  federated searching: searching distributed data & searching harvested data

1 herbert van de sompel

CS 502 Computing Methods for Digital

Libraries

Cornell University – Computer ScienceHerbert Van de [email protected]

Lab : federated searching: searching distributed data & searching harvested data

Access to PCorpcuserorpcpw

Page 2: Lab :  federated searching: searching distributed data & searching harvested data

2 herbert van de sompel

A&I

federated services

image

FTXT

OPAC

e-print

Page 3: Lab :  federated searching: searching distributed data & searching harvested data

3 herbert van de sompel

federated searching

• Distributed search approach ~ Z39.50, SDLIP, ...• today: MetaLib• commercial product by Ex Libris• searches repositories using “whichever” technique• normalizes results before presenting them to the user• can merge results after initial presentation

• Harvesting approach ~ OAI• today: ARC, the first OAI service provider

Page 4: Lab :  federated searching: searching distributed data & searching harvested data

4 herbert van de sompel

MetaLib

Goal: Unique, consistent interface across library resources (think Google for library collection)

Broadcast searching over a large collection of heterogeneous resources:• Different metadata syntax (MARC, EAD, Dublin Core, TEI, unspecified, ...)• Different protocols (Z39.50, HTTP, native ALEPH, screen scraping)

• Linking via integrated SFX server

• User personalization• Administration of resources

Page 5: Lab :  federated searching: searching distributed data & searching harvested data

5 herbert van de sompel

Application Level:

Technology Level:

Universal Gateway

Accurate, target-sensitive

searching

Context-sensitive

linking

User Admin

Customized, personalized

services

Resources’ administering

Information Gateway

Page 6: Lab :  federated searching: searching distributed data & searching harvested data

6 herbert van de sompel

Universal Gateway

Accurate, target-sensitive

searching

Context-sensitive

linking

User Admin

Customized, personalized

services

Information Gateway

Resources’ administering

Page 7: Lab :  federated searching: searching distributed data & searching harvested data

7 herbert van de sompel

The Information Gateway: The KnowledgeBase

The KnowledgeBase includes all the library Resources

Cataloging Information (per collection):

Name of collection, owner, subject, services, language

R e s o u r c e s

C o l l e c t i o n s

Configuration Information (per MetaLib resource):

Interfacing protocol, internal format, rules of conversion

Page 8: Lab :  federated searching: searching distributed data & searching harvested data

8 herbert van de sompel

KnowledgeBase

Example of cataloged collection

245 Title: Queen Elizabeth II Library

270 Location: Memorial University of Newfoundland | St. John’s, Newfoundland | Canada | AC15S7

307 Access Times: Monday-Thursday 8:30-20:45 Closed Saturdays

520 Description: Main library covers humanities, science, computers, physical ed, social sciences, and engineering

546 Language: English

531 Access: Open to the public

901 Administrator: [email protected]

650 Subject: Computer Science

650 Subject: Pure Science

650 Subject: Humanities

USMARC

Page 9: Lab :  federated searching: searching distributed data & searching harvested data

9 herbert van de sompel

Universal Gateway

Accurate, target-sensitive

searching

Context-sensitive

linking

User Admin

Customized, personalized

services

Database of catalogs and

databases

Information Gateway

Page 10: Lab :  federated searching: searching distributed data & searching harvested data

10 herbert van de sompel

D i v e r s e I n f o r m a t i o n R e s o u r c e s

ALEPH

Z39.50 HTTP

OTHER

Universal Gateway

Search Processed to Conform to Information Resources

Page 11: Lab :  federated searching: searching distributed data & searching harvested data

11 herbert van de sompel

Z39.50

Library of Congress

ALEPH

KOBV

HTTP

PubMed

Z39.50

Library of Congress

Z39.50

MedLine

Search Command Adapted to Various Resources

HTTP

PubMed

Z39.50

MedLine

ALEPH

KOBV

Title:

1=Kryger, Meir AND 4=sleep WAU=Kryger, Meir AND WTI=sleep(%20sleep[TITL]%20)%20AND%20(%20Kryger[AUTH]%20Meir[AUTH]%20)

Author:

1003=Kryger-M? AND 4=sleepWAU=Kryger, Meir AND WTI=sleep

sleep

Kryger, Meir

Page 12: Lab :  federated searching: searching distributed data & searching harvested data

12 herbert van de sompel

Universal Gateway

FIND PRESENT COMBINE SET

FIND DUPLICATES

U n i v e r s a l G a t e w a y F u n c t i o n s

The Universal Gateway enables the use of basic components via API

Page 13: Lab :  federated searching: searching distributed data & searching harvested data

13 herbert van de sompel

URLs

Distributed approach: MetaLib http://metalib01.exlibris-usa.com/V

Harvesting approach:ARC http://arc.cs.odu.edu/

Page 14: Lab :  federated searching: searching distributed data & searching harvested data

14 herbert van de sompel

Pop quiz: reference linking papers

Go http://63.70.76.27:8080/cs502/

Logon• Box 1 : Firstname• Box 2 : Lastname• Box 3 : netid• Click take

Take Quiz

Submit all responses at once

Page 15: Lab :  federated searching: searching distributed data & searching harvested data

15 herbert van de sompel

Make-up Pop quiz: SODA and FEDORA papers

Go http://63.70.76.27:8080/cs502/

Logon• Box 1 : Firstname• Box 2 : Lastname• Box 3 : netid• Click take

Take Quiz

Submit all responses at once