the university of cambridge universal catalogue: a work in progress patricia killiard head of it...

35
The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Upload: amos-cross

Post on 02-Jan-2016

224 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

The University of Cambridge Universal Catalogue: a work in progress

Patricia KilliardHead of IT ServicesCambridge University Library

Page 2: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Libraries in the University of Cambridge UCUniversity Library

Dependent libraries• Medical Library• Scientific Periodicals• Squire Law Library• Betty & Gordon Moore

Library

• College libraries

• Departmental & Faculty libraries

• Affiliated Institutions

• Other libraries associated with the University

Page 3: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

The Union Catalogue: Beginnings and growth

• Began in 1982 with the Union List of Serials – non-MARC records based on a printed list

• 19855 libraries began contributing short records for books to a Union Catalogue

• 1987 UC first made available to the public with 53,000 records

• 2002 90+ contributing libraries• New contributors are still joining• Software was written in-house and continued

to be used until 2002

Page 4: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Standards ...

• Early records were subject to no bibliographic standards to encourage contributions

• Brief records due to cost of disk space in 1980s

• No Authority control, even today• Independence of colleges, faculties and

departments means no overall control of standards ... consequences for the UC

• Serials records were non-MARC until 2002

Page 5: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Pre-2002 Union Catalogue Model

• Consortial model with duplicate bibliographic records

• No authority control• Completely separate from the authority-

controlled file for the University Library• Separate Union List of Serials which was de-

duplicated• Can still be seen at

http://linux01.lib.cam.ac.uk/Catalogues/OPAC/xunion.shtml

Page 6: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Pre-2002 Union Catalogue

Page 7: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Search Results in pre-2002 Union Catalogue

Page 8: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Cambridge Union List ofSerials

Page 9: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Advantages and disadvantages of the old UC modelAdvantages• Ability to request preferred 3 libraries first• Some patron functionality, e.g. Patrons able to view

books on loan• Each library’s holdings could be distinguished

immediatelyDisadvantages• Lack of de-duplication in the main Union Catalogue• Large numbers of search results• Exclusion of the University Library holdings from the UC• Separation of serials catalogue from monographs

Page 10: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Voyager vision for Cambridge

• Single de-duplicated Universal Catalogue incorporating all public databases, bringing University Library and other databases together

• Based on authority-controlled records• All patron functionality possible through the

UC• Libraries able to retain local rights over

records and patron functionality• Local subject headings retained

Page 11: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

From Consortial Catalogue toUniversal Catalogue

• Department/Faculty and College databases in Voyager have multiple owning libraries - no record sharing

• Could move to a Union Catalogue module by allowing record sharing within databases but ...– Requires political will– Is very slow since records would merge on a

individual basis– Interim stage of merging confusing for patrons

Page 12: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Cambridge System Hardware

Universal Catalogue

Feeder databases

Web Server

Page 13: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Hardware specificationsSun Fire 48004 x T3 arrays configured in 2

partner groups2 x 4 x 750MHZ CPU’s16GB memory (8GB for each

domain)Disk space is: 2 x 18GB (used for Solaris)

and2 x 9 x 36GB (in one T3partner pair) for each domain

Domain A (Hookea) holds all production databases

Domain C (Hookec) holds UC

Web server = Sun 280R2 x 750MHz UltraSPARC

III processors4GB memory72GB disk

Test server = Sun 220R

Page 14: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Cambridge Voyager Databases

cambridgedb University Library and dependent libraries

manuscrpdb Manuscript database

depfacaedb Departments & Faculties A-E

depfacfmdb Departments & Faculties F-M

depfacozdb Departments & Faculties O-Z

collandb Colleges A-N

collpwdb Colleges P-W

otherdb Affiliated Institutions

resourcedb Resource file* (non-UC)

ucdb Universal Catalog

Page 15: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

De-duplication

• Indexes used:– 010, 020, 022, 0350, 0359

• Large proportion of records do not have ISBNs or LCCNs

• De-duplication is very loose• Resulted in very low levels of de-duplication

(3-15%) • De-duplication may actually reduce as the file

accumulates due to addition of older records without control numbers

Page 16: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Replace vs Merge in de-duplication

• Bi-directional merge profile should have been available in 2001.2 but not yet working

• Essential in order to preserve British Education Index and local subject headings in 650._4 and 650._7

• Might be used in future to preserve other fields, e.g. 856 fields

Page 17: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Quality HierarchyLeader/06 Leader/17 040$a 040$d

* * DLC *as * * depfacaedbab * * depfacaedbas * * depfacfmdbab * * depfacfmdbas * * depfacozdbab * * depfacozdbas * * collandb

ab * * collandbas * * collpwdbab * * collpwdbas * * otherdbab * * otherdb* * * cambrdgedb

Page 18: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Trial UC build no. 1: Aug 2001

• First UC build with 2000.1.3 – built before remainder of system went live

• Contributing files were all test loads of data for all libraries - very slow to configure and build

• UC Phase 2 – should have had link back to holdings records but bug in 2000.1.3 prevented it from working

• Upgrade to 2000.2.1 needed to make it work (Oct 2001)

• No UB functionality• Very generic build using only 010, 020, 022 and

035 to de-duplicate

Page 19: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Trial build no. 2: Nov 2002

• 2 databases: cambrdgedb and depfacaedb with 2001.2 Beta

• Bugs in Sysadmin affected– Duplicate detection profiles– Quality hierarchy– Bi-directional merge– Saving values in Sysadmin generally

• Build failed several times at pre-bulk stage

Page 20: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Trial no. 3: March 2003

• Began March 2003, again with 2 databases• Early problems with matching location codes

and Oracle database names• Further pre-bulk problems• Delayed while databases were clustered in

March and upgraded to 2001.2.1 in early April• Build completed but

– quality hierarchy failed to work – bi-directional merge– unable to test patron functionality

Page 21: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Production build

• 21 July Initial load began with 2 databases: cambrdgedb and depfacaedb

• Indexed and reviewed at this stage• 22 August load of remaining databases began• 28 August load and indexing complete• Currently under review

– Authorities not loaded– UB not yet enabled– Bi-directional merge not yet functioning

Page 22: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

De-duplication in production build

CambrdgedbProcessed 1,546,138Added 1,493,243Discarded 203Rejected 2911Replaced 49,779Replacement level 3.2%

DepfacaedbProcessed 412,727Added 339,408Discarded 397Rejected 59,397Replaced 13,523Replacement level 3.3%

CollandbProcessed 481,002Added 260,311Discarded 9593Rejected 136,146Replaced 749,51 Replacement level 15.6%

DepfacfmdbProcessed 352,619Added 284674Discarded 1419Rejected 47,660Replaced 18,866Replacement level 5.3%

Page 23: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Newton OPAC

Page 24: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

UC Search Results

Page 25: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Full Record View

Page 26: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Major issues to tackle

• De-duplication of short records with no match points at present

• Authority control in a non-authority controlled environment

• Presentation of results to users:– Display doesn’t support multiple libraries in

database: shows database name as location rather than holding library

– Public names in OPAC need to be revised to reflect multiple libraries - 60 characters is not always sufficient

Page 27: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Short record with no de-duplication:

Page 28: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Short record de-duplication Option 1: Additional indexes• Creation of index solely for de-duplication

purposes• Manual matching by cataloguers• Addition of local control number in matching

records• Accurate but extremely slow• However, additional left-anchored indexes for

de-duping, like 015 (BNB numbers) would help.

Page 29: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Short record de-duplication

Option 2: • Combining indexes is probably the best way to

tackle the very large numbers of short records• Algorithm to combine author, title, and

publication date would be idealOption 3:• Upgrading all short records through retrocon

projects - expensive and not justified if only purpose is de-duplication

Page 30: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Serials: a special problem

• Two types of serials records:– Short Union List of Serials records: identical for all

libraries but multiple copies in each database– Upgraded serials records in all department/faculty

and college databases

• Need to ensure that – Higher quality records from departments etc. take

precedence– Former Union List of Serials records do not diverge

by controlling standards as they are upgraded

Page 31: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Authority control in the UC

• Authority records from the University Library database will be loaded into UC

• Local authorities discarded from Voyager build • No authorities in 7 out of 8 contributing

databases• Options?

– Load authorities into all databases? Too much space

– Introduce authority control into other 7 databases through Web authorities or copying authority records from cambrdgedb - problem of cleaning up existing records

Page 32: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Presentation of search results

• Patrons are interested in library holdings not database holdings

• Location Limits appear to be possible only by database not library

• May be able to work with access control groups and holdings sort groups

• Random order of MFHDs very confusing

Page 33: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Patron issues: UB environment ... but not entirely

• Full patron functionality in the UC OPAC was part of the Cambridge contract but recalls, holds and call slip requests not yet working

• Patron records from all contributing libraries display in OPAC

• Books on loan, requests, blocks, fines and fees from all libraries display in OPAC

• Circulation clustered environment• UB installed but no reciprocal borrowing

Page 34: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Top Enhancements

• Additional tools for de-duplication, preferably allowing combinations of indexes

• Fix for the multiple MFHDs being delivered in random order - incomprehensible to the user

• ISBN matching not ignoring text after first 10 digits (problem nos. 13283, 58877, etc.)– 020 __ |a 0335203884 and– 020 __ |a 0335203884(pbk)

• Link from the UC record to the record in the contributing database would be very useful for Cambridge

Page 35: The University of Cambridge Universal Catalogue: a work in progress Patricia Killiard Head of IT Services Cambridge University Library

Can be seen at:

http://hookec.lib.cam.ac.uk

University of Cambridge Universal Catalogue