1 demystifying metadata ann chapman ukoln university of bath ukoln is funded by resource: the...
TRANSCRIPT
1
Demystifying metadata
Ann Chapman
UKOLN
University of Bath
UKOLN is funded by Resource: The Council for Museums, Archives and Libraries, the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based.
2
What is metadata?
Structured data about resources• Library catalogues• Abstracting and indexing services• Archival finding aids• Museum documentation• Community information
Carriers: MARC, HTML, SGML, XML
3
Markup languages
SGML - Standard Generalised Markup Language
- controls document formatting for publication
XML - Extensible Markup Language
- “next generation” SGML
HTML - Hyper Text Markup Language
- SGML subset, controls display of web pages
Tags (usually paired) structure text into elementse.g. headings, paragraphs, lists, etc.<title> </title> <p> </p> <li> </li>
4
MARC - structure
• Structured format
• Numeric and alpha tags
• Fixed fields
• Leader, 001-008, 010-099
• Variable fields
5
MARC – elements
1XX Main entry2XX Title, SR, edition, publication3XX Physical description4XX Series5XX Notes6XX Subject access7XX Added entries8XX Added entries for series9XX References and local fields
6
ONIX - structure
• Carrier - XML • Primary use
• publishers to Internet booksellers• rich product information
In use • first version 1999• current version Release 2.0 (2001)
• Elements – XML reference name and tag
7
ONIX - elements
• Message header• Product record
• identifiers, author, title, edition, language, subject, audience, descriptions, publisher, dates
• territorial rights, dimensions, suppliers, availability, promotions
• Main series and sub series records
8
ONIX record
<ISBN> 0123456789 </ISBN>
<DistinctiveTitle> Alice in Wonderland </Distinctive Title>
<Contributor>
<ContributorRole> Author </ContributorRole> <PersonNameInverted> Carroll, Lewis </PersonNameInverted>
</Contributor>
<PublisherName> Collins </PublisherName>
<PublicationDate> 2000 </Publication Date>
9
Dublin Core - structure
• Simple resource discovery• DCMES – Dublin Core Metadata Element Set
• HTML the most common ‘carrier’• Comprises 15 elements with
element qualifiers
element encoding schemes
optional/mandatory elements• Application profiles
10
Dublin Core - elements
Title
Creator
Subject
Description
Publisher
Contributor
Date
Resource Type
Format
Resource Identifier
Source
Language
Relation
Coverage
Rights
11
Dublin Core - record
<Title> Alice in Wonderland </Title>
<Creator> Lewis Carroll </Creator>
<Subject> <LCSH> Fiction </LCSH> </Subject>
<Publisher> Project Gutenberg </Publisher>
<Date> 2000 </Date>
<Format> ASCII file via FTP </Format>
<Identifier> http://promo.net/pg/….. </Identifier>
12
Encoded Archival Description
• EAD• 1993 project to develop standard for
machine-readable finding aids,Version 1 1998
• SGML (and XML compliant)
• Hierarchical structure of archives• repository, management group, fonds, series,
file, item
• Possible to embed MARC elements
13
EAD - structure
<ead>
<eadheader>
</eadheader>
<frontmatter> [optional]
</frontmatter>
<archdesc>
<did>
</did>
</archdesc>
</ead>
14
EAD - elements
<eadheader> [id + bibliographic inf. for finding aid]
<archdesc> [data on a body of archival materials]
<did> [container, physical description, physical location,
repository, date and title of unit]
<admininfo> [biography, scope, access, arrangement]<controlaccess> [name, place, genre, subject, title]
</archdesc>
15
EAD record - <header><ead><eadheader><eadid> LKX-3042 </eadid<filedesc>
<titlestmt> <titleproper> Pitman Shorthand Collection Catalogue </titleproper> <author> Ann Chapman </author> </titlestmt>
<publicationstmt> <date> 1990 </date> <publisher> Bath University Library </publisher> </publicationstmt>
</filedesc> </eadheader>
16
EAD record - <archdesc>
<archdesc> collection
<did> <abstract> A collection of materials in and about shorthand collected by Sir Isaac Pitman and James Pitman </abstract> </did>
<controlaccess>
<subject encodinganalog=“MARC650”> Shorthand </subject>
</controlaccess>
</archdesc>
</ead>
17
Collection Description
Schema developed May 2000Access version for RSLP – summer 2001Web version for Reveal – spring 2002
General attributesSubjectDatesAssociated agentsExternal relationships
18
Coll.Desc. - elements
General: title, identifier, description, strength, physical characteristics, language, type, access control, accrual status, legal status, custodial history, note, location
Subject: concept, object, name, place, time
Dates: accumulation, contents
Agents: creator, owner
Relationships: sub/super collections, catalogues and descriptions, associated collections and publications
19
Coll. Desc. - record
Title: Pitman Collection
Strength: Shorthand – national collection
Phys. Desc: Printed texts and manuscripts
Lang: English, Spanish, Esperanto, ……
Access: Written request to the Librarian, Bath Univ.
Accrual: passive, deposit
Location: The Library, Bath University, Bath
Subject: Shorthand, Sir Isaac Pitman
Owner: Pitman Publishing Co.
Catalogue: Bath University OPAC
20
M21 Community Information
Same principles as MARC Bibliographic
Leader individual/organization/program/event/other
Fixed fields001-008, 010-099 fixed fields
007 disability facilities
008 special aspects
Variable fields
21
M21 Comm. Inf. - elements
1XX Name
2XX Title and Address
3XX Physical description
4XX Series (for events)
5XX Notes
6XX Subject access
7XX Added entries
8XX Other variable fields
22
M21 Comm. Inf. - record
110 $a CILIP
245 $a CILIP HQ
247 $a LA HQ $f 19?? - 2002
270 $a 7 Ridgmount St, London, WC1E 7AE $k 020 7255 0505 $m [email protected] $r 9am to 6pm
311 $a Ewart Room $d seats 50 $g £100 per day
312 $a Overhead projector $f £10 per day
581 $a Library + Information Update
856 $a http://www.cilip.org.uk
23
Metadata – fit for purpose
• MARC Bibliographic
• ONIX
• Dublin Core
• EAD
• Collection description
• M21 Community Information