using metadata in contentdm

26
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct. 29, 2002

Upload: luann

Post on 17-Jan-2016

43 views

Category:

Documents


1 download

DESCRIPTION

Using Metadata in CONTENTdm. Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct. 29, 2002. Outline. The metadata “environment”: factors that influence basic decisions - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Using Metadata in CONTENTdm

Using Metadata in CONTENTdm

Diana Brooking and Allen MaberryMetadata Implementation Group, Univ. of Washington

Crossing Organizational Boundaries

Oct. 29, 2002

Page 2: Using Metadata in CONTENTdm

Outline

• The metadata “environment”: factors that influence basic decisions

• Structure of metadata: Dublin Core, field structure in CONTENTdm

• Content standards: what goes into the fields, formatting, controlled vocabularies

• The data dictionary: bringing it all together

Page 3: Using Metadata in CONTENTdm

Metadata: what is it?

• Data about data– “Metadata are data that describe the attributes of a

resource; characterize its relationships; support its discovery, management, and effective use; and exist in an electronic environment.” (Sherry Vellucci, LRTS 44 (1), 1999)

• Commonly known as cataloging

Page 4: Using Metadata in CONTENTdm

Metadata: how is it used?

• For description: information for display with the image

• For searching: users search for images by searching for text attached to the image

Page 5: Using Metadata in CONTENTdm

Basic Decisions: Description

• How much information do you have?

• How much information do your users need/want?– What is depicted in the image?– Who created it?– Why is it important? Why did you select it?

• How much detail do you need to go into?

Page 6: Using Metadata in CONTENTdm

Basic Decisions: Searching

• How will users find the images? What will they be looking for? What aspects are they interested in?

• How will you find the images? What are your staff’s needs?

• At what level do you need to distinguish images from one another?

• At what level do you need to bring like resources together?

Page 7: Using Metadata in CONTENTdm

Decision Factors

• Size of file– 50 images (small enough to browse)– 10,000 images (need for more precise

searching)– 10,000 images of many different things vs.

10,000 images of trains

Page 8: Using Metadata in CONTENTdm

Decision Factors

• Audience– General public vs. specialists (e.g., railroad

enthusiasts)

• Institutional mission– Say you are a railroad museum (audience

expectations)

Page 9: Using Metadata in CONTENTdm

Decision Factors

• Legacy data– Starting from scratch– Years of good cataloging– Years of inconsistent cataloging

• Software issues– What kind of data can the system handle?– What are its search capabilities– Short-term vs. long-term view

Page 10: Using Metadata in CONTENTdm

Basic Dublin Core Metadata

• What is the Dublin Core Metadata Element Set (DCMES)

• Why was it developed, and how has it been developed.

• A short history of the DC Initiative is available at http://www.dublincore.org/about/overview/

Page 11: Using Metadata in CONTENTdm

Dublin Core Metadata Element Set

• There are15 basic elements

• See Dublin Core Element Set, Version 1.1 - Reference Description

• But, it is adaptable and expandable to fit the needs of different users by the use of “Applications profiles”

Page 12: Using Metadata in CONTENTdm

Dublin Core and CONTENTdm

• CONTENTdm is designed around the Dublin Core

• (Very) basic overview of how CONTENTdm works– CONTENTdm uses DC element names as file

names– Because each database has constant file names

it is easy to combine them to search either one or more collections

Page 13: Using Metadata in CONTENTdm

Dublin Core mapping

• An example:– Collection A has a field “Photographer”

mapped to DC:Creator, and Collection B has a field “Artist” mapped to DC:Creator. Searching across both databases searches the CONTENTdm index “Creat*” and retrieves data from the index for both “Photographers” and “Artists” for collections A + B or A+B+n…

Page 14: Using Metadata in CONTENTdm

Dublin Core and searching

• What are the practical consequences of this?– In cross database searching, one can search on

specific fields. However, the names of these fields will not be Photographer or Artist, but “Creator” because that is the common name of the index in each collection.

– However you can do a keyword search on all “searchable” fields in the database whether they are mapped to a Dublin Core field or not.

Page 15: Using Metadata in CONTENTdm

Modern Book Arts field labels– bibliographic description = descr0– text production = descr1– image production = descr2, etc.

Cross-database search index– Description = descr*

Page 16: Using Metadata in CONTENTdm

Dublin Core tips

– It is important to make sure that you are careful about what information you put in searchable fields, even if they are not mapped to a DC element.

– If you have multiple collections it is very important to make sure that the same type of data is mapped to the same DC elements consistently

Page 17: Using Metadata in CONTENTdm

Content Standards

• Used for choosing and formatting the data that goes into the fields.

• Increase coherence and intelligibility of description

• Enhance reliability of retrieval• Enable compatibility with other collections (cross-

database searching)• Makes maintenance and possible migration of data

to other software easier

Page 18: Using Metadata in CONTENTdm

Standards = Consistency

• “Date” field: dates should always be formatted the same way

• “Photographer” field: same person’s name should always appear in the same form

• “Subject” field: same topic should have the same term used to describe it across images

• If different terms or formats are used, the user may not even realize that more than one search is necessary

Page 19: Using Metadata in CONTENTdm

Examples of Content Standards

For description:• Anglo-American Cataloging Rules, 2nd ed.,

2002 revision (libraries)• Graphic Materials: Rules for Describing

Original Items and Historical Collections, 1982; revisions available electronically (libraries, also museums, historical societies, LC Prints & Photo., CORBIS)

Page 20: Using Metadata in CONTENTdm

Content Standards: Controlled Vocabularies

“Any subset of the lexicon of a natural language. A list of preferred and nonpreferred terms produced by the process of vocabulary control. Types of controlled vocabularies include subject heading lists and thesauri.” (NISO)

Page 21: Using Metadata in CONTENTdm

Controlled vocabs for which fields?

• When you need consistency across images, user searches to find all …– Proper names for things (people, places, etc.)– Subjects depicted in the images

• Not necessary when you have…– Fields that contain data more likely to be

unique to the particular image (title, notes, other free text fields)

Page 22: Using Metadata in CONTENTdm

Remember…

You can have fields that don’t use controlled vocabularies, but where you still need consistency in format:– Dates – Image numbers– Physical description

• You could create your own controlled vocab lists (if you really had to)

Page 23: Using Metadata in CONTENTdm

Controlled Vocabularies

For names:• Library of Congress/National Authority File:

http://authorities.loc.gov• Union List of Artist Names (Getty):

http://www.getty.edu/research/tools/vocabulary/ulan

• USGS Geographic Names Information System: http://geonames.usgs.gov/gnishome.html

Page 24: Using Metadata in CONTENTdm

Controlled Vocabularies

For subjects:• Library of Congress Subject Headings:

http://authorities.loc.gov• LC Thesaurus for Graphic Materials:

http://www.loc.gov/rr/print/tgm1• Art & Architecture Thesaurus (Getty):

http://www.getty.edu/research/tools/vocabulary/aat• Chenhall’s Nomenclature (The Revised

Nomenclature for Museum Cataloging. Walnut Creek: Altamira Press, 1995)

Page 25: Using Metadata in CONTENTdm

Vocabulary conflicts?

• DC Subject: LCSH vs. AAT– Church buildings vs. Churches

• DC Coverage: LC vs. Board of Geographic Names– Moscow vs. Moskva

• Challenge of meeting needs of diverse collections and users, while maintaining consistency within and between databases

Page 26: Using Metadata in CONTENTdm

Data Dictionaries

For each project a data dictionary documents:• Database-specific field labels• Mapping of fields to DC elements• Data formatting instructions for each field• Recommended controlled vocabularies • UW data dictionaries:

http://www.lib.washington.edu/msd/mig/datadicts/default.html

• MOHAI