extensible metadata developments in the triangle digital library project
TRANSCRIPT
Extensible Metadata
Developments in the Triangle Digital
Library Project
The Problems:
• heterogeneous data sources– established collections– individual collections– widely differing metadata needs
• A wide range of catalogers
Towards a solution:
• Some standards:– Dublin Core as “lowest common
denominator”– RDBMS as storage / management /
retrieval mechanism– XML as metadata storage / transport
mechanism
Current Implementation
• Database– Oracle
• Middleware– Cocoon
• A Web publishing framework that utilizes Java servlets and XML
Let’s restate the problem(s)
• Collections’ cataloging schemes may have more granularity than Dublin Core is capable of representing.
Example:
• VRA Core:<object> <title>Artemis</title> <description>Artemis slaying Actaeon</description> … <type>bell krater</type> <material>clay</material> <technique>pottery – Red Figure</technique> …</object>
Example:
• Dublin Core:<object> <title>Artemis</title> <description>Artemis slaying Actaeon</description> <description>pottery – Red Figure</description> … <type>bell krater</type> <format>clay</format>…</object>
doesn’t reallycorrespond totechnique
terminology will be confusing to peopleused to VRA
Proposed solution: Handle both DC and other schemata.Store elements that don’t map to Dublin Core…<object> <title>Artemis</title> <description>Artemis slaying Actaeon</description> <description>pottery – Red Figure</description> <type>bell krater</type> <format>clay</format> <meta> <vra:technique>pottery – Red Figure</vra:technique> </meta></object>
Proposed solution: Handle both DC and other schemata.…and map DC elements to other schemas
Dublin Core VRA Core
<title/> <title/>
<description/> <description/>
<type/> <type/>
<format/> <material/>
To accomplish this, we’ll need
• a schema repository– to handle various schemata (both
standard Dublin Core and extended in various ways)
• a crosswalk repository– to handle transforming metadata
records to different formats, such as VRA
Let’s restate the problem(s)• Collections’ cataloging schemes may have
more granularity than Dublin Core is capable of representing.
• Different disciplines have different vocabularies which may overlap.
Example<object> … <subject>tibia</subject> …<object>
Is it (a) a leg bone? Or (b) a Roman flute?Answer: depends on the context…
Proposed solution: Support multiple controlled vocabularies
vocabularies
LCSH, etc
semantic_unitswords and phrases
elements
Let’s restate the problem(s)• Collections’ cataloging schemes may have
more granularity than Dublin Core is capable of representing.
• Different disciplines have different vocabularies which may overlap.
• Different users will have different metadata needs and desires, both in cataloging and retrieving data.
Proposed solution(s):
• Allow users to choose what cataloging schemes they wish to use for entering and retrieving information.
• Allow users to annotate objects and / or “over-write” metadata.
Model
Links• UNC Digital Library Project
– http://www.unc.edu/projects/diglib
• 1999-2000 Digital Library Final Report– http://www.unc.edu/projects/diglib/report.pdf
• Cocoon– http://xml.apache.org/cocoon