1 schemas or vocabularies? april 26, 2005 oasis symposium on the future of xml vocabularies bob...
Post on 21-Jan-2016
221 Views
Preview:
TRANSCRIPT
1
Schemas or Vocabularies?
April 26, 2005
OASIS Symposium on the Future of XML Vocabularies
Bob DuCharme
LexisNexis
2
Outline
• review Dublin Core
• “vocabularies”
• creating vocabularies (and maybe schemas): required and optional steps
• case study: PRISM
3
Dublin Core• Dublin Core Metadata Initiative• dublincore.org• DCMI Metadata Terms: elements, element
refinements, encoding schemes, and vocabulary terms
• element: “A discrete unit of data or metadata. An element may contain subelements that are called qualifiers in Dublin Core. ”
• creator, date, description, format, identifier…
4
“vocabulary”?
• list of words?
• DTD?
• schema? – W3C Schema? – RELAX NG schema?– RDF Schema?
5
Mandatory step 1
Define your standard list of words:
• The actual words to use (PublishDate? publish-date?
PubDate?)
• Their meanings.
• (optional) Value restrictions, e.g.
– formatting, such as ISO 8601 for dates
(“2005-04-26T09:20”)
– list of values to choose from (Y/N, True/False, ISO 3166 country codes)
6
Example Dublin Core definition
Term Name: format
URI: http://purl.org/dc/elements/1.1/format
Label: Format
Definition: The physical or digital manifestation of the resource.
Comment: Typically, Format may include the media-type or dimensions of the resource. Format may be used to determine the software, hardware or other equipment needed to display or operate the resource. Examples of dimensions include size and duration. Recommended best practice is to select a value from a controlled vocabulary (for example, the list of Internet Media Types [MIME] defining computer media formats).
Reference: [MIME] http://www.iana.org/assignments/media-types/
Type of Term:
element
Status: recommended
Date Issued: 1999-07-02
7
Optional Steps 2 and 3
• Figure out the relationships of your labeled pieces of information
• Write it down in a machine-readable form
8
RDF Schemas
“RDF user communities also need the ability to define the vocabularies (terms) they intend to use in those statements, specifically, to indicate that they are describing specific kinds or classes of resources, and will use specific properties in describing those resources…”
- W3C RDF Tutorial
9
Validation?
“RDF classes and properties are in some respects very different from programming language types. RDF class and property descriptions do not create a straightjacket into which information must be forced, but instead provide additional information about the RDF resources they describe.”
10
Flexibility
• advantage: more systems can adapt, politically easier to sell
• disadvantage: fuzziness, more work to adopt a standard
11
PRISM
• Publishing Requirements for Industry Standard Metadata
• “Developing a standard XML metadata vocabulary for the publishing industry”
• http://www.prismstandard.org
• v 1.0: 2001; current version: 1.2
12
PRISM 1. 2 “elements”General Purpose
Provenance Dates and Time Subject
Description
Relations Rights Controlled
Vocabs
Inline Markup
dc:
identifier
title
creator
contributor
description
language
format
type
prism:
category
dc:
publisher
source
prism:
distributor
edition
issn
issueName
number
startingPage
Volume
prism:
creationDate
expirationDate
modificationDate
publicationDate
releaseDate
receptionDate
dc:
coverage
subject
prism:
event
industry
location
person
organization
section
prism:
isCorrectionOf
hasCorrection
isPartOf
hasPart
isVersionOf
hasVersion
isFormatOf
hasFormat
References
isReferencedBy
isBasedOn
isBasisFor
isTranslationOf
hasTranslation
requires
isRequiredBy
isAlternativeFor
hasAlternative
dc:
rights
prism:
copyright
expirationTime
releaseTime
rightsAgent
prl:
geography
industry
usage
pcv:
broaderTerm
code
definition
Descriptor
label
narrowerTerm
relatedTerm
synonym
Vocabulary
pim:
event
industry
location
objectTitle
organization
person
quote
13
PRISM DTDs
• metadata vs. data:– article titles, bylines– identifying inline entities
• PRISM Aggregator DTD (PAM)
• Two levels of compliance– level one: well-formed XML, dc:identifier– level two: RDF profile, rdf:about
14
PRISM RDF Schema
• Tony Hammond, Nature Publishing Group
• under “contributed resources” on PRISM website
15
Lessons Learned
• Which works best for your industry: vocabulary, DTD, XSD, RNG…
• Layered approach a good option
• Say what you mean
16
Schemas or Vocabularies?
April 26, 2005
OASIS Symposium on the Future of XML Vocabularies
Bob DuCharme
LexisNexis
top related