1 schemas or vocabularies? april 26, 2005 oasis symposium on the future of xml vocabularies bob...

16
1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

Upload: milo-rogers

Post on 21-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

1

Schemas or Vocabularies?

April 26, 2005

OASIS Symposium on the Future of XML Vocabularies

Bob DuCharme

LexisNexis

Page 2: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

2

Outline

• review Dublin Core

• “vocabularies”

• creating vocabularies (and maybe schemas): required and optional steps

• case study: PRISM

Page 3: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

3

Dublin Core• Dublin Core Metadata Initiative• dublincore.org• DCMI Metadata Terms: elements, element

refinements, encoding schemes, and vocabulary terms

• element: “A discrete unit of data or metadata. An element may contain subelements that are called qualifiers in Dublin Core. ”

• creator, date, description, format, identifier…

Page 4: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

4

“vocabulary”?

• list of words?

• DTD?

• schema? – W3C Schema? – RELAX NG schema?– RDF Schema?

Page 5: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

5

Mandatory step 1

Define your standard list of words:

• The actual words to use (PublishDate? publish-date?

PubDate?)

• Their meanings.

• (optional) Value restrictions, e.g.

– formatting, such as ISO 8601 for dates

(“2005-04-26T09:20”)

– list of values to choose from (Y/N, True/False, ISO 3166 country codes)

Page 6: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

6

Example Dublin Core definition

Term Name: format

URI: http://purl.org/dc/elements/1.1/format

Label: Format

Definition: The physical or digital manifestation of the resource.

Comment: Typically, Format may include the media-type or dimensions of the resource. Format may be used to determine the software, hardware or other equipment needed to display or operate the resource. Examples of dimensions include size and duration. Recommended best practice is to select a value from a controlled vocabulary (for example, the list of Internet Media Types [MIME] defining computer media formats).

Reference: [MIME] http://www.iana.org/assignments/media-types/

Type of Term:

element

Status: recommended

Date Issued: 1999-07-02

Page 7: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

7

Optional Steps 2 and 3

• Figure out the relationships of your labeled pieces of information

• Write it down in a machine-readable form

Page 8: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

8

RDF Schemas

“RDF user communities also need the ability to define the vocabularies (terms) they intend to use in those statements, specifically, to indicate that they are describing specific kinds or classes of resources, and will use specific properties in describing those resources…”

- W3C RDF Tutorial

Page 9: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

9

Validation?

“RDF classes and properties are in some respects very different from programming language types. RDF class and property descriptions do not create a straightjacket into which information must be forced, but instead provide additional information about the RDF resources they describe.”

Page 10: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

10

Flexibility

• advantage: more systems can adapt, politically easier to sell

• disadvantage: fuzziness, more work to adopt a standard

Page 11: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

11

PRISM

• Publishing Requirements for Industry Standard Metadata

• “Developing a standard XML metadata vocabulary for the publishing industry”

• http://www.prismstandard.org

• v 1.0: 2001; current version: 1.2

Page 12: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

12

PRISM 1. 2 “elements”General Purpose

Provenance Dates and Time Subject

Description

Relations Rights Controlled

Vocabs

Inline Markup

dc:

identifier

title

creator

contributor

description

language

format

type

prism:

category

dc:

publisher

source

prism:

distributor

edition

issn

issueName

number

startingPage

Volume

prism:

creationDate

expirationDate

modificationDate

publicationDate

releaseDate

receptionDate

dc:

coverage

subject

prism:

event

industry

location

person

organization

section

prism:

isCorrectionOf

hasCorrection

isPartOf

hasPart

isVersionOf

hasVersion

isFormatOf

hasFormat

References

isReferencedBy

isBasedOn

isBasisFor

isTranslationOf

hasTranslation

requires

isRequiredBy

isAlternativeFor

hasAlternative

dc:

rights

prism:

copyright

expirationTime

releaseTime

rightsAgent

prl:

geography

industry

usage

pcv:

broaderTerm

code

definition

Descriptor

label

narrowerTerm

relatedTerm

synonym

Vocabulary

pim:

event

industry

location

objectTitle

organization

person

quote

Page 13: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

13

PRISM DTDs

• metadata vs. data:– article titles, bylines– identifying inline entities

• PRISM Aggregator DTD (PAM)

• Two levels of compliance– level one: well-formed XML, dc:identifier– level two: RDF profile, rdf:about

Page 14: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

14

PRISM RDF Schema

• Tony Hammond, Nature Publishing Group

• under “contributed resources” on PRISM website

Page 15: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

15

Lessons Learned

• Which works best for your industry: vocabulary, DTD, XSD, RNG…

• Layered approach a good option

• Say what you mean

Page 16: 1 Schemas or Vocabularies? April 26, 2005 OASIS Symposium on the Future of XML Vocabularies Bob DuCharme LexisNexis

16

Schemas or Vocabularies?

April 26, 2005

OASIS Symposium on the Future of XML Vocabularies

Bob DuCharme

LexisNexis