wi 4 (cwa1): guidelines for machine-processable representation of dublin core application profiles...
TRANSCRIPT
WI 4 (CWA1): Guidelines for machine-processable representation of Dublin Core Application Profiles
Pete Johnston, UKOLN, University of BathThomas Baker, Fraunhofer-Gesellschaft
CEN/ISSS MMI-DC Meeting
Brussels, 22-23 September 2004
http://www.ukoln.ac.uk/
http://www.ukoln.ac.uk/
Machine-processable representation of Dublin Core Application Profiles
• Context• Conceptual model for DCAP• Suggested representation using RDF
http://www.ukoln.ac.uk/
Context
• Metadata “application profile”– Recognition that implementers adapt metadata standards to
context– Use terms from multiple metadata vocabularies in combination
• CEN CWA 14855– Guidelines for human-readable representation of DCAP
• Current draft– Make information available in structured form, usable by
applications• Influenced by
– DCMI practice ("Grammatical Principles“, Namespace Policy, declaration of metadata vocabularies, “DCMI Abstract Mode”l)
– W3C Semantic Web activity– Research projects on metadata schema registries
http://www.ukoln.ac.uk/
DCMI Abstract Model
• Working Draft of DC Architecture WG• Seeks to make explicit the DC "meta-model"
– what are the component parts of any DC metadata description
– what information these components convey about the resources described by the DC metadata description
– independent of form in which DC metadata description is represented
– closely aligned with RDF meta-model– adopts class hierarchy/property specialisation semantics
of RDFS
DCMI Abstract Model
• Description as set of statements about a subject resource
• Each statement describes a relationship between the subject resource and a second resource (value)
Ref to resource
Ref to property Ref to value
Description
Statement
http://www.ukoln.ac.uk/
Fundamentals of DCAP
• DCAP does not define new terms• DCAP references ("uses") terms already defined elsewhere
– Terms may be from multiple independently-created sources• DCAP may describe how use of terms is “constrained”,
adapted, contextualised• N.B. CWA 14855 employed "term usage"; current doc
employs "property usage“– DC differentiates different types of “term”– Only use of properties is “constrained”– Other types of term are referenced
• Only (or at least primarily) as part of constraints on property
http://www.ukoln.ac.uk/
Conceptual model for DCAP
• What is a DCAP? • What are the component entities? What
are the related entities?• What are the attributes of a DCAP? And of
these component and related entities?• What types of relationship exist between
these entities?
Property
DCAP
PropertyUsage
usesProperty
usesAsEncodingScheme
Class
hasPropertyUsage
BindingSchema
isExpressedBy
m
n
1
1
m
1 m
1
administersAgency1 m Schema
DocumentisDescribedIn
1
MetadataVocabulary
administers
1
m
1
SchemaDocument
isDescribedIn
1
hasTerm1 m
hasTerm
1
m
m
MetadataVocabulary
subclass
m
n
Class
hasTerm
1
m
subprop
m
n
administersAgency1 m
1
Property
hasTerm
m
Instance
type
n
m
1 SchemaDocument
isDescribedIn
1
1
m
isDescribedIn
Metadata Vocabulary
A set of metadata terms (Properties, Classes, and Instances of those classes) managed as a coherent unit by an Agency
RelationshipsMetadata Vocabulary Is-Administered-By Agency m – 1
MetadataVocabulary Has-Member-Term Property 1 - m
MetadataVocabulary Has-Member-Term Class 1 - m
MetadataVocabulary Has-Member-Term Instance 1 – m
MetadataVocabulary Is-Described-By SchemaDocument 1 - 1
Examples: the DCMES, the DC Terms Vocabulary, the DCMI Type Vocabulary
Property
A Property is a type of relationship between two Resources.
A Property is declared as a term within exactly one Metadata Vocabulary.
A Property may be related to another property by a sub-property relationship: this states that all resources related by the first property are also related by the second property
RelationshipsProperty Is-Member-Term-Of MetadataVocabulary m – 1
Property Is-Subproperty-Of Property m - n
Property Is-Used-By PropertyUsage 1 – m
Examples: dc:creator, dcterms:modified, dcterms:audience(All DCMI elements and element refinements are properties.)
Class
A Class is a group of resources. A Class is declared as a term within exactly one Metadata Vocabulary. A Class may be related to another class by a sub-class relationship: this states that all instances of the first Class are also instances of the second Class.A Resource is related to one or more Classes by a type relationship, and is said to be an Instance of those classes .
RelationshipsClass Is-Member-Term-Of MetadataVocabulary m – 1
Class Is-Subclass-Of Property m – n
Class Has-Instance Instance m - n
Class Is-Used-As-Encoding-Scheme-By PropertyUsage m - n
Examples: dcterms:LCSH, dcterms:W3CDTF, dcmitype:Text, dcmitype:Collection(All DCMI "encoding schemes" and type vocabulary terms are classes.)
DCAP
BindingSchema
isExpressedBy
1
m
Property
usesProperty
1
m
PropertyUsage
hasPropertyUsage
m
1
usesAsEncodingScheme
Class
n
m
administersAgency1 m 1 Schema
DocumentisDescribedIn
1
DC Application Profile (DCAP)
A set of Property Usages, created to meet the functional requirements of an application or context, and managed as a coherent unit by an Agency.
RelationshipsDCAP Is-Administered-By Agency m – 1
DCAP Has-Member PropertyUsage 1 - m
DCAP Is-Described-By SchemaDocument 1 - 1
DCAP Is-Expressed-By BindingSchema 1 - m
Examples: the Simple Dublin Core DCAP, the RDN-DC DCAP, the Renardus DCAP
Attributes of DCAP
URI Reference A URI Reference which identifies the DC application profile
Mandatory Max=1
Title The name or title of the DC application profile
Mandatory Max=1Max=unbounded if allowing for multiple languages
Version An indicator of the version of the DC application profile
Optional Max=1
Status An indicator of the status of the DC application profile
Optional Max=1
Description A summary of the scope and purpose of the DC application profile
Mandatory Max=1Max=unbounded if allowing for multiple languages
Specification A human-readable document that provides more information about the DC application profile
Optional Max=unbounded
Property Usage
A Property Usage is a description of how a previously declared Property from a Metadata Vocabulary is deployed in the context of an application. A Property Usage
– must reference ("use") exactly one Property.– may provide additional documentation on how the property is
interpreted in the context of this application– may provide an application-specific label for the property– may specify obligation for the use of statements referring to the
property (whether it is mandatory, optional, conditional)– may specify constraints on the occurrence of statements
referring to the property– may specify constraints on the permitted values of the property,
by specifying that they are instances of specified classes (i.e. may specify "encoding schemes" for the property)
Property Usage
RelationshipsPropertyUsage Uses Property m – 1
PropertyUsage Uses-As-Encoding-Scheme Class m – n
PropertyUsage Is-Member-Of DCAP m - 1
Examples: the usage of dc:title in the Simple Dublin Core DCAP, the usage of dc:title in RDN-DC DCAP, the usage of dc:title in Renardus DCAP
Attributes of Property Usage
Property Usage URI A URI Reference which identifies the property usage
Mandatory Max=1
Label A human-readable label assigned to the property, in the context of this DC application profile
Optional Max=1Max=unbounded if allowing for multiple languages
Status An indicator of the status of the property usage.
Optional Max=1
Definition A statement of the concept and essential nature of the property, as it is used in this DC Application Profile
Optional Max=1Max=unbounded if allowing for multiple languages
Comments Additional information about the property or its use specific to this DC Application Profile
Optional Max=1Max=unbounded if allowing for multiple languages
Attributes of Property Usage
Obligation An indication of whether a statement using the property is required to occur in a metadata description conforming to this DC Application Profile
Mandatory Max=1
Condition A description of the condition or conditions according to which a statement using the property in a metadata descripton conforming to this DC Application Profile
Conditional(Mandatory if Obligation = Conditional)
Max=1Max=unbounded if allowing for multiple languages
Occurrences The maximum permitted number of occurrences of statements using the property in a metadata description conforming to this DC Application Profile
Mandatory Max=1
http://www.ukoln.ac.uk/
Representation of DCAP : XML?
• Could provide a XML DTD or XML Schema to define an XML format for a DCAP
• But the property usages in a DCAP reference existing terms
• Term descriptions already available, using RDF/RDFS (in some cases, at least!)
• Would require– re-describing terms that are already described (or map
existing data to new format); or– using separate format/model for DCAP (DCAP-XML) and
for metadata vocabulary (RDFS/RDF)
http://www.ukoln.ac.uk/
Representation of DCAP: XML Schema?
• An XML Schema describes constraints on the structure of a (class of) XML document(s)
• Abstract Model– Description may be represented as records in multiple
syntaxes– May be multiple XML formats, each with different XML
Schema
• A DCAP specifies the properties/classes used in a description
• So (potentially) one-to-many relation between DCAP and XML Schema
http://www.ukoln.ac.uk/
Representation of DCAP: XML Schema?
• However, XML implementers want– to constrain structure of DC-in-XML documents during
creation– to validate structure of DC-in-XML documents post-
creation– … so need XML Schema corresponding to DCAP (for
their chosen XML format)
• DCAP model is (probably?!) rich enough to generate XML Schema…
• …but N.B. that generation process requires additional information about each XML format
http://www.ukoln.ac.uk/
Representation of DCAP: RDF?
• RDF provides simple meta-model– Resource-property-value
• Descriptions of terms in DCMI metadata vocabularies already published using RDF– using RDFS and DC vocabularies
• Many other significant vocabularies also available currently or will be available
• By definition DCAP references other terms• Use of RDF facilitates merging of DCAP
description and existing metadata vocabulary descriptions (and resource descriptions)
http://www.ukoln.ac.uk/
Representation of DCAP: RDF?
• However, DCAP concept is closely associated with that of document/record/bounded description– mandating that statement with specified property is
present– limiting number of occurrences of statements with
specified property – mandating that value of specified property is instance of
specified class
• Generally, RDF applications tend to adopt "open-world" assumptions– RDFS, OWL designed to support inferencing, rather than
completeness/correctness checks (validation)
http://www.ukoln.ac.uk/
RDF representation
• Specify RDF classes and properties corresponding to the entity types, attributes, & relation types in model
• Use existing RDF vocabularies where possible• RDF Vocabulary Description Language (RDF
Schema) provides– a semantics of class hierarchy/property specialisation– an RDF vocabulary to represent RDFS semantics– i.e. properties and classes to describe Properties,
Classes (and Datatypes)• DCMES/DC terms provide
– properties for many descriptive attributes
http://www.ukoln.ac.uk/
RDF representation
• RDFS has no concepts of application profile, property usage
• RDFS does not provide– a class to represent a (Metadata) Vocabulary
• So need to provide additional classes and properties where required– The “dcap” vocabulary
• Should provide RDFS descriptions of dcap: terms• N.B. No URIrefs yet assigned for dcap: terms
http://www.ukoln.ac.uk/
Example
• RDN-DC– DCAP used for record-sharing between
partners in Resource Discovery Network (RDN)– Sharing over OAI-PMH, so uses XML syntax
• Usage of dc:language– Optional (recommended)– Repeatable– Requires use of RFC3066 encoding scheme
RDN-DC
dc:title
rdn:rdn-dc-dcap
dcap:DCAP
rdf:type
dcap:PropertyUsage
rdf:type
dcap:isMemberOf
dcap:uses
dc:language
rdf:Property
rdf:type
dcterms:RFC3066
rdfs:Class
rdf:typedcap:encodingScheme
<dcap:PropertyUsage>
<dcap:uses rdf:resource="&dcns;language"/>
<dc:description>Use the language codes defined in RFC 3066.</dc:description>
<dcap:obligation rdf:resource="&dcapns;Obligation/recommended"/>
<dcap:maxOccurs>Unbounded</dcap:maxOccurs>
<dcap:encodingScheme rdf:resource="&dctermsns;RFC3066" />
<dcap:isMemberOf rdf:resource="http://www.rdn.ac.uk/ap/rdn_dc"/>
</dcap:PropertyUsage>
http://www.ukoln.ac.uk/
Issues
• Choice of URIrefs for dcap: RDF vocabulary terms• Currently, no DCMI-endorsed model for DCAP• Proposed model is largely untested!
– But JISC IEMSR registry in development (similar data model)
• DCMI Abstract Model still work-in-progress– Literal and non-literal values in DC metadata?– Use of literal datatyping for syntax encoding schemes?
• DCAP for description v DCAP for description set • CEN CWA 14855
– more "permissive" view of DCAP?