andy powell, eduserv foundation andy.powell@eduserv.org.uk feb 2007 the dublin core abstract model...

Post on 28-Mar-2015

220 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Feb

20

07

Andy Powell, Eduserv Foundationandy.powell@eduserv.org.uk

www.eduserv.org.uk/foundation

The Dublin Core Abstract Model – a packaging standard?

Content Packaging for Complex Objects: Technical Workshop

flickr photo by amalthya

Feb 2007Content Packaging for Complex Objects: Technical Workshop 2

Why Dublin Core?

• well… maybe…

this workshop is about content packaging not metadata!?

DC is just 15 elements for describing Web pages isn’t it?

DC doesn’t do content packaging does it?

Feb 2007Content Packaging for Complex Objects: Technical Workshop 3

DC and content packaging

• this talk is about the DCMI Abstract Model…

• …and its relationship to content packaging

• it is not intended as a tutorial

• but I appreciate that the DCMI Abstract Model is new to many of you

• I will therefore start by summarising the background, context and main features of the DCAM

• then I’ll give some examples

• and finally try to draw some conclusions

http://dublincore.org/documents/abstract-model/

Feb 2007Content Packaging for Complex Objects: Technical Workshop 4

DCMI Abstract Model background

• in the early days of Dublin Core there was no explicit model associated with DC metadata descriptions

• there were implicit models and conventional wisdom…

– largely ‘flat’ in nature – i.e. a set of metadata elements describing a single thing (e.g. a Web page)

• and there were known problems…– like sometimes it was obvious that an element was

really being used to describe a second thing (e.g. the author of a Web page)

Feb 2007Content Packaging for Complex Objects: Technical Workshop 5

DCMI Abstract Model background

• as the various DC syntaxes matured– XHTML, XML and RDF/XML

• the underlying model became more important

• primarily as a mechanism for mapping between syntaxes

• and there have been a number of attempts at applying the RDF model to DC

Feb 2007Content Packaging for Complex Objects: Technical Workshop 6

DCMI Abstract Model key features

• the DCAM (first published in 2005) attempts to make explicit the model that underpins DC

• the DCAM starts from the central notion of a ‘description set’

– a set of ‘descriptions’ about a group of related things (‘resources’)

– where each ‘description’ is about a single ‘resource’

– and where each ‘description’ is essentially made up of property/value pair ‘statements’

– ‘descriptions sets’ are instantiated as ‘records’ (e.g. using XHTML, XML or RDF/XML) for the purpose of exchanging information between networked systems

Feb 2007Content Packaging for Complex Objects: Technical Workshop 7

Model summary

record (encoded as HTML, XML or RDF/XML)

description set

description (about a resource (URI))

statement

property (URI) value (URI)

vocabulary encoding scheme (URI)

value string

language(e.g. en-GB)

syntax encodingscheme (URI)

Feb 2007Content Packaging for Complex Objects: Technical Workshop 8

DCAM and relationships

• the DCAM is very open about the nature of the relationships between the resources described in a description set

– whole / part (e.g. book / chapter / section / page)

– physical / digital (painting / digitised painting)

– object / human (document / author)

– conceptual / physical (work / item)

– or all of the above!

• the relationships between things is articulated in an ‘application model’ and captured using the properties specified in an ‘application profile’

Feb 2007Content Packaging for Complex Objects: Technical Workshop 9

Example 1 – Book application model

Book0..∞hasPart

Chapter

• here is a very simple ‘application model’…

Feb 2007Content Packaging for Complex Objects: Technical Workshop 10

Example 1 – pseudo-XML description set

<descriptionSet>

<description resourceURI=http://example.org/mybook>

<statement propertyURI=dcterms:hasPart” valueURI=http://example.org/chapter1 />

<statement propertyURI=dcterms:hasPart” valueURI=http://example.org/chapter2 />

</description>

<description resourceURI=http://example.org/chapter1>

<statement propertyURI=dc:title>

<valueString>Chapter 1</valueString>

</statement>

</description>

<description resourceURI=http://example.org/chapter2>

<statement propertyURI=dc:title>

<valueString>Chapter 2</valueString>

</statement>

</description>

</descriptionSet>

Feb 2007Content Packaging for Complex Objects: Technical Workshop 11

Note 1 – objects packaged by reference

• note that objects within the package (the resources described within the description set) are passed ‘by reference’

• i.e. their URL is provided

• this is in common with other packaging standards

• passing ‘by value’ (i.e. embedding the object in-line) is theoretically possible using the DCAM ‘rich representation’ mechanism (but this is not discussed further here)

Feb 2007Content Packaging for Complex Objects: Technical Workshop 12

Note 2 - ordering

• the DCAM has no built-in support for ordering

• the model is graph-based rather than being an ordered tree

• for applications requiring ordering, e.g. the chapters in a book, it would therefore be necessary to invent properties (e.g. my:sequenceNumber) to capture the ordering as part of the description

Feb 2007Content Packaging for Complex Objects: Technical Workshop 13

Eprints application model

ScholarlyWork

Expression0..∞

isExpressedAs

Manifestation

isManifestedAs

0..∞

Copy

isAvailableAs

0..∞

0..∞

0..∞

isCreatedBy

isPublishedBy

0..∞isEditedBy

0..∞isFundedBy

isSupervisedBy

AffiliatedInstitution

Agent

• here is a more complex ‘application model’…

http://www.ariadne.ac.uk/issue50/allinson-et-al/

Feb 2007Content Packaging for Complex Objects: Technical Workshop 14

Example 2 – psuedo-XML

<descriptionSet>

<description resourceURI=http://eprints.gla.ac.uk/503/>

<statement propertyURI=dc:title> <valueString>Attempts to detect retrotransposition and de novo deletion of Alus and other dispersed repeats at specific loci in the human genome </valueString> </statement>

<statement propertyURI=eprint:isExpressedAs valueRef=expression1 />

</description>

<description resourceId=expression1 >

<statement propertyURI=eprint:isManifestedAs valueRef=pdfmanifestation />

</description>

<description resourceId=pdfmanifestation >

<statement propertyURI=eprint:isAvailableAs

valueURI=http://eprints.gla.ac.uk/503/01/Eu_J._Hum_Gen.9(2)143_.pdf />

<statement propertyURI=eprint:isAvailableAs

valueURI=http://www.nature.com/ejhg/journal/v9/n2/pdf/5200590a.pdf />

<description>

<!– descriptions of the two copies here -->

</descriptionSet>

Feb 2007Content Packaging for Complex Objects: Technical Workshop 15

Note 3 - Compound vs. complex objects

• note that the relationships between objects in this example are more complex than hasPart or isPartOf

– because the model doesn’t just deal with digital objects

• it may be worth drawing a distinction between– ‘compound objects’ (where objects have whole / part type structural relationships) and

– ‘complex objects (where there are arbitrary relationships between objects) ??

• most objects in digital libraries are complex… not just compound

Feb 2007Content Packaging for Complex Objects: Technical Workshop 16

Summary – why DC?

• DC (and the DCAM) provides a simple packaging framework

– where objects within the package are typically passed by reference

– highly flexible and extensible relationship framework between objects

– supports multiple syntax encodings

– compatible with Semantic Web (which allows for possibility of inferencing across complex objects from unknown sources)

• content packaging is largely about relationships – i.e. it is just metadata

Feb 2007Content Packaging for Complex Objects: Technical Workshop 17

Questions…

top related