the 'digital object types' issue

The ‘Digital Object Types’ Issue

Andy JacksonBL

Where are we now, technically?

• Current digital object property architecture:– PLATO captures institutional requirements.– Can link them directly to ‘technical’ properties, e.g. those

extracted via the XCL tools.• Planets has no other ‘Digital Object Properties’

– Conceptual models and definitions are considered institutionally dependent.

– Note that O.T. templates can allow common property models and definitions to emerge.

• Should we try to define them? How?

Original Testbed Property Model

• Evaluate migration services based on comparing ‘Intellectual Properties’ of the input and output.– Defined as Properties of Types of Intellectual Object.– Extract relevant Intellectual Properties from initial and

final Digital Objects. – Intellectual Properties judged same/similar/different.– Evaluate service/tool based on how well the properties

were preserved.

Digital Object ‘Types’

• In TB, an interpretation of the rendered object.– i.e. a scanned TIFF of a journal article could be interpreted

as both an Image and a Document, employing properties from both.

• For comparison, in XCL, ‘type’ would refer to the information model which encompasses the information content of some set of formats.– e.g. the image model which covers PNG, TIFF, etc.

Significance & Objectivity

• General realisation that ‘significance’ is at least very difficult to define.

• Significance of properties differs between Planets Partners.

• Can we define ‘objective’ properties? – Requires a shared system of property definitions.– Attach ‘significance’ to these properties in PLATO.

Requirements For Intellectual Properties

• Requirements– Must pertain to the rendition, i.e. transcend file format.– Must be institutionally independent, -> ‘objective’.– Must be self-consistent, implying one system of

intellectual properties per rendition ‘type’.– Must be concrete, i.e. measurable in principle.– Must be capable of completely describing the rendered

form (required in order to be ‘objective’, to be capable of verifying ‘authenticity’ for all definitions of ‘significant’).

• This has proven very difficult!

Problems with that model

• Not clear if some of those requirements are even possible to meet in principle.

• Clearly a very large amount of work.– Even if achieved, does not include enough technical

properties that are of interest.• e.g. format properties with non-trivial consequences.

Does service preserve the method of compression?• e.g. properties of services (speed etc), agents, technical

environments, etc.• So, where should we focus our efforts?

What kind of properties?

• If we do want higher-level properties…– ‘Type’ Properties?– Properties of the Rendition?

• What degree of coverage, of what?– Aim for completeness, but narrowly focussed?– Sparse, but concentrate on risks and flaws?

• Explicitly define comparative properties?– Don’t just want compare properties extracted from files.– e.g. define a property that means the RMS difference

between two images when rendered (c.f. XCL).

Ideas for how to do it.

• Use DOPWG(?) to pool knowledge only?– Park it until Planets 2.

• Use DOPWG as seed for long-term working group?• Use DOPWG to monitor PLATO templates for

emergence of common property models?– Look for commonality and encourage convergence?

• Use DOPWG to generate a property model?– Produce a document, XML Schema, an ontology, a

dictionary, or PLATO O.T. template?

Summary

• Should we attempt to define common ‘Digital Object Properties’ in Planets (1)?– What kind of properties?

• ‘Object’ or ‘Rendering/Performance/BetterWord’?• ‘Comparative’ as well as ‘of-an-instance’?

– How to proceed?• Observe, steer, or control?• Define inside or outside PLATO? Augment ontology?• Needs well-defined objectives and allocated effort!

the 'digital object types' issue

Technology