a motivating scenario for designing an extensible audio- visual description language monday 25 th of...

21
A Motivating Scenario for Designing an Extensible Audio-Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy , Jean Carrive, Steffen Lalande and Jean- Philippe Poli

Upload: august-jacobs

Post on 28-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

A Motivating Scenario for Designing an Extensible Audio-Visual Description

Language

Monday 25th of October, 2004

Raphaël Troncy, Jean Carrive, Steffen Lalande and Jean-Philippe Poli

Page 2: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 2

Description of the AV content

• Various uses / Different granularity :– identification of the content creator and the content

provider: Dublin Core metadata, VRA core categories, TV Anytime metadata …

– feature extraction from the video signal: storing and exchanging automatic tools results (MPEG-7)

– structural decomposition in video segments corresponding to a logical structure of the program: time-code, spatial coordinates

– semantic description of these segments: controlled vocabulary, thesaurus, free text annotation

Page 3: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 3

Description of the AV content(cultural heritage point of view)

• Segmentation– locate and date some

events• Description

– type each segment with an AV genre

– type each segment with a general thematic

– give hints on the production– describe the scene (who,

when, where, what, …)

time t

report

athletics

Michael Johnson smashed the 200mworld record to complete a 200m in

19''32 in Atlanta for the Olympic Games

fade in/out

⇒ needs a powerful description language

Page 4: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 4

Motivating scenario• Generic application for describing manually TV

programs w.r.t:– structural constraints: patterns represent the logical

structure of a document– semantic constraints: the description of the content is

machine understandable

• Let us define the temporal structure of a Sports Magazine

Page 5: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 5

MPEG-7, the natural candidate description language?

• ISO standard since December of 2001

• Main components:– Descriptors (Ds) and

Description Schemes (DSs)

– DDL (XML Schema + extensions)

• Concern all types of media

Basic datatypes

Links & media localization

Basic Tools

Models

Basic elements

Navigation & Access

Content management

Content description

Collections

Summaries

Variations

Content organization

Creation & Production

Media Usage

Semantic aspects

Structural aspects

User interaction

User Preferences

Schema Tools

User History Views Views

Part 5 - MDS

Page 6: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 6

MPEG-7: a non-suitable description language for this scenario

1. A non-extensible language• closed set of descriptors

2. Exchange syntax rather than a real machine processable multimedia description language

• non object-based data model• non modular language (universal approach)

3. No formal semantics provided• applications cannot have access to the meaning of

the documents

⇒ the DDL (XML Schema) fault ?

Page 7: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 7

MPEG-7: a non-suitable description language for this scenario

⇒ how to reconciliate the critical issue object-oriented semantic expression

versus structural validation

• How to define new descriptors ?• How to define new description schemes ?• How to make the description machine

understandable ?

Page 8: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 8

Our proposition: AVDL

• AVDL: a reduced yet extensible audio-visual description language– an object meta-model (an instance model specifies

the vocabulary for and the rules followed by the descriptions)

– an XML syntax– a semantics (closed to DL for the descriptors)

• Description Schemes– Descriptors– Properties– Structures

• Descriptions– valid instances w.r.t

description schemes

Page 9: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 9

The meta class level

Page 10: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 10

The class level

Page 11: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 11

Location

Page 12: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 12

Document, Content and Media

• Distinction :– Document vs Content vs

Media– Virtual content vs

physical content

• Media: a content abstraction for decomposition– audio tracks, subtitles

Page 13: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 13

Defining Structures• A structure defines how the descriptors may and have to be

combined– allows a description control– allows an automatic completion of the descriptions

• AVDL provides some predefined structure models– containment : gives the list of the possible sub-segments of an AV

segment (in space and in time)– regular expression : by analogy of grammar for temporal succession

• Other models are currently studied: temporal constraints, etc.

Page 14: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 14

AVDL Implementation

• XML Serialization– Independent from a schema language– Use XML Schema validation (mainly for

datatypes)

• C#– Object inheritance– Use of the .NET reflexivity

Page 15: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 15

d-162.xmlds-17.xml

avdl.xsd

XML Serialization

Audio-VisualDescriptionLanguage

DescriptionSchemes

Descriptions

ds-17.xsdpartialcontrol

transformation

partial control

Page 16: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 16

XML Syntax (DS)

<Descriptor xsi:type="LocatedDescriptorType" id="id-d2" name="Tracking">

<Property ref="id-p2"/>

<Structure ref="id-s2"/>

<DescriptionRelationship characterization="string">

<Location type="TemporalInterval"/>

<Media type="Media"/>

</DescriptionRelationship>

</Descriptor>

<Property id="id-p2" name="nbDetection">

<Domain descriptor="id-d2"/>

<Range>

<Primitive nameType="int"/>

</Range>

</Property>

<Structure id="id-s2" name="TrackingStructure">

<FormalModel>

<Constraint type="temporal" validation="full" method="system

parser="XMLSchema">

<xsd:sequence minOccurs="0" maxOccurs="unbounded">

<xsd:element name="Detection" type="DetectionType"/>

</xsd:sequence>

</Constraint>

</FormalModel>

</Structure>

Page 17: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 17

XML Syntax (Descriptions)

<Tracking type="LocatedDescriptorType" nbDetection="1">

<DescriptionRelationship>

<Location>

<avdl:Begin timeRef="147329280"/><avdl:End timeRef="147329280"/>

</Location>

<Media id="CPB86006610.mpg" name="CPB86006610.mpg" contentID="CPB86006610.mpg"/>

</DescriptionRelationship>

<Structure constraintType="temporal">

<Detection type="LocatedDescriptorType" nbFeature="1">

<DescriptionRelationship>

<Location>

<avdl:Instant timeRef="147329280"/>

</Location>

<Media id="CPB86006610.mpg" name="CPB86006610.mpg"

contentID="CPB86006610.mpg" frameHeight="288" frameWidth="352"/>

</DescriptionRelationship>

<Structure constraintType="spatial">

<Feature xsi:type="FaceType">

<DescriptionRelationship>

<Location>

<avl:BoundingBox>

<avdl:NE numX="92" denX="352" numY="217" denY="288"/>

<avdl:NW numX="92" denX="352" numY="267" denY="288"/>

<avdl:SE numX="136" denX="352" numY="217" denY="288"/>

<avdl:SW numX="136" denX="352" numY="267" denY="288"/>

</avdl:BoundingBox>

</Location>

...

Page 18: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 18

Memory

.NET implementation

d-162.xmlds-17.xml

DescriptionSchemes

Descriptions

ds-17.dll

parsing parsing

read/write

.NET instanciation

Page 19: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 19

Two kinds of applications

• Static Description Schemes– DS are well-known– The developer uses generated libraries

• Dynamic Description Schemes– DS are created by the application– Use of the dynamic instantiation mechanism

(reflexivity) of .NET

Page 20: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 20

Carrying out the scenario

• Definition of new descriptors and properties– associating behavior with the corresponding classes– performing reasoning on the descriptions with the

formal definitions in OWL

• Definition of logical and temporal structures– the description is controlled and validated by a

grammar

Page 21: A Motivating Scenario for Designing an Extensible Audio- Visual Description Language Monday 25 th of October, 2004 Raphaël Troncy, Jean Carrive, Steffen

Raphaël Troncy CoRIMedia - 10/25/2004 21

Conclusion and Future Work

• AVDL: a reduced yet extensible Audio-Visual Description Language– descriptors, properties, structures– XML syntax and DL semantics– .NET implementation and APIs

• About structure validation:– which constructors used ? which semantics ?

• Trade-of expressivity vs calculability– OWL Full is undecidable– constraints satisfaction problems can be complex