metadata interoperability with jpsearch

51
FIB - Barcelona School of Informatics Universitat Politècnica de Catalunya (UPC) Barcelona, Spain Metadata Interoperability with JPSearch Nicos Demetriou Master in Computer Architecture Networks and Systems Barcelona, Spain July 3, 2013

Upload: nicolas-demetriou

Post on 20-Aug-2015

802 views

Category:

Education


4 download

TRANSCRIPT

FIB - Barcelona School of Informatics Universitat Politècnica de Catalunya (UPC)

Barcelona, Spain

Metadata Interoperability with JPSearch

Nicos Demetriou

Master in Computer Architecture Networks and Systems

Barcelona, Spain July 3, 2013

Outline

1. Introduction

2. State of the art

3. Problem definition

4. Resolving the problem

5. Experiments

6. Conclusion

Nicos Demetriou: Metadata Interoperability with JPSearch 2

Nicos Demetriou: Metadata Interoperability with JPSearch

Introduction 1

Introduction

Increasing number of digital images cause:

Organization issues

Search and retrieval difficulties

The need to semantically describe them (annotations)

Portability problems

Metadata: “Data about data”

Plenty of metadata standards (MPEG-7)

JPSearch

New standard from JPEG

Tries to solve the aforementioned problems

Still improving

Nicos Demetriou: Metadata Interoperability with JPSearch 5

Introduction

Goal

Embed JPSearch data to image files

Easily create translation rules

Evaluate JPSearch standard

Possible lacks, improvements

Does JPSearch resolves the problem?

Research and application work

Nicos Demetriou: Metadata Interoperability with JPSearch 6

Nicos Demetriou: Metadata Interoperability with JPSearch

State of the art 2

State of the art

Metadata

Describes the characteristics of a resource

Distinguished from the main content of the resource

E.g. Image {Content = Pixels, Metadata = Properties}

Improve organizing resources

Search mechanism

Embedded to the resource or externally

Digital images

Many proprietary formats such as JPEG, PNG

Contain metadata, method differs

Native metadata (e.g. EXIF)

Metadata in XML format (e.g. XMP)

Handful of metadata standards

Nicos Demetriou: Metadata Interoperability with JPSearch 8

State of the art

Joint Photographic Experts Group (JPEG)

Lossy compression / adjustable compression

Standards: JFIF EXIF & ICC profile (color space)

JPEG Interchange Format (JIF) byte layout

Segments of JPEG: Application Markers (0xFFEn)

JFIF – APP0

EXIF – APP1

XMP – APP1

ICC – APP2

JPSearch – APP3

Photoshop – APP13

Nicos Demetriou: Metadata Interoperability with JPSearch 9

State of the art

JPSearch

Suite of specifications

Enrichment with metadata in JPEG, JPEG 2000

Abstract framework

Modular

Flexible search architecture

Six parts:

1. System framework and components

2. Schema and ontology

3. Query format

4. File format for embedded metadata

5. Data interchange format between repositories

6. Reference software

Nicos Demetriou: Metadata Interoperability with JPSearch 10

Nicos Demetriou: Metadata Interoperability with JPSearch

Problem definition 3

Problem definition

Metadata Interoperability

Metadata exchanged without loss of information

Different processes express metadata in certain way

Distinct services exchange query messages

Nicos Demetriou: Metadata Interoperability with JPSearch 12

Problem definition

Challenges

Manipulation of image collections’ metadata

Image search and retrieval

Image repository maintenance and synchronization

Metadata storage

Reuse of metadata without regenerating it

Transferability between various image formats

Semantic meaning differs among formats

Approach to the solution: JPSearch

What is missing?

Lack of approaches and applications

Nicos Demetriou: Metadata Interoperability with JPSearch 13

Nicos Demetriou: Metadata Interoperability with JPSearch

Resolving the problem 4 4.1 Approach

4.2 JPSearch Part 2

4.3 JPSearch Part 4

4.4 Tools Developed

Resolving the problem

4.1 Approach

Implement tools

JPSearch Editor for JPEG files

Translation Rule generator

Analyze transformation rules for different metadata sets

Evaluate JPSearch

Find lacks

Suggest improvements

Nicos Demetriou: Metadata Interoperability with JPSearch 15

Resolving the problem

4.2 JPSearch Part 2

JPSearch Core Metadata Schema

Rules for machine readable translation

Nicos Demetriou: Metadata Interoperability with JPSearch 16

Resolving the problem

4.2 JPSearch Part 2

JPSearch Core Metadata Schema

19 Basic elements

XML syntax

Nicos Demetriou: Metadata Interoperability with JPSearch 17

<?xml version="1.0" encoding="UTF-8"

standalone="yes"?>

<ImageDescription

xmlns="JPSearch:schema:coremetadata">

<Creators>

<GivenName>Leonardo</GivenName>

<FamilyName>DaVinci</FamilyName>

</Creators>

<Publisher>

<PersonName>

<GivenName>Paris</GivenName>

<FamilyName></FamilyName>

</PersonName>

<OrganizationInformation>

<Name>Museum of Louvre</Name>

<Address>

<Name>Lourve, Paris</Name>

</Address>

</OrganizationInformation>

</Publisher>

<CreationDate>1503-01-

01T00:00:00.0Z</CreationDate>

<ModifiedDate>2013-06-

24T13:30:41.395+03:00</ModifiedDate>

<Description>The portrait of Mona

Lisa</Description>

<Keyword>Image</Keyword>

<Keyword>fr</Keyword>

<Title>Mona Lisa</Title>

<CollectionLabel>Painting</CollectionLabel>

<Width>677</Width>

<Height>1024</Height>

</ImageDescription>

Elements

Identifier Title

Modifiers CollectionLabel

Creators PreferenceValue

Publisher Rating

CreationDate OriginalImageIdentifier

ModifiedDate GPSPositioning

Description RegionOfInterest

RightsDescription Width

Source Height

Keyword

Example

Resolving the problem

4.2 JPSearch Part 2

Rules for machine readable translation

JPSearch Translation Rules Declaration Language

Rule types: OneToOne, OneToMany, ManyToOne

Dublin Core to JPSearch example

Nicos Demetriou: Metadata Interoperability with JPSearch 18

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<TranslationRules xmlns="JPSearch:schema:translation" fromFormat="http://purl.org/dc/terms/"

toFormat="JPSearch:schema:coremetadata">

<TranslationRule xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:type="OneToManyFieldTranslationType">

<FromField xsi:type="FilteredSourceFieldType">

<XPathExpression>/creator</XPathExpression>

<FilterWithRegExpr>(\S+) (\S+)</FilterWithRegExpr>

<VariableBinding>

<ExplicitPrefixBinding>$name</ExplicitPrefixBinding>

<ExplicitPostfixBinding>$lastname</ExplicitPostfixBinding>

</VariableBinding>

</FromField>

<ToField xsi:type="FormattedTargetFieldType">

<XPathExpression>/JPSearchCore/Creators/GivenName</XPathExpression>

<ReplaceWithRegExpr>$name</ReplaceWithRegExpr>

</ToField>

<ToField xsi:type="FormattedTargetFieldType">

<XPathExpression>/JPSearchCore/Creators/FamilyName</XPathExpression>

<ReplaceWithRegExpr>$lastname</ReplaceWithRegExpr>

</ToField>

</TranslationRule>

</TranslationRules>

Resolving the problem

4.3 JPSearch Part 4

File format – extension of JPEG-1/JPEG 2000

Additional metadata can co-exist

JPSearchMetadata. ElementaryMetadata, Data blocks

Nicos Demetriou: Metadata Interoperability with JPSearch 19

Resolving the problem

4.4 Tools developed

JPSearch Editor

Features:

Opens .jpeg files

Shows JPSearch metadata

Embeds JPSearch

Alters JPSearch metadata

Saves JPSearch to XML

Saves metadata to DB

Imports external XML instance with its translation rules

Region Tagging

Nicos Demetriou: Metadata Interoperability with JPSearch 20

Resolving the problem

4.4 Tools developed

JPSearch Editor

Features:

Opens .jpeg files

Shows JPSearch metadata

Embeds JPSearch

Alters JPSearch metadata

Saves JPSearch to XML

Saves metadata to DB

Imports external XML instance with its translation rules

Region Tagging

Nicos Demetriou: Metadata Interoperability with JPSearch 21

Resolving the problem

4.4 Tools developed

Rule generator

Features

Opens XML instance

Shows XML elements

Element selection

All three rule type supported

Rule type details

Show saved rules

Update/Deletion of saved rules

Save rules in XML file

Nicos Demetriou: Metadata Interoperability with JPSearch 22

Nicos Demetriou: Metadata Interoperability with JPSearch

Experiments 5

Experiments

Methodology

Creating mappings of multiple XML instances

Use rule generator tool

Based on a practical approach

Standards experimented with:

MPEG-7

Dublin Core

EBU core

XMP

Other formats (DeviantArt)

Evaluate JPSearch

Detect lacks and tool improvements

Nicos Demetriou: Metadata Interoperability with JPSearch 24

Experiments

MPEG-7

Large schema

Favors moving pictures and audio

Impossible to map all elements

Work on a subset of the schema

Valid sample documents found online

Conclusions

Some elements not trivial to be mapped

Title name of a person e.g. Doctor, Professor

Version element exists

Audio and Video metadata encountered

Element selection using attributes issue

Nicos Demetriou: Metadata Interoperability with JPSearch 25

Experiments

MPEG-7 to JPSearch mappings

Nicos Demetriou: Metadata Interoperability with JPSearch 26

MPEG-7 element JPSearch element

../MediaIdentification/EntityIdentifier Identifier

../Creator/Agent[@type="PersonType"]/Name/GivenName Creators/GivenName

../Creation/Abstract/FreeTextAnnotation Description

../MediaLocator/MediaUri OriginalImageIdentifier/OriginationOfID

../CreationInformation/Creation/Title Title

../CreationPreferences/@preferenceValue PreferenceValue

../MediaReview/Rating/RatingValue Rating/LabelValue

../Place/GeographicPosition/Point/@latitude GPSPositioning/latitude

../SpatialDecomposition/StillRegion/SpatialLocator/Box RegionOfInterest/RegionLocator

../Subject/KeywordAnnotation/Keyword RegionOfInterest/Keyword

../DescriptionMetadata/Version Description (Regex: “Version: $a”)

../DescriptionMetadata/LastUpdate ModifiedDate

../MediaProfile/MediaFormat/Medium/Name Source/CreationMethod

../CreationInformation/Classification/Genre/Name CollectionLabel (Regex: “Genre: $a”)

Experiments

Dublin Core

Simple schema

15 basic elements

Qualified Dublin Core adds 6 more elements

Online samples and manually created

Conclusions

Some elements not so clear how to be mapped

Qualified Dublin Core elements

Elements can appear multiple times

Multiple elements map to Creators/Modifiers/Region of Interest

Nicos Demetriou: Metadata Interoperability with JPSearch 27

Experiments

Dublin Core to JPSearch mappings

Nicos Demetriou: Metadata Interoperability with JPSearch 28

Dublin Core element JPSearch Core element

contributor Publisher/OrganizationInformation/Name

creator Creators/GivenName, FamilyName

rights RightsDescription/Description

source Source/SourceElementType

relation OriginalImageIdentifier/Identifier

coverage Publisher/OrganizationInformation/Address/Name

rightsholder RightsDescription/ActualRightsDescription

accrualperiodicity Source/SourceElement/SourceElementDescription

Experiments

EBUcore

Extension to Dublin Core

Additional elements for video, audio and image

Extra contact details

Merely based on attributes

Manually created samples

Conclusions

Similar mapping problems as Dublin Core

Version element exists

Optional elements can be ignored

Special repeated attributes

typeGroup, typeLabel, typeLink and typeDefinition

Nicos Demetriou: Metadata Interoperability with JPSearch 29

Experiments

EBUcore to JPSearch mappings

Nicos Demetriou: Metadata Interoperability with JPSearch 30

EBUCore elements JPSearch elements

coreMetadata/publisher/entity/contactDetails/

name

Publisher/PersonName/GivenName,

FamilyName

coreMetadata/publisher/entity/contactDetails/

organisationDetails/details

Publisher/OrganizationInformation/Name

coreMetadata/format/imageFormat/width Width

coreMetadata/coverage/spatial/location RegionOfInterest/ContentDescription/Place/

Description

coreMetadata/version RegionOfInterest/ContentDescription/Object/

Name

coreMetadata/rating/ratingValue Rating/LabelValue

Experiments

XMP

Different than other standards

Incorporates multiple standards in the same schema

Mixture

Namespace required

EXIF, Dublin Core, Photoshop tags

Heavily based on Resource Description Framework (RDF)

Consists of Description XMP packets

Stored in APP1 of JPEG

External file .xmp

Online valid samples

Conclusions

Large set of elements from different standards

Hard to map certain elements (camera raw metadata)

Nicos Demetriou: Metadata Interoperability with JPSearch 31

Experiments

Other formats

Third party web services use their own format

DeviantArt, Flickr and Youtube use oEmbed

12 basic elements

Most of them are optional

Use API to obtain metadata

No attributes

Samples got using API

Conclusions

URL elements

Thumbnail sizes

Nicos Demetriou: Metadata Interoperability with JPSearch 32

Experiments

oEmbed example of a DeviantArt image

Nicos Demetriou: Metadata Interoperability with JPSearch 33

Experiments

Execution of Tools

Rule generator

JPSearch metadata Editor

Nicos Demetriou: Metadata Interoperability with JPSearch 34

Experiments

Rule generator

Dublin Core XML instance

Nicos Demetriou: Metadata Interoperability with JPSearch 35

Experiments

Rule generator

Nicos Demetriou: Metadata Interoperability with JPSearch 36

Experiments

Rule generator

Nicos Demetriou: Metadata Interoperability with JPSearch 37

Dublin core XML instance

Experiments

Rule generator

Nicos Demetriou: Metadata Interoperability with JPSearch 38

Experiments

Rule generator

Nicos Demetriou: Metadata Interoperability with JPSearch 39

JPSearch Translation Rules

Experiments

Rule generator

JPSearch Translation Rules

Nicos Demetriou: Metadata Interoperability with JPSearch 40

Experiments

JPSearch Editor

Nicos Demetriou: Metadata Interoperability with JPSearch 41

Experiments

JPSearch Editor

Nicos Demetriou: Metadata Interoperability with JPSearch 42

Photo of Mona Lisa

Experiments

JPSearch Editor

Nicos Demetriou: Metadata Interoperability with JPSearch 43

Experiments

Nicos Demetriou: Metadata Interoperability with JPSearch

JPSearch Editor

44

Dublin core XML instance

JPSearch Translation Rules

Experiments

JPSearch Editor

Nicos Demetriou: Metadata Interoperability with JPSearch 45

Experiments

JPSearch Editor

Nicos Demetriou: Metadata Interoperability with JPSearch 46

Nicos Demetriou: Metadata Interoperability with JPSearch

Conclusion 6

Conclusion

Nicos Demetriou: Metadata Interoperability with JPSearch 48

JPSearch is powerful international standard

Well documented/defined and still improving

Two tools were developed

JPSearch metadata Editor

Translation Rule generator

Evaluation of the standard through experimentation

JPSearch

provides storage of big set of metadata

allows many metadata sets to be mapped

gives a solution to the metadata interoperability issue

Conclusion

Nicos Demetriou: Metadata Interoperability with JPSearch 49

Future work

Tools

Complete database support

Support JPSearch native metadata

JPEG 2000 file type support

Load XML schema (xsd files) to rule generator

Load XML rules to rule generator

Read and translate EXIF to JPSearch

Conclusion

Nicos Demetriou: Metadata Interoperability with JPSearch 50

Future work

JPSearch

Attribute support

Additional helpful elements

Extended Contact details (profession title, email, suffix)

Version

URLs and Thumbnail details

Dictionary based values e.g. “Perfect” in Rating

FIB - Barcelona School of Informatics Universitat Politècnica de Catalunya (UPC)

Barcelona, Spain

Metadata Interoperability with JPSearch

Nicos Demetriou

Master in Computer Architecture Networks and Systems

Barcelona, Spain July 3, 2013