combining multimedia and semantics (lacnem2010)

Combining Multimedia and Semantics

Oscar Corcho (ocorcho@fi.upm.es)

Universidad Politécnica de Madrid

http://www.oeg-upm.net/

LACNEM 2010, Cali, ColombiaSeptember 9th 2010

Credits: Adrián Siles, Mariano Rico, Víctor Méndez, Hector Andrés García-Silva, María del Carmen Suárez-Figueroa, Ghislain Atemezing, Raphaël Troncy

Work distributed under the license Creative Commons Attribution-Noncommercial-Share Alike 3.0

http://www.slideshare.net/ocorcho

2Asunción Gómez Pérez

Ontology Engineering Group. Whom we are

•Director: A. Gómez-Pérez•Research Group (37 people)

- 2 Full Professor- 4 Associate Professors - 1 Assistant Professor- 3 Postdocs- 17 PhD Students- 8 MSc Students- 2 Software Engineers

• Management (4 people)- 2 Project Managers- 1 System Administrator- 1 Secretary

• 50+ Past Collaborators• 10+ visitors

Semantic e-Science (Data Integration, Semantic Grid)

Internet of Things

(Social) Semantic

Natural Language Processing

Ontological Engineering

Research Areas

19972000

2004 2008

Before we start…

• How many of you have ever heard about the word “Ontology”?

• And how many of you do actually know what it means?

Coming to terms with ontologies and semantics

• An ontology is an engineering artifact, which provides: - A vocabulary of terms- A set of explicit assumptions regarding the intended meaning of the

vocabulary. • Almost always including concepts and their classification• Almost always including properties between concepts

• Shared understanding of a domain of interest - Agreement on the meaning of terms- Formal and machine manipulable model of a domain of interest

• Besides...- The meaning (semantics) of such terms is formally specified- New terms can be formed by combining existing ones- Can also specify relationships between terms in multiple ontologies

Example: An ontology about satellites

Outline

• Introduction- What I will be talking about and what I will not…

There were several options that I explored before selecting the one that you will be hearing in this talk…

Option 1: The Semantic Gap

• The lack of coincidence between the information that one can extract from the sensory data and the interpretation that the same data has for a user in a given situation

However, I already assumed that Ebroul would be talking a lot about it in his opening keynote (as he did).

Besides, I have not worked at all on the low-level part, so it may be difficult for me to provide you with a good insight on the (many) open problems in this area

A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain: Content-based image retrieval at the end of the early years, IEEE PAMI, 1349–1380, 2000.

Option 2: MPEG-7 and the Semantic Web

• ISO standard since December 2001• Main components:

- Descriptors (Ds) and Description Schemes (DSs)- DDL (XML Schema + extensions)

• Concern all types of media• A good number of ontologies developed around it

Basic datatypes

Links & media localization

Basic Tools

Models

Basic elements

Navigation & Access

Content management

Content description

Collections

Summaries

Variations

Content organization

Creation & Production

Media Usage

Semantic aspects

Structural aspects

User interaction

User Preferences

Schema Tools

User History Views Views

Part 5 – MDSMultimedia Description Schemes

Option 2: MPEG-7 and the Semantic Web

However, the talk may:

Be a bit boring and too technical

May lack the mix of state of the art and vision that an invited talk should normally have

And MPEG-7 is not used too much

…So I will cover only some aspects of this later, when I talk about multimedia ontologies.

Option 3: Canonical Processes of Media Production (and semantics, obviously)

• For example….- http://www.cewe-photobook.

• Application for authoring digital photo books

• Automatic selection, sorting and ordering of photos- Context analysis methods:

timestamp, annotation, etc.- Content analysis methods:

color histograms, edge detection, etc.

• Customized layout and background

• Print by the European leader photo finisher company

Credits: Raphaël Troncy, Lynda Hardman

CeWe Color PhotoBook Processes

• My winter ski holidays with my friendsPremeditate

Construct Message

Create

Package

Annotate

Organize

• Publish

Distribute

Semantics can be important in the process

Option 3: Canonical Processes of Media Production

However, some of you probably attended Raphaël Troncy’s talk last year (available in slideshare)

In summary…

• I decided to talk about something that I have been working in for the last couple of years, and which combines- Semantics (of course, this is the key expertise of our group)

• Mainly annotation, Linked Data and a bit of Multimedia Ontology Engineering

- Social networks, collaboration, sharing and collective intelligence• Exploiting home networks and online multimedia sites

- And, obviously, multimedia

And hence I still leave out manyinteresting topics (e.g., semantics in user interfaces)

Outline

• Introduction- What I will be talking about and what I will not

• Sem-UPnP-Grid- Sharing multimedia content across homes through semantic

annotations• Credits: Mariano Rico and Adrián Siles (UPM), Víctor

Méndez and José Manuel Gómez-Pérez (iSOCO), José Manuel Palacios and Mónica Pérez (TID)

• Sem4Tags- Tag disambiguation in Flickr

• M3 Ontology (only if time permits)- A semantic backbone for our multimedia-related work

• Conclusions and outlook

Internet

Motivation

• Multimedia resources in Web2.0 are stored in centralised servers.

• You lose some of your rights as an author when you upload these resources to these servers.

• Privacy problems.• Poor annotations and metadata.• These resources cannot be shared with other resources in

your home.

UpGrid

Multimedia Content Sharing with UpGrid

Annotation:“Ángel on the beach”

Annotation:“Ángel playing soccer”

Additional semantic information:- “Ángel is my son”

- “Pedro is my brother”

Semantic-based query:“multimedia content related to my

nephew”

Additional semantic information:“Juan is my brother”

JuanReasoning:

- “Ángel is my son”- “Pedro is my brother”- “Juan is my brother”

- -------------------------------- Ángel is my nephew

Architecture

Architecture (another view on it)

Snapshots from the application

Check http://www.youtube.com/results?search_query=UPnPGrid

Summary

• An effective means for sharing multimedia contents across homes, avoiding Web2.0 sites where your rights may be compromised

• However, it is still a prototype, and no serious usability testing has been done- Much work still needed in order to go into a real system

• And end users find it difficult to provide annotations - Do you imagine your parents and grandparents annotating

photos and videos like that?- Let’s see how this could be ameliorated with the next part of

our presentation.

Outline

annotations

• Credits: Héctor Andrés García Silva

Egresado de laUniversidad del Valle

• Social Tagging Systems- Web 2.0 applications - Applications for storing, sharing, and

discovering information resources.- Users assign tags to identify

information resources- Tags are used to search/discover

resources

Introduction

• Folksonomy- Emerging classification scheme from

social tagging systems - Folk: People, Taxonomy: Classification- Represented by: Users, Tags, Resources

Introduction

Taxonomy

• Top-down• Controlled Vocabulary• Hierarchical structure• Exclusive/Restrictive• Expensive to maintain

Folksonomy

• Bottom-up (user created)• No fixed vocabulary

• No Hierarchical structure• No Exclusive/Flexible• Low cost

Introduction

• Why is tagging so popular?- Reduce cognitive burdens

• it’s easy to use• Users don´t need any special skill or experience

- The benefits of tagging are immediate• Future retrieval• Contribution and sharing• Attract Attention• Self Presentation• Opinion Expression

• However- Tags can be ambiguous

• Polysemy: party as a celebration as opposed to party as a political organization

- Synonym: party and celebration

- Morphological variations:

party, parties, partying, partyign • Plurals• Acronyms• Conjugated verbs• Misspelling

- Compound words• Political party, PoliticalParty, Political_party,

Political-Party, etc.

- Detail/granularity level

A general tag as party in contrast to a specific tag as banquet.

Introduction

The problem: Morphological variations, synonyms, granularity, and polysemy hamper information retrieval processes based on folksonomies.

Motivation

710.659 results8.661.581 Results

Systems ignore resources tagged with morphological variations or synonyms of that tag, as well as the resources tagged with more generic or more specific tags

When searching with polysemous tags, all the resources tagged with that tag are retrieved without taking into account the tag sense the user was looking for.

(e.g., Query flickr with bank results in photos about financial institutions, river edges, fog banks, and sand banks, etc. )

Motivation

• What if we associate tags with semantic entities?

uk, tories, party, conservative, speech party, balloons, colors, bar, crowd

http://morpheus.cs.umbc.edu/aks1/ontosem.owl

#political-party

#political-entity

#organization

#Coalition#federation

#party

#special-occasion

#non-work-activity

#Celebration

#Birthday #Anniversary

We can avoid the

aforementioned pitfalls

State of the Art: Semantic Grounding of Cross-Lingual Folksonomies

None of the analyzed approaches deals with multilingual tags

Garcia HA, Corcho O, Alani H, Gómez-Pérez A. Review of the state of the art: Discovering and Associating Semantics to Folksonomies. Knowledge Engineering Review (in press)

Semantic Grounding of Cross-Lingual Folksonomies

• MSR: a Multilingual Sense Repository based on Wikipedia and enriched with semantic information taken from DBpedia.

Cardumen

Banco de Arena

Sandbank

BankTerms and frequency

Terms and frequency

http://dbpedia.org/resource/Bank

http://dbpedia.org/resource/Swarm

http://dbpedia.org/resource/SandBank

• Sem4Tags: A process for Associating Semantics to Tags.

EuropeEuro

FinanceCentral bankawesomePic

Nikon ..

http://dbpedia.org/resource/Bank Banco

Dinero,Calle,

Santander,Money,Madrid,Atm, cajero

• Disambiguation activity- The candidate senses and the tag context are represented

as vectors. • The vector components are the set of most frequent

terms in each Wikipedia page representing a sense.• For each sense the values of the vector are calculated

using TF-IDF.• For the tag context the values in each position are 1 or 0

if the corresponding term appears in the tag context.

- The tag context vector is compared against each sense vector using the cosine of the angle as similarity measure.

- The most similar sense to the tag context is selected as the one representing the meaning of the analyzed tag

• Disambiguation activity- We use the information of the wikipedia default sense for a term. - Sim(TagContext, Sensei)= λ*Cosine + β*defaultSense- We experimentally defined β = 0,2 and λ = 0.8

- We attempt to use DBpedia semantic information in the disambiguation activity:

• Sim(TagContext, Sensei)= λ*Cosine + β*defaultSense + δ*SemanticInfo

• Studies have shown that tags in flickr refers mainly to: Locations, Time, Given Names, Potography related subjects among others.

• We use DBpedia and YAGO relations to classify the senses according to this categories.

• However, we found that not all the senses related to a term have the same amount of relations. (e.g. Madrid is not a city)

Let’s try it

• http://robinson.dia.fi.upm.es:8080/SemanticTagsWebApp/index.jsp

• What does “bernabeu” mean if its context is…?- estadio, madrid, fútbol

Experiment

• Baseline: Directly associate tags with DBpedia resources- Look for spaces and replace them with ' _‘.- For tags in English:

• Create a URI of the form http://en.wikipedia.org/wiki/tag• Query DBpedia using the http://xmlns.com/foaf/0.1/page

relation- For tags in Spanish:

• Create a URI of the form http://es.wikipedia.org/wiki/tag• Query DBpedia using the

http://dbpedia.org/property/wikipage-es relation

Experiment

• Approaches:- Baseline: Selection of the sense without a disambiguation

activity.- Sem4Tags: For each sense we use the whole Wikipedia

article as source for frequent terms.- Sem4TagsAC: Same as Sem4Tags including the selection

of the Active Context.- Sem4TagsAbs: For each sense we use the Wikipedia

article abstract (extracted from DBpedia) as source for frequent terms.

- Sem4TagsAbsAC: Same as Sem4TagsAbs including the selection of the Active Context.

Experiment

• Initial Data Set- Wide range of Users, photos, and tags.- 764 photos uploaded by 719 users to Flickr that have been

tagged with tags describing tourist places in Spain- 12.4 (+/- 7.85) tags per photo- 9484 tagging activities (TAS) : <user,photo,tag>- 4135 distinct tags where used

• Processed Data Set- From each photo we processed on average 2 tags - 2260 tagging activities (TAS)

Experiment

• Evaluation Campaign- 41 Evaluators- Evaluate semantic associations produce by each approach:

<user; tag; photo; DBpedia resource; language>

- Three different evaluators evaluated each semantic association.

- Questions:• Able to identify the tag meaning (known or Unknown)• Tag language (English, Spanish, Both, other)• The tag correspond to a Named entity• According to the identified tag language they evaluate

the semantic association in terms of• Highly related, Related, Not Related.

Experiment

• Results- Evaluators identified the semantics of the 87% of TAS

(known)• 62.6 % of TAS were considered in English• 87.7% of TAS were considered in Spanish

- Agreement among evaluators (Fleiss’ kappa statistics):• k=0.76 for highly related• K=0.71 for the related case/highly related case

Experiment

• Precision and Recall for Highly Relevant results

Spanish

English

Experiment

• Conclusions- Baseline obtained high precision, however it was able to find

semantic resources for just a fraction of the analyzed data set:• Baseline: 27.7% in English and 19.4% in Spanish.• Sem4Tags: 79.1 % in English and 81.4% in Spanish

- All approaches obtained better precision with named entities than with unnamed entities.

- Sem4Tags and Sem4TagsAC are the approaches that obtained the best results in terms of Precision and Recall. • Sometimes Sem4TagsAC obtains better P@1 values but

the improvements are supported by no or low statistical evidence.

- Sem4TagsAbs and Sem4TagsAbs are clearly the worst approaches.

Outline

annotations

Ontología M3

OntologíaM3

PerspectivaMultimedia

PerspectivaMultidominio

PerspectivaMultilingüe

There are already multimedia ontologies

• MDS Upper Layer represented in RDFS- 2001: Hunter- Later on: link to the ABC upper ontology

• MDS fully represented in OWL-DL- 2004: Tsinaraki et al., DS-MIRF model

• MPEG-7 fully represented in OWL-DL- 2005: Garcia and Celma, Rhizomik model- Fully automatic translation of the whole standard

• MDS and Visual parts represented in OWL-DL- 2007: Arndt et al., COMM model - Re-engineering MPEG-7 using DOLCE design patterns

• However, their requirements are not always clear nor have they been developed with clear methodological guidelines

Knowledge Resources

Non Ontological Resource

Reengineering

Non Ontological Resources

Thesauri

DictionariesGlossaries Lexicons

TaxonomiesClassification

Schemas

O. Localization

Ontology Support Activities: Knowledge Acquisition (Elicitation); Documentation; Configuration Management; Evaluation (V&V); Assessment

1,2,3,4,5,6,7,8, 9

Ontological Resource

Reengineering

O. Aligning

O. Merging

Alignments5

3Ontological Resources

O. Repositories and Registries

FlogicRDF(S)

Ontology Design

Pattern Reuse

O. Design Patterns

Ontology Restructuring(Pruning, Extension,

Specialization, Modularization)

O. Specification O. Conceptualization O. ImplementationO. Formalization

RDF(S)

FlogicScheduling

NeOn Methodology

Ontology Requirements Specification (I)

Task 1. Identifying the purpose, scope and implementation

language

Task 2. Identifying the intended end-users

Task 3. Identifying the intended uses

Task 5. Grouping functional requirements

Are they valid?

Task 8. Extracting terminology and its frequency

Users and Domain Experts

Users, Domain Experts and ODT

Task 4. Identifying requirements

Task 6. Validating the set of requirements

Task 7. Prioritizing requirements

Ontology Development Team

ORSDOUTPUT

Set of ontological

Non-functional ontology requirements: Characteristics not related to the ontology content

Ontology Requirements Specification (II)

Task 1. Identifying the purpose, scope and implementation

language

Task 2. Identifying the intended end-users

Task 3. Identifying the intended uses

Task 5. Grouping functional requirements

Are they valid?

Task 8. Extracting terminology and its frequency

Users and Domain Experts

Task 4. Identifying requirements

Task 6. Validating the set of requirements

Task 7. Prioritizing requirements

Ontology Development Team

ORSDOUTPUT

Set of ontological

Functional ontology requirements: Content specific requirements referred to the particular knowledge to be

represented by the ontology Requirements in natural language

in the form of CQs in the form of sentences (General Characteristics)

Strategies: (1) Top-Down, (2) Bottom-Up, and (3) Middle out

Ontology Requirements Specification (III): Functional Requirements on M3

Perspectiva Multimedia

Perspectiva Multilenguaje

Perspectiva Multidominio

PCsCompetencias

PCs“Multidominio”

PCsActualidad

PCsDeportes

Proceso deAbstracción

Ontology Requirements Specification (IV): ORSD

Perspectiva Multidominio

Perspectiva Multimedia

Perspectiva Multilenguaje

Knowledge Resources

Reengineering

Thesauri

Schemas

O. Localization

1,2,3,4,5,6,7,8, 9

Reengineering

O. Aligning

O. Merging

Alignments5

FlogicRDF(S)

Ontology Design

Pattern Reuse

O. Design Patterns

RDF(S)

FlogicScheduling

NeOn Methodology

Scheduling using gOntt

Task 1. Selecting the ontology network life cycle model

Task 2. Selecting the set of scenarios

Task 3. Updating initial plan

Task 4. Establishing resource restrictions and assignments

Scheduling for the

Ontology Network

Development

OUTPUT

Initial Ontology Network Life Cycle in the

form of a Gantt chart

OUTPUT

Types of potential knowledge

resources to be reused

Modified Ontology

Network Life Cycle in the

OUTPUT

I need to schedule the development of the M3

ontolgy network

Scheduling for the

Ontology Network

Development

OUTPUT

Modified Ontology

OUTPUT

Life cycle model selection

Scenarios selection

Scheduling using gOntt (II)

Scheduling for the

Ontology Network

Development

OUTPUT

Modified Ontology

OUTPUT

Knowledge Resources

Reengineering

Thesauri

Schemas

O. Localization

1,2,3,4,5,6,7,8, 9

Reengineering

O. Aligning

O. Merging

Alignments5

FlogicRDF(S)

Ontology Design

Pattern Reuse

O. Design Patterns

RDF(S)

FlogicScheduling

NeOn Methodology

Reusing Ontological Resources: Comparative Analysis (I)

Reusing Ontological Resources: Comparative Analysis

Outline

annotations

Conclusions and outlook

• We all agree that…- Multimedia UGC has been one of the basis of Web2.0

• The use of semantics can provide…- Better understanding of the domain and of their content

• Heavyweight: addressing the semantic gap automatically• Ligthweight: allowing users to annotate• Middleweight: from free tags to knowledge

- Better exploratory navigation and serendipity• Interconnecting multimedia content with the Linked Data

• However, privacy issues are still a major barrier for a larger uptake, especially for some population segments- Allowing P2P exchange between “known” homes, while

exploiting semantic-based search

Combining Multimedia and Semantics

Oscar Corcho (ocorcho@fi.upm.es)

Universidad Politécnica de Madrid

http://www.oeg-upm.net/

LACNEM 2010, Cali, ColombiaSeptember 9th 2010

Credits: Adrián Siles, Mariano Rico, Víctor Méndez, Hector Andrés García-Silva, María del Carmen Suárez-Figueroa, Ghislain Atemezing, Raphaël Troncy

Work distributed under the license Creative Commons Attribution-Noncommercial-Share Alike 3.0

http://www.slideshare.net/ocorcho

combining multimedia and semantics (lacnem2010)

color histograms

meaning semantics

online multimedia sitesand

jos manuel gmezprez

vctor mndez

mariano rico

content analysis methods

extractfromthesensory

Technology

combining montague semantics and discourse...

combining structure and semantics for ontology-based...

multimedia support for minix 3 · 5.3.2. combining...

a multimedia service with mpeg-7 metadata and context...

symposium on semantics in systems for text processing...

aquaint kickoff meeting – december 2001 integrating robust...

statistical methods for learning multimedia...

probabilistic models for combining diverse knowledge...

combining distributional semantics and entity linking for...

a dynamic approach for combining abstract argumentation...

multimedia semantics - ssms 2010

frank biasi - combining maps, multimedia, and narrative to...

the seahorn verification framework · stack-free program...

distributional semantics meets mrs? · pdf file ·...

semantics at the multimedia fragment level or how enabling...

combining distributional semantics and entity linking for...

combining semantic and multimedia query routing techniques...

multimedia semantics – from mpeg-7 to web 3.0 jane hunter...

multimedia adult literacy package combining nasa ... ·...

semantics at the multimedia fragment level sssw 2013