mediaeval 2017 retrieving diverse social images task (overview)

27
Retrieving Diverse Social Images Task - task overview - 2017 University Politehnica of Bucharest Maia Zaharieva (TUW, Austria) Bogdan Ionescu (UPB, Romania) Alexandru Lucian Gînscǎ (CEA LIST, France) Rodrygo L.T. Santos (UFMG, Brazil) Henning Müller (HES-SO in Sierre, Switzerland) Bogdan Boteanu (UPB, Romania) September 13-15, Dublin, Irelandce Universidade Federal de Minas Gerais, Brazil

Upload: multimediaeval

Post on 28-Jan-2018

47 views

Category:

Science


4 download

TRANSCRIPT

Page 1: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

Retrieving Diverse Social Images Task

- task overview -

2017

University Politehnica

of Bucharest

Maia Zaharieva (TUW, Austria)

Bogdan Ionescu (UPB, Romania)

Alexandru Lucian Gînscǎ (CEA LIST, France)

Rodrygo L.T. Santos (UFMG, Brazil)

Henning Müller (HES-SO in Sierre, Switzerland)

Bogdan Boteanu (UPB, Romania)

September 13-15, Dublin, Irelandce

Universidade Federal de

Minas Gerais, Brazil

Page 2: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

The Retrieving Diverse Social Images Task

Dataset and Evaluation

Participants

Results

Discussion and Perspectives

2

Outline

Page 3: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

3

Diversity Task: Objective & Motivation

Objective: image search result diversification in the context of

social photo retrieval.

Why diversifying search results?

- to respond to the needs of different users;

- as a method of tackling queries with unclear information needs;

- to widen the pool of possible results (increase performance);

- to reduce the number/redundancy of the returned items;

Page 4: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

3

Diversity Task: Objective & Motivation #2

Page 5: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

4

Diversity Task: Objective & Motivation #2

Page 6: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

5

Diversity Task: Objective & Motivation #3

Page 7: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

7

Diversity Task: Definition

For each query, participants receive a ranked list of photos retrieved

from Flickr using its default “relevance” algorithm.

Query = general-purpose, multi-topic term

e.g.: autumn colors, bee on a flower, home office, snow in

the city, holding hands, ...

Goal of the task: refine the results by providing a ranked list of up

to 50 photos (summary) that are considered to be both relevant and

diverse representations of the query.

relevant: a common photo representation of the query topics (all at once);

bad quality photos (e.g., severely blurred, out of focus) are not considered

relevant in this scenario

diverse: depicting different visual characteristics of the query topics and

subtopics with a certain degree of complementarity, i.e., most of the

perceived visual information is different from one photo to another.

Page 8: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

8

Dataset: General Information & Resources

Provided information:

query text formulation;

ranked list of Creative Commons photos from Flickr*

(up to 300 photos per query);

metadata from Flickr (e.g., tags, description, views,

comments, date-time photo was taken, username, userid, etc);

visual, text & user annotation credibility descriptors;

semantic vectors for general English terms computed on top of

the English Wikipedia (wikiset);

relevance and diversity ground truth.

Photos:

Development: 110 queries 32,340 photos

Test: 84 queries 24,986 photos

Page 9: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

9

Dataset: Provided Descriptors

General purpose visual descriptors:

e.g., Auto Color Correlogram, Color and Edge Directivity

Descriptor, Pyramid of Histograms of Orientation Gradients, etc;

Convolutional Neural Network based descriptors:

Caffe framework based;

General purpose text descriptors:

e.g., term frequency information, document frequency

information and their ratio, i.e., TF-IDF;

User annotation credibility descriptors (give an automatic

estimation of the quality of users' tag-image content relationships):

e.g., measure of user image relevance, total number of images a

user shared, the percentage of images with faces.

Page 10: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

10

Dataset: Basic Statistics

devset

(design the methods)

testset

(final benchmarking)

#queries 110 84

#images 32,340 24,986

#img. per query

(min-average-max ) 141 - 295 - 300 299 - 300 - 300

% relevant img. 53 57.4

avg. #clusters per query 17 14

avg. #img. per cluster 9 14

Page 11: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

11

Dataset: Ground Truth - annotations

Relevance and diversity annotations were carried out by

expert annotators:

devset: relevance: 8 annotators + 1 master (3 annotations/query)

diversity: 1 annotation/query

testset: relevance: 8 annotators + 1 master (3 annotations/query)

diversity: 12 annotators (3 annotations/query)

Lenient majority voting for relevance

Page 12: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

12

Evaluation: Run Specification

Participants are required to submit up to 5 runs:

required runs:

run 1: automated using visual information only;

run 2: automated using textual information only;

run 3: automated using textual-visual fused without other

resources than provided by the organizers;

general runs:

run 4: everything allowed, e.g. human-based or hybrid human-

machine approaches, including using data from external

sources, (e.g., Internet) or pre-trained models obtained from

external datasets related to this task;

run 5: everything allowed.

Page 13: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

13

Evaluation: Official Metrics

Cluster Recall* @ X = Nc/N (CR@X) where X is the cutoff point, N is the total number of clusters for the

current query (from ground truth, N<=25) and Nc is the number of

different clusters represented in the X ranked images;

*cluster recall is computed only for the relevant images.

Precision @ X = R/X (P@X)

where R is the number of relevant images;

F1-measure @ X = harmonic mean of CR and P (F1@X)

Metrics are reported for different values of X (5, 10, 20, 30, 40 & 50)

on per topic as well as overall (average).

official ranking F1@20

Page 14: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

14

Participants: Basic Statistics

Survey:

- 22 respondents were interested in the task;

Registration:

- 14 teams registered (1 team is organizer related);

Run submission:

- 6 teams finished the task, including 1 organizer related team;

- 29 runs were submitted;

Workshop participation:

- 5 teams are represented at the workshop.

Page 15: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

15

Participants: Submitted Runs (29)

*organizer related team

Team Country

Required Runs General Runs Results (best)

1 (visual) 2 (text) 3 (vis-text) 4 5 P@20 CR@20 F1@20

NLE France ✓ ✓ ✓ visual-text visual-text 0.793 0.679 0.705

MultiBrazil Brazil ✓ ✓ ✓ visual-text-cred. visual-text-cred. 0.7208 0.6524 0.6634

UMONS Belgium ✓ ✓ ✓ visual-text-cred. visual-cred. 0.8071 0.5856 0.6554

CFM China ✓ ✓ ✓ text-cred. text-cred. 0.6881 0.6671 0.6533

tud-mmc Netherlands ✓ ✓ ✓ text-intent ✗ 0.7262 0.6142 0.6462

Flickr 0.6595 0.5831 0.5922

LAPI* Romania ✓ ✓ ✓ visual cred. 0.633 0.6045 0.5777

Page 16: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

16

Results: P vs. CR @20 (all runs - testset)

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.5 0.55 0.6 0.65 0.7

P@

20

CR@20

Flickr Initial

CFM

LAPI

MultiBrazil

NLE

tud-mmc

UMONS

Flickr

initial

NLE UMONS

Page 17: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

17

Results: Best Team Runs (F1 @)

0.3

0.4

0.5

0.6

0.7

0.8

@5 @10 @20 @30 @40 @50

F1@

X

Flickr Initial

CFM_run5_text_cred.txt

LAPI_HC_PSRF_Run5.txt

run3VisualTextual_MultiBrasil.txt

NLE_run3_CMRF_MMR.txt

tudmmc_run4_tudmmc_intent.txt

UMONS_run5_visual_user_G.txt

Page 18: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

18

Results: Best Team Runs (Cluster Recall @)

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

@5 @10 @20 @30 @40 @50

CR

@X

Flickr Initial

CFM_run5_text_cred.txt

LAPI_HC_PSRF_Run5.txt

run3VisualTextual_MultiBrasil.txt

NLE_run3_CMRF_MMR.txt

tudmmc_run4_tudmmc_intent.txt

UMONS_run5_visual_user_G.txt

Page 19: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

Results: Visual Results – Flickr Initial Results

Truck Camper

19

Page 20: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

Results: Visual Results – Flickr Initial Results

Truck Camper CR@20=0.35, P@20=0.3, F1@20=0.32

19

Page 21: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

Results: Visual Results #2 – Best run (F1@20)

20

Truck Camper

Page 22: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

Results: Visual Results #2 – Best run (F1@20)

20

Truck Camper CR@20=0.68, P@20=0.8, F1@20=0.74

Page 23: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

Results: Visual Results #3 – Lowest run

21

Truck Camper

Page 24: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

Results: Visual Results #3 – Lowest run

21

Truck Camper CR@20=0.5, P@20=0.5, F1@20=0.5

Page 25: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

22

Brief Discussion

Methods:

this year mainly classification/clustering (& fusion), re-ranking,

relevance feedback, & neural-network based;

best run F1@20: improving relevancy (text) + neural network-based

clustering; use of visual-text information (team NLE).

Dataset:

getting very complex (read diverse);

still low resources for Creative Commons on Flickr;

descriptors were very well received (employed by all of the

participants as provided).

Page 26: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

23

Acknowledgements

Task auxiliaries:

Bogdan Boteanu, UPB, Romania & Mihai Lupu, Vienna University of

Technology, Austria

Task supporters:

Alberto Ueda, Bruno Laporais, Felipe Moraes, Lucas Chaves, Jordan

Silva, Marlon Dias, Rafael Glater

Catalin Mitrea, Mihai Dogariu, Liviu Stefan, Gabriel Petrescu, Alexandru

Toma, Alina Banica, Andreea Roxana, Mihaela Radu, Bogdan Guliman,

Sebastian Moraru

Page 27: MediaEval 2017 Retrieving Diverse Social Images Task (Overview)

24

Questions & Answers

Thank you!