mediaeval 2016 - uned-uv @ retrieving diverse social images task
TRANSCRIPT
UNED-UV @ 2016 Retrieving Diverse Social Images
Ángel Castellanos, Xaro Benavent, Ana García-Serrano, Esther de Ves
03/11/2016 1
System Description
Two-step system focusing on a multimedia approach
• Relevancy via Relevance Feedback Algorithm (visual, developed by UV)
• Diversity via FCA-based and HAC Algorithms (textual, developed by UNED)
03/11/2016 2
Relevancy via Relevance Feedback
STEP 1: Reduction of the data dimensionality
The low-level features are reduced by a PrincipalComponent Analysis, PCA, which retains only the firstcomponents that account for 80% of the datavariability.
03/11/2016 3
Relevancy via Relevance FeedbackSTEP 2: Selection the relevant and non-relevantimage sets
1. The user marks images as being relevant and non-relevant (run4 ).
2. For automatic runs (run1,3,5), the sets are au-tomatically selected.
Relevant images: the first 5 images of the rankedFlicker list that belong to different clusters
Non-relevant images are selected from otherclusters at each query
03/11/2016 4
Relevancy via Relevance FeedbackSTEP 3: Parameter estimation of the LocalLogistic Regression Models
STEP 4: Ranking of the database
Models are evaluated on all the images of the databaseand return the probabilities of being relevant for eachestimated model; we combine these probabilities in just one by using a weighted average to obtain the(visual) score/probability for each image.
03/11/2016 5
Clusters for diversitySTEP 1: Textual Representation of an image
Kullback Leiber Divergence (KLD) algorithm to obtain the image representation using not only the most relevant but also the most “different and original” terms.
STEP 2: FCA-based modeling
The theory of Formal Concept Analysis (FCA) is used to organize the images in a lattice according its shared KLD-features.
Clusters for diversity
The FCA Model grouped the images in formal concepts organized hierarchically in a lattice.
STEP 3: To set the similarity/dissimilarity between the formal concepts:
HAC (Hierarchical Agglomerative Clustering) approach using of the Zero-Induces index as the similarity (dissimilarity) measure, in order to group formal concepts to obtain the resultant clusters.
Clusters for diversity
The HAC resultant clusters are considered as the different topics/concepts addressed by the images.
STEP 4: Finally, for each cluster it is selected the best image in the group i.e., the one with a higher score/probability according its visual features.
• 2014: only two monomodal runs (textual)
• 2015: the monomodal runs (textual and visual)overcome the automatic multimodal run
• 2016: The automatic multimodal runs (3 and 5)overcome the automatic monomodal runs (1 and 2).
• Importance of the relevant and non-relevant sets for the relevance algorithm.
Challenge: automatic approach performance in the level of human-based selection of relevant and non-relevant image sets.
03/11/2016 11