![Page 1: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/1.jpg)
Using
Probabilistic Modelsfor
Multimedia Retrieval
Arjen P. de [email protected]
(Joint research with Thijs Westerveld)
Centrum voor Wiskunde en Informatica
E-BioSci/ORIEL Annual Workshop, Sep 3-5, 2003
![Page 2: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/2.jpg)
• Eiffel tower
• scary/spooky Eiffel tower
•
Introduction
![Page 3: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/3.jpg)
Outline
• Generative Models– Generative Model– Probabilistic retrieval– Language models, GMMs
• Experiments– Corel experiments– TREC Video benchmark
• Conclusions
![Page 4: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/4.jpg)
What is a Generative Model?
• A statistical model for generating data– Probability distribution over samples in a
given ‘language’M
P ( | M ) = P ( | M )
P ( | M, )
P ( | M, )
P ( | M, )
© Victor Lavrenko, Aug. 2002
![Page 5: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/5.jpg)
Generative Models
video of Bayesian model to that present the disclosure can a on for retrieval in have is probabilistic still of for of using this In that is to only queries queries visual combines visual information look search video the retrieval based search. Both get decision (a visual generic results (a difficult We visual we still needs, search. talk what that to do this for with retrieval still specific retrieval information a as model still
LMabstract
![Page 6: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/6.jpg)
Unigram and higher-order models
• Unigram Models
• N-gram Models
• Other Models– Grammar-based models, etc.– Mixture models
= P ( ) P ( | ) P ( | ) P ( | )
P ( ) P ( ) P ( ) P ( )
P ( )
P ( ) P ( | ) P ( | ) P ( | )
© Victor Lavrenko, Aug. 2002
![Page 7: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/7.jpg)
The fundamental problem• Usually we don’t know the model M
– But have a sample representative of that model
• First estimate a model from a sample
• Then compute the observation probability
P ( | M ( ) )
M© Victor Lavrenko, Aug. 2002
![Page 8: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/8.jpg)
Indexing: determine models
•Indexing–Estimate
Gaussian Mixture Models from images using EM
–Based on feature vector with colour, texture and position information from pixel blocks
–Fixed number of components
Docs Models
![Page 9: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/9.jpg)
Retrieval: use query likelihood
• Query:
• Which of the models is most likely to generate these 24 samples?
![Page 10: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/10.jpg)
Probabilistic Image Retrieval
?
![Page 11: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/11.jpg)
Query
Rank by P(Q|M)
P(Q|M1)
P(Q|M4)
P(Q|M3)
P(Q|M2)
![Page 12: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/12.jpg)
Probabilistic Retrieval Model• Text
– Rank using probability of drawing query terms from document models
• Images– Rank using probability of drawing query blocks
from document models
• Multi-modal– Rank using joint probability of drawing query
samples from document models
![Page 13: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/13.jpg)
• Unigram Language Models (LM)– Urn metaphor
Text Models
• P( ) ~ P ( ) P ( ) P ( ) P ( )
= 4/9 * 2/9 * 4/9 * 3/9
© Victor Lavrenko, Aug. 2002
![Page 14: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/14.jpg)
Generative Models and IR
• Rank models (documents) by probability of generating the query
• Q:
• P( | ) = 4/9 * 2/9 * 4/9 * 3/9 = 96/9
• P( | ) = 3/9 * 3/9 * 3/9 * 3/9 = 81/9
• P( | ) = 2/9 * 3/9 * 2/9 * 4/9 = 48/9
• P( | ) = 2/9 * 5/9 * 2/9 * 2/9 = 40/9
![Page 15: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/15.jpg)
The Zero-frequency Problem
• Suppose some event not in our example– Model will assign zero probability to that event– And to any set of events involving the unseen
event
• Happens frequently with language • It is incorrect to infer zero probabilities
– Especially when dealing with incomplete samples
?
![Page 16: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/16.jpg)
Smoothing
• Idea: shift part of probability mass to unseen events
• Interpolation with background (General English)– Reflects expected frequency of events– Plays role of IDF
+(1-)
![Page 17: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/17.jpg)
Image Models
• Urn metaphor not useful– Drawing pixels useless
• Pixels carry no semantics
– Drawing pixel blocks not effective • chances of drawing exact query blocks from document slim
• Use Gaussian Mixture Models (GMM)– Fixed number of Gaussian
components/clusters/concepts
![Page 18: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/18.jpg)
?
Image Models
• Expectation-Maximisation (EM) algorithm– iteratively
• estimate component assignments• re-estimate component parameters
![Page 19: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/19.jpg)
Component 1 Component 2 Component 3
ExpectationMaximization
E
M
![Page 20: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/20.jpg)
ExpectationMaximization
animation
Component 1 Component 2 Component 3
E
M
![Page 21: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/21.jpg)
Key-frame representation
Query model
split colour channels
Take samples
CrCbY
DCT coefficients position
EM algorithm
675 9 12 11 1 9 4 1517 -9 -3 0 0 0 1 850 15 4 0 1 4 -2 1 1661 7 13 5 -5 11 3 1536 2 -4 0 1 1 0 844 5 4 -2 0 1 -2 1 2668 -7 13 3 -3 0 -1 1534 0 -5 0 0 0 0 837 3 3 -3 0 -2 1 1 3665 10 11 2 4 5 2 1534 0 -5 0 0 0 0 829 0 3 -1 0 0 0 1 4669 -5 18 7 -3 1 -5 1534 0 -5 0 0 0 0 833 -5 4 -1 0 3 -1 1 5
![Page 22: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/22.jpg)
Scary Formulas
![Page 23: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/23.jpg)
Probabilistic Retrieval Model• Find document(s) D* with highest probability
given query Q (MAP):
• Equal Priors ML
• Approximated by minimum Kullback-Leibler divergence
)(
)()|(argmax)|(argmax*
QP
DPDQPQDPD ii
iii
)|(argmax*ii DQPD
dxDxPDxP
xPxPD
iqi
iqi
)|(log)|(argmax
)](||)([KLargmin*
![Page 24: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/24.jpg)
• Query– Bag of textual terms– Bag of visual blocks
• Query model – empirical query
distribution
• KL distance
N
jijiq DxP
NdxDxPDxP
1
)|(log1
)|(log)|(
otherwise,0
,1
)|( QxNDxP q
},...,,{ 21 NxxxQ
Query Models
![Page 25: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/25.jpg)
Corel Experiments
![Page 26: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/26.jpg)
Testing the Model on Corel
• 39 classes, ~100 images each• Build models from all images• Use each image as query
– Rank full collection– Compute MAP (mean average precision)
• AP=average of precision values after each relevant image is retrieved
• MAP is mean of AP over multiple queries
– Relevant from query class
![Page 27: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/27.jpg)
Example resultsQuery:
Top 5:
![Page 28: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/28.jpg)
MAP per Class (mean: .12)
• English Pub Signs .36• English Country Gardens .33• Arabian Horses .31• Dawn & Dusk .21• Tropical Plants .19• Land of the Pyramids .19• Canadian Rockies .18• Lost Tribes .17• Elephants .17• Tigers .16• Tropical Sea Life .16• Exotic Tropical Flowers .16• Lions .15• Indigenous People .15• Nesting Birds .13• …
• …• Sweden .07• Ireland .07• Wildlife of the Galapagos .07• Hawaii .07• Rural France .07• Zimbabwe .07• Images of Death Valley .07• Nepal .07• Foxes & Coyotes .06• North American Deer .06• California Coasts .06• North American Wildlife .06• Peru .05• Alaskan Wildlife .05• Namibia .05
![Page 29: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/29.jpg)
Class confusion
• Query from class A
• Relevant from class B
• Queries retrieve images from own class
• Interesting mix-ups– Beaches – Greek islands
– Indigenous people – Lost tribes
– English country gardens – Tropical plants – Arabian Horses
• Similar backgrounds
![Page 30: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/30.jpg)
Tuning the Models
• Yet another subset of Corel data– 39 classes, 10 images each– Index as before and calculate MAP
• Vary model parameters– NY: Number of DCT coefficients from Y channel
(1,3,6,10,15,21)
– NCbCr: Number of DCT coefficients from CB and Cr channels (0,1,NY)
– Xypos: Do/do not use position of samples
– C: number of components in GMM (1,2,4,8,16,32)
![Page 31: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/31.jpg)
Example Image
![Page 32: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/32.jpg)
Example models + samples Varying C, NY=10, NCbCr=1, Xypos=1
C=4 C=8 C=32
![Page 33: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/33.jpg)
Example models + samples Varying NCbCr, NY=10, Xypos=1, C=8
NCbCr=0 NCbCr=1 NCbCr=10
![Page 34: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/34.jpg)
MAP with different parameters
NCbCr Xypos C=1 C=2 C=4 C=8 C=16 C=32
0 0 .08 .18 .20 .21 .21 .21
0 1 .09 .19 .21 .21 .21 .20
1 0 .13 .22 .23 .23 .23 .23
1 1 .13 .22 .23 .23 .23 .22
10 0 .12 .22 .24 .24 .24 .23
10 1 .13 .21 .24 .24 .24 .23
![Page 35: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/35.jpg)
Statistical Significance
• Mixture better than single Gauss (c>1)• Small differences between settings
– Yet, small differences might be significant• Wilcoxon signed-rank test (sign. level 5%)
A B Diff Rank Signrnk
97 96 -1 1 -1
88 86 -2 2.5 -2.5
75 79 4 4 4
90 88 -2 2.5 -2.5
85 93 8 5 5
m=87 m=88.4 =15 = Z+,Z-
Z+=9 Z-=6
=7.5
![Page 36: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/36.jpg)
Statistical Significance
• Results– Optimal number of components at C=8
• Fewer components -> insufficient resolution• More components -> overfitting
– Colour information is important (NCbCr >0)• More is better if enough components
– Position information undecided • although using it never harms
![Page 37: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/37.jpg)
Background MatchingQuery:
Top 5:
![Page 38: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/38.jpg)
Background MatchingQuery:
Top 5:
![Page 39: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/39.jpg)
TREC Experiments
![Page 40: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/40.jpg)
TREC Video Track
• Goal: Promote progress in content-based video retrieval via metric based evaluation
• 25 Topics– Multimedia descriptions of an information need; 22
had video examples (avg. 2.7 each), 8 had image (avg. 1.9 each)
• Task is to return up to 100 best shots– NIST assessors judged top 50 shots from each
submitted result set; subsequent full judgements showed only minor variations in performance
![Page 41: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/41.jpg)
Video Data
• Used mainly Internet Archive– advertising, educational, industrial, amateur films
1930-1970 – Noisy, strange color, but real archive data– 73.3 hours, partitioned as follows:
4.85
5.07
23.26
40.12Search test
Feature development(training and validation)
Feature test
Shot boundary test
![Page 42: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/42.jpg)
Video Representation
• Video as sequence of shots (all TREC)
– Common ground truth shot set used in evaluation; 14,524 shots
• Shot = image + text (CWI specific):– Key-frame (middle frame of shot)– ASR Speech Transcript (LIMSI)
![Page 43: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/43.jpg)
Search Topics
• Requesting shots with specific or generic:– People, Things, Locations, Activities
George Washington Football players
![Page 44: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/44.jpg)
Search Topics
• Requesting shots with specific or generic:– People, Things, Locations, Activities
Golden Gate Bridge Sailboats
![Page 45: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/45.jpg)
Search Topics
• Requesting shots with specific or generic:– People, Things, Locations, Activities
Overhead views of cities
![Page 46: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/46.jpg)
Search Topics
• Requesting shots with specific or generic:– People, Things, Locations, Activities
Rocket taking off
![Page 47: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/47.jpg)
Search Topics Summary
• Requested shots with specific/generic:– Combinations of the above:
• People spending leisure time at the beach• Locomotive approaching the viewer• Microscopic views of living cells
![Page 48: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/48.jpg)
Experiments
• …with official TREC measures – Query representation– Textual/Visual/Combined runs
• …without measures; inspecting visual similarity– Selecting components– Colour vs. texture– EM initialisation
![Page 49: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/49.jpg)
Measures• Precision
– fraction of retrieved documents that is relevant
• Recall – fraction of relevant documents that is retrieved
• Average Precision– precision averaged over different levels of recall
• Mean Average Precision (MAP) – mean of average precision over all queries
![Page 50: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/50.jpg)
Textual and Visual runs
• Textual– Short Queries (Topic description)
– Long Queries (Topic description + transcripts from video examples)
• Visual– All examples– Best examples
• Combined– Simply add textual and visual log-likelihood scores
(joint probability of seeing both query terms and query blocks)
![Page 51: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/51.jpg)
Textual and Visual runs
• Textual > Visual• Tlong > Tshort
• Combining overall not useful
• If both visual and textual runs good, combining improves
RUN MAP
Tshort 0.0916
Tlong 0.1212
BoBfull 0.0287
BoBbest 0.0444
BoBbest+Tshort 0.0784
BoBbest + Tlong 0.0870
![Page 52: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/52.jpg)
Visual runs
• Scores for purely visual runs low (MAP .037)
• Drop further when video examples are removed from relevance judgements
![Page 53: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/53.jpg)
Observation
• CBR successful under two conditions:– the query example is derived from the
same source as the target objects – a domain-specific detector is at hand
![Page 54: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/54.jpg)
vt076: Find shots with James H. Chandler
Top 10:
![Page 55: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/55.jpg)
Retrieval Results
• Non-interactive results disappointing– MAP across all participants/systems .056– Ignoring ASR runs, MAP drops to .044
• Only Known-item retrieval possible– MAP for queries with examples from
collection .094– MAP without these .026 (-40% from average)
• No significant differences between variants
![Page 56: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/56.jpg)
Selecting Query Images
• Find shots of the Golden Gate Bridge
• Full topic – use all examples
• Best example – compute results for individual examples and find best
• Manual example– manually select good example from ones available in topic
![Page 57: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/57.jpg)
Selecting Query Images• In general Best > Full (MAP full: 0.0287, best: 0.444)
• Sometimes Full > Best
![Page 58: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/58.jpg)
Selecting Components
• Query articulation can improve retrieval effectiveness, but requires enormous user effort [lowlands2001]
• Document models (GMM), allow for easy selection of important regions [LL10]
![Page 59: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/59.jpg)
Selecting Components
• For each topic we manually selected meaningful components
• No improvement in MAP
• Perhaps useful for more general queries (feature detection?)– Further investigation necessary
![Page 60: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/60.jpg)
Component Search
![Page 61: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/61.jpg)
Component Search
1-3:
18:
![Page 62: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/62.jpg)
Being lucky…
1-3:
10 17 68
Rel.:
Visually similar by chanceVisual NOT similarKeyframe does not represent shot
![Page 63: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/63.jpg)
Informal Results Analysis
• Forget about MAP scores• Investigate two aspects of experimental
results– How is image similarity captured
• Look at top 10 results
– How do visual results contribute to (MAP) scores
• Look at key-frames from relevant shots in top 100
• Qualitative observations
![Page 64: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/64.jpg)
Some Observations
• Colour dominates texture
• Homogeneous Queries – Semantically similar results– …or at least visually similar
• Heterogeneous queries– Results dominated by subset of query
![Page 65: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/65.jpg)
Some Observations
• Colour dominates texture
![Page 66: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/66.jpg)
Colour dominates texture
![Page 67: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/67.jpg)
Some Observations
• Colour dominates texture
• Homogeneous queries give intuitive results– Semantically similar– ... or at least visually
![Page 68: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/68.jpg)
Homogeneous querywith semantics
![Page 69: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/69.jpg)
Homogeneous queryno semantics, but visual similarity
Top 5 audience
Top 5 grass:
Full query Audience component Grass component
![Page 70: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/70.jpg)
Some Observations
• Colour dominates texture
• Homogeneous queries give intuitive results– Semantically similar– ... or at least visually
• Results for heterogeneous queries often dominated by part of samples
![Page 71: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/71.jpg)
Heterogeneous queryfull query
M M M M M
![Page 72: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/72.jpg)
Heterogenous querygrass samples
M M M M M
![Page 73: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/73.jpg)
Heterogeneous query
• Possible explanations domination sky samples– no document in the collection explains grass
samples well– sky samples well explained by any document
(i.e. background probability is high)
• Smoothing with background probabilities might help
![Page 74: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/74.jpg)
Heterogenous querieswith smoothing
M M M M M
•Smoothing seems to help somewhat, but problem not solved•Looking for model which favors documents with balanced individual sample scores
![Page 75: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/75.jpg)
Controlled Experiments
• What determines visual similarity in the generative probabilistic model
• Small special purpose collections created from the large TREC video collection
1. Emphasis on colour information
2. Role of initialisation of the mixture models
![Page 76: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/76.jpg)
Colour Experiments
• Collection with 2 copies of each frame– Original colour image
– Greyscale version
• Build models– Models can describe colour and texture
• Search using colour and greyscale queries
![Page 77: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/77.jpg)
Colour Experiments
M1A M1B M2BM2A MNBMNA
P( | )~ P( | )MiAMiB
P( | )~ P( | )MiAMiB
![Page 78: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/78.jpg)
Distance between pairsmodels without colour
• Results
P( | )
Ranks 2.9
P( | )• Results
Ranks 2.0
![Page 79: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/79.jpg)
Distance between pairsmodels with colour
• Results
Ranks 89.7
P( | )
Ranks 7.3
P( | )• Results
Indeed colour dominates texture
![Page 80: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/80.jpg)
Colour Experiments
• Conclusion:– Model from colour image only captures colour information
Queries Modelsrank 1
rank 1
rank 7.3
rank 89.7
![Page 81: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/81.jpg)
EM initialisation
• EM sensitive to initialisation– Build collection with several models for
each frame– Compare scores for different models from
same frame– Concentrate on top ranks
![Page 82: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/82.jpg)
EM initialisation
• Collection with:– 2 Videos
– 5 frames / shot
– 10 models / frame• From random initialisations
• Models from same frame should have similar scores
![Page 83: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/83.jpg)
EM initialisation
Ranks
Set Mean Std-dev
Frame 8.06 5.95
Shot 269.85 35.09
Video 2946.10 286.15
Collection 3075.5 374.02 cc
![Page 84: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/84.jpg)
EM initialisation
• Results– Models from query frame all near top list
• Mean rank: 8.06, std.dev.5.95
– Models from same shot closer together than models from other frames
– In general: higher ranking frames have their models closer together
• Although EM sensitive to initialisation, this does not affect ranking much
![Page 85: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/85.jpg)
Concluding Remarks
![Page 86: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/86.jpg)
Lessons TREC-10• Generalization remains a problem
– Good results examples from collection
• Textual search outperforms visual search– Even with topics designed for visual retrieval!
• Successful visual retrieval often traces down to involving luck (background, known-item)
• Combining textual and visual results possible in the presented framework– When both have reasonable performance,
combination outperforms individual runs
![Page 87: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/87.jpg)
Lessons TREC-10• Components queries retrieve intuitive results
• Convenient for query articulation!
• Color dominates texture• Sensitivity EM to initialization does not harm
results
• Note:Findings specific for model, but at least suggest hypotheses for others to investigate
![Page 88: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/88.jpg)
Need 4 Test Collections
• Results on one collection do not automatically transfer to another– Multiple collections needed to conclude one
technique is better than other
• What is a good Test Collection?– Should be representative of realistic task
• This is what TREC tries to achieve
– Results should be measurable • Like when using Corel
![Page 89: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/89.jpg)
Plans for TREC-11
• Better video representation– More frames per shot– Audio GMM (on MFCC)
• Spatial and temporal aspects– Shot = background + “objects”
• Special research interest in the right balance between interactive query articulation and (semi-)automatic query formulation
![Page 90: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/90.jpg)
Future plans
• Balancing results for heterogeneous queries
• Propagating generic concepts
![Page 91: Using Probabilistic Models for Multimedia Retrieval Arjen P. de Vries arjen@acm.org (Joint research with Thijs Westerveld) Centrum voor Wiskunde en Informatica](https://reader034.vdocuments.mx/reader034/viewer/2022051416/56649e115503460f94afcb2e/html5/thumbnails/91.jpg)
Care for more?
A probabilistic Multimedia Retrieval Model and Its Evaluation, Thijs Westerveld, Arjen de Vries, Alex van Ballegooij, Franciska de Jong and Djoerd Hiemstra, EURASIP journal on Applied Signal Processing 2003:2