semantically relevant visual dictionary
DESCRIPTION
European Conference on Operational Research, 2012, Vilinius LithuaniaTRANSCRIPT
![Page 1: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/1.jpg)
Semantically Relevant Visual Dictionary
Ashish Gupta (CVSSP)
University of Surrey
July 10,2012
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 2: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/2.jpg)
Contents
Introduction: Visual Category Recognition
Current practice: Visual Dictionary
Problem: inter-mixed feature vectors
Approach: Over-partition + Co-cluster image-word matrix
Solution: Group estimated categorically related partitions
Experiments:
Summary
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 3: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/3.jpg)
Visual Category Recognition
Definition
Detect presence of an instance of avisual category in an image.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 4: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/4.jpg)
Challenges
Several variations in visual category appearance render categoryrecognition very difficult.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 5: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/5.jpg)
Visual Dictionary
Visual Word
Representative feature vector(generally centroid) of eachcluster.
Image Histogram
Histogram of assignments ofimage feature vectors to visualwords.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 6: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/6.jpg)
Problems with Visual Dictionary
Inter-mixed
Categorically dissimilar feature vectors inter-mixed in feature space.
Semantic scatter
Feature vectors pertaining to same category part scattered infeature space.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 7: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/7.jpg)
Inter-mixed Feature Vectors
Categorically equivalentvectors mapped to naturallyoccurring clusters
Easily partitioned to yielddiscriminative dictionaryelements
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 8: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/8.jpg)
Inter-mixed Feature Vectors
Categorically dissimilar vectorsinter-mixed
Partitioning yieldsnon-discriminative dictionary
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 9: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/9.jpg)
Inter-mixed Feature Vectors
Over-partition feature space into tiny clusters.
Build a dictionary using these tiny clusters.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 10: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/10.jpg)
Semantic Scatter
Small variations in instances of object part causes associateddescriptors to get scattered in feature space.
Combine visual words which are related and create a visualtopic.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 11: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/11.jpg)
Hypothesis
Semantically related words can be discovered by analysingimage-word distribution.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 12: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/12.jpg)
Visual Topic Dictionary ← Visual Word Dictionary
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 13: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/13.jpg)
Co-Clustering
Formulate the image-word matrix as a joint probability distribution.
CX : {x1, x2, . . . , xm} → {x1, x2, . . . , xk}CY : {y1, y2, . . . , yn} → {y1, y2, . . . , yl}the tuple (CX ,CY ) is referred to as co-clustering.
‘re-order’ rows and columns of the matrix, which gives rise toblocks, referred to as co-clusters.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 14: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/14.jpg)
Co-clustering contd.
Optimal co-clustering minimizes loss in mutual informationI (X ;Y )− I (X ; Y ), given number of row (k) and column (l)clusters.
For a (CX ,CY ), loss in mutual information can be expressed byKL-divergence between p(X ,Y ) and an approximation q(X ,Y ).I (X ;Y )− I (X ; Y ) = DKL(p(X ,Y ) ‖ q(X ,Y ))
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 15: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/15.jpg)
Conceptual view
Image histogram feature vectors in high-dimensional visual wordsspace are projected to lower dimensional visual topic space.
The distance between feature vectors from the same category isreduced.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 16: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/16.jpg)
Experiment
Feature descriptor
SIFT : Affine co-variant local image patch descriptor.
Data sets
Scene-15; Pascal VOC 2006; VOC 2007; VOC 2010.
Classifier
k-NN : Verify if mutual distance between categorically equivalentfeature vectors is reduced.
Performance metric
F1-score: harmonic mean of precision and recall. Popularly used inclassification and retrieval communities.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 17: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/17.jpg)
Scene-15 Dataset
It has 15 visual categories of natural indoor and outdoor scenes.Each category has about 200 to 400 images and the entire datasethas 4485 images.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 18: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/18.jpg)
PASCAL VOC2006 Dataset
It has 10 visual categories with about 175 to 650 images percategory. There are a total of 5304 images.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 19: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/19.jpg)
PASCAL VOC2007 Dataset
It has 20 visual categories. Each category contains images rangingfrom 100 to 2000, with 9963 images in all.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 20: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/20.jpg)
PASCAL VOC2010 Dataset
It has 20 visual categories and 300 to 3500 images in eachcategory. Combines data from VOC2008 and VOC2009.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 21: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/21.jpg)
Dictionary Size
10,000 words → n Topics. Appropriate number of Topics?
Large dictionary becomes category dependent.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 22: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/22.jpg)
Summary
Visual dictionary in limited: unsupervised clustering.
Significant intra-category appearance variation: semantic scatter.Feature vectors from different visual categories inter-mixed infeature space.
Visual Topic ←∑
Visual Word: grouping over-partitioned featurespace.
Co-clustering Image-Word distribution: discover optimal groupingof words with minimal loss in mutual information.
Semantic dimensionality reduction.
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary
![Page 23: Semantically Relevant Visual Dictionary](https://reader033.vdocuments.mx/reader033/viewer/2022051311/540b0e018d7f72dc6a8b45b1/html5/thumbnails/23.jpg)
Thank you.
Acknowledgement
Ashish Gupta (CVSSP) Semantically Relevant Visual Dictionary