beyond bags of features: addi ilif iadding spatial...

20
Beyond bags of features: Addi ilif i Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Upload: others

Post on 04-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Beyond bags of features:Addi i l i f iAdding spatial information

Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Page 2: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Adding spatial information• Computing bags of features on sub-windows

of the whole image• Using codebooks to vote for object position• Generative part-based models

Page 3: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Spatial pyramid representationSpatial pyramid representation• Extension of a bag of features• Locally orderless representation at several levels of resolution

level 0

Lazebnik, Schmid & Ponce (CVPR 2006)

Page 4: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Spatial pyramid representationSpatial pyramid representation• Extension of a bag of features• Locally orderless representation at several levels of resolution

level 0 level 1

Lazebnik, Schmid & Ponce (CVPR 2006)

Page 5: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Spatial pyramid representationSpatial pyramid representation• Extension of a bag of features• Locally orderless representation at several levels of resolution

level 0 level 1 level 2

Lazebnik, Schmid & Ponce (CVPR 2006)

Page 6: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Scene category datasetScene category dataset

Multi-class classification results(100 training images per class)( g g p )

Page 7: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Caltech101 datasetCaltech101 datasethttp://www.vision.caltech.edu/Image_Datasets/Caltech101/Caltech101.html

Multi-class classification results (30 training images per class)( g g p )

Page 8: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Implicit shape models• Visual codebook is used to index votes for

object position

visual codeword withdi l t t

B Leibe A Leonardis and B Schiele Combined Object Categorization and

training image annotated with object localization info

displacement vectors

B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004

Page 9: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Implicit shape models• Visual codebook is used to index votes for

object position

B Leibe A Leonardis and B Schiele Combined Object Categorization and

test image

B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004

Page 10: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Implicit shape models: Details

B Leibe A Leonardis and B Schiele Combined Object Categorization andB. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004

Page 11: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Generative part-based models

R. Fergus, P. Perona and A. Zisserman, Object Class Recognition by Unsupervised Scale-Invariant Learning, CVPR 2003

Page 12: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Probabilistic model

)|(),|(),|(max)|,()|(

objecthpobjecthshapepobjecthappearancePobjectshapeappearancePobjectimageP

h==

h: assignment of features to partsPartdescriptors

Partlocationsdescriptors locations

Candidate parts

Page 13: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Probabilistic model

)|(),|(),|(max)|,()|(

objecthpobjecthshapepobjecthappearancePobjectshapeappearancePobjectimageP

h==

h: assignment of features to parts

Part 1

Part 3Part 3

Part 2

Page 14: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Probabilistic model

)|(),|(),|(max)|,()|(

objecthpobjecthshapepobjecthappearancePobjectshapeappearancePobjectimageP

h==

h: assignment of features to parts

Part 1

Part 3Part 3

Part 2

Page 15: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Probabilistic model

)|(),|(),|(max)|,()|(

objecthpobjecthshapepobjecthappearancePobjectshapeappearancePobjectimageP

h==

Distribution over patchdescriptorsdescriptors

High-dimensional appearance space

Page 16: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Probabilistic model

)|(),|(),|(max)|,()|(

objecthpobjecthshapepobjecthappearancePobjectshapeappearancePobjectimageP

h==

Distrib tionDistribution over jointpart positions

2D image space

Page 17: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Results: Faces

Faceshape

Patchappearance

modelpp

model

Recognitionltresults

Page 18: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Results: Motorbikes and airplanes

Page 19: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Representing people

Page 20: Beyond bags of features: Addi ilif iAdding spatial informationlazebnik/spring10/lec21_spatial.pdflevel 0 level 1 level 2 Lazebnik, Schmid & Ponce (CVPR 2006) Scene category dataset

Summary: Adding spatial informationS i l id• Spatial pyramids• Pro: simple extension of a bag of features, works very well• Con: no geometric invariance no object localizationCon: no geometric invariance, no object localization

• Implicit shape models• Pro: can localize object, maintain translation and possibly scale

invariance• Con: need supervised training data (known object positions and possibly

segmentation masks)

• Generative part-based models• Pro: very nice conceptually, can be learned from unsegmented images• Con: combinatorial hypothesis search problem