texton boost: joint appearance, shape and context modeling for multi-class object recognition and...
TRANSCRIPT
![Page 1: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/1.jpg)
TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-Class Object
Recognition and Segmentation
J. Shotton ; University of CambridgeJ. Jinn, C. Rother, A. Criminisi ; MSR Cambridge
Presented by Derek Hoiem
For Misc Reading 02/15/06
![Page 2: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/2.jpg)
The Ideas in TextonBoost
• Textons from Universal Visual Dictionary paper [Winn Criminisi Minka ICCV 2005]
• Color models and GC from “Foreground Extraction using Graph Cuts” [Rother Kolmogorov Blake SG 2004]
• Boosting + Integral Image from Viola-Jones
• Joint Boosting from [Torralba Murphy Freeman CVPR 2004]
![Page 3: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/3.jpg)
What’s good about this paper
• Provides recognition + segmentation for many classes (perhaps most complete set ever)
• Combines several good ideas
• Very thorough evaluation
![Page 4: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/4.jpg)
What’s bad about this paper
• A bit hacky
• Does not beat past work (in terms of quantitative recognition results)
• No modeling of “everything else” class
![Page 5: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/5.jpg)
Object Recognition and Segmentation are Coupled
Images from [Leibe et al. 2005]
Approximate Segmentation Good SegmentationNo Segmentation
People Present
![Page 6: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/6.jpg)
The Three Approaches
• Segment Detect
• Detect Segment
• Segment Detect
![Page 7: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/7.jpg)
Segment first and ask questions later.
• Reduces possible locations for objects
• Allows use of shape information and makes long-range cues more effective
• But what if segmentation is wrong?
[Duygulu et al ECCV 2002]
![Page 8: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/8.jpg)
Object recognition + data-driven smoothing
• Object recognition drives segmentation
• Segmentation gives little back
He et al. 2004
This Paper
![Page 9: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/9.jpg)
Is there a better way?• Integrated segmentation and recognition
• Generalized Swendsen-Wang
[Tu et al. 2003]
[Barba Wu 2005]
![Page 10: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/10.jpg)
TextonBoost Overview
Shape-texture: localized textons
Color: mixture of Gaussians
Location: normalized x-y coordinates
Edges: contrast-sensitive Pott’s model
![Page 11: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/11.jpg)
Learning the CRF Params
• The authors claim to be using piecewise training …
[Sutton McCallum UAI 2005]
![Page 12: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/12.jpg)
Learning the CRF Params
• But it’s really just piecewise hacking– Learn params for different potential functions
independently– Raise potentials to some exponent to reduce
overcounting
![Page 13: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/13.jpg)
Location Term
• Counts for each normalized position over training images for each class
from Validation
![Page 14: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/14.jpg)
Color Term
• Mixture of Gaussian learned over image
• Mixture coefficients determined separately for each class
• Iterate between class labeling and parameter-estimation Manual: 3
![Page 15: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/15.jpg)
Edge Term
• Parameters learned using validation data
![Page 16: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/16.jpg)
Texture-Shape
• 17 filters (oriented gaus/lap + dots)• Cluster responses to form textons • Count textons within white box (relative to
position i)• Feature = texton + rectangle
![Page 17: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/17.jpg)
Boosting Textons
• Use “Joint Boosting” [Torralba Murphy Freeman CVPR 2004]– Different classes share features– Weak learners: decision stumps on texton count
within rectangle • To speed training:
– Randomly select 0.3% of possible features from large set
– Downsample texton maps for training images
![Page 18: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/18.jpg)
“Shape Context”
• Toy example
![Page 19: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/19.jpg)
Random Feature Selection
• Toy example (training on ten images)
![Page 20: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/20.jpg)
Results on Boosted Textons
• Boosted shape-textons in isolation– Training time: 42 hrs for 5000 rounds on 21-
class training set of 276 images
![Page 21: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/21.jpg)
Parameters Learned from Validation
• Number of Adaboost rounds (when to stop)
• Number of textons
• Edge potential parameters
• Location potential exponent
![Page 22: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/22.jpg)
Qualitative (Good) Results
![Page 23: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/23.jpg)
Qualitative (Bad) Results
• But notice good segmentation, even with bad labeling
![Page 24: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/24.jpg)
Quantitative Results
![Page 25: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/25.jpg)
Effect of Different Model Potentials
Boosted textons only No color modeling Full CRF model
![Page 26: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/26.jpg)
Corel/Sowerby
![Page 27: Texton Boost: Joint Appearance, Shape and Context Modeling for Multi-class object recognition and segmentation](https://reader034.vdocuments.mx/reader034/viewer/2022052600/5572100e497959fc0b8ca53b/html5/thumbnails/27.jpg)
The End.