latent svms for human detection with a locally affine deformation field
DESCRIPTION
Latent SVMs for Human Detection with a Locally Affine Deformation Field. Ľubor Ladický 1 Phil Torr 2 Andrew Zisserman 1. 1 University of Oxford 2 Oxford Brookes University. Object Detection. Find all objects of interest Enclose them tightly in a bounding box. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/1.jpg)
Latent SVMs for Human Detection with a Locally Affine Deformation Field
Ľubor Ladický1 Phil Torr2 Andrew Zisserman1
1 University of Oxford 2 Oxford Brookes University
![Page 2: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/2.jpg)
Object Detection
• Find all objects of interest
• Enclose them tightly in a bounding box
![Page 3: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/3.jpg)
HOG Detector
Dalal & Triggs CVPR05
• Sliding window using learnt HOG template
• Post-processing using non-maxima suppression
![Page 4: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/4.jpg)
Dalal & Triggs CVPR05
HOG Detector
• Sliding window using learnt HOG template
• Post-processing using non-maxima suppression
![Page 5: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/5.jpg)
Dalal & Triggs CVPR05
HOG Detector
• Sliding window using learnt HOG template
• Post-processing using non-maxima suppression
![Page 6: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/6.jpg)
Dalal & Triggs CVPR05
HOG Detector
• Sliding window using learnt HOG template
• Post-processing using non-maxima suppression
![Page 7: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/7.jpg)
Dalal & Triggs CVPR05
HOG Detector
• Sliding window using learnt HOG template
• Post-processing using non-maxima suppression
![Page 8: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/8.jpg)
Dalal & Triggs CVPR05
HOG Detector
• Sliding window using learnt HOG template
• Post-processing using non-maxima suppression
![Page 9: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/9.jpg)
Classifier response :
HOG Detector
Dalal & Triggs CVPR05
The weights w* and the bias b* learnt using the Linear SVM as:
![Page 10: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/10.jpg)
Dalal & Triggs CVPR05
Does not fit well !• Sliding window using learnt HOG template
• Post-processing using non-maxima suppression
HOG Detector
![Page 11: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/11.jpg)
Deformable Part-based Model
Felzenszwalb et al. CVPR08
• Allows parts to move relative to the centre
• Effectively allows the template to deform
• Multiple models based on an aspect ratio
![Page 12: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/12.jpg)
Felzenszwalb et al. CVPR08
Deformable Part-based Model
• Allows parts to move relative to the centre
• Effectively allows the template to deform
• Multiple models based on an aspect ratio
![Page 13: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/13.jpg)
Cells c displaced by the deformation field d:
Our approach
![Page 14: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/14.jpg)
Cells c displaced by the deformation field d:
Our approach
Regularisation takes the form of a pairwise MRF:
Classifier response :
![Page 15: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/15.jpg)
Cells c displaced by the deformation field d:
Our approach
The weights w* and the bias b* learnt using the Latent Linear SVM as:
Classifier response :
![Page 16: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/16.jpg)
Comparison with our approach
HOG template
(no deformation)
Part-based model
(rigid movable parts)
Our model
(deformation field)
![Page 17: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/17.jpg)
Our approach
Why hasn’t anyone tried it before?
![Page 18: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/18.jpg)
Our approach
Why hasn’t anyone tried it before?
• Latent models with many latent variables tend to overfit
• Inference not feasible for a sliding window
![Page 19: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/19.jpg)
Our approach
Why hasn’t anyone tried it before?
• Latent models with many latent variables tend to overfit
• Inference not feasible for a sliding window
• Deformation field used before only for
• Classification task (Duchenne et al ICCV11)
• Rescoring of detections (Ladický, PhD thesis)
![Page 20: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/20.jpg)
Our approach
We restrict the deformation field to be locally affine ( ):
![Page 21: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/21.jpg)
Our approach
We restrict the deformation field to be locally affine ( ):
![Page 22: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/22.jpg)
Our approach
We restrict the deformation field to be locally affine ( ):
![Page 23: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/23.jpg)
Optimisation
Weights / bias (w*, b*) and the deformation fields dk estimated iteratively
![Page 24: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/24.jpg)
Optimisation
Weights / bias (w*, b*) and the deformation fields dk estimated iteratively
Given the deformation fields the problem is a standard linear SVM:
![Page 25: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/25.jpg)
Optimisation
Weights / bias (w*, b*) and the deformation fields dk estimated iteratively
Given (w*, b*) the problem is a constrained MRF optimisation:
The last can be decomposed as :
By defining the optimisation becomes:
![Page 26: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/26.jpg)
Optimisation
• The location of the cells in the first row and in the first column fully
determine the location of each cell
• Any locally affine deformation field can be reached by two moves :
• each column i moves by (Δcdix ,Δcdi
y)
• each row j moves by (Δrdjx ,Δrdj
y)
![Page 27: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/27.jpg)
Optimisation
• The location of the cells in the first row and in the first column fully
determine the location of each cell
• Any locally affine deformation field can be reached by two moves :
• each column i moves by (Δcdix ,Δcdi
y)
• each row j moves by (Δrdjx ,Δrdj
y)
![Page 28: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/28.jpg)
Optimisation
• The location of the cells in the first row and in the first column fully
determine the location of each cell
• Any locally affine deformation field can be reached by two moves :
• each column i moves by (Δcdix ,Δcdi
y)
• each row j moves by (Δrdjx ,Δrdj
y)
• Such moves do not alter the local affinity
![Page 29: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/29.jpg)
Optimisation
• The location of the cells in the first row and in the first column fully
determine the location of each cell
• Any locally affine deformation field can be reached by two moves :
• each column i moves by (Δcdix ,Δcdi
y)
• each row j moves by (Δrdjx ,Δrdj
y)
• Such moves do not alter the local affinity
• Both moves can be solved quickly using dynamic programming
![Page 30: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/30.jpg)
Learning multiple poses / viewpoints
![Page 31: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/31.jpg)
Learning multiple poses / viewpoints
• We define a similarity measure between two training samples as :
where
![Page 32: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/32.jpg)
Learning multiple poses / viewpoints
• We define a similarity measure between two training samples as :
where
• K-medoid clustering of S matrix clusters the data into multi model
![Page 33: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/33.jpg)
Experiments
• Buffy dataset (typically used for pose estimation)
• Contains large variety of poses, viewpoints and aspect ratios
• Consists of 748 images
• Episode s5e3 used for training
• Episode s5e4 used for validation
• Episodes s5e2, s5e5 and s5e6 used for testing
Ferrari et al. CVPR08
![Page 34: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/34.jpg)
Clustering of training samples
Each row corresponds to one model (out of 10 models)
![Page 35: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/35.jpg)
Qualitative results
![Page 36: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/36.jpg)
Quantitative results
![Page 37: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/37.jpg)
Conclusion
• We propose
• Novel inference for locally affine deformation field (LADF)
• Object detector using LADF
• Clustering using LADF
![Page 38: Latent SVMs for Human Detection with a Locally Affine Deformation Field](https://reader036.vdocuments.mx/reader036/viewer/2022062518/568143a1550346895db02255/html5/thumbnails/38.jpg)
Questions?
Thank you