![Page 1: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/1.jpg)
Video Compressionusing Computer VisionPresented by Yehuda Dar
Advanced Topics in Computer Vision (048921) Winter 2011-2012
![Page 2: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/2.jpg)
Video Compression Basics
Fundamental tradeoff among:Bit-rateDistortionComputational complexity
![Page 3: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/3.jpg)
Video Compression Basics
Utilized redundancies:SpatialTemporalPsycho-visualStatistical
![Page 4: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/4.jpg)
H.264 Overview
![Page 5: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/5.jpg)
H.264 Redundancy UtilizationMeans Utilization Redundancy
• Transform coding• Intra coding (spatial prediction) High
Spatial
Motion estimation & compensation High Temporal
• YCbCr color space• 4:2:0 sampling• DC \ AC coefficients quantization
MediumPsycho-visual
Entropy coding High Statistical
![Page 6: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/6.jpg)
Compression using Computer Vision
Motivation:
Better utilization of the psycho-visual redundancy
Application-specific compression methods
Exploring new approaches
![Page 7: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/7.jpg)
A Review of:
A Scheme for Attentional Video Compression R. Gupta and S. ChaundhuryPAMI 2011
![Page 8: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/8.jpg)
Method Outline
Salient region detectionFoveated video codingIntegration into H.264
Foveated image coding demonstrationFigure from Guo & Zhang, Trans. Image Process., 2010
![Page 9: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/9.jpg)
Saliency MapStep 1: Creating a 3D Feature Map
Based on Calculation method Feature typeLiu et al, CVPR 2007
Color spatial variance
Global
Huang et al, ICPR 2010
Center-surround multi-scale ratio of dissimilarity
Local
Yu et al, ICDL 2009
Pulse-DCT Rarity
![Page 10: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/10.jpg)
Relevance Vector Machine (RVM)
Used here as a binary classifier
Advantages over support-vector-machine (SVM):Provides posterior probabilitiesBetter generalization abilityFaster decisions
![Page 11: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/11.jpg)
Saliency MapStep 2: Unify Features using RVM
Global
local
rarity
average
avgglobal
avglocalavgrarity
æ ö÷ç ÷ç ÷ç ÷ç ÷ç ÷ç ÷ç ÷÷ç ÷ç ÷çè ø
average
average
ground truth count pixels
‘salient \ ’‘non salient’
RVM
sample
label
Training Procedure for MBs:
![Page 12: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/12.jpg)
Saliency MapStep 2: Unify Features using RVMTrained RVM Usage:
avgglobal
avglocalavgrarity
æ ö÷ç ÷ç ÷ç ÷ç ÷ç ÷ç ÷ç ÷÷ç ÷ç ÷çè øRVM
Newinput
Binary label ‘salient \ ’‘non salient’
Probability Relative saliency
![Page 13: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/13.jpg)
Saliency Map: Result Comparison
input global local[Huang et al, ICPR 2010]
rarity[Yu et al, ICDL 2009]
proposed [Harel et al, NIPS 2006]
[Bruce & Tsotsos, NIPS 2006]
Figures from Gupta & Chaundhury, PAMI 2011
![Page 14: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/14.jpg)
Saliency Map: ROC Curve
Figure from Gupta & Chaundhury, PAMI 2011
Proposed[Harel et al, NIPS 2006]
![Page 15: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/15.jpg)
Integration Into H.264:Calculation of Saliency Values
Recalculating saliency map only when it significantly changes
Mutual-information between successive frames indicates changes in saliency:
Figures from Gupta & Chaundhury, PAMI 2011
![Page 16: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/16.jpg)
Integration Into H.264:Propagation of Saliency Values
For inter-coded MBs, the saliency value is a weighted-average of those pointed by the motion-vector
Figures from Gupta & Chaundhury, PAMI 2011
![Page 17: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/17.jpg)
Integration Into H.264:Salient-Adaptive Quantization
Non-uniform bit-allocationSmaller saliency value => coarser
quantization
![Page 18: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/18.jpg)
Integration Into H.264
Figure from Gupta & Chaundhury, PAMI 2011
![Page 19: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/19.jpg)
Paper EvaluationNovelty:
Methods for: saliency map saliency value propagation
Assumption:All the MBs in P-frames are inter-coded (problematic)
Writing level: GoodPartially self-contained
![Page 20: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/20.jpg)
Paper EvaluationFeasibility:
Higher complexity than H.264 encoders Not for real-time encoders Useful at low bit-rates Objects entering the scene may be considered unimportant
Experimental evaluation:Saliency:
visual comparison: good ROC curve comparison: partial
Compression:None (authors’ future direction)
![Page 21: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/21.jpg)
Future Directions
Improving encoding complexityless complex saliency method
Better object entrance treatmentUsing mutual-information of frame areas
Treat intra-coded MBs in P-frames
![Page 22: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/22.jpg)
A Review of:
3D Models Coding and Morphing for Efficient Video CompressionF. Galpin, R. Balter, L. Morin, K. DeguchiCVPR 2004
![Page 23: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/23.jpg)
Method Outline
3D model extraction3D model-based video codingReconstruction using adaptive geometric morphing
![Page 24: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/24.jpg)
3D Models Stream Generation
Figure from Galpin et al, CVPR 2004
![Page 25: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/25.jpg)
Stream Compression
Three data types to compress:3D modelTexture imagesCamera parameters
![Page 26: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/26.jpg)
Texture Image Compression
Figure from Galpin et al, CVPR 2004
Reconstruction Process:
![Page 27: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/27.jpg)
3D Model Compression
The 3D model originates in decimated depth map
Compressed by:Wavelet transformDepth-adaptive quantization
Figures from Galpin et al, CVPR 2004
![Page 28: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/28.jpg)
Video Reconstruction:Texture Fading
Figure from Galpin et al, CVPR 2004
![Page 29: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/29.jpg)
Video Reconstruction:Texture Fading
without texture fading with texture fading
Figures from Galpin et al, CVPR 2004
![Page 30: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/30.jpg)
Video Reconstruction:Geometric Morphing
Improving 3D model interpolation
Figure from Galpin et al, CVPR 2004
![Page 31: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/31.jpg)
Video Reconstruction:Geometric Morphing
regular interpolation interpolation with geometric morphing
Figures from Galpin et al, CVPR 2004
![Page 32: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/32.jpg)
Result Comparison with H.264
![Page 33: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/33.jpg)
Paper EvaluationNovelty:
Compression using unknown 3D model
Assumptions:Static sceneMoving monocular cameraNeglected camera rotationGOP intrinsic parameters are fixed
Writing level: GoodNot self-contained
![Page 34: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/34.jpg)
Paper Evaluation
Feasibility:Only for static scene videoHigh encoder\decoder complexityReal-time unsuitableUseful at very low bit-rates
Experimental evaluation:Sufficient visual comparison with H.264No run-time information
![Page 35: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/35.jpg)
Future Directions
Treat moving objects
Improve complexityAt least for real-time decoding
![Page 36: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012](https://reader036.vdocuments.mx/reader036/viewer/2022062407/56649d0c5503460f949e1041/html5/thumbnails/36.jpg)
Approach Comparison3D model Attention
Static scene Any Video type
Very low Low Bit-rates useful at
High High Encoder complexity
High Regular Decoder complexity
Unsuitable Possible Integration in H.264
Inferior Promising Overall evaluation