presented by yehuda dar advanced topics in computer vision ( 048921 )winter 2011-2012

36
Video Compression using Computer Vision Presented by Yehuda Dar Advanced Topics in Computer Vision (048921) Winter 2011-

Upload: monica-lyons

Post on 18-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Video Compressionusing Computer VisionPresented by Yehuda Dar

Advanced Topics in Computer Vision (048921) Winter 2011-2012

Page 2: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Video Compression Basics

Fundamental tradeoff among:Bit-rateDistortionComputational complexity

Page 3: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Video Compression Basics

Utilized redundancies:SpatialTemporalPsycho-visualStatistical

Page 4: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

H.264 Overview

Page 5: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

H.264 Redundancy UtilizationMeans Utilization Redundancy

• Transform coding• Intra coding (spatial prediction) High

Spatial

Motion estimation & compensation High Temporal

• YCbCr color space• 4:2:0 sampling• DC \ AC coefficients quantization

MediumPsycho-visual

Entropy coding High Statistical

Page 6: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Compression using Computer Vision

Motivation:

Better utilization of the psycho-visual redundancy

Application-specific compression methods

Exploring new approaches

Page 7: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

A Review of:

A Scheme for Attentional Video Compression R. Gupta and S. ChaundhuryPAMI 2011

Page 8: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Method Outline

Salient region detectionFoveated video codingIntegration into H.264

Foveated image coding demonstrationFigure from Guo & Zhang, Trans. Image Process., 2010

Page 9: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Saliency MapStep 1: Creating a 3D Feature Map

Based on Calculation method Feature typeLiu et al, CVPR 2007

Color spatial variance

Global

Huang et al, ICPR 2010

Center-surround multi-scale ratio of dissimilarity

Local

Yu et al, ICDL 2009

Pulse-DCT Rarity

Page 10: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Relevance Vector Machine (RVM)

Used here as a binary classifier

Advantages over support-vector-machine (SVM):Provides posterior probabilitiesBetter generalization abilityFaster decisions

Page 11: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Saliency MapStep 2: Unify Features using RVM

Global

local

rarity

average

avgglobal

avglocalavgrarity

æ ö÷ç ÷ç ÷ç ÷ç ÷ç ÷ç ÷ç ÷÷ç ÷ç ÷çè ø

average

average

ground truth count pixels

‘salient \ ’‘non salient’

RVM

sample

label

Training Procedure for MBs:

Page 12: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Saliency MapStep 2: Unify Features using RVMTrained RVM Usage:

avgglobal

avglocalavgrarity

æ ö÷ç ÷ç ÷ç ÷ç ÷ç ÷ç ÷ç ÷÷ç ÷ç ÷çè øRVM

Newinput

Binary label ‘salient \ ’‘non salient’

Probability Relative saliency

Page 13: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Saliency Map: Result Comparison

input global local[Huang et al, ICPR 2010]

rarity[Yu et al, ICDL 2009]

proposed [Harel et al, NIPS 2006]

[Bruce & Tsotsos, NIPS 2006]

Figures from Gupta & Chaundhury, PAMI 2011

Page 14: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Saliency Map: ROC Curve

Figure from Gupta & Chaundhury, PAMI 2011

Proposed[Harel et al, NIPS 2006]

Page 15: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Integration Into H.264:Calculation of Saliency Values

Recalculating saliency map only when it significantly changes

Mutual-information between successive frames indicates changes in saliency:

Figures from Gupta & Chaundhury, PAMI 2011

Page 16: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Integration Into H.264:Propagation of Saliency Values

For inter-coded MBs, the saliency value is a weighted-average of those pointed by the motion-vector

Figures from Gupta & Chaundhury, PAMI 2011

Page 17: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Integration Into H.264:Salient-Adaptive Quantization

Non-uniform bit-allocationSmaller saliency value => coarser

quantization

Page 18: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Integration Into H.264

Figure from Gupta & Chaundhury, PAMI 2011

Page 19: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Paper EvaluationNovelty:

Methods for: saliency map saliency value propagation

Assumption:All the MBs in P-frames are inter-coded (problematic)

Writing level: GoodPartially self-contained

Page 20: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Paper EvaluationFeasibility:

Higher complexity than H.264 encoders Not for real-time encoders Useful at low bit-rates Objects entering the scene may be considered unimportant

Experimental evaluation:Saliency:

visual comparison: good ROC curve comparison: partial

Compression:None (authors’ future direction)

Page 21: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Future Directions

Improving encoding complexityless complex saliency method

Better object entrance treatmentUsing mutual-information of frame areas

Treat intra-coded MBs in P-frames

Page 22: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

A Review of:

3D Models Coding and Morphing for Efficient Video CompressionF. Galpin, R. Balter, L. Morin, K. DeguchiCVPR 2004

Page 23: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Method Outline

3D model extraction3D model-based video codingReconstruction using adaptive geometric morphing

Page 24: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

3D Models Stream Generation

Figure from Galpin et al, CVPR 2004

Page 25: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Stream Compression

Three data types to compress:3D modelTexture imagesCamera parameters

Page 26: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Texture Image Compression

Figure from Galpin et al, CVPR 2004

Reconstruction Process:

Page 27: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

3D Model Compression

The 3D model originates in decimated depth map

Compressed by:Wavelet transformDepth-adaptive quantization

Figures from Galpin et al, CVPR 2004

Page 28: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Video Reconstruction:Texture Fading

Figure from Galpin et al, CVPR 2004

Page 29: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Video Reconstruction:Texture Fading

without texture fading with texture fading

Figures from Galpin et al, CVPR 2004

Page 30: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Video Reconstruction:Geometric Morphing

Improving 3D model interpolation

Figure from Galpin et al, CVPR 2004

Page 31: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Video Reconstruction:Geometric Morphing

regular interpolation interpolation with geometric morphing

Figures from Galpin et al, CVPR 2004

Page 32: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Result Comparison with H.264

Page 33: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Paper EvaluationNovelty:

Compression using unknown 3D model

Assumptions:Static sceneMoving monocular cameraNeglected camera rotationGOP intrinsic parameters are fixed

Writing level: GoodNot self-contained

Page 34: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Paper Evaluation

Feasibility:Only for static scene videoHigh encoder\decoder complexityReal-time unsuitableUseful at very low bit-rates

Experimental evaluation:Sufficient visual comparison with H.264No run-time information

Page 35: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Future Directions

Treat moving objects

Improve complexityAt least for real-time decoding

Page 36: Presented by Yehuda Dar Advanced Topics in Computer Vision ( 048921 )Winter 2011-2012

Approach Comparison3D model Attention

Static scene Any Video type

Very low Low Bit-rates useful at

High High Encoder complexity

High Regular Decoder complexity

Unsuitable Possible Integration in H.264

Inferior Promising Overall evaluation