ee569 digital video processing 1 roadmap introduction intra-frame coding –review of jpeg...

EE569 Digital Video Processing EE569 Digital Video Processing 11

RoadmapRoadmap

IntroductionIntroduction

Intra-frame coding Intra-frame coding – Review of JPEGReview of JPEG

Inter-frame codingInter-frame coding– Conditional Replenishment (CR) CodingConditional Replenishment (CR) Coding– Motion Compensated Predictive (MCP) CodingMotion Compensated Predictive (MCP) Coding

Object-based and scalable video coding*Object-based and scalable video coding*– Motion segmentation, scalability issuesMotion segmentation, scalability issues


Introduction to Video CodingIntroduction to Video Coding

Lossless vs. lossy data compressionLossless vs. lossy data compression– Source entropy H(X)Source entropy H(X)– Rate-Distortion function R(D) or D(R)Rate-Distortion function R(D) or D(R)

Probabilistic modeling is at the heart of data Probabilistic modeling is at the heart of data compressioncompression– What is P(X) for video source X?What is P(X) for video source X?– Is video coding more difficult than image coding?Is video coding more difficult than image coding?


Shannon’s PictureShannon’s Picture

Rate (bps)

Distortion

Coder ACoder B

For video source, no one knows the limit (bound)

For Gaussian source N(0,2)


Distortion MeasuresDistortion Measures

ObjectiveObjective– Mean Square Error (MSE)Mean Square Error (MSE)– Peak Signal-to-Noise-Ratio (PSNR)Peak Signal-to-Noise-Ratio (PSNR)– Measure the fidelity to original videoMeasure the fidelity to original video

SubjectiveSubjective– Human Vision System (HVS) basedHuman Vision System (HVS) based– Emphasize visual quality rather than fidelityEmphasize visual quality rather than fidelity

We only discuss objective measures in this course, We only discuss objective measures in this course, but subjective video quality assessment is an open but subjective video quality assessment is an open and important topicand important topic


Video Coding ApplicationsVideo Coding Applications


RoadmapRoadmap


Intra-frame codingIntra-frame coding – Review of JPEGReview of JPEG

Inter-frame codingInter-frame coding– Conditional Replenishment (CR)Conditional Replenishment (CR)– Motion Compensated Prediction (MCP)Motion Compensated Prediction (MCP)



A Tour of JPEG Coding Standard

Key Components

Transform

Quantization

Coding

-8×8 DCT-boundary padding

-uniform quantization

-DC/AC coefficients

-Zigzag scan-run length/Huffman coding


JPEG Baseline Coder

169130

173129

170181

170183

179181

182180

179180

179179169132

171130

169183

164182

179180

176179

180179

178178167131

167131

165179

170179

177179

182171

177177

168179169130

165132

166187

163194

176116

15394

153183

160183Tour Example


Step 1: Transform• DC level shifting

• 2D DCT

169130

173129

170181

170183

179181

182180

179180

179179169132

171130

169183

164182

179180

176179

180179

178178167131

167131

165179

170179

177179

182171

177177

168179169130

165132

166187

163194

176116

15394

153183

160183

412

451

4253

4255

5153

5452

5152

5151414

432

4155

3654

5152

4851

5251

5050393

393

3751

4251

4951

5443

4949

4051412

374

3859

3566

4812

2534

2555

3655

-128

412

451

4253

4255

5153

5452

5152

5151414

432

4155

3654

5152

4851

5251

5050393

393

3751

4251

4951

5443

4949

4051412

374

3859

3566

4812

2534

2555

3655

13

42

12

09

40

21

13

4430

55

47

73

30

46

32

16113

916

109

621

179

3310

810

17201024

2727

132

6078

4413

1827

2738

56313

DCT


Step 2: Quantization

99103

101120

100112

121103

9895

8778

9272

644992113

77103

10481

10968

6455

5637

3524

22186280

5669

8751

5740

2922

2416

1714

13145560

6151

5826

4024

1914

1610

1212

1116

Q-table

13

42

12

09

40

21

13

4430

55

47

73

30

46

32

16113

916

109

621

179

3310

810

17201024

2727

132

6078

4413

1827

2738

56313

00

00

00

00

00

00

00

0000

00

00

00

00

00

00

0000

00

00

01

10

11

01

1100

01

01

23

21

13

23

520

Q

Why increasefrom top-left tobottom-right?


Step 3: Entropy Coding

Zigzag Scan

00

00

00

00

00

00

00

0000

00

00

00

00

00

00

0000

00

00

01

10

11

01

1100

01

01

23

21

13

23

520

(20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)

Zigzag Scan

End Of the Block:All following coefficients are zero


RoadmapRoadmap


Intra-frame coding Intra-frame coding – Review of JPEGReview of JPEG

Inter-frame codingInter-frame coding– Conditional Replenishment (CR)Conditional Replenishment (CR)– Motion Compensated Prediction (MCP)Motion Compensated Prediction (MCP)



Conditional ReplenishmentConditional Replenishment

Based on motion detection rather than motion Based on motion detection rather than motion estimationestimationPartition the current frame into “still areas” Partition the current frame into “still areas” and “moving areas”and “moving areas”– Replenishment is applied to moving regions onlyReplenishment is applied to moving regions only– Repetition is applied to still regionsRepetition is applied to still regions

Need to transmit the location of moving areas Need to transmit the location of moving areas as well as new (replenishment) informationas well as new (replenishment) information– No motion vectors transmittedNo motion vectors transmitted


Conditional ReplenishmentConditional Replenishment


Motion DetectionMotion Detection


From Replenishment to PredictionFrom Replenishment to Prediction

Replenishment can be viewed as a degenerated Replenishment can be viewed as a degenerated case of predictioncase of prediction– Only zero motion vector is considered Only zero motion vector is considered – Discard the historyDiscard the history

A more powerful approach of exploiting A more powerful approach of exploiting temporal dependency is predictiontemporal dependency is prediction– Locate the best match from the previous frameLocate the best match from the previous frame– Use the history to predict the current Use the history to predict the current


Differential Pulse Coded ModulationDifferential Pulse Coded Modulation

_

+D

xn

xnxn-1

yn

xn-1

+yn xn

D

Encoder

Decoder

Q^yn

^

^^

^

xn-1^

^

Xn,yn: unquantized samples and prediction residues

Xn,yn: decoded samples and quantized prediction residues^ ^

nnnn yyxx ˆˆ


Motion-Compensated Predictive CodingMotion-Compensated Predictive Coding


A Closer LookA Closer Look


Key ComponentsKey Components

Motion Estimation/CompensationMotion Estimation/Compensation – At the heart of MCP-based codingAt the heart of MCP-based coding

Coding of Motion Vectors (overhead)Coding of Motion Vectors (overhead)– Lossless: errors in MV are catastrophic Lossless: errors in MV are catastrophic

Coding of MCP residuesCoding of MCP residues– Lossy: distortion is controlled by the quantization Lossy: distortion is controlled by the quantization

step-sizestep-size

Rate-Distortion optimizationRate-Distortion optimization


Block-based Motion ModelBlock-based Motion Model

Block sizeBlock size– Fixed vs. variableFixed vs. variable

Motion accuracyMotion accuracy– Integer-pel vs. fractional-pelInteger-pel vs. fractional-pel

Number of hypothesisNumber of hypothesis– Overlapped Block Motion Compensation (OBMC)Overlapped Block Motion Compensation (OBMC)– Multi-frame predictionMulti-frame prediction


Quadtree Representation of Quadtree Representation of Motion Field with Variable BlocksizeMotion Field with Variable Blocksize

Sullivan, G.J.; Baker, R.L., "Rate-distortion optimized motion compensation for video compression using fixed or variable size blocks," GLOBECOM '91. pp.85-90 vol.1, 2-5 Dec 1991


ExampleExample

counted bits using a VLC table


Fractional-pel BMAFractional-pel BMA

Recall the tradeoff between spending bits on Recall the tradeoff between spending bits on motion and spending bits on MCP residuesmotion and spending bits on MCP residues

Intuitively speaking, going from integer-pel to Intuitively speaking, going from integer-pel to fractional-pel is good for it dramatically fractional-pel is good for it dramatically reduces the variance of MCP residues for reduces the variance of MCP residues for some video sequence.some video sequence.

The gain quickly saturates as motion accuracy The gain quickly saturates as motion accuracy refinesrefines


8-by-8 block, half-pel, var(e)=123.88-by-8 block, integer-pel, var(e)=220.8

ExampleExample

MCP residue comparison for the first two frames of Mobile sequence


Fractional-pel MCPFractional-pel MCP

Girod, B., "Motion-compensating prediction with fractional-pel accuracy," IEEE Trans. on Communications, vol.41, no.4, pp.604-612, Apr 1993


Multi-Hypothesis MCPMulti-Hypothesis MCP

Using one block from one reference frame Using one block from one reference frame represents a single-hypothesis MCPrepresents a single-hypothesis MCPIt is possible to formulate multiple hypothesis It is possible to formulate multiple hypothesis by consideringby considering– Overlapped blocksOverlapped blocks– More than one reference frameMore than one reference frame

Why multi-hypothesis?Why multi-hypothesis?– The benefit of reducing variance of MCP residues The benefit of reducing variance of MCP residues

outweighs the increased overhead on motionoutweighs the increased overhead on motion


Example: B-frameExample: B-frame

fn-1 fn fn+1

1/5.0/0),,()1(),(),(ˆ 11 ayxfayxfayxf nnn


Generalized B-frameGeneralized B-frame

fn-1 fn fn+1

nk

kk

n yxfayxf ),(),(ˆ

fn+2fn-2


Overlapped Block Motion Overlapped Block Motion Compensation (OBMC)Compensation (OBMC)


Overlapped Block Motion Overlapped Block Motion Compensation (OBMC)Compensation (OBMC)

Conventional block motion compensationConventional block motion compensation– One best matching block is found from a reference frameOne best matching block is found from a reference frame

– The current block is predicted by the best matching blockThe current block is predicted by the best matching block

OBMCOBMC– Each pixel in the current block is predicted by a weighted Each pixel in the current block is predicted by a weighted

average of several corresponding pixels in the reference frameaverage of several corresponding pixels in the reference frame

– The corresponding pixels are determined by the MVs of the The corresponding pixels are determined by the MVs of the current as well as adjacent MBscurrent as well as adjacent MBs

– The weights for each corresponding pixel depends on the The weights for each corresponding pixel depends on the expected accuracy of the associated MVexpected accuracy of the associated MV


OBMC Using 4 Neighboring MBsOBMC Using 4 Neighboring MBs

Should be Should be inversely inversely proportionalproportional to the distance to the distance between x and the center of between x and the center of


Optimal Weighting Design*Optimal Weighting Design*

Convert to an optimization problem: Convert to an optimization problem: – MinimizeMinimize

– Subject toSubject to

Optimal weighting functions:Optimal weighting functions:


Multi-Hypothesis MCPMulti-Hypothesis MCP



Motion Estimation Motion Estimation – At the heart of MCP-based codingAt the heart of MCP-based coding



step-sizestep-size



Motion Vector CodingMotion Vector Coding

2D lossless DPCM2D lossless DPCM– Spatially (temporally) adjacent motion vectors are Spatially (temporally) adjacent motion vectors are

correlatedcorrelated– Use causal neighbors to predict the current oneUse causal neighbors to predict the current one– Code Motion Vector Difference (MVD) instead of Code Motion Vector Difference (MVD) instead of

MVsMVs

Entropy coding techniquesEntropy coding techniques– Variable length codes (VLC)Variable length codes (VLC)– Arithmetic codingArithmetic coding


MVD ExampleMVD Example

MV

MV1 MV2

MV3

),,( 321 MVMVMVmedianMVMVD

Due to smoothness of MV field, MVD usually hasa smaller variance than MV


VLC Example VLC Example

MVx/MVy symbol codeword

0

-1

-2

1

2

3

1

2

3

4

5

6

1

010011

00100

00101

00110

Exponential Golomb Codes: 0…01x…xm m-1






step-sizestep-size



MCP Residue CodingMCP Residue Coding

Transform Quantization Coding

Conceptually similar to JPEG

Transform: unitary transform

Quantization: Deadzone quantization

Coding: Run-length coding


TransformTransform

Unitary matrix: A is real, AUnitary matrix: A is real, A-1-1=A=ATT

Unitary transform: A is unitary, Y=AXAUnitary transform: A is unitary, Y=AXATT

ExamplesExamples– 8-by-8 DCT8-by-8 DCT– 4-by-4 integer transform 4-by-4 integer transform

1221

1111

2112

1111

A


Deadzone QuantizationDeadzone Quantization

2

0

deadzone

codewords






step-sizestep-size


Constrained OptimizationConstrained Optimization

Min f(x,y) subject to g(x,y)=c


Lagrangian Multiplier MethodLagrangian Multiplier MethodRDJ

motionmotiondfd RDJ

RECModeREC RDJ

Motion estimation

Mode selection

ModeMotion

2cQUANTMode

QUANT: a user-specified parameter controlling quantization stepsize


Example: Rate-Distortion Optimized Example: Rate-Distortion Optimized BMABMA

Distortion alone

Rate and Distortion

counted bits using a VLC table


Experimental ResultsExperimental Results

Cited from G. Sullivan and L. Baker, “Rate-Distortion optimizedmotion compensation for video compression using fixed or variable size blocks”, Globecom’1991


SummarySummary

How does MCP coding work?How does MCP coding work?– The predictive model captures the slow-varying The predictive model captures the slow-varying

trend of the samples {ftrend of the samples {fnn}}

– The modeling of prediction residues {eThe modeling of prediction residues {enn} is easier } is easier

than that of original samples {fthan that of original samples {fnn}}

Fundamental weaknessFundamental weakness– Quantization error will propagate unless the Quantization error will propagate unless the

memory of predictor is refreshedmemory of predictor is refreshed– Not suitable for scalable coding applicationsNot suitable for scalable coding applications

ee569 digital video processing 1 roadmap introduction intra-frame coding –review of jpeg...

Documents

ee569 digital video

scalable video coding

video coding lossless

video coding applications

lengthhuffman coding

quantization coding

video source x

image coding