ee569 digital video processing 1 roadmap introduction intra-frame coding –review of jpeg...
TRANSCRIPT
EE569 Digital Video Processing EE569 Digital Video Processing 11
RoadmapRoadmap
IntroductionIntroduction
Intra-frame coding Intra-frame coding – Review of JPEGReview of JPEG
Inter-frame codingInter-frame coding– Conditional Replenishment (CR) CodingConditional Replenishment (CR) Coding– Motion Compensated Predictive (MCP) CodingMotion Compensated Predictive (MCP) Coding
Object-based and scalable video coding*Object-based and scalable video coding*– Motion segmentation, scalability issuesMotion segmentation, scalability issues
EE569 Digital Video Processing EE569 Digital Video Processing 22
Introduction to Video CodingIntroduction to Video Coding
Lossless vs. lossy data compressionLossless vs. lossy data compression– Source entropy H(X)Source entropy H(X)– Rate-Distortion function R(D) or D(R)Rate-Distortion function R(D) or D(R)
Probabilistic modeling is at the heart of data Probabilistic modeling is at the heart of data compressioncompression– What is P(X) for video source X?What is P(X) for video source X?– Is video coding more difficult than image coding?Is video coding more difficult than image coding?
EE569 Digital Video Processing EE569 Digital Video Processing 33
Shannon’s PictureShannon’s Picture
Rate (bps)
Distortion
Coder ACoder B
For video source, no one knows the limit (bound)
For Gaussian source N(0,2)
EE569 Digital Video Processing EE569 Digital Video Processing 44
Distortion MeasuresDistortion Measures
ObjectiveObjective– Mean Square Error (MSE)Mean Square Error (MSE)– Peak Signal-to-Noise-Ratio (PSNR)Peak Signal-to-Noise-Ratio (PSNR)– Measure the fidelity to original videoMeasure the fidelity to original video
SubjectiveSubjective– Human Vision System (HVS) basedHuman Vision System (HVS) based– Emphasize visual quality rather than fidelityEmphasize visual quality rather than fidelity
We only discuss objective measures in this course, We only discuss objective measures in this course, but subjective video quality assessment is an open but subjective video quality assessment is an open and important topicand important topic
EE569 Digital Video Processing EE569 Digital Video Processing 55
Video Coding ApplicationsVideo Coding Applications
EE569 Digital Video Processing EE569 Digital Video Processing 66
RoadmapRoadmap
IntroductionIntroduction
Intra-frame codingIntra-frame coding – Review of JPEGReview of JPEG
Inter-frame codingInter-frame coding– Conditional Replenishment (CR)Conditional Replenishment (CR)– Motion Compensated Prediction (MCP)Motion Compensated Prediction (MCP)
Object-based and scalable video coding*Object-based and scalable video coding*– Motion segmentation, scalability issuesMotion segmentation, scalability issues
EE569 Digital Video Processing EE569 Digital Video Processing 77
A Tour of JPEG Coding Standard
Key Components
Transform
Quantization
Coding
-8×8 DCT-boundary padding
-uniform quantization
-DC/AC coefficients
-Zigzag scan-run length/Huffman coding
EE569 Digital Video Processing EE569 Digital Video Processing 88
JPEG Baseline Coder
169130
173129
170181
170183
179181
182180
179180
179179169132
171130
169183
164182
179180
176179
180179
178178167131
167131
165179
170179
177179
182171
177177
168179169130
165132
166187
163194
176116
15394
153183
160183Tour Example
EE569 Digital Video Processing EE569 Digital Video Processing 99
Step 1: Transform• DC level shifting
• 2D DCT
169130
173129
170181
170183
179181
182180
179180
179179169132
171130
169183
164182
179180
176179
180179
178178167131
167131
165179
170179
177179
182171
177177
168179169130
165132
166187
163194
176116
15394
153183
160183
412
451
4253
4255
5153
5452
5152
5151414
432
4155
3654
5152
4851
5251
5050393
393
3751
4251
4951
5443
4949
4051412
374
3859
3566
4812
2534
2555
3655
-128
412
451
4253
4255
5153
5452
5152
5151414
432
4155
3654
5152
4851
5251
5050393
393
3751
4251
4951
5443
4949
4051412
374
3859
3566
4812
2534
2555
3655
13
42
12
09
40
21
13
4430
55
47
73
30
46
32
16113
916
109
621
179
3310
810
17201024
2727
132
6078
4413
1827
2738
56313
DCT
EE569 Digital Video Processing EE569 Digital Video Processing 1010
Step 2: Quantization
99103
101120
100112
121103
9895
8778
9272
644992113
77103
10481
10968
6455
5637
3524
22186280
5669
8751
5740
2922
2416
1714
13145560
6151
5826
4024
1914
1610
1212
1116
Q-table
13
42
12
09
40
21
13
4430
55
47
73
30
46
32
16113
916
109
621
179
3310
810
17201024
2727
132
6078
4413
1827
2738
56313
00
00
00
00
00
00
00
0000
00
00
00
00
00
00
0000
00
00
01
10
11
01
1100
01
01
23
21
13
23
520
Q
Why increasefrom top-left tobottom-right?
EE569 Digital Video Processing EE569 Digital Video Processing 1111
Step 3: Entropy Coding
Zigzag Scan
00
00
00
00
00
00
00
0000
00
00
00
00
00
00
0000
00
00
01
10
11
01
1100
01
01
23
21
13
23
520
(20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)
Zigzag Scan
End Of the Block:All following coefficients are zero
EE569 Digital Video Processing EE569 Digital Video Processing 1212
RoadmapRoadmap
IntroductionIntroduction
Intra-frame coding Intra-frame coding – Review of JPEGReview of JPEG
Inter-frame codingInter-frame coding– Conditional Replenishment (CR)Conditional Replenishment (CR)– Motion Compensated Prediction (MCP)Motion Compensated Prediction (MCP)
Object-based and scalable video coding*Object-based and scalable video coding*– Motion segmentation, scalability issuesMotion segmentation, scalability issues
EE569 Digital Video Processing EE569 Digital Video Processing 1313
Conditional ReplenishmentConditional Replenishment
Based on motion detection rather than motion Based on motion detection rather than motion estimationestimationPartition the current frame into “still areas” Partition the current frame into “still areas” and “moving areas”and “moving areas”– Replenishment is applied to moving regions onlyReplenishment is applied to moving regions only– Repetition is applied to still regionsRepetition is applied to still regions
Need to transmit the location of moving areas Need to transmit the location of moving areas as well as new (replenishment) informationas well as new (replenishment) information– No motion vectors transmittedNo motion vectors transmitted
EE569 Digital Video Processing EE569 Digital Video Processing 1414
Conditional ReplenishmentConditional Replenishment
EE569 Digital Video Processing EE569 Digital Video Processing 1515
Motion DetectionMotion Detection
EE569 Digital Video Processing EE569 Digital Video Processing 1616
From Replenishment to PredictionFrom Replenishment to Prediction
Replenishment can be viewed as a degenerated Replenishment can be viewed as a degenerated case of predictioncase of prediction– Only zero motion vector is considered Only zero motion vector is considered – Discard the historyDiscard the history
A more powerful approach of exploiting A more powerful approach of exploiting temporal dependency is predictiontemporal dependency is prediction– Locate the best match from the previous frameLocate the best match from the previous frame– Use the history to predict the current Use the history to predict the current
EE569 Digital Video Processing EE569 Digital Video Processing 1717
Differential Pulse Coded ModulationDifferential Pulse Coded Modulation
_
+D
xn
xnxn-1
yn
xn-1
+yn xn
D
Encoder
Decoder
Q^yn
^
^^
^
xn-1^
^
Xn,yn: unquantized samples and prediction residues
Xn,yn: decoded samples and quantized prediction residues^ ^
nnnn yyxx ˆˆ
EE569 Digital Video Processing EE569 Digital Video Processing 1818
Motion-Compensated Predictive CodingMotion-Compensated Predictive Coding
EE569 Digital Video Processing EE569 Digital Video Processing 1919
A Closer LookA Closer Look
EE569 Digital Video Processing EE569 Digital Video Processing 2020
Key ComponentsKey Components
Motion Estimation/CompensationMotion Estimation/Compensation – At the heart of MCP-based codingAt the heart of MCP-based coding
Coding of Motion Vectors (overhead)Coding of Motion Vectors (overhead)– Lossless: errors in MV are catastrophic Lossless: errors in MV are catastrophic
Coding of MCP residuesCoding of MCP residues– Lossy: distortion is controlled by the quantization Lossy: distortion is controlled by the quantization
step-sizestep-size
Rate-Distortion optimizationRate-Distortion optimization
EE569 Digital Video Processing EE569 Digital Video Processing 2121
Block-based Motion ModelBlock-based Motion Model
Block sizeBlock size– Fixed vs. variableFixed vs. variable
Motion accuracyMotion accuracy– Integer-pel vs. fractional-pelInteger-pel vs. fractional-pel
Number of hypothesisNumber of hypothesis– Overlapped Block Motion Compensation (OBMC)Overlapped Block Motion Compensation (OBMC)– Multi-frame predictionMulti-frame prediction
EE569 Digital Video Processing EE569 Digital Video Processing 2222
Quadtree Representation of Quadtree Representation of Motion Field with Variable BlocksizeMotion Field with Variable Blocksize
Sullivan, G.J.; Baker, R.L., "Rate-distortion optimized motion compensation for video compression using fixed or variable size blocks," GLOBECOM '91. pp.85-90 vol.1, 2-5 Dec 1991
EE569 Digital Video Processing EE569 Digital Video Processing 2323
ExampleExample
counted bits using a VLC table
EE569 Digital Video Processing EE569 Digital Video Processing 2424
Fractional-pel BMAFractional-pel BMA
Recall the tradeoff between spending bits on Recall the tradeoff between spending bits on motion and spending bits on MCP residuesmotion and spending bits on MCP residues
Intuitively speaking, going from integer-pel to Intuitively speaking, going from integer-pel to fractional-pel is good for it dramatically fractional-pel is good for it dramatically reduces the variance of MCP residues for reduces the variance of MCP residues for some video sequence.some video sequence.
The gain quickly saturates as motion accuracy The gain quickly saturates as motion accuracy refinesrefines
EE569 Digital Video Processing EE569 Digital Video Processing 2525
8-by-8 block, half-pel, var(e)=123.88-by-8 block, integer-pel, var(e)=220.8
ExampleExample
MCP residue comparison for the first two frames of Mobile sequence
EE569 Digital Video Processing EE569 Digital Video Processing 2626
Fractional-pel MCPFractional-pel MCP
Girod, B., "Motion-compensating prediction with fractional-pel accuracy," IEEE Trans. on Communications, vol.41, no.4, pp.604-612, Apr 1993
EE569 Digital Video Processing EE569 Digital Video Processing 2727
Multi-Hypothesis MCPMulti-Hypothesis MCP
Using one block from one reference frame Using one block from one reference frame represents a single-hypothesis MCPrepresents a single-hypothesis MCPIt is possible to formulate multiple hypothesis It is possible to formulate multiple hypothesis by consideringby considering– Overlapped blocksOverlapped blocks– More than one reference frameMore than one reference frame
Why multi-hypothesis?Why multi-hypothesis?– The benefit of reducing variance of MCP residues The benefit of reducing variance of MCP residues
outweighs the increased overhead on motionoutweighs the increased overhead on motion
EE569 Digital Video Processing EE569 Digital Video Processing 2828
Example: B-frameExample: B-frame
fn-1 fn fn+1
1/5.0/0),,()1(),(),(ˆ 11 ayxfayxfayxf nnn
EE569 Digital Video Processing EE569 Digital Video Processing 2929
Generalized B-frameGeneralized B-frame
fn-1 fn fn+1
nk
kk
n yxfayxf ),(),(ˆ
fn+2fn-2
EE569 Digital Video Processing EE569 Digital Video Processing 3030
Overlapped Block Motion Overlapped Block Motion Compensation (OBMC)Compensation (OBMC)
EE569 Digital Video Processing EE569 Digital Video Processing 3131
Overlapped Block Motion Overlapped Block Motion Compensation (OBMC)Compensation (OBMC)
Conventional block motion compensationConventional block motion compensation– One best matching block is found from a reference frameOne best matching block is found from a reference frame
– The current block is predicted by the best matching blockThe current block is predicted by the best matching block
OBMCOBMC– Each pixel in the current block is predicted by a weighted Each pixel in the current block is predicted by a weighted
average of several corresponding pixels in the reference frameaverage of several corresponding pixels in the reference frame
– The corresponding pixels are determined by the MVs of the The corresponding pixels are determined by the MVs of the current as well as adjacent MBscurrent as well as adjacent MBs
– The weights for each corresponding pixel depends on the The weights for each corresponding pixel depends on the expected accuracy of the associated MVexpected accuracy of the associated MV
EE569 Digital Video Processing EE569 Digital Video Processing 3232
OBMC Using 4 Neighboring MBsOBMC Using 4 Neighboring MBs
Should be Should be inversely inversely proportionalproportional to the distance to the distance between x and the center of between x and the center of
EE569 Digital Video Processing EE569 Digital Video Processing 3333
Optimal Weighting Design*Optimal Weighting Design*
Convert to an optimization problem: Convert to an optimization problem: – MinimizeMinimize
– Subject toSubject to
Optimal weighting functions:Optimal weighting functions:
EE569 Digital Video Processing EE569 Digital Video Processing 3434
Multi-Hypothesis MCPMulti-Hypothesis MCP
EE569 Digital Video Processing EE569 Digital Video Processing 3535
Key ComponentsKey Components
Motion Estimation Motion Estimation – At the heart of MCP-based codingAt the heart of MCP-based coding
Coding of Motion Vectors (overhead)Coding of Motion Vectors (overhead)– Lossless: errors in MV are catastrophic Lossless: errors in MV are catastrophic
Coding of MCP residuesCoding of MCP residues– Lossy: distortion is controlled by the quantization Lossy: distortion is controlled by the quantization
step-sizestep-size
Rate-Distortion optimizationRate-Distortion optimization
EE569 Digital Video Processing EE569 Digital Video Processing 3636
Motion Vector CodingMotion Vector Coding
2D lossless DPCM2D lossless DPCM– Spatially (temporally) adjacent motion vectors are Spatially (temporally) adjacent motion vectors are
correlatedcorrelated– Use causal neighbors to predict the current oneUse causal neighbors to predict the current one– Code Motion Vector Difference (MVD) instead of Code Motion Vector Difference (MVD) instead of
MVsMVs
Entropy coding techniquesEntropy coding techniques– Variable length codes (VLC)Variable length codes (VLC)– Arithmetic codingArithmetic coding
EE569 Digital Video Processing EE569 Digital Video Processing 3737
MVD ExampleMVD Example
MV
MV1 MV2
MV3
),,( 321 MVMVMVmedianMVMVD
Due to smoothness of MV field, MVD usually hasa smaller variance than MV
EE569 Digital Video Processing EE569 Digital Video Processing 3838
VLC Example VLC Example
MVx/MVy symbol codeword
0
-1
-2
1
2
3
1
2
3
4
5
6
1
010011
00100
00101
00110
Exponential Golomb Codes: 0…01x…xm m-1
EE569 Digital Video Processing EE569 Digital Video Processing 3939
Key ComponentsKey Components
Motion Estimation Motion Estimation – At the heart of MCP-based codingAt the heart of MCP-based coding
Coding of Motion Vectors (overhead)Coding of Motion Vectors (overhead)– Lossless: errors in MV are catastrophic Lossless: errors in MV are catastrophic
Coding of MCP residuesCoding of MCP residues– Lossy: distortion is controlled by the quantization Lossy: distortion is controlled by the quantization
step-sizestep-size
Rate-Distortion optimizationRate-Distortion optimization
EE569 Digital Video Processing EE569 Digital Video Processing 4040
MCP Residue CodingMCP Residue Coding
Transform Quantization Coding
Conceptually similar to JPEG
Transform: unitary transform
Quantization: Deadzone quantization
Coding: Run-length coding
EE569 Digital Video Processing EE569 Digital Video Processing 4141
TransformTransform
Unitary matrix: A is real, AUnitary matrix: A is real, A-1-1=A=ATT
Unitary transform: A is unitary, Y=AXAUnitary transform: A is unitary, Y=AXATT
ExamplesExamples– 8-by-8 DCT8-by-8 DCT– 4-by-4 integer transform 4-by-4 integer transform
1221
1111
2112
1111
A
EE569 Digital Video Processing EE569 Digital Video Processing 4242
Deadzone QuantizationDeadzone Quantization
2
0
deadzone
codewords
EE569 Digital Video Processing EE569 Digital Video Processing 4343
Key ComponentsKey Components
Motion Estimation Motion Estimation – At the heart of MCP-based codingAt the heart of MCP-based coding
Coding of Motion Vectors (overhead)Coding of Motion Vectors (overhead)– Lossless: errors in MV are catastrophic Lossless: errors in MV are catastrophic
Coding of MCP residuesCoding of MCP residues– Lossy: distortion is controlled by the quantization Lossy: distortion is controlled by the quantization
step-sizestep-size
Rate-Distortion optimizationRate-Distortion optimization
Constrained OptimizationConstrained Optimization
Min f(x,y) subject to g(x,y)=c
EE569 Digital Video Processing EE569 Digital Video Processing 4545
Lagrangian Multiplier MethodLagrangian Multiplier MethodRDJ
motionmotiondfd RDJ
RECModeREC RDJ
Motion estimation
Mode selection
ModeMotion
2cQUANTMode
QUANT: a user-specified parameter controlling quantization stepsize
EE569 Digital Video Processing EE569 Digital Video Processing 4646
Example: Rate-Distortion Optimized Example: Rate-Distortion Optimized BMABMA
Distortion alone
Rate and Distortion
counted bits using a VLC table
EE569 Digital Video Processing EE569 Digital Video Processing 4747
Experimental ResultsExperimental Results
Cited from G. Sullivan and L. Baker, “Rate-Distortion optimizedmotion compensation for video compression using fixed or variable size blocks”, Globecom’1991
EE569 Digital Video Processing EE569 Digital Video Processing 4848
SummarySummary
How does MCP coding work?How does MCP coding work?– The predictive model captures the slow-varying The predictive model captures the slow-varying
trend of the samples {ftrend of the samples {fnn}}
– The modeling of prediction residues {eThe modeling of prediction residues {enn} is easier } is easier
than that of original samples {fthan that of original samples {fnn}}
Fundamental weaknessFundamental weakness– Quantization error will propagate unless the Quantization error will propagate unless the
memory of predictor is refreshedmemory of predictor is refreshed– Not suitable for scalable coding applicationsNot suitable for scalable coding applications