compression efficiency and delay tradeoffs for hierarchical b-pictures and pulsed-quality frames...

39
Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California at San Diego IEEE Transactions on Image Processing, July 2007

Upload: brittany-moore

Post on 26-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames

Athanasios Leontaris, Pamela C. CosmanUniv. of California at San DiegoIEEE Transactions on Image Processing, July 2007

Page 2: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Outline

• Introduction• End-to-End Delay• Effect of branch removal from HIER coders• Delay due to encoder output buffer• Proposed framework for rate allocation

• Motivation

• Theoretical background

• Proposed estimate

• Rate allocation algorithm• Rate control• Results• Conclusion

Page 3: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Introduction

• Constraining delay is critical for real-time communication and live event broadcast

• Compression efficiency can be improved by• Increasing the buffering delay (bit rate allocated to each frame can

vary)• More flexible motion-compensated prediction structures

• When temporal correlation among several neighboring frames is better exploited, additional delay is incurred• Example: Tradeoffs of delay and compression in MCTF, which

delay was reduced by selectively removing the update step

• Delay is an issue for hierarchical bi-predictive structures, as well• The delay in the hierarchical case depends on the GOP size• Delay can’t be reduced by removing update steps while keeping the

GOP size intact• But in this work, it can be reduced by removing the MPC branches

Page 4: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Introduction

• One can also have increased delay when using a single-direction (forward) prediction• One codec that employs two reference frames, short-term (ST) and

long-term (LT)

• At constant transmission bit rate, the LT frames will take longer to transmit, introducing delay

• Compression efficiency can improved for certain sequences, but how about delay?

• A key element of a delay-constrained video encoder is the rate control scheme• Rate control algorithms: Test Model 5 for MPEG-2, TM5 of

replacing the block variance with the block SAD, and quadratic rate distortion model adopted in H.264

Page 5: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Introduction

• When the bit rate is distributed unevenly among the frames, extra buffering delay at the encoder output and decoder input is incurred

• So, given constraints on bit rate and buffering delay, such a R-D model can yield an efficient rate allocation

• To obtain a model for hierarchical prediction, we need to account for the temporal prediction distance• In [16], the rate and distortion were calculated as functions of the

power spectral density of the prediction error• This model introduced the concept of MC accuracy• In this work, we tend to use the accuracy to model the temporal

prediction distance

[16] B. Griod, “The efficiency of motion-compensating prediction for hybrid coding of video sequences”

Page 6: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

End-to-End Delay

• The encoder is free to allocate the rate within the frames of the time unit as to optimize some quality criterion, while making sure that the unit as a whole adheres to the CBR target rate

• The encoder output buffer determines how tightly the rate allocation and rate control must operate

• Allowing the encoder output buffer to be larger leads to higher video quality

Page 7: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

End-to-End Delay

• Bits buffered in decoder input buffer is the same as the encoder output buffer

• The encoder buffer fullness and the decoder buffer fullness are always complementary to each other and have a constant sum equal to the max size of each buffer

• The source coding end-to-end delay De2e is :

• Assuming the size of the encoder output buffer and the decoder input buffer is B, we can obtain

BDDD indec

outenc

Page 8: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Five types of encoders(1) (2)

Page 9: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Five types of encoders(3)

Page 10: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Five types of encoders(4) (5)

IBBBP coder, where all B-coded pictures are disposable and use only I-and P-coded pictures as references

Page 11: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

End-to-End Delay

• For all codecs apart from IBBBP

• For IBBBP

• Source coding end-to-end delay

• For IPPPP and PULSE codecs (NGOP = 1), the end-to-end delay is

outdecfrGOP

outdec

inenc2

)1( DDtN

DDDD

B

Bee

froutdec 2 tD

frGOP2outdec )(log tND

Bee DD 2

Page 12: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Effect of branch removal from HIER coders

Page 13: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Effect of branch removal from HIER coders• Truncated branch brings down the structural delay by one

half

• The structure is similar to a GOP size 2 structure• Differences:

• Still have 3 hierarchical levels and allows more granular temporal scalability or network condition

• Frame 4 is predicted frame 0, instead of being predicted from frame 2 as for GOP size 2. This means the compression performance will be worse than for a GOP size 2 structure

• For hierarchical B-pictures, the tradeoff of compression efficiency for delay becomes a tradeoff of compression efficiency for increased temporal scalability and bit-stream resilience and decreased delay

Page 14: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Delay due to encoder output buffer• To avoid a buffer overflow during encoding, the

necessary condition is

• The encoder can estimate the encoder output buffer length from the bit rate allocation

• A useful and intuitive lower bound is

• It translates the delay constraint into a rate allocation constraint

j

ii

KjRxB

0]1,0[

)0,max(max

),,max( 0 RxxB M

Page 15: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Proposed Framework for Rate Allocation

Page 16: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Motivation

• In the JM reference software uses a single QP for the entire frame

• however, rate allocation under tight delay constraints can’t use the same QP for the entire frame• Extend the rate control algorithm to offer per-block decisions

of QP, and seek to avoid buffer overflow and underflow and satisfy the target rate

• Goal : establish the bit rate allocation for different hierarchical levels with B-coded pictures

• We found through experiments that the efficiency of the bi-directional prediction of a frame depends on the distance from its references

Page 17: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Motivation

• Assumption a)• Frames within a temporal decomposition level

have similar entropy

→ can be afforded the same number of bits• We seek a solution that doesn’t depend on video

content: fixed proportion of bits for each temporal decomposition level

• The requirement for fixed ratios is a result of computational and delay constrains

Page 18: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Motivation

• Assumption b)• Closed-loop coding

• Refers to using as references the previously reconstructed version of the frames

• Approaches for rate allocation in open loop MCTF are based on temporal propagation of the error

• didn’t take into account the temporal distance between the frames and lack any delay constraints

• Not appropriate for this work since we can’t afford the delay and the computational complexity

Page 19: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Motivation

• Assumption c)• High-rate operation

• Closed-loop prediction at high rates doesn’t alter the signal significantly

• Effect of quantization error on prediction efficiency can be neglected for fine quantization

• Then, using a closed-loop video coder with the optimal open-loop rate allocation performs close the optimal closed-loop rate allocation

Page 20: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Theoretical background

• Rate-distortion (R-D) modeling scheme from [16] and [24]• [16] B. Girod, “The efficiency of motion-compensating prediction for hybrid coding of video

sequences”• [24] B. Girod, “Efficiency analysis of multihypothesis motion-compensated prediction for video

coding”

Page 21: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Theoretical background

• The prediction error:

1

0

),(),(),(),(N

iii yxcyxfyxsyxe

Page 22: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Theoretical background

• Assume the p.d.f. of the displacement and is a function of the temporal prediction distance

• If the power spectral density of the prediction error is known, then the error variance is given as

• The well-known rate distortion function for memoryless coding is

xy

Page 23: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Theoretical background

• Power spectrum is calculated for N-hypothesis prediction in [24]• N=1

• N=2

Page 24: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Theoretical background

• The power spectrum of the signal s is found in (19) in [17] as

• The noise power spectrum is• The displacement error p.d.f

• Fourier transforms of the above p.d.f

Page 25: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Theoretical background

• and represent the Fourier transform of the spatial filter for single and double hypothesis• For N=1 hypotheses, • For N=2 hypotheses,

• Continue the derivation of the power spectral densities

Page 26: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Proposed estimate

• seems to be a logarithmic function of the temporal distance

Page 27: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Proposed estimate

• Replace with the expression derived the error variance• Term produces approximately logarithmically spaced rate-

distortion function

Page 28: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Proposed estimate

• For fixed the standard deviation of the motion compensation displacement error varies approximately linearly with the temporal prediction distance

• The final rate-distortion model is written as

• The motivation behind adding the term to the denominator of Rl is that hybrid video coding is closed-loop and thus a case of dependent video coding

Page 29: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Rate allocation algorithm

Page 30: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Rate allocation algorithm

• An alternative approach to calculate the rate ratios was proposed by a reviewer for the HIER structure with NGOP = 4

• Constraining Dt, solve the equation (21) to find Dr. After Dr has been calculated, the calculation of R0, R1, and R2 is straightforward

Page 31: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Rate control

• For the IPPPP codec, the rate control algorithm is included in JM 10.1 reference software is directed used

• To ensure accurate rate control under tight delay constraints• we adopt the rate control approach of the PULSE codec,

with multiple rate-control paths, each of which maintains its own quadratic model

• For a hierarchical stream, the number of rate control bins is equal to the number of temporal decomposition levels

Page 32: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Results

• Video sequences:• Akiyo : very static image sequence• Carphone : include localized motion of various kinds. The

majority of activity is due to the instability of the camera inside the car. There is repetitive translational global motion

• Flower : high freq. content, and the motion is global and follows mainly the affine model

• Football : extremely active with local object motion• Mobile : substantial high freq. content and the motion is

mostly global due to the horizontal camera pan• Stefan : sports clip featuring a tennis court with very high

motion

Page 33: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Results

PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Akiyo” CIF 352x288 at 15 fps. Initial QP 39. (b) “Akiyo” CIF at 30 fps. Initial QP 30.

Page 34: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Results

PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Carphone” QCIF 176x144 at 15 fps. Initial QP 31. (b) “Carphone” QCIF at 30 fps. Initial QP 29

Page 35: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Results

PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Flower-Garden” SIF 352x240 at 30 fps. Initial QP 29. (b) “Football” SIF at 30 fps. Initial QP 33

Page 36: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Results

PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Mobile-Calendar” QCIF 176x144 at 30 fps. Initial QP 29. (b) “Stefan” CIF 352x288 at 30 fps. Initial QP 31.

Page 37: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Results

Page 38: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Results

Page 39: Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California

Conclusion• The study of the delay tradeoffs yielded the following conclusions:

• IPPPP performs well for low delay applications and for sequences with high motion

• PULSE is advantageous for relatively static sequences with repetitive content

• NGOP > 1 structures benefit from static sequences and from sequences with global motion

• As NGOP increases, the gain is nontrivial only if the sequence is either static, or if the global motion is translational

• For the sequences we evaluated, the delay thresholds are as follows: between 40 and 80 ms, IPPPP is the best choice, between 80 and 125 ms PULSE performs well, the large space between 125 and 270 ms is dominated by NGOP = 2, and for delays larger than 270 ms, then NGOP = 4 is the best choice. Delays larger than 270 ms are only however useful in cases of live event broadcast or streaming of stored content. They are prohibitive for real-time interactive communication

• The truncated NGOP = 4 codec underperforms the NGOP = 2 codec but has similar delay with the added advantage of increased temporal scalability