performance comparison of hevc, h.264 … · web viewthe difference with respect to other...

39
PERFORMANCE COMPARISON OF HEVC, H.264 and VP9 A PROJECT PROPOSAL UNDER THE GUIDANCE OF DR. K. R. RAO COURSE: EE5359 - MULTIMEDIA PROCESSING, SPRING 2015 SUBMITTED BY: DEEPIKA SREENIVASULU PAGALA [email protected] 1001112646 DEPARTMENT OF ELECTRICAL ENGINEERING UNIVERSITY OF TEXAS AT ARLINGTON 1 | Page

Upload: buinhu

Post on 17-May-2018

231 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

PERFORMANCE COMPARISON OF HEVC, H.264 and VP9

A PROJECT PROPOSAL UNDER THE GUIDANCE OF

DR. K. R. RAO

COURSE: EE5359 - MULTIMEDIA PROCESSING, SPRING 2015

SUBMITTED BY:

DEEPIKA SREENIVASULU PAGALA

[email protected]

1001112646

DEPARTMENT OF ELECTRICAL ENGINEERING

UNIVERSITY OF TEXAS AT ARLINGTON

1 | P a g e

Page 2: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Table of Contents:

1. Objective.................................................................................................................................5

2. Evolution of Video coding standards......................................................................................5

3. Need for video compression....................................................................................................6

4. Fundamentals Concepts in Video coding................................................................................6

5.H.264/AVC...............................................................................................................................7

5.1. Introduction........................................................................................................................7

5.2 Encoder and Decoder in H.264...........................................................................................8

5.3 Features of H.264/AVC.....................................................................................................10

5.3.1 Prediction................................................................................................................10

5.3.2 Transform and Quantization...................................................................................11

5.3.3 Entropy Coding.......................................................................................................11

6. HEVC......................................................................................................................................11

6.1. Introduction.......................................................................................................................11

6.2 Encoder and Decoder in HEVC.........................................................................................12

6.3 Features of HEVC..............................................................................................................14

6.3.1. Partitioning.............................................................................................................14

6.3.2 Prediction.................................................................................................................14

6.3.3 Transform and Quantization....................................................................................15

6.3.4 Entropy Coding........................................................................................................15

7. VP9..........................................................................................................................................16

7.1. Introduction........................................................................................................................16

7.2Encoder and Decoder in VP9...............................................................................................16

7.3 Features of VP9...................................................................................................................17

2 | P a g e

Page 3: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

7.3.1 Prediction Block Sizes..............................................................................................17

7.3.2 Prediction Modes .....................................................................................................18

7.3.2.1 Intra-prediction Modes...................................................................................18 7.3.2.2 Inter Prediction Modes.................................................................................19

7.3.3 Transform and Quantization.....................................................................................19

7.3.4 Entropy Coding.........................................................................................................19

8. Comparison Metrics..................................................................................................................19

8.1 Peak Signal to Noise Ratio...............................................................................................................19

8.2 Structural Similarity Index..................................................................................................20 8.3 BD-BD and BD-PSNR........................................................................................................21

8.4 Implementation Complexity................................................................................................21

9. Profiles used for comparison.....................................................................................................21

10. Test Sequences........................................................................................................................21

11. References...............................................................................................................................27

3 | P a g e

Page 4: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

List of Acronyms and Abbreviations:

ADST: Asymmetric Discrete Sine TransformAVC: Advanced Video Coding. BD-BR: Bjontegaard Delta Bitrate. BD-PSNR: Bjontegaard Delta Peak Signal to Noise Ratio. CABAC: Context Adaptive Binary Arithmetic Coding. CAVLC: Context Adaptive Variable Length Coding.CTB: Coding Tree Block. CTU: Coding Tree Unit. CU: Coding Unit. DBF: De-blocking Filter. DCT: Discrete Cosine Transform. DST :Discrete Sine Transform DPB :Decoded Picture BufferHD: High DefinitionHEVC: High Efficiency Video Coding. HM: HEVC Test Model. IEC: International Electro-technical Commission. ISO: International Organization for Standardization. ITU-T: International Telecommunication Union- Telecommunication Standardization Sector. JCT: Joint Collaborative Team. JCT-VC: Joint Collaborative Team on Video Coding. JM: H.264 Test Model. JPEG: Joint Photographic Experts Group. KTA: Key Technical Areas (H.264 based exploration software of VCEG)MC: Motion Compensation. ME: Motion Estimation. MPEG: Moving Picture Experts Group. MSE: Mean Square Error. NGOV: Next Generation open VideoPB: Prediction Block. PCS : Picture Coding SymposiumPSNR: Peak Signal to Noise Ratio. PU: Prediction UnitQP: Quantization ParameterRD: Rate Distortion SAO: Sample Adaptive Offset. SSIM: Structural Similarity Index. TB: Transform Block. TU: Transform Unit. VCEG: Visual Coding Experts Group.

4 | P a g e

Page 5: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

1. Objective:

The objective of this project is to study video coding standards HEVC [1] [34] [35] [36] , H.264 [2] [34] [35] and VP9 [3] [4] and understand various techniques in video coding such as prediction, transform, quantization and coding. A performance comparison of these video codecs based on various metrics such as computational time, PSNR [25], SSIM [5] [20] [31], BD-Bit rate [6] and BD-PSNR [7] will be carried out. The HM 16.0 [26] [33], JM 18.6 [27] [32] and VPX encoder [28] from The WebM Project test models for HEVC, H.264 and VP9 respectively will be used for this purpose.

2. Evolution of Video Coding standards [8]:

Fig 1: Evolution of video coding standards [8]

Major video coding standards have been developed by the International Standardization Organization / International Electro technical Commission (ISO/IEC) and the International Telecommunications Union – Telecommunication Standardization Sector (ITU-T) [8]. Figure 1

5 | P a g e

Page 6: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

shows a historical perspective for video coding standards development since the very first ITU-T H.120. The emergence of H.264/AVC doubled the coding efficiency from that of the MPEG-4 simple profile and has therefore gained wide industrial acceptance recently [8]. Further extensions of H.264/AVC include high profiles , scalable video coding (SVC) extension , and multi view video coding (MVC) extension [8] .

Back in 2005, the ITU-T Video Coding Experts Group (VCEG) considered the future work beyond H.264/AVC [8]. Possible targets and scope of the standard were brainstormed and a software known as Key Technology Area (KTA) was developed and released in 2008 [8] . In 2009, the ISO/IEC Moving Picture Experts Group (MPEG) began a similar call for High-Performance Video Coding (HVC) [8].

3. Need for Video Compression- Growing demand for video [30]:

Video exceeds half of internet traffic and will grow to 86 percent by 2016 [30]. Increase in applications, content, fidelity, etc. -Need higher coding efficiency! [30]. Ultra-HD 4K broadcast expected for Japan in 2014. London Olympics Opening and

Closing Ceremonies shot in Ultra-HD 8K. - Need higher throughput! [30]. 25x increase in mobile data traffic over next five years. Video is a “must have” on

portable devices. - Need lower power! [30].

4. Fundamental Concepts in Video Coding:

Color Spaces The common color spaces for digital image and video representation are:

RGB color space – Each pixel is represented by three numbers indicating the relative proportions of red, green and blue colors

YCrCb color space – Y is the luminance component, a monochrome version of color image. Y is a weighted average of R, G and B

Where k are the weighting factors. The color information is represented as color differences or chrominance components,

where each chrominance component is difference between R, G or B and the luminance Y.

As the human visual system is less sensitive to color than the luminance component, YCrCb has advantages over RGB space. The amount of data required to represent the chrominance component reduces without impairing the visual quality [10].

The popular pattern of sampling [10] is: 4:4:4 – The three components Y: Cr: Cb has the same resolution, which is for every 4

luminance samples there are 4 Cr and 4 Cb samples.

6 | P a g e

Page 7: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

The popular patterns of sub-sampling [10] are: 4:2:2 – For every 4 luminance samples in the horizontal direction, there are 2 Cr and 2 Cb

samples. This representation is used for high quality video color reproduction. 4:2:0 – The Cr and Cb each have half the horizontal and vertical resolution of Y. This is

popularly used in applications such as video conferencing, digital television and DVD storage.

Fig 2: 4:2:0 sub-sampling pattern [10].

Fig 3: 4:2:2 sub-sampling pattern and 4:4:4 sampling pattern [10].

5. H.264/AVC [2] :

5.1: INTRODUCTION [9]:

H.264/Advanced Video Coding (AVC) is video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group [2].

7 | P a g e

Page 8: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Completed(Version 1) in May 2003 [9]. H.264/AVC is the most popular video standard in the market- 80% of video on internet is

encoded with H.264 [9]. ~50% higher efficiency than MPEG-2 [9].

Applications include HDTV broadcast satellite, cable, and terrestrial video content acquisition and editing camcorders, security applications, Internet and mobile network video, Blue ray Discs real time video chat, video conferencing, and telepresence

5.2: Encoder and Decoder in H.264 [11]:

An H.264 video encoder carries out prediction, transform and encoding processes (Figure 4) to produce a compressed H.264 bit stream. An H.264 video decoder carries out complementary processes of decoding, inverse transform and reconstruction (Figure 5) to produce a decoded video sequence.

8 | P a g e

Page 9: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig 4: Encoding Process in H.264 [11]

9 | P a g e

Page 10: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig 5: Decoding Process in H.264 [11]

5.3 Features of H.264/AVC:

5.3.1 Prediction [12] : The encoder processes a frame of video in units of a macro-block (16x16 displayed pixels) [12] . It forms a prediction of the macro-block based on previously-coded data, either from the current frame (intra prediction) or from other frames that have already been coded and transmitted (inter prediction). The encoder subtracts the prediction from the current macro-block to form a residual. Intra prediction uses 16x16 and 4x4 block sizes to predict the macro-block from surrounding, previously coded pixels within the same frame.

Fig 6: Intra prediction in H.264 [12]

10 | P a g e

Page 11: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Inter prediction uses a range of block sizes (from 16x16 down to 4x4) to predict pixels in the current frame from similar regions in previously coded frames.

Fig 7: Inter prediction in H.264 [12]

Finding a suitable inter prediction is often described as motion estimation. Subtracting an inter prediction from the current macro-block is motion compensation.

5.3.2 Transform and Quantization [13]:A block of residual samples is transformed using a 4x4 or 8x8 integer transform, an approximate form of the Discrete Cosine Transform (DCT) [13]. The transform outputs a set of coefficients, each of which is a weighting value for a standard basis pattern. When combined, the weighted basis patterns re-create the block of residual samples. The output of the transform, a block of transform coefficients, is quantized, i.e. each coefficient is divided by an integer value. Quantization reduces the precision of the transform coefficients according to a quantization parameter (QP).

5.3.3 Entropy Coding [15] :

H.264/AVC specifies two alternative methods of entropy coding: a low-complexity technique based on the usage of context-adaptively switched sets of variable length codes, so-called CAVLC, and the computationally more demanding algorithm of context-based adaptive binary arithmetic coding (CABAC) [14].

6. HEVC [1] [34]:

6.1 Introduction : High Efficiency Video Coding (HEVC) [1] is an international standard for video compression developed by a working group of ISO/IEC MPEG (Moving Picture Experts Group) and ITU-T VCEG (Video Coding Experts Group). The main goal of HEVC standard is to significantly

11 | P a g e

Page 12: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

improve compression performance compared to existing standards (such as H.264/Advanced Video Coding [2]) in the range of 50% bit rate reduction at similar visual quality [1].

HEVC is designed to address existing applications of H.264/MPEG-4 AVC and to focus on two key issues: increased video resolution and increased use of parallel processing architectures [1] . It primarily targets consumer applications as pixel formats are limited to 4:2:0 8-bit and 4:2:0 10-bit. The next revision of the standard, finalized in 2014, enables new use-cases with the support of additional pixel formats such as 4:2:2 and 4:4:4 and bit depth higher than 10-bit [18], embedded bit-stream scalability , 3D video [17] and multiview video [41] .

6.2 Encoder and Decoder in HEVC [19]:

Source video, consisting of a sequence of video frames, is encoded or compressed by a video encoder to create a compressed video bit stream. The compressed bit stream is stored or transmitted. A video decoder decompresses the bit stream to create a sequence of decoded frames.

The video encoder performs the following steps: Partitioning each picture into multiple units Predicting each unit using inter or intra prediction, and subtracting the prediction from

the unit Transforming and quantizing the residual (the difference between the original picture

unit and the prediction) Entropy encoding transform output, prediction information, mode information and

headers

The video decoder performs the following steps: Entropy decoding and extracting the elements of the coded sequence Rescaling and inverting the transform stage Predicting each unit and adding the prediction to the output of the inverse transform Reconstructing a decoded video image

The Figures 8 [17] and 9 [21] represent the detailed block diagrams of HEVC encoder and decoder respectively:

12 | P a g e

Page 13: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig 8: Block Diagram of HEVC Encoder [17]

Fig 9: Block diagram of HEVC Decoder [21]

13 | P a g e

Page 14: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

6.3 Features of HEVC:

6.3.1 Partitioning [19]:

HEVC supports highly flexible partitioning of a video sequence. Each frame of the sequence is split up into rectangular or square regions (units or blocks) [19], each of which is predicted from previously coded data. After prediction, any residual information is transformed and entropy encoded. Each coded video frame, or picture, is partitioned into tiles and/or slices, which are further partitioned into coding tree units (CTUs). The CTU is the basic unit of coding, analogous to the macro-block in earlier standards, and can be up to 64x64 pixels in size. A coding tree unit can be subdivided into square regions known as coding units (CUs) using a quad-tree structure. Each CU is predicted using inter or intra prediction and transformed using one or more transform units.

Fig 10: Picture, Slice, Coding Tree Unit (CTU), Coding Unit (CU) [19]

6.3.2 Prediction [1]:

Frames of video are coded using intra or inter prediction:

Intra prediction: Each PU is predicted from neighboring image data in the same picture, using DC prediction (an average value for the PU), planar prediction (fitting a plane surface to the PU) or directional prediction (extrapolating from neighboring data).

Inter prediction: Each PU is predicted from image data in one or two reference pictures (before or after the current picture in display order), using motion compensated prediction.

14 | P a g e

Page 15: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig 11: Modes and angular intra prediction directions in HEVC [1]

6.3.3 Transform and Quantization [19] : Any residual data remaining after prediction is transformed using a block transform based on the Discrete Cosine Transform (DCT) [13] or Discrete Sine Transform (DST). One or more block transforms of size 32x32, 16x16, 8x8 and 4x4 are applied to residual data in each CU.

Fig 12: CTU showing range of transform (TU) sizes [19]Then transformed data is quantized.

6.3.4 Entropy coding:

A coded HEVC bit stream consists of quantized transform coefficients, prediction information such as prediction modes and motion vectors, partitioning information and other header data. All of these elements are encoded using Context Adaptive Binary Arithmetic Coding (CABAC) [14] similar to H.264/AVC.

15 | P a g e

Page 16: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

7. VP9 [3] [4] :

7.1 Introduction : VP9 is an open and royalty free video compression standard being developed by Google. VP9 had earlier development names of Next Generation Open Video (NGOV) and VP-Next. VP9 is a successor to VP8. Development of VP9 started in Q3 2011. One of the goals of VP9 is to reduce the bit rate by 50% compared to VP8 while having the same video quality [22]. Also VP9 aims to improve it to the point where it would have better compression efficiency than High Efficiency Video Coding. VP9 expands techniques used in H.264/AVC and VP8 and is very likely to replace AVC at least in the YouTube video service [23].

7.2 Encoder and Decoder in VP9 [24]:

A large part of the advances made by VP9 over its predecessors is natural progression from current generation video codecs to the next. Figures 13 and 14 represent block diagrams of encoder and decoder of VP9 respectively.

Fig 13: Encoder block diagram for VP9 [24]

16 | P a g e

Page 17: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig 14: Decoder block diagram for VP9 [24]

7.3 Features of VP9 :

7.3.1 Prediction Block Sizes:A large part of the coding efficiency improvements achieved in VP9 can be attributed to incorporation of larger prediction block sizes [4] [23]. VP9 introduces super-blocks(SB) of size up to 64x64 and allows breakdown using recursive decomposition all the way down to 4x4.

Each sub-block may be further split into prediction blocks and transform blocks. Intra-prediction in VP9 is still performed on square regions thus rectangular prediction blocks represent two square prediction blocks with the same prediction mode.

Giving an analogy to HEVC [1], prediction splitting 2Nx2N, NxN, 2NxN or Nx2N is available where 2Nx2N is the size of the block being split. It is worth mentioning that 4x4 prediction blocks are determined within corresponding 8x8 blocks as a group, unlike other prediction sizes when prediction data is stored per each prediction block. Like in HEVC, a sub-block can be split into transform blocks in a quad-tree structure down to the smallest 4x4 block. The allowed sizes are 32x32, 32x16, 16x16, 8x16, 8x8 and 4x4.

17 | P a g e

Page 18: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig 15: Example partitioning of a 64x64 Super-block [4] [23]

7.3.2 Prediction Modes

7.3.2.1 Intra-prediction Modes [4] :VP9 supports a set of 10 Intra prediction modes for block sizes ranging from 4x4 up to 32x32: DC_PRED (DC prediction), TM_PRED (True-motion prediction), H_PRED (Horizontal prediction), V_PRED (Vertical prediction), and 6 oblique directional prediction modes: D27, D153, D135, D117, D63, D45 corresponding approximately to angles 27, 153, 135, 117, 63, and 45 degrees (counter-clockwise measured against the horizontal axis). The horizontal, vertical and oblique directional prediction modes involve copying (or estimating) pixel values from surrounding blocks into the current block along the angle specified by the prediction mode. Figure 16 shows angular Intra-prediction modes in VP9.

Fig 16: VP9 angular intra-prediction modes [4]

18 | P a g e

Page 19: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

7.3.2.2 Inter Prediction Modes [4]:

VP9 supports a set of 4 inter prediction modes for block sizes ranging from 4x4 up to 64x64 pixels: NEARESTMV, NEARMV, ZEROMV, and NEWMV.

7.3.3 Transform and quantization [4]:The residuals after subtraction of predicted pixel values are subjected to transformation and quantization. Transform blocks can be 32x32, 16x16, 8x8 or 4x4 pixels. Like most other coding standards, these transforms are an integer approximation of the DCT [13].

For intra coded blocks either or both the vertical and horizontal transform pass can be DST (discrete sine transform) instead. This is with respect to the specific characteristics of the residual signal of intra blocks. In addition, VP9 introduces support for a new transform type, the Asymmetric Discrete Sine Transform (ADST), which can be used in combination with specific intra-prediction modes. Intra-prediction modes that predict from a left edge can use the 1-D ADST in the horizontal direction, combined with a 1-D DCT in the vertical direction.

Similarly, the residual signal resulting from intra-prediction modes that predict from the top edge can employ a vertical 1-D ADST transform combined with a horizontal 1-D DCT transform. Intra-prediction modes that predict from both edges such as the True Motion mode and some diagonal intra-prediction modes use the 1-D ADST in both horizontal and vertical directions.

7.3.4 Entropy coding:

VP9 uses 8-bit arithmetic coding engine from VP8 known as bool-coder [4]. Unlike AVC or HEVC, the probabilities of VP9 bool-coder do not change adaptively within a frame. VP9 makes use of forward context updates through the use of flags in the frame header that signal modifications of the coding contexts at the start of each frame. These probabilities are stored in what is known as a frame context. The decoder maintains four of these contexts, and each frame specifies which one to use in bitstream.

8. Comparison Metrics:

8.1 Peak Signal to Noise Ratio [25]:Peak signal-to-noise ratio (PSNR) [25] is an expression for the ratio between the maximum possible value (power) of a signal and the power of distorting noise that affects the quality of its representation. Because many signals have a very wide dynamic range (ratio between the largest and smallest possible values of a changeable quantity), the PSNR is usually expressed in terms of the logarithmic decibel scale. PSNR is most commonly used to measure the quality of reconstruction of lossy compression codecs. The signal in this case is the original data, and the noise is the error introduced by compression. When comparing compression codecs, PSNR is an approximation to human perception of reconstruction quality. Although a higher PSNR generally indicates that the reconstruction is of higher quality, in some cases it may not. One has to be extremely careful with the range of validity of this metric; it is only conclusively valid when it is used to compare results from the same codec (or codec type) and same content.

19 | P a g e

Page 20: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

PSNR is defined via the mean squared error (MSE). Given a noise-free m x n monochrome image f and its noisy approximation g, MSE is defined as:

f represents the matrix data of original image

g represents the matrix data of degraded image

m represents the numbers of rows of pixels of the images and i represents the index of that row

n represents the number of columns of pixels of the image and j represents the index of that column

The PSNR is defined as:

dB

MAXf is the maximum signal value that exists in the original image

8.2 Structural Similarity Index [5] [20] [31]:The structural similarity index is a method for measuring the similarity between two images. The SSIM index is a full reference metric; in other words, the measuring of image quality based on an initial uncompressed or distortion-free image as reference. SSIM is designed to improve on traditional methods like peak signal-to-noise ratio (PSNR) and mean squared error (MSE), which have proven to be inconsistent with human eye perception. The difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the other hand, SSIM considers image degradation as perceived change in structural information. Structural information is the idea that the pixels have strong inter-dependencies especially when they are spatially close. These dependencies carry important information about the structure of the objects in the visual scene. The SSIM metric is calculated on various windows of an image. The measure between two windows x and y of common size N×N is:

Where

20 | P a g e

Page 21: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

8.3 Bjontegaard Delta Bitrate (BD-BR) and Bjontegaard Delta PSNR (BD-PSNR) [6]:To objectively evaluate the coding efficiency of video codecs, Bjontegaard Delta PSNR (BD-PSNR) was proposed. Based on the rate-distortion (R-D) curve fitting, BD-PSNR provides a good evaluation of the R-D performance. BD metrics allow computing the average gain in PSNR or the average per cent saving in bitrate between two rate-distortion curves. However, BD-PSNR has a critical drawback: It does not take the coding complexity into account.

8.4 Implementation Complexity:The computational time of HEVC, AVC and VP9 encoders will be compared and this serves as an indication of implementation complexity.

9. Profiles used for comparison:

The HM 16.0 [26] [33], JM 18.6 [27] [32] and VPX encoder from The WebM Project [28] test models for HEVC, H.264 and VP9 respectively will be used for comparison in this project.

10. Test Sequences [29]:

The following test sequences will be used for study and comparison of the codecs.

21 | P a g e

Page 22: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig.17 akiyo_qcif.yuv [38]

22 | P a g e

Page 23: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig.18 waterfall_cif.yuv [38]

23 | P a g e

Page 24: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig.19 BasketballDrill_832x480.yuv [39]

24 | P a g e

Page 25: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig.20 Jockey_1920x1080.yuv [29]

25 | P a g e

Page 26: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

Fig.21 PeopleOnStreet_2560_1600_30_crop.yuv [29]

26 | P a g e

Page 27: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

11.References:

[1] G. J. Sullivan et al, “Overview of the high efficiency video coding (HEVC) standard”, IEEE Transactions on circuits and systems for video technology, vol. 22, no.12, pp. 1649 – 1668, Dec 2012.

[2] JVT Draft ITU-T recommendation and final draft international standard of joint video specification (ITU-T Rec. H.264-ISO/IEC 14496-10 AVC), March 2003, JVT-G050 available on http://ip.hhi.de/imagecom_G1/assets/pdfs/JVT-G050.pdf

[3] D. Grois et al, “Performance Comparison of H.265/ MPEG-HEVC, VP9, and H.264/MPEG-AVC Encoders”, IEEE PCS 2013, pp 394-397, San José, CA, USA, Dec 8-11, 2013

[4] D. Mukherjee et al, “The latest open-source video codec VP9–An overview and preliminary results”, Google Inc., United States

[5] Z. Wang et al, “Image quality assessment: From error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004

[6] G. Bjøntegaard, “Calculation of average PSNR differences between RD-curves”, ITU-T Q.6/SG16 VCEG 13th Meeting, Document VCEG-M33, Austin, USA, Apr. 2001

[7] X. Li et al, “Rate-complexity-distortion evaluation for hybrid video coding”, IEEE International Conference on Multimedia and Expo (ICME), pp. 685-690, July. 2010

[8] N. Ling, “High efficiency video coding and its 3D extension: A research perspective,” Keynote Speech, ICIEA, pp 2150-2155, Singapore, July 2012 -

http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6361087

[9] V. sze , M. Budagavi , " Design and Implementation of Next Generation Video Coding Systems (H.265/HEVC Tutorial) " , IEEE ISCAS Tutorial 2014 , Melbourne , Australia , June.2014 - filehttp://www.rle.mit.edu/eems/wp-content/uploads/2014/06/H.265-HEVC-Tutorial-2014-ISCAS.pdf

[10] I. E. G. Richardson, “Video Codec Design: Developing Image and Video Compression Systems”, Wiley, 2002

[11] A. Puri et al, “Video coding using the H.264/MPEG-4 AVC compression standard”, Signal Processing: Image Communication, vol. 19, pp. 793-849, Oct. 2004.

[12] H.264 tutorial by I.E.G. Richardson: https://www.vcodex.com/h264.html

[13] N. Ahmed , T. Natarajan and K. R. Rao, “Discrete Cosine Transform”, IEEE Transactions on Computers, Vol. C-23, pp. 90-93, Jan. 1974.

27 | P a g e

Page 28: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

[14] D. Marpe, H. Schwarz, and T. Wiegand, “Context-based adaptive binary arithmetic coding in the H.264/AVC video compression standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, pp. 620–636, Jul. 2003.

[15] J . Ostermann , et al, " Video coding with H.264/AVC tools, performance , and complexity", IEEE Circuits and Systems Magazine , Vol.4 , pp.7-28, Aug.2004.

[16] J. Ohm , et al, "Comparison of the Coding Efficiency of Video Coding Standards - including High Efficiency Video Coding (HEVC) ", IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, Issue: 12 , pp. 1669 -1684 , Dec.2012.

[17] G. Sullivan , et al, “Standardized Extensions of High Efficiency Video Coding (HEVC)”, IEEE Journal of selected topics in Signal Processing, Vol. 7, No. 6, pp. 1001-1016, Dec. 2013.

[18] HEVC white paper - http://www.ateme.com/an-introduction-to-uhdtv-and-hevc

[19] HEVC tutorial by I.E.G. Richardson: http://www.vcodex.com/h265.html

[20] W. Malpica and A. Bovik , "Range image quality assessment by structural similarity", IEEE ICASSP 2009, 19-24 Apr. 2009.

[21] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T / ISO-IEC Document: JCTVC J0292r1, July 2012.

[22] "VP-Next Overview and Progress Update" (PDF). WebM Project (Google). Retrieved 2012-12-29. Available on: http://downloads.webmproject.org/ngov2012/pdf/04-ngov-project-update.pdf

[23] M. P. Sharabayko et al, "Intra Compression Efficiency in VP9 and HEVC" Applied Mathematical Sciences, Vol. 7, no. 137, pp.6803 – 6824, Hikari Ltd, 2013

[24] J. Padia, “Complexity reduction for VP6 to H.264 transcoder using motion vector reuse,” M.S. Thesis, EE Dept., UTA, Arlington, TX, 2010. Available on:

http://www.uta.edu/faculty/krrao/dip/Courses/EE5359/index_tem.html

[25] White paper on PSNR-NI - http://www.ni.com/white-paper/13306/en/

[26] Access to HM Reference Software: http://hevc.hhi.fraunhofer.de/

[27] Access to JM 18.6 Reference Software: http://iphome.hhi.de/suehring/tml/

[28] Chromium® open-source browser project, VP9 source code, Online: http://git.chromium.org/gitweb/?p=webm/libvpx.git;a=tree;f=vp9;hb=aaf61dfbcab414bfacc3171501be17d191ff8506

[29] http://ultravideo.cs.tut.fi/#testsequences - Video test sequences

28 | P a g e

Page 29: PERFORMANCE COMPARISON OF HEVC, H.264 … · Web viewThe difference with respect to other techniques such as MSE or PSNR is that these approaches estimate perceived errors; on the

[30] Cisco Visual Networking Index - http://www.cisco.com/c/en/us/solutions/service-provider/visual-networking-index-vni/index.html

[31] J. Wang et al, "Fractal image coding using SSIM", IEEE 18th International Conference on Image Processing, Brussels, pp.241-244, 11-14 sept. 2011.

[32] H.264/AVC Software Reference Manual:

http://iphome.hhi.de/suehring/tml/JM%20Reference%20Software%20Manual%20(JVT-AE010).pdf

[33] HEVC Software Reference Manual :

https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM-9.2-dev/doc/software-manual.pdf

[34] K. R. Rao, D. N. Kim and J. J. Hwang,

“VideoCodingStandards:AVSChina,H.264/MPEG-4Part10,HEVC,VP6,DIRACandVC-1”, Springer, 2014.

[35] V. Sze , M. Budagavi , and G. J.Sullivan "High Efficiency Video Coding (HEVC) : Algorithms and Architectures", Springer, 2014.

[36] M. Wien, "High Efficiency Video Coding : Coding Tools and Specification" , Springer , 2014.

[37] I. E. Richardson , "Coding Video : A practical guide to HEVC and beyond " , Wiley , 11 May 2015.

[38] https://media.xiph.org/video/derf/ - test sequences

[39] ftp://ftp.kw.bbc.co.uk/hevc/hm-11.0-anchors/bitstreams/ - test sequences

[40] G. Correa , et al , " Fast HEVC Encoding Decisions Using Data Mining " , IEEE Transactions on Circuits and Systems for Video Technology , Vol . 25 , No. 4 , pp. 660 - 673, April 2015.

[41] D. K. Kwon and M.Budagavi , " Combined scalable and mutiview extension of High Efficiency Video Coding (HEVC) " , IEEE Picture Coding Symposium , pp. 414 - 417 , Dec . 2013 .

29 | P a g e