hevc cabac group project report dr.k.r.rao … (2).… · motivation- video compression and ......
TRANSCRIPT
1
HEVC CABAC
Group Project Report
DR.K.R.RAO
COURSE:EE5359–Topics in Signal Processing
SPRING2016
Submitted by
Mohammed Mahmood Quraishi(1001151028) Satya avasarala(1001090898) Sai Kumar Pola(1001110666)
2
Table of Contents 1. Objective................................................................................................................................ ……..5 2. Motivation- Video Compression and Its Need......................................................................................5 3. Evolution of Video Coding Standards..............................................................................................6 3.1. AVC/H.264.................................................................................................................................6 3.1.1 AVC/H.264 Encoder..................................................................................................................7 3.2 Features.........................................................................................................................................8 3.3 Applications....................................................................................................................................9 3.4. Disadvantages of AVC/H.264...........................................................................................................9 4. HEVC ................................................................................................................................................10 4.1. Versions.......................................................................................................................................10 4.1.1 Version 1 ....................................................................................................................................10 4.1.2Version 2.....................................................................................................................................10 4.1.3Version 3......................................................................................................................................10 4.2.1 HEVC Encoder and Decoder........................................................................................................11 4.2.2 Decoder and Steps ......................................................................................................................11 4.3 Features..........................................................................................................................................12 . 4.3.1 Image Partitioning.......................................................................................................................12 4.3.2 A Picture Representing CT, CU, PU............................................................................................12 4.3.3 Coding Efficiency........................................................................................................................13 4.3.4 Coding Tools................................................................................................................................14 4.3.4.1 Coding Tree...............................................................................................................................14 4.3.4.2 Parallel Handling Tools...............................................................................................................14
3
4.3.4.3 Wave front Parallel Processing WPP..........................................................................................14 4.3.5 Other Coding Tools .......................................................................................................................15 4.3.5.1. Entropy Coding.........................................................................................................................15 4.3.5.2. Intra Prediction.........................................................................................................................15 4.3.6. Motion Vector Prediction ............................................................................................................16 4.3.7. Inverse Transform..........................................................................................................................17 4.3.8 De-blocking Filter............................................................................................................................17 4.3.9. Sample Adaptive Effect .................................................................................................................17 4.4. CABAC and Its Overview .....................................................................................................................18 4.4.1. CABAC Algorithm ...........................................................................................................................18 4.5. HEVC CABAC and Its Main Steps.........................................................................................................18 4.5.1. Binarization.....................................................................................................................................18 4.5.2. Arithmetic Coding...........................................................................................................................19 4.5.3 Context Selection.............................................................................................................................19 4.6. CABAC Encoder....................................................................................................................................20 4.7. The Arithmetic Decoding Procedure ..................................................................................................20 4.8. Throughput Improvement CABAC.......................................................................................................20 4.8.1. Reduce Total number of bins.........................................................................................................21 4.8.2. Reduce Total number of Context Coded bins..................................................................................21 4.8.3. Grouping of By-pass Bins...............................................................................................................22 4.9. Conclusion..........................................................................................................................................24 4.10. References........................................................................................................................................26
4
ACRONYMS
AVCHD: Advance Video Coding High Definition ATSC: Advance Television System Committee CABAC: Context Adaptive Binary Arithmetic Coding CAVLC: Context Adaptive Variable Length Coding CCTV: Closed Circuit Television CPU: Central Processing Unit CTB : Coding Tree Block DCT : Discrete Cosine Transform DVB: Digital Video Broadcasting FSM : Finite State Machine HDTV: High Definition Television HEVC: High Efficiency Video Coding IDR: Instantaneous Video Refresh ISDBT: Sistema Brasileiro de Televisão Digital ISO: International Organization for Standardization
IEC: International Electro-Technical Commission JCT-VC: Joint Collaboration Team On Video Coding AVC : Advance Video Coding CTU : Coding Tree Unit CU : Coding Unit PU : Prediction Unit MPEG : Moving Picture Experts Group HDTV : High Definition Television PSNR : Peak Signal To Noise Ratio ITU-T: International Telecommunication Union- Telecommunication Standardization Sector. JCT: Joint Collaborative Team. JCT-VC: Joint Collaborative Team on Video Coding. JM: H.264 Test Model. JPEG: Joint Photographic Experts Group. MC: Motion Compensation. ME: Motion Estimation. MPEG: Moving Picture Experts Group. MSE: Mean Square Error. PB: Prediction Block. QP: Quantization Parameter SAO: Sample Adaptive Offset. VCEG: Video Coding Experts Group. SSIM: Structural Similarity Index. TB: Transform Block.
5
TU: Transform Unit. VCEG: Visual Coding Experts Group. 1.OBJECTIVE:
The goal of this project is to give an insight of the CABAC(Context Adaptive Binary Arithmetic
Coding) entropy coding method that was as of late developed with HEVC standard (ITU - T
Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG)
institutionalization associations) in the year 2013. HEVC (H.265) standard is the most recent
upgraded video coding standard which wanted to enhance the rendered details of its previous
standard MPEG-4 (H.264).The primary objective of the HEVC standardization is to enable
significantly improved compression performance relative to existing standard AVC. For
comparative video quality, HEVC bit-streams produce much better video quality than that of
H.264. HEVC uses variable block length coding to perform encoding operation. Hence, it uses
different sizes of integer transforms HEVC is gives support to resolutions up to Ultra High
Definition (UHD),4K video coding.
2.VIDEO COMPRESSION AND ITS MOTIVATION:
We need to compress video in practice since:
1.
Uncompressed video data are huge. In HDTV, the bit rate easily exceeds 1 Gbps. -- big problems
for storage and network communications[3]. For example: One of the formats defined for HDTV
broadcasting within the United States is 1920 pixels horizontally by 1080 lines vertically, at 30
frames per second. If these numbers are all multiplied together, along with 8 bits for each of the
three primary colors, the total data rate required would be approximately 1.5 Gb/sec. Because of
the 6 MHz. channel bandwidth allocated, each channel will only support a data rate of 19.2
Mb/sec, which is further reduced to 18 Mb/sec by the fact that the channel must also support
audio, transport, and ancillary data information. As can be seen, this restriction in data rate
means that the original signal must be compressed by a figure of approximately 83:1[2]. This
number seems all the more impressive when it is realized that the intent is to deliver very high
quality video to the end user, with as few visible artifacts as possible.
Lossy methods have to be employed since the compression ratio of lossless methods (e.g.,
Huffman, Arithmetic, LZW) is not high enough for image and video compression
The following compression types are commonly used in Video compression:.
Spatial Redundancy Removal – Intra-frame coding (JPEG)
Spatial and Temporal Redundancy Removal – Intra-frame and Inter-frame coding (H.261,
MPEG).
Block motion estimation.
Entropy Coding.
6
3.EVOLUTION OF DIFFERENT VIDEO CODING STANDARDS:
Figure # 1 Evolution of video coding standards [1]
3.1AVC/H.264
H.264 also known as Advanced Video Coding (MPEG-4 AVC) is a video coding format is
developed by ITU-T ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC
JTC1 Moving Picture Experts Group (MPEG)[4].The H.264 standard can be viewed as a "
group of standards" composed of the profiles . The standard was developed jointly in a
partnership of VCEG and MPEG, after earlier development work in the ITU-T as a VCEG
project called H.26L. It is thus common to refer to the standard with names such as H.264/AVC,
AVC/H.264, H.264/MPEG-4 AVC, or MPEG-4/H.264 AVC. It is also referred to as "the JVT
codec", in reference to the Joint Video Team (JVT) organization that developed it[2]. For
example, the video compression standard known as MPEG-2 also as it was proposed by the
partnership between MPEG and the ITU-T, where MPEG-2 video is known to the ITU-T
community as H.262.
H.264 technology aims to provide good video quality at considerably low bit rates, at reasonable
level of complexity while providing flexibility to wide range of applications.[1] The MPEG-2
video coding standard (also known as ITU-T H.262) [2], which was developed about ten years
ago primarily as an extension of prior MPEG-1 video capability with support of interlaced video
coding, was an enabling technology for digital television systems worldwide. It is widely used
for the transmission of standard definition (SD) and high definition (HD) TV signals over
7
satellite, cable, and terrestrial emission and the storage of high-quality SD video signals onto
DVDs.
H.264 is perhaps best known as being one of the video encoding standards for Blu-ray Discs; all
Blu-ray Disc players must be able to decode H.264. It is also widely used by streaming internet
sources, such as videos from Vimeo, YouTube, and the iTunes Store, web software such as the
Adobe Flash Player and Microsoft Silverlight, and also various HDTV broadcasts over terrestrial
(Advanced Television Systems Committee standards, ISDB-T, DVB-T or DVB-T2), cable
(DVB-C), and satellite (DVB-S and DVB-S2).
3.1.1 AVC /H.264 ENCODER [6]:
Figure # 2 AVC/H.264 ENCODER [2]
8
3.2 Features
H.264/AVC/MPEG-4 contains various new components that permit it to pack video considerably
more productively than more seasoned principles and to give more adaptability to application to
a wide assortment of system situations. Specifically, some such key elements include:
Multi-picture between picture expectation including the accompanying elements:
Using previously encoded pictures as references in a optimum way than in previous standards,
allowing up to 16 reference frames (or 32 reference fields, in the case of interlaced encoding) can
be used. In profiles that support non-IDR frames, most levels specify that sufficient buffering
should be available to allow for at least 4 or 5 reference frames at maximum resolution. This is in
contrary to prior standards, where the limit was typically one; or, in the case of conventional "B
pictures" (B-frames), two. This particular feature usually allows slight improvements in bit rate
and quality in most scenes. But in certain types of sequences, such as those with repetitive
motion or back-and-forth sequence cuts or uncovered background areas, it allows a significant
reduction in bit rate while maintaining clarity.
Variable block-size motion compensation (VBSMC) with block sizes as large as 16×16 and as
small as 4×4, enabling exact division of moving regions[6]. The supported luma prediction block
sizes include 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, and 4×4, many of which can be used together in
a single macroblock. Chroma prediction block sizes are correspondingly smaller according to the
chroma sub-sampling in use.
The ability to use multiple motion vectors per macro-block (one or two per partition) with a
maximum of 32 in the case of a B macro-block constructed of 16 4×4 partitions. The motion
vectors for each 8×8 or larger partition region can point to different reference pictures. Six-tap
filtering for derivation of half-pel luma sample predictions, for sharper sub-pixel motion-
compensation. Quarter-pixel motion is calculated by linear interpolation of the half-pel values, to
lessen processing power.[6]
Spatial prediction from the edges of neighboring blocks for "intra" coding, rather than the "DC"-
only prediction found in MPEG-2 Part 2 and the transform coefficient prediction found in H.263
version 2 and MPEG-4 Part 22a[3]. This includes luma prediction block sizes of 16×16, 8×8, and
4×4 (of which only one type can be used within each macro-block).
An in-loop de-blocking filter that helps prevent the blocking artifacts common to other DCT-
based image compression techniques, resulting in better visual appearance and compression
efficiency.
Coming to entropy coding it used both CABAC and CAVLC.CAVLC is some times referred as
Exp-golomb.
9
3.3 Applications
The H.264 video format has a very broad application range that covers all forms of digital
compressed video from low bit-rate Internet streaming applications to HDTV broadcast and
Digital Cinema applications with nearly lossless coding. With the use of H.264, bit rate savings
of 50% or more compared to MPEG-2 Part 2 are reported. For example, H.264 has been reported
to give the same Digital Satellite TV quality as current MPEG-2 implementations with less than
half the bitrate, with current MPEG-2 implementations working at around 3.5 Mbit/s and H.264
at only 1.5 Mbit/s[5].AVCHD is a high-definition recording format designed by Sony and
Panasonic that uses H.264.AVC-Intra is an intra-frame a compression format, developed by
Panasonic.
To guarantee similarity and issue free reception of H.264/AVC, numerous benchmarks bodies
have altered or added to their video-related principles so that clients of these norms can utilize
H.264/AVC. Both the Blu-ray Disk design and the now-ceased HD DVD position incorporate
the H.264/AVC High Profile as one of 3 video compression formats[4]. The Digital Video
Broadcast venture (DVB) affirmed the utilization of H.264/AVC for telecast TV in late 2004.
The Advanced Television Systems Committee (ATSC) in the United States approved the use of
H.264/AVC for broadcast television in July 2008. It has also been approved for use with the
more recent ATSC-M/H (Mobile/Handheld) standard, using the AVC and SVC portions of
H.264.
XAVC is a recording group composed by Sony that uses level 5.2 of H.264/MPEG-4 AVC,
which is the largest amount bolstered by that video standard. XAVC can bolster 4K resolution
(4096 × 2160 and 3840 × 2160) at up to 60 frames for each second (fps). Sony has declared that
cameras that bolster XAVC incorporate two Cine Alta cameras—the Sony PMW-F55 and Sony
PMW-F5. The Sony PMW-F55 can record XAVC with 4K resolution at 30 fps at 300 M bit/s
and 2K resolution at 30 fps at 100 Mbit/s. XAVC can record 4K resolution at 60 fps with 4:2:2
chroma sub sampling at 600 Mbit/s.
The CCTV (Closed Circuit TV) and Video Surveillance markets have incorporated the
innovation in numerous items.
3.4 DISADVANTAGES OF AVC/H.264:
1.At the same bit rate HEVC performs far more better video quality compared to H.264[6].
2.H.264 encoding and decoding is more computationally complex than some other codecs such
as MPEG-4.
3. At the same bit rate HEVC performs far more better video quality compared to H.264.
10
4 HEVC
HEVC was developed by the JCT-VC organization, a collaboration between the ISO/IEC MPEG
and ITU-T VCEG. The ISO/IEC group refers to it as MPEG-H Part 2 and the ITU-T as
H.265[1].
In most ways, HEVC is an extension of the concepts in H.264/MPEG-4 AVC. Both work by
comparing different parts of a frame of video to find areas that are redundant, both within a
single frame as well as subsequent frames[3]. These redundant areas are then replaced with a
short description instead of the original pixels. The primary changes for HEVC include the
expansion of the pattern comparison and difference-coding areas from 16×16 pixel to sizes up to
64×64, improved variable-block-size segmentation, improved "intra" prediction within the same
picture, improved motion vector prediction and motion region merging, improved motion
compensation filtering, and an additional filtering step called sample-adaptive offset filtering.
Effective use of these improvements requires much more signal processing capability for
compressing the video, but has less impact on the amount of computation needed for
decompression.
HEVC is protected by patents owned by various parties. Commercial use of HEVC technologies
requires the payment of royalties to licensors of HEVC patents, such as MPEG LA, HEVC
Advance, and Technicolor SA. The first version of HEVC was completed in January 2013[9].
The second version was completed and approved in 2014 and published in early 2015.
Additional 3D-HEVC extensions for 3D video were completed in early 2015. Further extensions
remain in development for completion in early 2016.
4.1 Versions[3]:
4.1.1 Version 1:On February 29, 2012, at the 2012 Mobile World Congress, Qualcomm
demonstrated a HEVC decoder running on an Android tablet, with a Qualcomm Snapdragon S4
dual-core processor running at 1.5 GHz, showing H.264/MPEG-4 AVC and HEVC versions of
the same video content playing side by side. In this demonstration, HEVC reportedly showed
almost a 50% bit rate reduction compared to H.264/MPEG-4 AVC. On April 13, 2013 first
approved version of the HEVC/H.265 standard containing Main, Main 10, and Main Still Picture
profiles.
4..1.2 Version 2:On April 3, 2013, ATEME announced the availability of the first open source
implementation of a HEVC software player based on the open HEVC decoder and GPAC video
player which are both licensed under LGPL. The open HEVC decoder supports the Main profile
of HEVC and can decode 1080p at 30 fps video using a single core CPU. On October 29, 2014)
second approved version of the HEVC/H.265 standard which adds 21 range extensions profiles,
two scalable extensions profiles, and one multi-view extensions profile
4.1.3 Version 3: (April 29, 2015) third approved version of the HEVC/H.265 standard which
adds the 3D Main profile
11
4.2 HEVC ENCODER AND DECODER[7]:
Figure # 3 BLOCK DIAGRAM OF HEVC ENCODER [3]
Source video consisting of a sequence of video frames is encoded or compressed by a video
encoder to create a compressed video bit stream. The compressed bit-stream is stored or
transmitted. A video decoder decompresses the bit stream to create a sequence of decoded
frames represent the steps followed by encoder and decoder.
4.2.1 ENCODER AND ITS STEPS:
The video encoder performs the following steps:
1. Partitioning each picture into multiple units
2. Predicting each unit using inter or intra-prediction, and subtracting the prediction from
the unit
3. Transforming and quantizing the residual (the difference between the original picture unit
and prediction)
4. Entropy encoding transform output, prediction information, mode information and
headers
4.2.2 DECODER AND ITS STEPS:
The video decoder performs the following steps:
1. Entropy decoding and extracting the elements of the coded sequence
12
2. Rescaling and inverting the transform stage
3. Predicting each unit and adding the prediction to the output of the inverse transform
4. Reconstructing a decoded video image.
4.3 FEATURES
HEVC was designed to improve coding efficiency compared to H.264/MPEG-4 AVC HP, i.e. to
reduce bit-rate requirements by half with comparable image quality, at the expense of increased
computational complexity[7]. HEVC was designed with the goal of allowing video content to
have a data compression ratio of up to 1000:1 [3]. Depending on the application requirements,
HEVC encoders can trade off computational complexity, compression rate, robustness to errors,
and encoding delay time. Two of the key features where HEVC was improved compared to
H.264/MPEG-4 AVC was support for higher resolution video and improved parallel processing
methods.
HEVC is targeted at next-generation HDTV displays and content capture systems which feature
progressive scanned frame rates and display resolutions from QVGA (320x240) to 4320p
(7680x4320), as well as improved picture quality in terms of noise level, color spaces, and
dynamic range[1].
HEVC was designed with the idea that progressive scan video would be used and no coding tools
were added specifically for interlaced video. Interlace specific coding tools, such as MBAFF and
PAFF, are not supported in HEVC [4]. HEVC instead sends metadata that tells how the
interlaced video was sent. Interlaced video may be sent either by coding each frame as a separate
picture or by coding each field as a separate picture. For interlaced video HEVC can change
between frame coding and field coding using Sequence Adaptive Frame Field (SAFF), which
allows the coding mode to be changed for each video sequence.This allows interlaced video to be
sent with HEVC without needing special interlaced decoding processes to be added to HEVC
decoders.
4.3.1 Image Partitioning: The previous standards split the pictures in block-shaped regions
called Macro-blocks and Blocks. In HEVC, we have high-resolution video content, so the use of
variable block sizes is advantageous for encoding. To support this wide variety of blocks size in
efficient manner HEVC pictures are divided into so-called coding tree Units (CTUs).CTU
actually divide the image into CU(Coding Units) which in turn are divided into PU(Partition
Units) depending upon the requirement to encode. Depending on the stream parameters, the
CTUs in a video sequence can have the size: 64×64, 32×32, or 16×16.
13
4.3.2 A PICTURE REPRESENTINGCTU,CU,PU [2].
Figure # 4: A TREE DIAGRAM FOR CTU,CU,PU [2]
Coding Tree Unit (CTU) is therefore a coding unit, which is in turn encoded into an HEVC bit-
stream. It consists of three blocks, namely luma (Y), that covers a square picture area of LxL
samples of the luma component, and two chroma components (Cb and Cr), that cover L/2xL/2
samples of each of the two chroma components, and associated syntax elements, Each block is
called Coding Tree Block (CTB).
Syntax elements describe properties of different types of units of a coded block of pixels and
how the video sequence can be reconstructed at the decoder. This includes the method of
prediction (e.g. inter or intra prediction, intra prediction mode, and motion vectors) and other
parameters.
4.3.3 Coding efficiency [3]:
The design of most video coding standards is primarily aimed at having the highest coding
efficiency. Coding efficiency is the ability to encode video at the lowest possible bit rate while
maintaining a certain level of video quality. There are two standard ways to measure the coding
efficiency of a video coding standard, which are to use an objective metric, such as peak signal-
to-noise ratio (PSNR), or to use subjective assessment of video quality. Subjective assessment of
video quality is considered to be the most important way to measure a video coding standard
since humans perceive video quality subjectively.
HEVC benefits from the use of larger coding tree unit (CTU) sizes. This has been shown in
14
PSNR tests with a HM-8.0 HEVC encoder where it was forced to use progressively smaller CTU
sizes. For all test sequences, when compared to a 64×64 CTU size, it was shown that the HEVC
bit rate increased by 2.2% when forced to use a 32×32 CTU size, and increased by 11.0% when
forced to use a 16×16 CTU size. In the Class A test sequences, where the resolution of the video
was 2560×1600, when compared to a 64×64 CTU size, it was shown that the HEVC bit rate
increased by 5.7% when forced to use a 32×32 CTU size, and increased by 28.2% when forced
to use a 16×16 CTU size. The tests showed that large CTU sizes increase coding efficiency while
also reducing decoding time.
4.3.4 Coding tools:
4.3.4.1 Coding tree unit:
HEVC replaces 16×16 pixel macro-blocks, which were used with past stages, with coding tree
units (CTUs) which can use greater square structures of up to 64x64 samples and can better sub-
allocate photograph into variable measured structures. HEVC at first divides the picture into
CTUs which can be 64×64, 32×32, or 16×16 with a larger pixel bit size regularly improving the
coding effectiveness.[9]
4.3.4.2 Parallel handling tools:
Tiles allow for the picture to be divided into a grid of rectangular regions that can independently
be decoded/encoded[10]. The main purpose of tiles is to allow for parallel processing. Tiles can
be independently decoded and can even allow for random access to specific regions of a picture
in a video stream.
4.3.4.3 Wavefront parallel processing (WPP)[9]:It is implemented when a slice is divided into
rows of CTUs in which the first row is decoded normally but each additional row requires that
decisions be made in the previous row WPP has the entropy encoder use information from the
preceding row of CTUs and allows for a method of parallel processing that may allow for better
compression than tiles.
Tiles and WPP are allowed, but are optional. If tiles are present, they must be at least 64 pixels
high and 256 pixels wide with a level specific limit on the number of tiles allowed.
Slices can, for the most part, be decoded independently from each other with the main purpose of
tiles being the re-synchronization in case of data loss in the video stream.
Slices can be defined as self-contained in that prediction is not made across slice boundaries.
When in-loop filtering is done on a picture though, information across slice boundaries may be
required. Slices are CTUs decoded in the order of the raster scan, and different coding types can
be used for slices such as I types, P types, or B types.
Dependent slices can allow for data related to tiles or WPP to be accessed more quickly by the
system than if the entire slice had to be decoded. The main purpose of dependent slices is to
allow for low-delay video encoding due to its lower latency.because in flat areas which are prone
15
to banding artifacts, sample amplitudes tend to be clustered in a small range. The SAO filter was
designed to increase picture quality, reduce banding artifacts, and reduce ringing artifacts.
4.4 CABAC & IT’S OVERVIEW[7]:
Context-adaptive binary arithmetic coding (CABAC) is a form of entropy encoding used in
the H.264/MPEG-4 AVC and High Efficiency Video Coding (HEVC) standards .It is a well
known bottleneck in AVC/H.264 [8]. It is a lossless compression technique, although the video
coding standards in which it is used are typically for lossy compression applications. CABAC is
notable for providing much better compression than most other entropy encoding algorithms
used in video encoding, and it is one of the key elements that provides the H.264/AVC encoding
scheme with better compression capability than its previous standards. CABAC has multiple
probability modes for different contexts. It first converts all non-binary symbols to binary. Then,
for each bit, the coder selects which probability model to use, then uses information from nearby
elements to optimize the probability estimate. Arithmetic coding is finally applied to compress
the data.
In H.264/MPEG-4 AVC, CABAC is only supported in the Main and higher profiles of the
standard, as it requires a larger amount of processing to decode than the simpler scheme known
as context-adaptive variable-length coding (CAVLC) that is used in the standard is Baseline
profile. CABAC is also difficult to parallelize and vectorize, so other forms of parallelism may
be coupled with its use. In HEVC, CABAC is used in all profiles of the standard.
4.4.1 CABAC ALGORITHM[9]:
CABAC is based on arithmetic coding, with a few innovations and changes to adapt it to the
needs of video encoding standards.
1. It encodes binary symbols, which keeps the complexity low and allows probability
modeling for more frequently used bits of any symbol.
2. The probability models are selected adaptively based on local context, allowing better
modeling of probabilities, because coding modes are usually locally well correlated.
3. It uses a multiplication-free range division by the use of quantized probability ranges and
probability states.
4.5 HEVC CABAC & its main steps[1]
4.5.1.Binarization: Syntax elements are mapped to binary symbols (bins) using a
binarization process. Various forms of binarization are used in AVC and HEVC (e.g. Exp-
Golomb, fixed length, truncated unary, custom). Combinations of different binarizations are also
allowed where the prefix and suffix are binarized differently. For example, the prefix can be
truncated unary and the suffix can be fixed length (this combination is also called as truncated
rice). Alternatively, truncated unary can be used for the prefix, and Exp-Golomb for the suffix.
The standard defines which type of binarization is used for each syntax element.
.
16
4.5.2. Arithmetic Coding:
The bins are compressed into bits using arithmetic coding . Multiple bins can be represented by a
single bit. This allows syntax elements to be represented by a fractional number of bits, which
improves coding efficiency. Arithmetic coding involves recursive sub-interval division, where a
range is divided into two subintervals based on the probability of the symbol that is being
compressed. The encoded bits represent an offset that, when converted to a binary fraction,
selects one of the two subintervals, which indicates the value of the decoded bin. After every
decoded bin, the range is updated to equal the selected subinterval, and the interval division
process repeats itself. In order to effectively compress the bins to bits, the probability of the bins
must be accurately estimated.
4.5.3. Context Selection: The context modeling and selection is used to accurately
model the probability of each bin. The probability of the bin depends on the type of
syntax element it belongs to, the bin index within the syntax element ,it can be most
significant bin or least significant bin and the properties of spatially neighboring
coding units. There are several hundred different context models used in AVC and
HEVC. As a result, a large finite state machine is needed to select the correct context
for each bin. In addition, the estimated probability of the selected context model is
updated after each binary symbol is encoded or decoded.[8]
At the decoder, there are several feedback loops in the CABAC . The context and
probability of the next bin depends on the decoded value of the current bin; the
current bin determines the bin index and syntax element and consequently the
context of the next bin. This bin-to-bin dependency makes it difficult to process
multiple bins in a parallel manner. The context and probability for the next bin can
be determined to increase concurrency; however, due to the complexity of the
context selection finite state machine, these operations are expensive in terms of
power and area, and grow exponentially as the number of bins increases. The context
update and range update feedback loops are simpler than the context selection loops
and thus do not affect throughput as severely. Not all bins are coded using an
estimated probability it can be context coded.[11]
Bins can also be coded assuming equal probability of 0.5 also known as bypass
coded. As a result, bypass coded bins avoid the feedback loop for the context
selection. In addition, the arithmetic coding is also simpler and faster for bypass
coded bins, as the division of the range into subintervals can be done by a shift,
rather than a look up table which is required for the context coded bins. Thus
multiple bypass bins can be processed concurrently in the same cycle at lower power
and area cost than context coded bins.[12]
17
Figure # 8: BLOCK DIAGRAM OF CABAC [8]
4.6 CABAC ENCODER:
Figure # 9 A BLOCK DIAGRAM OF CABAC ENCODER [11]
4.7 The arithmetic decoding procedure[10]:
4.7.1.Probability estimation is performed by a transition process between 64 separate probability
states for "Least Probable Symbol" LPS, the least probable of the two binary decisions "0" or "1"
4.7.2.The range R representing the current state of the arithmetic coder is quantized to a small
range of pre-set values before calculating the new range at each step, making it possible to
calculate the new range using a look-up table (i.e. multiplication-free).
4.7.3.A simplified encoding and decoding process is defined for data symbols with a near
18
uniform probability distribution.
The definition of the decoding process is designed to facilitate low-complexity implementations
of arithmetic encoding and decoding. Overall, CABAC provides improved coding efficiency
compared with CAVLC-based coding, at the expense of greater computational complexity.
4.8) THROUGHPUT IMPROVEMENTS CABAC
In HEVC, the binarization and context selection of CABAC were modified, while the arithmetic
coding engine remained the same as AVC. This section highlights three techniques that were
used to improve the throughput of CABAC in HEVC[7]. We evaluate the performance of this
HEVC against AVC with a new metric called Bjntegaard delta cycles(BD-Cycles) .BD-cycle
uses the Bjntegaard delta measurement method to compute the average difference between the
cycles vs. bit-rate curves of HEVC and AVC and thereby ,CABAC throughput versus the AVC.
Enabling higher throughput is desirable since it can be used to support higher pixel rates for
higher resolutions and frame rates and throughput can also be traded off for reduced power
consumption with the help of voltage scaling. Parallel processing is an effective way to increase
throughput. However, this is challenging as the purpose of video compression is to remove
redundancy which introduces dependency and makes parallelism difficult.
4.8.1) Reduce total number of bins :In here ,the binarization of the coefficient
level was modified to reduce the total number of bins. The coefficient levels account
for a significant portion on average 15 to 25% of the total number of bins. While
AVC uses truncated unary prefix followed by an Exp-Golomb suffix, HEVC uses a
truncated unary prefix followed by fixed length suffix. This combination is also
known as truncated rice. Up to the value of 12 coefficients HEVC and AVC have the
same number of bins however, for coefficient values above 12, the binarization
process used in HEVC always results in fewer bins than AVC[6][7][9][4]. The
transition point between the prefix and suffix was set such that the maximum total
number of bins for coefficient level 12 was reduced from 43 to 34 . The maximum
total number of bins for delta QP was also reduced from 53 to 15 by using truncated
unary plus Exp-Golomb rather than unary, and signaling the sign value separately .
for a worst case scenario of 16x16 block of pixels, total number of bins was reduced
in HEVC by 1.5x compared to AVC.
4.8.2) Reduce number of context coded bins: The number of context coded
bins was significantly reduced for syntax elements such as motion vectors and
coefficient levels. For these syntax elements, the first few bins were context coded,
and the remaining bins were bypass coded. For instance, in AVC, the first 9bins of
the motion vector difference were context coded. For HEVC, it was determined that
only the first two bins had to be context coded. Similarly, the number of context
coded bins for each coefficient level was reduced from 14 in AVC to either 1 or 2
19
(depending on the number of coefficients per 4x4 block) in HEVC .Table 1
summarizes the reduction in context coded bins for various syntax elements. Overall,
if we calculate of the number of times each syntax element appears for a 16x16 block
of pixels, the worst case scenario of context coded bins was reduced by over 8x
compared to AVC.
TABLE #1 : SYNTAX ELEMENTS FOR WHICH THE CONTEXT CODING IS
DONE. [12]
4.8.3 Grouping of bypass bins: Due to the effort to reduce the number of context
coded bins, bypass bins account for a significant portion of the total bins in HEVC.
As a result, processing multiple bypass bins per cycle can significantly increase the
overall CABAC throughput. Multiple bypass bins can only be processed in the same
cycle if bypass bins appear consecutively in the bin stream [3][10][9][7]. Thus long
runs of bypass bins result in higher throughput than frequent switching between
bypass and context coded bins. Accordingly, bypass bins were grouped together
across multiple syntax elements to maximize the run length of bypass bins. For
instance, the sign bins of the different coefficients were grouped together and the
bypass portion of the coefficient levels were grouped together. Table 2 summarizes
the syntax elements where bypass grouping was used.
TABLE # 2: SYNTAX ELEMENTS FOR WHICH BYPASS PROCESS IS
FOLLOWED [11]
20
The entropy coding procedure utilized within video coding will be the:. - Context-Based
Adaptive Binary Arithmetic Coding(CABAC) .CABAC outperforms other entropy encoders As
far as coding effectiveness. Entropy coding for HEVC will be In light of CABAC algorithm.[7]
HEVC technology is based on hybrid coding scheme with motion-compensated prediction,
transform coding & entropy coding of residual data. The HEVC utilizes new coding structure by
introducing coding Unit(CU), Prediction Unit(PU), and transform Unit(TU) which Perform
prediction & adaptive coding of information to pieces of numerous size[1]:.
CU Size- 64x64 to 8x8. PU Size- 8x4 alternately 4x8. TU Size-4x4 up to 32x32(Square)
alternately 32x8,8x32,16x4,4x16(Non-Square). The new coding structures allows choosing the
size of block for prediction & transform in an adaptive manner exploiting local features of an
image.
The HEVC feature compression technology utilizes Context based Adaptive Binary Arithmetic
Coding(CABAC) entropy. which may be an altered version of CABAC encoder that is utilized
within MPEG-4 AVC/H. 26 feature video compression standard
Figure # 10 BLOCK DIAGRAM OF CABAC [12] The different characteristic of CABAC algorithm will be provision the binary arithmetic codec
core that is able to encode binary symbols
21
225 models would be characterized over CABAC with HEVC.
CABAC introduces two fundamental simplifications in place will accelerate computations: -
Probabilities from symbols are computed to a rearranged best approach utilizing pre-defined
finite state Machine (FSM) for 64 States.
-Some images are encoded in the purported bypass coding produced for no information modeling
stage.
Simplifications settled on On CABAC setting modeler piece essentially lessens precision of
image probabilities which. Negatively influences compression execution.
Main idea of improved CABAC uses more accurate mechanism of data statistics estimation
relative to original algorithm.
In the improved CABAC, the CTW method is used by every statistical model to calculate
conditional probability of symbol more accurately
• Application of the improved CABAC within HEVC increases compression performance
of video encoder.
• As seen in the table below an average of 2.6% bit rate reduction was obtained for used
video sequences
IMPROVED CABAC VER 2 BLOCK DIAGRAM
Figure # 11 IMPROVED VERSION OF CABAC [8]
Compression performance of future HEVC technology can be further expanded when utilizing
enhanced entropy of the encoder.
22
Better information measurements estimation in CABAC gives 1.6% - 4.5% diminishment of
HEVC bit-stream
5 CONCLUSION:
The throughput was measured for 24 video sequences of various resolutions that were used by
JCT-VC during the HEVC standardization process. These sequences were encoded with HM-8.0
reference software for HEVC [13] and JM-18.4 reference software for AVC. The HM-8.0
encoder was configured based on the common conditions set by JCT-VCb [9][10]. Under the
common condition, for both AVC and HEVC, 12 encoded bit-streams are generated for each of
the 24 video sequences, which covers four quantization points QP = 22, 27, 32, 37and three
different configuration.
Graph # 1 PSNR vs BIT RATE COMPARISON OF HEVC AND AVC [8]
Table # 3 BIT RATE COMPARISON OF VERSIONS OF CABAC [8]
23
6 REFERENCES:
[1] J. Ohm, et al“ Comparison of the Coding Efficiency of Video Coding Standards – Including
High Efficiency Video Coding (HEVC),” IEEE Trans. on CSVT, vol.22, no. 12, pp. 1669–1684,
2012.
[2] G. Sullivan, et al, “The H.264/AVC Advanced Video Coding Standard: Overview and
Introduction to the Fidelity Range Extensions,” SPIE Conference on Applications of Digital
Image Processing XXVII, vol. 5558, pp. 53-74, Aug. 2004.
[3]M. Goldman, "High-Efficiency Video Coding (HEVC): The Next-Generation Compression
Technology", SMPTE Mot. Imag. J, vol. 121, no. 5, pp. 27-33, 2012.
[4] D. Marpe, et al, “Context-based adaptive binary arithmetic coding in the H.264/AVC video
compression standard,” IEEE Trans. on CSVT, vol. 13, no. 7, pp. 620– 636, July 2003.
[5] M. Zhou, et al "Parallel tools in HEVC for high-throughput processing", Applications of
Digital Image Processing XXXV, 2012.
[6]T. Wiegand, et al" Overview of the H.264/AVC video coding standard", IEEE Trans. Circuits
Syst. Video Technol., vol. 13, no. 7, pp. 560-576, 2003.
[7]S. Kim, et al, "Efficient entropy coding scheme for H.264/AVC lossless video coding", Signal
Processing: Image Communication, vol. 25, no. 9, pp. 687-696, 2010.
[8] V.Sze & M.Budagavi, M. 2012, "High Throughput CABAC Entropy Coding in HEVC",
IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1778-
1791.
[9] Dajiang Zhou, et al, "Ultra-High-Throughput VLSI Architecture of H.265/HEVC CABAC
Encoder for UHDTV Applications", IEEE Trans. Circuits Syst. Video Technol., vol. 25, no. 3,
pp. 497-507, 2015.
[10] V.Sze et al. (2014). High efficiency video coding (HEVC): Algorithms and architectures.
New York; Cham;: Springer.
[11]S. Choi and S. Chae, "Comparison of CABAC rate estimation models for HEVC rate
distortion optimization", Electronics Letters, vol. 50, no. 6, pp. 441-442, 2014.
[12]J. Choi and Y. Ho, "Efficient residual data coding in CABAC for HEVC lossless video
compression", Signal, Image and Video Processing, vol. 9, no. 5, pp. 1055-1066, 2013.
[13] HM C++ Code: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/
[14] HM software manual:
https://hevc .hhi.fraunhofer.de/svn/svn_HEVCSoftware/trunk/doc/software-manual.pdf.
24
[15] HEVC tutorial by I.E.G. Richardson: http://www.vcodex.com/h265.html.
[16] G. J. Sullivan, et al, “Standardized Extensions of High Efficiency Video Coding
(HEVC)”,
IEEE Journal of selected topics in Signal Processing, Vol. 7, No. 6, pp. 1001-1016, Dec. 2013.
[17] Test sequences: https://media.xiph.org/video/derf/
[18] C. Fogg, “Suggested figures for the HEVC specification”, ITU-T / ISO-IEC Document:
JCTVC J0292r1, July 2012.
[19] M. Wien, "High Efficiency Video Coding : Coding Tools and Specification" , Springer ,
2014.
[20] G. Correa et al , " Fast HEVC Encoding Decisions Using Data Mining " , IEEE
Transactions on Circuits and Systems for Video Technology , Vol . 25 , No. 4 , pp. 660 - 673,
Apr. 2015.
[21] N. Ling, “High efficiency video coding and its 3D extension: A research perspective,”
Keynote Speech, ICIEA, pp. 2150-2155, Singapore, July 2012.
[22] Video Sequences: http://forum.doom9.org/archive/index.php/t-135034.html
http://ultravideo.cs.tut.fi/
[23] ITU-T website: http://www.itu.int/ITU-T/index.html
[24] T.Nguyen et al,"Transform Coding Techniques in HEVC", IEEE Journal of Selected Topics
in Signal Processing, vol.7, pp.978–989, Dec. 2013.
[25] X. Wang et al,“Paralleling Variable Block Size Motion Estimation of HEVC on Multicore
CPU plus GPU platform”, IEEE International Conference on Image Processing (ICIP),vol.22,
pp. 1836-1839, Sep.2013.
[26] Y.L. Lee et al, "Improved lossless intra coding for H.264/MPEG-4 AVC", IEEE Trans on
Image Process, vol.15, no.9, pp.2610-2615, Sep.2006.
[27] M. Jakubowski and G. Pastuszak, “Block-based motion estimation algorithms-a survey”,
Journal of Opto-Electronics Review, vol. 21, pp.86-102, Mar.2013.
[28] Tortoise SVN: http://tortoisesvn.net/downloads.html
[29] Website on PSNR: http://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio
[30] White paper on PSNR-NI: http://www.ni.com/white-paper/13306/en/