parallel procvcvcessing in hevc
DESCRIPTION
cvcvcvcvcvTRANSCRIPT
Parallel Processing in HEVC
Seminar Selected Topics in Communications Engineering 21.05.2013
Abhishek Aggarwal
Supervisor: Christian Feldmann
Name - Kurztitel
Outline
Introduction
Parallel Processing
Parallel Processing in Previous Standard
What HEVC offers
Summary
Name - Kurztitel
Introduction
Increasing popularity of HD-formats, emergence of beyond-HD formats (e.g., 4k 2k or 8k 4k resolution)
Huge amounts of data to be processed
Stronger needs for superior coding efficiency and fast processing
HEVC particularly focuses on two key issues: - Increased video resolution and - Increased use of parallel processing architectures.
HEVC Coding Design
Name - Kurztitel
Contd......
The classic block-based hybrid video coding approach.
A hybrid of - Interpicture prediction - Intrapicture prediction - Transform coding of the prediction residual signals
Picture splitting into block-shaped regions
Transform coefcients scaled, quantized, entropy coded and transmitted together with the prediction information.
Parallel Processing
Name - Kurztitel
Contd...
A picture split into one or several segments.
Each segment represents an area of a picture.
Segments are self-contained or independently encodable/ decodable
Prediction within the picture not performed across segment boundaries.
Name - Kurztitel
Aim of Parallel Processing Techniques
Improve coding efficiency
Parallelizing decoding/encoding operations at finer level of granularity
Reducing Latency
Parallel Processing in Previous Standard
In the H.264/AVC standard,typically achieved by use of SLICE.
A Slice can be :Decoded/Encoded independently from other slices. Either an entire picture or a region of a picture.
A picture may be split into one or several slices
Name - Kurztitel
Contd....
Slice consists a sequence of CTUs
CTUs processed in raster-scan order.
Less desirable for parallel processing- Slice header overhead- Dependency on the raster scan order- More Prediction Breaks
Name - Kurztitel
What HEVC offers ?
Tiles
Wavefront Parallel Processing
Tiles
Partition of a picture into columns and rows.
Intersecting column and row boundaries delineate rectangular regions called tiles
Contain an integer number of CTUs.
CTUs processed in raster scan order within tiles.
Tiles processed in raster scan within a picture.
Name - Kurztitel
Tiles - Advantages
No header bits overhead.The width and height (in units of CTUs) signaled in the sequence parameters set or in the picture parameter set.
More flexibility of frame partitioning.
Fewer boundaries to break prediction mechanisms.
Better Load balancing.
Contd...
Break at column boundary offers
- Smaller penalty on compression performance.- Reduces delay when pixel data arrives one CTU row at a time from the camera. - Finer-grained Parallelismif multiples Tiles within a Slice
Wavefront Parallel Processing
A Slice further divided into rows of CTUs.
CTUs processed in raster scan order.
First row processed in an ordinary way.
Second row begins to be processed after only two CTUs have been processed in the rst row and so on.
Problem- CABAC probability dependency
CABAC entropy codec updating its probabilities on the fly.
To encode the current CTU, left, top-left, top and top-right CTUs needed to be available.
Entropy coding of symbols require CABAC probabilities, available after the previous CTU (left CTU) has been processed.
Contd...
The first CTU of a line uses the CABAC probabilities of the last CTU from the previous line.
Name - Kurztitel
Contd...
Not possible to start processing for threads 2, 3 before thread 1 is finished.
Preventing efficient parallelism.
A Solution isTo Re-initialize the CABAC probabilities at the beginning of each line of CTUs Will cause a large penalty in R/D performance.
Approach for CABAC probability updation
Initialize CABAC probabilities of the first CTU of each line with the probabilities obtained after the second CTU of the upper line is processed.
Name - Kurztitel
Contd..
Avoids losing the probabilities that have already been acquired.
Allows for a quick learning of the probabilities along the first column of CTUs.
Can be performed without modifying the wavefront structure, because the second CTU of the upper line is always available.
Easy Implementation, only a single addition probability buffer is required.
Name - Kurztitel
WPP - Advantages
Parallelism at a ne level of granularity, i.e., within a slice.
No Prediction breaks occurs.
Provide better compression performance than tiles (and avoid some visual artifacts that may be induced by using tiles).
Concept of Dependent Slice Segments Supports Fragmented Packetization with lower latency
Name - Kurztitel
Summary
Tiles Less prediction breaks than in slice approachParallelism at a more coarse level of granularity (pic-ture/subpicture) No sophisticated synchronization of threads required
WPP Parallelism at ne level of granularityLatency ReductionNo Prediction breaks. Hence better compression performance
Click to edit the title text formatTitelmasterformat durch Klicken bearbeiten
Kolloquium "40 Jahre IND", 18. Februar 2005, Aachen
Click to edit the title text formatTitelmasterformat durch Klicken bearbeiten
Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level
Seventh Outline LevelTextmasterformat bearbeiten
Zweite Ebene
Dritte Ebene
Vierte Ebene
Fnfte Ebene
-