ee 5359-multimedia processing
DESCRIPTION
3D EXTENSION of HEVC: Multi-View plus Depth. EE 5359-MULTIMEDIA PROCESSING. Parashar Nayana Karunakar Student Id: 1000833406 Department of Electrical Engineering. HEVC & MVD- Brief overview. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/1.jpg)
EE 5359-MULTIMEDIA PROCESSING
3D EXTENSION of HEVC: Multi-View plus Depth
Parashar Nayana KarunakarStudent Id: 1000833406Department of Electrical Engineering
![Page 2: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/2.jpg)
HEVC & MVD- BRIEF OVERVIEW
• High-Efficiency Video Coding (HEVC) is the newest video coding standard of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG).
• The Joint Collaborative Team on 3D Video Coding Extension Development was created to develop 3D video coding technology more advanced than the current multiview video coding (MVC) features of H.264. The standards for which these 3D video coding extension technologies will provide such enhanced capabilities may include H.262, H.264 and the High Efficiency Video Coding (HEVC) .
![Page 3: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/3.jpg)
HEVC – ENCODER BLOCK DIAGRAM
Encoder block diagram H.264 [18]
Fig 1. – Typical HEVC Encoder[1]
![Page 4: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/4.jpg)
What is 3D/Multi-View/Stereo Video• Multiple Cameras views of the same scene are captured – Multiple View Video(MVV)
• Efficient Compression techniques are essential as MVV contains vast amount of data both during storage and transmission.
• Inter-view statistical dependencies are exploited for combined temporal/inter-view prediction.
• When color video and an associated per sample depth map are considered, we get Multi-view video plus depth representation.
Fig 2. – Test Sequence –Balloons with depth map [3]
![Page 5: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/5.jpg)
Fig 3. – Overview of the system structure and data format for the transmission of3D video [12]
![Page 6: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/6.jpg)
MVC FOR MULTI-VIEW PLUS DEPTH• Encoding and decoding of each view of a multi-view data set separately ,referred to as simulcast coding , can be done with any video codec including H.264/AVC and HEVC.
• This would be simple but inefficient as inter-view statistical dependencies are not exploited.
• In order to exploit all the statistical dependencies within multi-view dataset, inter-view prediction has to be combined with temporal prediction.
• As seen in fig 3b. , In MVC, one of the views is conventionally coded in conformance to the HEVC codec. For coding the other views, in addition to previously coded pictures of the same view already coded co-located pictures of other views can also be used as reference pictures.
• In Multi –view video plus depth (MVD) format, only a few views are actually coded. Based on the transmitted videos and depth maps, additional views can be rendered..
Fig 4a. – Simulcast coding structure with hierarchical B pictures for temporal prediction(black arrows)[2]
Fig 4b. – Multi-view coding structure with hierarchical B pictures for both temporal (black arrows) and inter-view prediction(red arrows) [2]
![Page 7: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/7.jpg)
BASIC 3D VIDEO CODEC STRUCTURE
Fig 5. – Block Diagram of a 3D Video Codec[4]
![Page 8: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/8.jpg)
MVD CODEC- WORKING• The basic structure of the 3D video codec is shown in the block diagram of Figure 5. In principle, each component signal is coded using an HEVC-based codec. The resulting bit stream packets, or more accurately, the resulting Network Abstraction Layer (NAL) units, are multiplexed to form the 3D video bit stream.
• The base or independent view is coded using an unmodified HEVC codec. The base view sub-stream can be directly decoded using the conventional HEVC decoder.
• For coding the dependent views and the depth data, modified HEVC codec are used, which are extended by including additional coding tools and inter-component prediction techniques that employ already coded data inside the same access unit as indicated by the red arrows in Figure 5.
• For enabling an optional discarding of depth data from the bit stream, e.g., for supporting the decoding of a stereo video suitable for conventional stereo displays, the inter-component prediction can be configured in a way that video pictures can be decoded independently of the depth data..
![Page 9: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/9.jpg)
MVD- CODING ALGORITHM• The video pictures and, when present, the depth maps are coded access unit by access unit, as it is illustrated in Figure 6.• An access unit includes all video pictures and depth maps that correspond to the same time instant. NAL units containing camera parameters may be additionally associated with an access unit.• The video pictures and depth maps corresponding to a particular camera position are indicated by a view identifier (viewId). All video pictures and depth maps that belong to the same camera position are associated with the same value of viewId.• Inside an access unit, the video picture and, when present, the associated depth map with viewId equal to 0 are coded first, followed by the video picture and depth map with viewId equal to 1, etc.• For ordering the reconstructed video pictures and depth map after decoding, each value of viewId is associated with another identifier called view order index (VOI). The view order index is a signed integer values, which specifies the ordering of the coded views from left to right.
Fig 6. - Access units structure and coding order of view components[12]
![Page 10: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/10.jpg)
COMPARSION – MVD AND HEVC CODEC• CODING OF DEPENDENT VIEWS -- Additional tools have been integrated into the HEVC codec, which employ already coded data in other views for efficiently representing a dependent view. These tools include - Disparity-compensated prediction, View synthesis based inter-view prediction, Post processing in-loop filtering, Inter-view motion prediction, Depth-based motion parameter prediction, Inter-view residual prediction, Adjustment of QP of texture based on depth data.
• CODING OF DEPTH MAPS – There are certain additional tools and also some tools are removed for coding of Depth maps. Some of the differences are -- Depth Maps are coded in 4:0:0 format, Non-linear depth representation is used, Z-near Z-far compensated weighted prediction, Modified motion compensation and motion vector coding ( No interpolation is used i.e. for depth maps, the inter-picture prediction is always performed with full-sample accuracy. Disabling of in-loop filtering ( deblocking filter and SAO), Depth modeling modes ( Four new Intra-prediction modes are used), Motion parameter inheritance.
![Page 11: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/11.jpg)
THE PROPOSAL
• The aim of the project is to import the MVD coding tools into the HM 9.1.
• The project will be carried out in the following steps:
Learn about the tools that are different to MVD when compared to standard HEVC codec.
Study and compare the MVD extensions of HEVC carried out on HM 5.1by Fraunhoffer HHI and Qualcomm ( the two most recent ones).Import the MVD to HM 9.1 based on the observations from the first two
stages.
Present the results and report the changes in bitrates and PSNR between the new imported software and the previously studied
![Page 12: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/12.jpg)
RESULTS• The sequence balloons (1024 x 768) has been used. The tested configuration is the balloons sequence with three camera view including depth with QP of 30. The same configuration has been used with the sequence kendo (1024 x 768 ) as well.
• The encoding is done using both HTM 5.1 and HTM 6.0 for comparison. Finally, the encoding is done using the modified code which will be referred to as “mychanges” henceforth.
• The results are as shown in the next few slides,
![Page 13: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/13.jpg)
Encoding Time in seconds2000
2500
3000
3500
4000
4500
HTM 5.1HTM 6.0MyChanges
Fig 7. Encoding Time in Seconds for Balloons Sequence
SECONDS
RESULTS – Encoding Time (sec)Comparison for Balloons Sequence
![Page 14: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/14.jpg)
Fig 8. Encoding Time in Seconds for Kendo Sequence
Encoding Time in seconds2000
2500
3000
3500
4000
4500
HTM 5.1HTM 6.0MyChanges
SECONDS
RESULTS – Encoding Time (sec)Comparison for Kendo Sequence
![Page 15: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/15.jpg)
Video 0 Depth 0 Video 1 Depth 1 Video 2 Depth 20
200400600800
10001200140016001800
HTM 5.1HTM 6.0
RESULTS – Bit-rate (kbps)Comparison for Balloons Sequence
Fig 9. Bit-Rate (kbps) for Balloons Sequence
![Page 16: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/16.jpg)
RESULTS – Bit-rate (kbps)Comparison for Kendo Sequence
Fig 10. Bit-Rate (kbps) for Kendo Sequence
Video 0
Depth 0
Video 1
Depth 1
Video 2
Depth 2
0
200
400
600
800
1000
1200
HTM 5.1HTM 6.0
![Page 17: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/17.jpg)
RESULTS – Y-PSNR (dB)Comparison for Balloons Sequence
Fig 11. Y-PSNR ( dB ) for Balloons SequenceVideo 0 Depth 0 Video 1 Depth1 Video 2 Depth 2
30313233343536373839404142
HTM 5.1HTM 6.0
![Page 18: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/18.jpg)
RESULTS – Y-PSNR (dB)Comparison for Kendo Sequence
Fig 12. Y-PSNR ( dB ) for Kendo Sequence
Video 0
Depth 0
Video 1
Depth 1
Video 2
Depth 2
2527293133353739414345
HTM 5.1HTM 6.0My-Changes
![Page 19: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/19.jpg)
FIG 13. -VIDEO USED – BEFORE AND AFTER COMPRESSION – HTM 5.1
Artifacts
![Page 20: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/20.jpg)
FIG 14. - VIDEO USED – BEFORE AND AFTER COMPRESSION – HTM 6.0
ArtifactsArtifacts
![Page 21: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/21.jpg)
FIG 15. - VIDEO USED – BEFORE AND AFTER COMPRESSION – MYCHANGES
Artifacts
![Page 22: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/22.jpg)
FIG 16. - VIDEO USED – BEFORE AND AFTER COMPRESSION – HTM 5.1-KENDO
Artifacts
![Page 23: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/23.jpg)
FIG 17. - VIDEO USED – BEFORE AND AFTER COMPRESSION – HTM 6.0 -KENDO
![Page 24: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/24.jpg)
FIG 18. - VIDEO USED – BEFORE AND AFTER COMPRESSION – MYCHANGES -KENDO
![Page 25: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/25.jpg)
AN EXAMPLE OF DIFFERENCE IN CODING APPROACH
CODING STRUCTURE – HM 9.2 CODING STRUCTURE - MYCHANGES// coding structure Int m_iIntraPeriod; Int m_iDecodingRefreshType; Int m_iGOPSize; Int m_extraRPSs; GOPEntry m_GOPList[MAX_GOP];
Int m_numReorderPics[MAX_TLAYER];
Int m_maxDecPicBuffering[MAX_TLAYER];
Bool m_bUseLComb; Bool m_useTransformSkip; Bool m_useTransformSkipFast; Bool m_enableAMP;
// coding structureInt m_iIntraPeriod;Int m_iDecodingRefreshType;Int m_iGOPSize;Int m_extraRPSs[MAX_VIEW_NUM]; GOPEntryMvc m_GOPListsMvc[MAX_VIEW_NUM]
[MAX_GOP+1];Int m_numReorderPics[MAX_VIEW_NUM]
[MAX_TLAYER]; Int m_maxDecPicBuffering[MAX_VIEW_NUM]
[MAX_TLAYER]; Bool m_bUseLComb; Bool m_useTransformSkip; Bool m_useTransformSkipFast; Bool m_enableAMP;
![Page 26: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/26.jpg)
CONCLUSION
• The goal of porting the 3D compression tools to HM 9.2 has been successfully accomplished.
• From the results, its clear that the changes made are not completely optimized and hence the encoding time is the largest.
• Large encoding time can be accounted for the fact that the project has achieved a slightly better PSNR for depth maps compression compared to the previous versions used for reference.
![Page 27: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/27.jpg)
FUTURE WORK• USE OTHER DIFFERENT SEQUENCES TO COMPARE AND UNDERSTAND THE RESULTS BETTER.
•GENERATING OF THE ADDITIONAL VIEWS USING THE COMPRESSED VIEWS-RENDERING.
• IMPROVE ENCODING TIME– MANY METHODS BEING USED IN HEVC COMPRESSION TECHNIQUES CAN USED TO IMPROVE ENCODING TIME IN 3D COMPRESSION AS WELL.
• USE OF PARALLELIZATION TECHNIQUES AND OTHER FAST ENCODING METHODS.
• DEPTH MAPS CAN BE MADE MORE ARTIFACT FREE.
• IN SHORT, 3D COMPRESSION IS A FERTILE FIELD FOR RESEARCH WITH A LOT OF SCOPE.
![Page 28: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/28.jpg)
LIST OF ACRONYMS
AVC: Advanced video codingDIBR: Depth Image Base RenderingHD: High DefinitionHEVC: High Efficiency Video CodingHHI: Heinrich Hertz InstituteHM: HEVC Test ModelHTM: Test Model for 3D extension of HEVC IEC: International Electrotechnical CommissionISO: International Organization for StandardizationITU-T: International Telecommunication Union-TelecommunicationMPEG: Moving picture experts groupMC: Motion CompensationMV: Motion VectorMVC: Multi-View CodingMVD: Multi-View plus DepthMVV: Multi-View VideoNAL: Network Abstraction LayerPSNR: Peak Signal to Noise RatioQP: Quantization ParameterSAO: Sample Adaptive Offset VCEG: Video coding experts groupVOI: View Order Index
![Page 29: EE 5359-MULTIMEDIA PROCESSING](https://reader033.vdocuments.mx/reader033/viewer/2022051117/56815e03550346895dcc4b2f/html5/thumbnails/29.jpg)
REFERENCES[1] G.J. Sullivan; J. Ohm; Woo-Jin Han and T.Wiegand, “Overview of the High Efficiency Video Coding (HEVC) Standard”,
IEEE Transactions on Circuits and Systems for Video Technology, Volume: 22, Issue: 12, Pages 1649-1668, December 2012.
[2] P. Merkle, A Smolic, K. Müller, and T. Wiegand, “Multi-View video plus depth data representation and coding”. Picture Coding Symposium, 2007
[3]Test Sequences: http://www.tanimoto.nuee.nagoya-u.ac.jp/~fukushima/mpegftv/[4] H. Schwarz et al "3D Video Coding Using Advanced Prediction, Depth Modeling, and Encoder Control Methods",
Picture Coding Symposium, May 2012.[5] G. Tech, H. Schwarz, K. Müller, and T. Wiegand, "Effects of synthesized View Distortion based 3D Video Coding on the
Quality of interpolated and extrapolated Views", IEEE Intl. Conf. on Multimedia and Exposition, Pages 634-639, July 2012.
[6] P. Merkle e tal, "3D Video: Depth Coding Based on Inter-component Prediction of Block Partitions", Picture Coding Symposium, May 2012.
[7] H. Schwarz and T. Wiegand, "Inter-View Prediction of Motion Data in Multiview Video Coding", Picture Coding Symposium, May 2012.
[8] G. Tech, H. Schwarz, K. Müller, and T. Wiegand, "3D Video Coding using the Synthesized View Distortion Change", Picture Coding Symposium, May 2012.
[9] M. Winken, H. Schwarz, and T. Wiegand, "Motion Vector Inheritance for High Efficiency 3D Video plus Depth Coding," Picture Coding Symposium, May 2012.
[10] S. Bosse, H. Schwarz, T. Hinz and T. Wiegand, "Encoder Control for Renderable Regions in High Efficiency Multiview Video Plus Depth Coding", Picture Coding Symposium, May 2012.
[11] 3D Extension Software Repository: https://hevc.hhi.fraunhofer.de/svn/svn_3DVCSoftware/[12] “Test Model under Consideration for HEVC based 3D video coding”, ISO/IEC JTC1/SC29/WG11 MPEG2011/N12559
February 2012, San Jose, CA, USA [13] HM Software Repository:https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/[14] HEVC Text Specification Draft 9: http://phenix.int-evry.fr/jct/doc_end_user/current_document.php?id=6803[15] H.264/AVC reference website -http://www.itu.int/rec/T-REC-H.264-201003-I