nayana parashar multimedia processing lab university of texas at arlington supervising professor:...

39
Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th , 2013 IMPLEMENTATION OF AN OUT-OF-THE LOOP POST-PROCESSING TECHNIQUE FOR HEVC DECODED DEPTH-MAPS 11/25/2013 Multimedia Processing Lab, UTA

Upload: lewis-harrison

Post on 17-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Nayana ParasharMultimedia Processing Lab

University of Texas at Arlington

Supervising Professor: Dr. K.R. RaoNovember 25th, 2013

IMPLEMENTATION OF AN OUT-OF-THE LOOP POST-PROCESSING TECHNIQUE FOR

HEVC DECODED DEPTH-MAPS

11/25/2013

Page 2: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

CONTENTS

11/25/2013

1. BASIC CONCEPTS2. VIDEO COMPRESSION3. 3D VIDEO COMPRESSION4. THESIS-WORK5. RESULTS6. CONCLUSIONS7. FUTURE-WORK8. REFERENCES

Page 3: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

THESIS IN A NUT-SHELL

11/25/2013

Normal Procedure

Thesis

Motivation : Compression artifact removal, better perceptual quality of rendered frames.

3D VIDEO ENCODING

(Color-sequence & corresponding

Depth-map )

3D VIDEO DECODING(Color-sequence &

corresponding Depth-map )

VIEW RENDERING for DISPLAY

(Stereoscopic or Multi-

view)

3D VIDEO ENCODING

(Color-sequence & corresponding

Depth-map )

3D VIDEO DECODING(Color-sequence &

corresponding Depth-map )

VIEW RENDERING for DISPLAY

(Stereoscopic or Multi-

view)

Post-processing of the decoded depth-map

Page 4: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA 11/25/2013

BASIC CONCEPTS

Page 5: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Image and video

11/25/2013

Images and video make-up the visual media. An image is characterized by pixels or pels, the smallest addressable elements

in a display device. Properties of an image: number of pixels (height and width), color and

brightness of each pixel. Video is composed of a sequence of pictures (frames) taken at regular time

(temporal) intervals.

Figure 1: 2D image with spatial samples (L) and Video with N frames (R) [1]

Page 6: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

3D video – Multi-view video plus depth format

11/25/2013

The multi-view video plus depth (MVD) [2] [3] format: The most promising format for enhanced 3D visual experiences.

This type of representation provides, for each view-point, a texture (image sequence) and an associated depth-map sequence (fig. 2).

Figure 2: Color video frame (L) and associated depth map frame (R) [4]

Page 7: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Depth-maps

11/25/2013

Depth maps represent the per-pixel depth of a corresponding color image, and signal the disparity information needed at the virtual (novel) view rendering system.

They are represented as a gray-scale image sequence for storage and transmission requirements.

In the depth maps, each pixel conveys information on the relative distance from the camera to the object in the 3D space.

Their efficient compression and transmission to the decoder is important for view generation.

They are never actually displayed and are used for view generation purposes only.

Page 8: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Depth Image Based Rendering (DIBR) [5]

11/25/2013

It is the process of synthesizing “virtual” views of a scene from still or moving images and associated per-pixel depth information.

Two step process: 1. The original image points are reprojected into the 3D world, utilizing the

respective depth data.2. 3D space points are projected into the image plane of a “virtual” camera,

which is located at the required viewing position.

Stereoscopic view generation:- Two (left and right) views are generated.

Multiple view generation:- More than two views (each view corresponding to the scene viewed from a different angle) are generated.

Page 9: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Stereoscopic view rendering

11/25/2013

A color image and per-pixel depth map can be used to generate virtual stereoscopic views. This is shown in fig. 3.

In this process, the original image points at locations (x, y) are transferred to new locations (xL , y) and (xr , y) for left and right view respectively.

The view generation process in a little detail:

Figure 3: Virtual view generation in Depth Image Based Rendering (DIBR) process [6]

Page 10: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA 11/25/2013

VIDEO COMPRESSION

Page 11: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Introduction

11/25/2013

Data compression: Science of representing information in a compact format.

Common image/video compression techniques reduce the number of bits required to represent image/video sequence (can be lossy or lossless).

Video compression strategies:- Spatial, temporal and bit-stream redundancies are exploited. High-frequency components are removed.

Many organizations have come-up with a number of video compression codecs over the past many years. [1]

High Efficiency Video Coding (HEVC) is the most recent video compression standard.

Page 12: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

HEVC overview [13][14]

11/25/2013

Successor of the H.264/AVC video compression standard.

Multiple goals: improved coding efficiency, ease of transport system integration data loss resilience implementation ability using parallel processing architectures

Complexity of some key modules such as transforms, intra prediction, and motion compensation is higher in HEVC than in H.264/AVC. Complexity of modules such as entropy coding and deblocking is lower in HEVC than in H.264/AVC [15].

Page 13: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

HEVC encoder- Block diagram

11/25/2013

LEGEND: - High freq.

content removal - Spatial redundancy exploitation

- Temporal redundancy exploitation- Bit-stream redundancy exploitation-Sharp edge smoothing

Figure 4: HEVC encoder block-diagram [13]

Page 14: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA 11/25/2013

3D VIDEO COMPRESSION

Page 15: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

The depth-map dilemma

11/25/2013

Compression of depth-maps is a challenge.Quantization process eliminates high spatial

frequencies in individual frames. The compression artifacts have adverse consequences

upon the quality of the rendered views. It is highly important to preserve the sharp depth

discontinuities present in depth maps for high quality virtual view generation.

Two solutions exist to this dilemma

Page 16: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

The two approaches for 3D compression

11/25/2013

Approach one: Use of novel video compression techniques that are suitable for 3D video. Special

features are added to overcome the depth-map dilemma. E.g. 3D video coding in H.264/AVC [16], 3D video extension of HEVC [17] [18] [19] Advantages: Special features that are specific to 3D video are exploited (Inter-view

prediction), Dedicated blocks for depth-map compression in the codec. Disadvantages: Insanely complex with respect to general codec structure as well

encoding time.

Approach two: Using the already existing codecs to encode and decode the sequences.. Later, use image denoising techniques [20] on decoded depth-maps to remove

compression artifacts. Advantages: Not as complicated and complex as approach one. Use of existing video codecs without any modification. Disadvantages: There is never one right denoising solution.

Page 17: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA 11/25/2013

THESIS-WORK

Page 18: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Scope and premises

11/25/2013

This thesis falls into the second approach explained for 3D video compression

Not much research has been done to implement image denoising techniques for HEVC decoded depth-maps.

A post-processing framework that is based on analysis of compression artifacts upon generation of virtual views is used.

The post-processing frame-work utilizes a spatial filtering technique specifically discontinuity analysis followed by Edge-adaptive joint trilateral filter [6] to reduce compression artifacts.

Effectively reduces the compression artifacts from HEVC decoded depth-maps.

Improvement in the perceptual quality of rendered views without using depth-map specific video codec

Page 19: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Algorithm: Block diagram

11/25/2013

Encoder/Decoder

Depth Discontinuity

Analysis

Edge Adaptive

Joint Trilateral

Filter

OriginalDepth Map

CorrespondingColor Image

CompressedDepth Map

BinaryMask

ReconstructedDepthMap

(a)

(b)

Figure 5: Block-diagram of the algorithm used for depth-map enhancement

Page 20: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Step (a): Depth discontinuity analysis [6]

11/25/2013

The purpose is twofold: 1) The areas that have aligned edges in the color image and the corresponding depth map are identified. The filter kernels of the EA-JTF are adaptively selected based on this information. 2) All depth discontinuities that are significant in terms of rendering are identified.

Sub-steps: The depth map is convolved with a vertical sobel filter to obtain Gx.

An edge mask Ed is derived using Eq. (1.1), which corresponds to pixel locations of significant depth discontinuities.

Where is a theoretical threshold obtained after studying the effects of compression artifacts to view rendering.

xB – distance between the left and right virtual cameras or eye separation (assumed to be 6 cm)

D - viewing distance (assumed to be 250 cm).

knear and kfar – range of the depth information respectively behind and in front of the picture, relative to the screen width.

Npix – screen width measured in pixels

8-bit images are considered ( that is why the number 255)

(1.1)

Page 21: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Step (a): (contd.)

11/25/2013

To identify the regions in which the color edges and depth discontinuities are aligned, an edge mask Ec of the color image is generated by the canny edge detection algorithm. Using Ed and Ec, the binary mask Es signifying the aligned edge areas is obtained as:

Where, represents the morphological dilation and S1and S2 represent flat square structuring elements of size 2 and 7 respectively.

Different stages of step (a) are shown in figure 6.

(1.2)

Page 22: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA 11/25/2013

Figure 6: Illustration of depth discontinuity analysis

Page 23: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Step (b): Edge-adaptive joint trilateral filter

11/25/2013

The edge adaptive joint trilateral filter [6] is based of bilateral filter and joint trilateral filter [7] [8] [9] [10] [11] [12].

For some pixel position p the filtered result F is given as in the eq. (2.1),

In Eq. (2.1), Iq is the value at pixel position q in the kernel neighborhood. The filter weight wpq at pixel position q is calculated as,

Both c and s are popularly implemented as a Gaussian centered at p and Ip (Ip is the value at pixel position p) with standard deviations σc and σs, respectively as

The similarity filter kernel st of the joint trilateral filter is adaptively selected as given in Eq. (2.3). For the areas where the edges between the color image and the corresponding depth map are aligned (i.e. Es from eq. 1.2 = 1) , there will be two similarity filter kernels used, each derived from the compressed depth map (s) and the color image(sj). For the remaining area, only the similarity filter kernel derived from the compressed depth map is used.

(2.1)

(2.2)

(2.3)

(2.4)

(2.5)

Page 24: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Step (c) :- Stereoscopic view rendering

11/25/2013

The reconstructed depth-map from step (b) is used to generate left side and right side views using stereoscopic view rendering process [21] [22] [27].

Finally, the frames obtained using uncompressed depth map, HEVC decoded depth-map and HEVC decoded depth-map to which the post-processing has been applied are compared using the metrics PSNR, SSIM [24] and a approximate of Mean Opinion Score [25] for image quality.

Page 25: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA 11/25/2013

RESULTS

Page 26: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Results: Experimental set-up

11/25/2013

To evaluate the performance of the EA-JTF [6] on HEVC decoded depth maps, color sequences along with the corresponding depth maps are compressed using HEVC reference software HM 9.2 [26]. For the purpose of filtering and rendering MATLAB R2013a student version was used.

For all the sequences, other than Ballet, only one frame result is obtained at QP = 32. A 15 frame sequence at a frame-rate of 3 frames/sec is used for Ballet.

Three different rendered images are obtained: 1) Original image and the corresponding depth map are used. (original) 2) HEVC decoded image and the corresponding decoded depth-map are used. (compressed) 3) HEVC decoded image and the depth-map after post-processing. (post-processed) PSNR and SSIM [24] and an approximate Mean Opinion Score (MOS) [25] was

used to evaluate the perceptual quality of the rendered views.

Page 27: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Results: Input parameters

11/25/2013

Parameter Value

Viewing distance (D) 250cm (assumed)

Eye separation (xB) 6cm(assumed)

Screen width in pixels (Npix)1366 (for the laptop used for experimentation)

knear and kfar

knear = 44.00; kfar = 120.00 (BreakDancer)knear = 42.00; kfar = 130.00 (Ballet)knear = 448.25; kfar = 11206.28 (Balloons)knear = 448.25 ; kfar = 11206.28 (Kendo)

Resolution of the video sequences used 1024 x 768

EA-JTF

Kernel size: 15 x 15 pixels

Standard deviation for the color similarity filter (σs) = 0.025 (normalized range of 0-1)

Standard deviation for the depth similarity filter (σj) = 0.036 (normalized range of 0-1)

Standard deviation for the closeness filter (σc) = 45

Page 28: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Results: Break-dancer sequence

11/25/2013

Original sequence obtained from Microsoft Research [23] An increase in both PSNR as well as SSIM is seen. High-quality rendering as the original depth-maps are generated using

computer vision algorithms. A grayscale version of the sequence was used for approximate MOS

calculation. Even, here the post-processed method had better ratings than the compressed one.

Image databaseMetric Decoded Image(Left-side view)

Processed Image(Left-side view)

Decoded Image(Right-side view)

Processed Image(Right-side view)

PSNR (dB) 41.9401 41.9804 41.9401 41.9804SSIM (dB) 0.9133 0.9139 0.9133 0.9139

Image MOS Rating (max = 3)

Original 2.6

Decoded 1.5

Processed 1.9

Page 29: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Results: Ballet sequence

11/25/2013

Original sequence obtained from Microsoft Research [23] An increase in both PSNR as well as SSIM is seen. High-quality rendering as the original depth-maps are generated using

computer vision algorithms. Sequence not used for MOS calculation. Image database

Metric Decoded Image(Left-side view)

Processed Image(Left-side view)

Decoded Image(Right-side view)

Processed Image(Right-side view)

PSNR (dB) 42.7317 42.787 42.7317 42.787

SSIM (dB) 0.9413 0.9444 0.9413 0.9444

Page 30: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Results: Kendo sequence

11/25/2013

Original sequence obtained from [4]. Very interesting sequence. Not much edge information, hence the original,

post-processed and compressed all are extremely similar perceptually. However, there is a slight decrease in PSNR and SSIM turned out to be exactly equal.

On the other hand, in MOS calculation, the post-processed frame performed better than the compressed frame.

Image databaseMetric Decoded Image(Left-side view)

Processed Image(Left-side view)

Decoded Image(Right-side view)

Processed Image(Right-side view)

PSNR (dB) 45.7213 45.0551 45.7213 45.0551SSIM (dB) 0.9887 0.9887 0.9887 0.9877

Image MOS Rating (max = 3)

Original 2.2

Decoded 1.7

Processed 2.1

Page 31: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Results: Balloons sequence

11/25/2013

Original sequence obtained from [4]. The compressed has better PSNR as well as SSIM compared to the processed. This can be attributed to the fact that the views rendered from the original

sequence themselves are not optimal due to noise in the original-depth. The proposed solution improves the perceptual quality to a great extent. In MOS calculation, the post-processed frame performed better than the

compressed frame. Image database

Metric Decoded Image(Left-side view)

Processed Image(Left-side view)

Decoded Image(Right-side view)

Processed Image(Right-side view)

PSNR (dB) 44.2039 43.209 44.2039 43.209SSIM (dB) 0.981 0.9798 0.981 0.9798

Image MOS Rating (max = 3)

Original 2.4

Decoded 1.0

Processed 2.5

Page 32: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA 11/25/2013

CONCLUSIONS

Page 33: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Conclusions

11/25/2013

Quality of rendered views (stereoscopic rendering) generated using HEVC decoded depth-maps was improved.

Four multi-view plus depth sequences were used to carry-out experiments. There was a an improvement in PSNR as well as SSIM for the two sequences-

Break-dancer and Ballet. Break-dancer sequence saw an improvement of 0.04 dB in PSNR and 0.006 dB in SSIM. Ballet saw improvement of dB in PSNR and dB in SSIM. There was no improvement in PSNR for Kendo sequence while the SSIM remained constant (not much edge information) while for the balloons sequence, there was no improvement in either the PSNR or the SSIM.

However, the main improvement brought about by this method was the improvement in the perceptual quality of the rendered views. An approximate MOS test survey suggested that the views rendered after post-processing were always better perceptually compared to the ones rendered without post-processing. In this regard, all the four test sequences showed improvement in perceptual quality.

Page 34: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA 11/25/2013

FUTURE-WORK

Page 35: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

Future-work

11/25/2013

Improvement in filter design to provide more significant results.

Moving ahead of stereoscopic rendering and into multi-view rendering.

Method can be made in-loop and merged with the HEVC compression codec.

To calculate the perceptual quality, the current work used SSIM and an approximate of Mean Opinion Score, more research into perceptual quality assessment for depth-maps and rendered views will be useful.

Page 36: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA 11/25/2013

REFERENCES

Page 37: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

References

11/25/2013

1. K.R. Rao, D.N. Kim and J.J. Hwang, “Video coding standards: AVS China, H.264/MPEG4-Part 10, HEVC, VP6, DIRAC and VC-1”, Springer -2014.

2. D.K. Shah, et al, "Evaluating multi-view plus depth coding solutions for 3D video scenarios," 3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2012, vol., no., pp.1, 4, 15-17 Oct. 2012.

3. Fraunhofer HHI, 3D Video coding information: http://www.hhi.fraunhofer.de/fields-of-competence/image-processing/research-groups/image-video-coding/3d-hevc-extension.html

4. Balloons and Kendo test sequences: http://www.tanimoto.nuee.nagoya-u.ac.jp/~fukushima/mpegftv/5. C. Fehn "A 3D-TV system based on video plus depth information," Signals, Systems and Computers, 2004. Conference Record of the

Thirty-Seventh Asilomar Conference on, vol.2, no., pp.1529-1533 Vol.2, 9-12 Nov. 2003.6. D.V.S. De Silva, et al, “A Depth Map Post-Processing Framework for 3D-TV systems based on Compression Artifact Analysis”, Selected

Topics in Signal Processing, 2011, IEEE journal of volume: PP, Issue: 99, pp. 1 - 30 7. C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” IEEE International Conference on Computer Vision,

Washington DC, USA, pp 839-846, 1998. 8. E. Eisemann and F. Durand, “Flash photography enhancement via intrinsic relighting,” in ACM Transactions on Graphics (TOG), vol. 23,

no. 3. ACM, 2004, pp. 673–678.9. G. Petschnigg, et al, “Digital photography with flash and no-flash image pairs,” in ACM Transactions on Graphics (TOG), vol. 23, no. 3.

ACM, 2004, pp. 664–672.10. B. Zhang and J. Allebach, “Adaptive bilateral filter for sharpness enhancement and noise removal,” Image Processing, IEEE

Transactions on, vol. 17, no. 5, pp. 664–678, 2008.11. P. Choudhury and J. Tumblin, “The trilateral filter for high contrast images and meshes,” in ACM SIGGRAPH 2005 Courses. ACM, 2005,

pp. 5-es.12. S. Liu, P. Lai, D. Tian, C. Gomila, and C. W. Chen, “Joint trilateral filtering for depth map compression,” Huangshan, China, 2010, pp. 77

440F-10.13. G.J. Sullivan; J. Ohm; Woo-Jin Han and T.Wiegand, “Overview of the High Efficiency Video Coding (HEVC) Standard”, IEEE Transactions

on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649-1668, Dec 2012.14. HEVC text specification draft 10: http://phenix.it- sudparis.eu/jct/doc_end_user/current_document.php?id=7243

Page 38: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA

REFERENCES

11/25/2013

15. F Bossen, et al, “HEVC complexity and implementation analysis”, IEEE Transactions on Circuits and Systems for Video Technology, Volume: 22, Issue: 12, pp. 1685 - 1696, December 2012.

16. 3DV for H.264: http://mpeg.chiariglione.org/technologies/general/mp-3dv/index.htm17. Fraunhofer HHI, 3D Video coding information:

http://www.hhi.fraunhofer.de/fields-of-competence/image-processing/research-groups/image-video-coding/3d-hevc-extension.html

18. P. Merkle, A Smolic, K. Müller, and T. Wiegand, “Multi-View video plus depth data representation and coding”. Picture Coding Symposium, 2007.

19. “Test Model under Consideration for HEVC based 3D video coding”, ISO/IEC JTC1/SC29/WG11 MPEG2011/N12559, San Jose, CA, USA, February 2012.

20. M.C. Motwani, et al, “A survey of image denoising techniques”, Proceedings of GSPx 2004, Santa Clara, CA: http://www.cse.unr.edu/~fredh/papers/conf/034-asoidt/paper.pdf

21. I. J. S. W. 11, “Proposed experimental conditions for EE4 in MPEG 3DAV. WG 11 doc. m9016,” vol. Shanghai, Oct. 2002. 22. C. Fehn, “Depth-image-based rendering (DIBR), compression and transmission for a new approach on 3D-TV,”

Proceedings of the SPIE, vol. 5291, 93, 2004. 23. Break-Dancers and Ballet sequence: http://research.microsoft.com/en-us/um/people/sbkang/3dvideodownload/24. Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "

Image quality assessment: From error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600 - 612, Apr. 2004.

25. L Ma, et al, "Image Retargeting Quality Assessment: A study of subjective scores and objective metrics," Selected Topics in Signal Processing, IEEE Journal of , vol.6, no.6, pp.626,639, Oct. 2012.

26. HEVC reference software (HM 9.2):- https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/branches/HM-9.2-dev/27. MATLAB code for stereoscopic view rendering:

http://www.mathworks.com/matlabcentral/fileexchange/27538-depth-image-based-stereoscopic-view-rendering

Page 39: Nayana Parashar Multimedia Processing Lab University of Texas at Arlington Supervising Professor: Dr. K.R. Rao November 25 th, 2013 IMPLEMENTATION OF AN

Multimedia Processing Lab, UTA 11/25/2013

THANK YOU!

QUESTIONS??