gpu based-image-quality-assessment-using-structural-similarity-(ssim)-index

7

Click here to load reader

Upload: maheshkha

Post on 20-Mar-2017

9 views

Category:

Engineering


0 download

TRANSCRIPT

Page 1: Gpu based-image-quality-assessment-using-structural-similarity-(ssim)-index

276

Copyright © 2016, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.

Chapter 13

DOI: 10.4018/978-1-4666-8853-7.ch013

ABSTRACT

This chapter deals with performance analysis of CUDA implementation of an image quality assessment tool based on structural similarity index (SSI). Since it had been initial created at the University of Texas in 2002, the Structural SIMilarity (SSIM) image assessment algorithm has become a valuable tool for still image and video processing analysis. SSIM provided a big giant over MSE (Mean Square Error) and PSNR (Peak Signal to Noise Ratio) techniques because it way more closely aligned with the results that would have been obtained with subjective testing. For objective image analysis, this new technique represents as significant advancement over SSIM as the advancement that SSIM provided over PSNR. The method is computationally intensive and this poses issues in places wherever real time quality as-sessment is desired. We tend to develop a CUDA implementation of this technique that offers a speedup of approximately 30 X on Nvidia GTX275 and 80 X on C2050 over Intel single core processor.

INTRODUCTION

Structural Similarity Index Measurement (SSIM) is an image quality assessment method that compares a test image with a reference image to find similarities between the two images. This method is proposed as an improvement over traditional methods like PSNR and MSE, which have proved to be inconsistent with human visual system (HVS). A typical image quality assessment method is shown in Figure 1 (Zujovic et al., 2009).

The SSIM metric is calculated by taking two NxN regions, x and y, of the two images respectively and computing the metric

GPU Based Image Quality Assessment using Structural

Similarity (SSIM) IndexMahesh Satish Khadtare

Pune University, Maharashtra, India

Page 2: Gpu based-image-quality-assessment-using-structural-similarity-(ssim)-index

277

GPU Based Image Quality Assessment using Structural Similarity (SSIM) Index

SSIM x yc

cx y xy

x y x y

,( c )( )

( c )( )( ) =

+ +

+ + + +

2 21 2

2 21

2 22

µ µ σ

µ µ σ σ (1)

where

σx2 the variance of x ;σy2 the variance of y ;

C K L1 1

2= ( ) and C K L

2 2

2= ( ) two variables to stabilize the division with weak denominator;

L is the dynamic range of the pixel-values (typically this is 2 1# /bits pixel − );K10 01= . and K

20 03= . by default.

The NxN regions are shifted around the image pixel by pixel to cover the whole image, and the final SSIM is obtained by summing up the SSIM of all the regions. The resultant SSIM index is a decimal value between -1 and 1, and value 1 is only reachable in the case of test image and reference image being identical. The typical size of the region is taken to be 8x8 or 16x16 (Singh et al., 2011).

Image quality assessment is an important step in many image restoration applications like image denoising, image deblurring and image in painting. It is also an important step in video codecs where a block based approach is followed for video compression.

It is obvious from the definition of SSIM that it is a computationally intensive method. However in many cases we may require a real time or a faster implementation. It is also obvious from the definition that SSIM computations of two different regions are independent of each other’s and could be done in parallel. This kind of parallelism is well suited to GPU kind of architectures, where each stream multi-processor (SM) works independent of other SM’s (Shrivastav et al., 2011).

In this chapter, we report performance evaluation of a CUDA implementation of the SSIM based image quality assessment tool. For this purpose we have done a C implementation on Intel single core processor, and a CUDA implementation on Nvidia GTX275 and C2050. We have compared both the implementations on a database of six images of various sizes, by taking region sizes to be 8x8 and 16x16 (Wang et al., 2004).

Figure 1. Flow diagram of image quality assessment

Page 3: Gpu based-image-quality-assessment-using-structural-similarity-(ssim)-index

278

GPU Based Image Quality Assessment using Structural Similarity (SSIM) Index

Technical Background of CUDA

CUDA technology gives computationally intensive applications access to the tremendous processing power of the recent GPUs through a C-like programming interface. The GPU is especially well-suited to address problems that can be expressed as data-parallel computations with high arithmetic intensity. Because the same program is executed for each data element, there is a lower requirement for sophisti-cated control; and because it is executed on many data elements and has high arithmetic intensity, the memory access latency can be hidden with calculations instead of big data caches Data-parallel process-ing maps data elements to parallel processing threads. This work is contributed in NIVIDIA (2012) & (Wang et al., 2002).

In this chapter, we use GTX275 core which having 240 CUDA cores, 633MHz graphics clock, 1404MHz processor clock, 896MB GDDR3 RAM. Fermi Details C2050, CUDA core 448, CUDA core frequency 1.15GHz, 3GB GDDR5.

Implementation Details

In order to exploit the inherent parallelism in the computation of the SSIM metric for different regions, the images are split into NxN regions, as shown in Figure 2, and computation for each region was done by one CUDA block in NVIDIA (2012). To reduce the computational complexity further, pixel-by-pixel shifting of the regions was not considered, and only the regions shown in Figure 2, which is considered for final SSIM calculations.

Since the SSIM metric requires mean and variance of both the images, to avoid data dependence among different CUDA blocks, the computations for images blocks at the same location of both the images are performed by one CUDA block. Within one CUDA block, warp independence is exploited by assign-

Figure 2. 8 x 8 block for image division

Page 4: Gpu based-image-quality-assessment-using-structural-similarity-(ssim)-index

279

GPU Based Image Quality Assessment using Structural Similarity (SSIM) Index

ing different warps for the computation of different image parameters. Coalesced assess to the device memory is ensured by using cudaMallocPitch() API for image buffer allocation on the device. Shared memory optimization is used by copying pixel values to the shared memory. The SSIM for each region is computed by one CUDA block and moved back to the device memory. The SSIM sum is performed by launching one more kernel (Zujovic et al., 2009) and (Aswathappa & Rao, 2010).

Results and Discussion

For evaluation purpose the CUDA implementation is compared with a C implementation running on Intel single core. The experiments were performed to evaluate the speedup achieved by the CUDA implementation. The experiments were performed using regions sizes 8x8 and 16x16. The test images taken were either noisy images created by adding Gaussian noise to the reference image, blurred images obtained by using a Gaussian blur on the reference image, or an altogether different image, examples shown in Figure 3 and Figure 4.

Result are shown in following tables (Zujovic et al., 2009).Table 1 showing the SSIM result with region size 8x8 on Intel core and CUDA core with and without

optimization methods. We observed around 29x speedup for region size 8x8.Table 2 showing the SSIM result with region size 16x16 on Intel core and CUDA core with and

without optimization methods. We observed around 32x speedup for region size 16x16.Table 3 showing the SSIM result with region size 16x16 on Intel core and CUDA core with optimi-

zation on GTX275 and C2050 core. We observed around 32x speedup on GTX275 and 80x speedup on C2050 for region size 16x16.

Figure 3. Reference image Figure 4. Distorted image

Page 5: Gpu based-image-quality-assessment-using-structural-similarity-(ssim)-index

280

GPU Based Image Quality Assessment using Structural Similarity (SSIM) Index

Table 1. SSIM result with region size 8 x 8 CUDA code

Reference Image

Distorted Image Size SSIM

Without OptimizationWith Optimization (Shared, Intrinsic,

Pragma Loop)

Intel Time (Micro Sec)

CUDA Time

(Micro Sec)Speedup

CUDA Time

(Micro Sec)Speedup

Lena.gif Len.gif 256 x 256 1 7800 512 15.23 332 23.49

Lena.gif Lena_gaussian.gif 256 x 256 0.95 7708 611 12.61 321 24.01

Lena.gif Lena.gif 512 x 512 1 7927 451 17.57 270 29.35

Lena.gif Lena_gaussian.gid 512 x 512 0.97 8300 466 17.81 291 28.52

Lena.gif Lena.gif 1024 x 1024 1 8200 478 17.15 281 29.18

Lena.gif Lena_gaussian.gif 1024 x 1024 0.90 8250 465 17.74 283 29.15

Lena.gif Baboon.gif 512 x 512 -0.036 8100 461 17.57 292 27.73

Barbara.gif Lena.gif 512 x 512 0.027 8112 464 17.48 277 29.28

Pepper.gif Lena.gif 512 x 512 -0.012 8140 450 18.08 274 29.70

Pepper.gif Pepper_blur.gif 512 x 512 0.324 8034 447 17.97 273 29.42

Table 2. SSIM result with region size 16 x 16 CUDA code

Reference Image

Distorted Image Size SSIM

Without Optimization With Optimization

Intel Time CUDA Time Speedup CUDA

Time Speedup

Lena.gif Len.gif 256 x 256 1 6211 398 15.60 224 27.72

Lena.gif Lena_g.gif 256 x 256 0.94 6314 401 15.74 231 27.33

Lena.gif Lena.gif 512 x 512 1 6415 412 15.57 218 29.42

Lena.gif Lena_g.gif 512 x 512 0.96 6434 406 15.84 207 31.08

Lena.gif Lena.gif 1024 x 1024 1 6450 396 16.28 201 32.08

Lena.gif Lena_g.gif 1024 x 1024 0.91 6543 394 16.60 202 32.39

Lena.gif Baboon.gif 512 x 512 -0.034 6411 391 16.39 194 33.04

Barbara.gif Lena.gif 512 x 512 0.026 6387 399 16.00 195 32.75

Pepper.gif Lena.gif 512 x 512 -0.012 6410 389 16.47 199 32.21

Pepper.gif Pepperb.gif 512 x 512 0.324 6412 390 16.44 200 32.06

Page 6: Gpu based-image-quality-assessment-using-structural-similarity-(ssim)-index

281

GPU Based Image Quality Assessment using Structural Similarity (SSIM) Index

CONCLUSION

In this chapter, we have done performance evaluation of a CUDA implementation of an SSIM based image quality assessment tool. We have shown that SSIM algorithm is highly suitable for GPU kind of architecture. In our implementation we have achieved an average performance improvement of 30x on GTX275 and 80x on C2050.

Future work in this direction include utilizing the speedup achieved to evaluate SSIM performance by shifting the region pixel-by-pixel and performing further experiments to find out optimal window size which reduces computational complexity without compromising much on the index value.

REFERENCES

Aswathappa, B. M. K., & Rao, K. R. (2010). Rate-Distortion Optimization using Structural Information in H.264 Strictly Intra-frame Encoder. Paper presented at Southeastern Symposium on Systems Theory, Tyler, TX, USA (pp.367-370). doi:10.1109/SSST.2010.5442789

NVIDIA Corporation. (2012). CUDA Parallel Computing Platform. Retrieved from http://www.nvidia.com/object/cuda_home_new.html

Shrivastav, A., Tomar, G. S., & Singh, A. K. (2011).Performance Comparison of AMBA Bus-Based System-On-Chip Communication Protocol. Paper presented at IEEE International Conference on Com-munication Systems and Network Technologies (CSNT), Katra, Jammu (pp. 449-454). doi:10.1109/CSNT.2011.98

Table 3. SSIM result with region size 16 x 16 CUDA code, GTX275 and C2050

Reference Image

Distorted Image Size SSIM

With Optimization (Shared, Intrinsic, Pragma Loop) Fermi C2050

Tesla Time Speedup Fermi Time Speedup

Lena.gif Len.gif 256 x 256 1 224 27.72 58 107.08

Lena.gif Lena_g.gif 256 x 256 0.94 231 27.33 60 105.23

Lena.gif Lena.gif 512 x 512 1 218 29.42 76 84.40

Lena.gif Lena_g.gif 512 x 512 0.96 207 31.08 77 83.55

Lena.gif Lena.gif 1024 x 1024 1 201 32.08 120 53.75

Lena.gif Lena_g.gif 1024 x 1024 0.91 202 32.39 124 52.76

Lena.gif Baboon.gif 512 x 512 -0.034 194 33.04 78 82.19

Barbara.gif Lena.gif 512 x 512 0.026 195 32.75 78 81.88

Pepper.gif Lena.gif 512 x 512 -0.012 199 32.21 80 80.12

Pepper.gif Pepperb.gif 512 x 512 0.324 200 32.06 80 80.15

Page 7: Gpu based-image-quality-assessment-using-structural-similarity-(ssim)-index

282

GPU Based Image Quality Assessment using Structural Similarity (SSIM) Index

Singh, R. R., Tiwari, A., Singh, V. K., & Tomar, G. S. (June, 2011). VHDL environment for floating point Arithmetic Logic Unit-ALU design and simulation. Paper presented at IEEE International Confer-ence on Communication Systems and Network Technologies (CSNT), Katra, Jammu (pp. 469-472). doi:10.1109/CSNT.2011.167

Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612. doi:10.1109/TIP.2003.819861 PMID:15376593

Wang, Z., Lu, L., & Bovik, A. C. (2002).Video quality assessment using structural distortion measure-ment. Paper presented at IEEE International Conference on Image Processing, Rochester, NY (pp. 65-68).

Zujovic, J., Pappas, T. N., & Neuhoff, D. L. (2009). Structural similarity metrics for texture analysis and retrieval. Paper presented at 16th IEEE International Conference on Image Processing (ICIP), Cairo, Egypt (pp. 2225-2228). doi:10.1109/ICIP.2009.5413897