analysis of the statistical dependencies in the curvelet domain and applications in image...

Analysis of the Statistical Dependencies in the Curvelet Domain and Applications in Image Compression

Alin Alecu1, Adrian Munteanu1, Aleksandra Pizurica2, Jan Cornelis1, Peter Schelkens1

1 Dept. of Electronics and Informatics, Vrije Universiteit Brussel – Interdisciplinary Institute

for Broadband Technology (IBBT), Pleinlaan 2, 1050 Brussels, Belgium {aalecu, acmuntea, jpcornel, pschelke}@etro.vub.ac.be, Phone: +32-2-629-1896

2 Dept. of Telecommunications and Information Processing, Ghent University, Sint-Pietersnieuwstraat 41, 9000 Gent, Belgium

{[email protected]}

Abstract. This paper reports an information-theoretic analysis of the dependencies that exist between curvelet coefficients. We show that strong dependencies exist in local intra-band micro-neighborhoods, and that the shape of these neighborhoods is highly anisotropic. With this respect, it is found that the two immediately adjacent neighbors that lie in a direction orthogonal to the orientation of the subband convey the most information about the coefficient. Moreover, taking into account a larger local neighborhood set than this brings only mild gains with respect to intra-band mutual information estimations. Furthermore, we point out that linear predictors do not represent sufficient statistics, if applied to the entire intra-band neighborhood of a coefficient. We conclude that intra-band dependencies are clearly the strongest, followed by their inter-orientation and inter-scale counterparts; in this respect, the more complex intra-band/inter-scale or intra-band/inter-orientation models bring only mild improvements over intra-band models. Finally, we exploit the coefficient dependencies in a curvelet-based image coding application and show that the scheme is comparable and in some cases even outperforms JPEG2000.

Keywords: curvelet, coefficient dependency, mutual information, compression.

1. Introduction

For some time now, geometric-based image representations [1-4] are emerging as the new successors to classical wavelets [5]. These transforms overcome the limited ability of 2D tensor-product wavelets to capture directional information and, as such, are capable of providing optimally sparse representations of objects with edges. While most of the work in literature has been focused so far on the transforms themselves, practical applications that make use of these representations are only slowly coming to light. Carefully assessing the statistical dependencies between the resulting coefficients is of paramount importance in various applications. For instance, evolving from the original independence assumption [5] between wavelet

2C

Fig. 1. An image decomposition into curvelet subbands.

coefficients towards the observation of strong inter- and intra-scale statistical dependencies [6-8] has led to the design of successful image coding and denoising applications. It is clear that in order to repeat the success of wavelets, a similar investigation of the statistical dependencies is required for the recently emerging geometric transforms.

In this respect, we investigate in this paper a representation that appears to hold particular promise for future image processing applications, namely the curvelet transform [1]. The paper is organized as follows: section 2 gives a brief description of a curvelet image decomposition; we analyze the curvelet coefficient dependencies in terms of mutual information in section 3; we exploit these dependencies in a practical image coding application in section 4; finally, we draw the conclusions in section 5.

2. Curvelet Decompositions

The curvelet-based decomposition scheme employed in this work is the Digital Curvelet Transform via UnequiSpaced FFT’s (DCT-USFFT) of Candes et. al. While we will not go into an extensive discussion of the transform itself - instead referring the reader directly to literature [1, 9] - we would like to clarify here a few concepts and notations that will be used throughout the paper. Fig. 1 illustrates the decomposition of an image into curvelet subbands (shown as rectangles), each corresponding to a certain scale and orientation. At each finer scale, the number of orientations doubles w.r.t. the next coarser scale [1]. Subbands located at the same scale are displayed along concentric coronae, the outermost corresponding to the highest frequencies. The subbands are grouped as being mostly horizontal/vertical (MH/MV), according to their orientation. We employ hereafter the terminology of [10, 11], such that, given a subband coefficient X , P denotes its parent at the next resolution level, is a cousin at the same scale but in a different orientation kC

Table 1. Mutual information estimates between X and its single neighbors within a 5x5 neighborhood, as averages over a test set of 7 images.

,i jN

i -2 -1 0 1 2

-2 0.1225 0.1462 0.1562 0.1466 0.1243 -1 0.1379 0.1767 0.1972 0.1683 0.1314 0 0.2032 0.4905 0.488 0.1968 1 0.1337 0.1686 0.1963 0.1732 0.1351

j

2 0.1244 0.1469 0.1552 0.1448 0.1154

band, and is a local (intra-band) neighbor of N X . We refer to “adjacent” cousins as those belonging to subbands located at adjacent orientations, and we use the notation

op to denote the cousin belonging to the band with opposite orientation to the one containing C

X . Finally, the DCT-USFFT transform coefficients consist of wavelet coefficients at the coarsest scale, and curvelet coefficients at all other finer scales, respectively [1].

3. Curvelet Coefficient Dependencies

In this paper, we express coefficient dependencies in terms of mutual information (MI). In general, the mutual information ( );I X Y between two random variables X and can be reasonably estimated using existing methods (i.e., the log-scale histogram method, the adaptive partitioning method [12], a.s.o.). Nonetheless, it is well-known that as the number of variables involved increases, one is confronted with the so-called curse of dimensionality, in which the difficulty of accurately estimating the joint pdfs increases exponentially with the number of variables. Hence, we adopt the approach of [6-8] that replaces a multi-dimensional Y by its sufficient statistic

, such that

Y

( )T f Y= ( ) ( ); ;I X Y I X T= . We start by illustrating in Table 1 the intra-band MI estimates ( ),; i jI X N between

a curvelet coefficient X and each of its single neighbors ,i j , N { }, 2, 1,1,i j∈ − − 2 of the symmetrical 5x5 neighborhood ( X would refer here to the central coefficient

). The MI values are computed as averages over the curvelet subbands of the last two finest scales, over a test set of 7 images. It can be seen from these results that 1,0 and 1,0 convey more information about

0, 0i j= =

N− N X (by a factor of 4× ) than any other neighbor, the next strongest dependencies being observed amongst the horizontally and vertically-located neighbors. Finally, we notice that MI estimates gradually decrease as the distance from X increases.

We now focus on the MI estimates between curvelet coefficients X and their entire local neighborhoods, i.e. sets of the form { }, ,

N i j i I j JN

∈ ∈= . In order to derive

such estimates for multi-dimensional random variables { }1 2 NY Y , we employ a linear predictor of the magnitudes of the coefficients, i.e., we assume that

, , ,Y Y= …

i ii is a sufficient statistic of , where ia are weights that minimize the

expected squared error [6]. Furthermore, we use a greedy algorithm in order to T a Y= ∑ Y

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0 5 10 15 20 25

Nr. coefficients

Mut

ual I

nfor

mat

ion

Fig. 2. Curvelet intra-band mutual information estimates ( );NI X as averages over a test set of

7 images, for successive values of ( )N 1, ,24card = … .

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20 25

Nr. coefficients

Mut

ual I

nfor

mat

ion

Fig. 3. Curvelet intra-band mutual information estimates ( );NI X as averages over a test set of

7 images, for successive values of ( )N 1, ,24card = … ; a linear predictor of the entire

neighborhood set is employed. N

dynamically add the most informative neighbors to the set . In this sense, the algorithm starts with , and at each iteration extends the neighborhood set

NN=∅

{ },N=N i jN∪ (i.e., ( ) ( )N Ncard card= 1+ , where ( )card ⋅ is the cardinal of a set). The term denotes a single neighbor of ,i jN X chosen from among the available

, such that the MI for the given ,i I j J∈ ∈ ( )Ncard is maximized. The curvelet MI estimates ( );NI X for the symmetrically-located 5x5

neighborhood of X are calculated for increasing values of ( ) ( )N , N 1 24card card = … .

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 5 10 15 20 25

nr. of coefficients

Mut

ual I

nfor

mat

ion

Fig. 4. Wavelet intra-band mutual information estimates ( );NI X as averages over a test set of

7 images, for successive values of ( )N 1, ,24card = … .

These estimates are again computed as averages over the test set of 7 images. In a first experiment, we estimate by employing a linear predictor T only for the exclusive set

( ;NI X ){ }1,0 1,0N \ ,N N− . In other words, we calculate ( )1,0 1,0− , which,

despite the curse of dimensionality, is still within reasonable computational limits. The obtained MI values are illustrated in

; , ,I X N N T

Fig. 2. We exclude the set { }1,0 1,0,N N− because we have experimentally found that linear

magnitude predictors of , when 1,0 or 1,0 , do not behave well. Indeed, N NN− ⊂ NN ⊂Fig. 3 plots similar results to those of Fig. 2, except that a linear predictor T of when

N{ }1,0 1,0,N N− ⊂ N is now used. It can be clearly seen from Fig. 3 that, after an

initial abrupt increase, the MI decreases rapidly. Apparently this is in contradiction with the chain rule for MI, which states that ( ) ( )1 1 1k k [8]. Nonetheless, let us recall that

; , ; ,I X Y Y I X Y≥… …Y −

( );NI X can be estimated through its bound if T is a sufficient statistic for , in which case ( ) (;I X T I X N≤ ); N

( ) ( );I X T I X= ; N . The results of Fig. 3 indicate that ( );I X T decreases rapidly for ( )N 2card > . Hence, T can no longer be considered a sufficient statistic for , when N

{ }1,0 1,0,N N− ⊂ N . This comes as an important observation if one recalls that in the case of wavelets, it is shown that linear predictors are indeed sufficient statistics for the entire local neighborhood of a coefficient [7, 8]. In fact, for the sake of comparing curvelet MI behavior with that of a thoroughly-studied transform, we illustrate in Fig. 4 results similar to those shown in Fig. 3, but for high-frequency wavelet subbands (the wavelet transform employed here is the (4,4) symmetrical biorthogonal transform, and the results refer to the horizontal detail subbands).

A comparison of the results of Fig. 2 and Fig. 4 reveals that MI estimates increase more abruptly for curvelets than for wavelets. Indeed, curvelets require only two coefficients to approximately reach a MI maxima, while wavelets require four. Additionally, for both transforms, it can be noticed that after a certain , the MI estimates exhibit a slow decay. This can be explained by the fact that these values

( )Ncard

Fig. 5. The shape of curvelet intra-band neighborhoods { },N i jN= . The ordering of is

denoted by the shades of grey, black signifying the strongest dependency. ,i jN

Table 2. Mutual information estimates. denotes a cousin of kC X located orientations away, is the opposite-orientation cousin, and finally

k

opC P denotes the parent.

( );I X P ( )1;I X C ( )4;I X C ( )12;I X C ( ); opI X C Lena 0.1310 0.0806 0.0334 0.0102 0.1536 Peppers 0.0851 0.0293 0.0055 0.00001 0.0938

of correspond to neighbors ,i j of ( )Ncard N X located further away. At such distant locations, the correlation with respect to X decreases significantly, such that T deviates from the sufficient statistics assumption. The fact that for curvelets, the slow decay of the MI starts at a low value of ( )Ndcar , points to the conclusion that, although very strong, curvelet coefficient dependencies are limited to local micro-neighborhoods. Additionally, the difference between the magnitudes of the overall curvelet, respectively wavelet, MI estimates, is due to the high oversampling of the curvelet transform [1], if compared to the critically-sampled wavelet. Indeed, oversampling induces redundancy and thus stronger dependencies.

We conclude the analysis of intra-band curvelet coefficient dependencies by illustrating the shape of the curvelet neighborhood , for the first few values of

. These results are shown in N

( )Ncard Fig. 5, and correspond to the neighborhood employed in Fig. 2, the ordering of ,i j being denoted by the decreasing shades of grey. It can be seen from

NFig. 5 that the strongest dependencies can be found for the

immediate horizontal neighbors, followed by the next horizontal, and immediate vertical neighbors, respectively. In addition, this ordering appears to match the single MI results ( ),i j of ;I X N Table 1. This is an interesting observation, especially if compared with the known classical wavelet dependencies [8]. Indeed, curvelet neighborhoods appear to have a strong anisotropic shape. A possible explanation for this is the fact that curvelets themselves possess anisotropic scaling laws, the support of a curvelet being contained in a ‘parabolic’ shape that obeys such laws [1].

Next, we briefly extend our investigation of curvelet MI estimates to inter-scale, respectively inter-orientation coefficient dependencies (a discussion of their joint statistics can be found in [10]). We illustrate in Table 2, for “Lena” and “Peppers”, the MI estimates between a coefficient X and some of its cousins k , between C X and P , and finally between X and opC . The results are derived for subbands located at the last two finest scales. It can be observed that the MI decreases with the increase in the difference between orientations, the most significant cousin in this sense being the orientation-adjacent C . Nonetheless, the opposite-orientation cousin C appears 1 op

Table 3. Mutual information estimates between X and its parent P , neighbors N and cousins . C

( );I X P ( );I X N ( );I X C ( ); ,I X N P ( ); ,I X N C ( ); ,I X P C Lena 0.1310 0.9123 0.2051 1.0318 1.1555 0.4294 Peppers 0.0851 0.8138 0.0871 0.8668 0.8671 0.218 Barbara 0.0456 0.9092 0.2582 0.9124 1.1338 0.3538

to be the most significant of all, outperforming even the parent coefficient P . We believe that this is a result of the real-valued curvelet transform implementation. Indeed, the DCT-USFFT investigated in this paper builds complex coefficient subbands that correspond to a single direction. Real-valued pairs of subbands and their “opposites” are then constructed from such single complex-valued subbands. As such, it is expected that the obtained “opposite” coefficients still display significant dependencies.

We end this section by showing in Table 3, for a few images, the MI estimates between X and its “generalized“ neighborhood set { }N, ,CG P= , i.e. between X and its parent P , its intra-band neighbors set and its cousins set respectively. Based on the previous findings, we chose

N C ,{ }1,0 1,0N= ,N N− and { }-1 1C= C , , opC C ,

where -1 1 denote the two orientation-adjacent cousins of C ,C X . The first choice is motivated by the fact that 1,0 and 1,0 convey the most information about N− N X , the inclusion of additional neighbors beyond these two bringing insignificant gains with respect to MI; the second choice is based on the observed ordering of estimates.

( ); kI X C

From Table 3, we find that ( ) ( ) ( ); ;CI X P I X I X< ; N (i.e. the local neighbors provide the most information about X ), and, furthermore, that

. At this point, the results lead us to conclude that intra-band models capture most of the dependencies between curvelet coefficients, with marginal gains for the more complex intra-band/inter-scale or intra-band/inter-orientation models.

( ) ( ) (; ,C ; N, ; N,CI X P I X P I X< )

4 Image Coding

In this section, we target a potential application of the curvelet transform, namely coding. In particular, we describe how the statistical coefficient dependencies investigated in the previous section have been exploited in the design of a competitive curvelet-based image compression scheme. Furthermore, we show that the proposed codec is comparable and in some cases even outperforms JPEG2000 [13].

The general architecture of our scheme is derived from the general structure of a transform-based codec. Thus, at the encoder, a forward decomposition concentrates the energy of the signal in a few coefficients, followed by quantization, coding of the quantized coefficients to a set of symbols, and finally entropy coding. In the final stage, the scheme performs a context-based entropy coding that is steered by some parameters from the coding step. In the following, we will focus on the encoding of the transform coefficients, and the context models of the entropy coder, respectively.

Fig. 6. Context models and associated neighborhoods for (left) curvelet MV subbands, and

(right) wavelet subbands.

Table 4. The coding gain obtained using the proposed context models versus those of JPEG2000, for a few images.

Lena Barbara Seismic Average gain (dB) 0.1345 0.0804 0.0938 Max gain (dB) 0.2887 0.2562 0.1318

Thus, first let us recall that the results of section 3 show that intra-band modeling of the curvelet transform captures most of the dependencies between curvelet coefficients, with marginal gains for the more complex intra-band/inter-scale or intra-band/inter-orientation models. In view of these observations, we choose to adopt in this paper an intra-band coding strategy, wherein we encode the quantized curvelet coefficients using a 2D variant of the QuadTree-Limited (QT-L) codec of [14].

Furthermore, we have shown in Fig. 5 the shape of the curvelet intra-band neighborhood, for the set of coefficients exhibiting the highest dependencies. Based on these findings, we have designed context models for the curvelet transform, for the MH and MV subbands, respectively. The models have been derived using a training set of 9 representative images. An example of the associated coefficient neighborhoods is depicted in Fig. 6, for a) curvelet MV subbands, and b) wavelet subbands (i.e., as employed in the context models of JPEG2000 [13]). The coding gains (i.e., the gains in PSNR) obtained using the proposed anisotropic context models versus the JPEG2000 models are shown in Table 4, for a few images. The gains are expressed here as averages over an extensive range of bit-rates. It can be seen from this table that the proposed models bring considerable gain in compression performance, if compared to the context models of JPEG2000.

Finally, we illustrate in Fig. 7 and Fig. 8, for the “finger” and “seismic” images of the JPEG2000 test set, the rate-distortion curves obtained using our curvelet-based coding scheme and JPEG2000, respectively. It can be seen that at the targeted rates the proposed scheme is comparable and, moreover, in some cases outperforms JPEG2000. These results are all the more important if we note that the transform employed has an oversampling factor of over 5.8. In this sense, to the best of our knowledge, this is the first work that shows that the high redundancy typical of the new geometric transforms [1-4] is not necessarily an impediment for coding applications, and that a correct exploitation of the dependencies that exist between the transform coefficients can lead to competitiveness with respect to the JPEG2000 standard and its critically-sampled wavelets [13].

17

19

21

23

25

27

29

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

Rate (bpp)

PSN

R (d

B)

JPEG2000Proposed Scheme

Fig. 7. For “finger”, the rate-distortion results obtained using the proposed scheme and

JPEG2000, respectively.

27

29

31

33

35

37

39

41

43

45

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

Rate (bpp)

PSN

R (d

B)

JPEG2000Proposed Scheme

Fig. 8. For “seismic”, the rate-distortion results obtained using the proposed scheme and

JPEG2000, respectively.

5 Conclusions

This paper reports an information-theoretic analysis of the dependencies that exist between curvelet coefficients. We show that strong dependencies exist in local intra-band micro-neighborhoods, and that the shape of these neighborhoods is highly anisotropic. Specifically, we find that the two immediately adjacent neighbors that are located orthogonal to the orientation of the subband convey the most information about the coefficient. Moreover, taking into account a larger local neighborhood set

brings only mild gains with respect to intra-band mutual information estimations. Furthermore, we point out that, unlike the case of wavelets [8], linear predictors do not represent sufficient statistics, if applied to the entire intra-band neighborhood of a coefficient. Instead, such predictors should be used for a local neighborhood that does not include the two mentioned coefficients. Regarding inter-orientation dependencies, we observe that these strongly depend on the direction; in this sense, it is shown that the set of most significant predictors contains only three coefficients. We conclude that intra-band dependencies are clearly the strongest, followed by their inter-orientation and inter-scale counterparts; the more complex intra-band/inter-scale or intra-band/inter-orientation models bring only mild improvements. Finally, we exploit the coefficient dependencies in a curvelet-based image coding application and show that the proposed scheme is comparable and in some cases even outperforms JPEG2000 [13].

References

1. Candès, E.J., Donoho, D.: New Tight Frames of Curvelets and Optimal Representations of Objects with Piecewise C2 Singularities. Comm. Pure Appl. Math 57 (2004) 219-266

2. Do, M.N., Vetterli, M.: Contourlets. In: Welland, G.V. (ed.): Beyond Wavelets. Academic Press (2003)

3. Le Pennec, E., Mallat, S.: Sparse Geometric Image Representations with Bandelets. IEEE Transactions on Image Processing 14 (2005) 423-438

4. Candès, E.J., Donoho, D.: Ridgelets: a key to higher-dimensional intermittency. Phil. Trans. R. Soc. Lond. A. 357 (1999) 2495-2509

5. Mallat, S.: A Theory for Multiresolution Signal Decomposition: The Wavelet Representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 11 (1989) 674-693

6. Buccigrossi, R.W., Simoncelli, E.P.: Image Compression via Joint Statistical Characterization in the Wavelet Domain. IEEE Transactions on Image Processing 8 (1999) 1688-1701

7. Simoncelli, E.P.: Modeling the joint statistics of images in the wavelet domain. SPIE 44th Annual Meeting, Denver, CO (1999)

8. Liu, J., Moulin, P.: Information-Theoretic Analysis of Interscale and Intrascale Dependencies between Image Wavelet Coefficients. IEEE Transactions on Image Processing 10 (2001) 1647-1658

9. Candès, E.J., Demanet, L., Donoho, D.L., Ying, L.: Fast Discrete Curvelet Transforms. Applied and Computational Mathematics, California Institute of Technology, (2005)

10. Alecu, A., Munteanu, A., Pizurica, A., Philips, W., Cornelis, J., Schelkens, P.: Information-Theoretic Analysis of Dependencies between Curvelet Coefficients. IEEE International Conference on Image Processing (ICIP), Atlanta, GA, USA (2006)

11. Po, D.D.-Y., Do, M.N.: Directional multiscale modeling of images using the contourlet transform. IEEE Transactions on Image Processing (to appear)

12. Darbellay, G.A., Vajda, I.: Estimation of the information by an adaptive partitioning of the observation space. IEEE Transactions on Information Theory 45 (1999) 1315-1321

13. Taubman, D., Marcelin, M.W.: JPEG2000: Image Compression Fundamentals, Standards, and Practice. Kluwer Academic Publishers, Norwell, Massachusetts (2002)

14. Schelkens, P., Munteanu, A., Barbarien, J., Galca, M., Giro-Nieto, X., Cornelis, J.: Wavelet Coding of Volumetric Medical Datasets. IEEE Transactions on Medical Imaging 22 (2003) 441-458

analysis of the statistical dependencies in the curvelet domain and applications in image...

Documents