image denoising using wavelet thresholding and model selection
TRANSCRIPT
-
7/30/2019 Image Denoising Using Wavelet Thresholding and Model Selection
1/4
Image Denoising using Wavelet Thresholding and Model Selection
Shi Zhong
Dept. of ECE, Univ. of Texas at Austin
Vladimir Cherkassky
Dept. of ECE, Univ. of Minnesota
ABSTRACT
This paper describes wavelet thresholding for image denoising
under the framework provided by Statistical Learning Theory aka
Vapnik-Chervonenkis (VC) theory. Under the framework of VC-
theory, wavelet thresholding amounts to ordering of wavelet
coefficients according to their relevance to accurate function
estimation, followed by discarding insignificant coefficients.
Existing wavelet thresholding methods specify an ordering based
on the coefficient magnitude, and use threshold(s) derived under
gaussian noise assumption and asymptotic settings. In contrast,
the proposed approach uses orderings better reflecting statistical
properties of natural images, and VC-based thresholding
developed for finite sample settings under very general noiseassumptions. A tree structure is proposed to order the wavelet
coefficients based on its magnitude, scale and spatial location.
The choice of a threshold is based on the general VC method for
model complexity control. Empirical results show that the
proposed method outperforms Donohos level dependent
thresholding techniques and the advantages become more
significant under finite sample and non-gaussian noise settings.
1. INTRODUCTION
In many applications, image denoising is used to produce good
estimates of the original image from noisy observations. The
restored image should contain less noise than the observations
while still keep sharp transitions (i.e. edges).
Wavelet transform, due to its excellent localization property, has
rapidly become an indispensable signal and image processing
tool for a variety of applications, including compression [7,8] and
denoising [1,2,4,5,6,9]. Wavelet thresholding (first proposed by
Donoho [4,5,6]) is a signal estimation technique that exploits the
capabilities of wavelet transform for signal denoising and has
recently received extensive research attentions. It removes noise
by killing coefficients that are insignificant relative to some
threshold, and turns out to be simple and effective. Wavelet
thresholding solution given by Donoho has also proven to be
asymptotically optimal in a minimax MSE (mean squared error)
sense over a variety of smoothness spaces [2,6]. It should be
pointed out, however, all the proofs were conducted under
additive gaussian noise assumptions.
In this paper, we interpret image denoising as a special case of
signal estimation problem and propose a model selection based
denoising method under the framework of VC theory, which was
developed for estimating data dependencies from finite samples.
The methodology is presented in next section, followed by
empirical results. Finally, we present the conclusions.
2. METHODOLOGY
2.1 VC-theory
VC-theory has recently emerged as a general theory for
estimating data dependencies from finite samples. It provides a
framework for model selection called structural risk
minimization (SRM). Under SRM, a set of possible models
(each model may consist of one or more basis functions) are
ordered according to their complexity. The set, called a structure
in SRM, consists of a group ofnestedsubsets Sksuch that
kSSS 21 (1)
where each element Sk has finite VC-dimension (the complexity
measure in VC-theory) ofhk. A structure is designed to provide
an ordering of its elements according to their complexity. Model
selection can be done by choosing the minimal analytic upper
bound (VC-bound) of the prediction risk provided for each
element by SRM. For detailed formulation and explanation of
VC-bound see [10]. A simplified formula [3] derived for signal
estimation (regression) is
R R p p pn
npr ed e mp +
( lnln
)12
1 (2)
where R e mp is the empirical risk, Rpr ed is the estimated
prediction risk, n is the number of signal samples, p( = h nk ) is
a complexity parameter. This inequality holds with probability
(1 1 / n ). A straightforward implementation of SRM is to
construct each element Sk in the structure as a linear
combination of n k basis functions, in which case the complexity
of each element Sk
is simply h nk k= + 1 [10].
2.2 Wavelet thresholding
Wavelet thresholding for image denoising involves two steps: 1)
taking the wavelet transform of an image (i.e., calculating the
wavelet coefficients); 2) discarding (setting to zero) the
coefficients with relatively small or insignificant magnitudes. Bydiscarding small coefficients one actually discard wavelet basis
functions which have coefficients below a certain threshold. The
denoised signal is obtained via inverse wavelet transform of the
kept coefficients. One global threshold derived by Donoho [5,6]
under gaussian noise assumption is )log(2 nT = , where
n is the number of samples and the noise standard deviation.
Clearly, wavelet thresholding can be viewed a special case of
signal/data estimation from noisy samples, which can be
addressed within the framework of VC-theory. Consider the
following structure on a set of all discrete wavelet basis
-
7/30/2019 Image Denoising Using Wavelet Thresholding and Model Selection
2/4
functions: Each element (of the structure) Sk has exactly k
wavelet basis functions. Note that once kbasis functions in Sk
are specified, minimizing the empirical risk is trivial due to
orthogonality of wavelets and amounts to estimation of the
wavelet coefficients via discrete wavelet decomposition.
In summary, application of SRM to wavelet thresholding forimage denoising involve the following steps:
1) Define a structure by appropriate importance ordering of all
wavelet basis functions. Each element Sk of a structure consists
of the first k basis functions. The original wavelet thresholding
technique is equivalent to specifying a structure that use only a
magnitude ordering of the wavelet coefficients. Obviously, this is
not the best way of ordering the coefficients. A better tree
structure is presented in this paper.
2) Estimate the prediction risk for each set of wavelet functions
formed in the structure. Since each Sk is a set of linear models,
VC-bound of the prediction risk (2) is easy to compute.
2.3 Level dependent thresholding and importance ordering
Level-dependent thresholding has been proposed to improve the
performance of wavelet thresholding method. Instead of using a
global threshold, level-dependent thresholding uses a group of
thresholds, one for each scale level. One popular level-dependent
thresholding scheme [4] is to set the threshold as:
2/)(
, 2)log(2Jj
nj nt
= ,j = 0, , J (3)
where n is the total number of signal samples,Jis the number of
decomposition levels, is the noise standard deviation (to be
estimated) and j is the scale level. This scheme uses a larger
threshold at finer scale levels. It can be interpreted as:
1) Order the wavelet coefficients with respect to their
magnitudes adjusted by scale level as multiplied by
2/
2
j
,wherej is the scale level associated with each coefficient.
2) Apply global threshold 2/2)log(2J
n nt
= .
This suggests that the level-dependent thresholding be viewed as
a special case of more sophisticated importance ordering in
model selection based denoising method.
A number of different structures (ordering schemes) can be
specified on the same set of basis functions. The choice of a
structure can be critical for the success of image denoising. A
good ordering should reflect the prior knowledge about the
signal/data being estimated. For example, it is not sensible to
order a set of polynomial basis functions starting from the
highest order term, or order the Fourier basis functions from the
highest frequency down (because such orderings contradict the
basic assumptions about signal smoothness). Similarly, 2-D
image signal estimation with VC approach may require more
complicated ordering scheme.
Motivated by tree structures used in wavelet-base image
compression [7,8], an improved tree-base ordering structure is
proposed in this paper. The basic idea is to simultaneously
exploit the magnitude, scale and spatial location contribution of
each wavelet coefficient using a tree structure. This ordering
scheme include following steps:
1) Set initial threshold |})),({|(maxlog ,22
jiWYjit= ( denotes
the closest smaller integer), final thresholdft (usually 1) and
set the initial ordered coefficient list to an empty list;
2) Scan all the coefficients in an order from low scale to high
scale. Within each scale, choose (in certain order) those selected
(due to space limit, we refer readers to [11] for details on what
coefficients are selected) coefficients that are equal to or larger
than the threshold tand append them to the list;
3) Set those coefficients selected in step 2) to N/A (not
available next iteration) and halve the threshold t;
4) If t tf , then repeat step 2) and step 3); otherwise, append
all the rest coefficients to the list in certain scanning order.
3. EMPIRICAL RESULTS
We compared following three denoising methods:
1) WaveThresh: Donohos level dependent thresholding
method using (3). The noise standard deviation is calculated
using Donohos estimate MAD/0.6745 [4], where MAD is the
median of the magnitudes of all the coefficients at the finest
decomposition scale.
2) WaveVC: Order the wavelet coefficients using the tree
structure proposed in previous section and use VC-bound to
choose the optimal number of coefficients (minimizing the
bound).
3) Wiener2: Wiener2 in Matlab is a spatial version of Wiener
filtering algorithm.
Approach 1) and 2) use biorthogonal wavelet filters. The window
size, a parameter in Wiener2, is set in our experiments to 3 3.
Different image sizes are tested. We mainly compare differentmethods on two measures: Signal-to-Noise Ratio (SNR) of
denoised image and the model complexity of the approximation.
SNR is defined as:
)),(
)var((log10 10
YYmse
YSNR =
(4)
where Y is the original clean image and Y is the denoised
image. The model complexity ofWaveVCand WaveThresh is just
the VC dimension of the model. Wiener2 can be viewed as a
local K-mean method doing some local averaging over the noisy
image. Its model complexity can be approximated by the VC
dimension of K-mean method, which is n/k [3] with n the
number of samples and k the size of averaging window. For
example, for 512 512 image with 3 3 window size, the model
complexity is 512 * 512 / 3 / 3 = 29127.
Due to space limit, we only show results on 8-bitLenna image in
this paper. Fig. 1 and 2 show the comparable denoising results
on 512 512 Lenna images corrupted by gaussian white noise
( = 15), using WaveThresh and WaveVC, respectively. Fig. 3
compares the SNR values and the model complexities of the
three approaches on 512 512 Lenna images at a variety of
different noise levels. The results on 128 128 images and
32 32 images are shown in Fig. 4 and Fig. 5, respectively. In
-
7/30/2019 Image Denoising Using Wavelet Thresholding and Model Selection
3/4
these results, multiplicative speckle noise is used to show the
advantages of our proposed method under non-gaussian settings.
We have similar but less dramatic results for gaussian noise
settings (which can be found in [11]).
Obviously WaveVCperforms approximately the same as or better
than WaveThresh for 512 512 Lenna images and begins to
outperform WaveThresh for smaller (128 128) images. And therelative performance of WaveVC increases further for 32 32
images. The results can be explained as follows:
1) VC theory was designed for finite samples and Donohos
threshold was derived under asymptotic assumptions. As the
image size gets smaller, the asymptotic assumptions begin to fail.
2) The noise assumption used in Donohos derivation fails when
images are not contaminated by additive gaussian noise. In
contrast, VC-based approach is more general in this sense.
As a global trend, WaveVC tends to use large amount of
coefficients for reconstructing the image when the true noise
standard deviation is small and use less when is large. And
this is true for different image sizes. So when the noise standard
deviation is fairly small, meaning the image pretty clean, VC
approach tends to keep a large number of coefficients, which
makes sense. WaveThresh does not have such clear trends.
4. CONCLUSIONS
Image denoising problem can be cast as a 2-D signal estimation
problem. In this paper, VC-based model selection method is
integrated with a variation of the wavelet thresholding method
and performs well on this problem. An importance ordering
structure (the tree structure), which reflects the prior knowledge
about the data and the basis functions used, turns out to
characterize the importance of noisy wavelet coefficients
successfully. However, there may exist better ordering scheme
for this wavelet-based denoising problem.
Wiener filtering is an optimal linear MSE estimator and Donoho
has proven his methods to be minimax optimal under certain
assumptions. However, both methods are based on white noise
model and true only in asymptotic sense. In contrast, model
selection based denosing method is more general and does not
need any noise assumption. And compared to Wiener filtering,
thresholding uses a sparse structure to approximate the original
signal so provides a compressed representation of the original
signal (only a small number of coefficients need to be kept).
Obviously our method has a lot more potential applications.
5. ACKNOWLEDGEMENT
This work was supported, in part, by a grant from Minnesota
Department of Transportation.
6. REFERENCES
[1] S. G. Chang and M. Vetterli, "Spatial Adaptive Wavelet
Thresholding for Image Denoising", Proc of IEEE Int. Conf. on
Image Processing, 1997
Fig. 1 Denoised image by WaveThresh (SNR = 24.99 dB)
Fig. 2 Denoised image by WaveVC(SNR = 25.26 dB)
[2] A. Chambolle, R. A. DeVore, N-Y Lee and B. J. Lucier,
Nonlinear wavelet image processing: variational problems,
compression and noise removal through wavelet shrinkage,
IEEE Trans. Image Processing, vol. 7, pp. 319-335, 1998
[3] V. Cherkassky and F. Mulier, Learning from Data:
Concepts, Theory and Methods, Wiley Interscience, 1998
[4] D. L. Donoho, "Wavelet Thresholding and W.V.D.: A 10-
minute Tour", Int. Conf. on Wavelets and Applications,
Toulouse, France, June 1992
[5] D. L. Donoho and I. M. Johnstone, "Ideal spatial adaptation
via wavelet thresholding", Biometrika, vol. 81, pp. 425-455,
1994
-
7/30/2019 Image Denoising Using Wavelet Thresholding and Model Selection
4/4
[6] D. L. Donoho, "De-Noising by Soft-Threshholding", IEEE
Trans. Information Theory, vol. 41, No. 3, May 1995
[7] A. Said and W. A. Pearlman, A New Fast and Efficient
Image Codec Based on Set Partitioning in Hierarchical Trees,
IEEE Trans Circ. and Syst. Video Tech., vol. 6, June 1996
[8] J. M. Shapiro, Embedded Image Coding using Zerotrees of
Wavelet coefficients, IEEE Trans. Signal Processing, vol. 41,
pp. 3445-3462, Dec. 1993
[9] X. Shao and V. Cherkassky, "Model Selection for Wavelet-
based Signal Estimation", Proc. IEEE Int. Joint Conf. on Neural
Networks, Anchoradge, Alaska, 1998
[10] V. Vapnik, The Nature of Statistical Learning Theory,
Springer, 1995
[11] S. Zhong and V. Cherkassky, Image Denoising using
Wavelet Thresholding and Statistical Learning Theory,
submitted to IEEE Trans. Image Processing, Feb. 2000
Fig. 3 Denoising results for multiplicative speckle noise on 512 by 512 Lenna image
Fig. 4 Denoising results for multiplicative speckle noise on 128 by 128 Lenna image
Fig. 5 Denoising results for multiplicative speckle noise on 32 by 32 Lenna image
0 10 20 30 40 5020
22
24
26
28
30
32
34
Noise standard deviation
SNR
(dB)
. - Wiener2+ - WaveThresho - WaveVC
0 10 20 30 40 500.5
1
1.5
2
2.5
3
3.5
4x 104
Noise standard deviation
Modelcomplexity(VC-dimension) . - Wiener2
+ - WaveThresho - WaveVC
0 10 20 30 40 5016
18
20
22
24
26
28
Noise standard deviation
SNR
(dB)
. - Wiener2+ - WaveThresho - WaveVC
0 10 20 30 40 5010001500200025003000350040004500500055006000
Noise standard deviation
M
odelcomplexity(VC-dimension) . - Wiener2+ - WaveThresh
o - WaveVC
0 10 20 30 40 5012
14
16
18
20
22
24
Noise standard deviation
SNR
(dB)
. - Wiener2+ - WaveThresho - WaveVC
0 10 20 30 40 5050
100150200
250
300350400
450500
Noise standard deviation
Modelcomplexity(VC-dimension) . - Wiener2+ - WaveThresh
o - WaveVC