recursive image estimation and inpainting in … · recursive image estimation and inpainting in...
TRANSCRIPT
RECURSIVE IMAGE ESTIMATION AND
INPAINTING IN NOISE USING
NON-GAUSSIAN MRF PRIOR
A THESIS
submitted by
G. R. K. SAI SUBRAHMANYAM
for the award of the degree
of
DOCTOR OF PHILOSOPHY
DEPARTMENT OF ELECTRICAL ENGINEERINGINDIAN INSTITUTE OF TECHNOLOGY, MADRAS
JANUARY 2008
At the lotus feet of my samrdha sadgure
Bhagavaan Sri Venkaiah Swamy
THESIS CERTIFICATE
This is to certify that the thesis entitled “RECURSIVE IMAGE ESTIMATION
AND INPAINTING IN NOISE USING NON-GAUSSIAN MRF PRIOR”
submitted by G. R. K. S. Subrahmanyam to the Indian Institute of Technology,
Madras for the award of the degree of Doctor of Philosophy is a bona fide record of the
research work carried out by him under our supervision. The contents of this thesis,
in full or in parts, have not been submitted to any other Institute or University for
the award of any degree or diploma.
Chennai 600 036 (Dr. A.N. Rajagopalan and Dr. R. Aravind)
Date: Jan. 09, 2008 Research Guides
ACKNOWLEDGEMENTS
I vow my great debt of gratitude and offer my heart felt pranamas to the lotus feet
of merciful Bhagavan Sri Venkayya Swami of Golagamudi who blessed me to perform
PhD. I express my gratitude with utmost reverence to my Guru Sri P. Subbaramayya
sir who took loving care regarding my academic, physical and spiritual welfare during
my PhD. To Sri Achraya Bharadwaja master who enlightened me through his holy
books and talks a lot to pursue a student carrier with noble motivations.
I would like to express my deep sense of gratitude and reverence to my thesis
adviser Prof. A. N. Rajagopalan for his constant guidance and encouragement. His
cooperation that is reachable even to address the minute aspects such as in algorithmic
coding and editing tex files with a great care are the motivating reminiscence for me.
I would also like to offer my deep admiration and gratitude to my co-adviser
Prof. R. Aravind for his understanding and support especially during my thesis work.
Without his in depth clarifications and discussions it would not have been possible to
make many points explicit in this thesis.
I specially thank Ibrahim whose work and program codes became an interesting
and easy starting points of my work. I would like to convey my warm regards to Suresh
whose seniority and co-operation enable me to proceed fastly. I thank Rajiv, Aranov,
Paramanad and our other IPCV lab colleagues for their timely help. I acknowledge
the unanimous authors from whose publications I need to crop figures.
Finally, I bow at the lotus feet of Sri Sainath of Shiridi who blessed me with home
food and satsang during my study through my mother. I convey my heart full love to
mother who took all pains to follow the verdict of swami by providing company and diet
for my recreative health, to my father unless whose consent and cooperation it could
not have happened, and best wishes to my brother for his spirit of encouragement.
- G. R. K. Sai Subrahmanyam
ABSTRACT
Keywords: Image estimation, Markov random field (MRF), discontinuity-adaptive
MRF, Kalman filter, auto-regressive model, photographic images, film-grain noise,
synthetic aperture radar, speckle, image inpainting, edge-preserving priors, unscented
transformation, unscented Kalman filter.
The task of image recovery generally refers to deriving an original image from
its observations, by making use of the degradation model and the noise statistics.
Methods for tackling this problem have to do a delicate balancing act of suppressing
noise without losing features of interest. The effect of noise is usually reduced by
constraining the possible set of solutions with suitably chosen image models. Arriving
at a judicious choice for the image prior is a very challenging task. Image degradation
can be of many kinds. In this thesis, we specifically address the following problems:
image estimation in additive white Gaussian noise, reduction of noise due to grains
in images captured on photographic films, suppression of speckle noise in synthetic
aperture radar (SAR) images, and image inpainting in the presence of film-grain noise.
Additive white Gaussian noise (AWGN) model is widely used in many real situ-
ations including modeling of thermal noise, and noise in medical images [1, 2]. The
photographic film is a very popular image sensor. The degradation phenomenon asso-
ciated with it is film-grain noise. Even though this noise can be modeled as additive
white Gaussian in the density medium, the observation model is nonlinear [3,4]. Syn-
thetic aperture radar (SAR) imaging systems are a preferred choice for aerial photog-
raphy. Coherent processing which is inherent in SAR systems results in interference
patterns called speckle noise which is multiplicative in nature [5]. Yet another type
of degradation results from loss of information in portions of an image due to aging,
scratches, blotches or occlusions. Image inpainting refers to the process of filling-in
such damaged or missing regions [6]. A more complex problem arises when inpainting
must be done in the presence of noise. This has important applications in restoring
images and old movies captured on photographic films.
Among the various techniques proposed in the literature for handling image degra-
dations, recursive approaches are popular due to memory and implementational ad-
vantages. These methods are based on dynamic state-space formulation, and facilitate
easy incorporation of spatial adaptivity into the estimation procedure. The image
is modeled with a state transition equation and the degradation process by a mea-
surement model. The state is recursively predicted based on the state equation by
summarizing the information from the past estimates, while updation corrects the
prediction using the current measurement. The recursive Bayesian filter propagates
the state conditional density sequentially based on the given state and measurement
models. The well-known Kalman filter (KF) [7,8] belongs to this class, and is optimal
under the assumptions of a linear model and Gaussian noise. Techniques based on
the Kalman filter usually rely on a homogeneous auto-regressive (AR) model. A main
issue with AR-based KFs is that the incorporation of contextual knowledge such as an
edge-preserving prior is non-trivial. In more general nonlinear and/or non-Gaussian
situations, one seeks approximations to the Bayesian recursion [9, 10].
In this thesis, we first explore contextual modeling with a Markov random field
(MRF) based discontinuity adaptive image prior within the Kalman filtering frame-
work for filtering images corrupted by additive Gaussian noise. The desired moments
of the non-Gaussian prior are estimated using a Monte Carlo sampling technique.
The performance of the proposed approach is markedly superior in comparison to the
traditional 2D Kalman filter when tested on many images.
A recent nonlinear counterpart of the KF referred to as the unscented Kalman filter
(UKF) has been found to be quite effective in several 1D nonlinear estimation tasks.
It yields more stable and accurate estimates than the extended Kalman filter (EKF)
and uses exact nonlinear models. We investigate the applicability of the UKF for
nonlinear/non-additive image estimation tasks. A small set of deterministic samples
iv
known as sigma points are used to capture and propagate the first two statistics of the
state through the AR state model and true observation nonlinearity. The statistics at
the update step of the UKF are determined from the transformed sigma points.
We further explore the incorporation of an edge-preserving prior within the UKF
framework to accomplish excellent noise removal in conjunction with feature preserva-
tion. We demonstrate the effectiveness of the proposed UKF-based schemes for film-
grain noise removal in photographic images, and for despeckling SAR imagery. Several
examples (both synthetic and real) are given and the results are also compared with
existing methods.
Finally, we consider the problem of filling-in damages or missing pixels in photo-
graphic images by digital inpainting. We propose a novel procedure to reconstruct the
edges over damages and develop an adaptive inpainting method which is guided by
the reconstructed edge map. The proposed edge-based inpainting algorithm is embed-
ded within the UKF framework to accomplish simultaneous film-grain denoising and
inpainting. The proposed approach is validated on simulated as well as real cases.
v
TABLE OF CONTENTS
Abstract i
List of Tables viii
List of Figures ix
Abbreviations xiv
Notations xv
1 INTRODUCTION 1
1.1 Image Degradations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Image Estimation in AWGN . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Photographic Film-Grain . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 SAR Speckle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.4 Digital Inpainting . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Objectives and Scope of the Thesis . . . . . . . . . . . . . . . . . . . . 8
1.4.1 Contributions of the Thesis . . . . . . . . . . . . . . . . . . . . 11
1.5 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 MARKOV RANDOM FIELDS 13
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Lattice, Sites, and Labels . . . . . . . . . . . . . . . . . . . . . . 13
2.1.2 Neighborhood System and Cliques . . . . . . . . . . . . . . . . 14
2.2 Markovianity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Clique Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 Potential Function and Gibbs Random Field . . . . . . . . . . . 20
2.3.2 MRF-Gibbs Equivalence . . . . . . . . . . . . . . . . . . . . . . 21
2.4 MRF Priors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4.1 Gaussian MRF . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.2 Edge-preserving Priors . . . . . . . . . . . . . . . . . . . . . . . 25
2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3 IMPORTANCE SAMPLING KALMAN FILTER FOR IMAGE ES-
TIMATION 29
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Auto-Regressive Kalman Filter . . . . . . . . . . . . . . . . . . . . . . 30
3.2.1 Filter Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Discontinuity-adaptive MRF . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3.1 Condition for DA Potentials . . . . . . . . . . . . . . . . . . . . 33
3.3.2 DAMRF Prior for Recursive Estimation . . . . . . . . . . . . . 34
3.4 Principle of Importance Sampling . . . . . . . . . . . . . . . . . . . . . 36
3.4.1 Moment Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Kalman Filter with Non-Gaussian Prior . . . . . . . . . . . . . . . . . 41
3.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4 UNSCENTED FILTER FOR NON-LINEAR ESTIMATION 50
4.1 Unscented Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 UT Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.1 Accuracy of the Mean . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.2 Accuracy of the Covariance . . . . . . . . . . . . . . . . . . . . 58
vii
4.3 Illustration of UT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.4 UT-based Extension of the Kalman Filter . . . . . . . . . . . . . . . . 65
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5 NOISE REDUCTION IN PHOTOGRAPHIC IMAGES 71
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Auto-Regressive UKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3 Importance Sampling UKF . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.3.1 ISUKF for Non-linear Image Estimation . . . . . . . . . . . . . 75
5.4 Film-grain Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.4.1 The Proposed Filters . . . . . . . . . . . . . . . . . . . . . . . . 79
5.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.5.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.5.2 Real Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6 DESPECKLING SAR IMAGERY 90
6.1 Image Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.1.1 SAR Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.2 SAR Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.3 Noise Reduction in SAR . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.3.1 Speckle Suppression using ISUKF . . . . . . . . . . . . . . . . . 95
6.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7 JOINT INPAINTING AND DENOISING 105
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 An Edge-based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 107
7.2.1 Reconstruction of Edge Image . . . . . . . . . . . . . . . . . . . 109
7.2.2 The Proposed Method . . . . . . . . . . . . . . . . . . . . . . . 113
viii
7.3 Results for Inpainting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.4 Inpainting in Presence of Noise . . . . . . . . . . . . . . . . . . . . . . 118
7.4.1 ISUKF for Simultaneous Inpainting and Filtering . . . . . . . . 120
7.5 Results for Joint Recovery . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
8 CONCLUSIONS 128
8.1 Suggestions for Future Work . . . . . . . . . . . . . . . . . . . . . . . . 129
Bibliography 130
LIST OF PAPERS BASED ON THESIS 142
ix
LIST OF TABLES
3.1 Moment estimation using IS with a proper choice of importance function. 40
3.2 Moment estimation using IS with a bad choice of importance function. 41
3.3 Comparison of per-pixel computational complexity: ARKF vs ISKF. . . 47
5.1 Computational complexity comparison at each pixel. . . . . . . . . . . 88
6.1 Quantitative comparison with standard filters. . . . . . . . . . . . . . . 99
6.2 Quantitative comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.3 Quantitative comparison with wavelet filters. . . . . . . . . . . . . . . . 102
LIST OF FIGURES
2.1 Symmetric neighborhood system for (a) first-order, and (b) second-order. 15
2.2 Non-symmetric half-plane support at a pixel (m, n). . . . . . . . . . . 16
2.3 NSHP neighborhood system for (a) first-order, and (b) second-order. . 16
2.4 Cliques for NSHP neighborhood. . . . . . . . . . . . . . . . . . . . . . 17
2.5 Edge-preserving convex potentials. The x and y axes correspond to η
and g(η), respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.6 Discontinuity-preserving non-convex potentials. The x and y axes cor-
respond to η and g(η), respectively. . . . . . . . . . . . . . . . . . . . . 28
3.1 DA model. (a) Interaction of neighbors as a function of difference. (b)
Penalty imposed by the DA model with increasing difference in intensity
values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 (a) Plot shows how the smoothing strength of DAMRF varies with η2.
(b) Heavy-tailed DAMRF distributions. . . . . . . . . . . . . . . . . . . 35
3.3 Choice of importance function. (a) Good choice: The support of the
target pdf is included in the sampler. (b) Bad choice: The support of
the same target pdf is not included in the sampler. . . . . . . . . . . . 38
3.4 Choice of importance function. (a) Good choice: Sampler support in-
cludes (non-symmetric) target pdf (b) Bad choice: Non-symmetric sam-
pler cannot be employed for IS of a symmetric target distribution. . . . 39
3.5 (a) Original image. (b) Degraded image (SNR = 10 dB). Image esti-
mated by (c) ARKF (ISNR = 3.06 dB), and (d) the ISKF (ISNR =
4.44 dB, γ = 1.8). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.6 (a) Original image. (b) Degraded image (SNR = 10 dB ). Image
estimated using (c) ARKF (ISNR = 1.29 dB), and (d) the ISKF
(ISNR = 2.43 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.7 (a) Original ”House” image. (b) Degraded (σ2v = 300) . Image estimated
using (c) AR-based KF (ISNR = 2.04 dB), and (d) ISKF (ISNR =
2.58 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.8 Building (a) Original. (b) Degraded image (σ2v = 300). Image estimated
by (c) AR-based KF (ISNR = 2.17 dB), and (d) ISKF (ISNR = 3.81
dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.9 Daisy (a) Original image. (b) Degraded (σ2v = 500). Image estimated
using (c) AR-based KF (ISNR = 4.14 dB), and (d) ISKF (ISNR =
5.36 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.10 Performance comparison of ARKF and ISKF on different images for
moderate noise of σ2v = 300. . . . . . . . . . . . . . . . . . . . . . . . . 47
3.11 Plane image. (a) Original. (b) Degraded [11]. Image estimated using
(c) BL-RUBF [11] (ISNR = 1.01 dB), and (d) ISKF (ISNR = 1.67 dB). 48
4.1 Principle of unscented transformation. . . . . . . . . . . . . . . . . . . 51
4.2 Block diagram of UT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Posterior samples (a) Monte Carlo, and (b) UT. . . . . . . . . . . . . . 63
4.4 Figure shows the posterior mean and uncertainty (1-σ) contour of co-
variance, determined by MC approach, i.e., true values (mean at ’*’),
linearization (mean at ’+’), and unscented transformation (mean almost
matches with that of MC mean). . . . . . . . . . . . . . . . . . . . . . 64
4.5 Signal estimation: (—) original, (- . -) observations, and (- - -) UKF
estimates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6 Signal estimation: (—) original, (- . -) EKF, and (- - -) UKF estimates. 70
5.1 Auto-regressive unscented Kalman filter (ARUKF). . . . . . . . . . . . 74
5.2 Importance sampling unscented Kalman filter (ISUKF). . . . . . . . . . 77
xii
5.3 (a) Original ‘flower’ image. (b) Degraded image. Image estimated using
(c) MWF (ISNR = 2.86 dB), (d) PF (ISNR = 2.55 dB), (e) ARUKF
(ISNR = 3.42 dB), and (f) ISUKF (ISNR = 4.63 dB). . . . . . . . . . 80
5.4 (a) The ‘house’ image. (b) Degraded image (σ2v = 0.05). Result using
(c) MWF (ISNR = 2.51 dB), (d) PF (ISNR = 3.48 dB), (e) ARUKF
(ISNR = 3.40 dB), and (f) ISUKF (ISNR = 4.55 dB). . . . . . . . . . 81
5.5 (a) The ‘peppers’ image. (b) Degraded image. Output of (c) MWF
(ISNR = 2.98 dB), (d) PF (ISNR = 4.47 dB), (e) ARUKF (ISNR =
4.12 dB), and (f) ISUKF (ISNR = 5.26 dB). . . . . . . . . . . . . . . 82
5.6 Performance comparison on different images in terms of mean value of
ISNR over 20 MC runs (a) at moderate noise (σ2v = 0.05), and (b) at
high noise (σ2v = 0.15). . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.7 (a) Cropped portion of a frame from the movie ‘Das testament des Dr.
Mabuse’. Output of (b) MWF, (c) PF, (d) ARUKF, and (e) ISUKF. . 84
5.8 (a) Face image with real film-grain noise. Image estimated using (b)
MWF, (c) PF, (d) ARUKF, and (e) ISUKF. . . . . . . . . . . . . . . . 85
5.9 (a) A real building image. Output image obtained using (b) MWF, (c)
PF, (d) ARUKF, and (e) ISUKF. . . . . . . . . . . . . . . . . . . . . . 86
5.10 Cropped portion of a locomotive captured with a film camera. (a)
Original with real film-grain noise. Image estimated using (b) PF, (c)
ARUKF, and (d) ISUKF. . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.1 (a) Original image. (b) Noisy version. Output of (c) Enhanced Lee
(FOM = 0.92, S/MSE = 20.51 dB), (d) Frost (FOM = 0.94, S/MSE =
20.51 dB), and (e) the proposed method (FOM = 0.98, S/MSE =
22.83 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2 (a) Original SAR image. (b) Degraded (σ2v = 0.04). Image estimated
using (c) the Enhanced Lee filter, (d) the Frost filter, (e) AR-based
UKF, and (f) the ISUKF. . . . . . . . . . . . . . . . . . . . . . . . . . 98
xiii
6.3 (a) Original aerial image. (b) Degraded image. Image estimated using
(c) Rayleigh prior MAP estimator [12], and (d) the proposed ISUKF. . 100
6.4 (a) The Horse track image. Estimated output image using (b) the Frost
filter, and (c) our method. . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.5 (a) Bedfordshire image. Result using (b) edge-based Bayesian wavelet
filter [13], and (c) the ISUKF method. . . . . . . . . . . . . . . . . . . 101
6.6 (a) Airport image. Result using (b) MAP estimator in UDWL domain
[14], and (c) the proposed method. . . . . . . . . . . . . . . . . . . . . 102
6.7 (a) Urban SAR image. (b) Degraded image. Image estimated using (c)
wavelet-based filter [15], and (d) ISUKF. . . . . . . . . . . . . . . . . . 103
6.8 (a) Real SAR image. Image estimated using (b) simulated annealing
and Metropolis estimator [16], and (c) ISUKF. . . . . . . . . . . . . . . 104
7.1 A typical edge-based inpainting methodology (from [17]): (Left) General
algorithm outline and, (right) an illustration of the outputs for each stage.108
7.2 (a) A real image superimposed with text. (b) Mask specifies the region
where the original image information is lost. (c) A damaged photograph,
and (d) its mask that specifies the regions to be filled-in. . . . . . . . . 109
7.3 Edge reconstruction: (a) Degraded image showing maximum matching
area dimensions. (b) Edge map of the degraded image. (c) Located end
points, and (d) the reconstructed edge image. . . . . . . . . . . . . . . 111
7.4 Proposed inpainting algorithm. . . . . . . . . . . . . . . . . . . . . . . 114
7.5 Brick image (a) Original, (b) scratched, and (c) inpainted. . . . . . . . 115
7.6 A synthetic image (a) with a ring mask. (b) Inpainted output. (c)
Result of Bertalmio [6]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.7 Peppers image. (a) Original, (b) scratched, (c) edges from degraded
image, (d) reconstructed edges, and (e) inpainted image using recon-
structed edge map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
xiv
7.8 An old painting (a) Degraded image. (b) Edge map from degraded
image. (c) Reconstructed edge map. (d) Inpainted result using recon-
structed edge map. (e) Output cropped from [6]. . . . . . . . . . . . . . 117
7.9 A real image (a) with mask on damaged region. (b) Inpainted result
using our method. (c) Result of Bertalmio [6]. . . . . . . . . . . . . . . 118
7.10 (a) A bird image superimposed with text. Inpainted result using (b)
proposed method, and (c) method in [18]. . . . . . . . . . . . . . . . . 118
7.11 Horse-cart image (a) with superimposed text. (b) Inpainted output
using proposed method. (c) Output of [6]. . . . . . . . . . . . . . . . . 119
7.12 Proposed framework for inpainting in the presence of noise. . . . . . . . 121
7.13 (a) Image degraded by film-grain noise and a big patch. (b) Output of
the proposed method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.14 Boat (a) Original. (b) Degraded and scratched. (c) Recovered image. . 122
7.15 Peppers (a) Original. (b) Degraded and superimposed with text. (c)
Reconstructed edge map. (d) Image recovered by the proposed filter
(ISNR = 16.16 dB). (e) Degraded with high noise. (f) Recovered result
using reconstructed edge map (when input is (e)) (ISNR = 14.38 dB). 123
7.16 Face with real film-grain. (a) Scratched version. (b) Inpainted and
filtered result (when input is (a)). (c) Degraded by patches. (d) Image
recovered by our filter (when input is (c)). . . . . . . . . . . . . . . . . 124
7.17 Mabuse (a) with scratches. (b) Edge image from (a). (c) Reconstructed
edge image (when input is (b)). (d) Image recovered by the proposed
filter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.18 (a) Scanned painting with real film-grain noise. (b) Scratched version,
and (c) recovered image. . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.19 (a) An image captured using a film-camera, and (b) recovered output. . 126
7.20 (a) A child image with text, scratch and film-grain noise. (b) Inpainted
and filtered output of the proposed method. . . . . . . . . . . . . . . . 127
xv
ABBREVIATIONS
AWGN : Additive white Gaussian noise
MRF : Markov random field
AR : Auto-regressive
DA : Discontinuity-adaptive
DAMRF : Discontinuity-adaptive Markov random field
IS : Importance sampling
MAP : Maximum a Posteriori probability
pdf : Probability density function
KF : Kalman filter
ARKF : Auto-regressive Kalman filter
ISKF : Importance sampling Kalman filter
EKF : Extended Kalman filter
UKF : Unscented Kalman filter
UT : Unscented transformation
MWF : Modified Wiener filter
PF : Particle filter
MMSE : Minimum mean-square error
NSHP : Non-symmetric half-plane
2D : Two-dimensional
MC : Monte Carlo
ISNR : Improvement in signal-to-noise ratio
SAR : Synthetic aperture radar
ENL : Eqivalent number of looks
ARUKF : Auto-regressive unscented Kalman filter
ISUKF : Importance sampling unscented Kalman filter
NOTATIONS
M, N : Number of rows and columns of an image
S : Lattice
L : Set of labels
F : Configuration space
N : Neighborhood system
Nm,n : Set of sites neighboring the site (m, n)
Rm,n : NSHP support of order M1 at location (m, n)
c : A clique
C : Collection of all cliques
F : Random field
f : Configuration of F
p(.) : Probability density function
Vc : Potential function of clique c
γ : A constant controlling the convexity of DA function
ρm,n : Scale parameter of DAMRF density
s(m, n) : Pixel intensity at location (m, n)
s : vector containing first-order NSHP support of (m, n)
η2(s(m, n), s) : Energy term at (m, n) with neighbors s
x(m, n) : State vector at location (m, n)
y(m, n) : Known measurement at location (m, n)
u(m, n) : State noise with variance σ2u
u(m, n) : Vector state noise
v(m, n) : Measurement noise with variance σ2v
v(m, n) : Vector measurement noise
xvii
a(i, j) : AR model coefficients
E : Expectation operator
q(.) : Sampler density
zl : lth sample of random variable z
L : Number of samples
µp : Estimate of predicted mean
σp : Estimate of predicted covariance
f(.) : Known system transition function
F : State transition matrix
h(.) : Known measurement degradation function
g(.) : Arbitrary nonlinear function
s(m, n) : Mean estimate of the pixel intensity at location (m, n)
sk : Mean estimate of the state vector at location k
Pk : Estimate of the state Covariance at location k
Pk/k−1 : Predicted state Covariance matrix at location k
X : State sigma point matrix
Y : Measurement sigma point matrix
w : UT weights
κ, λ : UT scaling parameters
P(m, n) : State Covariance at location (m, n)
Px : Covariance of x at location (m, n)
Pxy : Cross covariance of x and y at location (m, n)
x(m, n) : Mean estimate of the state vector at location (m, n)
e(m, n) : Degraded image pixel in the exposure domain
Ω : Inpainting region
δΩ : Boundary of the inpainting region
xviii
CHAPTER 1
INTRODUCTION
1.1 IMAGE DEGRADATIONS
Imaging systems introduce different types of degradations resulting in imperfect ren-
dition of the original scene. Blur is a spatial degradation caused by relative mo-
tion between scene and camera or by an out-of-focus optical system [19]. Noise is
a point-wise degradation and refers to stochastic variations in the image intensity.
The thermal motion of electrons in electronic components constitutes electronic noise.
Thermal noise is additive white Gaussian (AWG) and signal-independent [1]. Quanti-
zation noise that results from digitization can be modeled by correlated space-varying
Gaussian distribution [20]. Photon noise is non-additive and Poisson distributed but
approaches the normal distribution for large numbers [1]. The perturbation in mag-
netic resonance imaging systems is characterised by additive white Gaussian noise
(AWGN) [21]. Restoring an original image in the absence of blur is referred to as
image estimation.
Yet another type of degradation arises from the nonlinear input-output charac-
teristics of imaging sensors and scanners. In photographic films, the silver density
deposited on the film is logarithmically related to the light exposure [3, 4]. The ran-
dom grains that develop during the film formation process result in film-grain noise
which can be modeled as AWG in the film density domain [4]. Aerial images captured
by synthetic aperture radar (SAR) systems are affected by multiplicative noise known
as speckle [22]. It results from the interference of back-scattered radar signals and is
usually modeled by a Gamma distribution [12].
Image degradations are not limited to only the above and include situations (such
1
as in old films and photographs) where there is complete loss of information in some
regions of the image due to aging, scratches, etc [17, 23]. Sometimes important infor-
mation is lost because of occlusion by another object or by superimposed text [6, 24].
The process of filling-in lost information is called image inpainting [6].
In this thesis, we investigate and propose new methods for image estimation in
AWGN, film-grain noise reduction, SAR despeckling, and joint image inpainting and
noise filtering.
1.2 LITERATURE REVIEW
Developing effective image recovery techniques is important for improving image qual-
ity, and continues to be an active area of research. These techniques must not only
rely on knowledge of the degradation process and the statistics of the noise, but must
also effectively exploit contextual image characteristics. They differ in incorporating
the prior knowledge about the original scene into the estimation procedure. Image re-
covery in the presence of noise is difficult not only because noise can be non-Gaussian
and/or non-additive, but also due to the fact that preservation of edges is equally
important for good qualitative and quantitative performance.
1.2.1 Image Estimation in AWGN
Work on suppressing AWG noise began with frequency domain approaches which ex-
ploit the differences in the frequency or the spectral content between the original signal
and the noise for recovery. The classical Wiener filter (WF) [25] assumes stationar-
ity of signal and noise, and minimizes the mean-squared error between the original
and the restored image. This requires knowledge of the power spectral density of the
original image and noise. The filter attenuates frequencies depending on the value of
signal-to-noise ratio. The constrained least-squares filter [26] provides for additional
control over the restoration process.
The Kalman filter (KF) is well-known for estimation of signals in AWGN and is
2
based on dynamic state-space formulation [7]. The KF was extended to 2D in [8] and
applied to remove AWGN from images. The reduced-order model Kalman filter [27]
achieves equivalent performance but with reduced computational complexity. Effects
of distortion resulting from noise can be reduced by Kalman filtering, provided the
image parameters such as the auto-regressive (AR) coefficients are known. In general,
however, these parameters are a priori unknown, and can also vary spatially as a
function of the image coordinates. Image estimation by KF with space-varying AR
parameters is considered in [28]. Jeng and Woods [29] propose residual and normalized
image models using local statistics based on the observation that a residual image can
be better predicted (by the AR model) than the original gray level image. Many
algorithms explicitly incorporate image edge-information within the KF framework.
An edge-adaptive filter for restoration of noisy and blurred images based on multiple
models has been presented in [30]. A modified KF in which edge information is used
to improve step-response is presented in [31]. In [32], the parameters of the AR model
are continuously updated to model non-homogeneity of the image. Kadaba et al.
[11] observe that the error residuals in the AR model can be better fitted by Bernouli-
Laplacian distribution. They analytically derive a sub-optimal recursive Bayesian filter
that can partially incorporate the non-Gaussian nature of error residuals. In [33], they
extend their framework to incorporate Cauchy AR residuals.
Based on the prior models of the image and the likelihood density specified by
the degradation model, many maximum a posteriori (MAP) techniques have been
developed. A great deal of image processing work stems from prior stochastic models
for image structure given by Markov random fields (MRFs). Geman and Geman [34]
exploit the equivalence between MRFs and Gibbs distributions to model images. The
optimal image estimate is given as a fixed point of an iterative procedure that relies on
neighborhood-dependent updates. Several MAP-based approaches have been proposed
to incorporate edge-preserving priors in the estimation procedure [35, 36]. In [37], a
compound Gauss MRF model is proposed for the image and its MAP estimate is
obtained by simulated annealing. Roth et. al. [38] propose a Field of Experts (FoE)
3
model by learning the prior probability with generic images within a higher-order
MRF formulation. Edge-preserving image recovery using discontinuity-adaptive finite
Markov random fields has been considered in [39]. A combination of homogeneous and
inhomogeneous conditional densities has been used in [40] for Bayesian estimation of
images. There is also an increasing interest in handling non-Gaussian situations using
Monte-Carlo sampling methods [41, 42].
Yet another class of algorithms is based on partial differential equations (PDEs)
[43, 44]. These methods have their origins in heat equation (called as anisotropic
diffusion) [45]. The total variation of an image is minimized numerically in [46] subject
to constraints involving the statistics of the noise. In recent years, wavelet transform-
based approaches have been proposed which perform a multi-resolution analysis of the
degraded image to effectively differentiate and suppress noise in the wavelet domain
[47, 48]. An approach that exploits the sparseness and independent nature of wavelet
coefficients is presented in [49]. The importance of prior modeling of the wavelet
coefficients is brought out in [50].
1.2.2 Photographic Film-Grain
In this problem, the main issue is to handle sensor non-linearity. Existing methods
for film-grain noise removal include extensions of the Wiener filter (WF) [4], MAP-
based approaches [51], and wavelet techniques [52]. An adaptive Wiener filter based
on non-stationary first-order statistics of the image is proposed in [53]. Tekalp et al. [4]
incorporate the effect of sensor nonlinearity into the Wiener filter structure to recover
from blurred and noisy photographic images. Both these variations of the Wiener filter
[4,53] result in good improvements over the conventional Wiener filter for suppressing
film-grain noise. In [54], the noise model parameters are estimated using higher-order
statistics. The filter coefficients are obtained by minimizing a cumulant criterion based
on the extended Wiener-Hopf equations.
Andrews et al. [3] expand the nonlinear observation model into a Taylor series
and derived an approximate filter for recovering the original image. In [55], using
4
the D − log E curve of the film, and assuming Gaussian models for image and noise
statistics, a MAP estimate of the restored image is derived. The resulting non-linear
equations are solved by steepest-descent using fast transform computations, and sig-
nificant improvements are reported over the WF. Image estimation based on different
priors is considered in [56]. The authors employ numerical integration and solve higher-
order polynomials to find the MAP estimate. In [51], the image is modeled as Gaussian
and a signal-independent transformation is used to reduce the complexity of the MAP
solution. Moldovan et al. [57] propose a Bayesian approach using learned spatial prior
and a likelihood based on inhomogeneous beta distribution. This is used to de-grain
individual frames of archival films. The Particle filter (PF) has been recently pro-
posed for linear and non-linear image estimation in [58]. The PF discussed in [59]
outperforms the well-known modified WF (MWF) [4] in suppressing film-grain noise
in photographic images.
An undecimated wavelet domain technique for reducing film-grain noise is pro-
posed in [52]. Using a set of training images, the wavelet coefficients are thresholded
for different noise levels by optimizing a cost function which is related to visual quality.
In [60], film-grain noise reduction is performed both in spatial and wavelet domains.
The detail wavelet coefficients are adaptively shrunk using the local statistics derived
from the noisy image. The wavelet filter is shown to outperform its spatial counter-
part. Noise reduction in video sequences of old movies is accomplished in [61] with a
combination of wavelet and Wiener filtering techniques.
1.2.3 SAR Speckle
Research on multiplicative speckle suppression in SAR images began with adaptive
denoising techniques which rely on local statistics. Frost et al. [62] explore the property
of coefficient of variation and propose an exponentially weighted kernel for speckle
suppression. Adaptive speckle suppression filters proposed by Lee et al. [63] and Kuan
et al. [64] use the local statistics of the degraded image and the multiplicative model
of speckle noise. Enhanced versions proposed by Lopes et al. [65] filter independently
5
in homogeneous, heterogeneous and isolated point regions. Other well-known early
approaches to SAR despeckling include the Gamma-MAP filter [66] and the nonlinear
geometric filter [67]. In [68], speckle is suppressed by employing an adaptive block-
Kalman filter, which relies on block-varying AR parameters of the original image.
An adaptive MAP estimator with a heavy-tailed Rayleigh model has been suggested
in [12]. Speckle suppression by anisotropic diffusion using PDE-based approaches is
considered in [69, 70].
In recent years, wavelet techniques have been found to be effective for SAR image
denoising. Wavelet despeckling filters adaptively modify the wavelet coefficients of
the log-transformed speckled image. A wavelet method based on soft-thresholding has
been proposed in [15]. Xie at el. [71] have proposed a despeckling algorithm that fuses
Bayesian wavelet denoising with a regularizing Gaussian-mixture prior. Many SAR
despeckling methods report improved performance with non-Gaussian prior modeling
of the wavelet coefficients within the MAP framework [72,73]. The choice of prior for
encoding SAR data is discussed in [74].
1.2.4 Digital Inpainting
Digital inpainting provides a means for reconstruction of small identified damages and
is also useful for removing superimposed text such as dates, subtitles, publicity or
logos from still images or videos [6, 17]. The main issue in inpainting methods is to
recover image information at damaged edges and lost texture regions [75]. There are
two major categories of inpainting algorithms. The first category propagates infor-
mation along the isophote (lines of equal gray values) direction [6] by formulating
the problem in terms of partial differential equations or elastica models. The second
category relies on edge or structure information [17]. We provide a detailed review
of inpainting approaches in chapter 7. Traditional inpainting algorithms aim only at
filling-in identified damages or removing superimposed text or small objects in images
[6, 17]. To the best of our knowledge, simultaneous image recovery from (film-grain)
noise and damages has not been addressed in the literature. In traditional denoising,
6
the pixels contain information about underlying data as well as noise, while in image
inpainting, there is no information available in the region to be inpainted [6] which
compounds the problem further.
1.3 MOTIVATION
The classical problem of image estimation in AWGN is difficult because one must meet
the (contradictory) but twin objective of noise reduction and detail preservation. The
Kalman filter is an elegant recursive filter that has been traditionally employed for
this purpose. However, the homogeneous image model, generally employed in KF [8],
can only provide limited compromise between noise removal and edge preservation.
For better performance, exploiting the non-stationary feature of the KF, algorithms
have been proposed to model the images with space-varying AR state models [28].
The state-space models are formulated with respect to residual and normalized images
to benefit from the use of the local statistics of the image [29]. However, such an
extension requires that the autocorrelation coefficients of the noise-free image be known
accurately. Explicit incorporation of edge-preserving priors within the KF framework
is an interesting research issue [11, 33].
The ubiquitous film is an important image recording medium. The degradation
phenomenon associated with it is film-grain noise which is not only visibly objection-
able but also renders compression difficult due to high entropy. Because the observa-
tion model is nonlinear, the noise turns out to be multiplicative and non-Gaussian in
the (visual) exposure domain [4]. A recent technique which is based on the recursive
particle filter (PF) [58] has the capability to incorporate sensor nonlinearity into its
estimation procedure. But its performance is constrained by the use of a homogeneous
autoregressive (AR) model. Also, it is computationally intensive. The widely em-
ployed nonlinear extension of the Kalman filter, the Extended Kalman filter (EKF),
only linearizes the state and observation nonlinear models about the (current) mean
of the state. The EKF is known to introduce considerably large errors in the estimates
7
and is prone to be unstable. Handling sensor nonlinearity in the presence of noise is a
challenging problem indeed.
In synthetic aperture radar (SAR) imaging systems, adaptive speckle suppression
filters [62,63] are based on the local statistics of the degraded image. These are simple
but tend to over-smooth image texture. They also suffer from ineffective denoising
around edges. The adaptive block-Kalman filter uses a block-wise varying AR model
for suppressing speckle noise [68]. However, it requires the AR coefficients of the noise-
free image to be known accurately. Recently proposed wavelet domain [13] and MAP
approaches [12] have superior performance but require explicit optimization and/or
parameter estimation. Developing methods for reducing the effect of speckle while
preserving point and line targets is an active area of research. A direct application
of the Kalman filter or its linear extensions [8, 11, 28] can only deal with linear and
additive noise models. The EKF is also limited by the additive model assumption.
Inpainting methods typically propagate information either along the isophote di-
rection or explicitly rely on edge or structure information. Edge-based inpainting
methods [17] first reconstruct the edge information in the damaged region. Then the
filling-in is performed within each object guided by the reconstructed edges. For this
class of algorithms, there are two main issues to be addressed: i) how do we recon-
struct the edges in the region to be inpainted using the boundary information? and
ii) how do we utilize the reconstructed edges in propagating gray level values into the
region to be inpainted? Research that addresses the problem of dual degradation due
to film-grain noise and loss of information is still in its infancy.
1.4 OBJECTIVES AND SCOPE OF THE THESIS
A common goal underlying all image estimation techniques is that of noise reduc-
tion in conjunction with feature preservation. We propose to meet this goal within
a recursive framework. Recursive techniques have excellent memory and implementa-
tional advantages. Moreover, they permit spatial adaptivity to be easily incorporated
8
into the filter model [76]. The state equation of the Kalman filter imposes a strong
linear constraint on state evolution which hinders true transitions at image edges.
We first propose a novel methodology to incorporate a non-Gaussian Markov ran-
dom field (MRF) prior into the recursive Kalman filter for estimating images cor-
rupted by AWGN. We begin with the intuition that incorporating edge-preserving
conditional priors can considerably improve performance. Specifically, the incorpora-
tion of a non-Gaussian prior in the Kalman filter amounts to determining the first two
moments of the prior at each recursive step. We formulate a discontinuity-adaptive
MRF (DAMRF) prior using the continuous adaptive regularizer model proposed in
[77, 78] so as to suit recursive estimation. However, the image model becomes non-
Gaussian and intractable for direct moment estimation. In recent years, there has
been an increasing interest in the application of Monte Carlo methods for several sig-
nal and image processing problems [41,58]. The power of these methods lies in random
sampling to approximate complex multi-dimensional integrals with simple summations
[79]. In particular, the importance sampling (IS) method [79,80] provides a convenient
methodology to estimate the moments under any arbitrary probability density func-
tion (pdf), known up to a multiplication constant. This is achieved by using weighted
samples of another roughly approximate pdf from which it is easy to draw samples.
We employ importance sampling to estimate the first two moments of the DAMRF
non-Gaussian prior which effectively uses the samples of an appropriately chosen sam-
pler density. Since the first two moments of the DAMRF prior constitute the predicted
mean and covariance of the state, the update is carried out exactly as in traditional
KF to arrive at the final image estimates. The performance of the proposed filter is
demonstrated with many examples.
The unscented Kalman filter (UKF), recently proposed by Julier and Uhlmann
[81], has been well-studied in the 1D domain and has been found to be more accurate
and stable than EKF [81]. It also has the capability to handle multiplicative, non-
Gaussian noise in its estimation procedure [82,83]. The UKF inherits the KF structure
and is the extension of a deterministic sampling approach known as the unscented
9
transformation (UT) to minimum mean-square error (MMSE) estimation [81]. In
UKF, the recursive prediction and updation of the first two moments of the state are
based on the propagation of sigma points (deterministic samples) through state and
observation models. The UKF has been applied to a wide variety of 1D nonlinear
signal estimation problems [81, 84, 85]. The UKF has also been explored for tracking
in computer vision [86, 87].
We develop an extension of the UKF for nonlinear image estimation, beginning
with the homogeneous AR state model. The sigma points are generated by assuming
an initial mean and covariance for the state and are propagated through the state and
nonlinear observation models resulting in the prediction of state and measurement
sigma points. The UT is employed to determine the predicted first two statistics and
the update is performed as in the Kalman filter.
Since the performance of AR based UKF (ARUKF) is limited by the AR state
model, we explore the incorporation of an edge-preserving MRF prior into the UKF
framework. Using a similar methodology as applied to improve the KF, we formulate
the DAMRF state conditional density based on the DA potential function and the
estimated neighbors. We employ IS to estimate the mean and covariance of the state
from the DAMRF prior. Unlike in the KF extension, here we employ the predicted
state statistics to determine the predicted sigma points and (directly) propagate them
through the nonlinear observation model to obtain the predicted measurement sigma
points. We carry out the update as in the ARUKF. We refer to this formulation of
the UKF as the importance sampling UKF (ISUKF).
We specifically consider film-grain noise removal in photographic prints, and speckle
suppression in SAR images, to demonstrate the effectiveness of the proposed ISUKF.
The ISUKF outperforms the ARUKF and also the PF [88] for image estimation in
film-grain noise. We modify the formulation of the DA prior for SAR imagery and ap-
ply the ISUKF with a multiplicative measurement model to arrive at despeckled SAR
images. Experiments on images with simulated and real speckle reveal the efficacy of
the proposed filter over contemporary methods.
10
Finally, we consider the problem of inpainting damaged images by extending the
proposed recursive UKF framework to simultaneously perform film-grain noise removal
and inpainting. We observe that the main issue in inpainting within the filtering
framework is to infer the observations from noisy neighborhood pixels. In the pro-
posed inpainting algorithm, we consider global information in terms of edge image
reconstruction and inpaint locally to restore the image. Using the damaged image and
information about the location of damages, we adopt a constrained matching strategy
to reconstruct the damaged edges in the regions to be inpainted. We inpaint the miss-
ing pixels by propagating the boundary information judiciously using the reconstructed
edge image. We validate the proposed approach with several examples.
To recover an image from damages as well as noise, we propose to embed the
inpainting method in the ISUKF formulation. Since the UKF lacks observations at
damages during its update step, we fill-in the missing observations by invoking our
inpainting method. We demonstrate the ability of the proposed unified framework for
inpainting and denoising of damaged film-grain noisy photographic images for both
simulated and real degradations.
1.4.1 Contributions of the Thesis
• We propose a discontinuity-adaptive image estimation scheme within the Kalman
framework by non-Gaussian MRF modeling of the image prior. This is achieved
by using importance sampling to estimate the first two moments of the non-
Gaussian prior and incorporating them into the update step of the KF.
• The unscented Kalman filter (UKF) with an AR state model is developed for
non-linear image estimation that can account for exact image sensor nonlinearity
in the observation model. To further improve performance, we incorporate an
edge-preserving Markov prior into the recursive estimation procedure of the
UKF through importance sampling. We employ this for film-grain noise removal
in photographic images.
• Next, the problem of despeckling of synthetic aperture radar (SAR) images is
11
considered. The discontinuity-adaptive MRF prior is modified so as to effectively
process SAR images and the UKF is formulated to account for multiplicative
noise in the image estimation procedure.
• We develop an edge-based image inpainting method and embed it into the pro-
posed UKF framework to simultaneously suppress film-grain noise and inpaint
(known) damages within a recursive framework.
1.5 ORGANIZATION OF THE THESIS
In chapter 2, we review the theory of Markov random fields.
Chapter 3 incorporates a discontinuity-adaptive MRF prior into the Kalman filter to
address the limitations of the AR-based KF for linear image estimation.
Chapter 4 introduces the principle of unscented transformation, analyzes the accuracy
of UT, illustrates UT with examples, and considers its extension to recursive nonlinear
estimation.
In chapter 5, we develop the UKF for nonlinear image estimation. We emphasize the
incorporation of edge-preserving MRF prior into the UKF and employ it for film-grain
noise suppression.
In chapter 6, we consider the problem of speckle noise suppression using the UKF
framework by suitably tailoring the prior for SAR images.
In chapter 7, the problem of filling-in missing information in noisy images is tackled
by embedding an edge-guided inpainting method into the UKF framework.
Chapter 8 summarizes the thesis and suggests directions for future work.
12
CHAPTER 2
MARKOV RANDOM FIELDS
2.1 INTRODUCTION
The study of random processes is important because filters or algorithms must be
developed in accordance with the probabilistic characteristics of the class of images
being processed [89]. A natural extension of a random process to two dimensions
is known as a random field. A realization of a random field is generated when we
perform a random experiment at each spatial location and assign the outcome of that
experiment to that location [90]. A Markov random field (MRF) is a conditional
density and possesses the Markovian property i.e., the value of a pixel depends only
on the values of its neighboring pixels and on no other pixel [77]. The qualifying
feature of MRFs in image processing is that the information contained in the local,
physical structure of images is captured by means of the local conditional probability
distribution [91]. It is important to realise that at discontinuities or edges, the notion
of neighborhood dependency should be relaxed. How to incorporate the smoothness
constraint in MRFs while simultaneously modeling the edges is a challenging task
[39, 77]. The theory of MRFs is built on the concept of a neighborhood system and
requires the following terminology for its representation and study.
2.1.1 Lattice, Sites, and Labels
Many vision tasks can be posed as labeling problems in which the solution to a task is
a set of labels assigned to image pixels or features. A labeling is specified in terms of a
set of sites and a set of (corresponding) labels. Typically, image data are represented
by gray-level variations defined over a finite rectangular or square point lattice. A
13
lattice S is a discrete set of points or pixels. For a 2D image of size M × N , it is
denoted by
S = (i, j) | 1 ≤ i ≤ M, 1 ≤ j ≤ N.
A lattice node t is uniquely specified by its coordinates t = (i, j), where i is the
image row number and j is the column number. A site represents a point or a region
in the Euclidean space. A site can be an image pixel (lattice node) or an image feature
such as a corner point, a line segment or a surface patch. Sites can be spatially regular
or irregular. A label is an event associated with a site. A label can assume either
continuous or discrete values. For example, the value of the analog pixel intensity is
an example of a continuous label, whereas the quantized values of intensities in the
set 0, 1, ..., 255, or edge, non-edge in edge detection, constitute discrete labels.
Another essential property of a label set is the ordering of the labels. For example,
elements in the continuous label set (the real space) can be ordered by the relation
“smaller than”. Let L be the set of labels. When a discrete set, say 0,1,2,...,255,represents pixel intensities, it is an ordered-set because for intensity values we have
0 < 1 < ... < 255. When a label set contains different symbols such as texture types
or edge features, it is considered to be un-ordered.
The problem of labeling is to assign a label from the set L to each of the sites in
the lattice S. The set of all possible combinations of labels is called the configuration
space and is denoted by F. In image estimation, the set S of sites corresponds to image
pixels and the discrete set L of labels is the intensity levels. For a discrete labeling
problem of k sites and l labels, there exist a total number of lk possible labellings (of
configuration space). However, among all the possibilities, there are only few which
are optimal in terms of a criterion measuring the goodness (or inversely, the cost) of
solutions. This is the optimization approach to visual labeling.
2.1.2 Neighborhood System and Cliques
One of the important characteristics of image data is statistical dependence of the
gray level at a lattice node on its neighbors. The sites in S are related to each other
14
(a) (b)
Fig. 2.1: Symmetric neighborhood system for (a) first-order, and (b) second-order.
through a neighborhood system defined as
N = Ni,j | ∀ (i, j) ∈ S (2.1)
where Ni,j is the set of sites neighboring the site (i, j). A collection of subsets of a
lattice S defined as N = Ni,j | (i, j) ∈ S, Ni,j ⊂ S is a neighborhood system on
S if the neighboring relationship satisfies the following properties:
1. (i, j) /∈ Ni,j (a site is not its own neighbor), and
2. if (k, l) ∈ Ni,j, then (i, j) ∈ Nk,l (the neighborhood relationship is mutual).
Neighborhood systems that are commonly used are N 1 and N 2. Figs. 2.1 (a) and
(b) show first-order and second-order symmetric neighborhood systems, respectively.
In the first-order neighborhood system (N 1), which is also called a 4-neighborhood
system, every site (m, n) has four neighbors as shown in Fig 2.1 (a). In the second-
order neighborhood system (N 2), also called an 8-neighborhood system, there are eight
neighbors for every site (shown in Fig. 2.1 (b)).
If we introduce the notion of time in a 2D representation, which is very useful
in recursive analysis of images, at a current pixel (m, n), only the “past” pixels such
as those shown in Fig. 2.2 become neighbors. Such a neighborhood support, known
as non-symmetric half-plane (NSHP) support (at a pixel (m, n)) is mathematically
15
Row
s Columns
Fig. 2.2: Non-symmetric half-plane support at a pixel (m, n).
(a) (b)
Fig. 2.3: NSHP neighborhood system for (a) first-order, and (b) second-order.
represented as
Rm,n = (m− i, n− j)|(1 ≤ i ≤ M1,−M1 ≤ j ≤ M1) ∪ (0, 1 ≤ j ≤ M1) (2.2)
and is depicted in Fig. 2.2. Here, M1 is the order of the NSHP support. The associated
local neighborhood Nm,n of site (m, n) is defined as
Nm,n = (k, l) ∈ S : (k, l) ∈ Rm,n, (k, l) 6= (m, n) (2.3)
Neighborhood system N , using Nm,n over the whole lattice S, is defined by Eq. 2.1.
Typically, we employ first and second-order NSHP neighborhood systems as shown in
16
Fig. 2.4: Cliques for NSHP neighborhood.
Figs. 2.3(a) and (b), respectively. MRFs with NSHP support are also referred to as
unilateral MRF [92] or causal MRFs [93].
A clique c of the pair (S, N ) is a subset of sites in S such that it consists either
of a single site c = (m, n) or of a pair of neighboring sites c = (m, n), (k, l), or
of a triplet of neighboring sites c = (m, n), (k, l), (i, j) and so on. The collections of
single-site, pair-site and triple-site cliques will be denoted by C1, C2 and C3, respectively,
where C1 = (m, n) : (m, n) ∈ S, C2 = (m, n), (k, l) : (k, l) ∈ Nm,n, (m, n) ∈S, C3 = (m, n), (k, l), (i, j) : (m, n), (k, l), (i, j) ∈ S and are neighbors of one
another and so on. Similarly, we can define higher-order cliques. The sites in a clique
are ordered i.e., (m, n), (k, l) is not the same as (k, l), (m, n). The collection of
all cliques for (S,N ) is C = C1 ∪ C2 ∪ C3,... where “...” denotes possible set of larger
cliques. The order or type of cliques for (S,N ) is determined by the size, shape and
orientation of the neighborhood.
Fig. 2.4 shows the clique types for the first-order causal neighborhood system of a
lattice at site (m, n), identified with a dot. The single-site, horizontal and vertical pair-
site cliques in Figs. 2.4 (a) and (b) constitute a first-order symmetric neighborhood
system. While the clique types for a first-order NSHP neighborhood system include
not only those in Figs. 2.4 (a) and (b) but also diagonal pair-site cliques in Fig. 2.4
(c), triple-sites in Fig. 2.4 (d), and quadruple-site (Fig. 2.4 (e)) cliques. We note that
clique types for a second-order symmetric non-causal neighborhood system contain all
those in Figs. 2.4 (a)-(e) and a fourth possible triple-site (which is not an NSHP clique
and hence is not shown in Fig. 2.4 (d)). As the order of the neighborhood system
increases, the number of cliques grows rapidly.
17
2.2 MARKOVIANITY
For the purpose of image analysis and processing, it is useful to have an underlying
model for the dominant characteristics of the given data. Although it is often difficult
to identify the physical mechanism which generated the observed data, an analytical
expression that captures the statistical properties of images can be used as a model
[94]. Specifically, the theory of MRF provides a consistent way of modeling context-
dependent entities [77]. The implicit assumption behind probabilistic approaches to
image analysis is that, for a given problem, there exists a probability distribution that
can capture to some extent the variability and interactions of different sets of relevant
image attributes [95].
A random field F is a collection of random variables arranged on the lattice S and
is defined as F = Fm,n, (m, n) ∈ S. The random variables Fm,n, (m, n) ∈ S take
values in the label set L. Specifically, for 8-bit quantized images, L = 0, 1, ..., 255.Let F = LM×N be the configuration space, where the random field F takes values. Here,
M×N is the dimension of the image. Let each random variable Fi,j take a value fi,j in
L, identified with the notation Fi,j = fi,j . The associated probability p(Fi,j = fi,j) is
abbreviated as p(fi,j). We use the notation (F1,1 = f1,1, ..., FM,N = fM,N) for the joint
event. For simplicity, this joint event is abbreviated as F = f where f = f1,1, ..., fM,Nis a configuration of F , corresponding to a realization of the field. The joint probability
p(F = f) is abbreviated as p(f). We define f c = fk,l : (k, l) ∈ S − (i, j). The
Markovian property can be stated as
p(fi,j/fc) = p(fk,l/fk,l, (k, l) ∈ Ni,j) (Markovianity) (2.4)
A random field F over lattice S is a Markov random field with respect to the neigh-
borhood system N , if and only if,
1. p(f) > 0, ∀ f ∈ F (Positivity)
2. F satisfies Markovian property (Eq. 2.4)
(for all (i, j) ∈ S). Here, f = fi,j, (i, j) ∈ S corresponds to a realization of the
18
random field F and F denotes the configuration space (set of all possible labels for f).
The fundamental notion associated with Markovianity is one of conditional inde-
pendence. A one-dimensional process Fn is Markovian if the knowledge of the process
at some point Fn decouples the past F p and the future F f i.e., p(f f/fn, f p) = p(f f/fn).
The notion of local dependence extends directly to 2D. However, the notion of “past”
and “future” is optional in 2D since there is no natural ordering of elements in a grid.
Nevertheless, the notion of time is very useful in extending 1D recursive techniques to
2D. If we consider a raster-scan of a grid at pixel position (m, n), the present, past,
and future NSHP pixels are shown in Fig. 2.2. We define a random field F as Markov
if knowledge of the process at pixel (m, n) is completely specified by the pixels in its
NSHP neighborhood i.e.,
p(fm,n/fk,l, (k, l) 6= (m, n)) = p(fm,n/fk,l, (k, l) ∈ Rm,n)
where Rm,n is the NSHP support for a site (m, n) as defined in Eq. (2.2).
In addition to the traditional discrete [96], continuous [77] and causal MRFs [93],
there exist several variants of MRFs such as discriminative RF [97], conditional RFs
[98], and double random fields [99]. The utility of MRF theory has also been extended
to 3D [94] for volumetric analysis of images and for modeling the temporal dimension in
videos. Reconstruction of non-texture surfaces is an example application of continuous
MRFs where we need to assign continuous labels for a sampling of the underlying
surface. In the restoration of a quantized image, we employ discrete MRFs. MRFs
with a symmetric neighborhood fall into the class of noncausal MRFs while the ones
with NSHP support belong to the class of causal MRFs.
2.3 CLIQUE POTENTIALS
The cliques defined in the earlier section are characterised by their strengths referred to
as clique potentials. The value of a clique potential depends on the local configuration
of the clique c and the chosen potential function. Lower the potential of a certain
clique, the closer is the current site to its neighbors within that clique. The sum of
19
clique potentials is a measure of the energy of the spatial activity at that site. The
energy function determines the probability of a current site given its neighbors.
2.3.1 Potential Function and Gibbs Random Field
A random field F = Fi,j defined on S is a Gibbs random field (GRF) with respect
to the neighborhood system N , if and only if, its joint distribution is of the form
p(f) =1
Zexp −U(f) (2.5)
where Z =∑
f∈Fexp −U(f) is a normalizing constant known as the partition func-
tion, and U(f) is the energy function defined as the sum of clique potentials over all
possible cliques C and given by
U(f) =∑
(m,n)∈S,c∈C1
Vc(fm,n) +∑
(m,n)∈S
∑
(k,l)∈N(m,n),c∈C2
Vc(fm,n; fk,l) + ... (2.6)
where Vc(fm,n) and Vc(fm,n; fk,l) are known as clique potentials such that Vc(.) ≥ 0.
This ensures that the Gibbs energy U(f) is non-negative. The distribution given by
Eq. (2.5) is called Gibbs distribution (GD). The expression for the joint distribution
in Eq. (2.5) has the physical interpretation that smaller the value of U(f), which is
the energy of a particular realization f , the more probable that realization is. A GRF
is said to be homogeneous if Vc(f) is independent of the relative position of the clique
c in S. It is said to be isotropic if it is independent of the orientation of c.
The clique potential Vc(.) is a function of the values of sites in clique c. It quantifies
the irregularity in the neighborhood. It helps in encoding prior information about the
image to be estimated to restrict the space of possible solutions. In fact, the power of
MRF lies in a careful selection of the clique potential Vc(f). Contextual constraints
on two labels are the lowest-order constraints to convey contextual information. They
are widely used not only because of their simple form and low computational cost, but
also due to the fact that they are usually sufficient to describe the local characteristics
of images. They are encoded in the Gibbs energy as pair-site clique potentials. The
20
cliques with only pair-wise interactions are known in MRF terminology as auto-models
[77, 100]. For these models, the clique potential Vc(.) = 0 for |C| > 2.
There exist several MRF models based on the choice of the potential function
and the label set. The choice of model is application-dependent. For instance, an
auto-logistic model [77], which has a binary label set, is appropriate for modeling
binary textures; multi-level logistic (or ”colour-blind”) models [77] with (multi-valued)
discrete label set are suitable for textures. The most widely used models are the
spatial autoregressive (AR) models or the auto-normal models (Gaussian MRFs) [40,
91]. These appear in a variety of applications such as texture feature extraction,
classification and segmentation. An extensive presentation and comparison of different
classes of MRF models can be found in [101].
2.3.2 MRF-Gibbs Equivalence
An MRF is characterized by its local (conditional) property whereas a GRF represents
global (joint) property. A theorem by Hammersley and Clifford [77,102] establishes the
equivalence between MRF and GRF and yields the joint distribution from conditional
distributions. The first part of this theorem states that any GRF is an MRF. Here, our
interest is to derive an expression for calculating the conditional probability p(fi,j/fc)
from potential functions based on the assumption that F is a GRF. For completeness,
we state the theorem but give the proof for only the part that is of interest to us.
Hammersley-Clifford Theorem: A random field F = Fi,j is a Markov
random field with respect to neighborhood N , if and only if, its joint density function
is a Gibbs distribution with cliques associated with N .
Proof : We now only prove that a GRF is an MRF [77,102], which enables us to
derive the conditional density from the clique potential functions.
Let p(f) be a Gibbs distribution on S with respect to the neighborhood system
N . The conditional probability can be written as
p(fi,j/fc) =
p(f)
p(f c)=
p(f)∑f ′i,j∈L
p(f ′)(2.7)
21
where f ′ = f1,1, · · · , fi,j−1, f′i,j, fi,j+1, · · · , fM,N is any configuration that agrees with
f at all sites except at site (i, j). Since we assume that F is a GRF, using Eq. (2.5)
and Eq. (2.6), we can write
p(fi,j/fc) =
exp−∑c∈C Vc(f)∑f ′i,j
exp−∑c∈C Vc(f ′)(2.8)
Dividing the set of cliques C into sets A consisting of cliques containing site (i, j) and
B with cliques not containing site (i, j), we can write
p(fi,j/fc) =
exp−∑c∈A Vc(f) exp−∑c∈B Vc(f)∑f ′i,j
exp−∑c∈A Vc(f ′) exp−∑c∈B Vc(f ′)(2.9)
Since by definition Vc(f) = Vc(f′) for any clique c that does not contain site (i, j),
canceling common terms we get
p(fi,j/fc) =
exp−∑c∈A Vc(f)∑f ′i,j
exp−∑c∈A Vc(f ′)(2.10)
Eq. (2.10) provides a formula for calculating the conditional density from the potential
functions, based on the assumption that F is a GRF. We note that Eq. (2.10) resembles
a Gibbs random field (Eq. 2.5), except that now the energy function is obtained by
summing the clique potentials of only the neighboring sites of the current site. We refer
[103] for proof of the converse statement. In the light of this equivalence, we simply
specify the local “interaction” factors to define, up to some multiplicative constant,
the joint probability distribution p(f). With such a setup, each variable only directly
depends on a few other “neighboring” variables. From a more global point of view, all
the variables are mutually dependent, but only through a combination of successive
local interactions [95].
2.4 MRF PRIORS
In a Bayesian framework, many image analysis tasks such as edge detection, template
matching, super resolution and restoration can be posed as inference problems [77,100].
The aim is to estimate the actual random signal f from its observation e. Bayesian
22
techniques provide a way to invert an observation model taking the prior knowledge
p(f) into account. This is done by applying the Bayes rule on the prior p(f) and the
likelihood p(f/e), and then optimizing the a posteriori expected cost function for a
given observation [100].
The choice of the clique potential Vc(f) is crucial as it embeds important prior
information about the image to be reconstructed. The conditional probabilities for
image neighborhood configurations, based on cliques, play a similar role to image
energy in variational approaches. The cliques in MRF encode a set of probabilistic
assumptions (priors) about the geometric properties of the signal, and thus are very
effective when the signal conforms sufficiently well to the prior. In the sequel, we
present how Vc(f) can be appropriately modified to include contextual prior knowledge
in the formulation of the local conditional density.
2.4.1 Gaussian MRF
A generic contextual constraint is smoothness (or continuity) which assumes that the
physical properties in a neighborhood of space or in an interval of time have some
coherence and generally do not change abruptly. For example, the surface of a table
is flat, a meadow presents a texture of grass, and a temporal event does not change
abruptly over a short period of time. Smoothness constraints are often expressed as
the prior probability or equivalently an energy term measuring the extent to which the
smoothness assumption is violated by a random process or field f .
For spatially and temporally continuous MRFs, the smoothness prior often involves
derivatives as in analytical regularization. The order of the derivative n determines
the number of sites involved in the cliques; for example, n = 1 corresponds to a pair-
site smoothness potential (common auto-model). The energy is the integral of the
potential function over the (entire) range of the signal [77].
In the discrete case, where the surface is sampled at discrete points and hence
the label set is also discrete, we use a first-order difference to approximate the first
derivative and use a summation to approximate the integral. Thus, the potential
23
function, considering a 1D signal is Vc(f) = V2(fi, fi′) = 12(fi−fi′)
2. The energy is the
potential summed over all the pair-site cliques i.e.,
U(f) =∑
i∈S
∑i′∈Ni,c∈C2
V2(fi; fi′) =∑
i
∑i′∈Ni
(fi − fi′)2.
In specifying prior clique potentials for piecewise continuous surfaces, only pair-site
clique potentials are normally used. In the simplest case of a flat surface, they can be
defined by [77] V2(fi, fi′) = g(fi − fi′) where g(.) is a function penalizing the violation
of smoothness caused by the difference fi − fi′. For the purpose of restoration, the
function g is generally even (i.e., g(η) = g(−η)) and nondecreasing (i.e., g′(η) ≥ 0).
The random field associated with quadratic smoothness prior is referred to as
Gauss Markov random field (GMRF). An image is usually modeled as a homogeneous
GMRF for simplicity and mathematical tractability. The clique potential function for
the GMRF, as discussed above, is a quadratic function represented as
g(η) = η2 (2.11)
which corresponds to the prior clique potentials in the MRF models. Specifically, for
an image s at site (m, n), the pair-site potential for a GMRF is
Vc(sm,n, sk,l) = g(s(m, n)− s(k, l)) =1
z(s(m, n)− s(k, l))2
where z is a normalizing constant. The advantage of GMRFs is that the potential (Eq.
2.11) is a convex function, and hence it favors global optimization.
Employing pair-site cliques with quadratic potential function, we can formulate
the energy function for GMRF over the sites containing (m, n) in its neighborhood as
η2(s(m, n), s) =1
2ρ2(m,n)
∑
(i,j)∈Nm,n
(s(m, n)− s(m− i, n− j))2. (2.12)
Here, s are (already estimated) neighbors and ρ(m,n) is a weighting parameter based
on the neighborhood. Note that we represent the energy function∑
c∈A Vc(s)(=∑
(k,l)∈Nm,ng(s(m, n) − s(k, l))) with η2(s(m, n), s). Assuming that the image s is
a GRF, we obtain the conditional probability density function (pdf) as
p (s(m, n)/s(m− i, n− j)) =1
(2πρ2(m,n)/mi)1/2
exp(−η2(s(m, n), s)
)(2.13)
24
where (i, j) ∈ Nm,n. The neighborhood becomes Rm,n in a causal recursive framework.
This defines a Gaussian intrinsic auto-regressive (AR) model with conditional mean
given by the mean of the neighboring values and conditional varianceρ2(m,n)
mi, where mi
is the number of neighbors and parameter ρ(m,n) controls the allowable variation of the
current pixel with respect to its neighboring pixel values [40]. In general, smaller the
variance among the neighbors, lower must be the value of ρ(m,n), allowing a prediction
closer to its neighborhood.
The Gaussian MRF model is effective in modeling smooth data, but it penalizes
edges. Since much of the information in an image is contained in edges, arbitrarily pe-
nalizing them is clearly disadvantageous. An image model must favor local smoothness
but at the same time should accommodate abrupt transitions such as edges.
2.4.2 Edge-preserving Priors
A solution to the over-smoothing nature of the GMRF prior is by choosing a more
general contextual prior model through g as
∑
c∈C
Vc(f) =∑
c∈C
g(dcf)
where the differential operator dcf is a local spatial activity measure of the image and
should have a small value in smooth regions and large value at edges.
In the literature, one approach is to detect discontinuities or edges explicitly and
preserve using methods such as line fields or compound GMRF (CGMRF). Geman and
Geman [34] proposed the use of line fields model for preserving edges. The potential
function is a truncated quadratic and is given as
g(η, α) = minη2, α (2.14)
The function in Eq. (2.14) is non-convex (truncated convex). The smoothness con-
straint is switched-off at points where the magnitude of the signal derivative exceeds
threshold√
α thus preserving edge information. But the use of the line field poten-
tial function renders it non-differentiable. An integration of GMRF and line fields
25
−50 0 500
5
10
15
20
25
−50 0 500
1
2
3
4
5
6
7
−50 0 500
1
2
3
4
5
6
−50 0 500
1
2
3
4
5
6
(a) (b)
(c) (d)
Fig. 2.5: Edge-preserving convex potentials. The x and y axes correspond to η andg(η), respectively.
is referred to as compound GMRF (CGMRF) [37]. A CGMRF consists of several
conditional Gauss-Markov sub-models with an underlying line field. Preserving dis-
continuities with this model requires the parameters of the CGMRF to be estimated.
Another approach is to replace the “square law” potentials by more “robust” functions
[100]. Optimization techniques such as simulated annealing with Metropolis-Hasting’s
algorithm or Gibbs sampler which are quite involved may be needed for a reliable
global minimum or maximum [100].
We now briefly discuss some well-known edge-preserving potential functions for
preservation of edges. These can be classified as (discontinuity-preserving) convex
and non-convex potentials. Among the convex ones, Bouman and Sauer [35] define
generalized Gaussian potentials as
g(η) = |η|r (2.15)
26
with r ∈ [1, 2] (r = 2 for GMRF). Figs. 2.5 (a) and (b) show this potential function for
r = 2 i.e., quadratic and r = 1.2, respectively. We observe that there is a significant
difference in the magnitude of the penalty between the two values of r.
Stevenson et al. [104] propose Huber MRF which has the potential function (shown
in Fig. 2.5 (c) for α = 0.6) and is given by
g(η) =
η2, |η| ≤ α
2α|η| − α2, |η| > α(2.16)
The function proposed by Green [105]
g(η) = 2α2 log cosh(η/α) (2.17)
is approximately quadratic for small η and linear for large η. Parameter α controls the
transition between the two behaviors. We depict the nature of this potential for four
different values of the parameter α (from 0.6 (solid) to 0.1 (doted)) in Fig. 2.5 (d).
Non-convex potential functions penalise the edges to a still smaller degree by flat-
tening their shape for large differences. Some of the non-convex potential functions
employed for edge-preservation include the one proposed by Geman and McClure [106]
(shown in Fig. 2.6 (a), for α = 1) and given by
g(η) = η2/(η2 + α2) (2.18)
Blake et. al. [107] suggest a potential function which adaptively switches off penalising
severe discontinuities (as depicted in Fig. 2.6 (b) for α = 2) and is given by
g(η) = (min|η|, α)2 (2.19)
Geman and Reynolds [108] propose
g(η) = |η|/(|η|+ α) (2.20)
for constrained restoration of discontinuities. A typical plot is shown in Fig. 2.6 (c).
Hebert and Leahy [109] propose
g(η) = log(1 + (η/α)2) (2.21)
27
−50 0 500
0.2
0.4
0.6
0.8
1
−50 0 500
1
2
3
4
−50 0 500
0.2
0.4
0.6
0.8
1
−50 0 500
0.5
1
1.5
2
2.5
3
3.5
(a) (b)
(c) (d)
Fig. 2.6: Discontinuity-preserving non-convex potentials. The x and y axes corre-spond to η and g(η), respectively.
which has been employed for 3D Bayesian reconstruction from Poisson noisy data.
The corresponding nonconvex potential function is plotted in Fig. 2.6 (d). Further
references to edge-preserving priors can be found in [36, 39].
2.5 DISCUSSION
A brief review of the terminology, formulation and utility of the Markov random fields
as an image prior was given in this chapter. The theory of MRF provides a convenient
and consistent way of modeling the context-dependent information. Along with the
simple GMRF models, potential functions adaptive to discontinuities were also dis-
cussed. Applications of MRFs to image processing and computer vision problems can
be found in [91, 110].
28
CHAPTER 3
IMPORTANCE SAMPLING KALMAN FILTER FOR
IMAGE ESTIMATION
3.1 INTRODUCTION
In this chapter, we address the problem of image estimation in additive white Gaussian
noise. This can be cast as a problem of state estimation from noisy measurements.
When the state transition and measurement equations are both linear, and the state
and measurement noises are independent and additive Gaussian, the Kalman filter
is optimal and gives the minimum mean square error (MMSE) estimate of the state.
Extension of the 1D recursive KF to images was first proposed by Woods and Rade-
wan [8] and is referred to as the reduced-update Kalman filter. We first discuss the
formulation, abilities and limitations of the traditional auto-regressive Kalman filter
(ARKF). To address the shortcomings of the ARKF, we consider statistical modeling
of the contextual prior and incorporate it within the KF framework.
We propose an interesting extension for handling edges by modeling the original
image with a non-Gaussian MRF prior. The edge preservation capability is implic-
itly incorporated using a discontinuity-adaptive state conditional density. If the state
transition equation is not known but an assumption on the state transition density
(possibly non-Gaussian) can be made, we can still use the Kalman filter update equa-
tions in the proposed framework. We review the discontinuity-adaptive MRF model
proposed by Li [77,78] and employ it to formulate a state conditional prior to specify
the state dynamics. We discuss the importance sampling (IS) technique to estimate
the moments of an arbitrary pdf. The novelty of our approach lies in obtaining the
29
predicted mean and variance of the non-Gaussian state conditional density by im-
portance sampling and incorporating them in the update step of the Kalman filter.
Experimental results demonstrate the effectiveness of the proposed method in filtering
noise while preserving edges.
3.2 AUTO-REGRESSIVE KALMAN FILTER
The observation model for an image degraded by additive noise is given by
y(m, n) = s(m, n) + v(m, n) (3.1)
where s(m, n) is intensity of the original image at location (m, n), y(m, n) is the
observation, and v(m, n) is zero-mean white Gaussian noise with variance σ2v , which
is independent of s(m, n). The problem is to estimate s(m, n) given y(m, n). This is
not straightforward because we need to preserve the discontinuities in s(m, n) while
filtering out the noise.
Typically, the original image is modeled as a 2-D autoregressive (AR) process.
This is due to the fact that in the absence of any a priori constraints, the solution
can be very noisy. The corresponding AR equation [8, 25] can be written as a state
transition equation in the form
s(m, n) =∑
(i,j)∈Rm,n
a(i, j)s(m− i, n− j) + u(m, n) (3.2)
Here, Rm,n is the NSHP model support (which is commonly used) given by Eq. (2.2),
the scalars a(i, j) are the AR model coefficients (assumed stationary) and computed
from a prototype image, and u(m, n) is uncorrelated zero-mean white-Gaussian noise
driving the AR process. The term u(m, n) can also be regarded as the modeling error
between the image and its predicted value. Specifically, we obtain the AR coefficients
a(i, j) by solving the Yule-Walker equations [111]
rk,l −∑
(i,j)∈Rm,n
a(i, j)rk−i,l−j = 0, (k, l) ∈ Rm,n
σ2u = r0,0 −
∑
(i,j)∈Rm,n
a(i, j)ri,j (3.3)
30
where σ2u is variance of the state noise and rk,l is the (k, l)th auto-correlation coefficient
of the original (or its prototype) image computed as
rk,l =1
(M − k)(N − l)
∑
(i,j),(i−k,j−l)∈S
s(i, j)s(i− k, j − l)
For simplicity, we rewrite the AR model (Eq. (3.2)) in matrix-vector form as
sk = Fsk−1 + uk (3.4)
For example, if we consider a first-order NSHP support with three-pixel neighborhood,
then sk = [s(m, n), s(m, n−1), s(m−1, n)]T and sk−1 = [s(m, n−1), s(m−1, n), s(m−1, n−1)]T . Matrix F contains the AR coefficients of the image. If a(0, 1), a(1, 0), a(1, 1)
are the AR coefficients corresponding to the three-pixel NSHP neighborhood and ob-
tained by solving Eq. (3.3), then F = [f1 f2 f3] where f1 = (a(0, 1) 1 0)T ,
f2 = (a(1, 0) 0 1)T and f3 = (a(1, 1) 0 0)T . The vector uk = [u(m, n) 0 0]T
where u(m, n) is process noise and is assumed to be independent and white Gaussian
with zero mean and with covariance defined by σ2u. The 3× 3 covariance matrix Q of
vector uk is formed by augmenting σ2u with zeros.
Based on Eq. (3.1), the measurement equation can be formulated as
yk = Hsk + vk (3.5)
Here, yk = y(m, n) is the given observation at (m, n) and H = [1 0 0]. The measure-
ment noise vk = v(m, n) with covariance matrix R = σ2v . The process and measure-
ment noise are assumed to be uncorrelated. Image estimation boils down to estimating
the state sk given the observation yk.
3.2.1 Filter Equations
For the state-space model given by Eqs. (3.4) and (3.5), the MMSE estimate of state
sk can be derived using the following Kalman filter recursive equations [8, 112].
State estimate propagation: sk/k−1 = Fsk−1
Error covariance propagation: Pk/k−1 = FPk−1FT + Q
31
Kalman gain matrix: Kk = Pk/k−1HT[HPk/k−1H
T + R]−1
State update: sk = sk/k−1 + Kk(yk −Hsk/k−1)
Error covariance update: Pk = (I−KkH)Pk/k−1
We initialize the state s0 (with the local mean of the observations) and covariance
matrix P0 (with an identity matrix). Here, k − 1 represents (m, n − 1), sk−1 and
Pk−1 are the posteriori estimates of the state and error covariance of the previous step
available at time k, sk/k−1 and Pk/k−1 are the a priori estimates of the state and error
covariance at time k, yk is the new measurement at time k, Kk is the Kalman gain, and
sk and Pk are the posterior state and error covariance of the present step, respectively.
The above filter is referred to as the Auto-regressive Kalman Filter (ARKF). It
is important to observe that linear dependence implies statistical dependence but not
vice-versa. Our idea is to arrive at a more general framework wherein pixel dependen-
cies can be expressed statistically rather than by imposing a strong linear constraint
as in the AR model.
3.3 DISCONTINUITY-ADAPTIVE MRF
To restore the original image in the presence of noise, it is important to incorporate
as much knowledge as possible about the original image in the estimation process.
The Gauss MRF or AR model, that is commonly used in image restoration, is one
way of imposing smoothness constraint through the state equation to regularize the
solution. A homogeneous and linear AR model compromises on the ability to detect
sharp transitions [28]. In [28, 113], spatially varying 2D AR parameters are estimated
by windowing the observed image. Jeng and Woods [29] propose inhomogeneous AR
image models using local statistics. Kadaba et al. [11, 33] observe that Bernouli-
Laplacian and Cauchy distributions are a better fit for AR model residuals than the
Gaussian. In [37], a compound Gauss Markov random field is used to model images.
Edge-preserving Markov random fields have been proposed in [39].
32
The approach that we adopt to incorporate an edge-preserving MRF prior within
our recursive estimation scheme is based on the fact that statistical dependence in-
corporated using a Markov random field (MRF) model provides better flexibility in
incorporating contextual constraints through a suitably derived prior. Following the
discussions in chapter 2, we propose to employ a discontinuity-adaptive MRF model
to incorporate smoothness constraint while simultaneously preserving the edges.
3.3.1 Condition for DA Potentials
Li [77] proposes a continuous adaptive regularizer model in which, unlike the line field
model, the interaction decreases as the derivative magnitude becomes large and is
completely turned off at infinite magnitude of the signal derivative. A classical way of
finding solutions to illposed problems is based on regularization methods, where stabil-
ity and uniqueness of the solution are enforced by the introduction of prior smoothness
constraints. A more general approach, which includes the classical one as a special
case, is probabilistic regularization which considers the original image and its degraded
version as realizations of random fields. The prior knowledge about the solution is ex-
pressed in the form of a conditional or joint probability distribution that specifies the
desired dependencies among values at neighboring sites.
Without loss of generality, consider a 1-D signal f that is to be estimated. Let
η = f ′(x) and f (n) denote the nth derivative of f . A potential function g(f (n)(x))
quantifies the penalty against the irregularity in f (n−1)(x) and corresponds to prior
clique (a set of connected pixels) potentials in MRF models [77]. The clique potential
is usually chosen to satisfy two properties (i) g(η) = g(|η|), and (ii) the derivative
of g must be expressible as g′(η) = 2ηh(η) where h is a function which determines
the interaction among neighboring pixels. The magnitude |g′(η)| = |2ηh(η)| is the
strength with which the regularizer performs smoothing. A necessary condition for
any regularization model to be adaptive to discontinuities is
limη→∞
|g′(η)| = limη→∞
|2ηh(η)| = A (3.6)
where A ≥ 0 is a constant. The above condition with A = 0 completely prohibits
33
−20 −15 −10 −5 0 5 10 15 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Signed difference measure, η
Inte
ract
ion
fu
nct
ion
, h
γ(η)
−20 −15 −10 −5 0 5 10 15 200
1
2
3
4
5
6
7
8
9
10
Signed difference measure, η
Str
en
gth
of
DA
−p
ote
ntia
l fu
nct
ion
, g γ(η
)
non−convexconvex region
(a) (b)
Fig. 3.1: DA model. (a) Interaction of neighbors as a function of difference. (b)Penalty imposed by the DA model with increasing difference in intensity values.
smoothing at discontinuities as η →∞ whereas with A > 0 the DA regularizer allows
limited (bounded) smoothing.
3.3.2 DAMRF Prior for Recursive Estimation
The DA models proposed by Li [77] satisfy all the conditions for discontinuity-adaptivity
and are robust to outliers [78,114]. They are parameterized by a free parameter γ and
allow better control over the shape of the potential function [115]. Among the DA
potentials suggested by Li in [77], we use the interaction function hγ(η) =1
1 + (η2/γ)where η is as defined before. The corresponding potential function takes the form
gγ(η) = γ log(1 + (η2/γ)) (3.7)
Fig. 3.1 (a), shows how the function hγ(η) specifies interaction based on the difference
value η while Fig. 3.1 (b) brings out the behavior of the corresponding potential func-
tion gγ(η). For this choice of hγ(η), the smoothing strength |ηhγ(η)| increases mono-
tonically as η increases within a band Bγ = (−√γ,√
γ). Outside the band, smoothing
decreases and becomes zero as η→∞. Since this enables to preserve image discon-
34
0 0.5 1 1.5 2 2.5 3 3.5
x 104
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Squared difference measure, η2
Sm
oo
thin
g s
tre
ng
th o
f D
AM
RF
, | 2
η h
γ(η)
|
50 100 150 200 2500
0.01
0.02
0.03
0.04
0.05
0.06
(30, 50)(80, 30)(160, 180)
Pixel intensity
DA
MR
F v
alu
es
(a) (b)
Fig. 3.2: (a) Plot shows how the smoothing strength of DAMRF varies with η2. (b)Heavy-tailed DAMRF distributions.
tinuities, it is also called a discontinuity-adaptive MRF (DAMRF). We observe that
the variation from the convex (smoothing) region to non-convex (i.e., lesser increment
in penalty to distinct pixels) is gradual in Fig. 3.1 (b). This choice of DA-potential
function is based on our observation that its performance is superior to other DA-
potentials proposed by Li [77, 78]. Fig. 3.2 (a) shows how the smoothing strength of
the DAMRF varies as a function of η2.
When neighborhood interaction in s is defined as above, the irregularities in the
solution are penalized by the corresponding potential function, and the conditional
pdf takes the form
p (s(m, n)/s(m− i, n− j)) =1
Zexp
(−γ log
(1 +
η2(s(m, n), s)
γ
))(3.8)
where Z is a normalization constant and the energy term
η2(s(m, n), s) =1
ρ2(m,n)
∑
(i,j)∈Rm,n
(s(m, n)− s(m− i, n− j))2
(i2 + j2)(3.9)
Here, (i, j) ∈ Rm,n where Rm,n is the NSHP support of order M1. Thus, s are the
NSHP neighbors of s(m, n). Parameter ρ2(m,n) controls the allowable variation of the
35
current pixel with respect to its neighboring pixel values [40] and plays an important
role in adaptively modifying the prior distribution. Since the potential function used is
discontinuity-adaptive, this local conditional prior is expected to render better edge-
preserving capability compared to the quadratic GMRF regularizer. Fig. 3.2 (b)
illustrates the non-Gaussian nature of DAMRF distributions. The pdf in Eq. (3.8)
can be rewritten as
p (s(m, n)/s(m− i, n− j)) =1
Z
(1 +
η2(s(m, n), s)
γ
)−γ
(3.10)
which has a maximum at
∑αij s(m− i, n− j)∑
αijwhere αij = 1/(i2 + j2) are the corre-
sponding weights and (i, j) ∈ Rm,n. Further, the pdf p is symmetric about the maxi-
mum (due to the quadratic nature of the argument η2). Thus, the mean value of the
pdf p(s(m, n)/s) is given by
E(p(s(m, n)/s)) =
∑i,j αij s(m− i, n− j)
∑i,j αij
. (3.11)
3.4 PRINCIPLE OF IMPORTANCE SAMPLING
In order to incorporate the DAMRF prior (formulated in section 3.3.2), into the
Kalman filter framework, we need to estimate the mean and covariance of this non-
Gaussian prior (in each recursive prediction step). Since it is analytically not possible
to compute the variance of this pdf, we resort to a Monte Carlo method known as
importance sampling.
Monte Carlo (MC) is the art of approximating an expectation by the sample mean
of a function of simulated random variables [80]. Since the simulations are with random
numbers, the more simulations we perform, the more accurate the approximation
becomes [116]. However, closer the approximation by a specific MC method, fewer
samples are needed to reach a certain level of accuracy. With appropriate MC methods,
even when using random sampling, it is possible to reduce the (error) variance for a
given number of sample points. This is based on the intuition that the efficiency of
random sampling methods can be enormously enhanced by a suitable choice of sample
36
points. For example, to estimate moments under a Gaussian distribution with random
samples, we need to sample more frequently around the mean of the Gaussian pdf
than employ uniform sampling. The MC procedure that enables us to perform such
tasks systematically is known as importance sampling.
Importance sampling (IS) is an MC method for determining the estimates under
a (difficult to sample) target function or pdf, provided its functional form is known
up to a multiplication constant [79]. The idea behind IS is that certain values of
the input random variables have more impact on the parameter being estimated than
others. If these “important” values are emphasized by sampling more frequently, then
the estimator error variance can be reduced [80]. Hence, the basic methodology is to
choose a distribution for generating random samples, which ‘encourages’ important
values. The way it does this is by bringing in a probability distribution function
[116] which is a function that attempts to tell which interval of the target function
should get more samples. It does this by having a higher probability in that area.
But this can result in a biased estimator. Hence, the outputs are corrected using
correction weights for each sample to ensure unbiasedness [116, 117]. The selection of
the sampling distribution known as importance function requires prior knowledge of
the target pdf (or function).
Let us consider a pdf p(z) from which it is very difficult to make any estimates of its
moments. From the functional form, we can estimate its non-zero support. Consider
a distribution q(z), a reasonable approximation to p(z), which is also known up to a
multiplicative constant, is easy to sample, and is such that the (non-zero) support of
q(z) includes the support of p(z). Such a density q(z) is called the sampler density
(referred to as importance function).
Mathematically, the idea of importance sampling can be presented as follows.
Suppose the density q(z) roughly approximates the target density p(z), with (non-
zero) support A (i.e.,∫
z∈Aq(z)dz = 1). The expected value of a function g under pdf
p(z) can be written as
Ep[g(z)] =
∫g(z)p(z)dz =
∫g(z)
(p(z)
q(z)
)q(z)dz = Eq
[g(z)
(p(z)
q(z)
)](3.12)
37
−20 −10 0 10 20 30 400
0.01
0.02
0.03
0.04
0.05
0.06
0.07Rayleigh (10)Gamma (2, 4)
Support
pd
f va
lue
s
0 5 10 15 20 25 30 35 400
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
q: Rayleigh (4) p: Gamma (2,4)
Support
pd
f va
lue
s
(a) (b)
Fig. 3.3: Choice of importance function. (a) Good choice: The support of the targetpdf is included in the sampler. (b) Bad choice: The support of the same target pdf isnot included in the sampler.
Using the above identity and IS with the samples of q, Ep[g(z)] is estimated as
∫g(z)p(z)dz ≈ 1
L
L∑
i=1
g(z(l))
(p(z(l))
q(z(l))
). (3.13)
where L samples z(l) are drawn from the distribution given by q(z). The (dis-
tribution) ratio p(z)/q(z) is known as the likelihood ratio [118], since it quantifies
the relative likelihood of a given sample z(l) under p compared to q. We note that
each sample drawn from q is weighted by the likelihood (or the correction weight) for
counter-balancing [116].
It is important to note that the success of this method in getting accurate estimates
is entirely dependent on selecting a good importance function q(z). Fig. 3.3 shows
good and bad q for a given p. Even when q(z) is roughly the same shape as p(z),
serious difficulties arise if q(z) decays-out faster than p(z) at the tails (for example,
as shown in Fig. 3.3 (b)). In such cases, the improbable samples at the tails of q
are given much higher-orders of the correction weight than usual making the estimate
biased [80]. For further information about the choice of importance function, refer to
38
0 10 20 30 40 50 60 70 800
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04Cauchy (2,1)Rayleigh (2)
Support
pd
f va
lue
s
−20 −10 0 10 20 30 400
0.01
0.02
0.03
0.04
0.05
0.06
0.07Rayleigh (10)Gaussian (5, 40)
Support
pd
f va
lue
s
(a) (b)
Fig. 3.4: Choice of importance function. (a) Good choice: Sampler support includes(non-symmetric) target pdf (b) Bad choice: Non-symmetric sampler cannot be em-ployed for IS of a symmetric target distribution.
[118, 119]. Fig. 3.4 illustrates the limitation of a non-symmetric sampler1.
3.4.1 Moment Estimation
Our aim is to determine the estimates under the pdf p, using samples from q. Since it
is difficult to draw samples from the target pdf p, we draw L samples, z(l)Ll=1 from
the sampler pdf q, i.e., they are chosen from a distribution q which concentrates in
the region where the function p is large. If these were under p, we can determine the
moments of p directly from them. In order to use these samples to determine the
estimates of the moments of p, we proceed as follows. When we use samples from
q to determine any estimates under p, in the regions where q is greater than p, the
estimates are over-represented. In the regions where q is less than p, they are under-
represented. To account for this, we use correction weights wl =p(z(l))
q(z(l))in determining
the estimates under p [see Eq. (3.13)]. While estimating the moments, the correction
weights must be normalized, if target p and sampler q (densities) are known only upto
1In Fig. 3.4 (a), to estimate the moments under a non-symmetric distribution using the symmetric
sampler, explicit care must be taken to use zero values of target pdf for the negative samples.
39
Table 3.1: Moment estimation using IS with a proper choice of importance function.
Target pdf Sampler density True IS (µp) True IS (σ2p)
(L = 1000) mean mean variance variance
Gaussian (µ = 20, σ2 = 40) Cauchy (µ, 30) 20 19.99 40 40.004
Gaussian (200, 400) Cauchy (µ, 30) 200 199.97 400 400.12
Laplacian (µ = 2, b = 1) Cauchy (µ, 30) 2 2.007 2 2.025
Laplacian (20, 10) Cauchy (µ, 30) 20 20.011 200 199.999
Gamma (k = 2, θ = 4) Rayleigh (σ = 10) 8 8.024 32 32.245
Chi-square (k = 4) Rayleigh (σ = 10) 4 3.997 8 7.95
Rayleigh (σ = 4) Rayleigh (σ = 10) 5.013 4.997 6.867 6.815
Rayleigh (σ = 2) Cauchy (2, 1) 2.507 2.503 1.717 1.720
a multiplication constant [118]. For example, to find the mean of the distribution p
we use µp =
∑Ll=1 wlz(l)
∑Ll=1 wl
. Similarly, the variance under the distribution p using the
samples of q by importance sampling can be found using σ2p =
∑Ll=1 wl(z(l) − µp)
2
∑Ll=1 wl
.
The accuracy of the IS estimates is quantified using their MC variance. For ex-
ample, the variance of the estimate of µp is defined as σ2ac =
∑l[z
(l) − µp]2
L, where
z(l) are now randomly sampled from p. Importance sampling with a proper choice of
importance function yields a much smaller variance than uniform sampling [80, 120].
We now illustrate and verify the ability of importance sampling principle to estimate
the moments under some well-known distributions.
We performed experiments to estimate the mean and variance of distributions
such as Laplacian, Gamma and Rayleigh from samples of another distribution, using
importance sampling. In Tables 3.1 and 3.2, we have summarized the results for proper
and improper choice of the importance function, respectively. The tables also contain
the parameters used for the target distributions, chosen parameters of the sampling
pdfs, and the true and estimated means and covariances. With a proper choice of
40
Table 3.2: Moment estimation using IS with a bad choice of importance function.
Target pdf Bad choice of True IS True IS
Sampler (L = 1000) mean mean variance variance
Gamma (k = 2, θ = 4) Rayleigh (σ = 4) 8 7.126 32 17.68
Gaussian (µ = 5, σ2 = 40) Rayleigh (σ = 10) 5 7.51 40 22.46
Chi-square (k = 40) Rayleigh (σ = 10) 40 33.88 80 19.41
importance function, the closeness of the estimated (first two) moments with that of
the true values (as given in Table 3.1), validates importance sampling. Figs. 3.3 (a)
and 3.4 (a) are good choices of importance function for which the estimated values
are given in rows 5 and 8, respectively, in Table 3.1. Note that Table 3.2 corresponds
to bad choices of the importance function. Figs. 3.3 (b) and 3.4 (b) illustrate target
distributions and importance functions corresponding to the first two rows of Table
3.2, respectively.
In our problem, the DAMRF distribution corresponds to p. We choose the Cauchy
distribution as the importance function q. It is a better approximation than the
Gaussian distribution as it provides larger support overlap due to its heavy-tailed
nature [79]. During the estimation procedure, since the DA state conditional pdf
changes its mode (peak value) at each pixel based on its neighbors, we adaptively vary
the Cauchy sampler location based on the mean of the neighbors. To account for the
variations in shape (width and tail behavior) of DAMRF prior at each pixel, we set
the scale of the Cauchy sampler sufficiently high.2
3.5 KALMAN FILTER WITH NON-GAUSSIAN PRIOR
We now present a recursive algorithm for estimating the original image within the KF
framework by non-Gaussian modeling of the image prior. In the proposed strategy,
knowledge of only the conditional pdf is required and its parameters are a function
2The same value of scale was found to work well for all the images.
41
of the already estimated pixels and the values of ρ(m,n) and γ as discussed in section
3.3.2. This implicitly generalizes the state transition equation and does not restrict it
to be linear. The proposed extension is based on the observation that the KF and its
variants differ only in state vector propagation, but have same update equations.
Specifically, we formulate the conditional density of the pixel to be estimated (given
its neighbors) based on the DAMRF model. The state sk at position k = (m, n) is
the original pixel intensity s(m, n) which is to be estimated. The mean and variance
of the state conditional density constitute the predicted mean sk/k−1 and predicted
covariance Pk/k−1 of the state, respectively. We estimate the first two statistics of the
non-Gaussian prior by importance sampling. They are taken to the update stage of
the KF to arrive at the final image estimate.
The steps involved in the proposed method are as follows:
1. At each pixel, we construct the state conditional pdf using the past pixels in
the NSHP support, and the values of ρ(m,n) and γ in the DAMRF model (Eq.
3.8). For a first-order NSHP support (M1 = 1), the neighbors considered are
s(m, n− 1), s(m− 1, n), s(m− 1, n− 1) and s(m− 1, n + 1), which are already
estimated, and η2(s(m, n), s) is as defined in Eq. 3.9. Since the parameter
ρ2(m,n) of the DAMRF model characterizes local dependence it is set to Pk−1,
the covariance of the previous state.
2. We obtain the mean and covariance of the above pdf using importance sampling.
Draw samples z(l), l = 1, 2, ..., L from a Cauchy sampler3. To facilitate a
close match with the DAMRF pdf p at each pixel, the location of the sampler
density q(z) is varied as µq = (s(m, n− 1) + s(m− 1, n) + s(m− 1, n− 1) +
s(m− 1, n + 1))/4 and the scale is chosen large enough to include the support
of p in q. The samples are weighted by wl =p(z(l))
q(z(l))(section 3.4). The mean µp
can be computed analytically using Eq. (3.11) or by importance sampling. The
mean and variance σ2p of p are computed as
3Cauchy distributed samples with location µq and scale s1 can be generated using µq + s1 tan(π(u−0.5)), where u is uniformly distributed over 0 to 1.
42
µp =
∑Ll=1 wlz(l)
∑Ll=1 wl
and σ2p =
∑Ll=1 wl(z(l) − µp)
2
∑Ll=1 wl
3. The predicted mean and error covariance are fed to the update stage of the
Kalman filter as sk/k−1 = µp and Pk/k−1 = σ2p
where sk/k−1 and Pk/k−1 are one-step forward predicted mean and error covari-
ances, respectively.
Kalman gain : Kk = Pk/k−1HT[HPk/k−1H
T + σ2v
]−1= σ2
p/(σ2
p + σ2v
)
Mean updation: sk = sk/k−1 + Kk(yk −Hsk/k−1) = µp +σ2
p(yk − µp)
(σ2p + σ2
v)
Error covariance update: Pk = (I−KkH)Pk/k−1 =(σ2
p σ2v)
(σ2p + σ2
v).
Since state (sk = s(m, n)) is a scalar, from Eq. (3.5) we have H = 1. Here
yk = y(m, n) is the scalar observation.
This gives the estimated mean s(m, n) = sk; we sequentially perform steps 1 to
3 until we estimate the last pixel.
Note that for the ARKF, the AR coefficients are assumed to be known accurately.
In real situations, this can be difficult because the original image is not available. In
contrast, for the ISKF, the statistical parameters µp and σ2p are obtained by IS.
3.6 EXPERIMENTAL RESULTS
In this section, we present results on image estimation using the proposed impor-
tance sampling-based Kalman filter (ISKF). We also provide comparisons with the
auto-regressive Kalman filter (ARKF) to demonstrate the improvement obtained by
incorporating a non-Gaussian prior in the Kalman filter. In ARKF, the AR parameters
and the process noise covariance are obtained from the auto-correlation coefficients of
the entire original image.
As a quantitative measure of the accuracy of the estimates, we use the improvement-
in-signal-to-noise-ratio (ISNR) defined as
ISNR = 10 log10
(∑i,j(y(i, j)− s(i, j))2
∑i,j(s(i, j)− s(i, j))2
)dB (3.14)
43
(a) (b) (c) (d)
Fig. 3.5: (a) Original image. (b) Degraded image (SNR = 10 dB). Image estimatedby (c) ARKF (ISNR = 3.06 dB), and (d) the ISKF (ISNR = 4.44 dB, γ = 1.8).
where s, y, s represent the original, the degraded observation, and the estimated image,
respectively, and the summation is taken over the entire image.
In Fig. 3.5(a), an original flower image of size 200 × 200 with sharp and clean
petals is shown. The image obtained after degradation by additive white Gaussian
noise (AWGN) of SNR = 10 dB is given in Fig. 3.5(b). The image estimated by
ARKF and the proposed approach are shown in Figs. 3.5(c) and 3.5(d), respectively.
We note that using the proposed scheme the noise on the white petals and on the
black background is filtered out very effectively while the petals come out sharp. The
image estimated by ISKF not only has finer details but has less noise compared to the
output of the ARKF. This is also reflected in the ISNR values.
To illustrate the (non-Gaussian) nature of the DAMRF pdf, the distribution cor-
responding to the DAMRF model has been plotted in Fig. 3.2 (b) for three different
pixel locations in the image (Fig. 3.5(d)), at (30, 50), (80, 30) and (160, 180) . We
note that the DAMRF pdfs have a heavy-tailed nature. We have observed experimen-
tally that their variance (degree of heavy-tail) coarsely varies with the variation among
neighbors (of the the current pixel) while it is finely controlled by the (previous) co-
variance estimate that is used for ρ2(m,n). From Fig. 3.2 (b), we see that the three pdfs
at three different locations have different means and different variances. This shows
the need for a space-varying heavy-tailed importance function to closely approximate
the DAMRF pdf at each pixel.
44
(a) (b) (c) (d)
Fig. 3.6: (a) Original image. (b) Degraded image (SNR = 10 dB ). Image estimatedusing (c) ARKF (ISNR = 1.29 dB), and (d) the ISKF (ISNR = 2.43 dB).
(a) (b) (c) (d)
Fig. 3.7: (a) Original ”House” image. (b) Degraded (σ2v = 300) . Image estimated
using (c) AR-based KF (ISNR = 2.04 dB), and (d) ISKF (ISNR = 2.58 dB).
Fig. 3.6(a) shows an image of bricks with narrow joints while its degraded version
is shown in Fig. 3.6(b). The image recovered using the ARKF and the ISKF are
shown in Figs. 3.6(c) and 3.6(d), respectively. In attempting to filter the noise, the
ARKF yields a blurred output. On the other hand, the ISKF not only captures the
edges (joints) well but also effectively filters out the noise (on the bricks) resulting in
an output (Fig. 3.6(d)) which is much closer to the original image.
Fig. 3.7(a) shows a natural scene with a house, tree and a car while its degraded
version with σ2v = 300 is shown in Fig. 3.7(b). The image recovered using ARKF (Fig.
3.7(c)) retains the tree leaves but is very noisy. However, as shown in Fig. 3.7(d),
the ISKF has filtered the noise well in homogeneous regions (such as sky, roof and
road) while simultaneously retaining image details like grass and leaves. Adaptive
45
(a) (b) (c) (d)
Fig. 3.8: Building (a) Original. (b) Degraded image (σ2v = 300). Image estimated by
(c) AR-based KF (ISNR = 2.17 dB), and (d) ISKF (ISNR = 3.81 dB).
(a) (b) (c) (d)
Fig. 3.9: Daisy (a) Original image. (b) Degraded (σ2v = 500). Image estimated using
(c) AR-based KF (ISNR = 4.14 dB), and (d) ISKF (ISNR = 5.36 dB).
noise suppression with the proposed ISKF is clearly evident even in this example.
Fig. 3.8(a) shows a palace surrounded by a lake. Its degraded version is shown in
Fig. 3.8(b). The images estimated by ARKF and the ISKF are shown in Figs. 3.8(c)
and 3.8(d), respectively. Note the noise removal capability of the proposed DA prior
vis-a-vis the AR model. The sky is noise free, the edges on the building walls are sharp
and clear and the reflection of the building is clearly visible in the water in Fig. 3.8(d).
Fig. 3.9(a) shows an original ”daisy” image. The image after degradation by
additive white Gaussian noise of σ2v = 500 is shown in Fig. 3.9(b). Figs. 3.9(c) and
(d), show the images estimated by ARKF and the proposed ISKF, respectively. We
note that, unlike ARKF, the proposed approach suppresses the noise very well without
46
0 5 10 15 20 251
2
3
4
5
6
7ISKFARKF
ISN
R v
alue
s
Image index
Fig. 3.10: Performance comparison of ARKF and ISKF on different images for mod-erate noise of σ2
v = 300.
any artifacts at the edges. The petals come out clean and sharp compared to ARKF.
This is also reflected as a higher ISNR value.
In order to provide a comprehensive evaluation of performance, we experimented
with a large database of 25 images containing faces, natural scenes, texture, and satel-
lite images. The performance improvement is quantified in terms of the mean value
of ISNR over 20 Monte Carlo trials and is plotted in Fig. 3.10. The plot clearly
demonstrates the superior quantitative performance of the proposed approach.4
Table 3.3: Comparison of per-pixel computational complexity: ARKF vs ISKF.
Operation addition multiplication/division exp/log
ARKF 3n2x(nx − 1) + nx(nx − 1) 3n3
x + n2x + 2nx 0
ISKF 10L 10L 2L
We provide a computational complexity comparison of the proposed ISKF and
4Except for some highly textured images (such as image 20 (Bark) and image 21 (Fabric)), where
the AR model is able to capture the image characteristics quite well.
47
(a) (b)
(c) (d)
Fig. 3.11: Plane image. (a) Original. (b) Degraded [11]. Image estimated using (c)BL-RUBF [11] (ISNR = 1.01 dB), and (d) ISKF (ISNR = 1.67 dB).
ARKF. As our prediction stage works by drawing samples, it is computationally in-
tensive than ARKF. The approximate computation requirements/pixel are given in
Table 3.3. Here nx is the dimension of the state in ARKF and L is the number of
samples in the importance sampling step of ISKF. For the experiments conducted here
(nx = 3 and L = 100), while the ARKF took 4 seconds, the proposed algorithm took
about 13 seconds for a 200 × 200 image, on a Pentium-IV PC running Matlab5.
5The improvement in performance of ISKF with more than 100 samples in IS step is only marginal.
48
We also compared our approach with the method in [11] to further evaluate the
ISKF. Figs. 3.11(a), (b), and (c) show the original, degraded and the estimated images,
respectively, as reproduced from [11]. The recursive Bayesian filter in [11], with an AR
state model driven by Bernouli-Laplacian density is shown to yield better performance
than the ARKF, for image estimation in AWGN. The result of our approach, shown
in Fig. 3.11 (d), when applied on the degraded image (Fig. 3.11 (b)), is closer to the
original than Fig. 3.11(c). Our approach yields cleaner uniform regions while at the
same time the letters on the plane emerge sharper.
3.7 DISCUSSION
We proposed a novel discontinuity-adaptive Kalman filter for image estimation and
provided a framework to handle the state conditional pdf directly in the prediction
step of the Kalman filter. Instead of using the state transition equation, we used
importance sampling to predict the mean and error covariance of the non-Gaussian
conditional prior. The estimated statistics are used in the update equations of the
Kalman filter to obtain the an estimate of the posteriori intensity. Experimental
results demonstrate the effectiveness of the proposed method.
The applicability of the proposed ISKF is limited to linear degradations. However,
we also encounter many real situations where the observed image is a non-linear or
multiplicative degradation of the original image. This situation requires the non-linear
counterpart of the Kalman filter, which is the subject of the following chapter.
49
CHAPTER 4
UNSCENTED FILTER FOR NON-LINEAR ESTIMATION
The objective of this chapter is to introduce a powerful framework for handling non-
linear or non-additive (image) degradations. An important observation by Julier et al.
[81] is that the optimal Kalman filter update, even in non-linear/non-Gaussian cases,
requires only the prediction of the first two moments (mean and covariance) of the
state and measurement as accurately as possible [121, 122]. This problem is a specific
case of the general problem of calculating the statistics of a random variable (RV)
which has undergone a nonlinear transformation. The most widely used approaches
to transform the means and covariances through nonlinear transformations are the
linearization method [123] and the Monte Carlo (MC) sampling method [79], which
are constrained by accuracy and computational efficiency, respectively.
In this chapter, we introduce the principle of unscented transformation (UT), pro-
posed by Julier et al. [81, 121], that transforms the mean and covariance through
nonlinear transformations using a very small carefully chosen deterministic set of sam-
ples (and deterministic weights). It leads to more accurate prediction (of moments)
than linearization [81, 122], as we will demonstrate, and provides an alternative to
the computationally intensive MC approaches in many applications [85,124]. We also
illustrate the superiority of the UT over linearization, and its proximity to the true
values or the MC estimates, on simple and common nonlinear transformation examples
from the literature [81, 121]. The UT when embedded in a Kalman filter formulation
is referred to as the unscented Kalman filter (UKF). The UKF provides a simple and
(more) accurate solution to the nonlinear estimation problem (than the commonly
employed Extended Kalman Filter) [81, 121].
50
i
i
i
Fig. 4.1: Principle of unscented transformation.
4.1 UNSCENTED TRANSFORMATION
The unscented transformation (UT) is a deterministic sampling approach for calcu-
lating the statistics of a random variable which undergoes a nonlinear transformation.
The UT is founded on the intuition that with a fixed number of parameters it is easier
to approximate a probability distribution than it is to approximate an arbitrary non-
linear function or transformation [81, 121]. Following this intuition, Julier et al. [81]
formulated a parameterization which captures the mean and covariance information
(of a RV) while at the same time permitting direct propagation of the (mean and
covariance) information through an arbitrary set of nonlinear equations. This is ac-
complished in UT by i) capturing the Gaussian approximation of the prior distribution
(first two moments of the prior) with a very small set of carefully chosen deterministic
samples known as sigma points, ii) propagating these samples through the nonlinear-
ity, and iii) determining the moments of the posterior from the transformed samples.
Fig. 4.1 illustrates the transformation process of sigma points. Here, we have shown 5
sigma points (deterministic samples, marked Xi) with the mean being the sigma point
at the center, while the others lie on an ellipse (determined by the (scaled) square root
of their covariance matrix [125]). Each sigma point Xi is independently transformed
51
through a nonlinear function g (vector-to-vector transformation) as depicted in the
figure. The transformed sigma points Yi = g(Xi), i = 0, ..., 4., shown on the right,
predict the mean and covariance of the posterior random variable.
Consider propagating random variable x through a nonlinear function g : Rnx→Rny
to generate y = g(x). Assume x has mean x and covariance Px. To calculate the
statistics (first two moments) of y using the scaled unscented transformation (SUT),
we proceed as follows: First, a set of 2nx + 1 weighted samples or sigma points are
deterministically chosen so that they exactly capture the true mean and covariance of
the prior random variable x. A selection scheme that satisfies this requirement [83,126]
is given by
X0 = x; w(µ)0 =
λ
(nx + λ)
Xi = x + (√
(nx + λ)Px)i , i = 1, ..., nx; w(c)0 = w
(µ)0 + (1− α2
UT + βUT )
Xi = x− (√
(nx + λ)Px)i , i = nx + 1, ..., 2nx; λ = α2UT (nx + κ)− nx
w(µ)i = w
(c)i =
1
2(nx + λ), i = 1, ..., 2nx (4.1)
Parameter αUT controls the spread of the sigma point distribution around x and is
usually set to a value between 0 and 1. The term βUT is used to incorporate prior
knowledge about x. Note that (√
(nx + λ)Px)i represents the ith column of the nx×nx
matrix square root1 of (nx +λ)Px. The addition and subtraction of these vectors from
the mean value of the prior results in 2nx symmetric sigma points. Along with the
prior mean X0, we have 2nx + 1 sigma points 2. While Xi refers to the ith sigma
point, the superscripts µ and c on the weights refer to their use in mean and covari-
ance calculations, respectively. The weight w(µ)i associated with the ith point satisfies
∑2nx
i=0 w(µ)i = 1. Simple matrix calculations confirm that the first two moments of these
sigma points are exactly the same as that of the prior [82]. We present explicit expres-
sions in section 4.2.1If the matrix square root A of matrix Px is of the form P = A
TA, then the sigma points are
formed from the rows of A. However, if P = AAT , then the columns of A are used [81].
2We note that the number of sigma points is directly related to the dimension of the prior RV.
52
Each sigma point is instantiated (independently passed) through the nonlinearity
to get Yi = g(Xi), i = 0, 1, ....., 2nx. The mean and covariance of y are estimated as
y =2nx∑
i=0
w(µ)i Yi (4.2)
and
Py =
2nx∑
i=0
w(c)i (Yi − y)(Yi − y)T (4.3)
Positive semi-definiteness of the predicted covariance matrix Py is guaranteed by choos-
ing κ ≥ 0 [81,83]. These estimates of the moments of y are accurate to the second-order
(and third-order for symmetric priors) of the Taylor series expansion for any nonlinear
function g(x), since the prior sigma points completely capture the prior distribution
up to the second moment [81, 82]. Valuable insights into the UT can also be gained
by relating it to a numerical technique called Gaussian quadrature of integrals [127].
A close similarity also exists between the UT and the central difference interpolation
filtering (CDF) [128]. The principle of UT has also been used to improve the ’Expec-
tation step’ in the EM algorithm in the context of training neural networks [129].
A block diagram (inspired by [112]) illustrating the steps for performing the UT
is shown in Fig. 4.2 for a prior RV of dimension 5. Its mean and covariance are
represented on the left by a vertical column and a square, respectively. To determine
the sigma points, the covariance matrix is passed through the matrix square root and
then multiplied by the factor λc =√
(5 + λ). The resulting 5 × 5 matrix is split
into 10 columns with positive and negative sign before each of its 5 columns. They are
augmented with a zero vector in the first column and shown as a 11 column rectangular
matrix. On the top, the mean is replicated as 11 columns and is added to the bottom
covariance components to obtain the sigma points as shown in the bottom-most (5×11)
rectangular matrix X. The ith column represents the ith sigma point Xi.
Each of these 11 prior sigma points are transformed through the nonlinear function
g to obtain the posterior sigma points Yi, each with dimension ny. We note that the
53
λc
λc λ
c
Fig. 4.2: Block diagram of UT.
ith posterior sigma point Yi is completely determined only by the ith prior sigma point
Xi for i = 1, ..., 11, and not by the others. Then, using the UT deterministic weights
given in Eq. 4.1, we predict the posterior mean and covariance as the weighted sample
mean and covariance of transformed samples Yi, i = 1, .., 11. We analyse the accuracy
of the predicted mean and covariances in the next section.
We note that although this method bears a superficial resemblance to Monte Carlo
sampling methods (as employed in particle filters) there are several fundamental dif-
ferences. First, the sigma points are not drawn at random; they are deterministically
chosen so that they exhibit certain specific properties (e.g., have a given mean and
covariance). As a result, higher-order information about the distribution can be cap-
tured with a fixed and small number of points. In contrast, MC sampling methods
require orders of magnitude more random sample points in an attempt to propagate
an accurate (possibly non-Gaussian) distribution of the state. The second difference is
that sigma points can be weighted in ways that are inconsistent with the distribution
interpretation of sample points in a particle filter. For example, the weights on the
points do not have to lie in the range [0,1].
54
4.2 UT ANALYSIS
We first prove that with the sigma point selection scheme of Eq. 4.1, the prior sigma
points accurately capture the first two moments of the prior. For purpose of analysis,
we assign values βUT = 0, αUT = 1 and κ ≥ 0 in the zeroth point covariance weight
calculation. To relax this condition, refer [126]. The prior sigma points are given by
X i = x± (√
(nx + λ)σi) = x± σi
where σi denotes the ith column of the matrix square root (A) of the true prior
covariance P(= AAT ) . This implies that∑nx
i=1(σiσTi ) = P.
Since the points (leaving the zeroth sigma point) are symmetrically distributed and
chosen with equal weights about the mean of the prior x, the sample mean x (given
by Eq. 4.2) is exactly the same as the prior mean and all odd-ordered moments are
zero. The UT covariance Px (Eq. 4.3) is
Px =2nx∑
i=0
w(c)i (Xi − x)(Xi − x)T =
2nx∑
i=1
1
2(nx + λ)(√
(nx + λ)σi)(√
(nx + λ)σi)T
=1
2
2nx∑
i=1
σiσTi = P (4.4)
This is by noting that σi = Xi − x = x−Xi+nx, i = 1, ..., nx, i.e., σi+nx
= −σi, for
i = 1, ..., nx. This proves that the UT sigma point selection scheme Px exactly capture
the true prior covariance P.
In many applications, we need to accurately calculate the expected mean and
covariance of a random variable that undergoes a nonlinear transformation [121,122].
Let the random variable x undergo an arbitrary nonlinear transformation, written as
y = g(x) = g(x + δx) (4.5)
where δx is a zero-mean random variable with the same covariance Px as x. The
problem is to determine the mean and covariance of y.
In order to analytically calculate these quantities, we first expand g(.) using a
multi-dimensional Taylor series expansion around x. We show how UT achieves
55
second-order (or third-order for symmetric priors) accuracy of the Taylor series ex-
pansion in the prediction of the moments of y for any nonlinear function g(x). We
follow similar steps as in [112, 122] for analysing the accuracy of the mean while we
follow [81] for analysing the covariance with Taylor series.
To express the nonlinear function in a multi-dimensional Taylor series, we assume
that all nonlinear transformations are analytic across the domain of all possible values
of x. As the number of terms tends to infinity, the residual in the series tends to zero
and so the series always converges to the true value of the function [81]. We note that
these assumptions are more restrictive than those required for both linearization and
UT. To apply linearization, the function must necessarily be differentiable to form the
Jacobian matrix. The UT does not place this restriction.
For the prior variable x, perturbed about a mean value x by a zero mean distur-
bance δx with covariance Px, the Taylor series expansion of the nonlinear transforma-
tion y = g(x) about x is [112]
y = g(x) = g(x) + Dδxg +1
2D2
δxg +1
3!D3
δxg +1
4!D4
δxg + ... (4.6)
where δx = [δx1, δx2, ..., δxnx]T and the operator Dδx evaluates the total differential
of g(.) when perturbed around a nominal value x by δx i.e.,
Dδx = δxT ∆x =nx∑
j=1
δxjδ
δxj
(4.7)
which acts on g(.) on a component-by-component basis. Here ∆x =
[δ
δx1
,δ
δx2
, ...,δ
δxnx
]T
.
Using this definition, the kth term in the Taylor series can be written as
1
k!Dk
δxg =1
k!
[(δxT ∆x)
kg(x)]x=x
=1
k!
[nx∑
j=1
δxjδ
δxj
]k
g(x)|x=x (4.8)
which is composed of the kth order derivatives of g(.) and the kth order powers of (the
components of) δx.
For example, for a 3D vector x = [x1, x2, x3]T and scalar y, in the second term of
the series for y = g(x), we can expand
56
D2δxg = δx2
1
δ2g
δx21
+δx22
δ2g
δx22
+δx23
δ2g
δx23
+2δx1δx2δ2g
δx1δx2+2δx1δx3
δ2g
δx1δx3+2δx2δx2
δ2g
δx2δx3.
4.2.1 Accuracy of the Mean
The true mean of y is given by taking the expectation on both sides of the Taylor
series i.e.,
y = E[y] = E[g(x)] = E
[g(x) + Dδxg +
1
2D2
δxg +1
3!D3
δxg +1
4!D4
δxg + ...
](4.9)
If we assume that x is a symmetrically distributed random variable, then all odd
moments will be zero. Also we note that D2δxg = Dδx(Dδxg) = (∆T
x δxδxT ∆x)g and
E[δxδxT ] = Px. Using these, the mean can be reduced to
y = g(x) +1
2
[(∆T
x Px∆x)g(x)]x=x
+ E
[1
4!D4
δxg +1
6!D6
δxg + ...
]. (4.10)
The UT calculates the posterior mean from the propagated sigma points using Eq.
4.1. We can write the propagation of each sigma point through the nonlinear function
as a Taylor series expansion about x
Y i = g(X i) = g(x) + Deσig +
1
2D2
eσig +
1
3!D3
eσig +
1
4!D4
eσig + ... (4.11)
The perturbation in sigma points X i about the mean value x is quantified in terms
of the corresponding covariance vectors (columns) σi.Using these sigma points (Eq. 4.11) and deterministic UT weights as in Eq. 4.2,
the UT predicted mean is: yUT =2nx∑
i=0
wmi Y i
=λ
nx + λg(x) +
1
2(nx + λ)
2nx∑
i=1
[g(x) + Deσi
g +1
2D2
eσig +
1
3!D3
eσig +
1
4!D4
eσig + ...
]
= g(x) +1
2(nx + λ)
2nx∑
i=1
[Deσi
g +1
2D2
eσig +
1
3!D3
eσig +
1
4!D4
eσig + ...
]
Since the sigma points are symmetrically distributed around x, all the odd order terms
add up to zero. This results in the simplification
yUT = g(x) +1
2(nx + λ)
2nx∑
i=1
[1
2D2
eσig +
1
4!D4
eσig + ...
](4.12)
57
By rewriting the operator
D2δxg =
[(δxT ∆x)
2g(x)]x=x
= (∆xg)TδxδxT (∆xg)
we have
1
2(nx + λ)
2nx∑
i=1
1
2D2
eσig =
1
2(nx + λ)(∆xg)T
[2nx∑
i=1
√(nx + λ)(σiσ
Ti )√
(nx + λ)
](∆xg)
=nx + λ
2(nx + λ)(∆xg)T 1
2
[2nx∑
i=1
(σiσTi )
](∆xg)
=1
2[(∆T
x Px∆x)g(x)]x=x .(4.13)
The UT predicted mean can be further simplified to
yUT = g(x)+1
2[(∆T
x Px∆x)g(x)]x=x +1
2(nx + λ)
2nx∑
i=1
[1
4!D4
eσig +
1
6!D6
eσig + ...
]. (4.14)
Comparing Eqs. (4.10) and (4.14), we note that the true posterior mean and the mean
calculated by UT agree exactly upto third-order (assuming symmetric prior) and that
the errors are only introduced in the fourth and higher-order terms. The magnitude
of these errors depends on the choice of the composite scaling parameter λ as well as
the higher-order derivatives of the nonlinear transformation g.
In contrast, the linearization approach calculates the posterior mean as
yLIN = g(x) (4.15)
which only agrees with the true mean up to the first-order. Julier and Uhlmann [81]
show that, on a term-by-term basis, the errors in the higher-order terms of the UT are
consistently smaller than those for linearization.
4.2.2 Accuracy of the Covariance
The true posterior covariance is given by
Py = E[(y− y)(y− y)T ] (4.16)
58
where the expectation is taken over the distribution of y. Substituting Eqs. (4.6) and
(4.9), the realization of the (posterior) state error is
y− y = g[x + δx]− y = Dδxg +1
2D2
δxg +1
3!D3
δxg +1
4!D4
δxg + ...
− E
[Dδxg +
1
2D2
δxg +1
3!D3
δxg +1
4!D4
δxg + ...
](4.17)
Recalling the symmetry of δx, the expected value of all odd order terms of δx
evaluate to zero. Using y− y in Eq. 4.16, the true posterior covariance is
Py = E
[Dδxg(Dδxg)T +
1
3!Dδxg(D3
δxg)T +1
2× 2!D2
δxg(D2δxg)T +
1
3!D3
δxg(Dδxg)T
]
−E
[1
2!D2
δxg
]E
[1
2!D2
δxg
]T
+ ... (4.18)
We note that
Dδx =[(δxT ∆x)g(x)
]x=x
= Agδx (4.19)
where Ag = (∆Tx g)|x=bx is the Jacobian matrix of g(x) evaluated at x. Substituting
the expectation over the second-order term given in Eq. 4.13, we can rewrite equation
Eq. 4.18 as
Py = AgPxATg + E
[1
3!Dδxg(D3
δxg)T +1
2× 2!D2
δxg(D2δxg)T +
1
3!D3
δxg(Dδxg)T
]
−[(
1
2!(∆T
x Px∆x)
)g
][(1
2!(∆T
x Px∆x)
)g
]T
(4.20)
The UT predicts the covariance using Eq. 4.3 which requires the values of Y i−y
and Y 0 − y. Using Eqs. (4.11) and (4.12), these values are given by
Y i − y = Deσig +
1
2D2
eσig +
1
3!D3
eσig +
1
4!D4
eσig + ...
− 1
2(nx + λ)
2nx∑
i=1
[Deσi
g +1
2D2
eσig +
1
3!D3
eσig +
1
4!D4
eσig + ...
](4.21)
Y 0 − y = − 1
2(nx + λ)
2nx∑
i=1
[Deσi
g +1
2D2
eσig +
1
3!D3
eσig +
1
4!D4
eσig + ...
](4.22)
Noting that (by using Eq. 4.19)
1
2(nx + λ)
2nx∑
i=1
Deσig(Deσi
g)T =1
2(nx + λ)
2nx∑
i=1
AgσiσTi AT
g = AgPxATg , (4.23)
59
and all odd order terms sum upto zero owing to the symmetry of sigma points, the
UT predicted covariance: (Py)UT =∑2nx
i=0 w(c)i (g(Xi)− y)(g(Xi)− y)T
can be expanded using Taylor series and algebraically simplified [81] to
(Py)UT = AgPxATg
+1
2(nx + λ)
2nx∑
i=1
[1
3!Dδxg(D3
δxg)T +1
2× 2!D2
δxg(D2δxg)T +
1
3!D3
δxg(Dδxg)T
]
−[(
1
2!(∆T
x Px∆x)
)g
] [(1
2!(∆T
x Px∆x)
)g
]T
(4.24)
Comparing Eqs. 4.20 and 4.24, we note that the UT calculates the posterior covariance
accurately to two terms in the Taylor series, with errors introduced only in the fourth
and higher-order moments. However, if the prior is non-symmetric, the UT mean
and covariance still agree exactly up to the second-order terms as described above
but errors are introduced from the third and higher-order terms in the Taylor series
expansion [81]. We consider only symmetric priors in our application.
The linearization algorithm predicts the covariance using
(Py)LIN = AgPxATg (4.25)
which is the true series truncated after the first term. Julier et. al. [81] show that
although both approaches predict the covariance correctly up to the second order, the
absolute errors in the fourth and higher-order terms for UT are smaller.
4.3 ILLUSTRATION OF UT
In this section, we illustrate the application of the unscented transformation. The
aim is to calculate the first two moments of the posterior (transformed sigma points),
evaluate them with true analytical results, when available, or evaluate with respect to
the Monte Carlo simulations, and also compare with linearization.
Example 1: Consider propagating a scalar random variable x through the simple
nonlinear transformation y = g(x) = x2 where x is normally distributed as N(x, σ2x).
For a random perturbation of δx, x = x + δx
60
• We have the analytical mean of y as y = E[x2] = x2 + σ2x.
• The mean square error for the realization is (y − y)2 = ((x + δx)2 − (x2 + σ2x))
2
= (δx)4 + 4x(δx)3 + (4x2 − 2σ2x)(δx)2 − 4σ2
xxδx + σ4x
• Taking expectation gives the true covariance as σ2y = E[(δx)4] + 4x2σ2
x − σ4x
where the first term is the kurtosis. From the moment generating function, it
can be shown to be 3σ4x. Therefore, the true covariance is σ2
y = 2σ4x + 4x2σ2
x.
• The linearized mean yLIN = g(x) = x2
• The linearization algorithm predicts the covariance (from Eq. 4.25) as
(σ2y)LIN = 4x2σ2
x
• For the unscented transform, we have:
– Prior sigma points, X = x, x− σ, x + σ and σ2 = (nx + λ)σ2x
– Weights: w0 = λ/(nx + λ) and w1 = w2 = 1/2(nx + λ)
– Transformed sigma points Yi = X2i yield
Y = x2, x2 + σ2 − 2xσ, x2 + σ2 + 2xσ
– The UT mean yUT = w0X0 + w1X1 + w2X2
=λ
nx + λx2 +
1
2(nx + λ)
2(x2 + σ2)
= x2 + σ2
x
– The UT covariance (using Eq. 4.3) is
(σ2y)UT =
λ
(1 + λ)(Y0 − y)2 +
1
2(1 + λ)
2nx∑
i=1
(Yi − y)2 = λσ4x + 4x2σ2
x.
When the linearized mean and covariance are compared with that of the true system,
it can be seen that the linearization assumption eliminates significant terms in mean
and covariance. This leads to a biased mean and under-prediction of the covariance.
As can be seen, the UT-predicted mean is independent of λ (the scaling parameter
of UT) and is exactly the same as the true mean. Since the UT introduces errors
both in mean and covariance only from the fourth-order term, the error in prediction
is affected by λ only from that term onwards. To find the solution of the covariance
61
prediction by UT we must specify the value of λ. It agrees with the true covariance if
we choose3 λ = 2. For this value, both the (mean and covariance) results are exactly
the same as the true values.
Example 2: Polar-to-Cartesian transformation We now consider a common
nonlinear transformation to illustrate the effectiveness of UT [121]. Let range r and
angle θ of the polar coordinates be independent and normally distributed random vari-
ables i.e., r ∼ N(1, 0.022) and θ ∼ N(π/2, (π/12)2). The transformation equations
that relate the polar coordinates to the Cartesian coordinates are x = r cos(θ) and
y = r sin(θ).
Since we have two prior random variables (r and θ), the prior augmented mean
Θ = [1 π/2] and the augmented covariance matrix PΘ =
0.022 0
0 (π/12)2
.
In this example, since the true posterior mean and covariances cannot be evaluated
analytically, we compare the performance of UT and ’linearization’ with respect to a
Monte Carlo sampling-based approach. We generate a large number (105) of samples
with the given prior distributions for range and angle, and propagate them through
the transformations to arrive at the posterior samples (Fig. 4.3 (a)). MC mean and
variance can be calculated as the (posterior) sample mean and variance, respectively.
For applying UT, the sigma point selection scheme Eq. (4.1) is employed on the
augmented prior mean and covariances. Since nΘ = 2, each sigma point will have
dimension 2 and the number of sigma points for UT is 2nΘ + 1 = 5. The resulting
posterior sigma points are shown in Fig. 4.3 (b). Then, the UT mean and variance of
the posterior RVs are predicted using Eqs. (4.2) and (4.3), respectively.
As discussed in section 4.2, linearization produces the posterior mean and covari-
ances (equal to the first term in the Taylor series expansion of the posterior mean
and covariance, respectively). Thus, the posterior linearized mean = g(Θ) and the
posterior covariance matrix = ∆gPΘ∆gT where g is the nonlinear (vector) function
3Justification for this choice of λ can be found in [81].
62
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10.5
0.6
0.7
0.8
0.9
1
1.1
1.2
x
y
Cloud of posterior samples by Monte Carlo simulation
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.50.85
0.9
0.95
1
1.05
Sigma points
x
y
Transformed sigma points by UT
(a) (b)
Fig. 4.3: Posterior samples (a) Monte Carlo, and (b) UT.
relating z = [x; y] with r and θ and ∆z =
[δ
δx,
δ
δy
].
With the three methods, the calculated posterior moments in the y−direction are
given as follows:
MC: posterior mean of y = 0.9672 and covariance Pyy =
0.0631 0
0 0.0260
.
UT: posterior mean of y = 0.9667 and covariance Pyy =
0.0625 0
0 0.0380
.
Linearization: mean of y = 1.0000 and covariance Pyy =
0.0685 0
0 0.0004
.
Figure 4.4 shows the mean and 1 − σ contours for each of these methods. The
1− σ contour is the locus of points y|(y − y)P−1yy (y − y) = 1 and is a graphical rep-
resentation of the size and orientation of Pyy [130,131]. As can be seen, the linearized
transformation is biased and inconsistent. This is most severe in the range direction,
where linearization estimates that the position is 1.00m whereas in reality it is 0.967m.
Since it is a bias which arises from the transformation process itself, the same error
with the same sign will be committed each time a coordinate transformation takes
63
−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.30.9
0.92
0.94
0.96
0.98
1
1.02
1.04
Linearization
MC
UT
y
x
Fig. 4.4: Figure shows the posterior mean and uncertainty (1-σ) contour of covariance,determined by MC approach, i.e., true values (mean at ’*’), linearization (mean at ’+’),and unscented transformation (mean almost matches with that of MC mean).
place. Even if there were no bias, the transformation is inconsistent, since the 1 − σ
contour of covariance obtained from linearization is significantly different from the MC
covariance. In comparison, the UT mean (96.67) is quite close to the MC estimate, and
the UT 1− σ covariance contour is very similar to that of MC propagation. We note
that only 5 sigma points are required in UT as against 105 samples for MC method
which works based on the laws of large numbers (to propagate the whole pdf).
As observed, linearization predicts the posterior moments of a nonlinearly trans-
formed random variable by truncating their (posterior) Taylor series expansion after
the first two terms for the mean and after the first term for the covariance [81, 112].
This can introduce significant errors in prediction for many frequently occurring non-
linearities [84, 132]. When the first two moments required for the Kalman filter are
predicted by linearization, it leads to the most well-known application of the KF frame-
work to nonlinear systems, known as the extended Kalman filter (EKF). Even though
64
the EKF is one of the most widely used approximate solutions for nonlinear estimation
and filtering, it has serious limitations [81, 122].
On the other hand, predicting the posterior moments with Monte Carlo methods
requires a very large (for example, of the order of 103 − 105) number of random sam-
ples (generated from the prior distribution) in an attempt to accurately propagate the
probability density function through the nonlinear models. The Monte Carlo approx-
imation of the optimal Bayesian estimator [41, 83], known as the particle filter, also
provides a solution (that is more accurate than the linearization used in EKF) to the
nonlinear estimation problem [59,133]. However, it is computationally very intensive.
4.4 UT-BASED EXTENSION OF THE KALMAN FILTER
Julier and Ulhmann [81] investigated a reliable nonlinear extension of the Kalman
filter and came up with UT to predict the means and covariances arising from nonlin-
ear transformations, which serve as sufficient statistics in a Kalman framework. They
demonstrated superior performance of the new filter, referred to as the unscented
Kalman filter (UKF), over the widely used EKF on several highly nonlinear 1D es-
timation problems [81, 132, 134]. Later, Wan and Ven der Merwe applied the UKF
to nonlinear parameter estimation [84] and dual (simultaneous state and parameter)
estimation [129]. Its utility as a smoother was demonstrated in the ’Expectation step’
of the EM algorithm for efficient training of neural networks, and then for the subse-
quent design of a dual unscented filter. Ven der Merwe et al. [135] further developed
a computationally efficient and numerically robust square root implementation of the
UKF. The unscented transformation with reduced number of sigma points is proposed
in [136] and its use has been extended to more accurately estimate the means and
covariances in [137, 138]. The sigma point approaches such as the UKF and the CDF
[127] are classified in a unified general family of derivative-free Kalman filters for non-
linear estimation in [128]. The use of UKF and its Gaussian mixtures as a proposal
function in particle filters is reported in [42, 83] and [122,139], respectively.
65
A detailed discussion on the utilities of UT including its application to discontin-
uous transformations and multi sensor fusion is described in [125]. Since its invention,
it has been applied as a more accurate, simple, and stable replacement to the widely
used Extended Kalman filter in diverse applications [85, 140]. The similarity of the
UKF with the statistical linear regression based Kalman filter (LRKF) is discussed
in [141] to provide useful insights into the properties of the UKF. The UKF is also
analysed from a Bayesian perspective in [142] and is suggested as a better alternative
to EKF and even to Monte Carlo filters for simple models. In vision, its application
to visual tracking applications has been explored in [86, 87, 143].
The UKF inherits the Kalman filter structure; however, it employs the UT to
propagate the first two moments of the state random variable through non-linear state
and measurement transformations [81, 83, 121]. As in the (extended) Kalman filter,
the state distribution is approximated by a Gaussian random variable (GRV) but is
now represented using a minimal set of carefully chosen sigma points. These sample
points completely capture the true mean and covariance of the prior GRV, and, when
propagated through the true nonlinear system, capture the posterior mean and covari-
ance accurately to second order (third order for symmetric priors) in the Taylor series
expansion for any nonlinearity [81].
State estimation in a general recursive framework employs a nonlinear/non-additive
state space model to describe the dynamics of the system. The evolution of the state
(the quantity to be estimated) is described as a state equation
xk = f(xk−1,uk) (4.26)
and the degradation that relates the observation to the state is modeled as the obser-
vation equation
yk = h(xk,vk) (4.27)
where uk is state noise and vk is the noise in the observations.
The UKF results from applying the UT to recursive minimum mean-square error
(MMSE) estimation [121, 122]. Kalman derived [7] the recursive form of the linear
66
Bayesian update of state conditional mean, xk = E[xk/y1,y2, ...,yk] and its covari-
ance, Pk = E[(xk − xk/k−1)(xk − xk/k−1)T ] as
xk =(Prediction of xk) + (Gain)k × [ yk − (Prediction of yk)]
While this is a linear recursion, we need not assume linearity of the model. For MMSE,
the terms in this recursion are given by
xk/k−1 = E[f(xk−1,uk)] yk/k−1 = E[h(xk/k−1,vk)] (4.28)
Pxy = E[(xk − xk/k−1)(yk − yk/k−1)T ] Pyy = E[(yk − yk/k−1)(yk − yk/k−1)
T ] (4.29)
Kk = PxyP−1yy (4.30)
where the optimal prediction (prior mean at time k) of xk is written as xk/k−1 while
that of yk is represented as yk/k−1. The optimal gain term Kk is expressed as a function
of the expected cross-correlation matrix of the state and observation prediction errors,
and the expected auto-correlation matrix of the observation prediction error.
Thus, to apply the Kalman filter to nonlinear/non-Gaussian situations, we only
require the first two moments of the state and the measurement random variables to
be predicted as accurately as possible. The resulting equations are the KF update
equations and assume the form [81,122]
xk = xk/k−1 + Kk(yk − yk/k−1)
Pk = Pk/k−1 −KkPyyKTk . (4.31)
Based on this observation, Julier et al. [121] formulate the problem of applying the KF
to nonlinear systems. The means and covariances in the above equations are predicted
using UT. We illustrate the application of UKF to nonlinear systems by a 1D scalar
state estimation example [83].
Example 3: Consider a time series generated by the process model
xk = 1 + sin(wπk) + k1xk + uk (4.32)
where uk is zero mean Gaussian random variable with variance 15 modeling the process
noise, w = 0.04 and k1 = 0.5 are scalar parameters. When state xk is subjected to a
67
non-stationary observation model, we have the degraded observations
yk =
k2x
2k + vk, k ≤ 30
k3xk − 2 + vk, k > 30(4.33)
with k2 = 0.2, k3 = 0.5, and vk Gaussian distributed as N(0, 0.5). Given only the
noisy observations yk, the problem is to estimate the underlying clean state-sequence
xk for k = 1...60.
We show how the UKF can be used to track the state sequence xk. The state
random variable is augmented with the noise variables [81] as xak = [xk; uk; vk] hav-
ing dimension na = nx + nu + nv. We represent the mean and covariance of the state
as xk and Pk, respectively. The mean and covariance of this augmented vector becomes
xak−1 = [xk−1; 0; 0] Pa
k−1 =
Pk−1 0 0
0 σ2u 0
0 0 σ2v
The scaled UT sigma point selection scheme (Eq. 4.1) is applied to the mean and
covariance matrices of this new augmented state RV (assuming an initial mean and
covariance for the state and known noise statistics) to calculate the corresponding
sigma matrix Xa.
The augmented sigma point matrix Xa is partitioned [81] as X
a = [Xx;Xu;Xv]
i.e., the first nx rows, subsequent nu rows and the last nv rows are sigma point matrices
corresponding to the state, state noise and the observation noise, respectively. Each
component of these sigma points is independently propagated according to the state
equation (Eq. 4.32) as
Xxk/k−1 = f(Xx
k−1,Xuk−1) = 1 + sin(wπk) + k1X
xk−1 + X
uk−1 (4.34)
and then through the measurement equation (Eq. 4.33) as
Y k/k−1 = h(Xxk/k−1,X
vk) =
k2
(X
xk/k−1
)2+ X
vk, k ≤ 30
k3Xxk/k−1 − 2 + X
vk, k > 30
(4.35)
68
0 10 20 30 40 50 60−20
0
20
40
60
80
100
120
140
True signalObservationsUKF estimate
Time step
Sign
al v
alue
s
Fig. 4.5: Signal estimation: (—) original, (- . -) observations, and (- - -) UKFestimates.
In UKF, the means and covariances required for each recursive Kalman filter-
ing update step are computed from the corresponding predicted state and/or mea-
surement sigma points employing (Eqs. (4.2) and (4.3)) as z =∑2nx
i=0 w(µ)i Z i and
Pzz =∑2nx
i=0 w(c)i (Z i − z)(Z i − y)T , respectively. Here, sigma matrix Z assumes X
x
for state and Y for measurement. These predicted statistics along with the recent
observation are used in the Kalman update equations (Eq. 4.31) to arrive at the final
estimate of the original signal. Figure 4.5 shows the original time series, its observa-
tions and the UKF estimates. From the plot, we note that UKF estimates are quite
close to the original. A comparison of the accuracy of the UKF estimates with those
of EKF (for the same observations) is shown in Fig. 4.6. We note that in the first
half-interval of the signal where the observation model is nonlinear the performance of
the UKF is quite superior to that of the EKF, while they perform comparably in the
linear zone. The MSE for the EKF estimate is 0.52 while for the UKF it is only 0.29.
69
0 10 20 30 40 50 60−10
−5
0
5
10
15
20
25
30
35
40
True signalEKF estimateUKF estimate
Time step
Sign
al v
alue
s
Fig. 4.6: Signal estimation: (—) original, (- . -) EKF, and (- - -) UKF estimates.
4.5 DISCUSSION
In this chapter, we introduced the principle of unscented transformation (UT) that has
the capability to transform mean and covariance through nonlinear transformations
and forms the basis for the UKF. We analysed the accuracy of the transformed (first
two) moments using the Taylor series expansion for nonlinear functions and compared
them with the true posterior and linearization following [81, 112, 122, 134]. We gave
simple and useful examples to illustrate the efficacy of the unscented transformation
over linearization and in comparison with analytical and Monte Carlo methods.
70
CHAPTER 5
NOISE REDUCTION IN PHOTOGRAPHIC IMAGES
5.1 INTRODUCTION
In this chapter, demonstrating the capability of unscented Kalman filter in the 2-
D domain, we first propose an auto-regressive UKF (ARUKF) for non-linear image
estimation. We further propose a novel methodology that reduces noise but pre-
serves edge information by judiciously incorporating a non-Gaussian prior within the
UKF framework through importance sampling (IS). We achieve this by formulating
a discontinuity-adaptive MRF (DAMRF) prior suited for recursive prediction and by
employing IS with a space-varying and heavy-tailed proposal density to estimate the
DAMRF statistics.
We address the specific problem of recovering images degraded by film-grain noise
[4, 55, 59]. This is an important practical example of sensor nonlinearity. The pho-
tographic film is a widely used image recording medium due to its high resolution
capability, exposure latitude, dynamic range, and ready availability [144]. In a photo-
graphic film, the film density is related linearly to the logarithm of the exposure. In
the density domain, the noise is modeled as additive, independent, and white Gaus-
sian. Due to the logarithmic relationship between density and exposure, the film-grain
noise manifests itself as multiplicative non-Gaussian noise in the exposure domain. We
explore the applicability of UKF for film-grain noise removal in photographic images.
5.2 AUTO-REGRESSIVE UKF
In this section, we propose a recursive filter based on the UKF that accounts for image
sensor nonlinearity. The UKF which is based on the UT provides a mathematically
71
tractable way to propagate the first two moments in the presence of non-linear degra-
dations even in the presence of non-Gaussian noise. This is achieved by augmenting
the state random variable with the (state and observation) noise random variables and
applying UT in the prediction of state and observation [81].
In order to apply the UKF for image estimation, we first model the original image
with an AR state model (as in Eq. 3.2). Adopting the same notation, we have
x(m,n) denote a 3 × 1 state vector, u is the driving state noise with zero-mean and
variance σ2u, and F is the state transition matrix containing the AR coefficients. For
non-linear/non-additive degradation, the measurement model assumes a general form
y(m,n) = h(x(m,n),v(m,n)) (5.1)
where y(m,n) is the observation at pixel (m, n) of dimension ny, v(m,n) is corresponding
measurement noise of dimension nv, and h is the functional form of the degradation.
Based on the state and measurement models, we formulate the UKF to estimate the
original image from its degraded observation. We assume that the statistics of the noise
are known (given or estimated from the degraded image). In order to deal with non-
additive noise, we define an augmented vector xa(m,n) as the concatenation of the state
vector and scalar noise variables i.e., xa(m,n) = [xT
(m,n) u(m, n) v(m, n)]T with dimension
na. Here, na = nx + nu + nv. For example, for a three pixel neighborhood, x(m,n) has
dimension three and xa(m,n) has dimension five. We apply the SUT sigma point selection
scheme (Eq. 4.1) to the new augmented vector xa(m,n) = [xT
(m,n) u(m, n) v(m, n)]T
to calculate the corresponding augmented sigma point matrix Xa(m,n) of dimension
na × (2na + 1). Since na = 5, Xa(m,n) will be of size 5× 11 and each sigma point is of
dimension five.
We now describe the ARUKF which sequentially estimates the intensity of the
original image at each pixel (m, n) in a raster-scan order from left-to-right. The filter
updates the mean and covariance of the Gaussian approximation to the posterior
distribution of the state as described in the following steps.
Initialize state statistics: x0 = E[x0]; P0 = E[(x0 − x0)(x0 − x0)T ];
72
1. Augment state mean and covariance with noise statistics.
xa(m,n−1) = [xT
(m,n−1) 0 0]T Pa(m,n−1) =
P(m,n−1) 0 0
0 σ2u 0
0 0 σ2v
2. Calculate the augmented sigma point matrix (Eq. 4.1) which constitute prior
sigma points.
Xa(m,n−1) =
[xa
(m,n−1) xa(m,n−1) ±
√(na + λ)Pa
(m,n−1)
]
3. The state sigma points and state noise sigma points are propagated through the
AR state model for prediction.
Xx(m,n)/(m,n−1) = f(Xx
(m,n−1),Xu(m,n−1)) = FXx
(m,n−1) +[XuT
(m,n−1) 0T 0T]T
(5.2)
where Xx(m,n−1) comprises the sigma points formed from the first nx rows of
Xa(m,n−1) while Xu
(m,n−1) is formed from the nu rows that immediately follow the
first nx rows in Xa(m,n−1).
4. The predicted state sigma points are used to determine the predicted mean and
covariances.
x(m,n)/(m,n−1) =∑2na
i=0 w(µ)i Xx
i,(m,n)/(m,n−1)
P(m,n)/(m,n−1)
=∑2na
i=0 w(c)i [Xx
i,(m,n)/(m,n−1)−x(m,n)/(m,n−1)][Xxi,(m,n)/(m,n−1)−x(m,n)/(m,n−1)]
T
where i represents the ith sigma point which is the ith column of Xx(m,n)/(m,n−1).
5. The sigma points corresponding to the measurement are predicted by propa-
gating the (predicted) state sigma points and measurement noise sigma points
through the observation non-linearity.
Y(m,n)/(m,n−1) = h(Xx(m,n)/(m,n−1),X
v(m,n−1))
Here, Xv(m,n−1) is formed from the last nv rows of Xa
(m,n−1). This results in
measurement sigma point matrix Y(m,n)/(m,n−1) of dimension ny × (2na + 1).
6. Statistics required for updation are estimated using the predicted sigma points.
y(m,n)/(m,n−1) =∑2na
i=0 w(µ)i Yi,(m,n)/(m,n−1)
73
Pyy =∑2na
i=0 w(c)i [Yi,(m,n)/(m,n−1)−y(m,n)/(m,n−1)][Yi,(m,n)/(m,n−1)−y(m,n)/(m,n−1)]
T
Pxy =∑2na
i=0 w(c)i [Xx
i,(m,n)/(m,n−1)−x(m,n)/(m,n−1)][Yi,(m,n)/(m,n−1)−y(m,n)/(m,n−1)]T
where Yi,(m,n)/(m,n−1) represents the ith sigma point which is the ith column of
Y(m,n)/(m,n−1).
7. Update as in the Kalman filter:
The Kalman gain K(m,n) = PxyP−1yy
x(m,n) = x(m,n)/(m,n−1) + K(m,n)(y(m,n) − y(m,n)/(m,n−1))
P(m,n) = P(m,n)/(m,n−1) −K(m,n)PyyKT(m,n)
Estimated intensity at (m, n) is s(m, n) = x(m,n)(1) (the first component of x(m,n)).
We refer to this formulation of the UKF for image estimation as ARUKF. A block
diagram of the ARUKF structure is given in Fig. 5.1.
Fig. 5.1: Auto-regressive unscented Kalman filter (ARUKF).
5.3 IMPORTANCE SAMPLING UKF
The performance of ARUKF can be improved by modeling the image with a non-
Gaussian edge preserving Markov prior as discussed next.
We construct the state conditional pdf using the past pixels in the NSHP support
and the DAMRF model as in ISKF (section 3.3.2)
P (s(m, n)/s(m− i, n− j)) = exp
(−γ log
(1 +
η2(s(m, n), s)
γ
))(5.3)
74
Our objective is to tailor the DA prior for film-grain noise removal in photographic
images and then embed the prior into the UKF framework. As discussed in chapter
2, the energy term η2(s(m, n), s) must be modified to handle the specific class of
image features in a given application. Once the DA conditional prior is formulated,
the update stage of the UKF requires the predicted sigma points to be propagated
through the observation model.
5.3.1 ISUKF for Non-linear Image Estimation
We now extend the basic structure of the UKF to include the non-Gaussian prior.
As in ISKF (section 3.5), we model the image with a DAMRF model and estimate
its mean and variance by importance sampling. However, unlike in the simple ISKF
where we directly propagate the predicted mean and covariances through the Kalman
update equations, we employ them to determine the predicted sigma points. These
sigma points are propagated through the non-linearities, and the required statistics
for the update step of the UKF are computed based on the sigma points (and UT
weights). The updation step yields the estimate of the pixel intensity.
The steps involved in the proposed importance sampling UKF (ISUKF) method
are as follows:
1. At pixel (m, n), we construct the (scalar) state conditional pdf p(s(m, n)/s)
using the DAMRF model, given the past NSHP pixels s, as in ISKF (step (1)
of section 3.5). The energy term η2(s(m, n), s) of the DAMRF prior must be
tailored specifically with respect to the given application to improve performance
(as discussed subsequently in sections 5.4.1 and 6.3.1). We note that unlike in
the ARUKF, the dimension of state in ISUKF is not defined by the number of
neighbors. It has to be determined by the number of features and/or pixels that
must be estimated. In our formulation of ISUKF, we estimate only the current
pixel intensity s(m, n) at pixel (m, n) and hence the state is a scalar.
2. From the state conditional prior p(s(m, n)/s), the predicted mean µp and pre-
75
dicted covariance σ2p estimates of the scalar state are obtained exactly as in
the ISKF algorithm by importance sampling of the DAMRF prior (step (2) of
section 3.5), using the samples of a space-varying Cauchy density. To achieve
support overlap with the prior at each pixel, the location of the Cauchy sampler
is varied with the mean of the neighboring pixels (s).
3. These estimates are augmented to deal with the non-additive measurement noise
and are used to predict directly the one-step ahead sigma points (unlike in step
(2) of ARUKF in the earlier section).
x(m,n)/(m,n−1) = µp and P(m,n)/(m,n−1) = σ2p
xa(m,n)/(m,n−1) = [xT
(m,n)/(m,n−1) 0]T = [µp 0]T (5.4)
Pa(m,n)/(m,n−1) =
P(m,n)/(m,n−1) 0
0 σ2v
=
σ2p 0
0 σ2v
(5.5)
Since state mean and covariance prediction is based directly on the conditional
pdf, we need not augment with the state noise statistics.1 This is in contrast
to step (1) of the ARUKF where we need to augment the state mean and
covariances with state and measurement noise statistics so as to propagate the
state and state noise sigma points through the state equation to obtain the
predicted sigma points (step (3) of ARUKF).
We note that while estimating a pixel, the dimension of the augmented vector
na in ISUKF is only (1+nv =) 2 and is independent of the number of neighbors,
whereas in ARUKF it is nx + 2 (assuming that nx is the number of neighbors,
and state and measurement noises are scalars).
(a) Use these augmented mean and covariances to determine the (predicted
state, and measurement noise) sigma point matrix using Eq. 4.1.
Xa(m,n)/(m,n−1) =
[xa
(m,n)/(m,n−1) xa(m,n)/(m,n−1) ±
√(na + λ)Pa
(m,n)/(m,n−1)
]
1Neighbors s and parameter ρ2(m,n) used in the DAMRF prior serve as implicit counterparts of the
state noise (or uncertainty).
76
(b) Propagate each of the 2na + 1 sigma points corresponding to state and
measurement noise through the observation nonlinearity.
Y(m,n)/(m,n−1) = h(Xx(m,n)/(m,n−1),X
v(m,n−1))
where Xx(m,n)/(m,n−1) and Xv
(m,n−1) are formed from the first row of Xa(m,n)/(m,n−1)
and the nv rows following the first, respectively.
(c) Estimate the statistics of y (from the sigma points) and update following
the same steps as (6) and (7) in the ARUKF (section 5.2)
The estimated pixel intensity is given by s(m, n) = x(m,n).
Thus, s yields the estimated image. This formulation of the UKF with a discontinuity-
adaptive prior is expected to preserve edges better than the ARUKF while allowing
for greater smoothing in uniform regions. A block schematic of the proposed ISUKF
is shown in Fig. 5.2 for ease of understanding.
Fig. 5.2: Importance sampling unscented Kalman filter (ISUKF).
5.4 FILM-GRAIN NOISE
A slide, a color negative, or a black-and-white film, contains tiny crystals of silver halide
salts which are light sensitive. When a film is developed, these crystals turn into tiny
filaments of metallic silver conventionally called ‘grain’. The faster the film, the larger
77
the clumps of silver formed and blobs of dye generated (in color films), and the more
they tend to group together in random patterns and become more visible to the naked
eye. These randomly patterned grains are visibly objectionable in photographic prints
and constitute film-grain noise. This noise also limits resolution since it is determined
by how fast a film reacts to light [144]. There are applications such as in motion pictures
[145] where the presence of film-grain noise is desired. While film-grain enhances the
feel of a motion picture, it makes compression of the content more difficult due to high
entropy as the pattern is noise-like and independent from that of the adjacent frames.
To compress and transmit these images, removal of film-grain noise is a prerequisite.
For the multimedia industry, film-grain noise removal is important for digitisation and
storage of the huge legacy of images and movies of the last several decades [57].
When a photographic film is used as a recording medium, there is a well-known
nonlinear relationship between the incoming light intensity (exposure) and the silver
density deposited on the film. The image recorded on a photographic film D, is related
to the logarithm of the exposure E given by the D − log E curve of the film [3, 4] as
D(E) = α log10(E) + β (5.6)
Parameters α and β are the slope and offset derived from the D − log E curve [55].
The domain in which the actual film formation takes place is referred to as the density
domain, and the domain in which the developed image is available for visual consump-
tion is called the exposure domain. In density domain, using the sensor nonlinearity
(Eq. 5.6), we get the noisy observations
rd(m, n) = α log10(s(m, n)) + β + v(m, n) (5.7)
where s(m, n) is the original scene, and v(m, n) denotes additive white Gaussian noise
with zero mean and variance σ2v . The inverse transform of Eq. (5.6), E(D) = 10(D−β)/α
gives the image in the exposure domain [4]. Applying this transformation to the density
domain model (Eq. 5.7), we obtain the observation model in the exposure domain as
re(m, n) = s(m, n) (10v(m,n)/α) (5.8)
78
Here, re(m, n) is the degraded image in exposure domain, where the noise becomes
multiplicative and non-Gaussian.
5.4.1 The Proposed Filters
In this section, we apply ARUKF and ISUKF for image estimation in film-grain noise.
The application of ARUKF and ISUKF for film-grain noise removal requires the prop-
agation of predicted state sigma points through the exposure domain film-grain non-
linearity (Eq. 5.8) with the observation being y(m,n) = re(m, n). Thus, step 5 in the
ARUKF algorithm that predicts the measurement sigma points becomes
Y(m,n)/(m,n−1) = h(Xx(m,n)/(m,n−1),X
v(m,n−1)) = HXx
(m,n)/(m,n−1). ∗ 10.(Xv(m,n−1)/α) (5.9)
Operations .∗ and .() represent element-by-element operations (multiplication and
raised power, respectively). For a three pixel NSHP neighborhood in ARUKF, the
state has dimension 3 and H = [1 0 0]. The augmented state dimension will be 5 and
hence 11 sigma points are used to propagate statistics.
In order to apply the ISUKF, the image prior must be formulated based on the
type of features to be preserved. For edge preservation in photographic images, we
construct the state conditional pdf using first-order NSHP neighbors i.e., (i, j) ∈(0, 1), (1, 0), (1, 1), (1,−1) and η2(s(m, n), s) =
1
ρ2(m,n)
∑
(i,j)
(s(m, n)− s(m− i, n− j))2
(i2 + j2).
We choose the parameter ρ2(m,n) = P(m,n−1), the covariance of the (recent) past pixel,
since it is a measure of local dependency. The first two moments of this prior are esti-
mated by importance sampling and are augmented with measurement noise statistics
to determine the predicted sigma points (steps 1, 2 in section 5.3).
The measurement equation (step 3(b)) in ISUKF has the same form as that of
the ARUKF (Eq. 5.9) except that H = 1. Since the predicted state by importance
sampling has dimension one, the augmented vector has dimension na = 2 and only
(2na+1 =)5 sigma points are needed to propagate the state dynamics. With this prior
and measurement formulations, the steps in ISUKF algorithm (described in section 5.3)
are followed to estimate the (original) image pixel intensity.
79
(a) (b) (c)
(d) (e) (f)
Fig. 5.3: (a) Original ‘flower’ image. (b) Degraded image. Image estimated using (c)MWF (ISNR = 2.86 dB), (d) PF (ISNR = 2.55 dB), (e) ARUKF (ISNR = 3.42dB), and (f) ISUKF (ISNR = 4.63 dB).
5.5 EXPERIMENTAL RESULTS
In this section, we present results obtained using the proposed methods, namely,
ARUKF and ISUKF, and compare their performance with the modified Wiener fil-
ter (MWF) [4] and the recently proposed particle filter (PF) [58,59]. We note that the
PF and ARUKF require accurate estimates of AR parameters of the original image
which is a difficult proposition in real situations. When only the corrupted image is
available, we estimate the AR parameters (approximately) from the degraded image
by solving Eqs. (3.3).
The parameters that influence the DAMRF model are γ and ρ(m,n). We have
80
(a) (b) (c)
(d) (e) (f)
Fig. 5.4: (a) The ‘house’ image. (b) Degraded image (σ2v = 0.05). Result using (c)
MWF (ISNR = 2.51 dB), (d) PF (ISNR = 3.48 dB), (e) ARUKF (ISNR = 3.40dB), and (f) ISUKF (ISNR = 4.55 dB).
found experimentally that γ = 1.8 yields best performance. From the D − log E
characteristics of the film, we took α = 5 for all the experiments [55]. The parameters
of UT were set as αUT = 1, βUT = 0 and κ = 1. The UKF was observed to be robust
to variations in these UT parameters.
5.5.1 Simulations
Fig. 5.3(a) shows a ‘flower’ image. It is degraded by film-grain noise with variance
σ2v = 0.05 according to Eq. (5.7) and the exposure domain image (re in Eq. (5.8))
is shown in Fig. 5.3(b). The image estimated by MWF is shown in Fig. 5.3(c).
The result obtained using the particle filter [59] (Fig. 5.3(d)) with 1000 samples is
81
(a) (b) (c)
(d) (e) (f)
Fig. 5.5: (a) The ‘peppers’ image. (b) Degraded image. Output of (c) MWF(ISNR = 2.98 dB), (d) PF (ISNR = 4.47 dB), (e) ARUKF (ISNR = 4.12 dB),and (f) ISUKF (ISNR = 5.26 dB).
considerably less noisy than MWF. Fig. 5.3(e) shows the output of the ARUKF. It
is marginally sharper than the PF output and has a higher ISNR value. The image
estimated using ISUKF is presented in Fig. 5.3(f). The noise level is quite low while the
petals come out sharp and crisp demonstrating the effectiveness of the non-Gaussian
prior incorporated in the ISUKF. The image is much closer to the original and has the
highest ISNR value among all these methods.
Fig. 5.4(a) shows a ‘house’ image. The image degraded by film-grain noise is
shown in Fig. 5.4(b). The images estimated by MWF, PF, ARUKF and ISUKF are
shown in Figs. 5.4(c), (d), (e) and (f), respectively. The ARUKF preserves sharp
details (such as grass and tree leaves) akin to MWF but is less noisy. The performance
82
0 5 10 15 20 251
2
3
4
5
6
7
8ISUKFARUKFPF MWF
ISN
R v
alu
es
of
resp
ect
ive
filt
ers
Image index 0 5 10 15 20 25
2
3
4
5
6
7
8
9ISUKFARUKFPFMWF
Image index
ISN
R v
alu
es
of
resp
ect
ive
filt
ers
(a) (b)
Fig. 5.6: Performance comparison on different images in terms of mean value of ISNRover 20 MC runs (a) at moderate noise (σ2
v = 0.05), and (b) at high noise (σ2v = 0.15).
of PF is comparable to that of ARUKF. However, the image estimated by ISUKF is
the best among all. The noise is effectively filtered while preserving fine details such
as window-panes, grass and leaves. This is also reflected in its high ISNR value.
Fig. 5.5(a) shows the ‘peppers’ image. It is severely degraded by film-grain noise
with variance σ2v = 0.1 as shown in Fig. 5.5(b). The image recovered by MWF is
given in Fig. 5.5(c). By applying the ARUKF, we achieve good noise reduction (Fig.
5.5(e)) and obtain a result that is visually comparable to that of PF (Fig. 5.5(d)). The
image estimated using ISUKF is presented in Fig. 5.5(f). We note that the output
of ISUKF is almost free from noise while the contours remain sharp. It also has the
highest ISNR value among all the methods.
In Figs. 5.6(a) and (b), we statistically compare the performance of proposed
ARUKF and ISUKF with MWF and PF with prior boosting [146] on a wide variety
of images including face images, natural scenes, film images and textures, both at
moderate noise and high noise. Except on very few images (highly textured) ISUKF
outperforms all other filters at all noise levels. The ARUKF quantitatively precedes
ISUKF in performance. At low noise (say σ2v < 0.04) PF results in poor performance
83
(a) (b)
(c) (d) (e)
Fig. 5.7: (a) Cropped portion of a frame from the movie ‘Das testament des Dr.Mabuse’. Output of (b) MWF, (c) PF, (d) ARUKF, and (e) ISUKF.
because the likelihood is very peaked. The performance of ARUKF is better than
MWF and comparable to PF. But the ISUKF with a DAMRF non-Gaussian prior
performs the best among all the filters. It is also computationally efficient than PF.
5.5.2 Real Examples
Fig. 5.7(a) shows a cropped portion of an image frame from the old classic movie ‘Das
testament des Dr. Mabuse’. The film-grain noise is real and shows up clearly on the
white coat, the face etc. Using a homogeneous region in the frame, the measurement
noise variance was found to be 0.04. The output of MWF, PF and the proposed
ARUKF, and ISUKF are shown in Figs. 5.7(b), (c), (d) and (e), respectively. We
observe that ARUKF performs on par with PF but the ISUKF is the most effective
84
(a) (b)
(c) (d) (e)
Fig. 5.8: (a) Face image with real film-grain noise. Image estimated using (b) MWF,(c) PF, (d) ARUKF, and (e) ISUKF.
even in real situations. The overall noise level is quite low and the folds on the cloth
become quite discernible in the output of the ISUKF.
Next, a face image with film-grain noise is shown in Fig. 5.8(a). Residual film-
grain noise is visible in the output of MWF (Fig. 5.8(b)). The PF result (Fig. 5.8(c))
is comparatively better. The result using ARUKF (Fig. 5.8(d)) is marginally sharper
than PF. The output of ISUKF (Fig. 5.8(e)) has all edges intact, and is least affected
by grains. Finer details such as lips and eye-brows are well-preserved. The image
obtained using ISUKF is visually striking in appearance and appears the most natural
among them all.
Fig. 5.9(a) shows yet another real image with film-grain noise. Figs. 5.9(b), (c),
85
(a) (b)
(c) (d) (e)
Fig. 5.9: (a) A real building image. Output image obtained using (b) MWF, (c) PF,(d) ARUKF, and (e) ISUKF.
(d) and (e) depict the output from MWF, PF, ARUKF and ISUKF, respectively. The
result obtained using ARUKF is comparable to PF. The ISUKF output is again the
best. The edges have been preserved well and the grainy appearance in the original
degraded image of Fig. 5.9(a) is greatly mitigated.
Finally, we consider the locomotive image shown in Fig. 5.10 (a). The result
obtained using the PF with 500 samples is shown in Fig. 5.10 (b). We note that
inspite of some blurring of the letters and edges, the PF could not effectively remove
the noise in the white and black homogeneous regions. The image estimated by the
ARUKF (Fig. 5.10 (c)) is more effective in removing noise but it tends to blur details.
However, the output of the ISUKF, shown in Fig. 5.10 (d), preserves the sharpness of
86
(a) (b)
(c) (d)
Fig. 5.10: Cropped portion of a locomotive captured with a film camera. (a) Originalwith real film-grain noise. Image estimated using (b) PF, (c) ARUKF, and (d) ISUKF.
the details very well (for example, the Tamil letters) and has very little noise.
Unlike the particle filter, in which a large number of samples must be propagated
throughout the algorithm in an attempt to faithfully represent the posterior distribu-
87
Table 5.1: Computational complexity comparison at each pixel.
Operation PF (S = 200) ARUKF (na = 5) ISUKF (L = 100, na = 2)
Non-linear functional evaluations O(S) O(2na + 1) O(L) + O(2na + 1)
Additions and Multiplications O(S)n3
a
6+ n2
a + O(2na + 1) O(L) +n3
a
6+ n2
a + O(2na + 1)
Inequality comparisons O(S log S) 0 0
Random number generations O(S) 0 O(L)
Exec. time (over 200 x 200 image) 148 seconds 14 seconds 18 seconds
tion, for the proposed method as few as 50 to 100 samples are sufficient to reliably
estimate the first two moments of the non-Gaussian prior during prediction. At each
pixel, the computational requirements are as follows: The particle filter, with S sam-
ples, requires O(S) operations in weight calculations and O(S log S) comparisons in
resampling step [41]. The proposed ISUKF requires O(L) operations in prediction (i.e.,
in importance sampling step) and n3a
6+n2
a multiplications and additions for the matrix
square-root in sigma point calculation and O(2na + 1) operations for UKF updation,
where L is the number of samples in IS step and na is the dimension of the augmented
state vector. Though the MWF is very simple and requires only O(MN log MN) op-
erations over all, its performance is comparatively inferior. Table 5.1 summarizes the
computational complexity of these filters.
5.6 DISCUSSION
In this chapter, we proposed a UKF-based approach to estimate images corrupted by
film-grain noise. We first considered the extension of the 1-D UKF for image estima-
tion with an auto-regressive state model. A small set of NSHP neighbors determines
the dimension of the state, and is augmented with noise variables. Statistics of the
augmented vector are employed to determine the sigma points which enable prediction
of prior state and measurement statistics by direct propagation through the AR state
and observation models.
To further enhance performance, we incorporated an edge-preserving MRF prior
(tailored for film-grain noise) into the recursive UKF framework. The predicted statis-
88
tics of the state conditional prior by importance sampling determines the predicted
sigma points, which are used for the prediction of the measurement statistics, and
employed in the UKF update. To recover from the non-linear degradation due to film-
grain noise, we used the exact exposure domain relation of the photographic film as
the observation model for the UKF. Experimental results were given to demonstrate
the effectiveness of the proposed approaches.
89
CHAPTER 6
DESPECKLING SAR IMAGERY
In this chapter, we address the problem of speckle reduction in synthetic aperture radar
(SAR) imagery. We demonstrate excellent despeckling in conjunction with feature
preservation by incorporating a modified DAMRF prior, tailored for noise removal in
SAR imagery, within the unscented Kalman filter framework. The performance of the
proposed method is evaluated on both synthetic and real examples, and compared
with existing methods.
6.1 IMAGE FORMATION
Synthetic aperture radar (SAR) imaging is an alternative approach to remote earth
observation and provides several advantages over visible/infrared sensing technology.
Because radar is an active sensing system that provides its own illumination source,
it does not rely on energy reflected or radiated from the earth’s surface. Radar can,
therefore, gather imagery day or night. Additionally, radar operates in the microwave
region of the electromagnetic spectrum [147]. These waves can penetrate clouds, haze,
and rain, thus allowing operation even in unfavorable weather conditions which typi-
cally preclude the use of visible/infrared systems. The use of microwaves also allows
for the observation of earth properties that are unique to the microwave region and
are not detectable with visible and infrared systems.
SAR systems produce 2D images of mapped areas. A radar is an electromagnetic
wave sensor having a pulsed microwave transmitter and a phase-coherent receiver.
The radar is carried by a moving platform such as an aircraft or a satellite. A pulse
transmitter signal is radiated by the antenna, reflected from the target, and sensed by
90
the receiver. The reflected signal time delay is proportional to target range (distance).
The range resolution δr is determined by the effective radar pulse length τ as δr = cτ/2
where c is the velocity of light [148].
The resolution capability of a radar is specified in the range and azimuth directions,
which correspond to the two orthogonal axes of the processed image. SAR systems
achieve high resolution images by the use of very short transmitted pulses with wide
bandwidths in the range direction, and by the use of a synthesized antenna (aperture)
and coherent processing of the phase history of the echo signals from many successive
pulses in the cross-range direction [148].
A digital image generated from SAR echo-returns is represented by spatial varia-
tions of pixel intensities. The gray level of a pixel in the image is proportional to the
processed received power resulting from the backscattering produced by the ground
area represented by that pixel. A difference in gray level for two adjacent features
on an image is due to a difference between their individual reflectivities, since system
and propagation factors are essentially the same for both the features. The resultant
image becomes a 2-D map of the scene reflectivity factor [22].
A SAR system coherently records the amplitude and the phase echoes from a
distant target. Since each resolution cell of the system contains several scatterers,
and since the phases of the returned signals from the scatterers are randomly dis-
tributed, the inherent coherent processing involved results in noise-like interference
patterns called speckle [5]. Grainy in appearance, speckle noise is primarily due to
phase fluctuations of the electromagnetic return signals [149]. This affects the radio-
metric resolution, which is the ability of a SAR to distinguish different objects in the
scene on the basis of their electromagnetic signatures [150]. Speckle noise severely
impedes automatic scene segmentation and interpretation, and limits the resolution of
SAR image as well as their utility. Typical spatial resolution of SAR images is com-
parable to the size of some of the objects of interest within the scene, such as houses,
trees or vehicles.
A fully-developed speckle can be modeled as random multiplicative noise. If s
91
represents the original image and v is speckle noise, then the degraded observation y
is given by the relation
y(m, n) = s(m, n) · v(m, n) (6.1)
Noise v is assumed to be independent of s with unit mean and variance σ2v . The
multiplicative nature of speckle complicates the noise filtering process.
6.1.1 SAR Methods
In general, a speckle suppression filter is expected to effectively filter homogeneous
areas, retain image texture and edges, and preserve features (both linear and point-
type). In the Lee filter [63], the multiplicative model is first approximated by a linear
combination of the local mean and the observed pixel. Then, a minimum mean square
error (MMSE) estimator is applied to determine the weighting constant. The Frost
filter [62] is an adaptive and exponentially-weighted averaging filter. The weights
are based on the coefficient of variation c which is the ratio of the local standard
deviation to the local mean of the degraded image, the distance |t| of a pixel from the
central pixel, and the damping factor K. The exponential kernel weights are given by
w = exp[−Kc|t|]. The enhanced Lee and enhanced Frost filters proposed by Lopes
et al. [65] divide an image into homogeneous, heterogeneous areas, and isolated point
targets based on the value of the coefficient of variation c (low, intermediate, and
high, respectively). The main principle behind adaptive filters is that they approach
the local mean at homogeneous regions. At points of high activity they tend to retain
the original observation pixel. The disadvantages are over-smoothing of image texture
and ineffective denoising around edges.
Wavelet despeckling approaches have been quite successful and are based on mod-
ifying the (log-transformed speckle) noisy wavelet coefficients according to some rule
(shrinkage) and reconstructing the filtered image from them. In [151], a Kalman
shrink maximum a posteriori (MAP) estimator is applied on high sub-band wavelet
coefficients. Xie at el. [71] have proposed a despeckling algorithm that fuses Bayesian
92
wavelet denoising with a regularizing prior. A MAP estimator with an alpha-stable
prior within the wavelet framework is proposed in [72] for despeckling. A wavelet
despeckling method based on Bayesian shrinkage which relies on edge information
has been proposed in [13]. Argenti et. al. [14] propose despeckling in the undecimated
wavelet domain using a space-varying generalized Gaussian distribution for the wavelet
coefficients. In [149], a model for SAR imagery based on MRFs is proposed to exploit
the characteristics of speckle.
6.2 SAR METRICS
To compare the performance of different despeckling techniques many metrics have
been proposed in the literature. These criteria enable to measure how well speckle
is reduced and to what extent important details, such as point and line targets are
preserved by any algorithm.
Some commonly prevalent metrics for determining the performance of SAR algo-
rithms are given below.
1. Signal to Mean square error ratio (S/MSE) = 10 log
( ∑m,n s(m, n)2
∑m,n(s(m, n)− s(m, n))2
).
Higher the S/MSE value, the closer will be the filtered image to the original.
2. Edge Correlation Factor (ECF) yields a measure of edge-preservation capability
[152]. The ISNR is not always an accurate measure of noise suppression in
images. For example, it need not well-account for the preservation of edges.
Therefore, we use a supplementary performance evaluation based on correlation
as proposed in [152]. ECF which is a measure of edge-preservation is computed
asΓ(∆s−∆s, ∆s− ∆s)√
Γ(∆s−∆s, ∆s−∆s)Γ(∆s− ∆s, ∆s− ∆s)
where ∆s and ∆s are the high
pass filtered version of the original and estimated images, respectively. They are
obtained with a 3×3 Laplacian operator and Γ(s1, s2) =∑
m,n s1(m, n)s2(m, n).
The correlation measure should be close to unity when the estimated image is
similar to the reference image.
93
3. Equivalent Number of Looks (ENL) measures the speckle suppression in a homo-
geneous area of the image and is given by ENL =
(µf
σf
)2
. Here, the mean value
µf =
∑m,n s(m, n)
m1n1and the standard deviation σf =
√∑m,n(s(m, n)− µf)2
m1n1
are for a (chosen) uniform area of dimension m1×n1 in the image. A large ENL
corresponds to better speckle suppression. This measure is of more importance
in the real situations, where we do not have S/MSE or ECF .
4. Figure of Merit (FOM) [15] gives a quantitative evaluation of detection of true
edges and suppression of false edges (in synthetic experiments). To assess FOM,
an edge map is created by applying the Roberts mask [153] on the filtered images.
Then, Pratt’s [154] FOM is adopted as FOM =1
max (Nf , NI)
∑
m,n
1
1 + ξd2(m, n),
where Nf and NI are the number of detected and ideal edge pixels, respectively.
Here, d(m, n) is the distance between the (m, n)th detected edge pixel and the
nearest ideal edge pixel, and ξ is a constant which is typically set to 1/9. FOM
ranges between 0 and 1. A high value indicates superior edge rendition.
6.3 NOISE REDUCTION IN SAR
The multiplicative model of speckle noise complicates recursive propagation in the KF
(or EKF) formulation. However, the UKF provides a mathematically tractable way
to propagate the first two moments even in the presence of multiplicative noise. The
propagation and the calculation of these moments are based on sigma points. This
facilitates incorporation of non-additive noise in the recursive estimation procedure.
We first used the ARUKF filter for suppressing speckle. As earlier, we employed
the homogeneous AR model for state prediction based on a three-pixel NSHP neighbor-
hood. The sigma points were generated by assuming some initial mean and covariances
for the state and these points were propagated through the AR state model and then
through the pure multiplicative observation model resulting in the prediction of state
and measurement sigma points. The UT was employed to determine the predicted
94
first two statistics and the update was performed as in the Kalman filter. The image
estimated by the ARUKF preserves edges and fine features but at the cost of letting
in noise. The effect of speckle can be suppressed by using a higher covariance value
for the state noise, but this blurs edges and point features which affects recognition of
fine details.
6.3.1 Speckle Suppression using ISUKF
In this section, we present the ISUKF algorithm which we specifically tailor to de-
speckle SAR images. We formulate the DAMRF prior as earlier and propose to use
p (s(m, n)/s(m− i, n− j)) = exp
(−γ log
(1 +
η2(s(m, n), s)
γ
))(6.2)
where (i, j) ∈ (0, 1), (1, 0), (1, 1), (1,−1) for a first-order NSHP neighborhood. How-
ever, η2(s(m, n), s) is formulated as follows.
η2(s(m, n), s) =1
ρ2(m,n)
∑
(i,j)
(s(m, n)− s(m− i, n− j))2. (6.3)
Due to the need for effective smoothing of speckle in SAR images, and for preservation
of point features, we vary the scale parameter ρ2(m,n) based on the estimated past NSHP
pixels. Note that ρ2(m,n) is not set to the previous covariance estimate as in film-grain
noise removal (section 5.4.1) . This is based on the following observation.
When DAMRF modeling is used in conjunction with the multiplicative noise model
in the UKF, we observed that a very low value of the parameter ρ2(m,n) is required in
low-intensity regions (compared to high-intensity regions) to perform an equivalent
amount of smoothing. To achieve overall good smoothing performance, we vary ρ2(m,n)
monotonically with the mean of the first-order NSHP neighborhood as ρ2(m,n) = ζµ1.5
s ,
where µs = 14(s(m, n− 1) + s(m− 1, n) + s(m− 1, n− 1) + s(m− 1, n + 1)). Here, ζ
can be used as a tuning parameter to vary the amount of smoothing depending on the
image texture.
We summarize the ISUKF algorithm for estimating each pixel sequentially in the
presence of speckle as follows:
95
1. At each pixel, we formulate the state conditional density using the DAMRF
density (Eq. 6.2) with the energy function given in Eq. (6.3) employing the
past estimates.
2. We employ importance sampling technique to estimate the first-two moments of
the above conditional pdf as in step 2 of ISUKF (section 5.3.1), which constitute
the predicted state mean x and covariance 1 Px.
3. We augment with the speckle noise statistics and employ the UT on the aug-
mented mean xa and covariance Pa to determine the corresponding sigma point
matrix Xa (as in step 3 (a) of ISUKF, section 5.3.1). Since the augmented ran-
dom variable has dimension 2, the number of sigma points will be 5, resulting
in Xa of dimension 2× 5.
4. Next, we apply the measurement model independently on each of the predicted
state and measurement noise sigma points to obtain the sigma points corre-
sponding to the measurement as
Y(m,n)/(m,n−1) = Xx(m,n)/(m,n−1). ∗Xv
(m,n−1) (6.4)
This is in accordance with the multiplicative noise model of Eq. 6.1. Here, Xx
and Xv are formed from the first and the second (which is the last) row of Xa,
respectively. Operation ‘.*’ represents element-by-element multiplication.
5. Using the predicted state and measurement sigma points, and the UT determin-
istic weights, we predict the mean and covariance of state and measurements
(x,Px, y,Py) and also the cross-covariance of the state and measurement Pxy.
6. Based on the state and measurement statistics and using the recent observa-
tion y(m, n), we employ the Kalman filter update to estimate the mean x and
covariance Px of the state.
The estimated state x(m, n) corresponds to the despeckled pixel.
1These are scalars as described in section 5.3.1. We have dropped the subscripts referring to pixel
location, for brevity.
96
(a) (b) (c) (d) (e)
Fig. 6.1: (a) Original image. (b) Noisy version. Output of (c) Enhanced Lee(FOM = 0.92, S/MSE = 20.51 dB), (d) Frost (FOM = 0.94, S/MSE = 20.51dB), and (e) the proposed method (FOM = 0.98, S/MSE = 22.83 dB).
We note that the propagation of the variance parameter ρ2(m,n) of the DAMRF
prior for SAR is different from that for film-grain noise in which ρ2(m,n) was directly
set to the previous covariance estimate. In this chapter, the variance parameter ρ2(m,n)
(in Eq. 3.9) is propagated only based on neighboring estimates and tuned with a free
scaling parameter ζ to effectively handle the wide variety of contextual features in
SAR images. The ISUKF leads to significantly better performance that the AR-based
UKF both in terms of (local) smoothing of uniform regions and in preserving point
and edge features.
6.4 EXPERIMENTAL RESULTS
In this section, we present results obtained using the proposed ISUKF method when
applied on SAR images. We compare the performance of the proposed approach
with both standard and recent methods. We note that in the proposed approach,
the parameter ζ of the DA model is image-dependent. Based on our experiments, it
typically takes a value between 0.01 and 0.03 for moderate noise.
Consider the edge image shown in Fig. 6.1(a). This image is degraded with
simulated 2-Look Gamma-distributed speckle noise (Fig. 6.1(b)). The outputs of the
Enhanced Lee [65], Frost [62] and the proposed ISUKF method are shown in Figs.
6.1(c), (d) and (e), respectively. The images in Figs. 6.1 (c) and (d) have a grainy
appearance in the white uniform region due to residual speckle. The proposed method
97
(a) (b) (c)
(d) (e) (f)
Fig. 6.2: (a) Original SAR image. (b) Degraded (σ2v = 0.04). Image estimated using
(c) the Enhanced Lee filter, (d) the Frost filter, (e) AR-based UKF, and (f) the ISUKF.
is not only effective in despeckling but also well preserves the transition between the
dark and bright regions. Our method yields an almost exact edge map (close to that of
the original) even though the noise level is high. This is also reflected in the S/MSE
and FOM values which are higher as compared to those of the standard filters (as
given in Table. 6.1 (a)).
The image used for the next experiment is shown in Fig. 6.2(a). The degraded
image with simulated speckle noise of variance σ2v = 0.04 is shown in Fig. 6.2(b). The
despeckled images obtained with the Enhanced Lee filter and the Frost filter are shown
in Figs. 6.2(c) and (d), respectively. We observe that dark regions are over-smoothed
while the high intensity regions have a grainy appearance. The image obtained using
the ARUKF is shown in Fig. 6.2(e). The output is visually inferior to that of the Frost
filter but it does not over-smooth dark regions. The output of the proposed ISUKF is
98
Table 6.1: Quantitative comparison with standard filters.
(a) Standard Filters (Fig. 6.1)
SMSE
dB FOM ENL
Enhanced Lee (K = 1) 20.51 0.92 305
Frost (K = 3) 20.51 0.94 361.15
ISUKF (ζ = 0.01) 22.83 0.98 3071.00
(b) Standard filters (Fig. 6.2)
SMSE
dB ECF ENL
Noisy (at (170, 160)) 0 0.242 24.90
Enhanced Lee (K = 8) 22.28 0.217 459.06
Frost (K = 12) 23.35 0.199 559.18
ARUKF 21.38 0.639 270.00
ISUKF (ζ = 0.025) 22.15 0.624 1041.00
shown in Fig. 6.2(f). Note that incorporating a non-Gaussian MRF prior in the UKF
gives a significant improvement over the ARUKF. The image in Fig. 6.2(f) is visually
closest to the original. Comparative metrics for this example are given in Table 6.1(b).
We also compared our method with the adaptive MAP technique in [12] which is
based on a heavy-tailed Rayleigh model. In Figs. 6.3 (a), (b) and (c), we reproduce
the original, degraded and the output image from [12], respectively. Even though the
method proposed in [12] reduces speckle, the output is blurred. The image estimated
using the proposed ISUKF approach (Fig. 6.3(d)) when used on the same degraded
image is sharp and even the fine details are recovered well. The output of our method
has higher S/MSE and ECF values as given in Table 6.2(a). Because the output
of [12] is blurred, its ENL value is marginally higher. Note that the output of the
proposed method is much closer to the original image (texture-wise also).
Next, we considered the case of a Horse track image that is affected by real speckle
noise (Fig. 6.4(a)). The output of the Frost filter, shown in Fig. 6.4(b), contains
noticeable residual speckle in uniform regions. Also, the edge boundaries are noisy.
99
(a) (b) (c) (d)
Fig. 6.3: (a) Original aerial image. (b) Degraded image. Image estimated using (c)Rayleigh prior MAP estimator [12], and (d) the proposed ISUKF.
(a) (b) (c)
Fig. 6.4: (a) The Horse track image. Estimated output image using (b) the Frostfilter, and (c) our method.
The image estimated by the ISUKF method is shown in Fig. 6.4(c). Our method
is significantly more effective in smoothing speckle over uniform regions (such as the
top-left and bottom-right gray regions); yet the edges are sharp and clear. Even the
small white blobs on the top-right corner are recovered well. Table 6.2(b) gives a
quantitative comparison with other filters. The cropped 60× 60 region that was used
to calculate the ENL is centered at (115, 55). The performance of our method is
evidently superior even in the real case.
Figure 6.5(a) shows the Bedfordshire image (in Southeast) England with real
speckle. The image estimated using a Bayesian wavelet filter which relies on edge
information is reproduced from [13] and is given in Fig. 6.5(b). The output obtained
100
Table 6.2: Quantitative comparison.
(a) Rayleigh MAP-based (Fig. 6.3)
SMSE
dB ECF ENL
Noisy (at (165, 100)) 0 0.67 13.44
Rayleigh MAP technique [12] 18.57 0.61 17.26
ISUKF (ζ = 0.03) 21.14 0.74 16.42
(b) Real speckle (Fig. 6.4)
µf σf ENL
Noisy (at (115, 55)) 110.96 28.73 14.92
Frost (K = 10) 110.46 8.92 153.10
Enhanced Lee (K = 6) 110.47 9.56 133.66
ISUKF (ζ = 0.01) 112.18 7.88 202.49
(a) (b) (c)
Fig. 6.5: (a) Bedfordshire image. Result using (b) edge-based Bayesian wavelet filter[13], and (c) the ISUKF method.
by the ISUKF filter on the same degraded image of Fig. 6.5(a) is shown in Fig. 6.5(c).
We note that the edges are recovered without any blurring effects while the homoge-
neous regions are almost free from noise. The isolated point targets (two white spots
in the dark homogeneous region on the left) become strikingly visible in Fig. 6.5(c).
A quantitative comparison of the two methods is given in Table 6.3(a).
101
Table 6.3: Quantitative comparison with wavelet filters.
(a) Wavelet edge-based (Fig. 6.5)
µf σf ENL
Noisy (at (90, 160)) 89.73 25.73 12.16
Bayesian wavelet filter [13] 90.85 13.87 42.87
ISUKF (ζ = 0.015) 89.25 13.03 46.91
(b) UDWL-MAP (Fig. 6.6)
µf σf ENL
Noisy (at (280, 100)) 52.10 8.73 35.61
UDWL [14] 54.06 3.04 316.06
ISUKF (ζ = 0.004) 51.96 2.83 337.9
(a) (b) (c)
Fig. 6.6: (a) Airport image. Result using (b) MAP estimator in UDWL domain [14],and (c) the proposed method.
We also compared our method with a very recent wavelet despeckling technique
[14] developed in the undecimated wavelet (UDWL) domain. Fig. 6.6(a) shows a real
SAR image of an airport. This is a very difficult example as it contains lots of weak
edges. The outputs of the method in [14] and the proposed method are shown in Figs.
6.6(b) and (c), respectively. From a visual comparison, it is quite evident that the
ability of the ISUKF method in capturing soft edges is superior to that of [14]. Even
102
(a) (b)
(c) (d)
Fig. 6.7: (a) Urban SAR image. (b) Degraded image. Image estimated using (c)wavelet-based filter [15], and (d) ISUKF.
the ENL for our method is higher (Table 6.3 (b)).
We next present few more results, only for visual comparison with some well known
existing methods, to demonstrate the effectiveness of the proposed ISUKF method on
different kinds of SAR images. Figs. 6.7(a), (b) and (c) show the original, degraded
and the image estimated using the wavelet filter with soft-thresholding (from [15]),
respectively. The image obtained by using the ISUKF is shown in Fig. 6.7(d). We
observe that the proposed method is more effective in bringing out the narrow sharp
details in the bottom middle regions while effectively removing the overall speckle.
Next, we compare the proposed method with an MRF-based approach which ob-
tains an estimate by simulated annealing with Metropolis sampler [16]. Figs. 6.8 (a)
103
(a) (b) (c)
Fig. 6.8: (a) Real SAR image. Image estimated using (b) simulated annealing andMetropolis estimator [16], and (c) ISUKF.
and (b) show the real noisy SAR image and the output of the method in [16], respec-
tively. The image estimated using the proposed ISUKF is shown in Fig. 6.8(c). Our
approach could effectively suppress speckle while preserving image texture and sharp
details.
The above examples clearly demonstrate the effectiveness of the proposed ISUKF
in suppressing speckle while simultaneously preserving point and fine features. It
generally leads to a better visual output and also compares favorably in quantitative
performance as well as in computational complexity with other methods. It has the
same computational complexity as given in chapter 5 (Table (5.1)), and executes in 18
seconds using Matlab on a Pentium 4 PC with 256 MB RAM for a 200 x 200 image.
6.5 DISCUSSION
In this chapter, we explored the applicability of ISUKF for despeckling SAR imagery.
The formulation of the DAMRF prior was modified to preserve fine features in a wide
range of SAR images. A small set of sigma points were used to capture and propagate
the first two moments of the state and measurement noise through the multiplicative
speckle model.
104
CHAPTER 7
JOINT INPAINTING AND DENOISING
Old films or photographs usually suffer from damages due to physical and/or chemical
effects resulting in stain, scratch, scribbling, noise, and digital drop-out in frames. In-
painting is a technique for modifying user-specified regions of an image undetectably.
It provides a means for reconstruction of known damaged portions. It serves a wide
range of applications, such as removing superimposed text like dates, subtitles, public-
ity or logos from still images or videos, and reconstructing scans of deteriorated images
by removing scratches or stains [6, 18].
In this chapter, we present a scheme which can simultaneously filter film-grain
noise while filling-in identified damaged portions in an image within the unscented
Kalman filter (UKF) framework. We observe that a key issue in handling inpainting
with a recursive filter lies in arriving at the observations in the regions to be inpainted
based on the surrounding available information. Care must be taken to preserve the
edges through the inpainting regions. We first demonstrate the validity of our scheme
on line and region-scratches, and text removal. We then incorporate our inpainting
scheme within the UKF framework and demonstrate simultaneous film-grain noise
reduction and inpainting capability of the proposed filter.
7.1 INTRODUCTION
Most inpainting techniques smoothly propagate image information from outside the
boundary into the inpainting region based on edges or isophotes (lines of equal gray
values) [6] and require the user to only specify the region to be inpainted. Kokaram
et al. [23] interpolate losses in films from adjacent frames using motion estimation and
auto-regressive models. The technique, however, cannot be applied to still images or to
105
films where the regions to be inpainted span several frames. Hirani and Totsuka [155]
combine global frequency and local spatial information in order to fill a given region
with a selected texture. The algorithm requires the user to select the texture to be
copied into the region to be inpainted. Mansnou and Morel [24] define dis-occlusion as
the recovery of hidden parts of objects in an image by interpolation from the vicinity
of the occluded area and perform inpainting by joining the points of the isophotes at
the boundary of the region to be inpainted.
Bertalmio et al. [6] formulate partial differential equations (PDEs) that smoothly
propagate the boundary information (the Laplacian of the image) in the direction of
the isophotes, estimated by the image gradient rotated by 90 degrees. Their algorithm
shows that both the gradient direction (geometry) and the gray-scale values (pho-
tometry) of the image should be propagated inside the region to be filled-in. They
demonstrate the generality of their method with various applications including region-
filling, text-removal, and special-effects restoration of old photographs. In [156], an
exemplar-based inpainting algorithm for filling-in large objects is proposed by com-
bining the principles of texture-synthesis and traditional inpainting. Another class
of inpainting algorithms which stresses on exploiting geometric image models in a
Bayesian framework has been proposed by Chan and Shen in [18,157]. The first model
introduced by them uses a total variation-based image model [158]. This model can
successfully propagate sharp edges into the damaged domain. However, because of a
regularization term, the model exacts a penalty on the length of edges, and, thus, the
inpainting model cannot connect contours across very large distances. Subsequently,
Chan et al. [159] introduced the Mumford-Shaw model that allows both for isophotes
to be connected across large distances, and their directions to be kept continuous across
edges in the inpainting region.
Oliveira et al. [160] have proposed a fast and simple inpainting method which
repeatedly convolves a Gaussian 3 × 3 filter over the missing regions to diffuse in-
formation. However, it requires specification of the diffusion barriers (high-gradient
areas) manually. Telea [161] uses intensity gradient and distance information of the
106
boundary pixels and weighs them by the isophote direction. Yet another inpaint-
ing algorithm [162] uses the Sobel edge operator’s magnitude and angle to compute
isophotes. An inpainting method based on belief propagation is proposed in [163].
An exemplar-based image completion method that unifies texture synthesis and image
inpainting with a priority belief propagation-based optimization is proposed in [164].
Rares et al. [17, 165] propose inpainting methods which use edge information both
for the reconstruction of the skeleton image structure in missing areas, as well as for
guiding the interpolation that follows.
In [166], a PDE model based on the Cahn-Hilliard equation [167] is proposed for
binary inpainting of degraded text. Image repairing methods in [168, 169] employing
texture segmentation followed by robust, non-iterative multi-dimensional tensor voting
[170] to globally infer the most suitable pixel value in a neighborhood by using the
MRF assumption. In recent years, there is great interest in extensions such as video
inpainting [171], video repair [172], video stabilization [173] and deinterlacing of video
sequences [174].
A very recent development in inpainting is simultaneous recovery from noise as well
as damages. In [175], a PDE-based method is proposed to fill-in missing information
while removing noise. Inside the inpainting domain, smoothing operation is carried
out by mean-curvature-flow to enable transportation of boundary information, while
outside of the inpainting domain smoothing is encouraged within homogeneous regions
and discouraged across boundaries. In this chapter, we first propose an inpainting
method that is guided by edge information. We next embed noise filtering into the
inpainting framework.
7.2 AN EDGE-BASED APPROACH
Since image information between different objects generally need not correlate and
are separated by edges, edge-based inpainting methods first reconstruct explicit edge
information in the region of damages. Filling-in is followed within each object guided
107
Fig. 7.1: A typical edge-based inpainting methodology (from [17]): (Left) Generalalgorithm outline and, (right) an illustration of the outputs for each stage.
by the reconstructed edges. The user is required to specify the region to be inpainted.
Edge-based inpainting algorithms involve mainly two steps. i) reconstruction of the
edge image (in the masked regions) from damaged image edges and those available in
the unmasked regions, and ii) an edge-based pixel aggregation procedure. Consider an
image I that has lost information in region Ω(⊂ I), the region to be inpainted. Let
δΩ(⊂ I) represent the outer boundary of the inpainting region. Note that observations
are not available inside Ω and virtually anything could have existed there. One must
‘predict’ the image content in Ω based on appropriate assumptions about image prop-
erties such as local continuity and independence across objects separated by edges.
Digital inpainting algorithms are known to undetectably fill-in missing information
provided the damaged area is not too large in size.
The steps employed in a typical edge-based inpainting method and the correspond-
ing outputs are illustrated in Fig. 7.1. Following step 1 in the figure, the boundary
edges are detected and propagated through the damages. Once the edges in the dam-
108
(a) (b) (c) (d)
Fig. 7.2: (a) A real image superimposed with text. (b) Mask specifies the regionwhere the original image information is lost. (c) A damaged photograph, and (d) itsmask that specifies the regions to be filled-in.
aged region are reconstructed (as shown in the image corresponding to step 2), the
inpainting guided by the reconstructed edges yields the restored image shown in the
final step. In general, there can be more than one damaged region in an image.
In this section, we present a simple and effective edge-based inpainting method.
The proposed algorithm is similar to the one proposed by Rares et al. [17, 165] in the
sense that the major steps illustrated in Fig. 7.1 are performed, but it differs in the
methodology of implementation. Moreover, here our interest is not to just inpaint
the damaged images but to inpaint the damages while simultaneously filtering the
film-grain noise.
As stated earlier, we assume that the region to be filled in (Ω) is given a priori.
A convenient way to specify the regions to be filled-in is by an image mask which is
a binary image that distinguishes the regions that must be inpainted. For example,
for the image shown in Fig. 7.2 (a), the region to be inpainted (superimposed text)
is shown in the mask image (Fig. 7.2(b)) by white pixels. Fig. 7.2 (d) is another
example of a mask image that shows the regions to be inpainted in Fig. 7.2 (c).
7.2.1 Reconstruction of Edge Image
In any inpainting method, propagating the information in a uniform region is trivial.
The main difficulty arises at lost edges. As edges determine the topology, edge re-
109
construction needs to incorporate global image structure information in the inpainting
procedure. It is useful to note that the structure of the original image inside the region
Ω is a continuation of the (edge) structure outside it as illustrated earlier in Fig. 7.1.
Edge information can be utilized to independently propagate the information between
any two different objects in the inpainting region. Moreover, once we reconstruct the
edge image, the gray level propagation can be local. Thus, it is intuitive and reason-
able to first reconstruct the edges in the missing data regions to reliably propagate the
boundary image information.
It is important to note that reconstruction of edges from a damaged image is
different from the usual edge-linking process.
• The region of inpainting can be large and can result in much larger breaks in
edges than those that typically arise from edge-detection methods.
• The end points of the broken edges across a damaged region may even be farther
than some possible end points on the same side of the mask.
• The location/distance between pixels and the direction of edges become more
important than gradient magnitude and pixel intensity.
• Since the mask is user-specified and causes the edge gaps, its location and di-
mensions should be used to make the edge-linking process effective.
We propose the following edge-reconstruction procedure which makes effective use
of image features of the end points of edges, the contextual dimensions of the damages,
and the location of the mask. It comprises of the following steps.
1. Preprocessing and edge detection: We first find the edge map over Ωc with a
Canny edge operator [176]. The threshold is set high so that only strong edges
are captured. This yields an edge image everywhere except at the masked (to-
be-inpainted) regions. Fig. 7.3 (b) shows the edge information in Ωc derived
from Fig. 7.3 (a).
2. End points and features:
110
(a) (b) (c) (d)
Fig. 7.3: Edge reconstruction: (a) Degraded image showing maximum matching areadimensions. (b) Edge map of the degraded image. (c) Located end points, and (d) thereconstructed edge image.
(a) By searching over the boundary of the masked regions, the end points of
edges are located as points with exactly one neighboring edge pixel within
their eight point neighborhood. The end points thus detected are marked
with dots as shown in Fig. 7.3 (c).
(b) For each end point at location (ik, jk), its gradient direction Dk, magnitude
Gk, and the corresponding pixel intensity Ik are collected to form a feature
vector Fk = [ik, jk, Dk, Gk, Ik].
3. Matching: Our aim is to link edges across (possibly large) damaged regions.
(a) The end points are matched using an area constraint. The local matching
area for an end point is chosen to be a rectangle whose dimensions are set
to the largest breaks Mr and Mc along height and width, respectively, in
the strong edges due to damages. In Fig. 7.3 (a), we show selection of Mr
and Mc for the given degraded image.
(b) Two end points are considered for matching, only if i) the row and column-
wise absolute difference of their spatial locations is less than Mr and Mc,
respectively, and ii) the region between the two points belongs to Ω, the
inpainting region i.e., they are separated by the same damage.
111
(c) Ratio-test: Consider the end point p1 (with coordinates (i1, j1)) in Fig. 7.3
(c). Note that end point p2 (with coordinates (i2, j2)) as well as p3 (with
coordinates (i3, j3)) satisfy the matching area constraint for p1. In order
to decide to which of these two points should p1 be matched, the damaged
contents of the region between two end points is inferred as follows.
Inside the rectangular region whose opposite corners are determined by the
pixels (i1, j1) and (i2, j2) to be matched, the ratio of masked pixels to the
available pixels is determined. If the ratio is higher than a threshold r, we
decide that the region between these end points is a damaged region and
consider only such points for matching. We note that the rectangular box
enclosing p1 and p3 in Fig. 7.3 (c) includes a larger fraction of available
pixels, in contrast to the rectangular box enclosing p1 and p2. Hence, p2 is
a better candidate-match for p1 than p3.
(d) Based on (b) and (c), we perform constrained matching as follows: Let
Fpmand Fpn
denote the feature vectors corresponding to points pm and
pn, respectively. If |im − in| < Mr and |jm − jn| < Mc and the pixels
in the rectangle satisfy the ratio test (c) then we compute the distance
|Fpm− Fpn
|2.
(e) The end points with the minimum distance are selected as the matches i.e.,
L(pi) = arg mink |Fpi− Fpk
|2 among all candidate end points pk.
This mask-dependent local matching strategy suppresses false matches very ef-
fectively as compared to naive direct matching which uses only the feature vec-
tors of the end points.
4. Edge linking: Only the mutually matched end points which satisfy pk = L(pm)
and pm = L(pk) are connected to obtain the reconstructed edge map from
the degraded image. In Fig. 7.3 (d), we show the reconstructed edge image
corresponding to Fig. 7.3(a) using the above steps.
112
7.2.2 The Proposed Method
Once the edge image is reconstructed, inpainting can proceed by using suitably-derived
boundary neighbors. In our method, the edge information is effectively used to define
the neighboring pixels so as to limit the propagation of information within each object.
The algorithmic steps of the proposed inpainting method are as follows:
If (m, n) ∈ Ω,
1. S = φ. Take a window W with initial size 3× 3 about the location (m, n).
2. Collect valid neighbors within the window W and append as
S = S ∪ I(i, j); (i, j) ⊂ W, (i, j) /∈ ΩNote that the valid neighbors are pixels that belong to Ωc.
3. Check independently in all four directions whether the end pixel of the window
encounters an edge pixel of the reconstructed edge image; otherwise, increment
the size of window W in that direction. This limits the propagation of the
information within the object and renders it independent of other objects.
Let wl be the width of the window W between pixel (m, n) and the extreme
pixel on the left. We perform wl = wl +1, if (m, n−wl) /∈ E; else wl = wl where
E is the domain of non-zero edge pixels. Similarly, update the window length
along right (wr), top (wt) and bottom (wb) directions also.
4. Stop pixel collection when a certain number of available neighbors are accumu-
lated. i.e., stop if Cardinality S = T , else repeat steps (2) and (3). We set
the threshold T as 25.
Finally, the inpainted observation yp(m, n) = medianS.Here, the median is preferred over the average value of the pixels in set S for
better noise robustness and edge-preservation. Also, note that due to the appending
operation employed in step 2 of the algorithm, the computed median is actually a
weighted median that gives higher priority to the nearest valid neighbors. Schematic
representation of the proposed edge-based inpainting approach is shown in Fig. 7.4.
113
Fig. 7.4: Proposed inpainting algorithm.
7.3 RESULTS FOR INPAINTING
In this section, we demonstrate the performance of the proposed inpainting algorithm
on various synthetic as well as real degradations and compare it with well-known
existing techniques.
We begin with inpainting scratches on a synthetic image. Fig. 7.5(a) shows an
original brick image and Fig. 7.5(b) shows the degraded version with scratches (result-
ing in even some occluded edges). The output using the proposed inpainting scheme
is shown in Fig. 7.5(c). Even though this is a simulated example, the edge map was
reconstructed from the degraded image since the original image is never available in
real situations. We note that the inpainted image is quite close to the original image
even at the edges.
We next consider the degraded image shown in Fig. 7.6(a). The inpainting result
using the proposed algorithm is shown in Fig. 7.6(b). For comparison, we have
reproduced the output of the method described in [6] for the same image. Our result
114
(a) (b) (c)
Fig. 7.5: Brick image (a) Original, (b) scratched, and (c) inpainted.
(a) (b) (c)
Fig. 7.6: A synthetic image (a) with a ring mask. (b) Inpainted output. (c) Resultof Bertalmio [6].
is sharper at edges than that of [6] (Fig. 7.6(c)) but the method in [6] preserves broken
curvature better.
Next, we consider Fig. 7.7(a) which shows an original peppers image. The dam-
aged version with thick line scratches and patches is shown in Fig. 7.7(b). Using the
edge map of the degraded image (Fig. 7.7 (c)), which is known only at unmasked
regions, we arrive at the reconstructed edge map shown in Fig. 7.7(d). The final
inpainted image is given in Fig. 7.7(e). We note that the final result is quite good.
In Fig. 7.8(a), we consider the case of a real image of an old damaged painting.
The edge image in the unmasked regions is shown in Fig. 7.8(b). The reconstructed
115
(a) (b)
(c) (d) (e)
Fig. 7.7: Peppers image. (a) Original, (b) scratched, (c) edges from degraded image,(d) reconstructed edges, and (e) inpainted image using reconstructed edge map.
edge map using our edge reconstruction method is shown in Fig. 7.8(c) and the final
inpainted output using the proposed algorithm is given in Fig. 7.8(d). We note that
except at the middle right folds of the coat, where the edge reconstruction is not
perfect, our result compares favorably with the output Fig. 7.8(e) of the PDE-based
iterative approach of [6].
Next, we present inpainting results on a real damaged photograph (Fig. 7.9(a))
taken from [6]. The output of our algorithm and that of [6] are shown in Figs. 7.9(b)
and (c), respectively. While both the methods exhibit comparable performance overall,
our algorithm does better near the eyes while the result of [6] is better in inpainting
the damage near the bottom-hand.
116
(a) (b)
(c) (d) (e)
Fig. 7.8: An old painting (a) Degraded image. (b) Edge map from degraded image.(c) Reconstructed edge map. (d) Inpainted result using reconstructed edge map. (e)Output cropped from [6].
We next consider inpainting of superimposed text. Fig. 7.10(a) shows a bird image
with text written on it. The inpainting result of the proposed algorithm is shown in
Fig. 7.10(b). We also show the result of Chan and Shen [18] in Fig. 7.10(c) for
comparison. Our method, despite being less complex, is quite close to that of [18] in
performance.
Finally, we apply the proposed algorithm on a horse-cart image (Fig. 7.11(a))
superimposed with dense text. Fig. 7.11 (b) shows the inpainted image obtained with
our algorithm. The image inpainting result of [6] is shown in Fig. 7.11(c). Again, our
approach performs comparably.
117
(a) (b) (c)
Fig. 7.9: A real image (a) with mask on damaged region. (b) Inpainted result usingour method. (c) Result of Bertalmio [6].
(a) (b) (c)
Fig. 7.10: (a) A bird image superimposed with text. Inpainted result using (b)proposed method, and (c) method in [18].
7.4 INPAINTING IN PRESENCE OF NOISE
In chapter 5, we explored UKF for film-grain noise suppression. Old photographic
images and movies not only suffer from damages due to scratch, stain or blotches,
but also from film-grain noise. Text inpainting in the presence of film-grain noise is
also important for restoration and compression of photographic images and movies
containing subtitles and markers. In contrast to filtering, we must entirely rely on the
surrounding available pixels to estimate a pixel in image painting. This can be partially
accomplished during the prediction step of a recursive filter. Reliable prediction augurs
118
(a)
(b) (c)
Fig. 7.11: Horse-cart image (a) with superimposed text. (b) Inpainted output usingproposed method. (c) Output of [6].
well for simultaneous inpainting and filtering. Moreover, the prediction must not
depend on the missing observations in order to identify the state model parameters.
Importance sampling unscented Kalman filter, proposed in chapter 5, has a reliable
prediction step (without being dependent on observations) which suits our current
requirement. However, for the update step (in the damaged regions) we derive ‘virtual’
observations from the boundary topology and gray levels. This is in contrast to using
’known’ observations in traditional restoration.
119
7.4.1 ISUKF for Simultaneous Inpainting and Filtering
In this subsection, we propose a unified framework for inpainting images in the presence
of film-grain noise. The proposed inpainting process is embedded within the recursive
ISUKF framework to simultaneously combat noise while also accounting for missing
data. Prediction is based on the already estimated NSHP pixels as before. The issue in
performing inpainting within the filter framework is to handle missing observations. To
address this, we propose to reconstruct a missing observation by using the (available)
surrounding observations in our inpainting method. Inpainting is invoked only when we
encounter a missing pixel during the sequential filtering process. We first reconstruct
the edge image from the damaged noisy image, exactly as in section 7.2.1. We set a
high threshold for finding edges to mitigate the effect of noise on edge reconstruction.
We use the reconstructed edge map to guide the inpainting process as described in
section 7.2.2.
Details of the proposed method to sequentially filter and recover the intensity of
the original image at each pixel (m, n) in a raster-scan order from left-to-right are as
follows:
1. Prediction is based on the DAMRF prior exactly as in ISUKF (chapter 5). We
first formulate the DAMRF conditional density following step (1) and predict
the state mean and covariance by importance sampling of this non-Gaussian pdf
as in step (2) of section 5.3.1.
2. Inpainting is performed only for the missing observation pixels. We use the
output of our inpainting algorithm for ’reconstructing’ the observation pixel
intensity at (m, n) based on the available neighbors in the observation image.
3. UKF update step is as in ISUKF (step (3), of section 5.3.1). Note that for
missing data, we have the inpainted observations.
(a) We predict the state sigma points using Eq. (4.1) based on the predicted
mean and covariance.
120
Fig. 7.12: Proposed framework for inpainting in the presence of noise.
(b) We use the film-grain nonlinearity (Eq. 5.8) to predict the measurement
sigma points using the predicted state and measurement noise sigma points
as in step 3(b) of section 5.3.1.
(c) Following step 3 (c) of section 5.3.1, we update the mean and covariance of
the state at each pixel. In order to do this, we require the observations from
the degraded image. In case of missing observations, we use the inpainted
output from step 2 (above).
4. The updated state is the estimated pixel. This is used in the formulation of the
DAMRF prior and also for inpainting subsequent missing observations.
Figure 7.12 gives a flow diagram of the proposed filter.
7.5 RESULTS FOR JOINT RECOVERY
In this section, we recover damages in presence of both synthetic and real film-grain
noisy images. We begin with a synthetic image with two ellipses, a white region to
121
(a) (b)
Fig. 7.13: (a) Image degraded by film-grain noise and a big patch. (b) Output of theproposed method.
(a) (b) (c)
Fig. 7.14: Boat (a) Original. (b) Degraded and scratched. (c) Recovered image.
be inpainted, and corrupted by simulated film grain noise, as shown in Fig. 7.13(a).
The proposed recursive filter not only suppresses film-grain noise but also inpaints
efficiently the damaged region as shown in Fig. 7.13(b).
Next, we show a boat image and its degraded version in Figs. 7.14 (a) and (b),
respectively. Note that the masts have been scratched through. The image estimated
by the proposed algorithm and shown in Fig. 7.14(c) recovers most of the details,
except at the very narrow poles where the neighboring information is very different.
In the next example, we apply the proposed approach to suppress film-grain noise
while removing dense superimposed text. An original peppers image is shown in Fig.
122
(a) (b) (c)
(d) (e) (f)
Fig. 7.15: Peppers (a) Original. (b) Degraded and superimposed with text. (c)Reconstructed edge map. (d) Image recovered by the proposed filter (ISNR = 16.16dB). (e) Degraded with high noise. (f) Recovered result using reconstructed edge map(when input is (e)) (ISNR = 14.38 dB).
7.15(a). It is superimposed with dense text and corrupted (initially) with less degree
of film-grain noise as shown in Fig. 7.15(b). The reconstructed edge image is shown in
Fig. 7.15 (c). The recovered image using the reconstructed edge map is shown in Fig.
7.15(d). Next, we introduced a high level of film-grain noise as shown in Fig. 7.15(e)
and the corresponding inpainted and denoised result is shown in Fig. 7.15 (f). For
both levels of noise, we observe that the outputs are quite close to the original (Fig.
7.15(a)). Here we have given the ISNR values for a quantitative comparison.
Next, we consider a face image with real film-grain noise but simulated scratches
as shown in Fig. 7.16(a). The result of the proposed filter (Fig. 7.16(b)) looks quite
natural, recovering the fine details near the eyebrows, hair and nose. Scratches on
123
(a) (b)
(c) (d)
Fig. 7.16: Face with real film-grain. (a) Scratched version. (b) Inpainted and filteredresult (when input is (a)). (c) Degraded by patches. (d) Image recovered by our filter(when input is (c)).
the ears and near the mouth are also well recovered. Even when the face image is
damaged with patches and thick stripes (Fig. 7.16(c)), the recovered image by the
proposed filter (shown in Fig. 7.16(d)) recovers most of the details, except for small
artifacts near the forehead and mouth.
A cropped portion of a frame from the film “Dr. Mabuse” with real film-grain noise
is synthetically damaged at many places as shown in Fig. 7.17 (a). The results of the
proposed edge reconstruction method is given in Fig. 7.17(c), which can reconstruct
most of the major breaks in Fig. 7.17(b). The image recovered by the proposed filter
is shown in Fig. 7.17(d). The edges at the shoulders and at the tie are well-recovered,
and the image details near eyes and nose are restored properly.
124
(a) (b)
(c) (d)
Fig. 7.17: Mabuse (a) with scratches. (b) Edge image from (a). (c) Reconstructededge image (when input is (b)). (d) Image recovered by the proposed filter.
We next demonstrate the capability of the proposed approach to recover an im-
age of scanned painting with real film-grain noise (Fig. 7.18(a)). It is scratched at
several places including at edges as shown in Fig. 7.18(b). The recovered image (Fig.
7.18(c)) using the proposed approach can suppress film-grain and effectively inpaint
while retaining the image texture and finer details such as the ship partitions.
We have also performed removal of scratches in an image captured on a photo-
graphic film. The image is that of a locomotive shown in Fig. 7.19 (a). In the output
of the proposed algorithm shown in Fig. 7.19 (b), not only the numbers and letters
are properly recovered, but also the noise is suppressed well.
Finally, we consider another real example that has film-grain noise, embedded
125
(a) (b) (c)
Fig. 7.18: (a) Scanned painting with real film-grain noise. (b) Scratched version, and(c) recovered image.
(a) (b)
Fig. 7.19: (a) An image captured using a film-camera, and (b) recovered output.
text and a vertical scratch as shown in Fig. 7.20(a). The image recovered using the
proposed approach is shown in Fig. 7.20(b). It appears natural and without grain.
We have successfully inpainted the titles and the mark in the middle of the face, and
also recovered quite well the facial details such as hair, nose and mouth.
126
(a) (b)
Fig. 7.20: (a) A child image with text, scratch and film-grain noise. (b) Inpaintedand filtered output of the proposed method.
These examples demonstrate that the proposed filter is quite effective in suppress-
ing film-grain noise while performing image inpainting. The proposed algorithm takes
(using Matlab) about 20 seconds to execute on an image of size 200× 200 pixels when
run on a Pentium-IV PC with 256 MB RAM.
7.6 DISCUSSION
We proposed a novel recursive filter based on the unscented Kalman filter to recover
an image degraded by both film-grain noise and scratches. We first proposed an edge-
based inpainting scheme that is suitable for incorporating in a recursive framework. As
our inpainting method relies on the edge image, we proposed a local mask-dependent
edge-linking method to link edges across damaged portions. We proposed a pixel
aggregation procedure to inpaint missing pixels.
To inpaint damages in images corrupted by film-grain noise, we proposed a recur-
sive scheme based on the UKF which uses our inpainting method to derive the ob-
servations in the damaged regions. Prediction was based on a discontinuity-adaptive
MRF prior which preserves the edges while achieving good noise-reduction in uniform
regions.
127
CHAPTER 8
CONCLUSIONS
In this thesis, we addressed the problem of recovering an image from its degraded
observation through edge-preserving novel extensions of the Kalman filter and its vari-
ants. We first investigated the problem of filtering AWGN. The original image was
modeled with a non-Gaussian MRF prior based on a discontinuity-adaptive poten-
tial function. This conditional prior implicitly models the non-homogeneous nature of
images with local statistical information. We brought in the principle of importance
sampling to predict the mean and covariance of the state conditional prior, which are
then fed to the update step of the Kalman filter to arrive at the final image estimate.
We next examined the problem of handling non-linearity in photographic films and
proposed methods within the framework of the UKF. As a first step, we modeled the
image by a homogeneous AR state model with NSHP support. Assuming the initial
statistics of the state, we determined sigma points and transformed them through the
AR state model followed by the observation model. Employing UT, we predicted the
state and measurement statistics for use in the update stage of UKF.
We further relaxed the linearity constraint imposed by the homogeneous AR model
in UKF and incorporated an edge-preserving MRF image prior to capture local statis-
tics. The principle of importance sampling was adopted to predict the mean and
covariance of the state from the prior, which allowed us to determine the predicted
sigma points directly from the prior. These were propagated through the film-grain
observation model to obtain measurement sigma points and subsequently the statis-
tics required for the update step of the UKF. Experimental results on film-grain noise
reduction in photographic images were given to demonstrate the effectiveness of the
UKF-based algorithms.
128
We next addressed the problem of suppression of speckle noise in SAR images.
Here, one had to cope with multiplicative noise and yet preserve critical point and
line features. In order to achieve these goals, we tailored the DAMRF prior for SAR
imagery. The predicted sigma points using this prior were propagated through the
multiplicative model and were subsequently used to arrive at the image estimates.
Finally, we considered the problem of digital inpainting of noisy photographs. We
first developed an edge-based inpainting method which relies on the reconstructed
edge image. The end points of edges are matched based on a mask-based constrained
local matching strategy. Missing pixels are inpainted by propagating the boundary
information which is guided by the reconstructed edge map. Embedding the developed
inpainting method within the ISUKF-based filtering framework, we proposed a method
to simultaneously suppress film-grain noise while inpainting missing pixels.
All the proposed approaches were validated on synthetic as well as real examples.
They were also compared with existing methods to demonstrate the effectiveness of
the proposed methods.
8.1 SUGGESTIONS FOR FUTURE WORK
There are several directions for pursuing further research. We cite below a few of them.
• Extending the proposed ISKF and ISUKF for image restoration in the presence
of blur.
• Investigating the utility of the proposed ISUKF framework for other noise mod-
els and nonlinear restoration problems.
• Improving the performance of ARUKF and ISUKF through more accurate pre-
diction of the means and covariances using higher-order UT methods [137,138].
• Extending the inpainting ISUKF framework for occlusion-free tracking in videos.
• Exploring Gaussian-mixture UKF models for tracking multiple objects or people
in image sequences.
129
BIBLIOGRAPHY
[1] A. C. Bovik, Handbook of Image and Video Processing. Academic Press, 2005.
[2] A. K. Katsaggelos, “Recent trends in image restoration and enhancement techniques,”in Proc. IEEE Asia Pacific Conf. on Circuits and Systems, pp. 458–459, 1996.
[3] H. C. Andrews and B. R. Hunt, Digital image restoration. New Jersey: Prentice-Hall,Inc, 1977.
[4] A. M. Tekalp and G. Pavlovic, “Image restoration with multiplicative noise: Incorpo-rating the sensor nonlinearity,” IEEE Trans. Signal Process., vol. 39, pp. 2132 – 2136,1991.
[5] J. W. Goodman, “Some fundamental properties of speckle,” J. Opt. Soc. Amer., vol. 66,pp. 1145–1150, 1976.
[6] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester, “Image inpainting,” in Proc.SIGGRAPH, Computer Graphics Proceedings, pp. 417 – 424, 2000.
[7] R. E. Kalman, “A new approach to linear filtering and prediction problems,” Trans.ASME, Ser. D, Journal of Basic Engineering, vol. 82, pp. 34–45, 1960.
[8] J. W. Woods and C. H. Radewan, “Kalman filtering in two dimensions,” IEEE Trans.Inform. Theory, vol. 23, pp. 473 – 482, 1977.
[9] F. Daum, “Nonlinear filters: beyond the Kalman filter,” IEEE Aerospace and Elec-tronic Systems Magazine, vol. 20, pp. 57–69, 2005.
[10] Z. Chen, “Bayesian filtering: From Kalman filters to particle filters, and beyond.”Technical Report, 2003.
[11] S. R. Kadaba, S. B. Gelfand, and R. L. Kashyap, “Recursive estimation of images usingnon-Gaussian autoregressive models,” IEEE Trans. Image Processing, vol. 7, pp. 1439– 1452, 1998.
[12] A. Achim, E. E. Kuruoglu, and J. Zerubia, “SAR image filtering based on the heavy-tailed Rayleigh model,” IEEE Trans. Image Processing, vol. 15, pp. 2686–2693, 2006.
[13] M. Dai, C. Peng, A. K. Chan, and D. Loguinov, “Bayesian wavelet shrinkage with edgedetection for SAR image despeckling,” IEEE Trans. Geoscience and Remote Sensing,vol. 42, pp. 1642–1648, 2004.
[14] F. Argenti, T. Bianchi, and L. Alparone, “Multiresolution MAP despeckling of SARimages based on locally adaptive generalized Gaussian pdf modeling,” IEEE Trans.Image Processing, vol. 15, pp. 3385–3399, 2006.
130
[15] L. Gagnon and A. Jouan, “Speckle filtering of SAR images - a comparative studybetween a complex-wavelet-based and standard filters,” Proc. SPIE, vol. 3169, pp. 80–91, 1997.
[16] O. Lankoande, M. M. Hayat, and B. Santhanam, “Speckle modeling and reduction insynthetic aperture radar imagery,” in Proc. IEEE Int. Conf. Image Processing (ICIP),pp. III:317–320, 2005.
[17] A. Rares, M. Reinders, and J. Biemond, “Edge-based image restoration,” IEEE Trans.Image Processing, vol. 14, pp. 1454–1468, 2005.
[18] T. Chan and J. Shen, “Mathematical models for local non-texture inpaintings,” SIAMJournal on Applied Mathematics, vol. 62, pp. 1019–1043, 2001.
[19] A. K. Katsaggelos, “Iterative image restoration algorithms,” Optical Engineering,vol. 28, pp. 735 – 748, 1989.
[20] M. A. Robertson and R. L. Stevenson, “DCT quantization noise in compressed images,”IEEE Trans. Circuits and Systems for Video Technology, vol. 15, pp. 27–38, 2005.
[21] H. Soltanian-Zadeh, J. P. Windham, and A. E. Yagle, “A multi-dimensional non-linearedge-preserving filter for magnetic resonance image restoration,” IEEE Trans. ImageProcessing, vol. 4, pp. 147–161, 1995.
[22] F. T. Ulaby, “Radar signatures of terrain: Useful monitors of renewable resources,”Proc. IEEE, vol. 70, pp. 1410–1433, 1982.
[23] A. Kokaram, R. Morris, W. Fitzgerald, and P. Rayner, “Interpolation of missing datain image sequences,” IEEE Trans. Image Processing, vol. 11, pp. 1509–1519, 1995.
[24] S. Masnou and J. Morel, “Level-lines based disocclusion,” in Proc. IEEE Int. Conf.Image Processing (ICIP), pp. 259–263, 1998.
[25] A. K. Jain, Fundamentals of digital image processing. India: Prentice Hall, 2000.
[26] T. Berger, J. O. Stromberg, and T. Eltoft, “Adaptive regularized constrained leastsquares image restoration,” IEEE Trans. Image Processing, vol. 8, pp. 1191–1203,1999.
[27] D. Angwin and H. Kaufman, “Image restoration using a reduced order model Kalmanfilter,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSAP),pp. 1000–1003, 1988.
[28] H. Kaufman, J. W. Woods, D. Subrahmanyam, and A. M. Tekalp, “Estimation andidentification of two-dimensional images,” IEEE Trans. Automatic Control, vol. 28,pp. 745–756, 1983.
[29] F. C. Jeng and J. W. Woods, “Inhomogeneous Gaussian image models for estimationand restoration,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 1305–1312, 1988.
131
[30] A. M. Tekalp, H. Kaufman, and J. Woods, “Edge-adaptive image restoration withringing suppression,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 892–899, 1989.
[31] J. Biemond and J. Gerbrands, “An edge preserving recursive noise smoothing algorithmfor image data,” IEEE Trans. Systems, Man and Cybernetics, vol. 9, pp. 622–627, 1979.
[32] H. R. Keshavan and M. D. Srinath, “Sequential estimation technique for enhancementof noisy images,” IEEE Trans. Computers, vol. 26, pp. 971–988, 1977.
[33] Y. C. Chang, S. R. Kadaba, P. C. Doerschuk, and S. B. Gelfand, “Image restorationusing recursive Markov random field models driven by Cauchy distributed noise,” IEEETrans. Signal Process. Letters, vol. 8, pp. 65–66, 2001.
[34] S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesianrestoration of images,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 6, pp. 721–741, 1984.
[35] C. Bouman and K. Sauer, “A generalized Gaussian image model for edge-preservingMAP estimation,” IEEE Trans. Image Processing, vol. 2, pp. 296–310, 1993.
[36] P. Charbonnier, L. Blanc-Feraud, G. Aubert, and M. Barlaud, “Deterministic edge-preserving regularization in computed imaging,” IEEE Trans. Image Processing, vol. 6,pp. 298–311, 1997.
[37] F. C. Jeng and J. W. Woods, “Compound Gauss-Markov random fields for imageestimation,” IEEE Trans. Signal Process., vol. 39, pp. 683–697, 1991.
[38] S. Roth and M. J. Black, “Fields of experts: A framework for learning image priors,” inProc. IEEE Int. Conf. Computer Vision and Pattern Recognition, pp. 860–867, 2005.
[39] M. Ceccarelli, “A finite Markov random field approach to fast edge-preserving imagerecovery,” Image and Vision Computing, vol. 25, pp. 792–804, 2007.
[40] R. G. Aykroyd, “Bayesian estimation for homogeneous and inhomogeneous Gaussianrandom fields,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 20, pp. 533–539,1998.
[41] M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial on particle filtersfor on-line nonlinear/non-Gaussian Bayesian tracking,” IEEE Trans. Signal Process.,vol. 50, pp. 174 – 188, 2002.
[42] Y. Rui and Y. Chen, “Better proposal distributions: Object tracking using unscentedparticle filter,” in Proc. IEEE Computer Society Conf. Computer Vision and PatternRecognition, pp. 786–794, 2001.
[43] H. E. Knutsson, R. Wilson, and G. H. Granlund, “Anisotropic nonstationary imageestimation and its applications: Part I-restoration of noisy images,” IEEE Trans. Com-munication, vol. 31, pp. 388–397, 1983.
132
[44] G. Gilboa, N. Sochen, and Y. Y. Zeevi, “Image enhancement and denoising by complexdiffusion process,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 25, pp. 1020–1036, 2004.
[45] P. Perona and J. Malik, “Scale-space and edge detection using anisotropic diffusion,”IEEE Trans. Pattern Anal. and Machine Intell., vol. 12, pp. 629–639, 1990.
[46] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removalalgorithms,” Physica D, vol. 60, pp. 259–268, 1992.
[47] L. Kaur, S. Gupta, and R. C. Chauhan, “Image denoising using wavelet threshold-ing,” in Indian Conf. on Computer Vision, Graphics and Image Processing (ICVGIP),(India), 2002.
[48] S. G. Chang, B. Yu, and M. Vetterli, “Adaptive wavelet thresholding for image denois-ing and compression,” IEEE Trans. Image Processing, vol. 9, pp. 1532–1546, 2000.
[49] M. A. T. Figueiredo and R. D. Nowak, “Wavelet based image estimation: An empiricalBayes approach, using Jeffrey’s noninformative prior,” IEEE Trans. Image Processing,vol. 10, pp. 1322–1331, 2001.
[50] M. Jansen and A. Bultheel, “Geometric prior for noise free wavelet coefficient in imagede-noising,” in Lecture notes in Statistics (P. Mueller and B. Vidakovic, eds.), vol. 141,pp. 223–242, Spinger - verlag, 1999.
[51] A. S. Tavildar, H. M. Guptha, and S. N. Guptha, “Maximum a posteriori estimationin presence of film-grain noise,” Signal Process. (North Holland), vol. 8, pp. 363–368,1985.
[52] A. D. Stefano, P. R. White, and W. B. Collis, “Film grain reduction in colour imagesusing undecimated wavelet transform,” Image and Vision Computing, vol. 22, pp. 873–882, 2004.
[53] F. Naderi and A. A. Sawchuk, “Estimation of images degraded by film-grain noise,” J.Applied Optics, vol. 17, pp. 1228 – 1237, 1978.
[54] J. C. K. Yan and D. Hatzinakos, “Signal-dependent film grain noise removal and gen-eration based on higher-order statistics,” in Proc. of IEEE Signal Processing Workshopon Higher-Order Statistics, pp. 77–81, 1997.
[55] B. R. Hunt, “Bayesian methods in nonlinear digital image restoration,” IEEE Trans.Computers, vol. 26, pp. 219 – 229, 1977.
[56] G. K. Froehlich, J. F. Walkup, and T. F. Krille, “Estimation in signal-dependent film-grain noise,” Applied Optics, vol. 20, pp. 3619 – 3626, 1981.
[57] T. M. Moldovan, S. Roth, and M. J. Black, “Denoising archival films using a learnedBayesian model,” in Proc. IEEE Int. Conf. Image Processing (ICIP), pp. 2641–2644,2006.
133
[58] S. I. Sadhar and A. N. Rajagopalan, “Image recovery under non-linear and non-Gaussian degradations,” J. Opt. Soc. Amer. (A), vol. 22, pp. 604–615, 2005.
[59] S. I. Sadhar and A. N. Rajagopalan, “Image estimation in film-grain noise,” IEEETrans. Signal Process. Letters, vol. 12, pp. 238–241, 2005.
[60] F. Argenti, G. Torricelli, and L. Alparone, “MMSE filtering of generalised signal-dependent noise in spatial and shift-invariant wavelet domains,” Signal Processing,vol. 86, pp. 2056–2066, 2006.
[61] V. Bruni and D. Vitulano, “Old movies noise reduction via wavelets and Wiener filter,”Journal of Winter School of Computer Graphics (WSCG), vol. 12, pp. 65–69, 2004.
[62] V. S. Frost, J. A. Stiles, A. Josephine, K. S. Shanmugan, and J. C. Holtzman, “A modelfor radar images and its application to adaptive filtering of multiplicative noise,” IEEETrans. Pattern Anal. and Machine Intell., vol. 4, pp. 157–165, 1982.
[63] J. S. Lee, “Digital image enhancement and noise filtering by use of local statistics,”IEEE Trans. Pattern Anal. and Machine Intell., vol. 2, pp. 165–168, 1980.
[64] D. T. Kuan, A. A. Sawchuk, T. C. Strand, and P. Chavel, “Adaptive noise smoothingfilter for images with signal dependent noise,” IEEE Trans. Pattern Anal. and MachineIntell., vol. 7, pp. 165–177, 1985.
[65] A. Lopes, R. Touzi, and E. Nezzy, “Adaptive speckle filters and scene heterogeneity,”IEEE Trans. Geoscience and Remote Sensing, vol. 28, pp. 992–1000, 1990.
[66] A. Lopes, E. Nezry, R. Touzi, and H. Laur, “Maximum a posteriori speckle filteringand first order texture models in SAR images,” in Proc. IEEE Int. Geoscience andRemote Sensing Symposium, pp. 2409–2412, 1990.
[67] T. R. Crimmins, “Geometric filter for reducing speckle,” Optical Engineering, vol. 25,pp. 651–654, 1986.
[68] M. R. Azimi-Sadjadi and S. Bannour, “Two-dimensional adaptive block Kalman filter-ing of SAR imagery,” IEEE Trans. Geoscience and Remote Sensing, vol. 29, pp. 742–753, 1991.
[69] Y. Yu and S. T. Acton, “Speckle reducing anisotropic diffusion,” IEEE Trans. ImageProcessing, vol. 11, pp. 1260–1270, 2002.
[70] H. M. Salinas and D. C. Fernandez, “Comparison of PDE-based nonlinear diffusionapproaches for image enhancement and denoising in optical coherence tomography,”IEEE Trans. Medical Imaging, vol. 26, pp. 761–771, 2007.
[71] H. Xie, L. E. Pierce, and F. T. Ulaby, “SAR speckle reduction using wavelet denoisingand Markov random field modeling,” IEEE Trans. Geoscience and Remote Sensing,vol. 40, pp. 2196–2212, 2002.
[72] A. Achim, P. Tsakalides, and A. Bezerianos, “SAR image denoising via Bayesianwavelet shrinkage based on heavy-tailed modeling,” IEEE Trans. Geoscience and Re-mote Sensing, vol. 41, pp. 1773–1785, 2003.
134
[73] M. I. H. Bhuiyan, M. O. Ahmad, and M. N. S. Swamy, “Spatially-adaptive wavelet-based method using the Cauchy prior for denoising the SAR images,” IEEE Trans.Circuits and Systems for Video Technology, vol. 17, pp. 500–507, 2007.
[74] S. P. Luttrell and C. J. Oliver, “Prior knowledge in synthetic-aperture radar process-ing,” J. Phys. D, vol. 19, pp. 333–356, 1986.
[75] M. Bertalmio, L. Vese, G. Sapiro, and S. Osher, “Simultaneous structure and textureimage inpainting,” IEEE Trans. Image Processing, vol. 12, pp. 882–889, 2003.
[76] H. Kaufman and A. M. Tekalp, “Survey of estimation techniques in image restoration,”IEEE Control Systems Magazine, vol. 11, pp. 16 – 24, 1991.
[77] S. Z. Li, Markov Random Field Modeling in Computer Vision. New York, inc.:Springer-Verlag, 1995.
[78] S. Z. Li, “On discontinuity-adaptive smoothness priors in computer vision,” IEEETrans. Pattern Anal. and Machine Intell., vol. 17, pp. 576–586, 1995.
[79] D. J. C. MacKay, “Introduction to Monte Carlo methods,” in Learning in GraphicalModels (M. I. Jordan, ed.), NATO Science Series, pp. 175–204, Kluwer Academic Press,1998.
[80] E. C. Anderson, “Monte Carlo methods and importance sampling.” Lecture Notes,at ‘http://ib.berkeley.edu/labs/slatkin/eriq/classes/guest lect/mc lecture notes.pdf’,1999.
[81] S. Julier and J. Uhlmann, “A general method for approximating nonlinear transfor-mations of probability distributions.” Tech. rep., RRG, Dept. of Engineering Science,University of Oxford, 1996.
[82] S. J. Julier, J. K. Uhlmann, and H. F. Durrant-Whyte, “A new method for the nonlin-ear transformation of means and covariances in filters and estimators,” IEEE Trans.Automatic Control, vol. 45, pp. 477–482, 2000.
[83] R. V. Merwe, N. de Freitas, A. Doucet, and E. Wan, “The unscented particle fil-ter.” Technical report CUED/F-INFENG/TR 380, Cambridge University EngineeringDepartment, 2000.
[84] E. A.Wan and R. V. Merwe, “The unscented Kalman filter for nonlinear estimation,”in Proceedings of IEEE Symposium on Adaptive Systems for Signal Processing, Com-munications and Control (AS-SPCC), pp. 153–158, 2000.
[85] J. L. Crassidis and F. L. Markley, “Unscented filtering for spacecraft attitude estima-tion,” J. Guid. Control Dyn., vol. 26, pp. 536–542, 2003.
[86] P. Li, T. Zhang, and B. Ma, “Unscented Kalman filter for visual curve tracking,” Imageand Vision Computing, vol. 22, pp. 157–164, 2004.
[87] B. Stenger, P. Mendonca, and R. Cipolla, “Model-based hand tracking using an un-scented Kalman filter,” in Proc. British Machine Vision Conference, pp. 63–72, 2001.
135
[88] G. R. K. S. Subrahmanyam, A. N. Rajagopalan, and R. Aravind, “Unscented Kalmanfilter for image estimation in film-grain noise,” in Proc. IEEE Int. Conf. Image Pro-cessing (ICIP), pp. IV:17–21, 2007.
[89] E. R. Dougherty, Random processes for image and signal processing. New York:SPIE/IEEE Series on Imaging Science and Engineering, 1999.
[90] M. Petrou and P. G. Sevilla, Image processing: Dealing with texture. London, UK:John Wiley and Sons, 2006.
[91] A. Rangarajan and R. Chellappa, “Markov random field models in image processing,”in The handbook of brain theory and neural networks, NATO Science Series, pp. 564–567, MIT Press, 1998.
[92] N. Ahuja and B. Schachter, “Image models,” ACM Computing Surveys, vol. 13,pp. 373–397, 1981.
[93] S. C. Zhu, “Statistical modeling and conceptualization of visual patterns,” IEEE Trans.Pattern Anal. and Machine Intell., vol. 25, pp. 691–712, 2003.
[94] E. B. Ranguelova, Segmentation of textured images on three-dimensional lattices. PhDthesis, Dept. of Electronic and Electrical Engg., Univ. of Dublin, Trinity College,Dublin, 2002.
[95] P. Perez, “Markov random fields and images,” CWI Quarterly, vol. 11, pp. 413–437,1998.
[96] J. W. Woods, “Two-dimensional discrete Markovian fields,” IEEE Trans. Inform. The-ory, vol. 18, pp. 232–240, 1972.
[97] S. Kumar and M. Hebert, “Discriminative random fields,” Int. J. Computer Vision,vol. 68, pp. 179–201, 2006.
[98] H. M. Wallach, “Conditional random fields: An introduction.” University of Pennsyl-vania CIS, Technical Report MS-CIS-04-21, 2004.
[99] D. E. Melas and S. P. Wilson, “Double Markov random fields and Bayesian imagesegmentation,” IEEE Trans. Signal Process., vol. 50, pp. 357–365, 2002.
[100] M. A. T. Figueiredo, “Bayesian methods and Markov random fields.” Departmentof Electrical and Computer Engineering, Instituto Superior Tecnico, available at“www.lx.it.pt/˜mtf/FigueiredoCVPR.pdf”.
[101] D. Melas, A Bayesian Approach to the Segmentation of Textural Images. PhD thesis,Dept. of Electronic and Electrical Engg., Univ. of Dublin, Trinity College, Dublin,1998.
[102] J. Besag, “Spatial interaction and the statistical analysis of lattice systems,” J. RoyalStatist. Society (B), vol. 36, pp. 192–236, 1974.
[103] D. Griffeath, “Introduction to random fields,” in Denumerable Markov Chains (J. L. S.J. G. Kemeny and A. W. Knapp, eds.), pp. 425–458, New York: Springer-Verlag, 1976.
136
[104] R. Stevenson, B. Schmitz, and E. Delp, “Discontinuity-preserving regularization ofinverse visual problems,” IEEE Trans. Systems, Man and Cybernetics, vol. 24, pp. 455–469, 1994.
[105] P. J. Green, “Bayesian reconstruction from emission tomography data using a modifiedEM algorithm,” IEEE Trans. Medical Imaging, vol. 9, pp. 84–93, 1990.
[106] S. Geman, D. McClure, and D. Geman, “A nonlinear filter for film restoration andother problems in image processing,” Computer Vision Graphics and Image Processing,vol. 54, pp. 281–289, 1992.
[107] A. Blake and A. Zisserman, Visual Reconstruction. Cambridge, M.A.: M.I.T. Press,1987.
[108] S. Geman and G. Reynolds, “Constrained restoration and the recovery of discontinu-ities,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 14, pp. 367–383, 1992.
[109] T. Hebert and R. Leahy, “A generalized EM algorithm for 3-D Bayesian reconstructionfrom Poisson data using Gibbs priors,” IEEE Trans. Medical Imaging, vol. 8, pp. 194–202, 1989.
[110] S. Kapoor, P. Y. Mundkur, and U. B. Desai, “Depth and image recovery using a MRFmodel,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 16, pp. 1117 – 1122,1994.
[111] L. Khouas, C. Odet, and D. Friboulet, “3D furlike texture generation by a 2D autore-gressive synthesis,” in Proc. Inter. Conf. in Central Europe on Computer Graphics andVisualization (WSCG), pp. 171–177, 1998.
[112] S. Haykin, Kalman Filtering and Neural Networks. John Wiley and Sons, New York,inc.: Wiley-interscience, 2001.
[113] A. M. Tekalp, H. Kaufman, and J. W. Woods, “Fast recursive estimation of the pa-rameters of a space-varying autoregressive image model,” IEEE Trans. Acoust., Speech,Signal Processing, vol. 33, pp. 469–472, 1985.
[114] S. Z. Li, “Discontinuity-adaptive MRF prior and robust statistics: A comparativestudy,” Image and Vision Computing, vol. 13, pp. 227–233, 1995.
[115] S. Z. Li, “Robustizing robust M-estimation using deterministic annealing,” PatternRecognition, vol. 29, pp. 159–166, 1996.
[116] J. Pengelly, “Monte Carlo methods.” Students Tutorial, February 2002, available at‘http://csnet.otago.ac.nz/cosc453/student tutorials/monte carlo.pdf.’.
[117] M. Pagano and W. Sandmann, “Efficient rare event simulation: A tutorial on im-portance sampling.” Tutorial presented at Third International Working Conference onPerformance Modelling and Evaluation of Heterogeneous Networks, 2005.
[118] T. C. Hesterberg, Advances in Importance Sampling. PhD thesis, Stanford University,US, 1988.
137
[119] I. Beichl and N. F. Sullivan, “The importance of importance sampling,” Computing inSci. and Engg., vol. 1, pp. 71 – 73, 1999.
[120] P. H. Borcherds, “Importance sampling: an illustrative introduction,” Eur. J. Phys.,vol. 21, pp. 405–411, 2000.
[121] S. J. Julier and J. K. Uhlmann, “A new extension of the Kalman filter to nonlinearsystems,” in Proc. of AeroSense: 11th International Symposium on Aerospace/DefenseSensing, Simulation and Controls, vol. 3068, pp. 182–193, 1997.
[122] R. V. Merwe, Sigma-point Kalman filters for probabilistic inference in dynamic state-space models. PhD thesis, OGI Sch. of Sci. and Engg., Oreg. Health and Sci. Univ.,Portland, Oreg., 2004.
[123] B. D. Anderson and J. B. Moore, Optimal filtering. Englewood Cliffs, N. J: Prentice-Hall, 1979.
[124] B. Ristic, M. S. Arulampalam, A. Farina, and D. Benvenuti, “Performance boundsand comparison of nonlinear filters for tracking a ballistic object on re-entry,” in IEEProceedings: Radar, Sonar and Navigation, vol. 150, pp. 65–70, 2003.
[125] S. J. Julier and J. K. Uhlmann, “Unscented filtering and nonlinear estimation,” Proc.IEEE, vol. 92, pp. 401–422, 2004.
[126] S. J. Julier and J. K. Uhlmann, “The scaled unscented transformation,” in Proc. Amer.Control Conf., pp. 4555–4559, 2002.
[127] K. Ito and K. Xiong, “Gaussian filters for nonlinear filtering problems,” IEEE Trans.Automatic Control, vol. 45, pp. 910–927, 2000.
[128] R. V. Merwe and E. A. Wan, “Sigma-point Kalman filters for probabilistic inference indynamic state-space models,” in Proceedings of the Workshop on Advances in MachineLearning, 2003.
[129] R. V. Merwe, E. A. Wan, and A. T. Nelson, “Dual estimation and the unscentedtransformation,” in Advances in Neural Information Processing Systems, pp. 666–672,2000.
[130] I. A. Gura and R. H. Gersten, “Interpretation of n-dimensional covariance matrices,”American Institute of Aeronautics and Astronautics Journal, vol. 9, pp. 740–742, 1971.
[131] A. ACCESS, “Statistical analysis site.” http://www.aiaccess.net/e gm.htm.
[132] S. J. Julier, J. K. Uhlmann, and H. F. Durrant-Whyte, “A new approach for filteringnonlinear systems,” in Proc. Amer. Control Conf., pp. 1628–1632, 1995.
[133] B. Ristic, S. Arulampalam, and N. Gordon, Beyond the Kalman Filter: Particle Filtersfor Tracking Applications. New York, inc.: Artech House Radar Library, 2004.
[134] S. J. Julier, Comprehensive process models for high-speed land vehicles. PhD thesis,Robotics Research Group, Wadham Collage, University of Oxford, UK, 1997.
138
[135] R. V. Merwe and E. Wan, “The square-root unscented Kalman filter for state andparameter-estimation,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing(ICASSAP), pp. 3461–3464, 2001.
[136] S. J. Julier and J. K. Uhlmann, “Reduced sigma point filters for the propagation ofmeans and covariances through nonlinear transformations,” in Proc. Amer. ControlConf., pp. 887–892, 2002.
[137] D. Tenne and T. Singh, “The higher order unscented filter,” in Proc. Amer. ControlConf., pp. 2441– 2446, 2003.
[138] Y. Wu, M. Wu, D. Hu, and X. Hu, “An improvement to unscented transformation,”in Australian Conference on Artificial Intelligence, pp. 1024–1029, 2004.
[139] R. V. Merwe and E. A. Wan, “Gaussian mixture sigma-point particle filters for sequen-tial probabilistic inference in dynamic state-space models,” in Proc. IEEE Int. Conf.Acoust., Speech, Signal Processing (ICASSAP), pp. 701–704, 2003.
[140] L. Angrisani, A. Baccigalupi, and R. S. L. Moriello, “Ultrasonic time-of-flight estima-tion through unscented Kalman filter,” IEEE Trans. Instrumentation and Measure-ment, vol. 55, pp. 1077–1084, 2006.
[141] T. Lefebvre, H. Bruyninckx, and J. D. Schutter, “Comment on ‘a new method forthe nonlinear transformation of means and covariances in filters and estimators (andauthor’s reply)’,” IEEE Trans. Automatic Control, vol. 47, pp. 1406–1409, 2002.
[142] A. Sitz, U. Schwarz, and J. Kurths, “The unscented Kalman filter, a powerful toolfor data analysis,” Int. J. of Bifurcation and Chaos in App. Sci. and Engg., vol. 14,pp. 2093–2105, 2004.
[143] J. J. LaViola, “A comparison of unscented and extended Kalman filtering for estimatingquaternion motion,” in Proc. Amer. Control Conf., pp. 2435–2440, 2003.
[144] S. I. Sadhar, Some new approaches for image restoration and blur identification usingparticle filter. PhD thesis, Dept. of Electrical Engg., IIT Madras, India, 2005.
[145] D. Zou, J. Tian, J. Bloom, and J. Zhai, “Data hiding in film grain,” in 5th InternationalWorkshop on Digital Watermarking, pp. 197–211, 2006.
[146] N. J. Gordon, D. J. Salmond, and A. F. M. Smith, “Novel approach to nonlinear/non-Gaussian Bayesian state estimation,” in IEE Proc. F: Radar and Signal Process.,vol. 140, pp. 107 – 113, 1993.
[147] R. Birk, W. Camus, E. Valenti, and W. McCandless, “Synthetic aperture radar imagingsystems,” IEEE AES Systems Magazine, vol. 10, pp. 15–23, 1995.
[148] K. Tomiyasu, “Tutorial review of synthetic-aperture radar (SAR) with applications toimaging of the ocean surface,” Proc. IEEE, vol. 66, pp. 563–587, 1978.
[149] O. Lankoande, M. M. Hayat, and B. Santhanam, “Speckle reduction of SAR imagesusing a physically based Markov random field model and simulated annealing,” Proc.SPIE, vol. 5808, pp. 210–221, 2005.
139
[150] M. Costanntini, A. Farina, and F. Zirilli, “The fusion of different resolution SARimages,” Proc. IEEE, vol. 85, pp. 139–146, 1997.
[151] M. Mastriani and A. E. Giraldez, “Kalman’s shrinkage for wavelet-based despecklingof SAR images,” Int. J. Intelligent Technology, vol. 1, pp. 190–196, 2006.
[152] F. Sattar, L. Floreby, G. Salomonsson, and L. Benny, “Image enhancement based ona nonlinear multiscale method,” IEEE Trans. Image Processing, vol. 6, pp. 888–895,1997.
[153] R. C. Gonzalez and R. E. Woods, Digital image processing. Asia: Pearson Education,2000.
[154] I. E. Abdou and W. K. Pratt, “Quantitative design and evaluation of enhance-ment/thresholding edge detectors,” Proc. IEEE, vol. 67, pp. 753–766, 1979.
[155] A. Hirani and T. Totsuka, “Combining frequency and spatial domain information forfast interactive image noise removal,” in Proc. SIGGRAPH, Computer Graphics Pro-ceedings, pp. 269–276, 1996.
[156] A. Criminisi, P. Perez, and K. Toyama, “Region filling and object removal by exemplar-based image inpainting,” IEEE Trans. Image Processing, vol. 13, pp. 1200–1212, 2004.
[157] J. Shen, “Inpainting and the fundamental problem of image processing,” SIAM News,vol. 36, 2003.
[158] T. Chan, S. Kang, and J. Shen, “Euler’s elastica and curvature based inpaintings,”SIAM Journal on Applied Mathematics, vol. 63, pp. 564–592, 2002.
[159] T. Chan and J. Shen, “Non-texture inpaintings by curvature-driven diffusions (CDD),”J. Vis. Commun. Image Res., vol. 12, pp. 436–449, 2001.
[160] M. M. Oliveira, B. Bowen, R. McKenna, and Y. S. Chang, “Fast digital image in-painting,” in Proc. of the Inter. Conf. on Visualization, Imaging and Image Processing(VIIP), pp. 261–266, 2001.
[161] A. Telea, “An image inpainting technique based on the fast marching method,” J.Graphics Tools, vol. 9, pp. 25–36, 2004.
[162] K. Ko and S. Kim, “Efficient inpainting of old film scratch using Sobel edge opera-tor based isophote computation.,” in Proc. IEEE Int. Conf. Comput. and Inf. Tech.,pp. 124–129, 2006.
[163] M. Yasuda, J. Ohkubo, and K. Tanaka, “Digital image inpainting based on Markovrandom field,” in CIMCA-IAWTIC, pp. 747–752, IEEE Computer Society, 2005.
[164] N. Komodakis and G. Tziritas, “Image completion using global optimization,” in Proc.IEEE Computer Society Conf. Computer Vision and Pattern Recognition, pp. 442– 452,2006.
140
[165] A. Rares, M. J. T. Reinders, and J. Biemond, “Image sequence restoration in thepresence of pathological motion and severe artifacts,” in Proc. IEEE Int. Conf. Acoust.,Speech, Signal Processing (ICASSAP), pp. 3365–3368, 2002.
[166] A. L. Bertozzi, S. Esedoglu, and A. Gillette, “Inpainting of binary images using theCahn-Hilliard equation,” IEEE Trans. Image Processing, vol. 16, pp. 285–291, 2007.
[167] A. N. Cohen, “The Cahn-Hilliard equation: mathematical and modeling perspectives,”Adv. Math. Sci. Appl., vol. 8, pp. 965–985, 1998.
[168] J. Jia and C. Tang, “Image repairing: Robust image synthesis by adaptive ND ten-sor voting,” in Proc. IEEE Computer Society Conf. Computer Vision and PatternRecognition, pp. 643–650, 2003.
[169] J. Jia and C. K. Tang, “Inference of segmented color and texture description by tensorvoting,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 26, pp. 771–786, 2004.
[170] G. Medioni, M. Lee, and C. Tang, A computational framework for feature extractionand segmentation. Elseviers Science, 2000.
[171] K. A. Patwardhan, G. Sapiro, and M. Bertalmo, “Video inpainting under constrainedcamera motion,” IEEE Trans. Image Processing, vol. 16, pp. 545–553, 2007.
[172] J. Jia, Y. Tai, T. Wu, and C. Tang, “Video repairing under variable illumination usingcyclic motions,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 28, pp. 832–839,2006.
[173] Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, “Full-frame video stabilizationwith motion inpainting,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 28,pp. 1150–1164, 2006.
[174] C. Ballester, M. Bertalmio, V. Caselles, L. Garrido, A. Marques, and F. Ranchin,“An inpainting-based deinterlacing method,” IEEE Trans. Image Processing, vol. 16,pp. 2476–2491, 2007.
[175] C. A. Z. Barcelos and M. A. Batista, “Image restoration using digital inpainting andnoise removal,” Image and Vision Computing, vol. 25, pp. 61–69, 2007.
[176] J. Canny, “A computational approach to edge detection,” IEEE Trans. Pattern Anal.and Machine Intell., vol. 8, pp. 679–714, 1986.
141
LIST OF PAPERS BASED ON THESIS
A. Journal Papers
1. G. R. K. S. Subrahmanyam, A. N. Rajagopalan and R. Aravind, “Importance
sampling Kalman filter for image estimation”, IEEE Signal Processing Letters,
vol. 14, pp. 453-456, 2007.
2. G. R. K. S. Subrahmanyam, A. N. Rajagopalan and R. Aravind, “Importance
sampling unscented Kalman filter for film-grain noise removal”, IEEE Multime-
dia (to appear).
B. Conference Papers
1. G. R. K. S. Subrahamanyam, A. N. Rajagopalan and R. Aravind, “A new exten-
sion of Kalman filter to non-Gaussian priors”, Indian Conference on Computer
Vision Graphics and Image Processing (ICVGIP’2006), pp. 162-171, 2006.
2. G. R. K. S. Subrahmanyam, A. N. Rajagopalan and R. Aravind, “Unscented
Kalman filter for image estimation in film-grain noise”, IEEE International Con-
ference on Image Processing (ICIP’2007), pp. IV-17 - IV-20, 2007.
142