recursive image estimation and inpainting in … · recursive image estimation and inpainting in...

RECURSIVE IMAGE ESTIMATION AND

INPAINTING IN NOISE USING

NON-GAUSSIAN MRF PRIOR

A THESIS

submitted by

G. R. K. SAI SUBRAHMANYAM

for the award of the degree

of

DOCTOR OF PHILOSOPHY

DEPARTMENT OF ELECTRICAL ENGINEERINGINDIAN INSTITUTE OF TECHNOLOGY, MADRAS

JANUARY 2008

At the lotus feet of my samrdha sadgure

Bhagavaan Sri Venkaiah Swamy

THESIS CERTIFICATE

This is to certify that the thesis entitled “RECURSIVE IMAGE ESTIMATION

AND INPAINTING IN NOISE USING NON-GAUSSIAN MRF PRIOR”

submitted by G. R. K. S. Subrahmanyam to the Indian Institute of Technology,

Madras for the award of the degree of Doctor of Philosophy is a bona fide record of the

research work carried out by him under our supervision. The contents of this thesis,

in full or in parts, have not been submitted to any other Institute or University for

the award of any degree or diploma.

Chennai 600 036 (Dr. A.N. Rajagopalan and Dr. R. Aravind)

Date: Jan. 09, 2008 Research Guides

ACKNOWLEDGEMENTS

I vow my great debt of gratitude and offer my heart felt pranamas to the lotus feet

of merciful Bhagavan Sri Venkayya Swami of Golagamudi who blessed me to perform

PhD. I express my gratitude with utmost reverence to my Guru Sri P. Subbaramayya

sir who took loving care regarding my academic, physical and spiritual welfare during

my PhD. To Sri Achraya Bharadwaja master who enlightened me through his holy

books and talks a lot to pursue a student carrier with noble motivations.

I would like to express my deep sense of gratitude and reverence to my thesis

adviser Prof. A. N. Rajagopalan for his constant guidance and encouragement. His

cooperation that is reachable even to address the minute aspects such as in algorithmic

coding and editing tex files with a great care are the motivating reminiscence for me.

I would also like to offer my deep admiration and gratitude to my co-adviser

Prof. R. Aravind for his understanding and support especially during my thesis work.

Without his in depth clarifications and discussions it would not have been possible to

make many points explicit in this thesis.

I specially thank Ibrahim whose work and program codes became an interesting

and easy starting points of my work. I would like to convey my warm regards to Suresh

whose seniority and co-operation enable me to proceed fastly. I thank Rajiv, Aranov,

Paramanad and our other IPCV lab colleagues for their timely help. I acknowledge

the unanimous authors from whose publications I need to crop figures.

Finally, I bow at the lotus feet of Sri Sainath of Shiridi who blessed me with home

food and satsang during my study through my mother. I convey my heart full love to

mother who took all pains to follow the verdict of swami by providing company and diet

for my recreative health, to my father unless whose consent and cooperation it could

not have happened, and best wishes to my brother for his spirit of encouragement.

- G. R. K. Sai Subrahmanyam

ABSTRACT

Keywords: Image estimation, Markov random field (MRF), discontinuity-adaptive

MRF, Kalman filter, auto-regressive model, photographic images, film-grain noise,

synthetic aperture radar, speckle, image inpainting, edge-preserving priors, unscented

transformation, unscented Kalman filter.

The task of image recovery generally refers to deriving an original image from

its observations, by making use of the degradation model and the noise statistics.

Methods for tackling this problem have to do a delicate balancing act of suppressing

noise without losing features of interest. The effect of noise is usually reduced by

constraining the possible set of solutions with suitably chosen image models. Arriving

at a judicious choice for the image prior is a very challenging task. Image degradation

can be of many kinds. In this thesis, we specifically address the following problems:

image estimation in additive white Gaussian noise, reduction of noise due to grains

in images captured on photographic films, suppression of speckle noise in synthetic

aperture radar (SAR) images, and image inpainting in the presence of film-grain noise.

Additive white Gaussian noise (AWGN) model is widely used in many real situ-

ations including modeling of thermal noise, and noise in medical images [1, 2]. The

photographic film is a very popular image sensor. The degradation phenomenon asso-

ciated with it is film-grain noise. Even though this noise can be modeled as additive

white Gaussian in the density medium, the observation model is nonlinear [3,4]. Syn-

thetic aperture radar (SAR) imaging systems are a preferred choice for aerial photog-

raphy. Coherent processing which is inherent in SAR systems results in interference

patterns called speckle noise which is multiplicative in nature [5]. Yet another type

of degradation results from loss of information in portions of an image due to aging,

scratches, blotches or occlusions. Image inpainting refers to the process of filling-in

such damaged or missing regions [6]. A more complex problem arises when inpainting

must be done in the presence of noise. This has important applications in restoring

images and old movies captured on photographic films.

Among the various techniques proposed in the literature for handling image degra-

dations, recursive approaches are popular due to memory and implementational ad-

vantages. These methods are based on dynamic state-space formulation, and facilitate

easy incorporation of spatial adaptivity into the estimation procedure. The image

is modeled with a state transition equation and the degradation process by a mea-

surement model. The state is recursively predicted based on the state equation by

summarizing the information from the past estimates, while updation corrects the

prediction using the current measurement. The recursive Bayesian filter propagates

the state conditional density sequentially based on the given state and measurement

models. The well-known Kalman filter (KF) [7,8] belongs to this class, and is optimal

under the assumptions of a linear model and Gaussian noise. Techniques based on

the Kalman filter usually rely on a homogeneous auto-regressive (AR) model. A main

issue with AR-based KFs is that the incorporation of contextual knowledge such as an

edge-preserving prior is non-trivial. In more general nonlinear and/or non-Gaussian

situations, one seeks approximations to the Bayesian recursion [9, 10].

In this thesis, we first explore contextual modeling with a Markov random field

(MRF) based discontinuity adaptive image prior within the Kalman filtering frame-

work for filtering images corrupted by additive Gaussian noise. The desired moments

of the non-Gaussian prior are estimated using a Monte Carlo sampling technique.

The performance of the proposed approach is markedly superior in comparison to the

traditional 2D Kalman filter when tested on many images.

A recent nonlinear counterpart of the KF referred to as the unscented Kalman filter

(UKF) has been found to be quite effective in several 1D nonlinear estimation tasks.

It yields more stable and accurate estimates than the extended Kalman filter (EKF)

and uses exact nonlinear models. We investigate the applicability of the UKF for

nonlinear/non-additive image estimation tasks. A small set of deterministic samples

iv

known as sigma points are used to capture and propagate the first two statistics of the

state through the AR state model and true observation nonlinearity. The statistics at

the update step of the UKF are determined from the transformed sigma points.

We further explore the incorporation of an edge-preserving prior within the UKF

framework to accomplish excellent noise removal in conjunction with feature preserva-

tion. We demonstrate the effectiveness of the proposed UKF-based schemes for film-

grain noise removal in photographic images, and for despeckling SAR imagery. Several

examples (both synthetic and real) are given and the results are also compared with

existing methods.

Finally, we consider the problem of filling-in damages or missing pixels in photo-

graphic images by digital inpainting. We propose a novel procedure to reconstruct the

edges over damages and develop an adaptive inpainting method which is guided by

the reconstructed edge map. The proposed edge-based inpainting algorithm is embed-

ded within the UKF framework to accomplish simultaneous film-grain denoising and

inpainting. The proposed approach is validated on simulated as well as real cases.

v

TABLE OF CONTENTS

Abstract i

List of Tables viii

List of Figures ix

Abbreviations xiv

Notations xv

1 INTRODUCTION 1

1.1 Image Degradations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1 Image Estimation in AWGN . . . . . . . . . . . . . . . . . . . . 2

1.2.2 Photographic Film-Grain . . . . . . . . . . . . . . . . . . . . . . 4

1.2.3 SAR Speckle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.4 Digital Inpainting . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Objectives and Scope of the Thesis . . . . . . . . . . . . . . . . . . . . 8

1.4.1 Contributions of the Thesis . . . . . . . . . . . . . . . . . . . . 11

1.5 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 12

2 MARKOV RANDOM FIELDS 13

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1.1 Lattice, Sites, and Labels . . . . . . . . . . . . . . . . . . . . . . 13

2.1.2 Neighborhood System and Cliques . . . . . . . . . . . . . . . . 14

2.2 Markovianity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.3 Clique Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.1 Potential Function and Gibbs Random Field . . . . . . . . . . . 20

2.3.2 MRF-Gibbs Equivalence . . . . . . . . . . . . . . . . . . . . . . 21

2.4 MRF Priors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4.1 Gaussian MRF . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.4.2 Edge-preserving Priors . . . . . . . . . . . . . . . . . . . . . . . 25

2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 IMPORTANCE SAMPLING KALMAN FILTER FOR IMAGE ES-

TIMATION 29

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2 Auto-Regressive Kalman Filter . . . . . . . . . . . . . . . . . . . . . . 30

3.2.1 Filter Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3 Discontinuity-adaptive MRF . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3.1 Condition for DA Potentials . . . . . . . . . . . . . . . . . . . . 33

3.3.2 DAMRF Prior for Recursive Estimation . . . . . . . . . . . . . 34

3.4 Principle of Importance Sampling . . . . . . . . . . . . . . . . . . . . . 36

3.4.1 Moment Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.5 Kalman Filter with Non-Gaussian Prior . . . . . . . . . . . . . . . . . 41

3.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

4 UNSCENTED FILTER FOR NON-LINEAR ESTIMATION 50

4.1 Unscented Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2 UT Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.2.1 Accuracy of the Mean . . . . . . . . . . . . . . . . . . . . . . . 57

4.2.2 Accuracy of the Covariance . . . . . . . . . . . . . . . . . . . . 58

vii

4.3 Illustration of UT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.4 UT-based Extension of the Kalman Filter . . . . . . . . . . . . . . . . 65

4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5 NOISE REDUCTION IN PHOTOGRAPHIC IMAGES 71

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.2 Auto-Regressive UKF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.3 Importance Sampling UKF . . . . . . . . . . . . . . . . . . . . . . . . . 74

5.3.1 ISUKF for Non-linear Image Estimation . . . . . . . . . . . . . 75

5.4 Film-grain Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.4.1 The Proposed Filters . . . . . . . . . . . . . . . . . . . . . . . . 79


5.5.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.5.2 Real Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

6 DESPECKLING SAR IMAGERY 90

6.1 Image Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.1.1 SAR Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

6.2 SAR Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6.3 Noise Reduction in SAR . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.3.1 Speckle Suppression using ISUKF . . . . . . . . . . . . . . . . . 95


6.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

7 JOINT INPAINTING AND DENOISING 105

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

7.2 An Edge-based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 107

7.2.1 Reconstruction of Edge Image . . . . . . . . . . . . . . . . . . . 109

7.2.2 The Proposed Method . . . . . . . . . . . . . . . . . . . . . . . 113

viii

7.3 Results for Inpainting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

7.4 Inpainting in Presence of Noise . . . . . . . . . . . . . . . . . . . . . . 118

7.4.1 ISUKF for Simultaneous Inpainting and Filtering . . . . . . . . 120

7.5 Results for Joint Recovery . . . . . . . . . . . . . . . . . . . . . . . . . 121

7.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

8 CONCLUSIONS 128

8.1 Suggestions for Future Work . . . . . . . . . . . . . . . . . . . . . . . . 129

Bibliography 130

LIST OF PAPERS BASED ON THESIS 142

ix

LIST OF TABLES

3.1 Moment estimation using IS with a proper choice of importance function. 40

3.2 Moment estimation using IS with a bad choice of importance function. 41

3.3 Comparison of per-pixel computational complexity: ARKF vs ISKF. . . 47

5.1 Computational complexity comparison at each pixel. . . . . . . . . . . 88

6.1 Quantitative comparison with standard filters. . . . . . . . . . . . . . . 99

6.2 Quantitative comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . 101

6.3 Quantitative comparison with wavelet filters. . . . . . . . . . . . . . . . 102

LIST OF FIGURES

2.1 Symmetric neighborhood system for (a) first-order, and (b) second-order. 15

2.2 Non-symmetric half-plane support at a pixel (m, n). . . . . . . . . . . 16

2.3 NSHP neighborhood system for (a) first-order, and (b) second-order. . 16

2.4 Cliques for NSHP neighborhood. . . . . . . . . . . . . . . . . . . . . . 17

2.5 Edge-preserving convex potentials. The x and y axes correspond to η

and g(η), respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.6 Discontinuity-preserving non-convex potentials. The x and y axes cor-

respond to η and g(η), respectively. . . . . . . . . . . . . . . . . . . . . 28

3.1 DA model. (a) Interaction of neighbors as a function of difference. (b)

Penalty imposed by the DA model with increasing difference in intensity

values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.2 (a) Plot shows how the smoothing strength of DAMRF varies with η2.

(b) Heavy-tailed DAMRF distributions. . . . . . . . . . . . . . . . . . . 35

3.3 Choice of importance function. (a) Good choice: The support of the

target pdf is included in the sampler. (b) Bad choice: The support of

the same target pdf is not included in the sampler. . . . . . . . . . . . 38

3.4 Choice of importance function. (a) Good choice: Sampler support in-

cludes (non-symmetric) target pdf (b) Bad choice: Non-symmetric sam-

pler cannot be employed for IS of a symmetric target distribution. . . . 39

3.5 (a) Original image. (b) Degraded image (SNR = 10 dB). Image esti-

mated by (c) ARKF (ISNR = 3.06 dB), and (d) the ISKF (ISNR =

4.44 dB, γ = 1.8). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.6 (a) Original image. (b) Degraded image (SNR = 10 dB ). Image

estimated using (c) ARKF (ISNR = 1.29 dB), and (d) the ISKF

(ISNR = 2.43 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.7 (a) Original ”House” image. (b) Degraded (σ2v = 300) . Image estimated

using (c) AR-based KF (ISNR = 2.04 dB), and (d) ISKF (ISNR =

2.58 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.8 Building (a) Original. (b) Degraded image (σ2v = 300). Image estimated

by (c) AR-based KF (ISNR = 2.17 dB), and (d) ISKF (ISNR = 3.81

dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.9 Daisy (a) Original image. (b) Degraded (σ2v = 500). Image estimated

using (c) AR-based KF (ISNR = 4.14 dB), and (d) ISKF (ISNR =

5.36 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.10 Performance comparison of ARKF and ISKF on different images for

moderate noise of σ2v = 300. . . . . . . . . . . . . . . . . . . . . . . . . 47

3.11 Plane image. (a) Original. (b) Degraded [11]. Image estimated using

(c) BL-RUBF [11] (ISNR = 1.01 dB), and (d) ISKF (ISNR = 1.67 dB). 48

4.1 Principle of unscented transformation. . . . . . . . . . . . . . . . . . . 51

4.2 Block diagram of UT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.3 Posterior samples (a) Monte Carlo, and (b) UT. . . . . . . . . . . . . . 63

4.4 Figure shows the posterior mean and uncertainty (1-σ) contour of co-

variance, determined by MC approach, i.e., true values (mean at ’*’),

linearization (mean at ’+’), and unscented transformation (mean almost

matches with that of MC mean). . . . . . . . . . . . . . . . . . . . . . 64

4.5 Signal estimation: (—) original, (- . -) observations, and (- - -) UKF

estimates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.6 Signal estimation: (—) original, (- . -) EKF, and (- - -) UKF estimates. 70

5.1 Auto-regressive unscented Kalman filter (ARUKF). . . . . . . . . . . . 74

5.2 Importance sampling unscented Kalman filter (ISUKF). . . . . . . . . . 77

xii

5.3 (a) Original ‘flower’ image. (b) Degraded image. Image estimated using

(c) MWF (ISNR = 2.86 dB), (d) PF (ISNR = 2.55 dB), (e) ARUKF

(ISNR = 3.42 dB), and (f) ISUKF (ISNR = 4.63 dB). . . . . . . . . . 80

5.4 (a) The ‘house’ image. (b) Degraded image (σ2v = 0.05). Result using

(c) MWF (ISNR = 2.51 dB), (d) PF (ISNR = 3.48 dB), (e) ARUKF

(ISNR = 3.40 dB), and (f) ISUKF (ISNR = 4.55 dB). . . . . . . . . . 81

5.5 (a) The ‘peppers’ image. (b) Degraded image. Output of (c) MWF

(ISNR = 2.98 dB), (d) PF (ISNR = 4.47 dB), (e) ARUKF (ISNR =

4.12 dB), and (f) ISUKF (ISNR = 5.26 dB). . . . . . . . . . . . . . . 82

5.6 Performance comparison on different images in terms of mean value of

ISNR over 20 MC runs (a) at moderate noise (σ2v = 0.05), and (b) at

high noise (σ2v = 0.15). . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.7 (a) Cropped portion of a frame from the movie ‘Das testament des Dr.

Mabuse’. Output of (b) MWF, (c) PF, (d) ARUKF, and (e) ISUKF. . 84

5.8 (a) Face image with real film-grain noise. Image estimated using (b)

MWF, (c) PF, (d) ARUKF, and (e) ISUKF. . . . . . . . . . . . . . . . 85

5.9 (a) A real building image. Output image obtained using (b) MWF, (c)

PF, (d) ARUKF, and (e) ISUKF. . . . . . . . . . . . . . . . . . . . . . 86

5.10 Cropped portion of a locomotive captured with a film camera. (a)

Original with real film-grain noise. Image estimated using (b) PF, (c)

ARUKF, and (d) ISUKF. . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.1 (a) Original image. (b) Noisy version. Output of (c) Enhanced Lee

(FOM = 0.92, S/MSE = 20.51 dB), (d) Frost (FOM = 0.94, S/MSE =

20.51 dB), and (e) the proposed method (FOM = 0.98, S/MSE =

22.83 dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.2 (a) Original SAR image. (b) Degraded (σ2v = 0.04). Image estimated

using (c) the Enhanced Lee filter, (d) the Frost filter, (e) AR-based

UKF, and (f) the ISUKF. . . . . . . . . . . . . . . . . . . . . . . . . . 98

xiii

6.3 (a) Original aerial image. (b) Degraded image. Image estimated using

(c) Rayleigh prior MAP estimator [12], and (d) the proposed ISUKF. . 100

6.4 (a) The Horse track image. Estimated output image using (b) the Frost

filter, and (c) our method. . . . . . . . . . . . . . . . . . . . . . . . . . 100

6.5 (a) Bedfordshire image. Result using (b) edge-based Bayesian wavelet

filter [13], and (c) the ISUKF method. . . . . . . . . . . . . . . . . . . 101

6.6 (a) Airport image. Result using (b) MAP estimator in UDWL domain

[14], and (c) the proposed method. . . . . . . . . . . . . . . . . . . . . 102

6.7 (a) Urban SAR image. (b) Degraded image. Image estimated using (c)

wavelet-based filter [15], and (d) ISUKF. . . . . . . . . . . . . . . . . . 103

6.8 (a) Real SAR image. Image estimated using (b) simulated annealing

and Metropolis estimator [16], and (c) ISUKF. . . . . . . . . . . . . . . 104

7.1 A typical edge-based inpainting methodology (from [17]): (Left) General

algorithm outline and, (right) an illustration of the outputs for each stage.108

7.2 (a) A real image superimposed with text. (b) Mask specifies the region

where the original image information is lost. (c) A damaged photograph,

and (d) its mask that specifies the regions to be filled-in. . . . . . . . . 109

7.3 Edge reconstruction: (a) Degraded image showing maximum matching

area dimensions. (b) Edge map of the degraded image. (c) Located end

points, and (d) the reconstructed edge image. . . . . . . . . . . . . . . 111

7.4 Proposed inpainting algorithm. . . . . . . . . . . . . . . . . . . . . . . 114

7.5 Brick image (a) Original, (b) scratched, and (c) inpainted. . . . . . . . 115

7.6 A synthetic image (a) with a ring mask. (b) Inpainted output. (c)

Result of Bertalmio [6]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

7.7 Peppers image. (a) Original, (b) scratched, (c) edges from degraded

image, (d) reconstructed edges, and (e) inpainted image using recon-

structed edge map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

xiv

7.8 An old painting (a) Degraded image. (b) Edge map from degraded

image. (c) Reconstructed edge map. (d) Inpainted result using recon-

structed edge map. (e) Output cropped from [6]. . . . . . . . . . . . . . 117

7.9 A real image (a) with mask on damaged region. (b) Inpainted result

using our method. (c) Result of Bertalmio [6]. . . . . . . . . . . . . . . 118

7.10 (a) A bird image superimposed with text. Inpainted result using (b)

proposed method, and (c) method in [18]. . . . . . . . . . . . . . . . . 118

7.11 Horse-cart image (a) with superimposed text. (b) Inpainted output

using proposed method. (c) Output of [6]. . . . . . . . . . . . . . . . . 119

7.12 Proposed framework for inpainting in the presence of noise. . . . . . . . 121

7.13 (a) Image degraded by film-grain noise and a big patch. (b) Output of

the proposed method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.14 Boat (a) Original. (b) Degraded and scratched. (c) Recovered image. . 122

7.15 Peppers (a) Original. (b) Degraded and superimposed with text. (c)

Reconstructed edge map. (d) Image recovered by the proposed filter

(ISNR = 16.16 dB). (e) Degraded with high noise. (f) Recovered result

using reconstructed edge map (when input is (e)) (ISNR = 14.38 dB). 123

7.16 Face with real film-grain. (a) Scratched version. (b) Inpainted and

filtered result (when input is (a)). (c) Degraded by patches. (d) Image

recovered by our filter (when input is (c)). . . . . . . . . . . . . . . . . 124

7.17 Mabuse (a) with scratches. (b) Edge image from (a). (c) Reconstructed

edge image (when input is (b)). (d) Image recovered by the proposed

filter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

7.18 (a) Scanned painting with real film-grain noise. (b) Scratched version,

and (c) recovered image. . . . . . . . . . . . . . . . . . . . . . . . . . . 126

7.19 (a) An image captured using a film-camera, and (b) recovered output. . 126

7.20 (a) A child image with text, scratch and film-grain noise. (b) Inpainted

and filtered output of the proposed method. . . . . . . . . . . . . . . . 127

xv

ABBREVIATIONS

AWGN : Additive white Gaussian noise

MRF : Markov random field

AR : Auto-regressive

DA : Discontinuity-adaptive

DAMRF : Discontinuity-adaptive Markov random field

IS : Importance sampling

MAP : Maximum a Posteriori probability

pdf : Probability density function

KF : Kalman filter

ARKF : Auto-regressive Kalman filter

ISKF : Importance sampling Kalman filter

EKF : Extended Kalman filter

UKF : Unscented Kalman filter

UT : Unscented transformation

MWF : Modified Wiener filter

PF : Particle filter

MMSE : Minimum mean-square error

NSHP : Non-symmetric half-plane

2D : Two-dimensional

MC : Monte Carlo

ISNR : Improvement in signal-to-noise ratio

SAR : Synthetic aperture radar

ENL : Eqivalent number of looks

ARUKF : Auto-regressive unscented Kalman filter

ISUKF : Importance sampling unscented Kalman filter

NOTATIONS

M, N : Number of rows and columns of an image

S : Lattice

L : Set of labels

F : Configuration space

N : Neighborhood system

Nm,n : Set of sites neighboring the site (m, n)

Rm,n : NSHP support of order M1 at location (m, n)

c : A clique

C : Collection of all cliques

F : Random field

f : Configuration of F

p(.) : Probability density function

Vc : Potential function of clique c

γ : A constant controlling the convexity of DA function

ρm,n : Scale parameter of DAMRF density

s(m, n) : Pixel intensity at location (m, n)

s : vector containing first-order NSHP support of (m, n)

η2(s(m, n), s) : Energy term at (m, n) with neighbors s

x(m, n) : State vector at location (m, n)

y(m, n) : Known measurement at location (m, n)

u(m, n) : State noise with variance σ2u

u(m, n) : Vector state noise

v(m, n) : Measurement noise with variance σ2v

v(m, n) : Vector measurement noise

xvii

a(i, j) : AR model coefficients

E : Expectation operator

q(.) : Sampler density

zl : lth sample of random variable z

L : Number of samples

µp : Estimate of predicted mean

σp : Estimate of predicted covariance

f(.) : Known system transition function

F : State transition matrix

h(.) : Known measurement degradation function

g(.) : Arbitrary nonlinear function

s(m, n) : Mean estimate of the pixel intensity at location (m, n)

sk : Mean estimate of the state vector at location k

Pk : Estimate of the state Covariance at location k

Pk/k−1 : Predicted state Covariance matrix at location k

X : State sigma point matrix

Y : Measurement sigma point matrix

w : UT weights

κ, λ : UT scaling parameters

P(m, n) : State Covariance at location (m, n)

Px : Covariance of x at location (m, n)

Pxy : Cross covariance of x and y at location (m, n)

x(m, n) : Mean estimate of the state vector at location (m, n)

e(m, n) : Degraded image pixel in the exposure domain

Ω : Inpainting region

δΩ : Boundary of the inpainting region

xviii

CHAPTER 1

INTRODUCTION

1.1 IMAGE DEGRADATIONS

Imaging systems introduce different types of degradations resulting in imperfect ren-

dition of the original scene. Blur is a spatial degradation caused by relative mo-

tion between scene and camera or by an out-of-focus optical system [19]. Noise is

a point-wise degradation and refers to stochastic variations in the image intensity.

The thermal motion of electrons in electronic components constitutes electronic noise.

Thermal noise is additive white Gaussian (AWG) and signal-independent [1]. Quanti-

zation noise that results from digitization can be modeled by correlated space-varying

Gaussian distribution [20]. Photon noise is non-additive and Poisson distributed but

approaches the normal distribution for large numbers [1]. The perturbation in mag-

netic resonance imaging systems is characterised by additive white Gaussian noise

(AWGN) [21]. Restoring an original image in the absence of blur is referred to as

image estimation.

Yet another type of degradation arises from the nonlinear input-output charac-

teristics of imaging sensors and scanners. In photographic films, the silver density

deposited on the film is logarithmically related to the light exposure [3, 4]. The ran-

dom grains that develop during the film formation process result in film-grain noise

which can be modeled as AWG in the film density domain [4]. Aerial images captured

by synthetic aperture radar (SAR) systems are affected by multiplicative noise known

as speckle [22]. It results from the interference of back-scattered radar signals and is

usually modeled by a Gamma distribution [12].

Image degradations are not limited to only the above and include situations (such

1

as in old films and photographs) where there is complete loss of information in some

regions of the image due to aging, scratches, etc [17, 23]. Sometimes important infor-

mation is lost because of occlusion by another object or by superimposed text [6, 24].

The process of filling-in lost information is called image inpainting [6].

In this thesis, we investigate and propose new methods for image estimation in

AWGN, film-grain noise reduction, SAR despeckling, and joint image inpainting and

noise filtering.

1.2 LITERATURE REVIEW

Developing effective image recovery techniques is important for improving image qual-

ity, and continues to be an active area of research. These techniques must not only

rely on knowledge of the degradation process and the statistics of the noise, but must

also effectively exploit contextual image characteristics. They differ in incorporating

the prior knowledge about the original scene into the estimation procedure. Image re-

covery in the presence of noise is difficult not only because noise can be non-Gaussian

and/or non-additive, but also due to the fact that preservation of edges is equally

important for good qualitative and quantitative performance.

1.2.1 Image Estimation in AWGN

Work on suppressing AWG noise began with frequency domain approaches which ex-

ploit the differences in the frequency or the spectral content between the original signal

and the noise for recovery. The classical Wiener filter (WF) [25] assumes stationar-

ity of signal and noise, and minimizes the mean-squared error between the original

and the restored image. This requires knowledge of the power spectral density of the

original image and noise. The filter attenuates frequencies depending on the value of

signal-to-noise ratio. The constrained least-squares filter [26] provides for additional

control over the restoration process.

The Kalman filter (KF) is well-known for estimation of signals in AWGN and is

2

based on dynamic state-space formulation [7]. The KF was extended to 2D in [8] and

applied to remove AWGN from images. The reduced-order model Kalman filter [27]

achieves equivalent performance but with reduced computational complexity. Effects

of distortion resulting from noise can be reduced by Kalman filtering, provided the

image parameters such as the auto-regressive (AR) coefficients are known. In general,

however, these parameters are a priori unknown, and can also vary spatially as a

function of the image coordinates. Image estimation by KF with space-varying AR

parameters is considered in [28]. Jeng and Woods [29] propose residual and normalized

image models using local statistics based on the observation that a residual image can

be better predicted (by the AR model) than the original gray level image. Many

algorithms explicitly incorporate image edge-information within the KF framework.

An edge-adaptive filter for restoration of noisy and blurred images based on multiple

models has been presented in [30]. A modified KF in which edge information is used

to improve step-response is presented in [31]. In [32], the parameters of the AR model

are continuously updated to model non-homogeneity of the image. Kadaba et al.

[11] observe that the error residuals in the AR model can be better fitted by Bernouli-

Laplacian distribution. They analytically derive a sub-optimal recursive Bayesian filter

that can partially incorporate the non-Gaussian nature of error residuals. In [33], they

extend their framework to incorporate Cauchy AR residuals.

Based on the prior models of the image and the likelihood density specified by

the degradation model, many maximum a posteriori (MAP) techniques have been

developed. A great deal of image processing work stems from prior stochastic models

for image structure given by Markov random fields (MRFs). Geman and Geman [34]

exploit the equivalence between MRFs and Gibbs distributions to model images. The

optimal image estimate is given as a fixed point of an iterative procedure that relies on

neighborhood-dependent updates. Several MAP-based approaches have been proposed

to incorporate edge-preserving priors in the estimation procedure [35, 36]. In [37], a

compound Gauss MRF model is proposed for the image and its MAP estimate is

obtained by simulated annealing. Roth et. al. [38] propose a Field of Experts (FoE)

3

model by learning the prior probability with generic images within a higher-order

MRF formulation. Edge-preserving image recovery using discontinuity-adaptive finite

Markov random fields has been considered in [39]. A combination of homogeneous and

inhomogeneous conditional densities has been used in [40] for Bayesian estimation of

images. There is also an increasing interest in handling non-Gaussian situations using

Monte-Carlo sampling methods [41, 42].

Yet another class of algorithms is based on partial differential equations (PDEs)

[43, 44]. These methods have their origins in heat equation (called as anisotropic

diffusion) [45]. The total variation of an image is minimized numerically in [46] subject

to constraints involving the statistics of the noise. In recent years, wavelet transform-

based approaches have been proposed which perform a multi-resolution analysis of the

degraded image to effectively differentiate and suppress noise in the wavelet domain

[47, 48]. An approach that exploits the sparseness and independent nature of wavelet

coefficients is presented in [49]. The importance of prior modeling of the wavelet

coefficients is brought out in [50].

1.2.2 Photographic Film-Grain

In this problem, the main issue is to handle sensor non-linearity. Existing methods

for film-grain noise removal include extensions of the Wiener filter (WF) [4], MAP-

based approaches [51], and wavelet techniques [52]. An adaptive Wiener filter based

on non-stationary first-order statistics of the image is proposed in [53]. Tekalp et al. [4]

incorporate the effect of sensor nonlinearity into the Wiener filter structure to recover

from blurred and noisy photographic images. Both these variations of the Wiener filter

[4,53] result in good improvements over the conventional Wiener filter for suppressing

film-grain noise. In [54], the noise model parameters are estimated using higher-order

statistics. The filter coefficients are obtained by minimizing a cumulant criterion based

on the extended Wiener-Hopf equations.

Andrews et al. [3] expand the nonlinear observation model into a Taylor series

and derived an approximate filter for recovering the original image. In [55], using

4

the D − log E curve of the film, and assuming Gaussian models for image and noise

statistics, a MAP estimate of the restored image is derived. The resulting non-linear

equations are solved by steepest-descent using fast transform computations, and sig-

nificant improvements are reported over the WF. Image estimation based on different

priors is considered in [56]. The authors employ numerical integration and solve higher-

order polynomials to find the MAP estimate. In [51], the image is modeled as Gaussian

and a signal-independent transformation is used to reduce the complexity of the MAP

solution. Moldovan et al. [57] propose a Bayesian approach using learned spatial prior

and a likelihood based on inhomogeneous beta distribution. This is used to de-grain

individual frames of archival films. The Particle filter (PF) has been recently pro-

posed for linear and non-linear image estimation in [58]. The PF discussed in [59]

outperforms the well-known modified WF (MWF) [4] in suppressing film-grain noise

in photographic images.

An undecimated wavelet domain technique for reducing film-grain noise is pro-

posed in [52]. Using a set of training images, the wavelet coefficients are thresholded

for different noise levels by optimizing a cost function which is related to visual quality.

In [60], film-grain noise reduction is performed both in spatial and wavelet domains.

The detail wavelet coefficients are adaptively shrunk using the local statistics derived

from the noisy image. The wavelet filter is shown to outperform its spatial counter-

part. Noise reduction in video sequences of old movies is accomplished in [61] with a

combination of wavelet and Wiener filtering techniques.

1.2.3 SAR Speckle

Research on multiplicative speckle suppression in SAR images began with adaptive

denoising techniques which rely on local statistics. Frost et al. [62] explore the property

of coefficient of variation and propose an exponentially weighted kernel for speckle

suppression. Adaptive speckle suppression filters proposed by Lee et al. [63] and Kuan

et al. [64] use the local statistics of the degraded image and the multiplicative model

of speckle noise. Enhanced versions proposed by Lopes et al. [65] filter independently

5

in homogeneous, heterogeneous and isolated point regions. Other well-known early

approaches to SAR despeckling include the Gamma-MAP filter [66] and the nonlinear

geometric filter [67]. In [68], speckle is suppressed by employing an adaptive block-

Kalman filter, which relies on block-varying AR parameters of the original image.

An adaptive MAP estimator with a heavy-tailed Rayleigh model has been suggested

in [12]. Speckle suppression by anisotropic diffusion using PDE-based approaches is

considered in [69, 70].

In recent years, wavelet techniques have been found to be effective for SAR image

denoising. Wavelet despeckling filters adaptively modify the wavelet coefficients of

the log-transformed speckled image. A wavelet method based on soft-thresholding has

been proposed in [15]. Xie at el. [71] have proposed a despeckling algorithm that fuses

Bayesian wavelet denoising with a regularizing Gaussian-mixture prior. Many SAR

despeckling methods report improved performance with non-Gaussian prior modeling

of the wavelet coefficients within the MAP framework [72,73]. The choice of prior for

encoding SAR data is discussed in [74].

1.2.4 Digital Inpainting

Digital inpainting provides a means for reconstruction of small identified damages and

is also useful for removing superimposed text such as dates, subtitles, publicity or

logos from still images or videos [6, 17]. The main issue in inpainting methods is to

recover image information at damaged edges and lost texture regions [75]. There are

two major categories of inpainting algorithms. The first category propagates infor-

mation along the isophote (lines of equal gray values) direction [6] by formulating

the problem in terms of partial differential equations or elastica models. The second

category relies on edge or structure information [17]. We provide a detailed review

of inpainting approaches in chapter 7. Traditional inpainting algorithms aim only at

filling-in identified damages or removing superimposed text or small objects in images

[6, 17]. To the best of our knowledge, simultaneous image recovery from (film-grain)

noise and damages has not been addressed in the literature. In traditional denoising,

6

the pixels contain information about underlying data as well as noise, while in image

inpainting, there is no information available in the region to be inpainted [6] which

compounds the problem further.

1.3 MOTIVATION

The classical problem of image estimation in AWGN is difficult because one must meet

the (contradictory) but twin objective of noise reduction and detail preservation. The

Kalman filter is an elegant recursive filter that has been traditionally employed for

this purpose. However, the homogeneous image model, generally employed in KF [8],

can only provide limited compromise between noise removal and edge preservation.

For better performance, exploiting the non-stationary feature of the KF, algorithms

have been proposed to model the images with space-varying AR state models [28].

The state-space models are formulated with respect to residual and normalized images

to benefit from the use of the local statistics of the image [29]. However, such an

extension requires that the autocorrelation coefficients of the noise-free image be known

accurately. Explicit incorporation of edge-preserving priors within the KF framework

is an interesting research issue [11, 33].

The ubiquitous film is an important image recording medium. The degradation

phenomenon associated with it is film-grain noise which is not only visibly objection-

able but also renders compression difficult due to high entropy. Because the observa-

tion model is nonlinear, the noise turns out to be multiplicative and non-Gaussian in

the (visual) exposure domain [4]. A recent technique which is based on the recursive

particle filter (PF) [58] has the capability to incorporate sensor nonlinearity into its

estimation procedure. But its performance is constrained by the use of a homogeneous

autoregressive (AR) model. Also, it is computationally intensive. The widely em-

ployed nonlinear extension of the Kalman filter, the Extended Kalman filter (EKF),

only linearizes the state and observation nonlinear models about the (current) mean

of the state. The EKF is known to introduce considerably large errors in the estimates

7

and is prone to be unstable. Handling sensor nonlinearity in the presence of noise is a

challenging problem indeed.

In synthetic aperture radar (SAR) imaging systems, adaptive speckle suppression

filters [62,63] are based on the local statistics of the degraded image. These are simple

but tend to over-smooth image texture. They also suffer from ineffective denoising

around edges. The adaptive block-Kalman filter uses a block-wise varying AR model

for suppressing speckle noise [68]. However, it requires the AR coefficients of the noise-

free image to be known accurately. Recently proposed wavelet domain [13] and MAP

approaches [12] have superior performance but require explicit optimization and/or

parameter estimation. Developing methods for reducing the effect of speckle while

preserving point and line targets is an active area of research. A direct application

of the Kalman filter or its linear extensions [8, 11, 28] can only deal with linear and

additive noise models. The EKF is also limited by the additive model assumption.

Inpainting methods typically propagate information either along the isophote di-

rection or explicitly rely on edge or structure information. Edge-based inpainting

methods [17] first reconstruct the edge information in the damaged region. Then the

filling-in is performed within each object guided by the reconstructed edges. For this

class of algorithms, there are two main issues to be addressed: i) how do we recon-

struct the edges in the region to be inpainted using the boundary information? and

ii) how do we utilize the reconstructed edges in propagating gray level values into the

region to be inpainted? Research that addresses the problem of dual degradation due

to film-grain noise and loss of information is still in its infancy.

1.4 OBJECTIVES AND SCOPE OF THE THESIS

A common goal underlying all image estimation techniques is that of noise reduc-

tion in conjunction with feature preservation. We propose to meet this goal within

a recursive framework. Recursive techniques have excellent memory and implementa-

tional advantages. Moreover, they permit spatial adaptivity to be easily incorporated

8

into the filter model [76]. The state equation of the Kalman filter imposes a strong

linear constraint on state evolution which hinders true transitions at image edges.

We first propose a novel methodology to incorporate a non-Gaussian Markov ran-

dom field (MRF) prior into the recursive Kalman filter for estimating images cor-

rupted by AWGN. We begin with the intuition that incorporating edge-preserving

conditional priors can considerably improve performance. Specifically, the incorpora-

tion of a non-Gaussian prior in the Kalman filter amounts to determining the first two

moments of the prior at each recursive step. We formulate a discontinuity-adaptive

MRF (DAMRF) prior using the continuous adaptive regularizer model proposed in

[77, 78] so as to suit recursive estimation. However, the image model becomes non-

Gaussian and intractable for direct moment estimation. In recent years, there has

been an increasing interest in the application of Monte Carlo methods for several sig-

nal and image processing problems [41,58]. The power of these methods lies in random

sampling to approximate complex multi-dimensional integrals with simple summations

[79]. In particular, the importance sampling (IS) method [79,80] provides a convenient

methodology to estimate the moments under any arbitrary probability density func-

tion (pdf), known up to a multiplication constant. This is achieved by using weighted

samples of another roughly approximate pdf from which it is easy to draw samples.

We employ importance sampling to estimate the first two moments of the DAMRF

non-Gaussian prior which effectively uses the samples of an appropriately chosen sam-

pler density. Since the first two moments of the DAMRF prior constitute the predicted

mean and covariance of the state, the update is carried out exactly as in traditional

KF to arrive at the final image estimates. The performance of the proposed filter is

demonstrated with many examples.

The unscented Kalman filter (UKF), recently proposed by Julier and Uhlmann

[81], has been well-studied in the 1D domain and has been found to be more accurate

and stable than EKF [81]. It also has the capability to handle multiplicative, non-

Gaussian noise in its estimation procedure [82,83]. The UKF inherits the KF structure

and is the extension of a deterministic sampling approach known as the unscented

9

transformation (UT) to minimum mean-square error (MMSE) estimation [81]. In

UKF, the recursive prediction and updation of the first two moments of the state are

based on the propagation of sigma points (deterministic samples) through state and

observation models. The UKF has been applied to a wide variety of 1D nonlinear

signal estimation problems [81, 84, 85]. The UKF has also been explored for tracking

in computer vision [86, 87].

We develop an extension of the UKF for nonlinear image estimation, beginning

with the homogeneous AR state model. The sigma points are generated by assuming

an initial mean and covariance for the state and are propagated through the state and

nonlinear observation models resulting in the prediction of state and measurement

sigma points. The UT is employed to determine the predicted first two statistics and

the update is performed as in the Kalman filter.

Since the performance of AR based UKF (ARUKF) is limited by the AR state

model, we explore the incorporation of an edge-preserving MRF prior into the UKF

framework. Using a similar methodology as applied to improve the KF, we formulate

the DAMRF state conditional density based on the DA potential function and the

estimated neighbors. We employ IS to estimate the mean and covariance of the state

from the DAMRF prior. Unlike in the KF extension, here we employ the predicted

state statistics to determine the predicted sigma points and (directly) propagate them

through the nonlinear observation model to obtain the predicted measurement sigma

points. We carry out the update as in the ARUKF. We refer to this formulation of

the UKF as the importance sampling UKF (ISUKF).

We specifically consider film-grain noise removal in photographic prints, and speckle

suppression in SAR images, to demonstrate the effectiveness of the proposed ISUKF.

The ISUKF outperforms the ARUKF and also the PF [88] for image estimation in

film-grain noise. We modify the formulation of the DA prior for SAR imagery and ap-

ply the ISUKF with a multiplicative measurement model to arrive at despeckled SAR

images. Experiments on images with simulated and real speckle reveal the efficacy of

the proposed filter over contemporary methods.

10

Finally, we consider the problem of inpainting damaged images by extending the

proposed recursive UKF framework to simultaneously perform film-grain noise removal

and inpainting. We observe that the main issue in inpainting within the filtering

framework is to infer the observations from noisy neighborhood pixels. In the pro-

posed inpainting algorithm, we consider global information in terms of edge image

reconstruction and inpaint locally to restore the image. Using the damaged image and

information about the location of damages, we adopt a constrained matching strategy

to reconstruct the damaged edges in the regions to be inpainted. We inpaint the miss-

ing pixels by propagating the boundary information judiciously using the reconstructed

edge image. We validate the proposed approach with several examples.

To recover an image from damages as well as noise, we propose to embed the

inpainting method in the ISUKF formulation. Since the UKF lacks observations at

damages during its update step, we fill-in the missing observations by invoking our

inpainting method. We demonstrate the ability of the proposed unified framework for

inpainting and denoising of damaged film-grain noisy photographic images for both

simulated and real degradations.

1.4.1 Contributions of the Thesis

• We propose a discontinuity-adaptive image estimation scheme within the Kalman

framework by non-Gaussian MRF modeling of the image prior. This is achieved

by using importance sampling to estimate the first two moments of the non-

Gaussian prior and incorporating them into the update step of the KF.

• The unscented Kalman filter (UKF) with an AR state model is developed for

non-linear image estimation that can account for exact image sensor nonlinearity

in the observation model. To further improve performance, we incorporate an

edge-preserving Markov prior into the recursive estimation procedure of the

UKF through importance sampling. We employ this for film-grain noise removal

in photographic images.

• Next, the problem of despeckling of synthetic aperture radar (SAR) images is

11

considered. The discontinuity-adaptive MRF prior is modified so as to effectively

process SAR images and the UKF is formulated to account for multiplicative

noise in the image estimation procedure.

• We develop an edge-based image inpainting method and embed it into the pro-

posed UKF framework to simultaneously suppress film-grain noise and inpaint

(known) damages within a recursive framework.

1.5 ORGANIZATION OF THE THESIS

In chapter 2, we review the theory of Markov random fields.

Chapter 3 incorporates a discontinuity-adaptive MRF prior into the Kalman filter to

address the limitations of the AR-based KF for linear image estimation.

Chapter 4 introduces the principle of unscented transformation, analyzes the accuracy

of UT, illustrates UT with examples, and considers its extension to recursive nonlinear

estimation.

In chapter 5, we develop the UKF for nonlinear image estimation. We emphasize the

incorporation of edge-preserving MRF prior into the UKF and employ it for film-grain

noise suppression.

In chapter 6, we consider the problem of speckle noise suppression using the UKF

framework by suitably tailoring the prior for SAR images.

In chapter 7, the problem of filling-in missing information in noisy images is tackled

by embedding an edge-guided inpainting method into the UKF framework.

Chapter 8 summarizes the thesis and suggests directions for future work.

12

CHAPTER 2

MARKOV RANDOM FIELDS

2.1 INTRODUCTION

The study of random processes is important because filters or algorithms must be

developed in accordance with the probabilistic characteristics of the class of images

being processed [89]. A natural extension of a random process to two dimensions

is known as a random field. A realization of a random field is generated when we

perform a random experiment at each spatial location and assign the outcome of that

experiment to that location [90]. A Markov random field (MRF) is a conditional

density and possesses the Markovian property i.e., the value of a pixel depends only

on the values of its neighboring pixels and on no other pixel [77]. The qualifying

feature of MRFs in image processing is that the information contained in the local,

physical structure of images is captured by means of the local conditional probability

distribution [91]. It is important to realise that at discontinuities or edges, the notion

of neighborhood dependency should be relaxed. How to incorporate the smoothness

constraint in MRFs while simultaneously modeling the edges is a challenging task

[39, 77]. The theory of MRFs is built on the concept of a neighborhood system and

requires the following terminology for its representation and study.

2.1.1 Lattice, Sites, and Labels

Many vision tasks can be posed as labeling problems in which the solution to a task is

a set of labels assigned to image pixels or features. A labeling is specified in terms of a

set of sites and a set of (corresponding) labels. Typically, image data are represented

by gray-level variations defined over a finite rectangular or square point lattice. A

13

lattice S is a discrete set of points or pixels. For a 2D image of size M × N , it is

denoted by

S = (i, j) | 1 ≤ i ≤ M, 1 ≤ j ≤ N.

A lattice node t is uniquely specified by its coordinates t = (i, j), where i is the

image row number and j is the column number. A site represents a point or a region

in the Euclidean space. A site can be an image pixel (lattice node) or an image feature

such as a corner point, a line segment or a surface patch. Sites can be spatially regular

or irregular. A label is an event associated with a site. A label can assume either

continuous or discrete values. For example, the value of the analog pixel intensity is

an example of a continuous label, whereas the quantized values of intensities in the

set 0, 1, ..., 255, or edge, non-edge in edge detection, constitute discrete labels.

Another essential property of a label set is the ordering of the labels. For example,

elements in the continuous label set (the real space) can be ordered by the relation

“smaller than”. Let L be the set of labels. When a discrete set, say 0,1,2,...,255,represents pixel intensities, it is an ordered-set because for intensity values we have

0 < 1 < ... < 255. When a label set contains different symbols such as texture types

or edge features, it is considered to be un-ordered.

The problem of labeling is to assign a label from the set L to each of the sites in

the lattice S. The set of all possible combinations of labels is called the configuration

space and is denoted by F. In image estimation, the set S of sites corresponds to image

pixels and the discrete set L of labels is the intensity levels. For a discrete labeling

problem of k sites and l labels, there exist a total number of lk possible labellings (of

configuration space). However, among all the possibilities, there are only few which

are optimal in terms of a criterion measuring the goodness (or inversely, the cost) of

solutions. This is the optimization approach to visual labeling.

2.1.2 Neighborhood System and Cliques

One of the important characteristics of image data is statistical dependence of the

gray level at a lattice node on its neighbors. The sites in S are related to each other

14

(a) (b)

Fig. 2.1: Symmetric neighborhood system for (a) first-order, and (b) second-order.

through a neighborhood system defined as

N = Ni,j | ∀ (i, j) ∈ S (2.1)

where Ni,j is the set of sites neighboring the site (i, j). A collection of subsets of a

lattice S defined as N = Ni,j | (i, j) ∈ S, Ni,j ⊂ S is a neighborhood system on

S if the neighboring relationship satisfies the following properties:

1. (i, j) /∈ Ni,j (a site is not its own neighbor), and

2. if (k, l) ∈ Ni,j, then (i, j) ∈ Nk,l (the neighborhood relationship is mutual).

Neighborhood systems that are commonly used are N 1 and N 2. Figs. 2.1 (a) and

(b) show first-order and second-order symmetric neighborhood systems, respectively.

In the first-order neighborhood system (N 1), which is also called a 4-neighborhood

system, every site (m, n) has four neighbors as shown in Fig 2.1 (a). In the second-

order neighborhood system (N 2), also called an 8-neighborhood system, there are eight

neighbors for every site (shown in Fig. 2.1 (b)).

If we introduce the notion of time in a 2D representation, which is very useful

in recursive analysis of images, at a current pixel (m, n), only the “past” pixels such

as those shown in Fig. 2.2 become neighbors. Such a neighborhood support, known

as non-symmetric half-plane (NSHP) support (at a pixel (m, n)) is mathematically

15

Row

s Columns

Fig. 2.2: Non-symmetric half-plane support at a pixel (m, n).

(a) (b)

Fig. 2.3: NSHP neighborhood system for (a) first-order, and (b) second-order.

represented as

Rm,n = (m− i, n− j)|(1 ≤ i ≤ M1,−M1 ≤ j ≤ M1) ∪ (0, 1 ≤ j ≤ M1) (2.2)

and is depicted in Fig. 2.2. Here, M1 is the order of the NSHP support. The associated

local neighborhood Nm,n of site (m, n) is defined as

Nm,n = (k, l) ∈ S : (k, l) ∈ Rm,n, (k, l) 6= (m, n) (2.3)

Neighborhood system N , using Nm,n over the whole lattice S, is defined by Eq. 2.1.

Typically, we employ first and second-order NSHP neighborhood systems as shown in

16

Fig. 2.4: Cliques for NSHP neighborhood.

Figs. 2.3(a) and (b), respectively. MRFs with NSHP support are also referred to as

unilateral MRF [92] or causal MRFs [93].

A clique c of the pair (S, N ) is a subset of sites in S such that it consists either

of a single site c = (m, n) or of a pair of neighboring sites c = (m, n), (k, l), or

of a triplet of neighboring sites c = (m, n), (k, l), (i, j) and so on. The collections of

single-site, pair-site and triple-site cliques will be denoted by C1, C2 and C3, respectively,

where C1 = (m, n) : (m, n) ∈ S, C2 = (m, n), (k, l) : (k, l) ∈ Nm,n, (m, n) ∈S, C3 = (m, n), (k, l), (i, j) : (m, n), (k, l), (i, j) ∈ S and are neighbors of one

another and so on. Similarly, we can define higher-order cliques. The sites in a clique

are ordered i.e., (m, n), (k, l) is not the same as (k, l), (m, n). The collection of

all cliques for (S,N ) is C = C1 ∪ C2 ∪ C3,... where “...” denotes possible set of larger

cliques. The order or type of cliques for (S,N ) is determined by the size, shape and

orientation of the neighborhood.

Fig. 2.4 shows the clique types for the first-order causal neighborhood system of a

lattice at site (m, n), identified with a dot. The single-site, horizontal and vertical pair-

site cliques in Figs. 2.4 (a) and (b) constitute a first-order symmetric neighborhood

system. While the clique types for a first-order NSHP neighborhood system include

not only those in Figs. 2.4 (a) and (b) but also diagonal pair-site cliques in Fig. 2.4

(c), triple-sites in Fig. 2.4 (d), and quadruple-site (Fig. 2.4 (e)) cliques. We note that

clique types for a second-order symmetric non-causal neighborhood system contain all

those in Figs. 2.4 (a)-(e) and a fourth possible triple-site (which is not an NSHP clique

and hence is not shown in Fig. 2.4 (d)). As the order of the neighborhood system

increases, the number of cliques grows rapidly.

17

2.2 MARKOVIANITY

For the purpose of image analysis and processing, it is useful to have an underlying

model for the dominant characteristics of the given data. Although it is often difficult

to identify the physical mechanism which generated the observed data, an analytical

expression that captures the statistical properties of images can be used as a model

[94]. Specifically, the theory of MRF provides a consistent way of modeling context-

dependent entities [77]. The implicit assumption behind probabilistic approaches to

image analysis is that, for a given problem, there exists a probability distribution that

can capture to some extent the variability and interactions of different sets of relevant

image attributes [95].

A random field F is a collection of random variables arranged on the lattice S and

is defined as F = Fm,n, (m, n) ∈ S. The random variables Fm,n, (m, n) ∈ S take

values in the label set L. Specifically, for 8-bit quantized images, L = 0, 1, ..., 255.Let F = LM×N be the configuration space, where the random field F takes values. Here,

M×N is the dimension of the image. Let each random variable Fi,j take a value fi,j in

L, identified with the notation Fi,j = fi,j . The associated probability p(Fi,j = fi,j) is

abbreviated as p(fi,j). We use the notation (F1,1 = f1,1, ..., FM,N = fM,N) for the joint

event. For simplicity, this joint event is abbreviated as F = f where f = f1,1, ..., fM,Nis a configuration of F , corresponding to a realization of the field. The joint probability

p(F = f) is abbreviated as p(f). We define f c = fk,l : (k, l) ∈ S − (i, j). The

Markovian property can be stated as

p(fi,j/fc) = p(fk,l/fk,l, (k, l) ∈ Ni,j) (Markovianity) (2.4)

A random field F over lattice S is a Markov random field with respect to the neigh-

borhood system N , if and only if,

1. p(f) > 0, ∀ f ∈ F (Positivity)

2. F satisfies Markovian property (Eq. 2.4)

(for all (i, j) ∈ S). Here, f = fi,j, (i, j) ∈ S corresponds to a realization of the

18

random field F and F denotes the configuration space (set of all possible labels for f).

The fundamental notion associated with Markovianity is one of conditional inde-

pendence. A one-dimensional process Fn is Markovian if the knowledge of the process

at some point Fn decouples the past F p and the future F f i.e., p(f f/fn, f p) = p(f f/fn).

The notion of local dependence extends directly to 2D. However, the notion of “past”

and “future” is optional in 2D since there is no natural ordering of elements in a grid.

Nevertheless, the notion of time is very useful in extending 1D recursive techniques to

2D. If we consider a raster-scan of a grid at pixel position (m, n), the present, past,

and future NSHP pixels are shown in Fig. 2.2. We define a random field F as Markov

if knowledge of the process at pixel (m, n) is completely specified by the pixels in its

NSHP neighborhood i.e.,

p(fm,n/fk,l, (k, l) 6= (m, n)) = p(fm,n/fk,l, (k, l) ∈ Rm,n)

where Rm,n is the NSHP support for a site (m, n) as defined in Eq. (2.2).

In addition to the traditional discrete [96], continuous [77] and causal MRFs [93],

there exist several variants of MRFs such as discriminative RF [97], conditional RFs

[98], and double random fields [99]. The utility of MRF theory has also been extended

to 3D [94] for volumetric analysis of images and for modeling the temporal dimension in

videos. Reconstruction of non-texture surfaces is an example application of continuous

MRFs where we need to assign continuous labels for a sampling of the underlying

surface. In the restoration of a quantized image, we employ discrete MRFs. MRFs

with a symmetric neighborhood fall into the class of noncausal MRFs while the ones

with NSHP support belong to the class of causal MRFs.

2.3 CLIQUE POTENTIALS

The cliques defined in the earlier section are characterised by their strengths referred to

as clique potentials. The value of a clique potential depends on the local configuration

of the clique c and the chosen potential function. Lower the potential of a certain

clique, the closer is the current site to its neighbors within that clique. The sum of

19

clique potentials is a measure of the energy of the spatial activity at that site. The

energy function determines the probability of a current site given its neighbors.

2.3.1 Potential Function and Gibbs Random Field

A random field F = Fi,j defined on S is a Gibbs random field (GRF) with respect

to the neighborhood system N , if and only if, its joint distribution is of the form

p(f) =1

Zexp −U(f) (2.5)

where Z =∑

f∈Fexp −U(f) is a normalizing constant known as the partition func-

tion, and U(f) is the energy function defined as the sum of clique potentials over all

possible cliques C and given by

U(f) =∑

(m,n)∈S,c∈C1

Vc(fm,n) +∑

(m,n)∈S

∑

(k,l)∈N(m,n),c∈C2

Vc(fm,n; fk,l) + ... (2.6)

where Vc(fm,n) and Vc(fm,n; fk,l) are known as clique potentials such that Vc(.) ≥ 0.

This ensures that the Gibbs energy U(f) is non-negative. The distribution given by

Eq. (2.5) is called Gibbs distribution (GD). The expression for the joint distribution

in Eq. (2.5) has the physical interpretation that smaller the value of U(f), which is

the energy of a particular realization f , the more probable that realization is. A GRF

is said to be homogeneous if Vc(f) is independent of the relative position of the clique

c in S. It is said to be isotropic if it is independent of the orientation of c.

The clique potential Vc(.) is a function of the values of sites in clique c. It quantifies

the irregularity in the neighborhood. It helps in encoding prior information about the

image to be estimated to restrict the space of possible solutions. In fact, the power of

MRF lies in a careful selection of the clique potential Vc(f). Contextual constraints

on two labels are the lowest-order constraints to convey contextual information. They

are widely used not only because of their simple form and low computational cost, but

also due to the fact that they are usually sufficient to describe the local characteristics

of images. They are encoded in the Gibbs energy as pair-site clique potentials. The

20

cliques with only pair-wise interactions are known in MRF terminology as auto-models

[77, 100]. For these models, the clique potential Vc(.) = 0 for |C| > 2.

There exist several MRF models based on the choice of the potential function

and the label set. The choice of model is application-dependent. For instance, an

auto-logistic model [77], which has a binary label set, is appropriate for modeling

binary textures; multi-level logistic (or ”colour-blind”) models [77] with (multi-valued)

discrete label set are suitable for textures. The most widely used models are the

spatial autoregressive (AR) models or the auto-normal models (Gaussian MRFs) [40,

91]. These appear in a variety of applications such as texture feature extraction,

classification and segmentation. An extensive presentation and comparison of different

classes of MRF models can be found in [101].

2.3.2 MRF-Gibbs Equivalence

An MRF is characterized by its local (conditional) property whereas a GRF represents

global (joint) property. A theorem by Hammersley and Clifford [77,102] establishes the

equivalence between MRF and GRF and yields the joint distribution from conditional

distributions. The first part of this theorem states that any GRF is an MRF. Here, our

interest is to derive an expression for calculating the conditional probability p(fi,j/fc)

from potential functions based on the assumption that F is a GRF. For completeness,

we state the theorem but give the proof for only the part that is of interest to us.

Hammersley-Clifford Theorem: A random field F = Fi,j is a Markov

random field with respect to neighborhood N , if and only if, its joint density function

is a Gibbs distribution with cliques associated with N .

Proof : We now only prove that a GRF is an MRF [77,102], which enables us to

derive the conditional density from the clique potential functions.

Let p(f) be a Gibbs distribution on S with respect to the neighborhood system

N . The conditional probability can be written as

p(fi,j/fc) =

p(f)

p(f c)=

p(f)∑f ′i,j∈L

p(f ′)(2.7)

21

where f ′ = f1,1, · · · , fi,j−1, f′i,j, fi,j+1, · · · , fM,N is any configuration that agrees with

f at all sites except at site (i, j). Since we assume that F is a GRF, using Eq. (2.5)

and Eq. (2.6), we can write

p(fi,j/fc) =

exp−∑c∈C Vc(f)∑f ′i,j

exp−∑c∈C Vc(f ′)(2.8)

Dividing the set of cliques C into sets A consisting of cliques containing site (i, j) and

B with cliques not containing site (i, j), we can write

p(fi,j/fc) =

exp−∑c∈A Vc(f) exp−∑c∈B Vc(f)∑f ′i,j

exp−∑c∈A Vc(f ′) exp−∑c∈B Vc(f ′)(2.9)

Since by definition Vc(f) = Vc(f′) for any clique c that does not contain site (i, j),

canceling common terms we get

p(fi,j/fc) =

exp−∑c∈A Vc(f)∑f ′i,j

exp−∑c∈A Vc(f ′)(2.10)

Eq. (2.10) provides a formula for calculating the conditional density from the potential

functions, based on the assumption that F is a GRF. We note that Eq. (2.10) resembles

a Gibbs random field (Eq. 2.5), except that now the energy function is obtained by

summing the clique potentials of only the neighboring sites of the current site. We refer

[103] for proof of the converse statement. In the light of this equivalence, we simply

specify the local “interaction” factors to define, up to some multiplicative constant,

the joint probability distribution p(f). With such a setup, each variable only directly

depends on a few other “neighboring” variables. From a more global point of view, all

the variables are mutually dependent, but only through a combination of successive

local interactions [95].

2.4 MRF PRIORS

In a Bayesian framework, many image analysis tasks such as edge detection, template

matching, super resolution and restoration can be posed as inference problems [77,100].

The aim is to estimate the actual random signal f from its observation e. Bayesian

22

techniques provide a way to invert an observation model taking the prior knowledge

p(f) into account. This is done by applying the Bayes rule on the prior p(f) and the

likelihood p(f/e), and then optimizing the a posteriori expected cost function for a

given observation [100].

The choice of the clique potential Vc(f) is crucial as it embeds important prior

information about the image to be reconstructed. The conditional probabilities for

image neighborhood configurations, based on cliques, play a similar role to image

energy in variational approaches. The cliques in MRF encode a set of probabilistic

assumptions (priors) about the geometric properties of the signal, and thus are very

effective when the signal conforms sufficiently well to the prior. In the sequel, we

present how Vc(f) can be appropriately modified to include contextual prior knowledge

in the formulation of the local conditional density.

2.4.1 Gaussian MRF

A generic contextual constraint is smoothness (or continuity) which assumes that the

physical properties in a neighborhood of space or in an interval of time have some

coherence and generally do not change abruptly. For example, the surface of a table

is flat, a meadow presents a texture of grass, and a temporal event does not change

abruptly over a short period of time. Smoothness constraints are often expressed as

the prior probability or equivalently an energy term measuring the extent to which the

smoothness assumption is violated by a random process or field f .

For spatially and temporally continuous MRFs, the smoothness prior often involves

derivatives as in analytical regularization. The order of the derivative n determines

the number of sites involved in the cliques; for example, n = 1 corresponds to a pair-

site smoothness potential (common auto-model). The energy is the integral of the

potential function over the (entire) range of the signal [77].

In the discrete case, where the surface is sampled at discrete points and hence

the label set is also discrete, we use a first-order difference to approximate the first

derivative and use a summation to approximate the integral. Thus, the potential

23

function, considering a 1D signal is Vc(f) = V2(fi, fi′) = 12(fi−fi′)

2. The energy is the

potential summed over all the pair-site cliques i.e.,

U(f) =∑

i∈S

∑i′∈Ni,c∈C2

V2(fi; fi′) =∑

i

∑i′∈Ni

(fi − fi′)2.

In specifying prior clique potentials for piecewise continuous surfaces, only pair-site

clique potentials are normally used. In the simplest case of a flat surface, they can be

defined by [77] V2(fi, fi′) = g(fi − fi′) where g(.) is a function penalizing the violation

of smoothness caused by the difference fi − fi′. For the purpose of restoration, the

function g is generally even (i.e., g(η) = g(−η)) and nondecreasing (i.e., g′(η) ≥ 0).

The random field associated with quadratic smoothness prior is referred to as

Gauss Markov random field (GMRF). An image is usually modeled as a homogeneous

GMRF for simplicity and mathematical tractability. The clique potential function for

the GMRF, as discussed above, is a quadratic function represented as

g(η) = η2 (2.11)

which corresponds to the prior clique potentials in the MRF models. Specifically, for

an image s at site (m, n), the pair-site potential for a GMRF is

Vc(sm,n, sk,l) = g(s(m, n)− s(k, l)) =1

z(s(m, n)− s(k, l))2

where z is a normalizing constant. The advantage of GMRFs is that the potential (Eq.

2.11) is a convex function, and hence it favors global optimization.

Employing pair-site cliques with quadratic potential function, we can formulate

the energy function for GMRF over the sites containing (m, n) in its neighborhood as

η2(s(m, n), s) =1

2ρ2(m,n)

∑

(i,j)∈Nm,n

(s(m, n)− s(m− i, n− j))2. (2.12)

Here, s are (already estimated) neighbors and ρ(m,n) is a weighting parameter based

on the neighborhood. Note that we represent the energy function∑

c∈A Vc(s)(=∑

(k,l)∈Nm,ng(s(m, n) − s(k, l))) with η2(s(m, n), s). Assuming that the image s is

a GRF, we obtain the conditional probability density function (pdf) as

p (s(m, n)/s(m− i, n− j)) =1

(2πρ2(m,n)/mi)1/2

exp(−η2(s(m, n), s)

)(2.13)

24

where (i, j) ∈ Nm,n. The neighborhood becomes Rm,n in a causal recursive framework.

This defines a Gaussian intrinsic auto-regressive (AR) model with conditional mean

given by the mean of the neighboring values and conditional varianceρ2(m,n)

mi, where mi

is the number of neighbors and parameter ρ(m,n) controls the allowable variation of the

current pixel with respect to its neighboring pixel values [40]. In general, smaller the

variance among the neighbors, lower must be the value of ρ(m,n), allowing a prediction

closer to its neighborhood.

The Gaussian MRF model is effective in modeling smooth data, but it penalizes

edges. Since much of the information in an image is contained in edges, arbitrarily pe-

nalizing them is clearly disadvantageous. An image model must favor local smoothness

but at the same time should accommodate abrupt transitions such as edges.

2.4.2 Edge-preserving Priors

A solution to the over-smoothing nature of the GMRF prior is by choosing a more

general contextual prior model through g as

∑

c∈C

Vc(f) =∑

c∈C

g(dcf)

where the differential operator dcf is a local spatial activity measure of the image and

should have a small value in smooth regions and large value at edges.

In the literature, one approach is to detect discontinuities or edges explicitly and

preserve using methods such as line fields or compound GMRF (CGMRF). Geman and

Geman [34] proposed the use of line fields model for preserving edges. The potential

function is a truncated quadratic and is given as

g(η, α) = minη2, α (2.14)

The function in Eq. (2.14) is non-convex (truncated convex). The smoothness con-

straint is switched-off at points where the magnitude of the signal derivative exceeds

threshold√

α thus preserving edge information. But the use of the line field poten-

tial function renders it non-differentiable. An integration of GMRF and line fields

25

−50 0 500

5

10

15

20

25

−50 0 500

1

2

3

4

5

6

7

−50 0 500

1

2

3

4

5

6

−50 0 500

1

2

3

4

5

6

(a) (b)

(c) (d)

Fig. 2.5: Edge-preserving convex potentials. The x and y axes correspond to η andg(η), respectively.

is referred to as compound GMRF (CGMRF) [37]. A CGMRF consists of several

conditional Gauss-Markov sub-models with an underlying line field. Preserving dis-

continuities with this model requires the parameters of the CGMRF to be estimated.

Another approach is to replace the “square law” potentials by more “robust” functions

[100]. Optimization techniques such as simulated annealing with Metropolis-Hasting’s

algorithm or Gibbs sampler which are quite involved may be needed for a reliable

global minimum or maximum [100].

We now briefly discuss some well-known edge-preserving potential functions for

preservation of edges. These can be classified as (discontinuity-preserving) convex

and non-convex potentials. Among the convex ones, Bouman and Sauer [35] define

generalized Gaussian potentials as

g(η) = |η|r (2.15)

26

with r ∈ [1, 2] (r = 2 for GMRF). Figs. 2.5 (a) and (b) show this potential function for

r = 2 i.e., quadratic and r = 1.2, respectively. We observe that there is a significant

difference in the magnitude of the penalty between the two values of r.

Stevenson et al. [104] propose Huber MRF which has the potential function (shown

in Fig. 2.5 (c) for α = 0.6) and is given by

g(η) =

η2, |η| ≤ α

2α|η| − α2, |η| > α(2.16)

The function proposed by Green [105]

g(η) = 2α2 log cosh(η/α) (2.17)

is approximately quadratic for small η and linear for large η. Parameter α controls the

transition between the two behaviors. We depict the nature of this potential for four

different values of the parameter α (from 0.6 (solid) to 0.1 (doted)) in Fig. 2.5 (d).

Non-convex potential functions penalise the edges to a still smaller degree by flat-

tening their shape for large differences. Some of the non-convex potential functions

employed for edge-preservation include the one proposed by Geman and McClure [106]

(shown in Fig. 2.6 (a), for α = 1) and given by

g(η) = η2/(η2 + α2) (2.18)

Blake et. al. [107] suggest a potential function which adaptively switches off penalising

severe discontinuities (as depicted in Fig. 2.6 (b) for α = 2) and is given by

g(η) = (min|η|, α)2 (2.19)

Geman and Reynolds [108] propose

g(η) = |η|/(|η|+ α) (2.20)

for constrained restoration of discontinuities. A typical plot is shown in Fig. 2.6 (c).

Hebert and Leahy [109] propose

g(η) = log(1 + (η/α)2) (2.21)

27

−50 0 500

0.2

0.4

0.6

0.8

1

−50 0 500

1

2

3

4

−50 0 500

0.2

0.4

0.6

0.8

1

−50 0 500

0.5

1

1.5

2

2.5

3

3.5

(a) (b)

(c) (d)

Fig. 2.6: Discontinuity-preserving non-convex potentials. The x and y axes corre-spond to η and g(η), respectively.

which has been employed for 3D Bayesian reconstruction from Poisson noisy data.

The corresponding nonconvex potential function is plotted in Fig. 2.6 (d). Further

references to edge-preserving priors can be found in [36, 39].

2.5 DISCUSSION

A brief review of the terminology, formulation and utility of the Markov random fields

as an image prior was given in this chapter. The theory of MRF provides a convenient

and consistent way of modeling the context-dependent information. Along with the

simple GMRF models, potential functions adaptive to discontinuities were also dis-

cussed. Applications of MRFs to image processing and computer vision problems can

be found in [91, 110].

28

CHAPTER 3

IMPORTANCE SAMPLING KALMAN FILTER FOR

IMAGE ESTIMATION

3.1 INTRODUCTION

In this chapter, we address the problem of image estimation in additive white Gaussian

noise. This can be cast as a problem of state estimation from noisy measurements.

When the state transition and measurement equations are both linear, and the state

and measurement noises are independent and additive Gaussian, the Kalman filter

is optimal and gives the minimum mean square error (MMSE) estimate of the state.

Extension of the 1D recursive KF to images was first proposed by Woods and Rade-

wan [8] and is referred to as the reduced-update Kalman filter. We first discuss the

formulation, abilities and limitations of the traditional auto-regressive Kalman filter

(ARKF). To address the shortcomings of the ARKF, we consider statistical modeling

of the contextual prior and incorporate it within the KF framework.

We propose an interesting extension for handling edges by modeling the original

image with a non-Gaussian MRF prior. The edge preservation capability is implic-

itly incorporated using a discontinuity-adaptive state conditional density. If the state

transition equation is not known but an assumption on the state transition density

(possibly non-Gaussian) can be made, we can still use the Kalman filter update equa-

tions in the proposed framework. We review the discontinuity-adaptive MRF model

proposed by Li [77,78] and employ it to formulate a state conditional prior to specify

the state dynamics. We discuss the importance sampling (IS) technique to estimate

the moments of an arbitrary pdf. The novelty of our approach lies in obtaining the

29

predicted mean and variance of the non-Gaussian state conditional density by im-

portance sampling and incorporating them in the update step of the Kalman filter.

Experimental results demonstrate the effectiveness of the proposed method in filtering

noise while preserving edges.

3.2 AUTO-REGRESSIVE KALMAN FILTER

The observation model for an image degraded by additive noise is given by

y(m, n) = s(m, n) + v(m, n) (3.1)

where s(m, n) is intensity of the original image at location (m, n), y(m, n) is the

observation, and v(m, n) is zero-mean white Gaussian noise with variance σ2v , which

is independent of s(m, n). The problem is to estimate s(m, n) given y(m, n). This is

not straightforward because we need to preserve the discontinuities in s(m, n) while

filtering out the noise.

Typically, the original image is modeled as a 2-D autoregressive (AR) process.

This is due to the fact that in the absence of any a priori constraints, the solution

can be very noisy. The corresponding AR equation [8, 25] can be written as a state

transition equation in the form

s(m, n) =∑

(i,j)∈Rm,n

a(i, j)s(m− i, n− j) + u(m, n) (3.2)

Here, Rm,n is the NSHP model support (which is commonly used) given by Eq. (2.2),

the scalars a(i, j) are the AR model coefficients (assumed stationary) and computed

from a prototype image, and u(m, n) is uncorrelated zero-mean white-Gaussian noise

driving the AR process. The term u(m, n) can also be regarded as the modeling error

between the image and its predicted value. Specifically, we obtain the AR coefficients

a(i, j) by solving the Yule-Walker equations [111]

rk,l −∑

(i,j)∈Rm,n

a(i, j)rk−i,l−j = 0, (k, l) ∈ Rm,n

σ2u = r0,0 −

∑

(i,j)∈Rm,n

a(i, j)ri,j (3.3)

30

where σ2u is variance of the state noise and rk,l is the (k, l)th auto-correlation coefficient

of the original (or its prototype) image computed as

rk,l =1

(M − k)(N − l)

∑

(i,j),(i−k,j−l)∈S

s(i, j)s(i− k, j − l)

For simplicity, we rewrite the AR model (Eq. (3.2)) in matrix-vector form as

sk = Fsk−1 + uk (3.4)

For example, if we consider a first-order NSHP support with three-pixel neighborhood,

then sk = [s(m, n), s(m, n−1), s(m−1, n)]T and sk−1 = [s(m, n−1), s(m−1, n), s(m−1, n−1)]T . Matrix F contains the AR coefficients of the image. If a(0, 1), a(1, 0), a(1, 1)

are the AR coefficients corresponding to the three-pixel NSHP neighborhood and ob-

tained by solving Eq. (3.3), then F = [f1 f2 f3] where f1 = (a(0, 1) 1 0)T ,

f2 = (a(1, 0) 0 1)T and f3 = (a(1, 1) 0 0)T . The vector uk = [u(m, n) 0 0]T

where u(m, n) is process noise and is assumed to be independent and white Gaussian

with zero mean and with covariance defined by σ2u. The 3× 3 covariance matrix Q of

vector uk is formed by augmenting σ2u with zeros.

Based on Eq. (3.1), the measurement equation can be formulated as

yk = Hsk + vk (3.5)

Here, yk = y(m, n) is the given observation at (m, n) and H = [1 0 0]. The measure-

ment noise vk = v(m, n) with covariance matrix R = σ2v . The process and measure-

ment noise are assumed to be uncorrelated. Image estimation boils down to estimating

the state sk given the observation yk.

3.2.1 Filter Equations

For the state-space model given by Eqs. (3.4) and (3.5), the MMSE estimate of state

sk can be derived using the following Kalman filter recursive equations [8, 112].

State estimate propagation: sk/k−1 = Fsk−1

Error covariance propagation: Pk/k−1 = FPk−1FT + Q

31

Kalman gain matrix: Kk = Pk/k−1HT[HPk/k−1H

T + R]−1

State update: sk = sk/k−1 + Kk(yk −Hsk/k−1)

Error covariance update: Pk = (I−KkH)Pk/k−1

We initialize the state s0 (with the local mean of the observations) and covariance

matrix P0 (with an identity matrix). Here, k − 1 represents (m, n − 1), sk−1 and

Pk−1 are the posteriori estimates of the state and error covariance of the previous step

available at time k, sk/k−1 and Pk/k−1 are the a priori estimates of the state and error

covariance at time k, yk is the new measurement at time k, Kk is the Kalman gain, and

sk and Pk are the posterior state and error covariance of the present step, respectively.

The above filter is referred to as the Auto-regressive Kalman Filter (ARKF). It

is important to observe that linear dependence implies statistical dependence but not

vice-versa. Our idea is to arrive at a more general framework wherein pixel dependen-

cies can be expressed statistically rather than by imposing a strong linear constraint

as in the AR model.

3.3 DISCONTINUITY-ADAPTIVE MRF

To restore the original image in the presence of noise, it is important to incorporate

as much knowledge as possible about the original image in the estimation process.

The Gauss MRF or AR model, that is commonly used in image restoration, is one

way of imposing smoothness constraint through the state equation to regularize the

solution. A homogeneous and linear AR model compromises on the ability to detect

sharp transitions [28]. In [28, 113], spatially varying 2D AR parameters are estimated

by windowing the observed image. Jeng and Woods [29] propose inhomogeneous AR

image models using local statistics. Kadaba et al. [11, 33] observe that Bernouli-

Laplacian and Cauchy distributions are a better fit for AR model residuals than the

Gaussian. In [37], a compound Gauss Markov random field is used to model images.

Edge-preserving Markov random fields have been proposed in [39].

32

The approach that we adopt to incorporate an edge-preserving MRF prior within

our recursive estimation scheme is based on the fact that statistical dependence in-

corporated using a Markov random field (MRF) model provides better flexibility in

incorporating contextual constraints through a suitably derived prior. Following the

discussions in chapter 2, we propose to employ a discontinuity-adaptive MRF model

to incorporate smoothness constraint while simultaneously preserving the edges.

3.3.1 Condition for DA Potentials

Li [77] proposes a continuous adaptive regularizer model in which, unlike the line field

model, the interaction decreases as the derivative magnitude becomes large and is

completely turned off at infinite magnitude of the signal derivative. A classical way of

finding solutions to illposed problems is based on regularization methods, where stabil-

ity and uniqueness of the solution are enforced by the introduction of prior smoothness

constraints. A more general approach, which includes the classical one as a special

case, is probabilistic regularization which considers the original image and its degraded

version as realizations of random fields. The prior knowledge about the solution is ex-

pressed in the form of a conditional or joint probability distribution that specifies the

desired dependencies among values at neighboring sites.

Without loss of generality, consider a 1-D signal f that is to be estimated. Let

η = f ′(x) and f (n) denote the nth derivative of f . A potential function g(f (n)(x))

quantifies the penalty against the irregularity in f (n−1)(x) and corresponds to prior

clique (a set of connected pixels) potentials in MRF models [77]. The clique potential

is usually chosen to satisfy two properties (i) g(η) = g(|η|), and (ii) the derivative

of g must be expressible as g′(η) = 2ηh(η) where h is a function which determines

the interaction among neighboring pixels. The magnitude |g′(η)| = |2ηh(η)| is the

strength with which the regularizer performs smoothing. A necessary condition for

any regularization model to be adaptive to discontinuities is

limη→∞

|g′(η)| = limη→∞

|2ηh(η)| = A (3.6)

where A ≥ 0 is a constant. The above condition with A = 0 completely prohibits

33

−20 −15 −10 −5 0 5 10 15 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Signed difference measure, η

Inte

ract

ion

fu

nct

ion

, h

γ(η)

−20 −15 −10 −5 0 5 10 15 200

1

2

3

4

5

6

7

8

9

10

Signed difference measure, η

Str

en

gth

of

DA

−p

ote

ntia

l fu

nct

ion

, g γ(η

)

non−convexconvex region

(a) (b)

Fig. 3.1: DA model. (a) Interaction of neighbors as a function of difference. (b)Penalty imposed by the DA model with increasing difference in intensity values.

smoothing at discontinuities as η →∞ whereas with A > 0 the DA regularizer allows

limited (bounded) smoothing.

3.3.2 DAMRF Prior for Recursive Estimation

The DA models proposed by Li [77] satisfy all the conditions for discontinuity-adaptivity

and are robust to outliers [78,114]. They are parameterized by a free parameter γ and

allow better control over the shape of the potential function [115]. Among the DA

potentials suggested by Li in [77], we use the interaction function hγ(η) =1

1 + (η2/γ)where η is as defined before. The corresponding potential function takes the form

gγ(η) = γ log(1 + (η2/γ)) (3.7)

Fig. 3.1 (a), shows how the function hγ(η) specifies interaction based on the difference

value η while Fig. 3.1 (b) brings out the behavior of the corresponding potential func-

tion gγ(η). For this choice of hγ(η), the smoothing strength |ηhγ(η)| increases mono-

tonically as η increases within a band Bγ = (−√γ,√

γ). Outside the band, smoothing

decreases and becomes zero as η→∞. Since this enables to preserve image discon-

34

0 0.5 1 1.5 2 2.5 3 3.5

x 104

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Squared difference measure, η2

Sm

oo

thin

g s

tre

ng

th o

f D

AM

RF

, | 2

η h

γ(η)

|

50 100 150 200 2500

0.01

0.02

0.03

0.04

0.05

0.06

(30, 50)(80, 30)(160, 180)

Pixel intensity

DA

MR

F v

alu

es

(a) (b)

Fig. 3.2: (a) Plot shows how the smoothing strength of DAMRF varies with η2. (b)Heavy-tailed DAMRF distributions.

tinuities, it is also called a discontinuity-adaptive MRF (DAMRF). We observe that

the variation from the convex (smoothing) region to non-convex (i.e., lesser increment

in penalty to distinct pixels) is gradual in Fig. 3.1 (b). This choice of DA-potential

function is based on our observation that its performance is superior to other DA-

potentials proposed by Li [77, 78]. Fig. 3.2 (a) shows how the smoothing strength of

the DAMRF varies as a function of η2.

When neighborhood interaction in s is defined as above, the irregularities in the

solution are penalized by the corresponding potential function, and the conditional

pdf takes the form

p (s(m, n)/s(m− i, n− j)) =1

Zexp

(−γ log

(1 +

η2(s(m, n), s)

γ

))(3.8)

where Z is a normalization constant and the energy term

η2(s(m, n), s) =1

ρ2(m,n)

∑

(i,j)∈Rm,n

(s(m, n)− s(m− i, n− j))2

(i2 + j2)(3.9)

Here, (i, j) ∈ Rm,n where Rm,n is the NSHP support of order M1. Thus, s are the

NSHP neighbors of s(m, n). Parameter ρ2(m,n) controls the allowable variation of the

35

current pixel with respect to its neighboring pixel values [40] and plays an important

role in adaptively modifying the prior distribution. Since the potential function used is

discontinuity-adaptive, this local conditional prior is expected to render better edge-

preserving capability compared to the quadratic GMRF regularizer. Fig. 3.2 (b)

illustrates the non-Gaussian nature of DAMRF distributions. The pdf in Eq. (3.8)

can be rewritten as

p (s(m, n)/s(m− i, n− j)) =1

Z

(1 +

η2(s(m, n), s)

γ

)−γ

(3.10)

which has a maximum at

∑αij s(m− i, n− j)∑

αijwhere αij = 1/(i2 + j2) are the corre-

sponding weights and (i, j) ∈ Rm,n. Further, the pdf p is symmetric about the maxi-

mum (due to the quadratic nature of the argument η2). Thus, the mean value of the

pdf p(s(m, n)/s) is given by

E(p(s(m, n)/s)) =

∑i,j αij s(m− i, n− j)

∑i,j αij

. (3.11)

3.4 PRINCIPLE OF IMPORTANCE SAMPLING

In order to incorporate the DAMRF prior (formulated in section 3.3.2), into the

Kalman filter framework, we need to estimate the mean and covariance of this non-

Gaussian prior (in each recursive prediction step). Since it is analytically not possible

to compute the variance of this pdf, we resort to a Monte Carlo method known as

importance sampling.

Monte Carlo (MC) is the art of approximating an expectation by the sample mean

of a function of simulated random variables [80]. Since the simulations are with random

numbers, the more simulations we perform, the more accurate the approximation

becomes [116]. However, closer the approximation by a specific MC method, fewer

samples are needed to reach a certain level of accuracy. With appropriate MC methods,

even when using random sampling, it is possible to reduce the (error) variance for a

given number of sample points. This is based on the intuition that the efficiency of

random sampling methods can be enormously enhanced by a suitable choice of sample

36

points. For example, to estimate moments under a Gaussian distribution with random

samples, we need to sample more frequently around the mean of the Gaussian pdf

than employ uniform sampling. The MC procedure that enables us to perform such

tasks systematically is known as importance sampling.

Importance sampling (IS) is an MC method for determining the estimates under

a (difficult to sample) target function or pdf, provided its functional form is known

up to a multiplication constant [79]. The idea behind IS is that certain values of

the input random variables have more impact on the parameter being estimated than

others. If these “important” values are emphasized by sampling more frequently, then

the estimator error variance can be reduced [80]. Hence, the basic methodology is to

choose a distribution for generating random samples, which ‘encourages’ important

values. The way it does this is by bringing in a probability distribution function

[116] which is a function that attempts to tell which interval of the target function

should get more samples. It does this by having a higher probability in that area.

But this can result in a biased estimator. Hence, the outputs are corrected using

correction weights for each sample to ensure unbiasedness [116, 117]. The selection of

the sampling distribution known as importance function requires prior knowledge of

the target pdf (or function).

Let us consider a pdf p(z) from which it is very difficult to make any estimates of its

moments. From the functional form, we can estimate its non-zero support. Consider

a distribution q(z), a reasonable approximation to p(z), which is also known up to a

multiplicative constant, is easy to sample, and is such that the (non-zero) support of

q(z) includes the support of p(z). Such a density q(z) is called the sampler density

(referred to as importance function).

Mathematically, the idea of importance sampling can be presented as follows.

Suppose the density q(z) roughly approximates the target density p(z), with (non-

zero) support A (i.e.,∫

z∈Aq(z)dz = 1). The expected value of a function g under pdf

p(z) can be written as

Ep[g(z)] =

∫g(z)p(z)dz =

∫g(z)

(p(z)

q(z)

)q(z)dz = Eq

[g(z)

(p(z)

q(z)

)](3.12)

37

−20 −10 0 10 20 30 400

0.01

0.02

0.03

0.04

0.05

0.06

0.07Rayleigh (10)Gamma (2, 4)

Support

pd

f va

lue

s

0 5 10 15 20 25 30 35 400

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

q: Rayleigh (4) p: Gamma (2,4)

Support

pd

f va

lue

s

(a) (b)

Fig. 3.3: Choice of importance function. (a) Good choice: The support of the targetpdf is included in the sampler. (b) Bad choice: The support of the same target pdf isnot included in the sampler.

Using the above identity and IS with the samples of q, Ep[g(z)] is estimated as

∫g(z)p(z)dz ≈ 1

L

L∑

i=1

g(z(l))

(p(z(l))

q(z(l))

). (3.13)

where L samples z(l) are drawn from the distribution given by q(z). The (dis-

tribution) ratio p(z)/q(z) is known as the likelihood ratio [118], since it quantifies

the relative likelihood of a given sample z(l) under p compared to q. We note that

each sample drawn from q is weighted by the likelihood (or the correction weight) for

counter-balancing [116].

It is important to note that the success of this method in getting accurate estimates

is entirely dependent on selecting a good importance function q(z). Fig. 3.3 shows

good and bad q for a given p. Even when q(z) is roughly the same shape as p(z),

serious difficulties arise if q(z) decays-out faster than p(z) at the tails (for example,

as shown in Fig. 3.3 (b)). In such cases, the improbable samples at the tails of q

are given much higher-orders of the correction weight than usual making the estimate

biased [80]. For further information about the choice of importance function, refer to

38

0 10 20 30 40 50 60 70 800

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04Cauchy (2,1)Rayleigh (2)

Support

pd

f va

lue

s

−20 −10 0 10 20 30 400

0.01

0.02

0.03

0.04

0.05

0.06

0.07Rayleigh (10)Gaussian (5, 40)

Support

pd

f va

lue

s

(a) (b)

Fig. 3.4: Choice of importance function. (a) Good choice: Sampler support includes(non-symmetric) target pdf (b) Bad choice: Non-symmetric sampler cannot be em-ployed for IS of a symmetric target distribution.

[118, 119]. Fig. 3.4 illustrates the limitation of a non-symmetric sampler1.

3.4.1 Moment Estimation

Our aim is to determine the estimates under the pdf p, using samples from q. Since it

is difficult to draw samples from the target pdf p, we draw L samples, z(l)Ll=1 from

the sampler pdf q, i.e., they are chosen from a distribution q which concentrates in

the region where the function p is large. If these were under p, we can determine the

moments of p directly from them. In order to use these samples to determine the

estimates of the moments of p, we proceed as follows. When we use samples from

q to determine any estimates under p, in the regions where q is greater than p, the

estimates are over-represented. In the regions where q is less than p, they are under-

represented. To account for this, we use correction weights wl =p(z(l))

q(z(l))in determining

the estimates under p [see Eq. (3.13)]. While estimating the moments, the correction

weights must be normalized, if target p and sampler q (densities) are known only upto

1In Fig. 3.4 (a), to estimate the moments under a non-symmetric distribution using the symmetric

sampler, explicit care must be taken to use zero values of target pdf for the negative samples.

39

Table 3.1: Moment estimation using IS with a proper choice of importance function.

Target pdf Sampler density True IS (µp) True IS (σ2p)

(L = 1000) mean mean variance variance

Gaussian (µ = 20, σ2 = 40) Cauchy (µ, 30) 20 19.99 40 40.004

Gaussian (200, 400) Cauchy (µ, 30) 200 199.97 400 400.12

Laplacian (µ = 2, b = 1) Cauchy (µ, 30) 2 2.007 2 2.025

Laplacian (20, 10) Cauchy (µ, 30) 20 20.011 200 199.999

Gamma (k = 2, θ = 4) Rayleigh (σ = 10) 8 8.024 32 32.245

Chi-square (k = 4) Rayleigh (σ = 10) 4 3.997 8 7.95

Rayleigh (σ = 4) Rayleigh (σ = 10) 5.013 4.997 6.867 6.815

Rayleigh (σ = 2) Cauchy (2, 1) 2.507 2.503 1.717 1.720

a multiplication constant [118]. For example, to find the mean of the distribution p

we use µp =

∑Ll=1 wlz(l)

∑Ll=1 wl

. Similarly, the variance under the distribution p using the

samples of q by importance sampling can be found using σ2p =

∑Ll=1 wl(z(l) − µp)

2

∑Ll=1 wl

.

The accuracy of the IS estimates is quantified using their MC variance. For ex-

ample, the variance of the estimate of µp is defined as σ2ac =

∑l[z

(l) − µp]2

L, where

z(l) are now randomly sampled from p. Importance sampling with a proper choice of

importance function yields a much smaller variance than uniform sampling [80, 120].

We now illustrate and verify the ability of importance sampling principle to estimate

the moments under some well-known distributions.

We performed experiments to estimate the mean and variance of distributions

such as Laplacian, Gamma and Rayleigh from samples of another distribution, using

importance sampling. In Tables 3.1 and 3.2, we have summarized the results for proper

and improper choice of the importance function, respectively. The tables also contain

the parameters used for the target distributions, chosen parameters of the sampling

pdfs, and the true and estimated means and covariances. With a proper choice of

40

Table 3.2: Moment estimation using IS with a bad choice of importance function.

Target pdf Bad choice of True IS True IS

Sampler (L = 1000) mean mean variance variance

Gamma (k = 2, θ = 4) Rayleigh (σ = 4) 8 7.126 32 17.68

Gaussian (µ = 5, σ2 = 40) Rayleigh (σ = 10) 5 7.51 40 22.46

Chi-square (k = 40) Rayleigh (σ = 10) 40 33.88 80 19.41

importance function, the closeness of the estimated (first two) moments with that of

the true values (as given in Table 3.1), validates importance sampling. Figs. 3.3 (a)

and 3.4 (a) are good choices of importance function for which the estimated values

are given in rows 5 and 8, respectively, in Table 3.1. Note that Table 3.2 corresponds

to bad choices of the importance function. Figs. 3.3 (b) and 3.4 (b) illustrate target

distributions and importance functions corresponding to the first two rows of Table

3.2, respectively.

In our problem, the DAMRF distribution corresponds to p. We choose the Cauchy

distribution as the importance function q. It is a better approximation than the

Gaussian distribution as it provides larger support overlap due to its heavy-tailed

nature [79]. During the estimation procedure, since the DA state conditional pdf

changes its mode (peak value) at each pixel based on its neighbors, we adaptively vary

the Cauchy sampler location based on the mean of the neighbors. To account for the

variations in shape (width and tail behavior) of DAMRF prior at each pixel, we set

the scale of the Cauchy sampler sufficiently high.2

3.5 KALMAN FILTER WITH NON-GAUSSIAN PRIOR

We now present a recursive algorithm for estimating the original image within the KF

framework by non-Gaussian modeling of the image prior. In the proposed strategy,

knowledge of only the conditional pdf is required and its parameters are a function

2The same value of scale was found to work well for all the images.

41

of the already estimated pixels and the values of ρ(m,n) and γ as discussed in section

3.3.2. This implicitly generalizes the state transition equation and does not restrict it

to be linear. The proposed extension is based on the observation that the KF and its

variants differ only in state vector propagation, but have same update equations.

Specifically, we formulate the conditional density of the pixel to be estimated (given

its neighbors) based on the DAMRF model. The state sk at position k = (m, n) is

the original pixel intensity s(m, n) which is to be estimated. The mean and variance

of the state conditional density constitute the predicted mean sk/k−1 and predicted

covariance Pk/k−1 of the state, respectively. We estimate the first two statistics of the

non-Gaussian prior by importance sampling. They are taken to the update stage of

the KF to arrive at the final image estimate.

The steps involved in the proposed method are as follows:

1. At each pixel, we construct the state conditional pdf using the past pixels in

the NSHP support, and the values of ρ(m,n) and γ in the DAMRF model (Eq.

3.8). For a first-order NSHP support (M1 = 1), the neighbors considered are

s(m, n− 1), s(m− 1, n), s(m− 1, n− 1) and s(m− 1, n + 1), which are already

estimated, and η2(s(m, n), s) is as defined in Eq. 3.9. Since the parameter

ρ2(m,n) of the DAMRF model characterizes local dependence it is set to Pk−1,

the covariance of the previous state.

2. We obtain the mean and covariance of the above pdf using importance sampling.

Draw samples z(l), l = 1, 2, ..., L from a Cauchy sampler3. To facilitate a

close match with the DAMRF pdf p at each pixel, the location of the sampler

density q(z) is varied as µq = (s(m, n− 1) + s(m− 1, n) + s(m− 1, n− 1) +

s(m− 1, n + 1))/4 and the scale is chosen large enough to include the support

of p in q. The samples are weighted by wl =p(z(l))

q(z(l))(section 3.4). The mean µp

can be computed analytically using Eq. (3.11) or by importance sampling. The

mean and variance σ2p of p are computed as

3Cauchy distributed samples with location µq and scale s1 can be generated using µq + s1 tan(π(u−0.5)), where u is uniformly distributed over 0 to 1.

42

µp =

∑Ll=1 wlz(l)

∑Ll=1 wl

and σ2p =

∑Ll=1 wl(z(l) − µp)

2

∑Ll=1 wl

3. The predicted mean and error covariance are fed to the update stage of the

Kalman filter as sk/k−1 = µp and Pk/k−1 = σ2p

where sk/k−1 and Pk/k−1 are one-step forward predicted mean and error covari-

ances, respectively.

Kalman gain : Kk = Pk/k−1HT[HPk/k−1H

T + σ2v

]−1= σ2

p/(σ2

p + σ2v

)

Mean updation: sk = sk/k−1 + Kk(yk −Hsk/k−1) = µp +σ2

p(yk − µp)

(σ2p + σ2

v)

Error covariance update: Pk = (I−KkH)Pk/k−1 =(σ2

p σ2v)

(σ2p + σ2

v).

Since state (sk = s(m, n)) is a scalar, from Eq. (3.5) we have H = 1. Here

yk = y(m, n) is the scalar observation.

This gives the estimated mean s(m, n) = sk; we sequentially perform steps 1 to

3 until we estimate the last pixel.

Note that for the ARKF, the AR coefficients are assumed to be known accurately.

In real situations, this can be difficult because the original image is not available. In

contrast, for the ISKF, the statistical parameters µp and σ2p are obtained by IS.

3.6 EXPERIMENTAL RESULTS

In this section, we present results on image estimation using the proposed impor-

tance sampling-based Kalman filter (ISKF). We also provide comparisons with the

auto-regressive Kalman filter (ARKF) to demonstrate the improvement obtained by

incorporating a non-Gaussian prior in the Kalman filter. In ARKF, the AR parameters

and the process noise covariance are obtained from the auto-correlation coefficients of

the entire original image.

As a quantitative measure of the accuracy of the estimates, we use the improvement-

in-signal-to-noise-ratio (ISNR) defined as

ISNR = 10 log10

(∑i,j(y(i, j)− s(i, j))2

∑i,j(s(i, j)− s(i, j))2

)dB (3.14)

43

(a) (b) (c) (d)

Fig. 3.5: (a) Original image. (b) Degraded image (SNR = 10 dB). Image estimatedby (c) ARKF (ISNR = 3.06 dB), and (d) the ISKF (ISNR = 4.44 dB, γ = 1.8).

where s, y, s represent the original, the degraded observation, and the estimated image,

respectively, and the summation is taken over the entire image.

In Fig. 3.5(a), an original flower image of size 200 × 200 with sharp and clean

petals is shown. The image obtained after degradation by additive white Gaussian

noise (AWGN) of SNR = 10 dB is given in Fig. 3.5(b). The image estimated by

ARKF and the proposed approach are shown in Figs. 3.5(c) and 3.5(d), respectively.

We note that using the proposed scheme the noise on the white petals and on the

black background is filtered out very effectively while the petals come out sharp. The

image estimated by ISKF not only has finer details but has less noise compared to the

output of the ARKF. This is also reflected in the ISNR values.

To illustrate the (non-Gaussian) nature of the DAMRF pdf, the distribution cor-

responding to the DAMRF model has been plotted in Fig. 3.2 (b) for three different

pixel locations in the image (Fig. 3.5(d)), at (30, 50), (80, 30) and (160, 180) . We

note that the DAMRF pdfs have a heavy-tailed nature. We have observed experimen-

tally that their variance (degree of heavy-tail) coarsely varies with the variation among

neighbors (of the the current pixel) while it is finely controlled by the (previous) co-

variance estimate that is used for ρ2(m,n). From Fig. 3.2 (b), we see that the three pdfs

at three different locations have different means and different variances. This shows

the need for a space-varying heavy-tailed importance function to closely approximate

the DAMRF pdf at each pixel.

44

(a) (b) (c) (d)

Fig. 3.6: (a) Original image. (b) Degraded image (SNR = 10 dB ). Image estimatedusing (c) ARKF (ISNR = 1.29 dB), and (d) the ISKF (ISNR = 2.43 dB).

(a) (b) (c) (d)

Fig. 3.7: (a) Original ”House” image. (b) Degraded (σ2v = 300) . Image estimated

using (c) AR-based KF (ISNR = 2.04 dB), and (d) ISKF (ISNR = 2.58 dB).

Fig. 3.6(a) shows an image of bricks with narrow joints while its degraded version

is shown in Fig. 3.6(b). The image recovered using the ARKF and the ISKF are

shown in Figs. 3.6(c) and 3.6(d), respectively. In attempting to filter the noise, the

ARKF yields a blurred output. On the other hand, the ISKF not only captures the

edges (joints) well but also effectively filters out the noise (on the bricks) resulting in

an output (Fig. 3.6(d)) which is much closer to the original image.

Fig. 3.7(a) shows a natural scene with a house, tree and a car while its degraded

version with σ2v = 300 is shown in Fig. 3.7(b). The image recovered using ARKF (Fig.

3.7(c)) retains the tree leaves but is very noisy. However, as shown in Fig. 3.7(d),

the ISKF has filtered the noise well in homogeneous regions (such as sky, roof and

road) while simultaneously retaining image details like grass and leaves. Adaptive

45

(a) (b) (c) (d)

Fig. 3.8: Building (a) Original. (b) Degraded image (σ2v = 300). Image estimated by

(c) AR-based KF (ISNR = 2.17 dB), and (d) ISKF (ISNR = 3.81 dB).

(a) (b) (c) (d)

Fig. 3.9: Daisy (a) Original image. (b) Degraded (σ2v = 500). Image estimated using

(c) AR-based KF (ISNR = 4.14 dB), and (d) ISKF (ISNR = 5.36 dB).

noise suppression with the proposed ISKF is clearly evident even in this example.

Fig. 3.8(a) shows a palace surrounded by a lake. Its degraded version is shown in

Fig. 3.8(b). The images estimated by ARKF and the ISKF are shown in Figs. 3.8(c)

and 3.8(d), respectively. Note the noise removal capability of the proposed DA prior

vis-a-vis the AR model. The sky is noise free, the edges on the building walls are sharp

and clear and the reflection of the building is clearly visible in the water in Fig. 3.8(d).

Fig. 3.9(a) shows an original ”daisy” image. The image after degradation by

additive white Gaussian noise of σ2v = 500 is shown in Fig. 3.9(b). Figs. 3.9(c) and

(d), show the images estimated by ARKF and the proposed ISKF, respectively. We

note that, unlike ARKF, the proposed approach suppresses the noise very well without

46

0 5 10 15 20 251

2

3

4

5

6

7ISKFARKF

ISN

R v

alue

s

Image index

Fig. 3.10: Performance comparison of ARKF and ISKF on different images for mod-erate noise of σ2

v = 300.

any artifacts at the edges. The petals come out clean and sharp compared to ARKF.

This is also reflected as a higher ISNR value.

In order to provide a comprehensive evaluation of performance, we experimented

with a large database of 25 images containing faces, natural scenes, texture, and satel-

lite images. The performance improvement is quantified in terms of the mean value

of ISNR over 20 Monte Carlo trials and is plotted in Fig. 3.10. The plot clearly

demonstrates the superior quantitative performance of the proposed approach.4

Table 3.3: Comparison of per-pixel computational complexity: ARKF vs ISKF.

Operation addition multiplication/division exp/log

ARKF 3n2x(nx − 1) + nx(nx − 1) 3n3

x + n2x + 2nx 0

ISKF 10L 10L 2L

We provide a computational complexity comparison of the proposed ISKF and

4Except for some highly textured images (such as image 20 (Bark) and image 21 (Fabric)), where

the AR model is able to capture the image characteristics quite well.

47

(a) (b)

(c) (d)

Fig. 3.11: Plane image. (a) Original. (b) Degraded [11]. Image estimated using (c)BL-RUBF [11] (ISNR = 1.01 dB), and (d) ISKF (ISNR = 1.67 dB).

ARKF. As our prediction stage works by drawing samples, it is computationally in-

tensive than ARKF. The approximate computation requirements/pixel are given in

Table 3.3. Here nx is the dimension of the state in ARKF and L is the number of

samples in the importance sampling step of ISKF. For the experiments conducted here

(nx = 3 and L = 100), while the ARKF took 4 seconds, the proposed algorithm took

about 13 seconds for a 200 × 200 image, on a Pentium-IV PC running Matlab5.

5The improvement in performance of ISKF with more than 100 samples in IS step is only marginal.

48

We also compared our approach with the method in [11] to further evaluate the

ISKF. Figs. 3.11(a), (b), and (c) show the original, degraded and the estimated images,

respectively, as reproduced from [11]. The recursive Bayesian filter in [11], with an AR

state model driven by Bernouli-Laplacian density is shown to yield better performance

than the ARKF, for image estimation in AWGN. The result of our approach, shown

in Fig. 3.11 (d), when applied on the degraded image (Fig. 3.11 (b)), is closer to the

original than Fig. 3.11(c). Our approach yields cleaner uniform regions while at the

same time the letters on the plane emerge sharper.

3.7 DISCUSSION

We proposed a novel discontinuity-adaptive Kalman filter for image estimation and

provided a framework to handle the state conditional pdf directly in the prediction

step of the Kalman filter. Instead of using the state transition equation, we used

importance sampling to predict the mean and error covariance of the non-Gaussian

conditional prior. The estimated statistics are used in the update equations of the

Kalman filter to obtain the an estimate of the posteriori intensity. Experimental

results demonstrate the effectiveness of the proposed method.

The applicability of the proposed ISKF is limited to linear degradations. However,

we also encounter many real situations where the observed image is a non-linear or

multiplicative degradation of the original image. This situation requires the non-linear

counterpart of the Kalman filter, which is the subject of the following chapter.

49

CHAPTER 4

UNSCENTED FILTER FOR NON-LINEAR ESTIMATION

The objective of this chapter is to introduce a powerful framework for handling non-

linear or non-additive (image) degradations. An important observation by Julier et al.

[81] is that the optimal Kalman filter update, even in non-linear/non-Gaussian cases,

requires only the prediction of the first two moments (mean and covariance) of the

state and measurement as accurately as possible [121, 122]. This problem is a specific

case of the general problem of calculating the statistics of a random variable (RV)

which has undergone a nonlinear transformation. The most widely used approaches

to transform the means and covariances through nonlinear transformations are the

linearization method [123] and the Monte Carlo (MC) sampling method [79], which

are constrained by accuracy and computational efficiency, respectively.

In this chapter, we introduce the principle of unscented transformation (UT), pro-

posed by Julier et al. [81, 121], that transforms the mean and covariance through

nonlinear transformations using a very small carefully chosen deterministic set of sam-

ples (and deterministic weights). It leads to more accurate prediction (of moments)

than linearization [81, 122], as we will demonstrate, and provides an alternative to

the computationally intensive MC approaches in many applications [85,124]. We also

illustrate the superiority of the UT over linearization, and its proximity to the true

values or the MC estimates, on simple and common nonlinear transformation examples

from the literature [81, 121]. The UT when embedded in a Kalman filter formulation

is referred to as the unscented Kalman filter (UKF). The UKF provides a simple and

(more) accurate solution to the nonlinear estimation problem (than the commonly

employed Extended Kalman Filter) [81, 121].

50

i

i

i

Fig. 4.1: Principle of unscented transformation.

4.1 UNSCENTED TRANSFORMATION

The unscented transformation (UT) is a deterministic sampling approach for calcu-

lating the statistics of a random variable which undergoes a nonlinear transformation.

The UT is founded on the intuition that with a fixed number of parameters it is easier

to approximate a probability distribution than it is to approximate an arbitrary non-

linear function or transformation [81, 121]. Following this intuition, Julier et al. [81]

formulated a parameterization which captures the mean and covariance information

(of a RV) while at the same time permitting direct propagation of the (mean and

covariance) information through an arbitrary set of nonlinear equations. This is ac-

complished in UT by i) capturing the Gaussian approximation of the prior distribution

(first two moments of the prior) with a very small set of carefully chosen deterministic

samples known as sigma points, ii) propagating these samples through the nonlinear-

ity, and iii) determining the moments of the posterior from the transformed samples.

Fig. 4.1 illustrates the transformation process of sigma points. Here, we have shown 5

sigma points (deterministic samples, marked Xi) with the mean being the sigma point

at the center, while the others lie on an ellipse (determined by the (scaled) square root

of their covariance matrix [125]). Each sigma point Xi is independently transformed

51

through a nonlinear function g (vector-to-vector transformation) as depicted in the

figure. The transformed sigma points Yi = g(Xi), i = 0, ..., 4., shown on the right,

predict the mean and covariance of the posterior random variable.

Consider propagating random variable x through a nonlinear function g : Rnx→Rny

to generate y = g(x). Assume x has mean x and covariance Px. To calculate the

statistics (first two moments) of y using the scaled unscented transformation (SUT),

we proceed as follows: First, a set of 2nx + 1 weighted samples or sigma points are

deterministically chosen so that they exactly capture the true mean and covariance of

the prior random variable x. A selection scheme that satisfies this requirement [83,126]

is given by

X0 = x; w(µ)0 =

λ

(nx + λ)

Xi = x + (√

(nx + λ)Px)i , i = 1, ..., nx; w(c)0 = w

(µ)0 + (1− α2

UT + βUT )

Xi = x− (√

(nx + λ)Px)i , i = nx + 1, ..., 2nx; λ = α2UT (nx + κ)− nx

w(µ)i = w

(c)i =

1

2(nx + λ), i = 1, ..., 2nx (4.1)

Parameter αUT controls the spread of the sigma point distribution around x and is

usually set to a value between 0 and 1. The term βUT is used to incorporate prior

knowledge about x. Note that (√

(nx + λ)Px)i represents the ith column of the nx×nx

matrix square root1 of (nx +λ)Px. The addition and subtraction of these vectors from

the mean value of the prior results in 2nx symmetric sigma points. Along with the

prior mean X0, we have 2nx + 1 sigma points 2. While Xi refers to the ith sigma

point, the superscripts µ and c on the weights refer to their use in mean and covari-

ance calculations, respectively. The weight w(µ)i associated with the ith point satisfies

∑2nx

i=0 w(µ)i = 1. Simple matrix calculations confirm that the first two moments of these

sigma points are exactly the same as that of the prior [82]. We present explicit expres-

sions in section 4.2.1If the matrix square root A of matrix Px is of the form P = A

TA, then the sigma points are

formed from the rows of A. However, if P = AAT , then the columns of A are used [81].

2We note that the number of sigma points is directly related to the dimension of the prior RV.

52

Each sigma point is instantiated (independently passed) through the nonlinearity

to get Yi = g(Xi), i = 0, 1, ....., 2nx. The mean and covariance of y are estimated as

y =2nx∑

i=0

w(µ)i Yi (4.2)

and

Py =

2nx∑

i=0

w(c)i (Yi − y)(Yi − y)T (4.3)

Positive semi-definiteness of the predicted covariance matrix Py is guaranteed by choos-

ing κ ≥ 0 [81,83]. These estimates of the moments of y are accurate to the second-order

(and third-order for symmetric priors) of the Taylor series expansion for any nonlinear

function g(x), since the prior sigma points completely capture the prior distribution

up to the second moment [81, 82]. Valuable insights into the UT can also be gained

by relating it to a numerical technique called Gaussian quadrature of integrals [127].

A close similarity also exists between the UT and the central difference interpolation

filtering (CDF) [128]. The principle of UT has also been used to improve the ’Expec-

tation step’ in the EM algorithm in the context of training neural networks [129].

A block diagram (inspired by [112]) illustrating the steps for performing the UT

is shown in Fig. 4.2 for a prior RV of dimension 5. Its mean and covariance are

represented on the left by a vertical column and a square, respectively. To determine

the sigma points, the covariance matrix is passed through the matrix square root and

then multiplied by the factor λc =√

(5 + λ). The resulting 5 × 5 matrix is split

into 10 columns with positive and negative sign before each of its 5 columns. They are

augmented with a zero vector in the first column and shown as a 11 column rectangular

matrix. On the top, the mean is replicated as 11 columns and is added to the bottom

covariance components to obtain the sigma points as shown in the bottom-most (5×11)

rectangular matrix X. The ith column represents the ith sigma point Xi.

Each of these 11 prior sigma points are transformed through the nonlinear function

g to obtain the posterior sigma points Yi, each with dimension ny. We note that the

53

λc

λc λ

c

Fig. 4.2: Block diagram of UT.

ith posterior sigma point Yi is completely determined only by the ith prior sigma point

Xi for i = 1, ..., 11, and not by the others. Then, using the UT deterministic weights

given in Eq. 4.1, we predict the posterior mean and covariance as the weighted sample

mean and covariance of transformed samples Yi, i = 1, .., 11. We analyse the accuracy

of the predicted mean and covariances in the next section.

We note that although this method bears a superficial resemblance to Monte Carlo

sampling methods (as employed in particle filters) there are several fundamental dif-

ferences. First, the sigma points are not drawn at random; they are deterministically

chosen so that they exhibit certain specific properties (e.g., have a given mean and

covariance). As a result, higher-order information about the distribution can be cap-

tured with a fixed and small number of points. In contrast, MC sampling methods

require orders of magnitude more random sample points in an attempt to propagate

an accurate (possibly non-Gaussian) distribution of the state. The second difference is

that sigma points can be weighted in ways that are inconsistent with the distribution

interpretation of sample points in a particle filter. For example, the weights on the

points do not have to lie in the range [0,1].

54

4.2 UT ANALYSIS

We first prove that with the sigma point selection scheme of Eq. 4.1, the prior sigma

points accurately capture the first two moments of the prior. For purpose of analysis,

we assign values βUT = 0, αUT = 1 and κ ≥ 0 in the zeroth point covariance weight

calculation. To relax this condition, refer [126]. The prior sigma points are given by

X i = x± (√

(nx + λ)σi) = x± σi

where σi denotes the ith column of the matrix square root (A) of the true prior

covariance P(= AAT ) . This implies that∑nx

i=1(σiσTi ) = P.

Since the points (leaving the zeroth sigma point) are symmetrically distributed and

chosen with equal weights about the mean of the prior x, the sample mean x (given

by Eq. 4.2) is exactly the same as the prior mean and all odd-ordered moments are

zero. The UT covariance Px (Eq. 4.3) is

Px =2nx∑

i=0

w(c)i (Xi − x)(Xi − x)T =

2nx∑

i=1

1

2(nx + λ)(√

(nx + λ)σi)(√

(nx + λ)σi)T

=1

2

2nx∑

i=1

σiσTi = P (4.4)

This is by noting that σi = Xi − x = x−Xi+nx, i = 1, ..., nx, i.e., σi+nx

= −σi, for

i = 1, ..., nx. This proves that the UT sigma point selection scheme Px exactly capture

the true prior covariance P.

In many applications, we need to accurately calculate the expected mean and

covariance of a random variable that undergoes a nonlinear transformation [121,122].

Let the random variable x undergo an arbitrary nonlinear transformation, written as

y = g(x) = g(x + δx) (4.5)

where δx is a zero-mean random variable with the same covariance Px as x. The

problem is to determine the mean and covariance of y.

In order to analytically calculate these quantities, we first expand g(.) using a

multi-dimensional Taylor series expansion around x. We show how UT achieves

55

second-order (or third-order for symmetric priors) accuracy of the Taylor series ex-

pansion in the prediction of the moments of y for any nonlinear function g(x). We

follow similar steps as in [112, 122] for analysing the accuracy of the mean while we

follow [81] for analysing the covariance with Taylor series.

To express the nonlinear function in a multi-dimensional Taylor series, we assume

that all nonlinear transformations are analytic across the domain of all possible values

of x. As the number of terms tends to infinity, the residual in the series tends to zero

and so the series always converges to the true value of the function [81]. We note that

these assumptions are more restrictive than those required for both linearization and

UT. To apply linearization, the function must necessarily be differentiable to form the

Jacobian matrix. The UT does not place this restriction.

For the prior variable x, perturbed about a mean value x by a zero mean distur-

bance δx with covariance Px, the Taylor series expansion of the nonlinear transforma-

tion y = g(x) about x is [112]

y = g(x) = g(x) + Dδxg +1

2D2

δxg +1

3!D3

δxg +1

4!D4

δxg + ... (4.6)

where δx = [δx1, δx2, ..., δxnx]T and the operator Dδx evaluates the total differential

of g(.) when perturbed around a nominal value x by δx i.e.,

Dδx = δxT ∆x =nx∑

j=1

δxjδ

δxj

(4.7)

which acts on g(.) on a component-by-component basis. Here ∆x =

[δ

δx1

,δ

δx2

, ...,δ

δxnx

]T

.

Using this definition, the kth term in the Taylor series can be written as

1

k!Dk

δxg =1

k!

[(δxT ∆x)

kg(x)]x=x

=1

k!

[nx∑

j=1

δxjδ

δxj

]k

g(x)|x=x (4.8)

which is composed of the kth order derivatives of g(.) and the kth order powers of (the

components of) δx.

For example, for a 3D vector x = [x1, x2, x3]T and scalar y, in the second term of

the series for y = g(x), we can expand

56

D2δxg = δx2

1

δ2g

δx21

+δx22

δ2g

δx22

+δx23

δ2g

δx23

+2δx1δx2δ2g

δx1δx2+2δx1δx3

δ2g

δx1δx3+2δx2δx2

δ2g

δx2δx3.

4.2.1 Accuracy of the Mean

The true mean of y is given by taking the expectation on both sides of the Taylor

series i.e.,

y = E[y] = E[g(x)] = E

[g(x) + Dδxg +

1

2D2

δxg +1

3!D3

δxg +1

4!D4

δxg + ...

](4.9)

If we assume that x is a symmetrically distributed random variable, then all odd

moments will be zero. Also we note that D2δxg = Dδx(Dδxg) = (∆T

x δxδxT ∆x)g and

E[δxδxT ] = Px. Using these, the mean can be reduced to

y = g(x) +1

2

[(∆T

x Px∆x)g(x)]x=x

+ E

[1

4!D4

δxg +1

6!D6

δxg + ...

]. (4.10)

The UT calculates the posterior mean from the propagated sigma points using Eq.

4.1. We can write the propagation of each sigma point through the nonlinear function

as a Taylor series expansion about x

Y i = g(X i) = g(x) + Deσig +

1

2D2

eσig +

1

3!D3

eσig +

1

4!D4

eσig + ... (4.11)

The perturbation in sigma points X i about the mean value x is quantified in terms

of the corresponding covariance vectors (columns) σi.Using these sigma points (Eq. 4.11) and deterministic UT weights as in Eq. 4.2,

the UT predicted mean is: yUT =2nx∑

i=0

wmi Y i

=λ

nx + λg(x) +

1

2(nx + λ)

2nx∑

i=1

[g(x) + Deσi

g +1

2D2

eσig +

1

3!D3

eσig +

1

4!D4

eσig + ...

]

= g(x) +1

2(nx + λ)

2nx∑

i=1

[Deσi

g +1

2D2

eσig +

1

3!D3

eσig +

1

4!D4

eσig + ...

]

Since the sigma points are symmetrically distributed around x, all the odd order terms

add up to zero. This results in the simplification

yUT = g(x) +1

2(nx + λ)

2nx∑

i=1

[1

2D2

eσig +

1

4!D4

eσig + ...

](4.12)

57

By rewriting the operator

D2δxg =

[(δxT ∆x)

2g(x)]x=x

= (∆xg)TδxδxT (∆xg)

we have

1

2(nx + λ)

2nx∑

i=1

1

2D2

eσig =

1

2(nx + λ)(∆xg)T

[2nx∑

i=1

√(nx + λ)(σiσ

Ti )√

(nx + λ)

](∆xg)

=nx + λ

2(nx + λ)(∆xg)T 1

2

[2nx∑

i=1

(σiσTi )

](∆xg)

=1

2[(∆T

x Px∆x)g(x)]x=x .(4.13)

The UT predicted mean can be further simplified to

yUT = g(x)+1

2[(∆T

x Px∆x)g(x)]x=x +1

2(nx + λ)

2nx∑

i=1

[1

4!D4

eσig +

1

6!D6

eσig + ...

]. (4.14)

Comparing Eqs. (4.10) and (4.14), we note that the true posterior mean and the mean

calculated by UT agree exactly upto third-order (assuming symmetric prior) and that

the errors are only introduced in the fourth and higher-order terms. The magnitude

of these errors depends on the choice of the composite scaling parameter λ as well as

the higher-order derivatives of the nonlinear transformation g.

In contrast, the linearization approach calculates the posterior mean as

yLIN = g(x) (4.15)

which only agrees with the true mean up to the first-order. Julier and Uhlmann [81]

show that, on a term-by-term basis, the errors in the higher-order terms of the UT are

consistently smaller than those for linearization.

4.2.2 Accuracy of the Covariance

The true posterior covariance is given by

Py = E[(y− y)(y− y)T ] (4.16)

58

where the expectation is taken over the distribution of y. Substituting Eqs. (4.6) and

(4.9), the realization of the (posterior) state error is

y− y = g[x + δx]− y = Dδxg +1

2D2

δxg +1

3!D3

δxg +1

4!D4

δxg + ...

− E

[Dδxg +

1

2D2

δxg +1

3!D3

δxg +1

4!D4

δxg + ...

](4.17)

Recalling the symmetry of δx, the expected value of all odd order terms of δx

evaluate to zero. Using y− y in Eq. 4.16, the true posterior covariance is

Py = E

[Dδxg(Dδxg)T +

1

3!Dδxg(D3

δxg)T +1

2× 2!D2

δxg(D2δxg)T +

1

3!D3

δxg(Dδxg)T

]

−E

[1

2!D2

δxg

]E

[1

2!D2

δxg

]T

+ ... (4.18)

We note that

Dδx =[(δxT ∆x)g(x)

]x=x

= Agδx (4.19)

where Ag = (∆Tx g)|x=bx is the Jacobian matrix of g(x) evaluated at x. Substituting

the expectation over the second-order term given in Eq. 4.13, we can rewrite equation

Eq. 4.18 as

Py = AgPxATg + E

[1

3!Dδxg(D3

δxg)T +1

2× 2!D2

δxg(D2δxg)T +

1

3!D3

δxg(Dδxg)T

]

−[(

1

2!(∆T

x Px∆x)

)g

][(1

2!(∆T

x Px∆x)

)g

]T

(4.20)

The UT predicts the covariance using Eq. 4.3 which requires the values of Y i−y

and Y 0 − y. Using Eqs. (4.11) and (4.12), these values are given by

Y i − y = Deσig +

1

2D2

eσig +

1

3!D3

eσig +

1

4!D4

eσig + ...

− 1

2(nx + λ)

2nx∑

i=1

[Deσi

g +1

2D2

eσig +

1

3!D3

eσig +

1

4!D4

eσig + ...

](4.21)

Y 0 − y = − 1

2(nx + λ)

2nx∑

i=1

[Deσi

g +1

2D2

eσig +

1

3!D3

eσig +

1

4!D4

eσig + ...

](4.22)

Noting that (by using Eq. 4.19)

1

2(nx + λ)

2nx∑

i=1

Deσig(Deσi

g)T =1

2(nx + λ)

2nx∑

i=1

AgσiσTi AT

g = AgPxATg , (4.23)

59

and all odd order terms sum upto zero owing to the symmetry of sigma points, the

UT predicted covariance: (Py)UT =∑2nx

i=0 w(c)i (g(Xi)− y)(g(Xi)− y)T

can be expanded using Taylor series and algebraically simplified [81] to

(Py)UT = AgPxATg

+1

2(nx + λ)

2nx∑

i=1

[1

3!Dδxg(D3

δxg)T +1

2× 2!D2

δxg(D2δxg)T +

1

3!D3

δxg(Dδxg)T

]

−[(

1

2!(∆T

x Px∆x)

)g

] [(1

2!(∆T

x Px∆x)

)g

]T

(4.24)

Comparing Eqs. 4.20 and 4.24, we note that the UT calculates the posterior covariance

accurately to two terms in the Taylor series, with errors introduced only in the fourth

and higher-order moments. However, if the prior is non-symmetric, the UT mean

and covariance still agree exactly up to the second-order terms as described above

but errors are introduced from the third and higher-order terms in the Taylor series

expansion [81]. We consider only symmetric priors in our application.

The linearization algorithm predicts the covariance using

(Py)LIN = AgPxATg (4.25)

which is the true series truncated after the first term. Julier et. al. [81] show that

although both approaches predict the covariance correctly up to the second order, the

absolute errors in the fourth and higher-order terms for UT are smaller.

4.3 ILLUSTRATION OF UT

In this section, we illustrate the application of the unscented transformation. The

aim is to calculate the first two moments of the posterior (transformed sigma points),

evaluate them with true analytical results, when available, or evaluate with respect to

the Monte Carlo simulations, and also compare with linearization.

Example 1: Consider propagating a scalar random variable x through the simple

nonlinear transformation y = g(x) = x2 where x is normally distributed as N(x, σ2x).

For a random perturbation of δx, x = x + δx

60

• We have the analytical mean of y as y = E[x2] = x2 + σ2x.

• The mean square error for the realization is (y − y)2 = ((x + δx)2 − (x2 + σ2x))

2

= (δx)4 + 4x(δx)3 + (4x2 − 2σ2x)(δx)2 − 4σ2

xxδx + σ4x

• Taking expectation gives the true covariance as σ2y = E[(δx)4] + 4x2σ2

x − σ4x

where the first term is the kurtosis. From the moment generating function, it

can be shown to be 3σ4x. Therefore, the true covariance is σ2

y = 2σ4x + 4x2σ2

x.

• The linearized mean yLIN = g(x) = x2

• The linearization algorithm predicts the covariance (from Eq. 4.25) as

(σ2y)LIN = 4x2σ2

x

• For the unscented transform, we have:

– Prior sigma points, X = x, x− σ, x + σ and σ2 = (nx + λ)σ2x

– Weights: w0 = λ/(nx + λ) and w1 = w2 = 1/2(nx + λ)

– Transformed sigma points Yi = X2i yield

Y = x2, x2 + σ2 − 2xσ, x2 + σ2 + 2xσ

– The UT mean yUT = w0X0 + w1X1 + w2X2

=λ

nx + λx2 +

1

2(nx + λ)

2(x2 + σ2)

= x2 + σ2

x

– The UT covariance (using Eq. 4.3) is

(σ2y)UT =

λ

(1 + λ)(Y0 − y)2 +

1

2(1 + λ)

2nx∑

i=1

(Yi − y)2 = λσ4x + 4x2σ2

x.

When the linearized mean and covariance are compared with that of the true system,

it can be seen that the linearization assumption eliminates significant terms in mean

and covariance. This leads to a biased mean and under-prediction of the covariance.

As can be seen, the UT-predicted mean is independent of λ (the scaling parameter

of UT) and is exactly the same as the true mean. Since the UT introduces errors

both in mean and covariance only from the fourth-order term, the error in prediction

is affected by λ only from that term onwards. To find the solution of the covariance

61

prediction by UT we must specify the value of λ. It agrees with the true covariance if

we choose3 λ = 2. For this value, both the (mean and covariance) results are exactly

the same as the true values.

Example 2: Polar-to-Cartesian transformation We now consider a common

nonlinear transformation to illustrate the effectiveness of UT [121]. Let range r and

angle θ of the polar coordinates be independent and normally distributed random vari-

ables i.e., r ∼ N(1, 0.022) and θ ∼ N(π/2, (π/12)2). The transformation equations

that relate the polar coordinates to the Cartesian coordinates are x = r cos(θ) and

y = r sin(θ).

Since we have two prior random variables (r and θ), the prior augmented mean

Θ = [1 π/2] and the augmented covariance matrix PΘ =

0.022 0

0 (π/12)2

.

In this example, since the true posterior mean and covariances cannot be evaluated

analytically, we compare the performance of UT and ’linearization’ with respect to a

Monte Carlo sampling-based approach. We generate a large number (105) of samples

with the given prior distributions for range and angle, and propagate them through

the transformations to arrive at the posterior samples (Fig. 4.3 (a)). MC mean and

variance can be calculated as the (posterior) sample mean and variance, respectively.

For applying UT, the sigma point selection scheme Eq. (4.1) is employed on the

augmented prior mean and covariances. Since nΘ = 2, each sigma point will have

dimension 2 and the number of sigma points for UT is 2nΘ + 1 = 5. The resulting

posterior sigma points are shown in Fig. 4.3 (b). Then, the UT mean and variance of

the posterior RVs are predicted using Eqs. (4.2) and (4.3), respectively.

As discussed in section 4.2, linearization produces the posterior mean and covari-

ances (equal to the first term in the Taylor series expansion of the posterior mean

and covariance, respectively). Thus, the posterior linearized mean = g(Θ) and the

posterior covariance matrix = ∆gPΘ∆gT where g is the nonlinear (vector) function

3Justification for this choice of λ can be found in [81].

62

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10.5

0.6

0.7

0.8

0.9

1

1.1

1.2

x

y

Cloud of posterior samples by Monte Carlo simulation

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.50.85

0.9

0.95

1

1.05

Sigma points

x

y

Transformed sigma points by UT

(a) (b)

Fig. 4.3: Posterior samples (a) Monte Carlo, and (b) UT.

relating z = [x; y] with r and θ and ∆z =

[δ

δx,

δ

δy

].

With the three methods, the calculated posterior moments in the y−direction are

given as follows:

MC: posterior mean of y = 0.9672 and covariance Pyy =

0.0631 0

0 0.0260

.

UT: posterior mean of y = 0.9667 and covariance Pyy =

0.0625 0

0 0.0380

.

Linearization: mean of y = 1.0000 and covariance Pyy =

0.0685 0

0 0.0004

.

Figure 4.4 shows the mean and 1 − σ contours for each of these methods. The

1− σ contour is the locus of points y|(y − y)P−1yy (y − y) = 1 and is a graphical rep-

resentation of the size and orientation of Pyy [130,131]. As can be seen, the linearized

transformation is biased and inconsistent. This is most severe in the range direction,

where linearization estimates that the position is 1.00m whereas in reality it is 0.967m.

Since it is a bias which arises from the transformation process itself, the same error

with the same sign will be committed each time a coordinate transformation takes

63

−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.30.9

0.92

0.94

0.96

0.98

1

1.02

1.04

Linearization

MC

UT

y

x

Fig. 4.4: Figure shows the posterior mean and uncertainty (1-σ) contour of covariance,determined by MC approach, i.e., true values (mean at ’*’), linearization (mean at ’+’),and unscented transformation (mean almost matches with that of MC mean).

place. Even if there were no bias, the transformation is inconsistent, since the 1 − σ

contour of covariance obtained from linearization is significantly different from the MC

covariance. In comparison, the UT mean (96.67) is quite close to the MC estimate, and

the UT 1− σ covariance contour is very similar to that of MC propagation. We note

that only 5 sigma points are required in UT as against 105 samples for MC method

which works based on the laws of large numbers (to propagate the whole pdf).

As observed, linearization predicts the posterior moments of a nonlinearly trans-

formed random variable by truncating their (posterior) Taylor series expansion after

the first two terms for the mean and after the first term for the covariance [81, 112].

This can introduce significant errors in prediction for many frequently occurring non-

linearities [84, 132]. When the first two moments required for the Kalman filter are

predicted by linearization, it leads to the most well-known application of the KF frame-

work to nonlinear systems, known as the extended Kalman filter (EKF). Even though

64

the EKF is one of the most widely used approximate solutions for nonlinear estimation

and filtering, it has serious limitations [81, 122].

On the other hand, predicting the posterior moments with Monte Carlo methods

requires a very large (for example, of the order of 103 − 105) number of random sam-

ples (generated from the prior distribution) in an attempt to accurately propagate the

probability density function through the nonlinear models. The Monte Carlo approx-

imation of the optimal Bayesian estimator [41, 83], known as the particle filter, also

provides a solution (that is more accurate than the linearization used in EKF) to the

nonlinear estimation problem [59,133]. However, it is computationally very intensive.

4.4 UT-BASED EXTENSION OF THE KALMAN FILTER

Julier and Ulhmann [81] investigated a reliable nonlinear extension of the Kalman

filter and came up with UT to predict the means and covariances arising from nonlin-

ear transformations, which serve as sufficient statistics in a Kalman framework. They

demonstrated superior performance of the new filter, referred to as the unscented

Kalman filter (UKF), over the widely used EKF on several highly nonlinear 1D es-

timation problems [81, 132, 134]. Later, Wan and Ven der Merwe applied the UKF

to nonlinear parameter estimation [84] and dual (simultaneous state and parameter)

estimation [129]. Its utility as a smoother was demonstrated in the ’Expectation step’

of the EM algorithm for efficient training of neural networks, and then for the subse-

quent design of a dual unscented filter. Ven der Merwe et al. [135] further developed

a computationally efficient and numerically robust square root implementation of the

UKF. The unscented transformation with reduced number of sigma points is proposed

in [136] and its use has been extended to more accurately estimate the means and

covariances in [137, 138]. The sigma point approaches such as the UKF and the CDF

[127] are classified in a unified general family of derivative-free Kalman filters for non-

linear estimation in [128]. The use of UKF and its Gaussian mixtures as a proposal

function in particle filters is reported in [42, 83] and [122,139], respectively.

65

A detailed discussion on the utilities of UT including its application to discontin-

uous transformations and multi sensor fusion is described in [125]. Since its invention,

it has been applied as a more accurate, simple, and stable replacement to the widely

used Extended Kalman filter in diverse applications [85, 140]. The similarity of the

UKF with the statistical linear regression based Kalman filter (LRKF) is discussed

in [141] to provide useful insights into the properties of the UKF. The UKF is also

analysed from a Bayesian perspective in [142] and is suggested as a better alternative

to EKF and even to Monte Carlo filters for simple models. In vision, its application

to visual tracking applications has been explored in [86, 87, 143].

The UKF inherits the Kalman filter structure; however, it employs the UT to

propagate the first two moments of the state random variable through non-linear state

and measurement transformations [81, 83, 121]. As in the (extended) Kalman filter,

the state distribution is approximated by a Gaussian random variable (GRV) but is

now represented using a minimal set of carefully chosen sigma points. These sample

points completely capture the true mean and covariance of the prior GRV, and, when

propagated through the true nonlinear system, capture the posterior mean and covari-

ance accurately to second order (third order for symmetric priors) in the Taylor series

expansion for any nonlinearity [81].

State estimation in a general recursive framework employs a nonlinear/non-additive

state space model to describe the dynamics of the system. The evolution of the state

(the quantity to be estimated) is described as a state equation

xk = f(xk−1,uk) (4.26)

and the degradation that relates the observation to the state is modeled as the obser-

vation equation

yk = h(xk,vk) (4.27)

where uk is state noise and vk is the noise in the observations.

The UKF results from applying the UT to recursive minimum mean-square error

(MMSE) estimation [121, 122]. Kalman derived [7] the recursive form of the linear

66

Bayesian update of state conditional mean, xk = E[xk/y1,y2, ...,yk] and its covari-

ance, Pk = E[(xk − xk/k−1)(xk − xk/k−1)T ] as

xk =(Prediction of xk) + (Gain)k × [ yk − (Prediction of yk)]

While this is a linear recursion, we need not assume linearity of the model. For MMSE,

the terms in this recursion are given by

xk/k−1 = E[f(xk−1,uk)] yk/k−1 = E[h(xk/k−1,vk)] (4.28)

Pxy = E[(xk − xk/k−1)(yk − yk/k−1)T ] Pyy = E[(yk − yk/k−1)(yk − yk/k−1)

T ] (4.29)

Kk = PxyP−1yy (4.30)

where the optimal prediction (prior mean at time k) of xk is written as xk/k−1 while

that of yk is represented as yk/k−1. The optimal gain term Kk is expressed as a function

of the expected cross-correlation matrix of the state and observation prediction errors,

and the expected auto-correlation matrix of the observation prediction error.

Thus, to apply the Kalman filter to nonlinear/non-Gaussian situations, we only

require the first two moments of the state and the measurement random variables to

be predicted as accurately as possible. The resulting equations are the KF update

equations and assume the form [81,122]

xk = xk/k−1 + Kk(yk − yk/k−1)

Pk = Pk/k−1 −KkPyyKTk . (4.31)

Based on this observation, Julier et al. [121] formulate the problem of applying the KF

to nonlinear systems. The means and covariances in the above equations are predicted

using UT. We illustrate the application of UKF to nonlinear systems by a 1D scalar

state estimation example [83].

Example 3: Consider a time series generated by the process model

xk = 1 + sin(wπk) + k1xk + uk (4.32)

where uk is zero mean Gaussian random variable with variance 15 modeling the process

noise, w = 0.04 and k1 = 0.5 are scalar parameters. When state xk is subjected to a

67

non-stationary observation model, we have the degraded observations

yk =

k2x

2k + vk, k ≤ 30

k3xk − 2 + vk, k > 30(4.33)

with k2 = 0.2, k3 = 0.5, and vk Gaussian distributed as N(0, 0.5). Given only the

noisy observations yk, the problem is to estimate the underlying clean state-sequence

xk for k = 1...60.

We show how the UKF can be used to track the state sequence xk. The state

random variable is augmented with the noise variables [81] as xak = [xk; uk; vk] hav-

ing dimension na = nx + nu + nv. We represent the mean and covariance of the state

as xk and Pk, respectively. The mean and covariance of this augmented vector becomes

xak−1 = [xk−1; 0; 0] Pa

k−1 =

Pk−1 0 0

0 σ2u 0

0 0 σ2v

The scaled UT sigma point selection scheme (Eq. 4.1) is applied to the mean and

covariance matrices of this new augmented state RV (assuming an initial mean and

covariance for the state and known noise statistics) to calculate the corresponding

sigma matrix Xa.

The augmented sigma point matrix Xa is partitioned [81] as X

a = [Xx;Xu;Xv]

i.e., the first nx rows, subsequent nu rows and the last nv rows are sigma point matrices

corresponding to the state, state noise and the observation noise, respectively. Each

component of these sigma points is independently propagated according to the state

equation (Eq. 4.32) as

Xxk/k−1 = f(Xx

k−1,Xuk−1) = 1 + sin(wπk) + k1X

xk−1 + X

uk−1 (4.34)

and then through the measurement equation (Eq. 4.33) as

Y k/k−1 = h(Xxk/k−1,X

vk) =

k2

(X

xk/k−1

)2+ X

vk, k ≤ 30

k3Xxk/k−1 − 2 + X

vk, k > 30

(4.35)

68

0 10 20 30 40 50 60−20

0

20

40

60

80

100

120

140

True signalObservationsUKF estimate

Time step

Sign

al v

alue

s

Fig. 4.5: Signal estimation: (—) original, (- . -) observations, and (- - -) UKFestimates.

In UKF, the means and covariances required for each recursive Kalman filter-

ing update step are computed from the corresponding predicted state and/or mea-

surement sigma points employing (Eqs. (4.2) and (4.3)) as z =∑2nx

i=0 w(µ)i Z i and

Pzz =∑2nx

i=0 w(c)i (Z i − z)(Z i − y)T , respectively. Here, sigma matrix Z assumes X

x

for state and Y for measurement. These predicted statistics along with the recent

observation are used in the Kalman update equations (Eq. 4.31) to arrive at the final

estimate of the original signal. Figure 4.5 shows the original time series, its observa-

tions and the UKF estimates. From the plot, we note that UKF estimates are quite

close to the original. A comparison of the accuracy of the UKF estimates with those

of EKF (for the same observations) is shown in Fig. 4.6. We note that in the first

half-interval of the signal where the observation model is nonlinear the performance of

the UKF is quite superior to that of the EKF, while they perform comparably in the

linear zone. The MSE for the EKF estimate is 0.52 while for the UKF it is only 0.29.

69

0 10 20 30 40 50 60−10

−5

0

5

10

15

20

25

30

35

40

True signalEKF estimateUKF estimate

Time step

Sign

al v

alue

s

Fig. 4.6: Signal estimation: (—) original, (- . -) EKF, and (- - -) UKF estimates.

4.5 DISCUSSION

In this chapter, we introduced the principle of unscented transformation (UT) that has

the capability to transform mean and covariance through nonlinear transformations

and forms the basis for the UKF. We analysed the accuracy of the transformed (first

two) moments using the Taylor series expansion for nonlinear functions and compared

them with the true posterior and linearization following [81, 112, 122, 134]. We gave

simple and useful examples to illustrate the efficacy of the unscented transformation

over linearization and in comparison with analytical and Monte Carlo methods.

70

CHAPTER 5

NOISE REDUCTION IN PHOTOGRAPHIC IMAGES

5.1 INTRODUCTION

In this chapter, demonstrating the capability of unscented Kalman filter in the 2-

D domain, we first propose an auto-regressive UKF (ARUKF) for non-linear image

estimation. We further propose a novel methodology that reduces noise but pre-

serves edge information by judiciously incorporating a non-Gaussian prior within the

UKF framework through importance sampling (IS). We achieve this by formulating

a discontinuity-adaptive MRF (DAMRF) prior suited for recursive prediction and by

employing IS with a space-varying and heavy-tailed proposal density to estimate the

DAMRF statistics.

We address the specific problem of recovering images degraded by film-grain noise

[4, 55, 59]. This is an important practical example of sensor nonlinearity. The pho-

tographic film is a widely used image recording medium due to its high resolution

capability, exposure latitude, dynamic range, and ready availability [144]. In a photo-

graphic film, the film density is related linearly to the logarithm of the exposure. In

the density domain, the noise is modeled as additive, independent, and white Gaus-

sian. Due to the logarithmic relationship between density and exposure, the film-grain

noise manifests itself as multiplicative non-Gaussian noise in the exposure domain. We

explore the applicability of UKF for film-grain noise removal in photographic images.

5.2 AUTO-REGRESSIVE UKF

In this section, we propose a recursive filter based on the UKF that accounts for image

sensor nonlinearity. The UKF which is based on the UT provides a mathematically

71

tractable way to propagate the first two moments in the presence of non-linear degra-

dations even in the presence of non-Gaussian noise. This is achieved by augmenting

the state random variable with the (state and observation) noise random variables and

applying UT in the prediction of state and observation [81].

In order to apply the UKF for image estimation, we first model the original image

with an AR state model (as in Eq. 3.2). Adopting the same notation, we have

x(m,n) denote a 3 × 1 state vector, u is the driving state noise with zero-mean and

variance σ2u, and F is the state transition matrix containing the AR coefficients. For

non-linear/non-additive degradation, the measurement model assumes a general form

y(m,n) = h(x(m,n),v(m,n)) (5.1)

where y(m,n) is the observation at pixel (m, n) of dimension ny, v(m,n) is corresponding

measurement noise of dimension nv, and h is the functional form of the degradation.

Based on the state and measurement models, we formulate the UKF to estimate the

original image from its degraded observation. We assume that the statistics of the noise

are known (given or estimated from the degraded image). In order to deal with non-

additive noise, we define an augmented vector xa(m,n) as the concatenation of the state

vector and scalar noise variables i.e., xa(m,n) = [xT

(m,n) u(m, n) v(m, n)]T with dimension

na. Here, na = nx + nu + nv. For example, for a three pixel neighborhood, x(m,n) has

dimension three and xa(m,n) has dimension five. We apply the SUT sigma point selection

scheme (Eq. 4.1) to the new augmented vector xa(m,n) = [xT

(m,n) u(m, n) v(m, n)]T

to calculate the corresponding augmented sigma point matrix Xa(m,n) of dimension

na × (2na + 1). Since na = 5, Xa(m,n) will be of size 5× 11 and each sigma point is of

dimension five.

We now describe the ARUKF which sequentially estimates the intensity of the

original image at each pixel (m, n) in a raster-scan order from left-to-right. The filter

updates the mean and covariance of the Gaussian approximation to the posterior

distribution of the state as described in the following steps.

Initialize state statistics: x0 = E[x0]; P0 = E[(x0 − x0)(x0 − x0)T ];

72

1. Augment state mean and covariance with noise statistics.

xa(m,n−1) = [xT

(m,n−1) 0 0]T Pa(m,n−1) =

P(m,n−1) 0 0

0 σ2u 0

0 0 σ2v

2. Calculate the augmented sigma point matrix (Eq. 4.1) which constitute prior

sigma points.

Xa(m,n−1) =

[xa

(m,n−1) xa(m,n−1) ±

√(na + λ)Pa

(m,n−1)

]

3. The state sigma points and state noise sigma points are propagated through the

AR state model for prediction.

Xx(m,n)/(m,n−1) = f(Xx

(m,n−1),Xu(m,n−1)) = FXx

(m,n−1) +[XuT

(m,n−1) 0T 0T]T

(5.2)

where Xx(m,n−1) comprises the sigma points formed from the first nx rows of

Xa(m,n−1) while Xu

(m,n−1) is formed from the nu rows that immediately follow the

first nx rows in Xa(m,n−1).

4. The predicted state sigma points are used to determine the predicted mean and

covariances.

x(m,n)/(m,n−1) =∑2na

i=0 w(µ)i Xx

i,(m,n)/(m,n−1)

P(m,n)/(m,n−1)

=∑2na

i=0 w(c)i [Xx

i,(m,n)/(m,n−1)−x(m,n)/(m,n−1)][Xxi,(m,n)/(m,n−1)−x(m,n)/(m,n−1)]

T

where i represents the ith sigma point which is the ith column of Xx(m,n)/(m,n−1).

5. The sigma points corresponding to the measurement are predicted by propa-

gating the (predicted) state sigma points and measurement noise sigma points

through the observation non-linearity.

Y(m,n)/(m,n−1) = h(Xx(m,n)/(m,n−1),X

v(m,n−1))

Here, Xv(m,n−1) is formed from the last nv rows of Xa

(m,n−1). This results in

measurement sigma point matrix Y(m,n)/(m,n−1) of dimension ny × (2na + 1).

6. Statistics required for updation are estimated using the predicted sigma points.

y(m,n)/(m,n−1) =∑2na

i=0 w(µ)i Yi,(m,n)/(m,n−1)

73

Pyy =∑2na

i=0 w(c)i [Yi,(m,n)/(m,n−1)−y(m,n)/(m,n−1)][Yi,(m,n)/(m,n−1)−y(m,n)/(m,n−1)]

T

Pxy =∑2na

i=0 w(c)i [Xx

i,(m,n)/(m,n−1)−x(m,n)/(m,n−1)][Yi,(m,n)/(m,n−1)−y(m,n)/(m,n−1)]T

where Yi,(m,n)/(m,n−1) represents the ith sigma point which is the ith column of

Y(m,n)/(m,n−1).

7. Update as in the Kalman filter:

The Kalman gain K(m,n) = PxyP−1yy

x(m,n) = x(m,n)/(m,n−1) + K(m,n)(y(m,n) − y(m,n)/(m,n−1))

P(m,n) = P(m,n)/(m,n−1) −K(m,n)PyyKT(m,n)

Estimated intensity at (m, n) is s(m, n) = x(m,n)(1) (the first component of x(m,n)).

We refer to this formulation of the UKF for image estimation as ARUKF. A block

diagram of the ARUKF structure is given in Fig. 5.1.

Fig. 5.1: Auto-regressive unscented Kalman filter (ARUKF).

5.3 IMPORTANCE SAMPLING UKF

The performance of ARUKF can be improved by modeling the image with a non-

Gaussian edge preserving Markov prior as discussed next.

We construct the state conditional pdf using the past pixels in the NSHP support

and the DAMRF model as in ISKF (section 3.3.2)

P (s(m, n)/s(m− i, n− j)) = exp

(−γ log

(1 +

η2(s(m, n), s)

γ

))(5.3)

74

Our objective is to tailor the DA prior for film-grain noise removal in photographic

images and then embed the prior into the UKF framework. As discussed in chapter

2, the energy term η2(s(m, n), s) must be modified to handle the specific class of

image features in a given application. Once the DA conditional prior is formulated,

the update stage of the UKF requires the predicted sigma points to be propagated

through the observation model.

5.3.1 ISUKF for Non-linear Image Estimation

We now extend the basic structure of the UKF to include the non-Gaussian prior.

As in ISKF (section 3.5), we model the image with a DAMRF model and estimate

its mean and variance by importance sampling. However, unlike in the simple ISKF

where we directly propagate the predicted mean and covariances through the Kalman

update equations, we employ them to determine the predicted sigma points. These

sigma points are propagated through the non-linearities, and the required statistics

for the update step of the UKF are computed based on the sigma points (and UT

weights). The updation step yields the estimate of the pixel intensity.

The steps involved in the proposed importance sampling UKF (ISUKF) method

are as follows:

1. At pixel (m, n), we construct the (scalar) state conditional pdf p(s(m, n)/s)

using the DAMRF model, given the past NSHP pixels s, as in ISKF (step (1)

of section 3.5). The energy term η2(s(m, n), s) of the DAMRF prior must be

tailored specifically with respect to the given application to improve performance

(as discussed subsequently in sections 5.4.1 and 6.3.1). We note that unlike in

the ARUKF, the dimension of state in ISUKF is not defined by the number of

neighbors. It has to be determined by the number of features and/or pixels that

must be estimated. In our formulation of ISUKF, we estimate only the current

pixel intensity s(m, n) at pixel (m, n) and hence the state is a scalar.

2. From the state conditional prior p(s(m, n)/s), the predicted mean µp and pre-

75

dicted covariance σ2p estimates of the scalar state are obtained exactly as in

the ISKF algorithm by importance sampling of the DAMRF prior (step (2) of

section 3.5), using the samples of a space-varying Cauchy density. To achieve

support overlap with the prior at each pixel, the location of the Cauchy sampler

is varied with the mean of the neighboring pixels (s).

3. These estimates are augmented to deal with the non-additive measurement noise

and are used to predict directly the one-step ahead sigma points (unlike in step

(2) of ARUKF in the earlier section).

x(m,n)/(m,n−1) = µp and P(m,n)/(m,n−1) = σ2p

xa(m,n)/(m,n−1) = [xT

(m,n)/(m,n−1) 0]T = [µp 0]T (5.4)

Pa(m,n)/(m,n−1) =

P(m,n)/(m,n−1) 0

0 σ2v

=

σ2p 0

0 σ2v

(5.5)

Since state mean and covariance prediction is based directly on the conditional

pdf, we need not augment with the state noise statistics.1 This is in contrast

to step (1) of the ARUKF where we need to augment the state mean and

covariances with state and measurement noise statistics so as to propagate the

state and state noise sigma points through the state equation to obtain the

predicted sigma points (step (3) of ARUKF).

We note that while estimating a pixel, the dimension of the augmented vector

na in ISUKF is only (1+nv =) 2 and is independent of the number of neighbors,

whereas in ARUKF it is nx + 2 (assuming that nx is the number of neighbors,

and state and measurement noises are scalars).

(a) Use these augmented mean and covariances to determine the (predicted

state, and measurement noise) sigma point matrix using Eq. 4.1.

Xa(m,n)/(m,n−1) =

[xa

(m,n)/(m,n−1) xa(m,n)/(m,n−1) ±

√(na + λ)Pa

(m,n)/(m,n−1)

]

1Neighbors s and parameter ρ2(m,n) used in the DAMRF prior serve as implicit counterparts of the

state noise (or uncertainty).

76

(b) Propagate each of the 2na + 1 sigma points corresponding to state and

measurement noise through the observation nonlinearity.

Y(m,n)/(m,n−1) = h(Xx(m,n)/(m,n−1),X

v(m,n−1))

where Xx(m,n)/(m,n−1) and Xv

(m,n−1) are formed from the first row of Xa(m,n)/(m,n−1)

and the nv rows following the first, respectively.

(c) Estimate the statistics of y (from the sigma points) and update following

the same steps as (6) and (7) in the ARUKF (section 5.2)

The estimated pixel intensity is given by s(m, n) = x(m,n).

Thus, s yields the estimated image. This formulation of the UKF with a discontinuity-

adaptive prior is expected to preserve edges better than the ARUKF while allowing

for greater smoothing in uniform regions. A block schematic of the proposed ISUKF

is shown in Fig. 5.2 for ease of understanding.

Fig. 5.2: Importance sampling unscented Kalman filter (ISUKF).

5.4 FILM-GRAIN NOISE

A slide, a color negative, or a black-and-white film, contains tiny crystals of silver halide

salts which are light sensitive. When a film is developed, these crystals turn into tiny

filaments of metallic silver conventionally called ‘grain’. The faster the film, the larger

77

the clumps of silver formed and blobs of dye generated (in color films), and the more

they tend to group together in random patterns and become more visible to the naked

eye. These randomly patterned grains are visibly objectionable in photographic prints

and constitute film-grain noise. This noise also limits resolution since it is determined

by how fast a film reacts to light [144]. There are applications such as in motion pictures

[145] where the presence of film-grain noise is desired. While film-grain enhances the

feel of a motion picture, it makes compression of the content more difficult due to high

entropy as the pattern is noise-like and independent from that of the adjacent frames.

To compress and transmit these images, removal of film-grain noise is a prerequisite.

For the multimedia industry, film-grain noise removal is important for digitisation and

storage of the huge legacy of images and movies of the last several decades [57].

When a photographic film is used as a recording medium, there is a well-known

nonlinear relationship between the incoming light intensity (exposure) and the silver

density deposited on the film. The image recorded on a photographic film D, is related

to the logarithm of the exposure E given by the D − log E curve of the film [3, 4] as

D(E) = α log10(E) + β (5.6)

Parameters α and β are the slope and offset derived from the D − log E curve [55].

The domain in which the actual film formation takes place is referred to as the density

domain, and the domain in which the developed image is available for visual consump-

tion is called the exposure domain. In density domain, using the sensor nonlinearity

(Eq. 5.6), we get the noisy observations

rd(m, n) = α log10(s(m, n)) + β + v(m, n) (5.7)

where s(m, n) is the original scene, and v(m, n) denotes additive white Gaussian noise

with zero mean and variance σ2v . The inverse transform of Eq. (5.6), E(D) = 10(D−β)/α

gives the image in the exposure domain [4]. Applying this transformation to the density

domain model (Eq. 5.7), we obtain the observation model in the exposure domain as

re(m, n) = s(m, n) (10v(m,n)/α) (5.8)

78

Here, re(m, n) is the degraded image in exposure domain, where the noise becomes

multiplicative and non-Gaussian.

5.4.1 The Proposed Filters

In this section, we apply ARUKF and ISUKF for image estimation in film-grain noise.

The application of ARUKF and ISUKF for film-grain noise removal requires the prop-

agation of predicted state sigma points through the exposure domain film-grain non-

linearity (Eq. 5.8) with the observation being y(m,n) = re(m, n). Thus, step 5 in the

ARUKF algorithm that predicts the measurement sigma points becomes

Y(m,n)/(m,n−1) = h(Xx(m,n)/(m,n−1),X

v(m,n−1)) = HXx

(m,n)/(m,n−1). ∗ 10.(Xv(m,n−1)/α) (5.9)

Operations .∗ and .() represent element-by-element operations (multiplication and

raised power, respectively). For a three pixel NSHP neighborhood in ARUKF, the

state has dimension 3 and H = [1 0 0]. The augmented state dimension will be 5 and

hence 11 sigma points are used to propagate statistics.

In order to apply the ISUKF, the image prior must be formulated based on the

type of features to be preserved. For edge preservation in photographic images, we

construct the state conditional pdf using first-order NSHP neighbors i.e., (i, j) ∈(0, 1), (1, 0), (1, 1), (1,−1) and η2(s(m, n), s) =

1

ρ2(m,n)

∑

(i,j)

(s(m, n)− s(m− i, n− j))2

(i2 + j2).

We choose the parameter ρ2(m,n) = P(m,n−1), the covariance of the (recent) past pixel,

since it is a measure of local dependency. The first two moments of this prior are esti-

mated by importance sampling and are augmented with measurement noise statistics

to determine the predicted sigma points (steps 1, 2 in section 5.3).

The measurement equation (step 3(b)) in ISUKF has the same form as that of

the ARUKF (Eq. 5.9) except that H = 1. Since the predicted state by importance

sampling has dimension one, the augmented vector has dimension na = 2 and only

(2na+1 =)5 sigma points are needed to propagate the state dynamics. With this prior

and measurement formulations, the steps in ISUKF algorithm (described in section 5.3)

are followed to estimate the (original) image pixel intensity.

79

(a) (b) (c)

(d) (e) (f)

Fig. 5.3: (a) Original ‘flower’ image. (b) Degraded image. Image estimated using (c)MWF (ISNR = 2.86 dB), (d) PF (ISNR = 2.55 dB), (e) ARUKF (ISNR = 3.42dB), and (f) ISUKF (ISNR = 4.63 dB).


In this section, we present results obtained using the proposed methods, namely,

ARUKF and ISUKF, and compare their performance with the modified Wiener fil-

ter (MWF) [4] and the recently proposed particle filter (PF) [58,59]. We note that the

PF and ARUKF require accurate estimates of AR parameters of the original image

which is a difficult proposition in real situations. When only the corrupted image is

available, we estimate the AR parameters (approximately) from the degraded image

by solving Eqs. (3.3).

The parameters that influence the DAMRF model are γ and ρ(m,n). We have

80

(a) (b) (c)

(d) (e) (f)

Fig. 5.4: (a) The ‘house’ image. (b) Degraded image (σ2v = 0.05). Result using (c)

MWF (ISNR = 2.51 dB), (d) PF (ISNR = 3.48 dB), (e) ARUKF (ISNR = 3.40dB), and (f) ISUKF (ISNR = 4.55 dB).

found experimentally that γ = 1.8 yields best performance. From the D − log E

characteristics of the film, we took α = 5 for all the experiments [55]. The parameters

of UT were set as αUT = 1, βUT = 0 and κ = 1. The UKF was observed to be robust

to variations in these UT parameters.

5.5.1 Simulations

Fig. 5.3(a) shows a ‘flower’ image. It is degraded by film-grain noise with variance

σ2v = 0.05 according to Eq. (5.7) and the exposure domain image (re in Eq. (5.8))

is shown in Fig. 5.3(b). The image estimated by MWF is shown in Fig. 5.3(c).

The result obtained using the particle filter [59] (Fig. 5.3(d)) with 1000 samples is

81

(a) (b) (c)

(d) (e) (f)

Fig. 5.5: (a) The ‘peppers’ image. (b) Degraded image. Output of (c) MWF(ISNR = 2.98 dB), (d) PF (ISNR = 4.47 dB), (e) ARUKF (ISNR = 4.12 dB),and (f) ISUKF (ISNR = 5.26 dB).

considerably less noisy than MWF. Fig. 5.3(e) shows the output of the ARUKF. It

is marginally sharper than the PF output and has a higher ISNR value. The image

estimated using ISUKF is presented in Fig. 5.3(f). The noise level is quite low while the

petals come out sharp and crisp demonstrating the effectiveness of the non-Gaussian

prior incorporated in the ISUKF. The image is much closer to the original and has the

highest ISNR value among all these methods.

Fig. 5.4(a) shows a ‘house’ image. The image degraded by film-grain noise is

shown in Fig. 5.4(b). The images estimated by MWF, PF, ARUKF and ISUKF are

shown in Figs. 5.4(c), (d), (e) and (f), respectively. The ARUKF preserves sharp

details (such as grass and tree leaves) akin to MWF but is less noisy. The performance

82

0 5 10 15 20 251

2

3

4

5

6

7

8ISUKFARUKFPF MWF

ISN

R v

alu

es

of

resp

ect

ive

filt

ers

Image index 0 5 10 15 20 25

2

3

4

5

6

7

8

9ISUKFARUKFPFMWF

Image index

ISN

R v

alu

es

of

resp

ect

ive

filt

ers

(a) (b)

Fig. 5.6: Performance comparison on different images in terms of mean value of ISNRover 20 MC runs (a) at moderate noise (σ2

v = 0.05), and (b) at high noise (σ2v = 0.15).

of PF is comparable to that of ARUKF. However, the image estimated by ISUKF is

the best among all. The noise is effectively filtered while preserving fine details such

as window-panes, grass and leaves. This is also reflected in its high ISNR value.

Fig. 5.5(a) shows the ‘peppers’ image. It is severely degraded by film-grain noise

with variance σ2v = 0.1 as shown in Fig. 5.5(b). The image recovered by MWF is

given in Fig. 5.5(c). By applying the ARUKF, we achieve good noise reduction (Fig.

5.5(e)) and obtain a result that is visually comparable to that of PF (Fig. 5.5(d)). The

image estimated using ISUKF is presented in Fig. 5.5(f). We note that the output

of ISUKF is almost free from noise while the contours remain sharp. It also has the

highest ISNR value among all the methods.

In Figs. 5.6(a) and (b), we statistically compare the performance of proposed

ARUKF and ISUKF with MWF and PF with prior boosting [146] on a wide variety

of images including face images, natural scenes, film images and textures, both at

moderate noise and high noise. Except on very few images (highly textured) ISUKF

outperforms all other filters at all noise levels. The ARUKF quantitatively precedes

ISUKF in performance. At low noise (say σ2v < 0.04) PF results in poor performance

83

(a) (b)

(c) (d) (e)

Fig. 5.7: (a) Cropped portion of a frame from the movie ‘Das testament des Dr.Mabuse’. Output of (b) MWF, (c) PF, (d) ARUKF, and (e) ISUKF.

because the likelihood is very peaked. The performance of ARUKF is better than

MWF and comparable to PF. But the ISUKF with a DAMRF non-Gaussian prior

performs the best among all the filters. It is also computationally efficient than PF.

5.5.2 Real Examples

Fig. 5.7(a) shows a cropped portion of an image frame from the old classic movie ‘Das

testament des Dr. Mabuse’. The film-grain noise is real and shows up clearly on the

white coat, the face etc. Using a homogeneous region in the frame, the measurement

noise variance was found to be 0.04. The output of MWF, PF and the proposed

ARUKF, and ISUKF are shown in Figs. 5.7(b), (c), (d) and (e), respectively. We

observe that ARUKF performs on par with PF but the ISUKF is the most effective

84

(a) (b)

(c) (d) (e)

Fig. 5.8: (a) Face image with real film-grain noise. Image estimated using (b) MWF,(c) PF, (d) ARUKF, and (e) ISUKF.

even in real situations. The overall noise level is quite low and the folds on the cloth

become quite discernible in the output of the ISUKF.

Next, a face image with film-grain noise is shown in Fig. 5.8(a). Residual film-

grain noise is visible in the output of MWF (Fig. 5.8(b)). The PF result (Fig. 5.8(c))

is comparatively better. The result using ARUKF (Fig. 5.8(d)) is marginally sharper

than PF. The output of ISUKF (Fig. 5.8(e)) has all edges intact, and is least affected

by grains. Finer details such as lips and eye-brows are well-preserved. The image

obtained using ISUKF is visually striking in appearance and appears the most natural

among them all.

Fig. 5.9(a) shows yet another real image with film-grain noise. Figs. 5.9(b), (c),

85

(a) (b)

(c) (d) (e)

Fig. 5.9: (a) A real building image. Output image obtained using (b) MWF, (c) PF,(d) ARUKF, and (e) ISUKF.

(d) and (e) depict the output from MWF, PF, ARUKF and ISUKF, respectively. The

result obtained using ARUKF is comparable to PF. The ISUKF output is again the

best. The edges have been preserved well and the grainy appearance in the original

degraded image of Fig. 5.9(a) is greatly mitigated.

Finally, we consider the locomotive image shown in Fig. 5.10 (a). The result

obtained using the PF with 500 samples is shown in Fig. 5.10 (b). We note that

inspite of some blurring of the letters and edges, the PF could not effectively remove

the noise in the white and black homogeneous regions. The image estimated by the

ARUKF (Fig. 5.10 (c)) is more effective in removing noise but it tends to blur details.

However, the output of the ISUKF, shown in Fig. 5.10 (d), preserves the sharpness of

86

(a) (b)

(c) (d)

Fig. 5.10: Cropped portion of a locomotive captured with a film camera. (a) Originalwith real film-grain noise. Image estimated using (b) PF, (c) ARUKF, and (d) ISUKF.

the details very well (for example, the Tamil letters) and has very little noise.

Unlike the particle filter, in which a large number of samples must be propagated

throughout the algorithm in an attempt to faithfully represent the posterior distribu-

87

Table 5.1: Computational complexity comparison at each pixel.

Operation PF (S = 200) ARUKF (na = 5) ISUKF (L = 100, na = 2)

Non-linear functional evaluations O(S) O(2na + 1) O(L) + O(2na + 1)

Additions and Multiplications O(S)n3

a

6+ n2

a + O(2na + 1) O(L) +n3

a

6+ n2

a + O(2na + 1)

Inequality comparisons O(S log S) 0 0

Random number generations O(S) 0 O(L)

Exec. time (over 200 x 200 image) 148 seconds 14 seconds 18 seconds

tion, for the proposed method as few as 50 to 100 samples are sufficient to reliably

estimate the first two moments of the non-Gaussian prior during prediction. At each

pixel, the computational requirements are as follows: The particle filter, with S sam-

ples, requires O(S) operations in weight calculations and O(S log S) comparisons in

resampling step [41]. The proposed ISUKF requires O(L) operations in prediction (i.e.,

in importance sampling step) and n3a

6+n2

a multiplications and additions for the matrix

square-root in sigma point calculation and O(2na + 1) operations for UKF updation,

where L is the number of samples in IS step and na is the dimension of the augmented

state vector. Though the MWF is very simple and requires only O(MN log MN) op-

erations over all, its performance is comparatively inferior. Table 5.1 summarizes the

computational complexity of these filters.

5.6 DISCUSSION

In this chapter, we proposed a UKF-based approach to estimate images corrupted by

film-grain noise. We first considered the extension of the 1-D UKF for image estima-

tion with an auto-regressive state model. A small set of NSHP neighbors determines

the dimension of the state, and is augmented with noise variables. Statistics of the

augmented vector are employed to determine the sigma points which enable prediction

of prior state and measurement statistics by direct propagation through the AR state

and observation models.

To further enhance performance, we incorporated an edge-preserving MRF prior

(tailored for film-grain noise) into the recursive UKF framework. The predicted statis-

88

tics of the state conditional prior by importance sampling determines the predicted

sigma points, which are used for the prediction of the measurement statistics, and

employed in the UKF update. To recover from the non-linear degradation due to film-

grain noise, we used the exact exposure domain relation of the photographic film as

the observation model for the UKF. Experimental results were given to demonstrate

the effectiveness of the proposed approaches.

89

CHAPTER 6

DESPECKLING SAR IMAGERY

In this chapter, we address the problem of speckle reduction in synthetic aperture radar

(SAR) imagery. We demonstrate excellent despeckling in conjunction with feature

preservation by incorporating a modified DAMRF prior, tailored for noise removal in

SAR imagery, within the unscented Kalman filter framework. The performance of the

proposed method is evaluated on both synthetic and real examples, and compared

with existing methods.

6.1 IMAGE FORMATION

Synthetic aperture radar (SAR) imaging is an alternative approach to remote earth

observation and provides several advantages over visible/infrared sensing technology.

Because radar is an active sensing system that provides its own illumination source,

it does not rely on energy reflected or radiated from the earth’s surface. Radar can,

therefore, gather imagery day or night. Additionally, radar operates in the microwave

region of the electromagnetic spectrum [147]. These waves can penetrate clouds, haze,

and rain, thus allowing operation even in unfavorable weather conditions which typi-

cally preclude the use of visible/infrared systems. The use of microwaves also allows

for the observation of earth properties that are unique to the microwave region and

are not detectable with visible and infrared systems.

SAR systems produce 2D images of mapped areas. A radar is an electromagnetic

wave sensor having a pulsed microwave transmitter and a phase-coherent receiver.

The radar is carried by a moving platform such as an aircraft or a satellite. A pulse

transmitter signal is radiated by the antenna, reflected from the target, and sensed by

90

the receiver. The reflected signal time delay is proportional to target range (distance).

The range resolution δr is determined by the effective radar pulse length τ as δr = cτ/2

where c is the velocity of light [148].

The resolution capability of a radar is specified in the range and azimuth directions,

which correspond to the two orthogonal axes of the processed image. SAR systems

achieve high resolution images by the use of very short transmitted pulses with wide

bandwidths in the range direction, and by the use of a synthesized antenna (aperture)

and coherent processing of the phase history of the echo signals from many successive

pulses in the cross-range direction [148].

A digital image generated from SAR echo-returns is represented by spatial varia-

tions of pixel intensities. The gray level of a pixel in the image is proportional to the

processed received power resulting from the backscattering produced by the ground

area represented by that pixel. A difference in gray level for two adjacent features

on an image is due to a difference between their individual reflectivities, since system

and propagation factors are essentially the same for both the features. The resultant

image becomes a 2-D map of the scene reflectivity factor [22].

A SAR system coherently records the amplitude and the phase echoes from a

distant target. Since each resolution cell of the system contains several scatterers,

and since the phases of the returned signals from the scatterers are randomly dis-

tributed, the inherent coherent processing involved results in noise-like interference

patterns called speckle [5]. Grainy in appearance, speckle noise is primarily due to

phase fluctuations of the electromagnetic return signals [149]. This affects the radio-

metric resolution, which is the ability of a SAR to distinguish different objects in the

scene on the basis of their electromagnetic signatures [150]. Speckle noise severely

impedes automatic scene segmentation and interpretation, and limits the resolution of

SAR image as well as their utility. Typical spatial resolution of SAR images is com-

parable to the size of some of the objects of interest within the scene, such as houses,

trees or vehicles.

A fully-developed speckle can be modeled as random multiplicative noise. If s

91

represents the original image and v is speckle noise, then the degraded observation y

is given by the relation

y(m, n) = s(m, n) · v(m, n) (6.1)

Noise v is assumed to be independent of s with unit mean and variance σ2v . The

multiplicative nature of speckle complicates the noise filtering process.

6.1.1 SAR Methods

In general, a speckle suppression filter is expected to effectively filter homogeneous

areas, retain image texture and edges, and preserve features (both linear and point-

type). In the Lee filter [63], the multiplicative model is first approximated by a linear

combination of the local mean and the observed pixel. Then, a minimum mean square

error (MMSE) estimator is applied to determine the weighting constant. The Frost

filter [62] is an adaptive and exponentially-weighted averaging filter. The weights

are based on the coefficient of variation c which is the ratio of the local standard

deviation to the local mean of the degraded image, the distance |t| of a pixel from the

central pixel, and the damping factor K. The exponential kernel weights are given by

w = exp[−Kc|t|]. The enhanced Lee and enhanced Frost filters proposed by Lopes

et al. [65] divide an image into homogeneous, heterogeneous areas, and isolated point

targets based on the value of the coefficient of variation c (low, intermediate, and

high, respectively). The main principle behind adaptive filters is that they approach

the local mean at homogeneous regions. At points of high activity they tend to retain

the original observation pixel. The disadvantages are over-smoothing of image texture

and ineffective denoising around edges.

Wavelet despeckling approaches have been quite successful and are based on mod-

ifying the (log-transformed speckle) noisy wavelet coefficients according to some rule

(shrinkage) and reconstructing the filtered image from them. In [151], a Kalman

shrink maximum a posteriori (MAP) estimator is applied on high sub-band wavelet

coefficients. Xie at el. [71] have proposed a despeckling algorithm that fuses Bayesian

92

wavelet denoising with a regularizing prior. A MAP estimator with an alpha-stable

prior within the wavelet framework is proposed in [72] for despeckling. A wavelet

despeckling method based on Bayesian shrinkage which relies on edge information

has been proposed in [13]. Argenti et. al. [14] propose despeckling in the undecimated

wavelet domain using a space-varying generalized Gaussian distribution for the wavelet

coefficients. In [149], a model for SAR imagery based on MRFs is proposed to exploit

the characteristics of speckle.

6.2 SAR METRICS

To compare the performance of different despeckling techniques many metrics have

been proposed in the literature. These criteria enable to measure how well speckle

is reduced and to what extent important details, such as point and line targets are

preserved by any algorithm.

Some commonly prevalent metrics for determining the performance of SAR algo-

rithms are given below.

1. Signal to Mean square error ratio (S/MSE) = 10 log

( ∑m,n s(m, n)2

∑m,n(s(m, n)− s(m, n))2

).

Higher the S/MSE value, the closer will be the filtered image to the original.

2. Edge Correlation Factor (ECF) yields a measure of edge-preservation capability

[152]. The ISNR is not always an accurate measure of noise suppression in

images. For example, it need not well-account for the preservation of edges.

Therefore, we use a supplementary performance evaluation based on correlation

as proposed in [152]. ECF which is a measure of edge-preservation is computed

asΓ(∆s−∆s, ∆s− ∆s)√

Γ(∆s−∆s, ∆s−∆s)Γ(∆s− ∆s, ∆s− ∆s)

where ∆s and ∆s are the high

pass filtered version of the original and estimated images, respectively. They are

obtained with a 3×3 Laplacian operator and Γ(s1, s2) =∑

m,n s1(m, n)s2(m, n).

The correlation measure should be close to unity when the estimated image is

similar to the reference image.

93

3. Equivalent Number of Looks (ENL) measures the speckle suppression in a homo-

geneous area of the image and is given by ENL =

(µf

σf

)2

. Here, the mean value

µf =

∑m,n s(m, n)

m1n1and the standard deviation σf =

√∑m,n(s(m, n)− µf)2

m1n1

are for a (chosen) uniform area of dimension m1×n1 in the image. A large ENL

corresponds to better speckle suppression. This measure is of more importance

in the real situations, where we do not have S/MSE or ECF .

4. Figure of Merit (FOM) [15] gives a quantitative evaluation of detection of true

edges and suppression of false edges (in synthetic experiments). To assess FOM,

an edge map is created by applying the Roberts mask [153] on the filtered images.

Then, Pratt’s [154] FOM is adopted as FOM =1

max (Nf , NI)

∑

m,n

1

1 + ξd2(m, n),

where Nf and NI are the number of detected and ideal edge pixels, respectively.

Here, d(m, n) is the distance between the (m, n)th detected edge pixel and the

nearest ideal edge pixel, and ξ is a constant which is typically set to 1/9. FOM

ranges between 0 and 1. A high value indicates superior edge rendition.

6.3 NOISE REDUCTION IN SAR

The multiplicative model of speckle noise complicates recursive propagation in the KF

(or EKF) formulation. However, the UKF provides a mathematically tractable way

to propagate the first two moments even in the presence of multiplicative noise. The

propagation and the calculation of these moments are based on sigma points. This

facilitates incorporation of non-additive noise in the recursive estimation procedure.

We first used the ARUKF filter for suppressing speckle. As earlier, we employed

the homogeneous AR model for state prediction based on a three-pixel NSHP neighbor-

hood. The sigma points were generated by assuming some initial mean and covariances

for the state and these points were propagated through the AR state model and then

through the pure multiplicative observation model resulting in the prediction of state

and measurement sigma points. The UT was employed to determine the predicted

94

first two statistics and the update was performed as in the Kalman filter. The image

estimated by the ARUKF preserves edges and fine features but at the cost of letting

in noise. The effect of speckle can be suppressed by using a higher covariance value

for the state noise, but this blurs edges and point features which affects recognition of

fine details.

6.3.1 Speckle Suppression using ISUKF

In this section, we present the ISUKF algorithm which we specifically tailor to de-

speckle SAR images. We formulate the DAMRF prior as earlier and propose to use

p (s(m, n)/s(m− i, n− j)) = exp

(−γ log

(1 +

η2(s(m, n), s)

γ

))(6.2)

where (i, j) ∈ (0, 1), (1, 0), (1, 1), (1,−1) for a first-order NSHP neighborhood. How-

ever, η2(s(m, n), s) is formulated as follows.

η2(s(m, n), s) =1

ρ2(m,n)

∑

(i,j)

(s(m, n)− s(m− i, n− j))2. (6.3)

Due to the need for effective smoothing of speckle in SAR images, and for preservation

of point features, we vary the scale parameter ρ2(m,n) based on the estimated past NSHP

pixels. Note that ρ2(m,n) is not set to the previous covariance estimate as in film-grain

noise removal (section 5.4.1) . This is based on the following observation.

When DAMRF modeling is used in conjunction with the multiplicative noise model

in the UKF, we observed that a very low value of the parameter ρ2(m,n) is required in

low-intensity regions (compared to high-intensity regions) to perform an equivalent

amount of smoothing. To achieve overall good smoothing performance, we vary ρ2(m,n)

monotonically with the mean of the first-order NSHP neighborhood as ρ2(m,n) = ζµ1.5

s ,

where µs = 14(s(m, n− 1) + s(m− 1, n) + s(m− 1, n− 1) + s(m− 1, n + 1)). Here, ζ

can be used as a tuning parameter to vary the amount of smoothing depending on the

image texture.

We summarize the ISUKF algorithm for estimating each pixel sequentially in the

presence of speckle as follows:

95

1. At each pixel, we formulate the state conditional density using the DAMRF

density (Eq. 6.2) with the energy function given in Eq. (6.3) employing the

past estimates.

2. We employ importance sampling technique to estimate the first-two moments of

the above conditional pdf as in step 2 of ISUKF (section 5.3.1), which constitute

the predicted state mean x and covariance 1 Px.

3. We augment with the speckle noise statistics and employ the UT on the aug-

mented mean xa and covariance Pa to determine the corresponding sigma point

matrix Xa (as in step 3 (a) of ISUKF, section 5.3.1). Since the augmented ran-

dom variable has dimension 2, the number of sigma points will be 5, resulting

in Xa of dimension 2× 5.

4. Next, we apply the measurement model independently on each of the predicted

state and measurement noise sigma points to obtain the sigma points corre-

sponding to the measurement as

Y(m,n)/(m,n−1) = Xx(m,n)/(m,n−1). ∗Xv

(m,n−1) (6.4)

This is in accordance with the multiplicative noise model of Eq. 6.1. Here, Xx

and Xv are formed from the first and the second (which is the last) row of Xa,

respectively. Operation ‘.*’ represents element-by-element multiplication.

5. Using the predicted state and measurement sigma points, and the UT determin-

istic weights, we predict the mean and covariance of state and measurements

(x,Px, y,Py) and also the cross-covariance of the state and measurement Pxy.

6. Based on the state and measurement statistics and using the recent observa-

tion y(m, n), we employ the Kalman filter update to estimate the mean x and

covariance Px of the state.

The estimated state x(m, n) corresponds to the despeckled pixel.

1These are scalars as described in section 5.3.1. We have dropped the subscripts referring to pixel

location, for brevity.

96

(a) (b) (c) (d) (e)

Fig. 6.1: (a) Original image. (b) Noisy version. Output of (c) Enhanced Lee(FOM = 0.92, S/MSE = 20.51 dB), (d) Frost (FOM = 0.94, S/MSE = 20.51dB), and (e) the proposed method (FOM = 0.98, S/MSE = 22.83 dB).

We note that the propagation of the variance parameter ρ2(m,n) of the DAMRF

prior for SAR is different from that for film-grain noise in which ρ2(m,n) was directly

set to the previous covariance estimate. In this chapter, the variance parameter ρ2(m,n)

(in Eq. 3.9) is propagated only based on neighboring estimates and tuned with a free

scaling parameter ζ to effectively handle the wide variety of contextual features in

SAR images. The ISUKF leads to significantly better performance that the AR-based

UKF both in terms of (local) smoothing of uniform regions and in preserving point

and edge features.


In this section, we present results obtained using the proposed ISUKF method when

applied on SAR images. We compare the performance of the proposed approach

with both standard and recent methods. We note that in the proposed approach,

the parameter ζ of the DA model is image-dependent. Based on our experiments, it

typically takes a value between 0.01 and 0.03 for moderate noise.

Consider the edge image shown in Fig. 6.1(a). This image is degraded with

simulated 2-Look Gamma-distributed speckle noise (Fig. 6.1(b)). The outputs of the

Enhanced Lee [65], Frost [62] and the proposed ISUKF method are shown in Figs.

6.1(c), (d) and (e), respectively. The images in Figs. 6.1 (c) and (d) have a grainy

appearance in the white uniform region due to residual speckle. The proposed method

97

(a) (b) (c)

(d) (e) (f)

Fig. 6.2: (a) Original SAR image. (b) Degraded (σ2v = 0.04). Image estimated using

(c) the Enhanced Lee filter, (d) the Frost filter, (e) AR-based UKF, and (f) the ISUKF.

is not only effective in despeckling but also well preserves the transition between the

dark and bright regions. Our method yields an almost exact edge map (close to that of

the original) even though the noise level is high. This is also reflected in the S/MSE

and FOM values which are higher as compared to those of the standard filters (as

given in Table. 6.1 (a)).

The image used for the next experiment is shown in Fig. 6.2(a). The degraded

image with simulated speckle noise of variance σ2v = 0.04 is shown in Fig. 6.2(b). The

despeckled images obtained with the Enhanced Lee filter and the Frost filter are shown

in Figs. 6.2(c) and (d), respectively. We observe that dark regions are over-smoothed

while the high intensity regions have a grainy appearance. The image obtained using

the ARUKF is shown in Fig. 6.2(e). The output is visually inferior to that of the Frost

filter but it does not over-smooth dark regions. The output of the proposed ISUKF is

98

Table 6.1: Quantitative comparison with standard filters.

(a) Standard Filters (Fig. 6.1)

SMSE

dB FOM ENL

Enhanced Lee (K = 1) 20.51 0.92 305

Frost (K = 3) 20.51 0.94 361.15

ISUKF (ζ = 0.01) 22.83 0.98 3071.00

(b) Standard filters (Fig. 6.2)

SMSE

dB ECF ENL

Noisy (at (170, 160)) 0 0.242 24.90

Enhanced Lee (K = 8) 22.28 0.217 459.06

Frost (K = 12) 23.35 0.199 559.18

ARUKF 21.38 0.639 270.00

ISUKF (ζ = 0.025) 22.15 0.624 1041.00

shown in Fig. 6.2(f). Note that incorporating a non-Gaussian MRF prior in the UKF

gives a significant improvement over the ARUKF. The image in Fig. 6.2(f) is visually

closest to the original. Comparative metrics for this example are given in Table 6.1(b).

We also compared our method with the adaptive MAP technique in [12] which is

based on a heavy-tailed Rayleigh model. In Figs. 6.3 (a), (b) and (c), we reproduce

the original, degraded and the output image from [12], respectively. Even though the

method proposed in [12] reduces speckle, the output is blurred. The image estimated

using the proposed ISUKF approach (Fig. 6.3(d)) when used on the same degraded

image is sharp and even the fine details are recovered well. The output of our method

has higher S/MSE and ECF values as given in Table 6.2(a). Because the output

of [12] is blurred, its ENL value is marginally higher. Note that the output of the

proposed method is much closer to the original image (texture-wise also).

Next, we considered the case of a Horse track image that is affected by real speckle

noise (Fig. 6.4(a)). The output of the Frost filter, shown in Fig. 6.4(b), contains

noticeable residual speckle in uniform regions. Also, the edge boundaries are noisy.

99

(a) (b) (c) (d)

Fig. 6.3: (a) Original aerial image. (b) Degraded image. Image estimated using (c)Rayleigh prior MAP estimator [12], and (d) the proposed ISUKF.

(a) (b) (c)

Fig. 6.4: (a) The Horse track image. Estimated output image using (b) the Frostfilter, and (c) our method.

The image estimated by the ISUKF method is shown in Fig. 6.4(c). Our method

is significantly more effective in smoothing speckle over uniform regions (such as the

top-left and bottom-right gray regions); yet the edges are sharp and clear. Even the

small white blobs on the top-right corner are recovered well. Table 6.2(b) gives a

quantitative comparison with other filters. The cropped 60× 60 region that was used

to calculate the ENL is centered at (115, 55). The performance of our method is

evidently superior even in the real case.

Figure 6.5(a) shows the Bedfordshire image (in Southeast) England with real

speckle. The image estimated using a Bayesian wavelet filter which relies on edge

information is reproduced from [13] and is given in Fig. 6.5(b). The output obtained

100

Table 6.2: Quantitative comparison.

(a) Rayleigh MAP-based (Fig. 6.3)

SMSE

dB ECF ENL

Noisy (at (165, 100)) 0 0.67 13.44

Rayleigh MAP technique [12] 18.57 0.61 17.26

ISUKF (ζ = 0.03) 21.14 0.74 16.42

(b) Real speckle (Fig. 6.4)

µf σf ENL

Noisy (at (115, 55)) 110.96 28.73 14.92

Frost (K = 10) 110.46 8.92 153.10

Enhanced Lee (K = 6) 110.47 9.56 133.66

ISUKF (ζ = 0.01) 112.18 7.88 202.49

(a) (b) (c)

Fig. 6.5: (a) Bedfordshire image. Result using (b) edge-based Bayesian wavelet filter[13], and (c) the ISUKF method.

by the ISUKF filter on the same degraded image of Fig. 6.5(a) is shown in Fig. 6.5(c).

We note that the edges are recovered without any blurring effects while the homoge-

neous regions are almost free from noise. The isolated point targets (two white spots

in the dark homogeneous region on the left) become strikingly visible in Fig. 6.5(c).

A quantitative comparison of the two methods is given in Table 6.3(a).

101

Table 6.3: Quantitative comparison with wavelet filters.

(a) Wavelet edge-based (Fig. 6.5)

µf σf ENL

Noisy (at (90, 160)) 89.73 25.73 12.16

Bayesian wavelet filter [13] 90.85 13.87 42.87

ISUKF (ζ = 0.015) 89.25 13.03 46.91

(b) UDWL-MAP (Fig. 6.6)

µf σf ENL

Noisy (at (280, 100)) 52.10 8.73 35.61

UDWL [14] 54.06 3.04 316.06

ISUKF (ζ = 0.004) 51.96 2.83 337.9

(a) (b) (c)

Fig. 6.6: (a) Airport image. Result using (b) MAP estimator in UDWL domain [14],and (c) the proposed method.

We also compared our method with a very recent wavelet despeckling technique

[14] developed in the undecimated wavelet (UDWL) domain. Fig. 6.6(a) shows a real

SAR image of an airport. This is a very difficult example as it contains lots of weak

edges. The outputs of the method in [14] and the proposed method are shown in Figs.

6.6(b) and (c), respectively. From a visual comparison, it is quite evident that the

ability of the ISUKF method in capturing soft edges is superior to that of [14]. Even

102

(a) (b)

(c) (d)

Fig. 6.7: (a) Urban SAR image. (b) Degraded image. Image estimated using (c)wavelet-based filter [15], and (d) ISUKF.

the ENL for our method is higher (Table 6.3 (b)).

We next present few more results, only for visual comparison with some well known

existing methods, to demonstrate the effectiveness of the proposed ISUKF method on

different kinds of SAR images. Figs. 6.7(a), (b) and (c) show the original, degraded

and the image estimated using the wavelet filter with soft-thresholding (from [15]),

respectively. The image obtained by using the ISUKF is shown in Fig. 6.7(d). We

observe that the proposed method is more effective in bringing out the narrow sharp

details in the bottom middle regions while effectively removing the overall speckle.

Next, we compare the proposed method with an MRF-based approach which ob-

tains an estimate by simulated annealing with Metropolis sampler [16]. Figs. 6.8 (a)

103

(a) (b) (c)

Fig. 6.8: (a) Real SAR image. Image estimated using (b) simulated annealing andMetropolis estimator [16], and (c) ISUKF.

and (b) show the real noisy SAR image and the output of the method in [16], respec-

tively. The image estimated using the proposed ISUKF is shown in Fig. 6.8(c). Our

approach could effectively suppress speckle while preserving image texture and sharp

details.

The above examples clearly demonstrate the effectiveness of the proposed ISUKF

in suppressing speckle while simultaneously preserving point and fine features. It

generally leads to a better visual output and also compares favorably in quantitative

performance as well as in computational complexity with other methods. It has the

same computational complexity as given in chapter 5 (Table (5.1)), and executes in 18

seconds using Matlab on a Pentium 4 PC with 256 MB RAM for a 200 x 200 image.

6.5 DISCUSSION

In this chapter, we explored the applicability of ISUKF for despeckling SAR imagery.

The formulation of the DAMRF prior was modified to preserve fine features in a wide

range of SAR images. A small set of sigma points were used to capture and propagate

the first two moments of the state and measurement noise through the multiplicative

speckle model.

104

CHAPTER 7

JOINT INPAINTING AND DENOISING

Old films or photographs usually suffer from damages due to physical and/or chemical

effects resulting in stain, scratch, scribbling, noise, and digital drop-out in frames. In-

painting is a technique for modifying user-specified regions of an image undetectably.

It provides a means for reconstruction of known damaged portions. It serves a wide

range of applications, such as removing superimposed text like dates, subtitles, public-

ity or logos from still images or videos, and reconstructing scans of deteriorated images

by removing scratches or stains [6, 18].

In this chapter, we present a scheme which can simultaneously filter film-grain

noise while filling-in identified damaged portions in an image within the unscented

Kalman filter (UKF) framework. We observe that a key issue in handling inpainting

with a recursive filter lies in arriving at the observations in the regions to be inpainted

based on the surrounding available information. Care must be taken to preserve the

edges through the inpainting regions. We first demonstrate the validity of our scheme

on line and region-scratches, and text removal. We then incorporate our inpainting

scheme within the UKF framework and demonstrate simultaneous film-grain noise

reduction and inpainting capability of the proposed filter.

7.1 INTRODUCTION

Most inpainting techniques smoothly propagate image information from outside the

boundary into the inpainting region based on edges or isophotes (lines of equal gray

values) [6] and require the user to only specify the region to be inpainted. Kokaram

et al. [23] interpolate losses in films from adjacent frames using motion estimation and

auto-regressive models. The technique, however, cannot be applied to still images or to

105

films where the regions to be inpainted span several frames. Hirani and Totsuka [155]

combine global frequency and local spatial information in order to fill a given region

with a selected texture. The algorithm requires the user to select the texture to be

copied into the region to be inpainted. Mansnou and Morel [24] define dis-occlusion as

the recovery of hidden parts of objects in an image by interpolation from the vicinity

of the occluded area and perform inpainting by joining the points of the isophotes at

the boundary of the region to be inpainted.

Bertalmio et al. [6] formulate partial differential equations (PDEs) that smoothly

propagate the boundary information (the Laplacian of the image) in the direction of

the isophotes, estimated by the image gradient rotated by 90 degrees. Their algorithm

shows that both the gradient direction (geometry) and the gray-scale values (pho-

tometry) of the image should be propagated inside the region to be filled-in. They

demonstrate the generality of their method with various applications including region-

filling, text-removal, and special-effects restoration of old photographs. In [156], an

exemplar-based inpainting algorithm for filling-in large objects is proposed by com-

bining the principles of texture-synthesis and traditional inpainting. Another class

of inpainting algorithms which stresses on exploiting geometric image models in a

Bayesian framework has been proposed by Chan and Shen in [18,157]. The first model

introduced by them uses a total variation-based image model [158]. This model can

successfully propagate sharp edges into the damaged domain. However, because of a

regularization term, the model exacts a penalty on the length of edges, and, thus, the

inpainting model cannot connect contours across very large distances. Subsequently,

Chan et al. [159] introduced the Mumford-Shaw model that allows both for isophotes

to be connected across large distances, and their directions to be kept continuous across

edges in the inpainting region.

Oliveira et al. [160] have proposed a fast and simple inpainting method which

repeatedly convolves a Gaussian 3 × 3 filter over the missing regions to diffuse in-

formation. However, it requires specification of the diffusion barriers (high-gradient

areas) manually. Telea [161] uses intensity gradient and distance information of the

106

boundary pixels and weighs them by the isophote direction. Yet another inpaint-

ing algorithm [162] uses the Sobel edge operator’s magnitude and angle to compute

isophotes. An inpainting method based on belief propagation is proposed in [163].

An exemplar-based image completion method that unifies texture synthesis and image

inpainting with a priority belief propagation-based optimization is proposed in [164].

Rares et al. [17, 165] propose inpainting methods which use edge information both

for the reconstruction of the skeleton image structure in missing areas, as well as for

guiding the interpolation that follows.

In [166], a PDE model based on the Cahn-Hilliard equation [167] is proposed for

binary inpainting of degraded text. Image repairing methods in [168, 169] employing

texture segmentation followed by robust, non-iterative multi-dimensional tensor voting

[170] to globally infer the most suitable pixel value in a neighborhood by using the

MRF assumption. In recent years, there is great interest in extensions such as video

inpainting [171], video repair [172], video stabilization [173] and deinterlacing of video

sequences [174].

A very recent development in inpainting is simultaneous recovery from noise as well

as damages. In [175], a PDE-based method is proposed to fill-in missing information

while removing noise. Inside the inpainting domain, smoothing operation is carried

out by mean-curvature-flow to enable transportation of boundary information, while

outside of the inpainting domain smoothing is encouraged within homogeneous regions

and discouraged across boundaries. In this chapter, we first propose an inpainting

method that is guided by edge information. We next embed noise filtering into the

inpainting framework.

7.2 AN EDGE-BASED APPROACH

Since image information between different objects generally need not correlate and

are separated by edges, edge-based inpainting methods first reconstruct explicit edge

information in the region of damages. Filling-in is followed within each object guided

107

Fig. 7.1: A typical edge-based inpainting methodology (from [17]): (Left) Generalalgorithm outline and, (right) an illustration of the outputs for each stage.

by the reconstructed edges. The user is required to specify the region to be inpainted.

Edge-based inpainting algorithms involve mainly two steps. i) reconstruction of the

edge image (in the masked regions) from damaged image edges and those available in

the unmasked regions, and ii) an edge-based pixel aggregation procedure. Consider an

image I that has lost information in region Ω(⊂ I), the region to be inpainted. Let

δΩ(⊂ I) represent the outer boundary of the inpainting region. Note that observations

are not available inside Ω and virtually anything could have existed there. One must

‘predict’ the image content in Ω based on appropriate assumptions about image prop-

erties such as local continuity and independence across objects separated by edges.

Digital inpainting algorithms are known to undetectably fill-in missing information

provided the damaged area is not too large in size.

The steps employed in a typical edge-based inpainting method and the correspond-

ing outputs are illustrated in Fig. 7.1. Following step 1 in the figure, the boundary

edges are detected and propagated through the damages. Once the edges in the dam-

108

(a) (b) (c) (d)

Fig. 7.2: (a) A real image superimposed with text. (b) Mask specifies the regionwhere the original image information is lost. (c) A damaged photograph, and (d) itsmask that specifies the regions to be filled-in.

aged region are reconstructed (as shown in the image corresponding to step 2), the

inpainting guided by the reconstructed edges yields the restored image shown in the

final step. In general, there can be more than one damaged region in an image.

In this section, we present a simple and effective edge-based inpainting method.

The proposed algorithm is similar to the one proposed by Rares et al. [17, 165] in the

sense that the major steps illustrated in Fig. 7.1 are performed, but it differs in the

methodology of implementation. Moreover, here our interest is not to just inpaint

the damaged images but to inpaint the damages while simultaneously filtering the

film-grain noise.

As stated earlier, we assume that the region to be filled in (Ω) is given a priori.

A convenient way to specify the regions to be filled-in is by an image mask which is

a binary image that distinguishes the regions that must be inpainted. For example,

for the image shown in Fig. 7.2 (a), the region to be inpainted (superimposed text)

is shown in the mask image (Fig. 7.2(b)) by white pixels. Fig. 7.2 (d) is another

example of a mask image that shows the regions to be inpainted in Fig. 7.2 (c).

7.2.1 Reconstruction of Edge Image

In any inpainting method, propagating the information in a uniform region is trivial.

The main difficulty arises at lost edges. As edges determine the topology, edge re-

109

construction needs to incorporate global image structure information in the inpainting

procedure. It is useful to note that the structure of the original image inside the region

Ω is a continuation of the (edge) structure outside it as illustrated earlier in Fig. 7.1.

Edge information can be utilized to independently propagate the information between

any two different objects in the inpainting region. Moreover, once we reconstruct the

edge image, the gray level propagation can be local. Thus, it is intuitive and reason-

able to first reconstruct the edges in the missing data regions to reliably propagate the

boundary image information.

It is important to note that reconstruction of edges from a damaged image is

different from the usual edge-linking process.

• The region of inpainting can be large and can result in much larger breaks in

edges than those that typically arise from edge-detection methods.

• The end points of the broken edges across a damaged region may even be farther

than some possible end points on the same side of the mask.

• The location/distance between pixels and the direction of edges become more

important than gradient magnitude and pixel intensity.

• Since the mask is user-specified and causes the edge gaps, its location and di-

mensions should be used to make the edge-linking process effective.

We propose the following edge-reconstruction procedure which makes effective use

of image features of the end points of edges, the contextual dimensions of the damages,

and the location of the mask. It comprises of the following steps.

1. Preprocessing and edge detection: We first find the edge map over Ωc with a

Canny edge operator [176]. The threshold is set high so that only strong edges

are captured. This yields an edge image everywhere except at the masked (to-

be-inpainted) regions. Fig. 7.3 (b) shows the edge information in Ωc derived

from Fig. 7.3 (a).

2. End points and features:

110

(a) (b) (c) (d)

Fig. 7.3: Edge reconstruction: (a) Degraded image showing maximum matching areadimensions. (b) Edge map of the degraded image. (c) Located end points, and (d) thereconstructed edge image.

(a) By searching over the boundary of the masked regions, the end points of

edges are located as points with exactly one neighboring edge pixel within

their eight point neighborhood. The end points thus detected are marked

with dots as shown in Fig. 7.3 (c).

(b) For each end point at location (ik, jk), its gradient direction Dk, magnitude

Gk, and the corresponding pixel intensity Ik are collected to form a feature

vector Fk = [ik, jk, Dk, Gk, Ik].

3. Matching: Our aim is to link edges across (possibly large) damaged regions.

(a) The end points are matched using an area constraint. The local matching

area for an end point is chosen to be a rectangle whose dimensions are set

to the largest breaks Mr and Mc along height and width, respectively, in

the strong edges due to damages. In Fig. 7.3 (a), we show selection of Mr

and Mc for the given degraded image.

(b) Two end points are considered for matching, only if i) the row and column-

wise absolute difference of their spatial locations is less than Mr and Mc,

respectively, and ii) the region between the two points belongs to Ω, the

inpainting region i.e., they are separated by the same damage.

111

(c) Ratio-test: Consider the end point p1 (with coordinates (i1, j1)) in Fig. 7.3

(c). Note that end point p2 (with coordinates (i2, j2)) as well as p3 (with

coordinates (i3, j3)) satisfy the matching area constraint for p1. In order

to decide to which of these two points should p1 be matched, the damaged

contents of the region between two end points is inferred as follows.

Inside the rectangular region whose opposite corners are determined by the

pixels (i1, j1) and (i2, j2) to be matched, the ratio of masked pixels to the

available pixels is determined. If the ratio is higher than a threshold r, we

decide that the region between these end points is a damaged region and

consider only such points for matching. We note that the rectangular box

enclosing p1 and p3 in Fig. 7.3 (c) includes a larger fraction of available

pixels, in contrast to the rectangular box enclosing p1 and p2. Hence, p2 is

a better candidate-match for p1 than p3.

(d) Based on (b) and (c), we perform constrained matching as follows: Let

Fpmand Fpn

denote the feature vectors corresponding to points pm and

pn, respectively. If |im − in| < Mr and |jm − jn| < Mc and the pixels

in the rectangle satisfy the ratio test (c) then we compute the distance

|Fpm− Fpn

|2.

(e) The end points with the minimum distance are selected as the matches i.e.,

L(pi) = arg mink |Fpi− Fpk

|2 among all candidate end points pk.

This mask-dependent local matching strategy suppresses false matches very ef-

fectively as compared to naive direct matching which uses only the feature vec-

tors of the end points.

4. Edge linking: Only the mutually matched end points which satisfy pk = L(pm)

and pm = L(pk) are connected to obtain the reconstructed edge map from

the degraded image. In Fig. 7.3 (d), we show the reconstructed edge image

corresponding to Fig. 7.3(a) using the above steps.

112

7.2.2 The Proposed Method

Once the edge image is reconstructed, inpainting can proceed by using suitably-derived

boundary neighbors. In our method, the edge information is effectively used to define

the neighboring pixels so as to limit the propagation of information within each object.

The algorithmic steps of the proposed inpainting method are as follows:

If (m, n) ∈ Ω,

1. S = φ. Take a window W with initial size 3× 3 about the location (m, n).

2. Collect valid neighbors within the window W and append as

S = S ∪ I(i, j); (i, j) ⊂ W, (i, j) /∈ ΩNote that the valid neighbors are pixels that belong to Ωc.

3. Check independently in all four directions whether the end pixel of the window

encounters an edge pixel of the reconstructed edge image; otherwise, increment

the size of window W in that direction. This limits the propagation of the

information within the object and renders it independent of other objects.

Let wl be the width of the window W between pixel (m, n) and the extreme

pixel on the left. We perform wl = wl +1, if (m, n−wl) /∈ E; else wl = wl where

E is the domain of non-zero edge pixels. Similarly, update the window length

along right (wr), top (wt) and bottom (wb) directions also.

4. Stop pixel collection when a certain number of available neighbors are accumu-

lated. i.e., stop if Cardinality S = T , else repeat steps (2) and (3). We set

the threshold T as 25.

Finally, the inpainted observation yp(m, n) = medianS.Here, the median is preferred over the average value of the pixels in set S for

better noise robustness and edge-preservation. Also, note that due to the appending

operation employed in step 2 of the algorithm, the computed median is actually a

weighted median that gives higher priority to the nearest valid neighbors. Schematic

representation of the proposed edge-based inpainting approach is shown in Fig. 7.4.

113

Fig. 7.4: Proposed inpainting algorithm.

7.3 RESULTS FOR INPAINTING

In this section, we demonstrate the performance of the proposed inpainting algorithm

on various synthetic as well as real degradations and compare it with well-known

existing techniques.

We begin with inpainting scratches on a synthetic image. Fig. 7.5(a) shows an

original brick image and Fig. 7.5(b) shows the degraded version with scratches (result-

ing in even some occluded edges). The output using the proposed inpainting scheme

is shown in Fig. 7.5(c). Even though this is a simulated example, the edge map was

reconstructed from the degraded image since the original image is never available in

real situations. We note that the inpainted image is quite close to the original image

even at the edges.

We next consider the degraded image shown in Fig. 7.6(a). The inpainting result

using the proposed algorithm is shown in Fig. 7.6(b). For comparison, we have

reproduced the output of the method described in [6] for the same image. Our result

114

(a) (b) (c)

Fig. 7.5: Brick image (a) Original, (b) scratched, and (c) inpainted.

(a) (b) (c)

Fig. 7.6: A synthetic image (a) with a ring mask. (b) Inpainted output. (c) Resultof Bertalmio [6].

is sharper at edges than that of [6] (Fig. 7.6(c)) but the method in [6] preserves broken

curvature better.

Next, we consider Fig. 7.7(a) which shows an original peppers image. The dam-

aged version with thick line scratches and patches is shown in Fig. 7.7(b). Using the

edge map of the degraded image (Fig. 7.7 (c)), which is known only at unmasked

regions, we arrive at the reconstructed edge map shown in Fig. 7.7(d). The final

inpainted image is given in Fig. 7.7(e). We note that the final result is quite good.

In Fig. 7.8(a), we consider the case of a real image of an old damaged painting.

The edge image in the unmasked regions is shown in Fig. 7.8(b). The reconstructed

115

(a) (b)

(c) (d) (e)

Fig. 7.7: Peppers image. (a) Original, (b) scratched, (c) edges from degraded image,(d) reconstructed edges, and (e) inpainted image using reconstructed edge map.

edge map using our edge reconstruction method is shown in Fig. 7.8(c) and the final

inpainted output using the proposed algorithm is given in Fig. 7.8(d). We note that

except at the middle right folds of the coat, where the edge reconstruction is not

perfect, our result compares favorably with the output Fig. 7.8(e) of the PDE-based

iterative approach of [6].

Next, we present inpainting results on a real damaged photograph (Fig. 7.9(a))

taken from [6]. The output of our algorithm and that of [6] are shown in Figs. 7.9(b)

and (c), respectively. While both the methods exhibit comparable performance overall,

our algorithm does better near the eyes while the result of [6] is better in inpainting

the damage near the bottom-hand.

116

(a) (b)

(c) (d) (e)

Fig. 7.8: An old painting (a) Degraded image. (b) Edge map from degraded image.(c) Reconstructed edge map. (d) Inpainted result using reconstructed edge map. (e)Output cropped from [6].

We next consider inpainting of superimposed text. Fig. 7.10(a) shows a bird image

with text written on it. The inpainting result of the proposed algorithm is shown in

Fig. 7.10(b). We also show the result of Chan and Shen [18] in Fig. 7.10(c) for

comparison. Our method, despite being less complex, is quite close to that of [18] in

performance.

Finally, we apply the proposed algorithm on a horse-cart image (Fig. 7.11(a))

superimposed with dense text. Fig. 7.11 (b) shows the inpainted image obtained with

our algorithm. The image inpainting result of [6] is shown in Fig. 7.11(c). Again, our

approach performs comparably.

117

(a) (b) (c)

Fig. 7.9: A real image (a) with mask on damaged region. (b) Inpainted result usingour method. (c) Result of Bertalmio [6].

(a) (b) (c)

Fig. 7.10: (a) A bird image superimposed with text. Inpainted result using (b)proposed method, and (c) method in [18].

7.4 INPAINTING IN PRESENCE OF NOISE

In chapter 5, we explored UKF for film-grain noise suppression. Old photographic

images and movies not only suffer from damages due to scratch, stain or blotches,

but also from film-grain noise. Text inpainting in the presence of film-grain noise is

also important for restoration and compression of photographic images and movies

containing subtitles and markers. In contrast to filtering, we must entirely rely on the

surrounding available pixels to estimate a pixel in image painting. This can be partially

accomplished during the prediction step of a recursive filter. Reliable prediction augurs

118

(a)

(b) (c)

Fig. 7.11: Horse-cart image (a) with superimposed text. (b) Inpainted output usingproposed method. (c) Output of [6].

well for simultaneous inpainting and filtering. Moreover, the prediction must not

depend on the missing observations in order to identify the state model parameters.

Importance sampling unscented Kalman filter, proposed in chapter 5, has a reliable

prediction step (without being dependent on observations) which suits our current

requirement. However, for the update step (in the damaged regions) we derive ‘virtual’

observations from the boundary topology and gray levels. This is in contrast to using

’known’ observations in traditional restoration.

119

7.4.1 ISUKF for Simultaneous Inpainting and Filtering

In this subsection, we propose a unified framework for inpainting images in the presence

of film-grain noise. The proposed inpainting process is embedded within the recursive

ISUKF framework to simultaneously combat noise while also accounting for missing

data. Prediction is based on the already estimated NSHP pixels as before. The issue in

performing inpainting within the filter framework is to handle missing observations. To

address this, we propose to reconstruct a missing observation by using the (available)

surrounding observations in our inpainting method. Inpainting is invoked only when we

encounter a missing pixel during the sequential filtering process. We first reconstruct

the edge image from the damaged noisy image, exactly as in section 7.2.1. We set a

high threshold for finding edges to mitigate the effect of noise on edge reconstruction.

We use the reconstructed edge map to guide the inpainting process as described in

section 7.2.2.

Details of the proposed method to sequentially filter and recover the intensity of

the original image at each pixel (m, n) in a raster-scan order from left-to-right are as

follows:

1. Prediction is based on the DAMRF prior exactly as in ISUKF (chapter 5). We

first formulate the DAMRF conditional density following step (1) and predict

the state mean and covariance by importance sampling of this non-Gaussian pdf

as in step (2) of section 5.3.1.

2. Inpainting is performed only for the missing observation pixels. We use the

output of our inpainting algorithm for ’reconstructing’ the observation pixel

intensity at (m, n) based on the available neighbors in the observation image.

3. UKF update step is as in ISUKF (step (3), of section 5.3.1). Note that for

missing data, we have the inpainted observations.

(a) We predict the state sigma points using Eq. (4.1) based on the predicted

mean and covariance.

120

Fig. 7.12: Proposed framework for inpainting in the presence of noise.

(b) We use the film-grain nonlinearity (Eq. 5.8) to predict the measurement

sigma points using the predicted state and measurement noise sigma points

as in step 3(b) of section 5.3.1.

(c) Following step 3 (c) of section 5.3.1, we update the mean and covariance of

the state at each pixel. In order to do this, we require the observations from

the degraded image. In case of missing observations, we use the inpainted

output from step 2 (above).

4. The updated state is the estimated pixel. This is used in the formulation of the

DAMRF prior and also for inpainting subsequent missing observations.

Figure 7.12 gives a flow diagram of the proposed filter.

7.5 RESULTS FOR JOINT RECOVERY

In this section, we recover damages in presence of both synthetic and real film-grain

noisy images. We begin with a synthetic image with two ellipses, a white region to

121

(a) (b)

Fig. 7.13: (a) Image degraded by film-grain noise and a big patch. (b) Output of theproposed method.

(a) (b) (c)

Fig. 7.14: Boat (a) Original. (b) Degraded and scratched. (c) Recovered image.

be inpainted, and corrupted by simulated film grain noise, as shown in Fig. 7.13(a).

The proposed recursive filter not only suppresses film-grain noise but also inpaints

efficiently the damaged region as shown in Fig. 7.13(b).

Next, we show a boat image and its degraded version in Figs. 7.14 (a) and (b),

respectively. Note that the masts have been scratched through. The image estimated

by the proposed algorithm and shown in Fig. 7.14(c) recovers most of the details,

except at the very narrow poles where the neighboring information is very different.

In the next example, we apply the proposed approach to suppress film-grain noise

while removing dense superimposed text. An original peppers image is shown in Fig.

122

(a) (b) (c)

(d) (e) (f)

Fig. 7.15: Peppers (a) Original. (b) Degraded and superimposed with text. (c)Reconstructed edge map. (d) Image recovered by the proposed filter (ISNR = 16.16dB). (e) Degraded with high noise. (f) Recovered result using reconstructed edge map(when input is (e)) (ISNR = 14.38 dB).

7.15(a). It is superimposed with dense text and corrupted (initially) with less degree

of film-grain noise as shown in Fig. 7.15(b). The reconstructed edge image is shown in

Fig. 7.15 (c). The recovered image using the reconstructed edge map is shown in Fig.

7.15(d). Next, we introduced a high level of film-grain noise as shown in Fig. 7.15(e)

and the corresponding inpainted and denoised result is shown in Fig. 7.15 (f). For

both levels of noise, we observe that the outputs are quite close to the original (Fig.

7.15(a)). Here we have given the ISNR values for a quantitative comparison.

Next, we consider a face image with real film-grain noise but simulated scratches

as shown in Fig. 7.16(a). The result of the proposed filter (Fig. 7.16(b)) looks quite

natural, recovering the fine details near the eyebrows, hair and nose. Scratches on

123

(a) (b)

(c) (d)

Fig. 7.16: Face with real film-grain. (a) Scratched version. (b) Inpainted and filteredresult (when input is (a)). (c) Degraded by patches. (d) Image recovered by our filter(when input is (c)).

the ears and near the mouth are also well recovered. Even when the face image is

damaged with patches and thick stripes (Fig. 7.16(c)), the recovered image by the

proposed filter (shown in Fig. 7.16(d)) recovers most of the details, except for small

artifacts near the forehead and mouth.

A cropped portion of a frame from the film “Dr. Mabuse” with real film-grain noise

is synthetically damaged at many places as shown in Fig. 7.17 (a). The results of the

proposed edge reconstruction method is given in Fig. 7.17(c), which can reconstruct

most of the major breaks in Fig. 7.17(b). The image recovered by the proposed filter

is shown in Fig. 7.17(d). The edges at the shoulders and at the tie are well-recovered,

and the image details near eyes and nose are restored properly.

124

(a) (b)

(c) (d)

Fig. 7.17: Mabuse (a) with scratches. (b) Edge image from (a). (c) Reconstructededge image (when input is (b)). (d) Image recovered by the proposed filter.

We next demonstrate the capability of the proposed approach to recover an im-

age of scanned painting with real film-grain noise (Fig. 7.18(a)). It is scratched at

several places including at edges as shown in Fig. 7.18(b). The recovered image (Fig.

7.18(c)) using the proposed approach can suppress film-grain and effectively inpaint

while retaining the image texture and finer details such as the ship partitions.

We have also performed removal of scratches in an image captured on a photo-

graphic film. The image is that of a locomotive shown in Fig. 7.19 (a). In the output

of the proposed algorithm shown in Fig. 7.19 (b), not only the numbers and letters

are properly recovered, but also the noise is suppressed well.

Finally, we consider another real example that has film-grain noise, embedded

125

(a) (b) (c)

Fig. 7.18: (a) Scanned painting with real film-grain noise. (b) Scratched version, and(c) recovered image.

(a) (b)

Fig. 7.19: (a) An image captured using a film-camera, and (b) recovered output.

text and a vertical scratch as shown in Fig. 7.20(a). The image recovered using the

proposed approach is shown in Fig. 7.20(b). It appears natural and without grain.

We have successfully inpainted the titles and the mark in the middle of the face, and

also recovered quite well the facial details such as hair, nose and mouth.

126

(a) (b)

Fig. 7.20: (a) A child image with text, scratch and film-grain noise. (b) Inpaintedand filtered output of the proposed method.

These examples demonstrate that the proposed filter is quite effective in suppress-

ing film-grain noise while performing image inpainting. The proposed algorithm takes

(using Matlab) about 20 seconds to execute on an image of size 200× 200 pixels when

run on a Pentium-IV PC with 256 MB RAM.

7.6 DISCUSSION

We proposed a novel recursive filter based on the unscented Kalman filter to recover

an image degraded by both film-grain noise and scratches. We first proposed an edge-

based inpainting scheme that is suitable for incorporating in a recursive framework. As

our inpainting method relies on the edge image, we proposed a local mask-dependent

edge-linking method to link edges across damaged portions. We proposed a pixel

aggregation procedure to inpaint missing pixels.

To inpaint damages in images corrupted by film-grain noise, we proposed a recur-

sive scheme based on the UKF which uses our inpainting method to derive the ob-

servations in the damaged regions. Prediction was based on a discontinuity-adaptive

MRF prior which preserves the edges while achieving good noise-reduction in uniform

regions.

127

CHAPTER 8

CONCLUSIONS

In this thesis, we addressed the problem of recovering an image from its degraded

observation through edge-preserving novel extensions of the Kalman filter and its vari-

ants. We first investigated the problem of filtering AWGN. The original image was

modeled with a non-Gaussian MRF prior based on a discontinuity-adaptive poten-

tial function. This conditional prior implicitly models the non-homogeneous nature of

images with local statistical information. We brought in the principle of importance

sampling to predict the mean and covariance of the state conditional prior, which are

then fed to the update step of the Kalman filter to arrive at the final image estimate.

We next examined the problem of handling non-linearity in photographic films and

proposed methods within the framework of the UKF. As a first step, we modeled the

image by a homogeneous AR state model with NSHP support. Assuming the initial

statistics of the state, we determined sigma points and transformed them through the

AR state model followed by the observation model. Employing UT, we predicted the

state and measurement statistics for use in the update stage of UKF.

We further relaxed the linearity constraint imposed by the homogeneous AR model

in UKF and incorporated an edge-preserving MRF image prior to capture local statis-

tics. The principle of importance sampling was adopted to predict the mean and

covariance of the state from the prior, which allowed us to determine the predicted

sigma points directly from the prior. These were propagated through the film-grain

observation model to obtain measurement sigma points and subsequently the statis-

tics required for the update step of the UKF. Experimental results on film-grain noise

reduction in photographic images were given to demonstrate the effectiveness of the

UKF-based algorithms.

128

We next addressed the problem of suppression of speckle noise in SAR images.

Here, one had to cope with multiplicative noise and yet preserve critical point and

line features. In order to achieve these goals, we tailored the DAMRF prior for SAR

imagery. The predicted sigma points using this prior were propagated through the

multiplicative model and were subsequently used to arrive at the image estimates.

Finally, we considered the problem of digital inpainting of noisy photographs. We

first developed an edge-based inpainting method which relies on the reconstructed

edge image. The end points of edges are matched based on a mask-based constrained

local matching strategy. Missing pixels are inpainted by propagating the boundary

information which is guided by the reconstructed edge map. Embedding the developed

inpainting method within the ISUKF-based filtering framework, we proposed a method

to simultaneously suppress film-grain noise while inpainting missing pixels.

All the proposed approaches were validated on synthetic as well as real examples.

They were also compared with existing methods to demonstrate the effectiveness of

the proposed methods.

8.1 SUGGESTIONS FOR FUTURE WORK

There are several directions for pursuing further research. We cite below a few of them.

• Extending the proposed ISKF and ISUKF for image restoration in the presence

of blur.

• Investigating the utility of the proposed ISUKF framework for other noise mod-

els and nonlinear restoration problems.

• Improving the performance of ARUKF and ISUKF through more accurate pre-

diction of the means and covariances using higher-order UT methods [137,138].

• Extending the inpainting ISUKF framework for occlusion-free tracking in videos.

• Exploring Gaussian-mixture UKF models for tracking multiple objects or people

in image sequences.

129

BIBLIOGRAPHY

[1] A. C. Bovik, Handbook of Image and Video Processing. Academic Press, 2005.

[2] A. K. Katsaggelos, “Recent trends in image restoration and enhancement techniques,”in Proc. IEEE Asia Pacific Conf. on Circuits and Systems, pp. 458–459, 1996.

[3] H. C. Andrews and B. R. Hunt, Digital image restoration. New Jersey: Prentice-Hall,Inc, 1977.

[4] A. M. Tekalp and G. Pavlovic, “Image restoration with multiplicative noise: Incorpo-rating the sensor nonlinearity,” IEEE Trans. Signal Process., vol. 39, pp. 2132 – 2136,1991.

[5] J. W. Goodman, “Some fundamental properties of speckle,” J. Opt. Soc. Amer., vol. 66,pp. 1145–1150, 1976.

[6] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester, “Image inpainting,” in Proc.SIGGRAPH, Computer Graphics Proceedings, pp. 417 – 424, 2000.

[7] R. E. Kalman, “A new approach to linear filtering and prediction problems,” Trans.ASME, Ser. D, Journal of Basic Engineering, vol. 82, pp. 34–45, 1960.

[8] J. W. Woods and C. H. Radewan, “Kalman filtering in two dimensions,” IEEE Trans.Inform. Theory, vol. 23, pp. 473 – 482, 1977.

[9] F. Daum, “Nonlinear filters: beyond the Kalman filter,” IEEE Aerospace and Elec-tronic Systems Magazine, vol. 20, pp. 57–69, 2005.

[10] Z. Chen, “Bayesian filtering: From Kalman filters to particle filters, and beyond.”Technical Report, 2003.

[11] S. R. Kadaba, S. B. Gelfand, and R. L. Kashyap, “Recursive estimation of images usingnon-Gaussian autoregressive models,” IEEE Trans. Image Processing, vol. 7, pp. 1439– 1452, 1998.

[12] A. Achim, E. E. Kuruoglu, and J. Zerubia, “SAR image filtering based on the heavy-tailed Rayleigh model,” IEEE Trans. Image Processing, vol. 15, pp. 2686–2693, 2006.

[13] M. Dai, C. Peng, A. K. Chan, and D. Loguinov, “Bayesian wavelet shrinkage with edgedetection for SAR image despeckling,” IEEE Trans. Geoscience and Remote Sensing,vol. 42, pp. 1642–1648, 2004.

[14] F. Argenti, T. Bianchi, and L. Alparone, “Multiresolution MAP despeckling of SARimages based on locally adaptive generalized Gaussian pdf modeling,” IEEE Trans.Image Processing, vol. 15, pp. 3385–3399, 2006.

130

[15] L. Gagnon and A. Jouan, “Speckle filtering of SAR images - a comparative studybetween a complex-wavelet-based and standard filters,” Proc. SPIE, vol. 3169, pp. 80–91, 1997.

[16] O. Lankoande, M. M. Hayat, and B. Santhanam, “Speckle modeling and reduction insynthetic aperture radar imagery,” in Proc. IEEE Int. Conf. Image Processing (ICIP),pp. III:317–320, 2005.

[17] A. Rares, M. Reinders, and J. Biemond, “Edge-based image restoration,” IEEE Trans.Image Processing, vol. 14, pp. 1454–1468, 2005.

[18] T. Chan and J. Shen, “Mathematical models for local non-texture inpaintings,” SIAMJournal on Applied Mathematics, vol. 62, pp. 1019–1043, 2001.

[19] A. K. Katsaggelos, “Iterative image restoration algorithms,” Optical Engineering,vol. 28, pp. 735 – 748, 1989.

[20] M. A. Robertson and R. L. Stevenson, “DCT quantization noise in compressed images,”IEEE Trans. Circuits and Systems for Video Technology, vol. 15, pp. 27–38, 2005.

[21] H. Soltanian-Zadeh, J. P. Windham, and A. E. Yagle, “A multi-dimensional non-linearedge-preserving filter for magnetic resonance image restoration,” IEEE Trans. ImageProcessing, vol. 4, pp. 147–161, 1995.

[22] F. T. Ulaby, “Radar signatures of terrain: Useful monitors of renewable resources,”Proc. IEEE, vol. 70, pp. 1410–1433, 1982.

[23] A. Kokaram, R. Morris, W. Fitzgerald, and P. Rayner, “Interpolation of missing datain image sequences,” IEEE Trans. Image Processing, vol. 11, pp. 1509–1519, 1995.

[24] S. Masnou and J. Morel, “Level-lines based disocclusion,” in Proc. IEEE Int. Conf.Image Processing (ICIP), pp. 259–263, 1998.

[25] A. K. Jain, Fundamentals of digital image processing. India: Prentice Hall, 2000.

[26] T. Berger, J. O. Stromberg, and T. Eltoft, “Adaptive regularized constrained leastsquares image restoration,” IEEE Trans. Image Processing, vol. 8, pp. 1191–1203,1999.

[27] D. Angwin and H. Kaufman, “Image restoration using a reduced order model Kalmanfilter,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing (ICASSAP),pp. 1000–1003, 1988.

[28] H. Kaufman, J. W. Woods, D. Subrahmanyam, and A. M. Tekalp, “Estimation andidentification of two-dimensional images,” IEEE Trans. Automatic Control, vol. 28,pp. 745–756, 1983.

[29] F. C. Jeng and J. W. Woods, “Inhomogeneous Gaussian image models for estimationand restoration,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 1305–1312, 1988.

131

[30] A. M. Tekalp, H. Kaufman, and J. Woods, “Edge-adaptive image restoration withringing suppression,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 892–899, 1989.

[31] J. Biemond and J. Gerbrands, “An edge preserving recursive noise smoothing algorithmfor image data,” IEEE Trans. Systems, Man and Cybernetics, vol. 9, pp. 622–627, 1979.

[32] H. R. Keshavan and M. D. Srinath, “Sequential estimation technique for enhancementof noisy images,” IEEE Trans. Computers, vol. 26, pp. 971–988, 1977.

[33] Y. C. Chang, S. R. Kadaba, P. C. Doerschuk, and S. B. Gelfand, “Image restorationusing recursive Markov random field models driven by Cauchy distributed noise,” IEEETrans. Signal Process. Letters, vol. 8, pp. 65–66, 2001.

[34] S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesianrestoration of images,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 6, pp. 721–741, 1984.

[35] C. Bouman and K. Sauer, “A generalized Gaussian image model for edge-preservingMAP estimation,” IEEE Trans. Image Processing, vol. 2, pp. 296–310, 1993.

[36] P. Charbonnier, L. Blanc-Feraud, G. Aubert, and M. Barlaud, “Deterministic edge-preserving regularization in computed imaging,” IEEE Trans. Image Processing, vol. 6,pp. 298–311, 1997.

[37] F. C. Jeng and J. W. Woods, “Compound Gauss-Markov random fields for imageestimation,” IEEE Trans. Signal Process., vol. 39, pp. 683–697, 1991.

[38] S. Roth and M. J. Black, “Fields of experts: A framework for learning image priors,” inProc. IEEE Int. Conf. Computer Vision and Pattern Recognition, pp. 860–867, 2005.

[39] M. Ceccarelli, “A finite Markov random field approach to fast edge-preserving imagerecovery,” Image and Vision Computing, vol. 25, pp. 792–804, 2007.

[40] R. G. Aykroyd, “Bayesian estimation for homogeneous and inhomogeneous Gaussianrandom fields,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 20, pp. 533–539,1998.

[41] M. S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial on particle filtersfor on-line nonlinear/non-Gaussian Bayesian tracking,” IEEE Trans. Signal Process.,vol. 50, pp. 174 – 188, 2002.

[42] Y. Rui and Y. Chen, “Better proposal distributions: Object tracking using unscentedparticle filter,” in Proc. IEEE Computer Society Conf. Computer Vision and PatternRecognition, pp. 786–794, 2001.

[43] H. E. Knutsson, R. Wilson, and G. H. Granlund, “Anisotropic nonstationary imageestimation and its applications: Part I-restoration of noisy images,” IEEE Trans. Com-munication, vol. 31, pp. 388–397, 1983.

132

[44] G. Gilboa, N. Sochen, and Y. Y. Zeevi, “Image enhancement and denoising by complexdiffusion process,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 25, pp. 1020–1036, 2004.

[45] P. Perona and J. Malik, “Scale-space and edge detection using anisotropic diffusion,”IEEE Trans. Pattern Anal. and Machine Intell., vol. 12, pp. 629–639, 1990.

[46] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removalalgorithms,” Physica D, vol. 60, pp. 259–268, 1992.

[47] L. Kaur, S. Gupta, and R. C. Chauhan, “Image denoising using wavelet threshold-ing,” in Indian Conf. on Computer Vision, Graphics and Image Processing (ICVGIP),(India), 2002.

[48] S. G. Chang, B. Yu, and M. Vetterli, “Adaptive wavelet thresholding for image denois-ing and compression,” IEEE Trans. Image Processing, vol. 9, pp. 1532–1546, 2000.

[49] M. A. T. Figueiredo and R. D. Nowak, “Wavelet based image estimation: An empiricalBayes approach, using Jeffrey’s noninformative prior,” IEEE Trans. Image Processing,vol. 10, pp. 1322–1331, 2001.

[50] M. Jansen and A. Bultheel, “Geometric prior for noise free wavelet coefficient in imagede-noising,” in Lecture notes in Statistics (P. Mueller and B. Vidakovic, eds.), vol. 141,pp. 223–242, Spinger - verlag, 1999.

[51] A. S. Tavildar, H. M. Guptha, and S. N. Guptha, “Maximum a posteriori estimationin presence of film-grain noise,” Signal Process. (North Holland), vol. 8, pp. 363–368,1985.

[52] A. D. Stefano, P. R. White, and W. B. Collis, “Film grain reduction in colour imagesusing undecimated wavelet transform,” Image and Vision Computing, vol. 22, pp. 873–882, 2004.

[53] F. Naderi and A. A. Sawchuk, “Estimation of images degraded by film-grain noise,” J.Applied Optics, vol. 17, pp. 1228 – 1237, 1978.

[54] J. C. K. Yan and D. Hatzinakos, “Signal-dependent film grain noise removal and gen-eration based on higher-order statistics,” in Proc. of IEEE Signal Processing Workshopon Higher-Order Statistics, pp. 77–81, 1997.

[55] B. R. Hunt, “Bayesian methods in nonlinear digital image restoration,” IEEE Trans.Computers, vol. 26, pp. 219 – 229, 1977.

[56] G. K. Froehlich, J. F. Walkup, and T. F. Krille, “Estimation in signal-dependent film-grain noise,” Applied Optics, vol. 20, pp. 3619 – 3626, 1981.

[57] T. M. Moldovan, S. Roth, and M. J. Black, “Denoising archival films using a learnedBayesian model,” in Proc. IEEE Int. Conf. Image Processing (ICIP), pp. 2641–2644,2006.

133

[58] S. I. Sadhar and A. N. Rajagopalan, “Image recovery under non-linear and non-Gaussian degradations,” J. Opt. Soc. Amer. (A), vol. 22, pp. 604–615, 2005.

[59] S. I. Sadhar and A. N. Rajagopalan, “Image estimation in film-grain noise,” IEEETrans. Signal Process. Letters, vol. 12, pp. 238–241, 2005.

[60] F. Argenti, G. Torricelli, and L. Alparone, “MMSE filtering of generalised signal-dependent noise in spatial and shift-invariant wavelet domains,” Signal Processing,vol. 86, pp. 2056–2066, 2006.

[61] V. Bruni and D. Vitulano, “Old movies noise reduction via wavelets and Wiener filter,”Journal of Winter School of Computer Graphics (WSCG), vol. 12, pp. 65–69, 2004.

[62] V. S. Frost, J. A. Stiles, A. Josephine, K. S. Shanmugan, and J. C. Holtzman, “A modelfor radar images and its application to adaptive filtering of multiplicative noise,” IEEETrans. Pattern Anal. and Machine Intell., vol. 4, pp. 157–165, 1982.

[63] J. S. Lee, “Digital image enhancement and noise filtering by use of local statistics,”IEEE Trans. Pattern Anal. and Machine Intell., vol. 2, pp. 165–168, 1980.

[64] D. T. Kuan, A. A. Sawchuk, T. C. Strand, and P. Chavel, “Adaptive noise smoothingfilter for images with signal dependent noise,” IEEE Trans. Pattern Anal. and MachineIntell., vol. 7, pp. 165–177, 1985.

[65] A. Lopes, R. Touzi, and E. Nezzy, “Adaptive speckle filters and scene heterogeneity,”IEEE Trans. Geoscience and Remote Sensing, vol. 28, pp. 992–1000, 1990.

[66] A. Lopes, E. Nezry, R. Touzi, and H. Laur, “Maximum a posteriori speckle filteringand first order texture models in SAR images,” in Proc. IEEE Int. Geoscience andRemote Sensing Symposium, pp. 2409–2412, 1990.

[67] T. R. Crimmins, “Geometric filter for reducing speckle,” Optical Engineering, vol. 25,pp. 651–654, 1986.

[68] M. R. Azimi-Sadjadi and S. Bannour, “Two-dimensional adaptive block Kalman filter-ing of SAR imagery,” IEEE Trans. Geoscience and Remote Sensing, vol. 29, pp. 742–753, 1991.

[69] Y. Yu and S. T. Acton, “Speckle reducing anisotropic diffusion,” IEEE Trans. ImageProcessing, vol. 11, pp. 1260–1270, 2002.

[70] H. M. Salinas and D. C. Fernandez, “Comparison of PDE-based nonlinear diffusionapproaches for image enhancement and denoising in optical coherence tomography,”IEEE Trans. Medical Imaging, vol. 26, pp. 761–771, 2007.

[71] H. Xie, L. E. Pierce, and F. T. Ulaby, “SAR speckle reduction using wavelet denoisingand Markov random field modeling,” IEEE Trans. Geoscience and Remote Sensing,vol. 40, pp. 2196–2212, 2002.

[72] A. Achim, P. Tsakalides, and A. Bezerianos, “SAR image denoising via Bayesianwavelet shrinkage based on heavy-tailed modeling,” IEEE Trans. Geoscience and Re-mote Sensing, vol. 41, pp. 1773–1785, 2003.

134

[73] M. I. H. Bhuiyan, M. O. Ahmad, and M. N. S. Swamy, “Spatially-adaptive wavelet-based method using the Cauchy prior for denoising the SAR images,” IEEE Trans.Circuits and Systems for Video Technology, vol. 17, pp. 500–507, 2007.

[74] S. P. Luttrell and C. J. Oliver, “Prior knowledge in synthetic-aperture radar process-ing,” J. Phys. D, vol. 19, pp. 333–356, 1986.

[75] M. Bertalmio, L. Vese, G. Sapiro, and S. Osher, “Simultaneous structure and textureimage inpainting,” IEEE Trans. Image Processing, vol. 12, pp. 882–889, 2003.

[76] H. Kaufman and A. M. Tekalp, “Survey of estimation techniques in image restoration,”IEEE Control Systems Magazine, vol. 11, pp. 16 – 24, 1991.

[77] S. Z. Li, Markov Random Field Modeling in Computer Vision. New York, inc.:Springer-Verlag, 1995.

[78] S. Z. Li, “On discontinuity-adaptive smoothness priors in computer vision,” IEEETrans. Pattern Anal. and Machine Intell., vol. 17, pp. 576–586, 1995.

[79] D. J. C. MacKay, “Introduction to Monte Carlo methods,” in Learning in GraphicalModels (M. I. Jordan, ed.), NATO Science Series, pp. 175–204, Kluwer Academic Press,1998.

[80] E. C. Anderson, “Monte Carlo methods and importance sampling.” Lecture Notes,at ‘http://ib.berkeley.edu/labs/slatkin/eriq/classes/guest lect/mc lecture notes.pdf’,1999.

[81] S. Julier and J. Uhlmann, “A general method for approximating nonlinear transfor-mations of probability distributions.” Tech. rep., RRG, Dept. of Engineering Science,University of Oxford, 1996.

[82] S. J. Julier, J. K. Uhlmann, and H. F. Durrant-Whyte, “A new method for the nonlin-ear transformation of means and covariances in filters and estimators,” IEEE Trans.Automatic Control, vol. 45, pp. 477–482, 2000.

[83] R. V. Merwe, N. de Freitas, A. Doucet, and E. Wan, “The unscented particle fil-ter.” Technical report CUED/F-INFENG/TR 380, Cambridge University EngineeringDepartment, 2000.

[84] E. A.Wan and R. V. Merwe, “The unscented Kalman filter for nonlinear estimation,”in Proceedings of IEEE Symposium on Adaptive Systems for Signal Processing, Com-munications and Control (AS-SPCC), pp. 153–158, 2000.

[85] J. L. Crassidis and F. L. Markley, “Unscented filtering for spacecraft attitude estima-tion,” J. Guid. Control Dyn., vol. 26, pp. 536–542, 2003.

[86] P. Li, T. Zhang, and B. Ma, “Unscented Kalman filter for visual curve tracking,” Imageand Vision Computing, vol. 22, pp. 157–164, 2004.

[87] B. Stenger, P. Mendonca, and R. Cipolla, “Model-based hand tracking using an un-scented Kalman filter,” in Proc. British Machine Vision Conference, pp. 63–72, 2001.

135

[88] G. R. K. S. Subrahmanyam, A. N. Rajagopalan, and R. Aravind, “Unscented Kalmanfilter for image estimation in film-grain noise,” in Proc. IEEE Int. Conf. Image Pro-cessing (ICIP), pp. IV:17–21, 2007.

[89] E. R. Dougherty, Random processes for image and signal processing. New York:SPIE/IEEE Series on Imaging Science and Engineering, 1999.

[90] M. Petrou and P. G. Sevilla, Image processing: Dealing with texture. London, UK:John Wiley and Sons, 2006.

[91] A. Rangarajan and R. Chellappa, “Markov random field models in image processing,”in The handbook of brain theory and neural networks, NATO Science Series, pp. 564–567, MIT Press, 1998.

[92] N. Ahuja and B. Schachter, “Image models,” ACM Computing Surveys, vol. 13,pp. 373–397, 1981.

[93] S. C. Zhu, “Statistical modeling and conceptualization of visual patterns,” IEEE Trans.Pattern Anal. and Machine Intell., vol. 25, pp. 691–712, 2003.

[94] E. B. Ranguelova, Segmentation of textured images on three-dimensional lattices. PhDthesis, Dept. of Electronic and Electrical Engg., Univ. of Dublin, Trinity College,Dublin, 2002.

[95] P. Perez, “Markov random fields and images,” CWI Quarterly, vol. 11, pp. 413–437,1998.

[96] J. W. Woods, “Two-dimensional discrete Markovian fields,” IEEE Trans. Inform. The-ory, vol. 18, pp. 232–240, 1972.

[97] S. Kumar and M. Hebert, “Discriminative random fields,” Int. J. Computer Vision,vol. 68, pp. 179–201, 2006.

[98] H. M. Wallach, “Conditional random fields: An introduction.” University of Pennsyl-vania CIS, Technical Report MS-CIS-04-21, 2004.

[99] D. E. Melas and S. P. Wilson, “Double Markov random fields and Bayesian imagesegmentation,” IEEE Trans. Signal Process., vol. 50, pp. 357–365, 2002.

[100] M. A. T. Figueiredo, “Bayesian methods and Markov random fields.” Departmentof Electrical and Computer Engineering, Instituto Superior Tecnico, available at“www.lx.it.pt/˜mtf/FigueiredoCVPR.pdf”.

[101] D. Melas, A Bayesian Approach to the Segmentation of Textural Images. PhD thesis,Dept. of Electronic and Electrical Engg., Univ. of Dublin, Trinity College, Dublin,1998.

[102] J. Besag, “Spatial interaction and the statistical analysis of lattice systems,” J. RoyalStatist. Society (B), vol. 36, pp. 192–236, 1974.

[103] D. Griffeath, “Introduction to random fields,” in Denumerable Markov Chains (J. L. S.J. G. Kemeny and A. W. Knapp, eds.), pp. 425–458, New York: Springer-Verlag, 1976.

136

[104] R. Stevenson, B. Schmitz, and E. Delp, “Discontinuity-preserving regularization ofinverse visual problems,” IEEE Trans. Systems, Man and Cybernetics, vol. 24, pp. 455–469, 1994.

[105] P. J. Green, “Bayesian reconstruction from emission tomography data using a modifiedEM algorithm,” IEEE Trans. Medical Imaging, vol. 9, pp. 84–93, 1990.

[106] S. Geman, D. McClure, and D. Geman, “A nonlinear filter for film restoration andother problems in image processing,” Computer Vision Graphics and Image Processing,vol. 54, pp. 281–289, 1992.

[107] A. Blake and A. Zisserman, Visual Reconstruction. Cambridge, M.A.: M.I.T. Press,1987.

[108] S. Geman and G. Reynolds, “Constrained restoration and the recovery of discontinu-ities,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 14, pp. 367–383, 1992.

[109] T. Hebert and R. Leahy, “A generalized EM algorithm for 3-D Bayesian reconstructionfrom Poisson data using Gibbs priors,” IEEE Trans. Medical Imaging, vol. 8, pp. 194–202, 1989.

[110] S. Kapoor, P. Y. Mundkur, and U. B. Desai, “Depth and image recovery using a MRFmodel,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 16, pp. 1117 – 1122,1994.

[111] L. Khouas, C. Odet, and D. Friboulet, “3D furlike texture generation by a 2D autore-gressive synthesis,” in Proc. Inter. Conf. in Central Europe on Computer Graphics andVisualization (WSCG), pp. 171–177, 1998.

[112] S. Haykin, Kalman Filtering and Neural Networks. John Wiley and Sons, New York,inc.: Wiley-interscience, 2001.

[113] A. M. Tekalp, H. Kaufman, and J. W. Woods, “Fast recursive estimation of the pa-rameters of a space-varying autoregressive image model,” IEEE Trans. Acoust., Speech,Signal Processing, vol. 33, pp. 469–472, 1985.

[114] S. Z. Li, “Discontinuity-adaptive MRF prior and robust statistics: A comparativestudy,” Image and Vision Computing, vol. 13, pp. 227–233, 1995.

[115] S. Z. Li, “Robustizing robust M-estimation using deterministic annealing,” PatternRecognition, vol. 29, pp. 159–166, 1996.

[116] J. Pengelly, “Monte Carlo methods.” Students Tutorial, February 2002, available at‘http://csnet.otago.ac.nz/cosc453/student tutorials/monte carlo.pdf.’.

[117] M. Pagano and W. Sandmann, “Efficient rare event simulation: A tutorial on im-portance sampling.” Tutorial presented at Third International Working Conference onPerformance Modelling and Evaluation of Heterogeneous Networks, 2005.

[118] T. C. Hesterberg, Advances in Importance Sampling. PhD thesis, Stanford University,US, 1988.

137

[119] I. Beichl and N. F. Sullivan, “The importance of importance sampling,” Computing inSci. and Engg., vol. 1, pp. 71 – 73, 1999.

[120] P. H. Borcherds, “Importance sampling: an illustrative introduction,” Eur. J. Phys.,vol. 21, pp. 405–411, 2000.

[121] S. J. Julier and J. K. Uhlmann, “A new extension of the Kalman filter to nonlinearsystems,” in Proc. of AeroSense: 11th International Symposium on Aerospace/DefenseSensing, Simulation and Controls, vol. 3068, pp. 182–193, 1997.

[122] R. V. Merwe, Sigma-point Kalman filters for probabilistic inference in dynamic state-space models. PhD thesis, OGI Sch. of Sci. and Engg., Oreg. Health and Sci. Univ.,Portland, Oreg., 2004.

[123] B. D. Anderson and J. B. Moore, Optimal filtering. Englewood Cliffs, N. J: Prentice-Hall, 1979.

[124] B. Ristic, M. S. Arulampalam, A. Farina, and D. Benvenuti, “Performance boundsand comparison of nonlinear filters for tracking a ballistic object on re-entry,” in IEEProceedings: Radar, Sonar and Navigation, vol. 150, pp. 65–70, 2003.

[125] S. J. Julier and J. K. Uhlmann, “Unscented filtering and nonlinear estimation,” Proc.IEEE, vol. 92, pp. 401–422, 2004.

[126] S. J. Julier and J. K. Uhlmann, “The scaled unscented transformation,” in Proc. Amer.Control Conf., pp. 4555–4559, 2002.

[127] K. Ito and K. Xiong, “Gaussian filters for nonlinear filtering problems,” IEEE Trans.Automatic Control, vol. 45, pp. 910–927, 2000.

[128] R. V. Merwe and E. A. Wan, “Sigma-point Kalman filters for probabilistic inference indynamic state-space models,” in Proceedings of the Workshop on Advances in MachineLearning, 2003.

[129] R. V. Merwe, E. A. Wan, and A. T. Nelson, “Dual estimation and the unscentedtransformation,” in Advances in Neural Information Processing Systems, pp. 666–672,2000.

[130] I. A. Gura and R. H. Gersten, “Interpretation of n-dimensional covariance matrices,”American Institute of Aeronautics and Astronautics Journal, vol. 9, pp. 740–742, 1971.

[131] A. ACCESS, “Statistical analysis site.” http://www.aiaccess.net/e gm.htm.

[132] S. J. Julier, J. K. Uhlmann, and H. F. Durrant-Whyte, “A new approach for filteringnonlinear systems,” in Proc. Amer. Control Conf., pp. 1628–1632, 1995.

[133] B. Ristic, S. Arulampalam, and N. Gordon, Beyond the Kalman Filter: Particle Filtersfor Tracking Applications. New York, inc.: Artech House Radar Library, 2004.

[134] S. J. Julier, Comprehensive process models for high-speed land vehicles. PhD thesis,Robotics Research Group, Wadham Collage, University of Oxford, UK, 1997.

138

[135] R. V. Merwe and E. Wan, “The square-root unscented Kalman filter for state andparameter-estimation,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing(ICASSAP), pp. 3461–3464, 2001.

[136] S. J. Julier and J. K. Uhlmann, “Reduced sigma point filters for the propagation ofmeans and covariances through nonlinear transformations,” in Proc. Amer. ControlConf., pp. 887–892, 2002.

[137] D. Tenne and T. Singh, “The higher order unscented filter,” in Proc. Amer. ControlConf., pp. 2441– 2446, 2003.

[138] Y. Wu, M. Wu, D. Hu, and X. Hu, “An improvement to unscented transformation,”in Australian Conference on Artificial Intelligence, pp. 1024–1029, 2004.

[139] R. V. Merwe and E. A. Wan, “Gaussian mixture sigma-point particle filters for sequen-tial probabilistic inference in dynamic state-space models,” in Proc. IEEE Int. Conf.Acoust., Speech, Signal Processing (ICASSAP), pp. 701–704, 2003.

[140] L. Angrisani, A. Baccigalupi, and R. S. L. Moriello, “Ultrasonic time-of-flight estima-tion through unscented Kalman filter,” IEEE Trans. Instrumentation and Measure-ment, vol. 55, pp. 1077–1084, 2006.

[141] T. Lefebvre, H. Bruyninckx, and J. D. Schutter, “Comment on ‘a new method forthe nonlinear transformation of means and covariances in filters and estimators (andauthor’s reply)’,” IEEE Trans. Automatic Control, vol. 47, pp. 1406–1409, 2002.

[142] A. Sitz, U. Schwarz, and J. Kurths, “The unscented Kalman filter, a powerful toolfor data analysis,” Int. J. of Bifurcation and Chaos in App. Sci. and Engg., vol. 14,pp. 2093–2105, 2004.

[143] J. J. LaViola, “A comparison of unscented and extended Kalman filtering for estimatingquaternion motion,” in Proc. Amer. Control Conf., pp. 2435–2440, 2003.

[144] S. I. Sadhar, Some new approaches for image restoration and blur identification usingparticle filter. PhD thesis, Dept. of Electrical Engg., IIT Madras, India, 2005.

[145] D. Zou, J. Tian, J. Bloom, and J. Zhai, “Data hiding in film grain,” in 5th InternationalWorkshop on Digital Watermarking, pp. 197–211, 2006.

[146] N. J. Gordon, D. J. Salmond, and A. F. M. Smith, “Novel approach to nonlinear/non-Gaussian Bayesian state estimation,” in IEE Proc. F: Radar and Signal Process.,vol. 140, pp. 107 – 113, 1993.

[147] R. Birk, W. Camus, E. Valenti, and W. McCandless, “Synthetic aperture radar imagingsystems,” IEEE AES Systems Magazine, vol. 10, pp. 15–23, 1995.

[148] K. Tomiyasu, “Tutorial review of synthetic-aperture radar (SAR) with applications toimaging of the ocean surface,” Proc. IEEE, vol. 66, pp. 563–587, 1978.

[149] O. Lankoande, M. M. Hayat, and B. Santhanam, “Speckle reduction of SAR imagesusing a physically based Markov random field model and simulated annealing,” Proc.SPIE, vol. 5808, pp. 210–221, 2005.

139

[150] M. Costanntini, A. Farina, and F. Zirilli, “The fusion of different resolution SARimages,” Proc. IEEE, vol. 85, pp. 139–146, 1997.

[151] M. Mastriani and A. E. Giraldez, “Kalman’s shrinkage for wavelet-based despecklingof SAR images,” Int. J. Intelligent Technology, vol. 1, pp. 190–196, 2006.

[152] F. Sattar, L. Floreby, G. Salomonsson, and L. Benny, “Image enhancement based ona nonlinear multiscale method,” IEEE Trans. Image Processing, vol. 6, pp. 888–895,1997.

[153] R. C. Gonzalez and R. E. Woods, Digital image processing. Asia: Pearson Education,2000.

[154] I. E. Abdou and W. K. Pratt, “Quantitative design and evaluation of enhance-ment/thresholding edge detectors,” Proc. IEEE, vol. 67, pp. 753–766, 1979.

[155] A. Hirani and T. Totsuka, “Combining frequency and spatial domain information forfast interactive image noise removal,” in Proc. SIGGRAPH, Computer Graphics Pro-ceedings, pp. 269–276, 1996.

[156] A. Criminisi, P. Perez, and K. Toyama, “Region filling and object removal by exemplar-based image inpainting,” IEEE Trans. Image Processing, vol. 13, pp. 1200–1212, 2004.

[157] J. Shen, “Inpainting and the fundamental problem of image processing,” SIAM News,vol. 36, 2003.

[158] T. Chan, S. Kang, and J. Shen, “Euler’s elastica and curvature based inpaintings,”SIAM Journal on Applied Mathematics, vol. 63, pp. 564–592, 2002.

[159] T. Chan and J. Shen, “Non-texture inpaintings by curvature-driven diffusions (CDD),”J. Vis. Commun. Image Res., vol. 12, pp. 436–449, 2001.

[160] M. M. Oliveira, B. Bowen, R. McKenna, and Y. S. Chang, “Fast digital image in-painting,” in Proc. of the Inter. Conf. on Visualization, Imaging and Image Processing(VIIP), pp. 261–266, 2001.

[161] A. Telea, “An image inpainting technique based on the fast marching method,” J.Graphics Tools, vol. 9, pp. 25–36, 2004.

[162] K. Ko and S. Kim, “Efficient inpainting of old film scratch using Sobel edge opera-tor based isophote computation.,” in Proc. IEEE Int. Conf. Comput. and Inf. Tech.,pp. 124–129, 2006.

[163] M. Yasuda, J. Ohkubo, and K. Tanaka, “Digital image inpainting based on Markovrandom field,” in CIMCA-IAWTIC, pp. 747–752, IEEE Computer Society, 2005.

[164] N. Komodakis and G. Tziritas, “Image completion using global optimization,” in Proc.IEEE Computer Society Conf. Computer Vision and Pattern Recognition, pp. 442– 452,2006.

140

[165] A. Rares, M. J. T. Reinders, and J. Biemond, “Image sequence restoration in thepresence of pathological motion and severe artifacts,” in Proc. IEEE Int. Conf. Acoust.,Speech, Signal Processing (ICASSAP), pp. 3365–3368, 2002.

[166] A. L. Bertozzi, S. Esedoglu, and A. Gillette, “Inpainting of binary images using theCahn-Hilliard equation,” IEEE Trans. Image Processing, vol. 16, pp. 285–291, 2007.

[167] A. N. Cohen, “The Cahn-Hilliard equation: mathematical and modeling perspectives,”Adv. Math. Sci. Appl., vol. 8, pp. 965–985, 1998.

[168] J. Jia and C. Tang, “Image repairing: Robust image synthesis by adaptive ND ten-sor voting,” in Proc. IEEE Computer Society Conf. Computer Vision and PatternRecognition, pp. 643–650, 2003.

[169] J. Jia and C. K. Tang, “Inference of segmented color and texture description by tensorvoting,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 26, pp. 771–786, 2004.

[170] G. Medioni, M. Lee, and C. Tang, A computational framework for feature extractionand segmentation. Elseviers Science, 2000.

[171] K. A. Patwardhan, G. Sapiro, and M. Bertalmo, “Video inpainting under constrainedcamera motion,” IEEE Trans. Image Processing, vol. 16, pp. 545–553, 2007.

[172] J. Jia, Y. Tai, T. Wu, and C. Tang, “Video repairing under variable illumination usingcyclic motions,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 28, pp. 832–839,2006.

[173] Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H. Shum, “Full-frame video stabilizationwith motion inpainting,” IEEE Trans. Pattern Anal. and Machine Intell., vol. 28,pp. 1150–1164, 2006.

[174] C. Ballester, M. Bertalmio, V. Caselles, L. Garrido, A. Marques, and F. Ranchin,“An inpainting-based deinterlacing method,” IEEE Trans. Image Processing, vol. 16,pp. 2476–2491, 2007.

[175] C. A. Z. Barcelos and M. A. Batista, “Image restoration using digital inpainting andnoise removal,” Image and Vision Computing, vol. 25, pp. 61–69, 2007.

[176] J. Canny, “A computational approach to edge detection,” IEEE Trans. Pattern Anal.and Machine Intell., vol. 8, pp. 679–714, 1986.

141

LIST OF PAPERS BASED ON THESIS

A. Journal Papers

1. G. R. K. S. Subrahmanyam, A. N. Rajagopalan and R. Aravind, “Importance

sampling Kalman filter for image estimation”, IEEE Signal Processing Letters,

vol. 14, pp. 453-456, 2007.

2. G. R. K. S. Subrahmanyam, A. N. Rajagopalan and R. Aravind, “Importance

sampling unscented Kalman filter for film-grain noise removal”, IEEE Multime-

dia (to appear).

B. Conference Papers

1. G. R. K. S. Subrahamanyam, A. N. Rajagopalan and R. Aravind, “A new exten-

sion of Kalman filter to non-Gaussian priors”, Indian Conference on Computer

Vision Graphics and Image Processing (ICVGIP’2006), pp. 162-171, 2006.

2. G. R. K. S. Subrahmanyam, A. N. Rajagopalan and R. Aravind, “Unscented

Kalman filter for image estimation in film-grain noise”, IEEE International Con-

ference on Image Processing (ICIP’2007), pp. IV-17 - IV-20, 2007.

142

recursive image estimation and inpainting in … · recursive image estimation and inpainting in...

Documents