final report

IIMAGEMAGE I INPAINTINGNPAINTING

Chapter-1

PREAMBLE

1.1 Introduction

Inpainting, the technique of modifying an image in an undetectable form, is as

ancient as art itself. The goals and applications of inpainting are numerous, from the

restoration of damaged paintings and photographs to the removal/replacement of selected

objects. In our project, we introduce a novel algorithm for digital inpainting of still

images that attempts to replicate the basic techniques used by professional restorators.

After the user selects the regions to be restored, the algorithm automatically fills-in these

regions with information surrounding them. The fill-in is done in such a way that

isophote lines arriving at the region’s boundaries are completed inside. In contrast with

previous approaches, the technique here introduced does not require the user to specify

where the novel information comes from. This is automatically done (and in a fast way),

thereby allowing to simultaneously fill-in numerous regions containing completely

different structures and surrounding backgrounds. In addition, no limitations are imposed

on the topology of the region to be inpainted. Applications of this technique include the

restoration of old photographs and damaged film; removal of superimposed text like

dates, subtitles, or publicity; and the removal of entire objects from the image like

microphones or wires in special effects.

In our project we presents a novel algorithm for removing objects from digital

photographs and replacing them with visually plausible backgrounds. The algorithm

effectively hallucinates new color values for the target region in a way that looks

“reasonable” to the human eye. In previous work, several researchers have considered

texture synthesis as a way to fill large image regions with “pure” textures – repetitive

two-dimensional textural patterns with moderate stochasticity. This is based on a large

body of texture-synthesis research, which seeks to replicate texture and infinitum, given a

small source sample of pure texture. Of particular interest are exemplar-based techniques

Department of CS & E, PDIT, Hospet 1


which cheaply and effectively generate new texture by sampling and copying color

values from the source .As effective as these techniques are in replicating consistent

texture, they have difficulty filling holes in photographs of real-world scenes, which often

consist of linear structures and composite textures , multiple textures interacting

spatially. The main problem is that boundaries between image regions are a complex

product of mutual influences between different textures.

In contrasts to the two dimensional nature of pure textures, these boundaries form

what might be considered more one-dimensional, or linear, image structures. A number

of algorithms specifically address this issue for the task of image restoration, where

speckles, scratches, and overlaid text are removed .These image inpainting techniques fill

holes in images by propagating linear structures (called isophotes in the inpainting

literature) into the target region via diffusion. They are inspired by the partial differential

equations of physical heat flow, and work convincingly as restoration algorithms. Their

drawback is that the diffusion process introduces some blur, which is noticeable when the

algorithm is applied to fill larger regions. The algorithm presented here combines the

strengths of both approaches. As with inpainting, we pay special attention to linear

structures. But, linear structures abutting the target region only influence the fill order of

what is at core an exemplar-based texture synthesis algorithm. The result is an algorithm

that has the efficiency and qualitative performance of exemplar-based texture synthesis,

but which also respects the image constraints imposed by surrounding linear structures.

Our algorithm builds on very recent research along similar lines. The output image is the

sum of the two processed components. This approach still remains limited to the removal

of small image gaps, however, as the diffusion process continues to blur the filled region.

One of the first attempts to use exemplar-based synthesis specifically for object

removal was by Harrison [1]. There, the order in which a pixel in the target region is

filled was dictated by the level of “texturedness” of the pixel’s neighborhood. Although

the intuition is sound, strong linear structures were often overruled by nearby noise,

minimizing the value of the extra computation. Finally, Zalesny et al. describe an

interesting algorithm for the parallel synthesis of composite textures. They devise a

special-purpose solution for the interface between two textures.



1.2 Problem Statement

The problem of the work can be stated as “Object Removal by Exemplar-based

Inpainting”. Through the work, a particular object can be entirely or partially removed

from an image. Sometimes the image contains unnecessary data. Under such situation the

objects can be first marked by an appropriate color and then the algorithm is applied. The

algorithm references the original image and removes that specific object from the image.

Exemplar-based methods are inspired by greedy image-based texture growing algorithms,

and the global image completion approach that was recently proposed to solve quality

problems in exemplar-based image completion. We overcome the problem of filling large

areas with textures that surround the area and reconstruct the image. This technique is

faster than the previous methods.

1.3 Hardware Requirement & Software Requirement

Hardware Requirement

I. P4 processor over 2.5 GHz

II. Minimum 512MB RAM.

III. Accelerated Graphics Card.

IV. Minimum 8GB Hard Disk.

V. Better performance with 32KB and above Cache Memory.

Software Requirement

Matlab 7

Components: Windows Components needs to be updated by Service Packs (XP/NT/S2K

service pack depending upon the system.



1.4 Organization of the rest of the report

The rest of the project report is organized under the following topics:

LITERATURE SURVEY section provides a brief description about the presently

existing system and the new system we are developing in the project.

REQUIREMENT SPECIFICATION gives a description of what the user wants

the system to do and even it includes software and hardware requirements.

ANALYSIS DESIGN AND IMPLEMENTATION gives the process or art of

defining the architecture and data for a system to satisfy specified requirements and

the implementation.

RESULTS Contains the results of the experiments. The images of the object and

the

CONCLUSION AND FUTURE ENHANCEMENT: The conclusion allows

having the final say on the issues we have written in the report. The future

enhancement gives a description of what is the future of the project we have

developed.

APPENDIX: Appendix refers to the reference or the bibliography of the project.



Chapter-2

LITERATURE SURVEY

2.1 EXISTING SYSTEM

In this survey, techniques developed in three distinct but related fields of study,

variational image inpainting, texture synthesis and image completion, are investigated.

Variational image inpainting involves Filling narrow gaps in images. Though there are

challenging alternative methods, best results are obtained by PDE-based algorithms.

Texture synthesis is reproduction of a texture from a sample. Firstly, statistical model

based methods were proposed for texture synthesis. Then pixel and patch-based sampling

techniques were developed, preserving texture structures better than statistical methods.

Bertalmio et al [2] pioneered a digital image inpainting algorithm based on the

PDE and be the extension to the level lines based disocclusion method or the same basic

idea, after the user select the region to be inpainted the two methods iteratively propagate

information from the outside of the area along the level lines isophotes (lines of equal

gray values), the difference lies in the goal of maintaining the angle of arrival. In order to

maintain the angle of arrival, the direction of the largest spatial change is used. The

direction may be obtained by computing a discretized gradient vector and rotating this

vector by 90 radians. Instead of using geodesic curves to connect the isophotes, the

prolongation lines are progressively curved while preventing the lines from intersecting

each other.

Disadvantages:

Impainting method is slower for large areas to be filled.

Large object removal is difficult.



2.2 PROPOSED SYSTEM

A new algorithm is proposed for removing large objects from digital images. The

challenge is to fill in the hole that is left behind in a visually plausible way. exemplar-

based methods are inspired by greedy image-based texture growing algorithms, and the

global image completion approach that was recently proposed to solve quality problems

in exemplar-based image completion. We overcome the problem of filling large areas

with textures that surround the area and reconstruct the image. This technique is faster

than the previous methods.



Chapter-3

IMAGE INPAINTING

A digital image a [m, n] described in 2D discrete space is derived from an analog

image a(x, y) in a 2D continuous space through a sampling process that is frequently

referred to as digitization. For now we will look at some basic definitions associated with

the digital image.

The 2D continuous image a(x,y) is divided into N rows and M columns. The

intersection of a row and a column is termed a pixel. The value assigned to the integer

coordinates [m,n] with m=0,1,2,...,M-1 and n=0,1,2,...,N-1 is a[m,n]. In fact, in most

cases a(x,y) which we might consider to be the physical signal that impinges on the face

of a 2D sensor is actually a function of many variables including depth (z), color ( ), and

time (t). we will consider the case of 2D, monochromatic, static images .

Figure 3.1: Digitization of a continuous image. The pixel at coordinates [m=10, n=3] has

the integer brightness value 110.

The image shown in Figure 3.1 has been divided into N = 16 rows and M = 16

columns. The value assigned to every pixel is the average brightness in the pixel rounded

to the nearest integer value. The process of representing the amplitude of the 2D signal at



a given coordinate as an integer value with L different gray levels is usually referred to as

amplitude quantization or simply quantization.

3.1 Characteristics of Image Operations

There is a variety of ways to classify and characterize image operations. The

reason for doing so is to understand what type of results we might expect to achieve with

a given type of operation or what might be the computational burden associated with a

given operation.

3.1.1 Types of operations

The types of operations that can be applied to digital images to transform an input

image a[m,n] into an output image b[m,n] (or another representation) can be classified

into three categories as shown in Table 2.

Table 2: Types of image operations. Image size = N x N; neighborhood size = P x P.

Note that the complexity is specified in operations per pixel.

Operation Characterization Generic

Complexity/Pixel

Point - The output value at a specific coordinate is

dependent only on the input value at that same

coordinate.

Constant

Local - The output value at a specific coordinate is

dependent on the input values in the neighborhood of

that same coordinate.

P2

Global - The output value at a specific coordinate is

dependent on all the values in the input image.

N2

This is shown graphically in Figure 3.2.



Figure 3.2: Illustration of various types of image operations

3.1.2 Types of neighborhoods

Neighborhood operations play a key role in modern digital image processing. It is

therefore important to understand how images can be sampled and how that relates to the

various neighborhoods that can be used to process an image.

Rectangular sampling - In most cases, images are sampled by laying a

rectangular grid over an image as illustrated in Figure 1. This results in the

type of sampling shown in Figure 4.2(a) and 4.2 (b).

hexagonal sampling - An alternative sampling scheme is shown in Figure

4.2 (c) and is termed hexagonal sampling.

Both sampling schemes have been studied extensively and both represent a

possible periodic tiling of the continuous image space. We will restrict our attention,

however, to only rectangular sampling as it remains, due to hardware and software

considerations, the method of choice.

Local operations produce an output pixel value b[m=mo,n=no] based upon the

pixel values in the neighborhood of a[m=mo,n=no]. Some of the most common

neighborhoods are the 4-connected neighborhood and the 8-connected neighborhood in

the case of rectangular sampling and the 6-connected neighborhood in the case of

hexagonal sampling illustrated in Figure 3.



a b c

Figure 4.2: Rectangular sampling Rectangular sampling hexagonal sampling 4-connected

8-connected 6-connected

3.2 Contour Representations

When dealing with a region or object, several compact representations are

available that can facilitate manipulation of and measurements on the object. In each case

we assume that we begin with an image representation of the object as shown in Figure

a,b. Several techniques exist to represent the region or object by describing its contour.

3.2.1 Chain code

This representation is based upon the work of Freeman. We follow the contour in

a clockwise manner and keep track of the directions as we go from one contour pixel to

the next. For the standard implementation of the chain code we consider a contour pixel

to be an object pixel that has a background (non-object) pixel as one or more of its 4-

connected neighbors. See Figures 3.2.1.

The codes associated with eight possible directions are the chain codes and, with

x as the current contour pixel position, the codes are generally defined as:

3 2 1

Chain codes = 4 x 0

5 6 7



Figure 3.2.1 : Region (shaded) as it is transformed from (a) continuous to (b) discrete

form and then considered as a (c) contour or (d) run lengths illustrated in alternating

colors.

3.2.2 Chain code properties

Even codes 0, 2, 4, 6 correspond to horizontal and vertical directions;

odd codes 1, 3, 5, 7 correspond to the diagonal directions.

Each code can be considered as the angular direction, in multiples of

45deg., that we must move to go from one contour pixel to the next.

The absolute coordinates [m, n] of the first contour pixel (e.g. top,

leftmost) together with the chain code of the contour represent a complete

description of the discrete region contour.

When there is a change between two consecutive chain codes, then the

contour has changed direction. This point is defined as a corner.



3.2.3 Crack code

An alternative to the chain code for contour encoding is to use neither the contour

pixels associated with the object nor the contour pixels associated with background but

rather the line, the "crack", in between. This is illustrated with an enlargement of a

portion of Figure 3.2.3 in Figure 3.2.1.

The "crack" code can be viewed as a chain code with four possible directions

instead of eight.

1

Crack codes = 2 x 0

3

(a) (b)

Figure 3.2.3: (a) Object including part to be studied. (b) Contour pixels as used in the

chain code are diagonally shaded. The "crack" is shown with the thick black line.

The chain code for the enlarged section of Figure 3.2.3b, from top to bottom, is

5,6,7,7,0. The crack code is 3, 2, 3, 3, 0, 3, 0, 0.

3.2.4 Run codes

A third representation is based on coding the consecutive pixels along a row--a

run--that belong to an object by giving the starting position of the run and the ending

position of the run. Such runs are illustrated in Figure 8d. There are a number of



alternatives for the precise definition of the positions. Which alternative should be used

depends upon the application and thus will not be discussed here.

3.3 Tools

Certain tools are central to the processing of digital images. These include

mathematical tools such as convolution, Fourier analysis, and statistical descriptions, and

manipulative tools such as chain codes and run codes

3.4 Convolution

There are several possible notations to indicate the convolution of two (multi-

dimensional) signals to produce an output signal. The most common are:

3.4.1 Properties of Convolution

There are a number of important mathematical properties associated with convolution.

Convolution is commutative.

Convolution is associative.

Convolution is distributive.

where a, b, c, and d are all images, either continuous or discrete.

3.4.2 Fourier Transforms

The Fourier transform produces another representation of a signal, specifically a

representation as a weighted sum of complex exponentials. Because of Euler's formula:



Where , we can say that the Fourier transform produces a representation of a

(2D) signal as a weighted sum of sines and cosines. The defining formulas for the

forward Fourier and the inverse Fourier transforms are as follows. Given an image “a”

and its Fourier transform A, then the forward transform goes from the spatial domain

(either continuous or discrete) to the frequency domain which is always continuous.

Forward

The inverse Fourier transform goes from the frequency domain back to the spatial

domain.

Inverse

The Fourier transform is a unique and invertible operation so that:

and

3.4.3 Properties of Fourier Transforms

There are a variety of properties associated with the Fourier transform and the

inverse Fourier transform. The Fourier transform is, in general, a complex function of the

real frequency variables. In words, convolution in the spatial domain is equivalent to

multiplication in the Fourier (frequency) domain and vice-versa. This is a central result

which provides not only a methodology for the implementation of a convolution but also

insight into how two signals interact with each other--under convolution--to produce a

third signal.



3.5 Probability distribution function of the brightnesses

The probability distribution function, P(a), is the probability that a brightness

chosen from the region is less than or equal to a given brightness value a. As a increases

from - to + , P (a) increases from 0 to 1. P(a) is monotonic, non-decreasing in a and

thus P/ a >= 0.

3.5.1 Probability density function of the brightnesses

The probability that a brightness in a region falls between a and a+ a, given the

probability distribution function P(a), can be expressed as p(a) a where p(a) is the

probability density function:

For an image with quantized (integer) brightness amplitudes, the interpretation of

a is the width of a brightness interval. We assume constant width intervals. The

brightness probability density function is frequently estimated by counting the number of

times that each brightness occurs in the region to generate a histogram, h[a]. The

histogram can then be normalized so that the total area under the histogram is 1 (eq. ).

Said another way, the p[a] for a region is the normalized count of the number of pixels,

in a region that have quantized brightness a:

with

The brightness probability distribution function for the image shown in Figure 4a

is shown in Figure 6a. The (unnormalized) brightness histogram of Figure 4a which is

proportional to the estimated brightness probability density function is shown in Figure

6b. The height in this histogram corresponds to the number of pixels with a given

brightness.



(a) (b)

Figure 3.5.1: (a) Brightness distribution function of Figure 4a with minimum, median,

and maximum indicated. See text for explanation. (b) Brightness histogram of Figure 4a.

Both the distribution function and the histogram as measured from a region are a

statistical description of that region. It must be emphasized that both P[a] and p[a] should

be viewed as estimates of true distributions when they are computed from a specific

region. That is, we view an image and a specific region as one realization of the various

random processes involved in the formation of that image and that region. In the same

context, the statistics defined below must be viewed as estimates of the underlying

parameters.

3.5.2 Average

The average brightness of a region is defined as the sample mean of the pixel

brightness’s within that region.

3.5.3 Mode

The mode of the distribution is the most frequent brightness value. There is no

guarantee that a mode exists or that it is unique.



3.5.4 Perception

Many image processing applications are intended to produce images that are to be

viewed by human observers (as opposed to, say, automated industrial inspection.) It is

therefore important to understand the characteristics and limitations of the human visual

system--to understand the "receiver" of the 2D signals. At the outset it is important to

realize that 1) the human visual system is not well understood, 2) no objective measure

exists for judging the quality of an image that corresponds to human assessment of image

quality, and, 3) the "typical" human observer does not exist. Nevertheless, research in

perceptual psychology has provided some important insights into the visual system. See,

for example, Stockham.

3.6 Brightness Sensitivity

There are several ways to describe the sensitivity of the human visual system. To

begin, let us assume that a homogeneous region in an image has an intensity as a function

of wavelength (color) given by I(λ). Further let us assume that I(λ) = Io, a constant.

3.6.1 Wavelength sensitivity

The perceived intensity as a function of λ, the spectral sensitivity, for the "typical

observer" is shown in Figure 3.6.1 .

Figure 3.6.1: Spectral Sensitivity of the "typical" human observer



3.6.2 Stimulus sensitivity

If the constant intensity (brightness) Io is allowed to vary then, to a good

approximation, the visual response, R, is proportional to the logarithm of the intensity.

This is known as the Weber-Fechner law:

The implications of this are easy to illustrate. Equal perceived steps in brightness,

R = k, require that the physical brightness (the stimulus) increases exponentially. This is

illustrated in Figure 11ab.

A horizontal line through the top portion of Figure 11a shows a linear increase in

objective brightness (Figure 11b) but a logarithmic increase in subjective brightness. A

horizontal line through the bottom portion of Figure 11a shows an exponential increase in

objective brightness (Figure 11b) but a linear increase in subjective brightness.

(a) (b)

Figure 3.6.2: Brightness step I = k Actual brightnesses plus interpolated values

(bottom) Brightness step I = k*I

The Mach band effect is visible in Figure 3.6.2 a. Although the physical

brightness is constant across each vertical stripe, the human observer perceives an

"undershoot" and "overshoot" in brightness at what is physically a step edge. Thus, just

before the step, we see a slight decrease in brightness compared to the true physical



value. After the step we see a slight overshoot in brightness compared to the true physical

value. The total effect is one of increased, local, perceived contrast at a step edge in

brightness.

3.6.3 Spatial Frequency Sensitivity

If the constant intensity (brightness) Io is replaced by a sinusoidal grating with

increasing spatial frequency (Figure 3.6.3 a), it is possible to determine the spatial

frequency sensitivity. The result is shown in Figure 3.6.3 b .

(a) (b)

Figure 3.6.3 Sinusoidal test grating Spatial frequency sensitivity

To translate these data into common terms, consider an "ideal" computer monitor

at a viewing distance of 50 cm. The spatial frequency that will give maximum response is

at 10 cycles per degree. (See Figure 3.6.3 b.) The one degree at 50 cm translates to 50

tan(1deg.) = 0.87 cm on the computer screen. Thus the spatial frequency of maximum

response fmax = 10 cycles/0.87 cm = 11.46 cycles/cm at this viewing distance



3.7 Optical Illusions

The description of the human visual system presented above is couched in

standard engineering terms. This could lead one to conclude that there is sufficient

knowledge of the human visual system to permit modeling the visual system with

standard system analysis techniques.

Two simple examples of optical illusions, shown in Figure 3.7, illustrate that this

system approach would be a gross oversimplification. Such models should only be used

with extreme care.

Figure 3.7: Optical Illusions

The left illusion induces the illusion of gray values in the eye that the brain

"knows" does not exist. Further, there is a sense of dynamic change in the image due, in

part, to the saccadic movements of the eye. The right illusion, Kanizsa's triangle, shows

enhanced contrast and false contours neither of which can be explained by the system-

oriented aspects of visual perception described above.



3.8 Noise

Images acquired through modern sensors may be contaminated by a variety of

noise sources. By noise we refer to stochastic variations as opposed to deterministic

distortions such as shading or lack of focus. We will assume for this section that we are

dealing with images formed from light using modern electro-optics. In particular we will

assume the use of modern, charge-coupled device (CCD) cameras where photons produce

electrons that are commonly referred to as photoelectrons. Nevertheless, most of the

observations we shall make about noise and its various sources hold equally well for

other imaging modalities.

While modern technology has made it possible to reduce the noise levels

associated with various electro-optical devices to almost negligible levels, one noise

source can never be eliminated and thus forms the limiting case when all other noise

sources are "eliminated".

3.8.1 Photon Noise

When the physical signal that we observe is based upon light, then the quantum

nature of light plays a significant role. A single photon at = 500 nm carries an energy of

E = h = hc/ = 3.97 x 10-19 Joules. Modern CCD cameras are sensitive enough to be

able to count individual photons. The noise problem arises from the fundamentally

statistical nature of photon production. We cannot assume that, in a given pixel for two

consecutive but independent observation intervals of length T, the same number of

photons will be counted. Photon production is governed by the laws of quantum physics

which restrict us to talking about an average number of photons within a given

observation window. The probability distribution for p photons in an observation window

of length T seconds is known to be Poisson:



where is the rate or intensity parameter measured in photons per second. It is critical to

understand that even if there were no other noise sources in the imaging chain, the

statistical fluctuations associated with photon counting over a finite time interval T would

still lead to a finite signal-to-noise ratio (SNR). If we use the appropriate formula for the

SNR , then due to the fact that the average value and the standard deviation are given by:

Poisson process -

we have for the SNR:

Photon noise -

The three traditional assumptions about the relationship between signal and noise

do not hold for photon noise:

Photon noise is not independent of the signal;

Photon noise is not Gaussian, and;

Photon noise is not additive.

For very bright signals, where T exceeds 105, the noise fluctuations due to

photon statistics can be ignored if the sensor has a sufficiently high saturation level.

3.8.2 Thermal Noise

An additional, stochastic source of electrons in a CCD well is thermal energy.

Electrons can be freed from the CCD material itself through thermal vibration and then,

trapped in the CCD well, be indistinguishable from "true" photoelectrons. By cooling the

CCD chip it is possible to reduce significantly the number of "thermal electrons" that

give rise to thermal noise or dark current. As the integration time T increases, the number

of thermal electrons increases. The probability distribution of thermal electrons is also a

Poisson process where the rate parameter is an increasing function of temperature. There

are alternative techniques (to cooling) for suppressing dark current and these usually



involve estimating the average dark current for the given integration time and then

subtracting this value from the CCD pixel values before the A/D converter. While this

does reduce the dark current average, it does not reduce the dark current standard

deviation and it also reduces the possible dynamic range of the signal.

3.8.3 On-chip Electronic Noise

This noise originates in the process of reading the signal from the sensor, in this

case through the field effect transistor (FET) of a CCD chip.

Readout noise can be reduced to manageable levels by appropriate readout rates

and proper electronics. At very low signal levels however, readout noise can still become

a significant component in the overall SNR.

3.8.4 KTC Noise

Noise associated with the gate capacitor of an FET is termed KTC noise and can

be non-negligible Ni=252 electrons. This value is a "one time" noise per pixel that occurs

during signal readout and is thus independent of the integration time. Proper electronic

design that makes use, for example, of correlated double sampling and dual-slope

integration can almost completely eliminate KTC noise.

3.8.5 Noise Removal from Images

Imagine an image with noise. For example, the image on the left below is a

corrupted binary (black and white) image of some letters; 60% of the pixels are thrown

away and replaced by random gray values ranging from black to white.



(a) (b) (c)

Figure 3.8.5 Noise removal (a) A "noisy" image (b) Smoothed (c) Continued

smoothing

One goal in image restoration is to remove the noise from the image in such a way

that the "original" image is discernible. Of course, "noise" is in the eye of the beholder;

removing the "noise" from a Jackson Pollack painting would considerably reduce its

value. Nonetheless, one approach is to decide that features that exist on a very small scale

in the image are noise, and that removing these while maintaining larger features might

help "clean things up".

One well-traveled approach is to smooth the image. The simplest such version is

replace each pixel it by the average of the neighboring pixel values. If we do this a few

times we get the image in the middle above; if we do it many times, we get the image on

the right.

On the plus side, much of the spotty noise has been muted out. On the downside,

the sharp boundaries that make up the letters have been smeared due to the averaging.

While many more sophisticated approaches exist, the goal is the same: to remove the

noise, and keep the real image sharp. The trick is to not do too much, and to "know when

to stop".



3.9 Texture Analysis

In many machine vision and image processing algorithms, simplifying

assumptions are made about the uniformity of intensities in local image regions.

However, images of real objects often do not exhibit regions of uniform intensities. For

example, the image of a wooden surface is not uniform but contains variations of

intensities which form certain repeated patterns called visual texture. The patterns can be

the result of physical surface properties such as roughness or oriented strands which often

have a tactile quality, or they could be the result of reflectance differences such as the

color on a surface.

Image texture, defined as a function of the spatial variation in pixel intensities

(gray values), is useful in a variety of applications and has been a subject of intense study

by many researchers. One immediate application of image texture is the recognition of

image regions using texture properties. Texture is the most important visual cue in

identifying these types of homogeneous regions. This is called texture classification. The

goal of texture classification then is to produce a classification map of the input image

where each uniform textured region is identified with the texture class it belongs to.

The texture features (texture elements) are distorted due to the imaging process

and the perspective projection, which provide information about surface orientation and

shape.

Texture analysis is important in many applications of computer image analysis for

classification or segmentation of images based on local spatial variations of intensity or

color. A successful classification or segmentation requires an efficient description of

image texture. Important applications include industrial and biomedical surface

inspection, for example for defects and disease, ground classification and segmentation of

satellite or aerial imagery, segmentation of textured regions in document analysis, and

content-based access to image databases.. A major problem is that textures in the real

world are often not uniform, due to changes in orientation, scale or other visual

appearance. In addition, the degree of computational complexity of many of the proposed

texture measures is very high.



Texture classification process involves two phases: the learning phase and the

recognition phase. In the learning phase, the target is to build a model for the texture

content of each texture class present in the training data, which generally comprises of

images with known class labels. The texture content of the training images is captured

with the chosen texture analysis method, which yields a set of textural features for each

image. These features, which can be scalar numbers or discrete histograms or empirical

distributions, characterize given textural properties of the images, such as spatial

structure, contrast, roughness, orientation, etc. In the recognition phase the texture

content of the unknown sample is first described with the same texture analysis method.

Then the textural features of the sample are compared to those of the training images with

a classification algorithm, and the sample is assigned to the category with the best match.

Statistical methods analyze the spatial distribution of gray values, by computing

local features at each point in the image, and deriving a set of statistics from the

distributions of the local features. Depending on the number of pixels defining the local

feature statistical methods can be further classified into first-order (one pixel), second-

order (two pixels) and higher-order (three or more pixels) statistics.

The basic difference is that first-order statistics estimate properties (e.g. average

and variance) of individual pixel values, ignoring the spatial interaction between image

pixels, whereas second- and higher-order statistics estimate properties of two or more

pixel values occurring at specific locations relative to each other. The most widely used

statistical methods are concurrence features and gray level differences, which have

inspired a variety of modifications. Properties of the primitives (e.g. area and average

intensity) were used as texture intensity distribution. The intensity function is considered

to be a combination of a function representing the known structural information on the

image surface and an additive random noise sequence. Pixel-based models view an image

as a collection of pixels, whereas region-based models regard an image as a set of sub

patterns placed according to given rules.



3.10 Image Inpainting

A new algorithm is proposed for removing large objects from digital images. The

challenge is to fill in the hole that is left behind in a visually plausible way. In the past,

this problem has been addressed by two classes of algorithms:

(1) “Texture Synthesis” algorithms for generating large image regions from sample

textures,

(2) “Inpainting” techniques for filling in small image gaps.

The former has been demonstrated for “textures” – repeating two-dimensional

patterns with some stochasticity, the latter focus on linear “structures” which can be

thought of as one-dimensional patterns, such as lines and object contours. The algorithm

that we use here is a novel and efficient algorithm that combines the advantages of these

two approaches. We first note that exemplar-based texture synthesis contains the essential

process required to replicate both texture and structure the success of structure

propagation, however, is highly dependent on the order in which the filling proceeds. We

propose a best-first algorithm in which the confidence in the synthesized pixel values is

propagated in a manner similar to the propagation of information in inpainting. The

actual colour values are computed using exemplar-based synthesis. In this paper the

simultaneous propagation of texture and structure information is achieved by a single,

efficient algorithm. Computational efficiency is achieved by a block-based sampling

process. A number of examples on real and synthetic images demonstrate the

effectiveness of our algorithm in removing large occluding objects as well as thin

scratches. The algorithm effectively hallucinates new colour values for the target region

in a way that looks “reasonable” to the human eye. Of particular interest are exemplar-

based techniques which cheaply and effectively generate new texture by sampling and

copying colour values from the source. As effective as these techniques are in replicating

consistent texture, they have difficulty filling holes in photographs of real-world scenes,

which often consist of linear structures and composite textures multiple textures

interacting spatially.



The main problem is that boundaries between image regions are a complex

product of mutual influences between different textures. These image inpainting

techniques fill holes in images by propagating linear structures (called isophotes in the

inpainting literature) into the target region via diffusion. They are inspired by the partial

differential equations of physical heat flow, and work convincingly as restoration

algorithms. Their drawback is that the diffusion process introduces some blur, which

becomes noticeable when filling larger regions. The technique presented here combines

the strengths of both approaches into a single, efficient algorithm. As with inpainting, we

pay special attention to linear structures. But, linear structures abutting the target region

only influence the fill order of what is at core

3.10.1 Exemplar-based synthesis

The core of the algorithm is an isophote-driven image-sampling process. It is well

understood that exemplar-based approaches perform well for two-dimensional textures

But, we note in addition that exemplar-based texture synthesis is sufficient for

propagating extended linear image structures, as well; i.e., a separate synthesis

mechanism is not required for handling isophotes.

Figure 5.1 illustrates this point. For ease of comparison, we adopt notation similar

to that used in the inpainting literature. The region to be filled, i.e., the target region is

indicated by Ω, and its contour is denoted δΩ. The contour evolves inward as the

algorithm progresses, and so we also refer to it as the “fill front”. The source region, Φ,

which remains fixed throughout the algorithm, provides samples used in the filling

process. We now focus on a single iteration of the algorithm to show how structure and

texture are adequately handled by exemplar-based synthesis. Suppose that the square

template Ψp Ω centred at the point p (fig. 3.10.1 b), is to be filled. The best-match

sample from the source region comes from the patch Ψˆ q Φ, which is most similar to

those parts that are already filled in Ψp. In the example in fig. 3.10.1 b, we see that if Ψp

lies on the continuation of an image edge, the most likely best matches will lie along the

same edge.



Figure 3.10.1: Structure propagation by exemplar-based texture synthesis

All that is required to propagate the isophote inwards is a simple transfer of the

pattern from the best-match source patch (fig. 3.10.1d). Notice that isophote orientation is

automatically preserved. In the figure, despite the fact that the original edge is not

orthogonal to the target contour δΩ, the propagated structure has maintained the same

orientation as in the source region. We focus on a patch-based filling approach this

improves execution speed. Furthermore, we note that patch-based filling improves the

accuracy of the propagated structures.

In the figure 3.10.1 (a) Original image, with the target region Ω, its contour δΩ, and the

source region Φ clearly marked. (b)We want to synthesize the area delimited by the patch

Ψp centered on the point p δΩ. (c)The most likely candidate matches for Ψp lie along

the boundary between the two textures in the source region, e.g., Ψq and Ψq. (d) The best

matching patch in the candidates set has been copied into the position occupied by Ψp,

thus achieving partial filling of Ω. Notice that both texture and structure (the separating



line) have been propagated inside the target region. The target region Ω has, now, shrank

and its front δΩ has assumed a different shape.

3.10.2 Filling order is critical

The previous section has shown how careful exemplar-based filling may be capable of

propagating both texture and structure information. This section demonstrates that the

quality of the output image synthesis is highly influenced by the order in which the filling

process.

Figure 3.10.2: The importance of the Filling order when dealing with concave target

regions.



proceeds. Furthermore, we list a number of desired properties of the “ideal” filling

algorithm. A comparison between the standard concentric layer filling (onion-peel) and

the desired filling behaviour is illustrated in 3. Figures 3.10.2 b, c, d show the progressive

filling of a concave target region via an anti-clockwise onion-peel strategy. As it can be

observed, this ordering of the filled patches produces the horizontal boundary between

the background image regions to be unexpectedly reconstructed as a curve.

A better filling algorithm would be one that gives higher priority of synthesis to those

regions of the target area which lie on the continuation of image structures, as shown in

figs. 3.10.2 b’, c’, d’. Together with the property of correct propagation of linear

structures, the Onion peel Desiderata latter algorithm would also be more robust towards

variations in the shape of the target.

In the figure 3.10.2 (a) is a diagram showing an image and a selected target region

(in white). The remainder of the image is the source i.e. b, c, d Different stages in the

concentric layer filling of the target region. (d) The onion-peel approach produces

artifacts in the synthesized horizontal structure b’, c’, d’ Filling the target region by an

edge-driven filling order achieves the desired artefact-free reconstruction. (d) The Final

edge-driven reconstruction, where the boundary between the two background image

regions has been reconstructed correctly.

Chapter-4



SYSTEM ANALYSIS DESIGN AND

IMPLEMENTATION

4.1Fundamentals

Let stand for the region to be inpainted, and for its boundary (note once

again that no assumption on the topology of is made). Intuitively, the technique we

propose will prolong the isophote lines arriving at , while maintaining the angle of

“arrival.” We proceed drawing from inward in this way, while curving the

prolongation lines progressively to prevent them from crossing each other. Before

presenting the detailed description of this technique, let us analyze how experts inpaint.

Conservators at the Minneapolis Institute of Arts were consulted for this work and made

it clear to us that inpainting is a very subjective procedure, different for each work of art

and for each professional. There is no such thing as “the” way to solve the problem, but

the underlying methodology is as follows:

(1.)The global picture determines how to fill in the gap, the purpose of inpainting being

to restore the unity of the work;

(2). The structure of the area surrounding is continued into the gap, contour lines are

drawn via the prolongation of those arriving at .

(3).The different regions inside , as defined by the contour lines, are filledwith color,

matching those of

(4.) The small details are painted (e.g. little white spots on an otherwise uniformly

blue sky): in other words, “texture” is added.

A number of lessons can immediately be learned from these basic inpainting rules

used by professionals. Our algorithm simultaneously, and iteratively, performs the steps

(2.) and (3.) above.4 We progressively “shrink” the gap by prolonging inward, in a

smooth way, the lines arriving at the gap boundary .



4.2 Design

4.2.1 Dataflow diagram:

A data flow diagram(DFD) is a graphical representation of the “flow” of data

through an information system. A data flow diagram can also be used for the

visualization of data processing(structured design).

Figure 4.2.1 The DFD for image processing

4.3 The inpainting algorithm


Input (Painted image)

ProcessingFill the region with Features

around it

Output(Inpainted image)


The primary goal here is to shrink the information from the neighbor to the

background that is occluded by a stationary or moving object. We first assign confidence

values to each pixel in every frame. The confidence of pixels which are deemed to belong

to the moving foreground or to the damaged area is set to zero. The rest of the pixels are

initialized to a confidence value of one. The process of background filling is completed

in two steps:

Temporal Filling-in: We search for the highest priority pixel location in

the complete video sequence. Temporal information (background pixels)

is copied from the temporally nearest undamaged location having the

highest confidence.

Spatial Filling-In: Once the temporal filling is over, we are left with a

image with a hole at some location. We again find the highest priority

location to be filled-in, and find a best matching patch. This patch is

copied to all the frames, so as to maintain consistent background

throughout the sequence.

4.3.1 Computing Confidence and Filling-In Priority

During the temporal filling-in step, priority of filling-in the 3-D hole is computed.

The confidence term C (p), where p is the pixel under consideration, is initialized to zero

if p is moving or is damaged and C (p) is initialized to one otherwise. The second

relevant term is called the data-term D (p), and its value is based on the availability of

temporal information at location p. The data term is computed as follows:

where Q indicates the hole to be filled in, &Q is its boundary, and Mt = 0 if p is damaged

or if p is moving, else Mt = 1. The time index t indicates the relative position of any

frame from the current frame (to which p belongs, t = 0). The denominator 13 is a



normalizing constant equal to 2n + 1, where n indicates the number of previous and next

frames considered.

Finally the priority of filling-in at p C &Q is given by:

This priority determines the damaged frame and pixel location which we need to

first fill-in with background information.

We then copy temporal information patches having the highest confidence value

from the temporally nearest frame to the location p. Since we are filling-in temporally,

and the camera remains fixed, we do not need to perform an explicit search. Such

confidence based nearest neighbor copying is better than copying directly from a median

image, the median may not contain all the information available in the temporal

neighborhood. Once we copy a patch to the highest priority location p, the confidence at

all previously damaged pixels in 'p is updated.

where is a patch centered at p, | | is its area, and I denotes the Image.

4.4 MATLAB

4.4.1 Introduction to Matlab

In the 1960s and 1970s before the appearance of personal computers, complex

and large scale calculations were done on large mainframes using code primarily



developed with FORTRAN. As a number of related large subroutines were developed for

specific computational purposes, they were organized into public domain packages and

distributed for free. Matlab was originally created as a front end for one of these, the

LINPACK package -- a group of routines for working with matrices and linear algebra.

The primary developer, Professor Cleve Moler at the University of New Mexico,

eventually founded Mathworks, Inc., to further develop and market the product in a

commercial setting. From the original Matlab, a high powered suite of applications has

evolved. The current generation release, the MatlabR2006b suite, features the newest

kernel, Matlab 7.3. It is largely backward compatible with recent Matlab versions, but

there may be some slight changes.

Figure 6.3.1: The matlab snapshot

4.4.2 The Current Directory Window

The Current Directory window displays a current directory with a listing of its

contents. There is navigation capability for resetting the current directory to any directory

among those set in the path. This window is useful for finding the location of particular



files and scripts so that they can be edited, moved, renamed, deleted, etc. The default

current directory is the Work subdirectory of the original Matlab installation directory.

4.4.3 The Workspace Window

The Workspace window provides an inventory of all the items in the workspace

that are currently defined, either by assignment or calculation, in the Command window

or by importation with a load or similar command from the Matlab command line

prompt. These items consist of the set of arrays (including 1x1 scalars) whose elements

are variables or constants and which have been constructed or loaded during the current

Matlab session and have remained stored in memory.

4.4.4 The Command History Window

The Command History window, at the lower left in the default desktop, contains a

log of commands that have been executed within the Command window. This is a

convenient feature for tracking when developing or debugging programs or to confirm

that commands were executed in a particular sequence during a multi-step calculation

from the command line.

4.4.5 The Command Window

The Command window is where the command line prompt for interactive

commands is located. This is also the only window that appears if you execute the UNIX

version of Matlab outside of an X environment, e.g., on a vt100 screen. Commands and

scripts can be executed from a vt100 window, but graphics and desktop tools will not be

available. The Matlab prompt on the command window consists of two adjacent right

angle brackets, i.e., >>. Results of command operations will also be displayed in this

window unless the command line is terminated by a semi-colon, in which case the

display of results is suppressed.

4.4.6 The Help Window



Separate from the main desktop layout is a Help desktop with its own layout. This

utility can be launched by selecting Help ->MATLAB Help from the Help pull down

menu. This Help desktop has a right side which contains links to help with functions,

help with graphics, and tutorial type documentation. The left side has various tabs that

can be brought to the foreground for navigating by table of contents, by indexed

keywords, or by a search on a particular string.

4.5 Principal Components of Impaiting

In essence, image inpainting is quite different from denoising or image

enhancement. Unlike common image enhancement applications, in which pixels contain

both information about the real data and the noise, in image inpainting, the only

information available for reconstruction is the average value of the erased block.

Therefore, it is necessary to develop new techniques to address these problems. With

regard to image inpainting, there used to be mainly three types of methods. The first deals

with the restoration of films. The second is related to structure and texture filling in

missing image blocks, and the last is related to disocclusion. The algorithm is based on

solving partial differential equations (PDE) similar to that of diffusion using pixel value

gradient of neighboring blocks.

4.5.1 Block Classification

Image Inpainting region is divided into blocks. We first check whether each block

is recoverable using either texture or structure inpainting. Then, the blocks to be removed

and appropriate algorithm for their later reconstruction are selected based on the analysis



of the first step. For decision, the relationship with surrounding eight blocks is critical

because the neighboring blocks provide the information needed for reconstruction of a

missing block.

Texture inpainting successfully reconstructs a missing block when surrounding

blocks have similar statistical characteristics. Such similarity is briefly checked with

mean and variance of each of neighboring blocks. If the mean and variance are similar to

surrounding blocks, then the block in the middle is recoverable with texture inpainting. In

contrast to texture inpainting, which has a limited application for filling in missing

region, structure inpainting successfully reconstructs image in general. We can apply

structure inpainting if a block falls into one of the two cases: 1) when a block does not

contain ’strong edges’, and 2) when it is not composed of fine repetitive patterns. Sharp

(strong) edges are critical when human recognize the shape of an object. Structure

inpainting blurs the sharp edges because the algorithm is based on diffusion equations.

Also, fine details are not recoverable because the diffusion equation continuously

connects the missing region with surrounding region.

4.5.2 Removing Blocks

Our goal is to remove as much blocks as possible and still recover the image with

little perceptual difference. Based on the classification of part A, we designed an

algorithm to automatically find suitable masks using following general rules:

For blocks recoverable using texture inpainting, remove alternative blocks

because we need surrounding blocks for texture inpainting.

Remove as much blocks in smooth area that can be restored using

structure

inpainting as possible. Such blocks are not noticeable even we simply fill in

DC value.

Remove alternative blocks along an weak edge since we need neighboring

blocks to connect the edge.

Fill in the removed blocks with mean pixel value before removal.



Figure 4.4.2: an example of mask on pepper

(Black: texture, gray: structure with edge, white: flat structure)

4.5.3. Texture Inpainting

We classify blocks whose statistical properties are similar to those of surrounding

blocks into texture. So, when we want to fill the texture block, we exploit the statistical

similarity with other blocks. (Out texture synthesis algorithm is not much different from



the previous work.) Briefly speaking, texture inpainting is to find best match from

referable surrounding blocks in statistical aspect. In detail, texture synthesis process is

like below. At first, when we call texture synthesis function, we give referable

neighborhood block information. Secondly, we set the template which is 3x3 or 4x4 and

next to the missed pixel we want to fill. We use this template as the best match criteria.

Using MMSE algorithm, we find the best match among referable neighborhood blocks,

and then, copy the pixel next to the missed pixel. Filling order is from up to down and

from left to right. Finally, to make same DC value as before, we normalized the pixel

values of the filled texture block. This texture synthesis method is similar to that of the

previous works. So, there is another method, we used. In big picture(512x512), 8x8 block

is not a big portion. And, in smooth area, we don’t use texture synthesis because of PSNR

quality, even though texture synthesis is faster than structure inpainting. In most cases,

we use texture synthesis in very coarse area or pattern area. From those reasons, we copy

the whole block(8x8) to the missed texture block, after finding the block which has the

closest mean and variance to missed texture block. In this case, we can get a good result

in visual aspect, despite of almost same result in PSNR.

4.5.4. Structure Inpainting

Structure is the region that can be clearly divided into two or more sectors through

clear edges. Each sector is relatively devoid of minute details and the missing block

within the sector is easily predicted from surrounding blocks. To state simply, structure

inpainting is the process of gradually propagating the information contained in the

surrounding blocks into the missing block.

4.6 Implementation

We implemented the entire algorithm from the paper. A pseudo-code description

of the algorithm (from the paper) follows. Let R represent the region to be filled, I the

entire image, S=I-R the source region from which candidate exemplars are chosen, P(p)



the priority of a pixel, C(p) the confidence term for pixel p, D(p) the data term for pixel p,

and t the iteration:

Extract the manually selected initial front dR0.

Repeat until done:

o Identify the fill front . If = , exit.

o Compute priorities P(p) = C(p)*D(p), .

o Find the patch with maximum priority, i.e. .

o Find the exemplar that minimizes the sum squared error (SSE)

.

o Copy image data from to.

o Update C(p) p | p .

The main contribution of this algorithm is the priority/patch-ordering mechanism

that allows an exemplar-based approach to respect the structural features of the input

image. The priority is composed of a confidence term, C(p), and a data term, D(p), both

defined over pixels:

C(p) = ( Sumq in ~RC(q) ) / ( area of patch )

D(p) = abs( Isophote(p) /dot Normal(p) ) /alpha

Intuitively, the confidence term measures how sure a pixel is of its own value; this

is computed from the confidence of surrounding pixels that have already been filled (or

weren't in the fill region to begin with). Confidence tends to decay as the center of the fill

region is approached. Because of this, if the priority only consisted of the confidence

term, the patches would be selected in an "onion-peel" manner, which is typical of

current exemplar-based approaches. Confidence ignores structural information in the

image, however. This is why the data term is necessary; it is a combination of how

strongly an isophote at a pixel collides with the contour at that same pixel. An isophote is

basically the gradient at a pixel rotated by 90 degrees---it captures the "strength of flow"



of an edge. If only the data term is used in the priority, however, edges end up

propagating where they shouldn't. It is the harmony between the two factors that creates

good results. In this vein, both quantities are normalized (to lie between 0 and 1) by

appropriate factors.

Chapter-5

RESULTS



A synthetic image, I created a 213x284 synthetic image for testing and sanity checking.

My implementation took 25 seconds to inpaint this image. It correctly painted in the

missing parts of the synthetic image. The interesting thing to note about this example is

the presence of the red dots in the data term; they lie along the hard edge between the

black and gray regions. Because of the presence of such a strong edge, the pixels along

the edge get a high priority; therefore, those patches get filled.

Painted images



Natural Images



Again, the data term reveals something about how the linear structure is

preserved. At two points in the jumper’s torso, the data term becomes larger, along the

two edges created by the roof of the building in the background (it's hard to see; look for

the red dots).



Chapter-6

CONCLUSION AND FUTURE WORK

6.1 Conclusion

Inpainting the technique of modifying image in an undetectable form the

technique finds in application like restoration of old photographs , damaged films ,

removal of superimposed text like dates subtitles or publicity and removal of entire object

from the image.

The exampler based method an inspiration by greedy image based texture

growing algorithm and the global image completion method was recently proposed for

image inpainting, which we used as a problem for our project. We implemented the

technique in matlab as an input the file was taken.

The filling of an image is done by green color then the algorithm is run to obtain

the results that are discussed in results section. It can be observed that the algorithm is

capable of removing unwanted objects from image. It was also found that the algorithm is

fast.

Even though by observing the results surfacially we cannot identify that image

inpainting process has been done but by keen observation we can see that the resolution

is not at the best.

6.2 Future work

The work can be extended to video images.

The present work was on global image completion approach we can also

use local image completion approach to improve the resolution.



Chapter-7

REFERENCES

[1] P. Harrison. A non-hierarchical procedure for re-synthesis of complex texture. In

Proc.Int. Conf. Central Europe Comp. Graphics, Visua. And Comp. Vision, Plzen,

CzechRepublic, February 2001.

[2] M.Bertalmio, A.L. Bertozzi, and G. Sapiro. Navier-stokes, fluid dynamics, and

imageand video inpainting. In Proc. Conf. Comp. Vision Pattern Rec., pages I:355–362,

Hawai, December 2001.

[3] A. Efros and W.T. Freeman. Image quilting for texture synthesis and transfer. In

Proc.ACM Conf. Comp. Graphics (SIGGRAPH), pages 341–346, Eugene Fiume, August

2001.

[4] A. Zalesny, V. Ferrari, G. Caenen, and L. van Gool. Parallel composite texture

synthesis. In Texture 2002 workshop - ECCV, Copenhagen, Denmark, June 2002.

[5] A. Criminisi, P. Perez, and K. Toyama. Object removal by exemplar-based inpainting.

In Proc. Conf. Comp. Vision Pattern Rec., Madison, WI, Jun 2003.

[6] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester. Image inpainting. In Proc.

ACM Conf. Comp. Graphics (SIGGRAPH), pages 417–424, New Orleans, LU, July

2000. http://mountains.ece.umn.edu/ ~guille/inpainting.htm.

[7] M. Bertalmio, L. Vese, G. Sapiro, and S. Osher. Simultaneous structure and texture

image inpainting. In Proc. Conf. Comp. Vision Pattern Rec., Madison, WI, 2003.

http://mountains.ece.umn.edu/ ~guille/inpainting.htm.


final report

Documents

image structures

image inpainting chapter

task of image restoration

ll large image regions

digital inpainting

texture synthesis

new texture

applications of inpainting