image completion - university of michiganhyunjinp/proj/sample/... · today, smooth object removal...

Image Completion

A Comparative Analysis of Exemplar-Based Image Inpainting (EBII) and Simultaneous Cartoon

and Texture Image Inpainting Using Sparse Representations.

Abstract

Cracks in old photographs, lost pixels in data transmission, and text removal from a digital image are typical applications for image restoration. In the digital world, background blur and edge discontinuities have been obvious indicators that an image has been modified. Today, smooth object removal is particularly useful in applications like special effects and digital photo touchup. This project will explore two new digital techniques for filling in regions of an image. Previously, Exemplar-Based algorithms and Image Inpainting techniques have been researched separately. Image Inpainting focuses heavily on propagating linear structures, i.e. extending continuous lines and edges, while Exemplar-Based algorithms focus on replicating textures in order to create a smooth-looking background. Both techniques have been combined into one -- Exemplar-Based Image Inpainting -- that we will explore further. A comparative analysis will be made between Exemplar-Based Image Inpainting and a new technique called Simultaneous Cartoon and Texture Image Inpainting using Morphological Component Analysis (MCA). MCA is one of the most recent inpainting methods and combines the advantages of variational and local statistical analysis methods, i.e., performs well in images containing simultaneously piecewise smooth regions (cartoon) and texture. However when the goal is to fill in a big missing region this algorithm is outperformed by Exemplar Based Image Inpainting methods. The analysis of this and other tradeoffs altogether with implementation details are of great interest due to the fact this method uses a novel approach and still lacks a complete theoretical formulation. I. Introduction: The typical image restoration problem can be seen as an estimation of the true image from its noisy or corrupted observation. The estimator is usually obtained from the past (noisy) measurements and is designed to provide a good approximation to the original image. However when there is total absence of data, common estimators and interpolation methods turn out to be inadequate because of lack of “a priori” information about the signal. The obvious question is then: how to determine a missing region of an image?

Some intuitive thinking will lead to the conclusion that one should make this “hole” as close as possible to its surrounding pattern in order to provide a smooth and continuous restored image, which is desired to be approximately equal to the original one. This is the basic idea behind image inpainting methods: Fill in the missing region with a pattern as similar as possible to the one of its surrounding regions, ideally avoiding blur, edge discontinuities and at the same time being robust to noise contamination. Image inpainting has a wide range of applications, ranging from photo restoration to reliable error transmission and image coding and compression. In this project, we will analyze two new digital techniques for filling in regions of an image, Exemplar Based Image Inpainting (EBII) and Simultaneous Cartoon and Texture Image Inpainting using Morphological Component Analysis (MCA). Image Inpainting focuses on linear structures, i.e. creating continuous lines and edges. Exemplar-Based algorithms focus on repeating textures in order to create a smooth-looking background. Exemplar-Based Image Inpainting combines both techniques. MCA separates the image into its texture and structure components then proceeding to the inpainting of either the texture, the structure (cartoon), or a region resulting of a linear combination of both. The separation into cartoon and texture is achieved with overcomplete and sparse representations. A comparative analysis will be made between EBII and MCA. Both algorithms should be able to successfully restore images with relatively small amounts of degradation; however there will be regions where the differences will be noticeable that we will use to compare the performances and respective tradeoffs. We intend to go a step further in the understanding of the strengths and weaknesses of each algorithm so we will also explore other types of degradation, namely larger object removal and the presence of additive noise. With the goal of determining which algorithm will most successfully complete images given different types of corruption we will try to trick the human eye into believing that our reconstructed images have not been altered, i.e., the object or degradation never existed. II. Algorithms: A. Exemplar-Based Image Inpainting Exemplar-Based Image Inpainting follows a more intuitive approach to completing an image. The algorithm works by copying groups of pixels (patches) from the known region into the missing region. We define a “patch” for each pixel as the neighborhood of pixels defined by

⎭⎬⎫

⎩⎨⎧ +≤≤−

22: hsizeofpatcpphsizeofpatcpx xxx where px is the x-coordinate of each pixel, and

⎭⎬⎫

⎩⎨⎧ +≤≤−

22: hsizeofpatcpphsizeofpatcpy yyy where py is the y-coordinate of each pixel. Instead of

copying pixel by pixel, we fill in the missing region patch by patch. The most important objective of this algorithm turns out to be choosing which missing pixel to start filling in around (please refer to Figure 3 for a graphical explanation). In the object removal literature, image is regarded to be composed of two structures, so called “composite textures” and “isophotes”. Basically, isophotes are considered to be the linear structures in the image, whereas composite textures consist of the background elements. Human eye is very good at detecting discontinuities because natural images are continuous [5], [6]. To preserve continuity of sharp edges, the algorithm must select where to start filling in the unknown target area from, as one should realize that starting from a random patch might end up with the loss of linear structures in the end (illustrated in Figure 1).

2

Figure 1. The white region is the target region that should be filled in. The top row shows the result of filling process

where we start filling it from an arbitrary position. The bottom row is shows the steps of filling of EBII algorithm . Note that where the algorithm starts filling in is an essential issue in terms of preserving the linear structures.

(extracted from [1]) Filling priority is thus given to certain portions of the target region, which depends on two functions—“Data” and “Confidence”. “Data” is a measure of the linear structures thought to be in the patch, while “Confidence” is a measure of how many known pixels exist in the unknown target patch versus patch area. Here, the purpose of “Confidence” is to give more priority to the patches containing more source pixels. “Confidence” will give more priority to the patches with small unknown corners and thin unknown tendrils in them. The importance of correctly determining Priority is illustrated in Figure 2. In this image, the missing region is the upside-down white triangle (∇ ) in the center of the image. It is seen in Figure 2.c that the algorithm starts filling the target starting from the patches which have high priority in terms of “Data”, in order to create smooth continuations of the circle edges. Also, note that the circles get totally filled in on the transition from Figure 2.d to 2.e, as those patches have high “Confidence”. It should be noted that if the algorithm had started in the center of the triangle, and we did not have prioritization through “Data” and “Confidence”, most probably the circles and the triangle would be corrupted.

Figure 2. “Kaizsa triangle” Graphical example of why a correct Priority function can make a difference. Note how

the algorithm first extends the known boundaries into the unknown region. (extracted from [1])

3

(1) Algorithm: 1. Determine the contours of the target region: The user interactively selects the region to be removed by just using the color red (255,000,000).

The algorithm will find the boundary of this region by searching through the pixels to find the pixels which have pure red on its one side, and another color on its other side.

2. Select the target patch to replace, or fill-in, according to the prioritization: After finding the boundary of the missing region, the algorithm searches through all of the pixels

along the boundary to prioritize them. Priority of each target patch is calculated as follows [1]: , p is the center pixel of the unknown target patch, C(p) is the Confidence

function mentioned above, D(p) is the Data function mentioned above. )()()( pDpCpP ⋅=

p

IpqqC

pCΨ

=∑ Ω−∩Ψ∈ )()(

)()(

α

pp nIpD

⋅∇=

⊥

)(

I = complete image Ω = unknown target region Ψ(p) = current target patch | Ψ(p)| = area of current target patch np = unit vector orthogonal to the unknown boundary, at the point p α = normalization factor Initialize C(p) such that:

⎩⎨⎧

Ω−∈Ω∈

=Ip

ppC

,,

10

)(

For the “Data” term (D(p)), the algorithm needs to calculate the gradient of the entire image and

normalize it with α. First, the image is converted from RGB format to grayscale (Intensity = Y = 0.299R + 0.587G + 0.114B). Then, the differences between the center pixel p and its known neighbors are calculated. The gradient value at p is just the highest of these difference values. After the gradient is calculated at each pixel p, it is then normalized by α to keep the gradient value between 0 and 1.

3. Choose the best match and fill in the target patch: Euclidean distance is used in order to decide which patch to choose to fill in the target patch with.

The algorithm looks for a patch in the source region that closely matches the target patch in terms of RGB Euclidean distance. If there are two patches with the same RGB Euclidean distance, the one with the smallest spatial distance to the target patch is selected. The source patch is then copied directly onto the target patch, as demonstrated in Figure 3. All of the target patch that has been filled in is updated to be in the source region from now on.

4

Figure 3. Example of one iteration in the Exemplar-Based Image Inpainting Algorithm (extracted from [1]).

4. Iterate until the whole missing region is filled in:

After each iteration, “Confidence” (C(p)) is updated, and new contours are found. The algorithm is repeated until the entire target area is filled in. The “Confidence” term (C(p)) for each pixel is updated as follows: For each pixel, we find the patch that is surrounding it, keeping the pixel on the center. Then, just by dividing the number of known pixels to the whole patch size we find confidence values for each pixel. After we find the confidence values for each pixel, we just normalize confidence values by dividing all of them to the maximum confidence value, to keep them between 0 and 1.

(2) Algorithm Details and Possible Improvements: The Exemplar-Based Image Inpainting algorithm given above is reported to work effectively in

removing large objects as well as small cracks. The algorithm is actually a “texture synthesis” algorithm that has been improved to include inpainting techniques.

Reported pros [1] of the algorithm can be given as:

(i) Preservation of edge sharpness (ii) The use of Confidence term in the Priority function helps to avoid over-shooting artifacts,

such as those seen in Figure 4. At a coarse level, the term C(p) or the Priority equation approximately enforces the

desirable concentric fill order. As filling proceeds, pixels in the outer layers of the target region will tend to be characterized by greater confidence values, and therefore be filled earlier; pixels in the centre of the target region will have lesser confidence. [1]

Figure 4. The “overshoot” artifact. The use of the data term only in the priority function may lead to undesired edge

“over-shoot” artifacts. This is due to the fact that some edges may grow indiscriminately. A balance between structure and texture synthesis is highly desirable and achieved in this paper. Extracted from [1].

5

(iii) Computationally Efficient (iv) Accuracy in the synthesis of texture (v) Accurate propagation of linear structures Reported cons [1] of the algorithm are: (i) Cannot produce reasonable results if there is insufficient “examples” in the background. (ii) The algorithm is not designed to handle curved structures, as can be seen in Figure 5.

Figure 5. Here, red marks the missing region and the black circle represents the source region we are trying to complete. As seen in the reconstruction, the algorithm simply attempts to extend linear structure into the missing

region. It is not designed to handle curved structures. (iii) The algorithm does not handle depth ambiguities in the images. For example, a similar

patch may be found that was physically at a different depth than the target patch. This can lead to regions that look incongruous to the eye, as can be seen in Figure 6.

Figure 6. We expect EBII to produce overlapping rectangles (either green overlaps brown or brown overlaps green).

But instead EBII produces a broken green rectangle.

The most critical part of the algorithm lies in the prioritization of the contours. The idea of the priority function that is composed of “Confidence” and “Data” functions is obtained empirically. Therefore, the formulation of the function does not depend on sophisticated mathematical background. The authors of the algorithm claim that the algorithm works well for small targets as well as removal of large objects, although this still remains a question.

A possible improvement on this algorithm might be obtained by finding a different priority function

that would again take into account “Confidence” and “Data”. Also, after making experiments on different types of images, one can come up with several priority functions that each one of them is suitable for one type of image.

During our implementation we experimented with several different priority functions in addition to

the function given in [1]. )(*)(*)( pDbpCapP += , where a and b are bigger than or equal to 0.

6

This function provides the option of giving weights to confidence and data separately. Although it works very well for specific types of images (images consisting of mostly lines), the product form of the priority function in [1] can handle more general types of pictures.

B. Simultaneous Cartoon and Texture Image Inpainting using MCA Numerous approaches to fill in holes in images are based on variational methods, which are very attractive and motivate the filling-in algorithms based on geometrical considerations (one should fill in by smooth continuation of isophotes). The variational approach has been shown to perform well on piecewise smooth images also called cartoons. However real images also contain textured regions and variational methods generally fail in such settings. On the other hand, local statistical analysis and prediction have been shown to perform well at filling in texture content [2]. Real images contain geometry and texture, demanding approaches that work for images containing both cartoon and texture layers. Furthermore, methods based on image segmentation – labeling each pixel as either cartoon or texture – are to be avoided since some areas in the image contain contributions from both layers. Instead, a method of additively decomposing the image into layers (cartoon and texture) would be preferred, allowing a combination of 2 layer-specific methods for filling in [4]. The central idea is to use two adapted mutually incoherent layers (aka dictionaries), one adapted to represent textures and the other to represent cartoons. The algorithm we propose to explore is a direct extension of a recently developed “Sparse Representation Based Image Decomposition Method Called MCA (Morphological Components Analysis)” but is pioneer in combining the decomposition and filling-in stages into just one step. It has the following desirable properties [2]: (1) the image is allowed to include additive noise; (2) the image is allowed to have missing pixels; (3) the image is assumed to be a sparse combination of atoms from the two dictionaries. This method is actually one of the most recent inpainting methods, and it has substantial advantages over other methods proposed in the literature [2]: (1) the use of general overcomplete representations; (2) a global treatment of the image rather than a local block-based analysis; (3) a coherent modeling of the overall problem as an optimization, rather than the presentation of a numerical scheme; and perhaps most important of all, (4) the ability to treat overlapping texture and cartoon layers due to the mentioned separation properties. An example of the importance of the separation of the image in its texture and cartoon layers is given in Figure 7. Note that typical Exemplar-Based methods often fail to propagate lines into the target when in textured regions. The fine detail of the texture regions is evident after its isolation by the texture layer – bottom left image. The cartoon portion on the bottom right contains the piecewise smooth structures of the image.

7

Figure 7. Decomposition of the Barbara image. The top image is the original image, the bottom left is the texture,

and the bottom right is the structure (cartoon). (Image extracted from [2])

(1) Image Inpainting Using MCA Principle: The idea behind the MCA representation is very simple and intuitive: Find two matrices (that we will further on refer to as dictionaries) Tt and Tn that must have properties such that: - They should provide localization of the texture (Tt) and geometric layers (Tn) through a multiscale and

local analysis of the image content. - They must be incoherent, i.e. Tt/Tn must sparsely represent all texture/cartoon parts of the image and

simultaneously not be able to sparsely represent the other component of the image. (2) Model and Mathematical Description: Let the input image be an NxN image. To model images containing only texture (Xt), it is assumed that the matrix

Tt œ MN*N+L (typically N*N >> L) allows sparse decomposition. Informally this can be written as:

ttt TX α.= where αt is Sparse

An analogous representation is also defined for Xn. Sparsity is defined in terms of norms, including the lo “norm”, with a small quantitative value indicating sparsity. The goal is to find a sparse representation for any arbitrary image X, containing both texture (Xt) and piecewise smooth content (Xn), over a combined dictionary containing both Tt and Tn. The best dictionary to choose depends on the structure of the image and user’s experience. For example, Ridgelets are a good choice for Tn when the cartoon is built of pure edges that are globally straight lines, and a global DCT is a good choice for Tn when the texture is global and highly periodic.

8

The sparse representation over a combined dictionary can be represented mathematically by:

( ) 0:#:

min,

0

00,

≠=+=

+=

iiTTXtosubject

Arg

nnt

noptn

opt

t

t

nt

t

αααα

αααααα

We will not go in great details here (for more information about this see [2]) but the solution to the optimization problem would be non-convex and thus intractable had the lo norm been used. The basis pursuit (BP) method [2] suggests the replacement of the lo norm by the l1 norm thus leading to a tractable convex optimization problem, i.e., find the solution of:

nnt

noptn

opt

TTXtosubject

Arg

t

t

nt

t

αα

αααααα

+=

+=

:

min,11,

Note that if the image is noisy it cannot be cleanly decomposed into sparse texture and cartoon layers, thus having some residual error. This error results of the fact that some sections of the image may not be well represented by any of the two defined dictionaries. The noise-cognizant version of BP will in this case be:

11,min, nt

optn

optt

nt

Arg αααααα

+=

Subject to: εαα ≤−−2nntt TTX

In order to make the problem more tractable instead of solving the above optimization problem directly (finding the two representation vectors or coefficients opt

noptt αα , ) we reformulate it so as to get the texture

and cartoon images Xt and Xn as unknowns. The reason behind this reformulation is the obvious simplicity of searching lower dimensional vectors (recall that for overcomplete dictionaries the representation vectors are much longer than the image they represent). Defining ttt TX α.= it is possible to recover αt as

ttt rTt

+= + αα . , where tr is an arbitrary vector in the null-space of Tt. A similar structure holds for Tn.

Our initial problem is now reduced to:

)(min, 2

211,,, nntnnntttrrXX

optn

optt XTVXXXMrXTrXTArgXX

ntnt

γλ +−−++++= ++ …( 1)

Subject to: 0=tt rT , 0=nn rT Where: - M is a diagonal mask matrix M œ MNxN that encodes the pixel status, namely ‘1’ for an existing pixel

and ‘0’ for a missing one. - γTV is a total variation unconstrained penalty that is verified to work well in recovering piecewise

smooth objects with pronounced edges. The terms +

tt TX . and +nn TX . are overcomplete linear transforms of the images Xt and Xn

respectively. For tight frames these transformations are equivalent to the multiplication by the adjoint of the original dictionaries Tt and Tn thus leading to a low complexity implementation. In

9

the same spirit of simplification we are also going to assume tr = nr = 0 therefore obtaining a suboptimal solution to the problem. These simplifications lead to the following equation:

( ) nntnnttXXXTVXXXMXTXT

nt

γλ +−−++ ++ 2

211,min …( 2)

Note that tr = nr = 0 is a reasonable assumption [2] once the minimization of the function in equation (1), when optimized with respect to tr and nr , gives an upper bound to equation (2). Furthermore, in the special case where the dictionaries Tt and Tn are square and non-singular (leading in this case to a complete, rather than overcomplete representation) or when the l1 norms are replaced by the l2 norms the formulations (1) and (2) are equivalent and thus lead us to expect that the suboptimal solution of equation (2) is close to the optimal [2]. The above optimization problem will be solved using an iterative Block Coordinate Descent algorithm1 [2], which is divided in 4 main steps (described in Figure 8). This algorithm not only performs very well the task of finding the sparse basis separation but also turns out to be very effective for image denoising. The idea behind the diagonal mask matrix M is to assure that the update of the residual ( )( nt XXXMR −−= ) in each iteration is done only with respect to existing pixels, thus increasing the fidelity of the approximation.

Figure 8. Block Coordinate Descent Algorithm used to obtain the solution of the minimization problem in (2) .

(Extracted from [2])

1 The name of the algorithm arises from the fact that in each iteration one image is fixed (e.g Xt) while the other (Xn) is updated and then fix Xn and update Xt, or vice-versa.

10

(3) Transform Choice and Implementation: The decomposition into the texture and cartoon layers is performed using two orthogonal transforms. The texture image Tt is isolated using a global DCT analysis (one can also use Gabor or Wavelet packets) of the original image. Overcomplete Wavelets are used for isolating the structure (cartoon) image Tn (several other transforms can also be applied, such as curvelets, ridgelets, and contourlets). The texture image will contain the fine details of objects, and the structure image will be piecewise smooth and contain the coarse details, mostly object edges. The choice of the transform is the key for the success of the results and depends significantly on the content of the image. There is no systematic procedure to find the adequate transform other than user’s experience. For example: If the texture is highly periodic and globally so, a global DCT will be perfect. However if it is spatially varying a block DCT with overlaps will be preferable. The size of the block will be chosen such that the texture is nearly homogeneous on the support. As a second example, if the cartoon is built of pure edges and those are global straight lines, then ridglets is the choice to take. If it is spatially varying, curved, and not global, than contourlets or curvelets is the best choice. Using simple wavelets is hardly a good choice because (1) it is separable and thus cannot represent diagonal curves well; and (2) it is not overcomplete which implies that it is not shift invariant and not rich enough. In any case it is crucial to use DCT with overlaps and when using wavelets they must be overcomplete. III. Analysis A. Performance Criteria To analyze the performance of the two Image Completion algorithms, we will focus on several performance criteria:

1) Propagation of image structures into target regions. Discontinuity of image structure across the source/target boundary can result in a highly visible edge. A good inpainting algorithm must propagate these structures smoothly into the target region.

2) Accuracy of texture reconstruction; Filled textures should resemble textures surrounding the target region. In addition, if the filled texture contains a regular pattern (such as thin horizontal strips), we require that pattern to be extended continuously across the boundary. We will rely on visual inspection to judge whether textures are filled in “correctly”;

3) Ability to propagate complicated structures such as jagged lines; 4) Resilience to noise

We will observe how well the algorithms perform in the presence of noise. We will try white Gaussian noise and speckle (Salt & Pepper) noise.

B. Testing with Images We tested both algorithms on real images and synthetic images. We reached the following conclusions for each algorithm: 1) Propagation of image structures into target regions. MCA—Works perfectly for small missing regions. For large regions (8x8 or larger), it

tends to blur the region instead of recreating it. Figure 9 shows 3 sets of missing regions on the Lena image. The regions are of size 4x4, 8x8, and 16x16 pixels. As one can see, the larger blocks lead to blurring as mentioned above. The reason behind this is related to the fact that MCA algorithm tries to fill in missing regions

11

iteratively through the overcomplete representation of the texture and cartoon layers. When the region to fill in is too extensive, it is difficult to complete the missing information accurately from the dictionaries, therefore blurring the inpainted area. On the other hand, if the region to fill in is small, the overcomplete representation allows almost perfect restoration of the image.

XhatX

Figure 9. Results of in(b) resu

EBII

d)c)

b)a)

Xn Xt

painting missing blocks on the Lena image (a) rectangular blocks (4x4, 8x8, and 16x16 pixels); lted inpainted image using MCA ; (c) separated texture ; (d) separated cartoon

—Works well for propagating linear structures into missing regions. Figures 10 and 11 show that EBII preserves straight edges. In addition, sharp corners are not distorted. For the algorithm to preserve the corners, the size of the patch should be selected carefully. EBII begins filling at the junctions indicated by the arrows, as these pixels have higher priority due to Data term.

12

Figure 10. Propagation into missing regions privileges the preservation of linear structures in order to maintain the boundaries and edges, thus leading to a visually pleasing image. EBII begins filling at the junctions indicated by the

arrows, as these pixels have higher priority due to Data term.

Figure 11. Propagation of linear structure in a real image. The bridge is reconstructed successfully by EBII.

2) Accuracy of texture reconstruction.

MCA—does not achieve a good performance reconstructing texture. The inpainting result tends to blur out regions as can be seen in Figure 12. Despite the different texture regions still preserved, the inpainted regions are noticeably blurred out. The blur here occurs because the region to inpaint is significantly large and containing mostly fine details that are very difficult to predict from the overcomplete representation.

Figure 12. Texture inpainting using MCA. As it is possible to observe on the right, different texture regions can still

be seen, but they appear blurred out.

13

it’s possible to observe that EBII EBII— In Figure 13 achieves a very good performance

reconstructing texture. The final inpainted image is remarkably perfect and aesthetically pleasing being almost impossible to find any difference to the original. Note that there are lots of example texture regions in the source region. EBII achieves the best performance with this type of image. Patch size plays a major role. The patch must be selected to match the size of the tessellation in the texture.

Figure 13. Background texture is reconstructed almost perfectly by the EBII algorithm.

3) Ability to propagate complicated structures such as jagged lines.

MCA: Observing Figure 12, one can conclude that the MCA algorithm can extend jagged edges like the ones in this stripes design. Even though it is not perfect, the restored texture is a good reproduction of the texture pattern. Despite some blur is noticeable, the results are quite impressive attending to the total absence of information and complexity of the texture.

EBII: Achieved good results inpainting images containing jagged edges. As can be seen

in Figure 14, below, the reproduced coastline looks “natural”.

Figure 14. The complicated contours of the coastline are reproduced in a way visually pleasing to the human eye.

14

4) Resilience to noise. MCA: Ha

curGauconper ecent approximation to the original image can be obtained from a few coefficients of its sparse representation. Despite the fact that the original image is highly corrupted, an accurate decomposition is still possible for uncorrupted pixels. On the other hand, typical image enhancement (e.g. denoising) algorithms obtain the estimate of the image directly from the noisy measurements, therefore leading to bad results when the level of corruption is high.

Figure 15. Denoising using MCA provides amazing results. The original image can be understood from an

unrecognizable noisy image.

EBII— As expected, this algorithm doesn’t have any denoising capabilities and obviousl

ndles Salt & Pepper noise with remarkable performance, in fact outperforming rent benchmarks (i.e. NPLS, Wiener Filtering). However, when the noise is ssian its performance is comparable to the other denoising algorithms. These clusions can be visualized in Figure 15 (50% Salt & Pepper). Such a good formance in the presence of Salt & Pepper noise results from the fact that a d

Original Noisy, MSE = 63.96

A S, MSE = 59.89 , MSE = 9.72 NPLMC

y it will reproduce noisy structures and textures. No illustration is provided here.

15

IV. Conclusions and future work

Inpainting, uses a min an unknown targusing MCA, decomeach component. Th We conclude that eportions of images of an overlap of carwith transforms. Wfine details that are difficult to predict from the decomposition in the two dictionaries.

For future work, we wo e the ideas behind E ainting a region that has both structure and texture, as can be seen in egions in Figure 16 are reproduced accurately, but EBII introduced artifacts into regions that have both texture and structure missing regions.

Figure 16. EBII fails to accurately reproduce areas that have both texture and structure, as can be seen in the above Adar image. The area around the hand on the right and the hat both show artifacts, but the texture-only regions on

the left and under the left arm are both reproduced accurately. We feel that an image inpainting algorithm which separates an image into texture and structure using MCA, and then inpaints each image individually with EBII, would outperform either algorithm alone. To our knowledge such an approach has never been tried yet, and we decided to give this idea a try. The initial results we obtained seem to be very promising, as can be seen in Figure 17:

In this project, two very recent algorithms were analyzed. One algorithm, Exemplar-Based Image

ore intuitive method to complete images, taking existing patches and iteratively filling et region. The second algorithm, Simultaneous Cartoon and Texture Image Inpainting poses an image into its texture and cartoon layers and performs inpainting separately on e final image is a result of a linear combination of both inpainted structures.

ach algorithm has its own strengths and weaknesses. EBII is great for removing large and reproducing texture, while MCA works best with small missing regions consisting toon and texture. MCA is highly dependent on image content and developer familiarity e conjecture that it performs poorly with texture because texture regions contain only

uld like to merg BII and MCA. EBII has difficulty inp

Figure 16. The texture-only missing r

16

Figure 17a Left: The original image: Water lilies with an overlaying texture. Right: Missing regions marked by red.

Figure 17b Reconstruction with EBII without first separating image into cartoon and texture. The white circles show

the resulting artifacts.

Figure 17c Image separated into Left: Cartoon, and Right: Texture, using MCA. The missing regions in the two

components are filled in using EBII.

17

Figure 17d The final result obtained by adding reconstructed cartoon and texture together. Note that the artifacts are

much less visible tha se in Figure 17b.

he results we obtained by first separating the image into cartoon and texture using MCA, and then using EBII to complete the missing regions works much better than EBII alone. An analysis on this approach might be studied in further detail as a future work on this topic. Also interesting for future work is to design a procedure to build in the dictionaries empirically from the data, therefore avoiding all the difficulties of using transforms that are dependent on image content.

[1] Toyama, K., Criminisi, A., Perez, P., “Region Filling and Object Removal by Exemplar-Based Image

Inpainting”. Technical Report MSR-TR-2003-8 (Nov,2003). ] Elad, M., Starck J.-L., Querre P., Donoho D.L., “Simultaneous Cartoon and Texture Image Inpainting

Using Morphological Component Analysis (MCA)”. ACHA, to appear (March 2005). [3] Rudin, L.I., Osher, S., Fatemi, E., “Nonlinear total variation noise removal algorithm”. Physica D, 60:

259-268, 1992. [4] Starck J.-L., Elad, M., Donoho D.L., “Image Decomposition Via The Combination Of Sparse

Representations And A Variational Approach”. IEEE Transactions on Image Processing, 2004. in press.

[5] R. Bornard, E. Lecan, L. Laborelli, and J-H. Chenot. Missing data correction in still images and image sequences. In ACM Multimedia, France, December 2002.

[6] T. F. Chan and J. Shen. Non-texture inpainting by curvature-driven diffusions (CDD). J. Visual Comm. Image Rep., 4(12):436–449, 2001. [7] Rice Wavelet Toolbox. This product includes soft y Rice University, Houston, Texas

n tho

T

V. References

4, Microsoft [2

ware developed band its contributors.

18

Appendix Some results obtained using EBII are:

19

Some results obtained using MCA are:

X X

20

image completion - university of michiganhyunjinp/proj/sample/... · today, smooth object removal...

Documents