a hybrid image retargeting approach via combining … hybrid image retargeting approach via...

10
A Hybrid Image Retargeting Approach via Combining Seam Carving and Grid Warping Lifang Wu 1 , Lianchao Cao 1 , Min Xu 2 , and Jinqiao Wang 3 1. School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing, China 2. iNEXT, School of Computing and Communications, University of Technology, Sydney, Australia 3. National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science, Beijing, China Email: [email protected], [email protected], [email protected], [email protected] AbstractImage retargeting is a critical technique for browsing images in diversified terminals. In this paper, we propose a hybrid image resizing approach by jointly using seam carving and warping. Firstly, based on the importance partition with the saliency map, we apply a weighted seam carving approach to make the seams distributed dispersedly in the important regions. Then we propose Content Aware Image Distance (CAID) to assess the deformation caused by removing seams. The weighted seam carving will stop with a fixed threshold to assure little visual image quality degradation. Finally, the grid based warping is utilized to achieve the final size with a global optimization model, since warping tends to avoid discontinuity artifacts of important region and typically make the distortion distribution of unimportant region more coherently. Experiments and comparison in the public RetargetMe dataset [1] with Dong [2], Energy-based deformation [3], Multi-operator [4], SeamCarving [5], Simple scaling operator, Shift-maps [6], Scale and Stretch [7], Streaming Video [8], Non- homogeneous warping [9], show the superiority of the proposed approach. Index TermsImage Resizing; Weight Transfer; Content Aware Image Distance I. INTRODUCTION With the rapid development of mobile multimedia techniques, it is possible that users browse images on devices of different sizes such as cellular phones and high resolution TV. This requires image retargeting techniques [10], which adapt the images of various aspect ratios to the target screen, maximizing the viewer experience. Many content-aware retargeting methods have been proposed such as seam carving [4], [5], [11], [12], mesh- based retargeting [7], [9], [13], [14] and hybrid approaches [2], [15], [16]. Seam carving (SC) [4], [5], [11], [12] removes or inserts 1D seam that passes through the less important regions to preserve media content. But it usually brings discontinuity artifacts when the removed/inserted seams pass through some important objects or regions. To restrain the noise brought by seam carving, Achanta [17] computed the saliency map only once, independent of the number of seams inserted or removed. Toony [18] assumed that the object edges were removed during carving seams, and they proposed a modified saliency map combining the traditional saliency map with local edges for seam carving. Domingues [19] proposed stream carving by utilizing an adaptive importance map to merge several features like gradient magnitude, saliency, face, edge and straight line detection. Chen [20] proposed balanced seam carving (BSC) with a criterion to evaluate the diagonal artifacts in addition to the previous horizontal and vertical artifacts. Zhang [21] defined handles to describe both local regions and image edges. They assigned a weight for each handle based on an importance map for the source image. They constructed a quadratic distortion energy to measure the shape distortion for each handle. Huang [22] presented a fast seam based image resizing approach. They searched seams through establishing the matching relation between adjacent rows or columns. And a linear algorithm was proposed to find the optimal matches, which could save about 99% time compared to [12]. In addition to seam carving, mesh based approaches [7], [9], [13], [14] are kinds of continuous approaches. It optimizes an image from the source media size to some target size using several types of constraints to protect media content. Warping tends to avoid discontinuity arti- facts and typically preserves the overall shapes of image objects more coherently. The others are hybrid approaches. Dong [2] proposed an approach to combining seam carving and scaling based on an image distance function. Their strategy was interesting but it needed to scale the image to target size and compute the distance once a seam is removed, which was a time-consuming process. Also, scaling may cause deformation of important regions. A similar idea was proposed by Hwang [16], which exploited the important map weighted combining saliency map, gradient and face regions. The switching scheme (switching from seam carving to warping) is that the energy of a seam is larger than a threshold. In [15], a scheme to jointly use seam carving with warping was proposed. The deformation of semantic edges (DSE) was defined to measure deformation of resized images. This approach was computationally more efficient than [2], but the semantic edges needed to be detected beforehand and DSE can only be applied to straight line edges. To compare image quality by different retargeting methods reliably, an objective metric was presented from global to local viewpoints to access the quality of retargeted images [23]. JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014 483 © 2014 ACADEMY PUBLISHER doi:10.4304/jmm.9.4.483-492

Upload: buikien

Post on 08-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

A Hybrid Image Retargeting Approach via

Combining Seam Carving and Grid Warping

Lifang Wu 1, Lianchao Cao

1, Min Xu

2, and Jinqiao Wang

3

1. School of Electronic Information and Control Engineering, Beijing University of Technology, Beijing, China

2. iNEXT, School of Computing and Communications, University of Technology, Sydney, Australia

3. National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science, Beijing, China

Email: [email protected], [email protected], [email protected], [email protected]

Abstract—Image retargeting is a critical technique for

browsing images in diversified terminals. In this paper, we

propose a hybrid image resizing approach by jointly using

seam carving and warping. Firstly, based on the importance

partition with the saliency map, we apply a weighted seam

carving approach to make the seams distributed dispersedly

in the important regions. Then we propose Content Aware

Image Distance (CAID) to assess the deformation caused by

removing seams. The weighted seam carving will stop with a

fixed threshold to assure little visual image quality

degradation. Finally, the grid based warping is utilized to

achieve the final size with a global optimization model, since

warping tends to avoid discontinuity artifacts of important

region and typically make the distortion distribution of

unimportant region more coherently. Experiments and

comparison in the public RetargetMe dataset [1] with Dong

[2], Energy-based deformation [3], Multi-operator [4],

SeamCarving [5], Simple scaling operator, Shift-maps [6],

Scale and Stretch [7], Streaming Video [8], Non-

homogeneous warping [9], show the superiority of the

proposed approach.

Index Terms—Image Resizing; Weight Transfer; Content

Aware Image Distance

I. INTRODUCTION

With the rapid development of mobile multimedia

techniques, it is possible that users browse images on

devices of different sizes such as cellular phones and high

resolution TV. This requires image retargeting techniques

[10], which adapt the images of various aspect ratios to

the target screen, maximizing the viewer experience.

Many content-aware retargeting methods have been

proposed such as seam carving [4], [5], [11], [12], mesh-

based retargeting [7], [9], [13], [14] and hybrid

approaches [2], [15], [16].

Seam carving (SC) [4], [5], [11], [12] removes or

inserts 1D seam that passes through the less important

regions to preserve media content. But it usually brings

discontinuity artifacts when the removed/inserted seams

pass through some important objects or regions. To

restrain the noise brought by seam carving, Achanta [17]

computed the saliency map only once, independent of the

number of seams inserted or removed. Toony [18]

assumed that the object edges were removed during

carving seams, and they proposed a modified saliency

map combining the traditional saliency map with local

edges for seam carving. Domingues [19] proposed stream

carving by utilizing an adaptive importance map to merge

several features like gradient magnitude, saliency, face,

edge and straight line detection. Chen [20] proposed

balanced seam carving (BSC) with a criterion to evaluate

the diagonal artifacts in addition to the previous

horizontal and vertical artifacts. Zhang [21] defined

handles to describe both local regions and image edges.

They assigned a weight for each handle based on an

importance map for the source image. They constructed a

quadratic distortion energy to measure the shape

distortion for each handle. Huang [22] presented a fast

seam based image resizing approach. They searched

seams through establishing the matching relation between

adjacent rows or columns. And a linear algorithm was

proposed to find the optimal matches, which could save

about 99% time compared to [12].

In addition to seam carving, mesh based approaches

[7], [9], [13], [14] are kinds of continuous approaches. It

optimizes an image from the source media size to some

target size using several types of constraints to protect

media content. Warping tends to avoid discontinuity arti-

facts and typically preserves the overall shapes of image

objects more coherently.

The others are hybrid approaches. Dong [2] proposed

an approach to combining seam carving and scaling

based on an image distance function. Their strategy was

interesting but it needed to scale the image to target size

and compute the distance once a seam is removed, which

was a time-consuming process. Also, scaling may cause

deformation of important regions. A similar idea was

proposed by Hwang [16], which exploited the important

map weighted combining saliency map, gradient and face

regions. The switching scheme (switching from seam

carving to warping) is that the energy of a seam is larger

than a threshold. In [15], a scheme to jointly use seam

carving with warping was proposed. The deformation of

semantic edges (DSE) was defined to measure

deformation of resized images. This approach was

computationally more efficient than [2], but the semantic

edges needed to be detected beforehand and DSE can

only be applied to straight line edges. To compare image

quality by different retargeting methods reliably, an

objective metric was presented from global to local

viewpoints to access the quality of retargeted images [23].

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014 483

© 2014 ACADEMY PUBLISHERdoi:10.4304/jmm.9.4.483-492

Generally speaking, removing excessive seams can

easily cause image quality degradation for seam carving.

Cho [24] proposed an importance diffusion scheme which

propagated importance of removed pixels to their neigh-

bors for preserving visual contexts and avoiding over-

shrinkage of unimportant parts. It can be seen that the

artifacts brought by seam carving in unimportant regions

cause less visual quality degradation than important

regions. Based on our observation, if the same number of

removed seams is relatively far from each other, the

subjective image quality degrades not obviously, as

shown in Figure 1. In this paper, we propose a weighted

seam carving approach, in which the weighted forward

energy function [5] is computed in seam carving instead

of energy diffusion [24]. And the weight of removed

seams falling in important regions is propagated to the

pixels in its neighborhood. It will increase the cost of

removing these neighboring pixels and decrease the

possibility of these neighboring pixels involved in a seam.

Therefore, the seams fall into important regions non-

adjacently.

Figure 1. Seam carving under different distribution for seams. (a) The original image. (b) Seams distribute uniformly. (c) Seams distribute

consecutively. (d) Result of b. (e) Result of c

The above scheme can reduce image degradation when

the same number of seams is removed. However, it can

not essentially resolve the problem of artifacts. After

more seams are removed, the above scheme also causes

visual quality degradation. While the warping approach is

a continuous way, which tends to avoid discontinuity

artifacts of important region and typically makes the

distortion distribution on unimportant regions more

coherently. Therefore an optimal scheme to jointly using

seam carving and grid warping is proposed in this paper,

as illustrated in Figure 2. Motivated by the objective

quality assessment model [25], which showed that SSIM

(Structural Similarity) is consistent with the subjective

mean opinion score (MOS) by Logarithm function [25],

we propose Content Aware Image Distance (CAID) to

assess image quality of different retargeting size. With

the CAID model, we can assess the structural damage

caused by removing seams so as to optimally combine the

advantage of the weighted seam carving and warping.

The weighted seam carving will stop when visual image

quality degradation is greater. Then, the grid based

warping is utilized to achieve the target size with a global

optimization model, which could effectively keep the

whole structure through the distort the unimportant

regions coherently.

The contributions of this paper can be summarized as

follows:

For the structural deformation caused by neighboring

seams removal, we propose a weighted seam carving

approach. By transferring the weights of removed seams

to its neighboring pixels, the seams fall in important

regions are removed more uniformly. Therefore, the

weighted seam carving causes less visual quality

degradation than seam carving [5].

To optimally combine seam carving with warping, we

introduce a CAID to give an objective measure from the

view of image structural quality assessment. The

structure components of sub images are selected for

effectively measuring the deformation brought by

removed seams, which is more consistent with human

perception than BDS [26] and Liu [23].

With the CAID model for quality assessment, we can

effectively control the structural deformation caused by

removing seams and optimally combine the advantage of

the weighted seam carving and grid warping. Our

approach is superior in terms of removing un-important

regions, and keeping the aspect ratio of important objects.

II. CONTENT AWARE STRUCTURAL SIMILARITY

In this paper, we introduce a CAID to give an objective

measure for seam carving from the view of image quality

assessment. The CAID is used to measure the structural

quality loss of the important regions and judge the

switching point from the seam carving to the grid

warping.

A. Region Importance Determination

We calculate the visual saliency with [27] since it is

easy to implement and the performance is acceptable. The

saliency value of each pixel is normalized to [0-1] as the

weight ( , )w x y for the pixel in ( , )x y . The threshold r0 is

determined by a condition that half number of ( , )w x y is

smaller than0r . Then we obtain the important regions

from the saliency map by binaryzation. The binary image

( , )b x y is computed as follows,

1 ,0

,0 ,

0

w x y rb x y

w x y r

(1)

Figure 3 shows an example of region importance

determination, including the original image, the saliency

map and the important regions respectively. In Figure

3(c), the important regions are marked as 1.

484 JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014

© 2014 ACADEMY PUBLISHER

Figure 2. Image retargeting by combination of seam carving and grid warping

Figure 3. An example of region importance determination. (a) the original image, (b) the saliency map, (c) the important regions

B. Statistic Structural Similarity

Structural similarity measure (SSIM) [25] is classic

approach used for image quality assessment, and it is

superior to apply the SSIM index locally rather than

globally for effective quality assessment. In SSIM, three

comparison functions are calculated from the aspects of

luminance, contrast and structure. As the statistical results

in our previous work [28], the removed seam brings large

intensity discontinuity and it causes more structural

variation. Therefore, the structural comparison function is

more effective to measure the quality degradation in seam

carving. For two image f and g , the structural

comparison function is,

( , )fg

f g

Cs f g

C

(2)

where C is small constant in both denominator and

numerator to avoid instability when the remaining part of

the denominator is very close to zero. The local statistic

features including mean intensity, standard variation and

correlation variation , , ,f g f g and fg are as

follows:

1 1

1 1,

N N

f i g i

i i

f gN N

(3)

1 1

2 22 2

1 1

1 1( ( ) ) , ( ( ) )

1 1

N N

f i f g i g

i i

f gN N

(4)

1

1( )( )

1

N

fg i f i g

i

f gN

(5)

where N is the total pixel numbers of each image, and

if and ig are the gray value for pixel i . By Cauchy-

Schwarz inequality ( , ) 1. ( , ) 1s f g s f g if and only if

f and g are linearly dependent. With the analysis above,

we define the distance between two images f and g as

follows:

( , ) 1.0 ( , )Dis f g s f g (6)

When the two image are the same, the value of ( , )s f g is

Otherwise, the value of ( , )Dis f g is 0.

C. Content Aware Image Distance

For the original ( , )f x y and the target image ( , )g u v ,

in order to calculate the structural similarity, we

uniformly split the important regions into sub images

with 9×9 pixels in the original image. Each sub image is

represented by its center coordinate,

,_ ( , ), 1,2,...,sub n n n subI x y n N and subN is the total

number of sub images. For a sub image in ( , )f x y , we

will find the corresponding sub image in ( , )g u v . During

the process of seam carving, if the seam path pass

through the center point of the sub image, the center of

the sub image should be updated.

Let us take removing vertical seams in the image as an

example. When the removed seams don't pass through the

center of the sub images, Dis is calculated as Equation 6.

When a seam passes through the center of the thi sub

image, and its coordinate is ( , )i ip x y . Its left and right

neighboring pixels are ( , 1)i ip x y and ( , 1)i ip x y . The

centers of sub images on its nearest left/right are

_( , )i i leftp x y and _( , )i i rightp x y respectively. The

distances of the left and right neighboring pixels to the

center of the left and right sub images are computed as

Equation 7, respectively.

_ _

_ _

_ ( 1)

_ ( 1)

i left i i left

i right i right i

Dis Ave y y

Dis Ave y y

(7)

The one with the larger distance is set as the center of

the thi sub image. If the neighboring pixels of the pixel

( , )i ip x y include the centers of some other important sub

images, the thi sub image is deleted and the

corresponding Dis is set to 1.0.

Similarly, if horizontal seams are removed, the up and

down neighboring pixels are used for calculating the

distances. Figure 4(b) gives the corresponding sub images

of Figure 4(a) after 150 seams are removed.

After we build the correspondence between the sub

images in ( , )f x y and ( , )g u v , we compute the distance

_ _ _( , )sub n sub n sub nDis f g between each pair of important sub

images as Equation 6. CAID is the average distance of all

the sub images as follows,

_

1

1 subN

sub n

nsub

CAID DisN

(8)

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014 485

© 2014 ACADEMY PUBLISHER

Figure 4. The corresponding sub images between source image and target image. (a) the important sub images of Figure 3 (a), (b) the

corresponding sub images of Figure 3(a) after 150 seams removed

III. WEIGHTED SEAM CARVING

In order to reduce the degradation of visual image

quality, the strategy for removing seams is critical. But

based on the different strategies of removing seams

illustrated in Figure 1, there is visible quality degradation

when the seams are consecutive especially in the

important regions, while the visual image degradation is

not obvious when the seams are dispersed. Therefore, we

can remove seams by the following rules:

Seams in the unimportant regions are removed as

many as possible.

The distribution of seams is as uniform as possible.

Based on the two rules, we adopt a weighted strategy

to improve the performance of seam carving. If a seam to

be removed lies in the important regions, the pixels

around the seam will be penalized with a weight. So the

pixels around its neighborhood are less likely to be

removed. This could ensure that seams in important

regions are non-adjacently removed. Moreover, this

approach also results that more seams from the

unimportant regions are removed with less degradation.

A. Weighted Gradient Energy

As for the weighted seam carving, we firstly compute

the weighted gradient energy. With an image ( , )I x y , we

compute three possible step costs to the left, up or right

respectively.

( , ) | ( , 1) ( , 1) ( , 1) ( , 1) | | ( 1, ) ( 1, ) ( , 1) ( , 1) |

( , ) | ( , 1) ( , 1) ( , 1) ( , 1) |

( , ) | ( , 1) ( , 1) ( , 1) ( , 1) | | ( 1, ) ( 1, ) ( , 1)

Left

up

Right

C x y w x y I x y w x y I x y w x y I x y w x y I x y

C x y w x y I x y w x y I x y

C x y w x y I x y w x y I x y w x y I x y w x y

( , 1) |I x y

(9)

where the initial weight ( , )w x y is the normalized

saliency value in Section II-A. Then we compute the

forward-cumulative cost matrix ( , )M i j . For vertical

seams, the cost ( , )M i j is updated as follows,

( 1, 1) ( , )

( , ) min ( 1, ) ( , )

( 1, 1) ( , )

left

up

right

M x y C x y

M x y M x y C x y

M x y C x y

(10)

The minimal ( , )M i j is the minimal cost RLCost . The

( , )ypath x y corresponding to RLCost is the optimal seam

path. In another word, removing ( , )ypath x y will insert

the minimal amount of weighted gradient energy.

B. Weight Transfer

When the removed seam pass through the important

regions, the weights of pixels in the neighborhood are to

be updated correspondingly. The main idea is to transfer

the weight of removed pixels on the seam to its

neighboring pixels. Using removing a vertical seam as an

example, the seam to be removed is marked in red in

Figure 5. We will update the weights of pixels in the

horizontal neighborhood of this seam. For an pixel

0 0( , )x y on there moved seam, the neighboring pixel set is

denoted as 0 0( , ), 1, 2,...,C x p y p

( 1)Bandwidth . For each pixel 0 0( , )C x p y in

set C , the corresponding weight will be updated using

Equation 11.

0 0 0 0 0 0( , ) ( , ) (1.0 ) ( , )

pw x p y w x p y w x y

Bandwidth (11)

where Bandwidth is the half width of the neighborhood,

and we set Bandwidth = 5 in our experiment.

Figure 5. A seam and corresponding neighboring regions for weight transfer

IV. GRID WARPING

After a seam is removed, the CAID of the target image

to the original image is computed. If the CAID is larger

than or equal to 0.2 (based on our extensive observation),

we think the distortion is unacceptable, and the procedure

is switched to warping. The optimization model [29] is

employed to optimally allocate image aspect distortion

based on grid model. The constraints of rectangular grids

are employed to avoid serious shape transformation in

486 JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014

© 2014 ACADEMY PUBLISHER

resizing. All grids' aspect ratio changes are summed up to

measure distortion energy in retargeting. For grid

construction, an image is divided into N K grids, and

the grids are denoted by ( , )M V E in which V is the

2D grid coordinate, E are the edges of grids. Each grid is

denoted by 11,... ,...ij NKG g g g with its location ,i j .

Owing to the constraint of rectangular grids, all the grids

in each row have the same height while the grids in each

column have the same width. So the edge is simply

denoted by 1 1( , ),...( , )N KE w h w h , and ,i jw h is the

width and the height of the grid ijg respectively. Each

grid's importance ijs is computed by averaging the pixel-

wise importance values in that grid .ijg With the

definition above, The objective functions as well as

boundary constraints are formulated as follows:

The Objective Function A nonlinear objective

function is employed to reallocate distortion to a large

proportion of (all) unimportant regions to avoid

discontinuity. To minimize the grid distortion energy, the

objective function is defined as:

min ( ) ( )m

n

i s j ijy t ar x t s (12)

m is a even number and 2, 1m n . As to the choice of

object function, we do not prefer a linear form. For

example, when ( ) ( )i s jy t ar x t is used as the distort-

ion energy of grids, the optimization always chooses one

integral row or column of grids with a minimum

importance value to shrink or stretch. When the width or

height of those grids is reduced to zero, the spatial

continuity of a whole image would be destroyed.

Distortion Energy We use the edges of grids rather

than the coordinates of vertices to measure the distortion

energy of each grid. For an image t , the distortion energy

of each grid ijg is defined as:

( ) ( )ij i s jg y t ar x t (13)

sar is the aspect ratio of the original grid respectively,

,i jx y is the width and height of the target grid ijg ,

respectively.

Boundary Constraints We introduce the constraints

as follows:

1

1

( )

( ) , , 1,2,...

( ) 0

( ) 0

N

i Ti

N

j Ti

i

j

y t H

x t W i j n

y t

x t

(14)

,T TH W are the height and width of a target grid,

respectively. Note that the minimum height or length of a

grid is set to one pixel, as adjacent grids should not

overlap each other.

Global Solution To get a global solution, we employ

an active-set method to solve this optimization problem.

This nonlinear program is convex programming, and any

local solution of convex programming is actually a global

solution. ( ) ( )m

i s jy t ar x t is a convex function, so our

objective function ( ) ( )m

n

i s j ijy t ar x t s is a

convex one.

Moreover, the equality constraints are linear functions

and the inequality constraints can be seen as a concave

function. The solutions satisfying equality and inequality

constraints finally form a convex set. When a local

solution is resolved, the global solution is yielded.

An active-set starts by making a guess of the optimal

active set which satisfies equalities. With the width and

height of a target image ,T TH W , the initial guess is

,... ,...T TH W

N N

satisfying the equality constraint. Then,

the nonlinear program can be solved iteratively to get

global solutions in feasible region.

For the convex programming, the Hessian matrix of

the objective function is positive semi-definite. The

complexity is similar to a linear programming that

depends on the number of the model variables ( )O N K

(i.e. the division of width and height).

Figure 6. Results of grid based image warping

V. EXPERIMENTAL RESULTS

A. Experimental Setting

To evaluate the effectiveness and efficiency of our

retargeting method, we use the RetargetMe dataset [1]

and our collected images datasets, to conduct our

experiments and comparison. Our experiments involve

three parts:

Evaluate the effects of our weighted seam carving;

Evaluate the measure of CAID;

Evaluate the effectiveness of our combination

approach.

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014 487

© 2014 ACADEMY PUBLISHER

B. Effects of Weighted Seam Carving

In order to evaluate the performance of weighted seam

carving, we compare our weighted seam carving with

Rubinstein's SC [5] in Figure 7, which shows the cor-

responding images when seam number is 100, 200,

300,400, 500 respectively. Figure 3 (a) is the original

image of size 500 500 , we resize it to the image of size

250 250. We use CAID to assess the loss of the visual

quality. We also indicate the center of important sub

images on the corresponding images.

Figure 7. Comparison of Rubinstein's SC [5] and weighted seam carving. (a1)- (a5) are results of Rubinstein's SC [5] and the CAID

values. (b1)-(b5) are the results of ours

From Figure 7, our approach can achieve more

uniform distribution of important sub images than

Rubinstein's SC [5], especially for 300, 400 and 500

removed seams. It demonstrates that our approach causes

less deformation (lower CAID) in important regions than

Rubinstein's SC [5]. Moreover, in terms of the overall

visual effect our approach is much better after the seams

are removed. The most important is that, as the seam

number increases, Rubinstein's SC [5] causes faster

degradation of image quality than ours. Further-more,

from Figure 7, the important sub images distribute more

uniformly from our approach than from [5].

C. Experimental Results of CAID Measure

To show the performance of the CAID measure, we

add comparison with the objective measures: BDS [26],

Liu [23]. We chose ten images in TargetMe dataset

whose width or height is bigger than 1000 pixels. Each

image is resized using our weighted seam carving to

reduce 100, 200, 300, 400, 500, 600 and 700 seams

respectively. Both subjective and objective measures are

used to assess each resized image. For subjective

evaluation, we collected the subjective measure results of

40 participators based on how similar each retargeted

image and its original image were. The participators are

asked to give a dissimilarity score in range of [0,1], and

not influencing the participators aesthetic concept, no

hints and prior-knowledge are delivered to them in

advance. We average all the scores as the result of

subjective measure. The measures Liu [23] and BDS [26]

are used to compare with our approach. All the results are

normalized to dissimilarity in the range of [0-1], and the

experiment results are shown in Figure 8. From Figure 8,

we could see that the curve shape of our results are more

similar with the results of user study than other methods.

And the values of CAID and user study are different duo

to different quantitative method. The results of CAID and

the user study increase dramatically with the increasing

of removed seams, whereas Liu [23] increases linearly

and BDS [26] is fluctuant. The results could explain that

the CAID measure is more consistent with human

perception.

Figure 8. Comparison results between CAID, BDS [26], Liu [23], and the User study

D. Performance of Weighted Seam Carving + Warping

Our approach is compared with warping [29],

Rubinstein's SC [5], our weighted SC and Dong [2], our

weighted SC + scaling, and our weighted SC + warping.

An comparison example is given in Figure 9. From

Figure 9, we can see that our SC is much better than

Rubinstein's SC [5]. Compared with other approaches,

our SC + warping can remove more unimportant regions

(such as sky in the left images) than Dong [2], and it

causes less deformation of the key object (Tower) than

our weighted SC + scaling. But the tower in Figure 9(g)

is a little smaller than those in Figure 9(d)-(f).

In order to further show the effectiveness of our results,

we also compute the area and aspect ratio of the

important object in Figure 10.The key object lotus is

marked using a bounding box in Figure 10(a). The area

and aspect ratio of key objects in different images are

listed in table I. The results of comparison in Table I

show that our weighted SC + warping can preserve more

important regions than warping, and it keeps the aspect

488 JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014

© 2014 ACADEMY PUBLISHER

ratio of the important object much close to the original

image than other approaches. Our weighted SC causes

less deformation of details than Rubinstein's SC [5]. In a

word, our weighted SC+warping shows the better

performance in terms of removing more unimportant

pixels, while preserving more details and aspect ratio of

the important objects. Our approach achieves an optimal

trade off between preserving the aspect ratio of important

regions and the loss of deformation.

Figure 9. Comparison of different image resizing approaches

E. User Study

A subjective evaluation is further performed to com-

pare with the other methods in RetargetMe dataset [1].

By means of user preference and scoring evaluation, the

effectiveness is measured quantitatively. Totally 40

students and teachers participate in the user study. Each

participant is showed an original image and a randomly

ordered sequence of retargeting results with different

methods including Cropping, Energy-based deformation

[3], Multi-operator [4], Rubinsteins SC [5], Simple

scaling operator, Shift-maps [6], Scale-and-Stretch [7],

Streaming Video [8], Non-homogeneous warping [9],

Wang's method [9] and our weighted SC + warping. The

participators are asked to vote the most favorite one and

least favorite one for all the retargeting methods. Not

influencing the participators' aesthetic concept, no hints

and prior-knowledge are delivered to them in advance.

The statistical results are shown in Table II. As listed in

Table II, our approach rank 2th for "most favorite", and

rank 9th for "least favorite". This interesting results show

that most participants prefer our results. Figure 11 gives

some comparison results from the RetargetMe database.

Figure 10. Comparison of different image resizing approaches. (Image is resized from 333 × 500 to 333 × 250.) (a) Original image (b) warping

(c) Rubinstein's SC (d) Our weighted SC (e) Dong's with DCD [2] (f) Dong's Without DCD [2] (g) Our weighted SC+scaling (h) our weighted

SC + warping

TABLE I. THE AREA AND ASPECT RATIO OF IMAGES IN FIGURE. 10.

Area (Pixels) Aspect Ratio

(a) 35.6k 1.76

(b) 17.1k 1.47

(c) 29.7k 1.44

(d) 27.5k 1.37

(e) 23.3k 1.29

(f) 26.8k 1.50

(g) 24.8k 1.22

(h) 20.4k 1.57

F. Failure Case

Figure 12 give a failure case of our approach. The

original image (in Figure 12(a)) is resized from 683×1024

to 684×768. There is obvious deformation in the shelf as

shown in Figure 12(e). Since we give more constraints on

the important regions, and we have not any operation for

the unimportant regions. When the removed seams pass

through the region of "shelf", the serious deformation

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014 489

© 2014 ACADEMY PUBLISHER

TABLE II. RESULTS OF SUBJECTIVE EVALUATION. CR: CROPPING, LG: ENERGY-BASED DEFORMATION [3], SC: RUBINSTEINS SC [5], SCL: SCALE, SM:SHIFT-MAPS [6], SNS: SCALE-AND-STRETCH [7], SV: STREAMING VIDEO [8]

Cr Lg Multioperator Sc Scl Sm Sns Sv Warp Ours

Number of images 78 75 77 77 78 73 71 78 77 78

Most favorite (%) 18.28 5.20 14.42 7.03 4.58 12.52 6.13 9.07 5.97 16.79

Least favorite (%) 16.67 8.84 3.51 14.40 17.07 13.41 7.40 4.42 7.24 7.04

Figure 11. Comparison results in RetargetMe images

Figure 12. A failure case. (a) Original image. (b) The saliency map. (c) the important regions. (d) The important regions with structural

constraints. (e) The resized results of our approach. (f) the improved resized result with structural constraints

happens. Next step, we will add some structural

constraints for the unimportant regions like Figure 12(d),

where is not obvious deformation in the resized image as

shown in Figure 12(f).

Our approach is implemented in C++, and it is run on a

PC with duo CPU 2.10 GHZ, 1G memory. Average 40

seconds are needed to resize a 500×500 image to half size.

VI. CONCLUSIONS

In this paper, we have proposed a hybrid image

resizing approach by jointly using seam carving and

warping. We apply CAID to monitor degradation of

visual quality when removing seams in the resizing

process. The weighted seam carving will stop with a

fixed threshold to assure little visual image quality

degradation. Then the current image is warped to the

target size. In this way, we remove the unimportant pixels

as possible and make the deformation of the important

regions as small as possible. Experiments and comparison

in Retarget Me database shows the superiority of the

propose approach.

490 JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014

© 2014 ACADEMY PUBLISHER

ACKNOWLEDGMENT

The authors are grateful to the anonymous referees for

their valuable comments and suggestions to improve the

presentation of this paper. This work was supported by

973 Program (2010CB327905) and National Natural

Science Foundation of China (61273034, 61040052).

REFERENCES

[1] M. Rubinstein, D. Gutierrez, O. Sorkine, and A. Shamir,

"A comparative study of image retargeting," ACM Trans-

action Graphics, vol. 29, no. 6, Dec. 2010.

[2] W. Dong, N. Zhou, J. Paul, and X. Zhang, "Optimized

image resizing using seam carving and scaling," ACM

SIGGRAPH Asia 2009 papers, vol. 28, no. 5, pp. 1–10,

2009.

[3] Z. Karni, D. Freedman, and C. Gotsman, "Energybased

image deformation," CGF, vol. 28, no. 5, pp. 1257–1268,

2009.

[4] M. Rubinstein, A. Shamir, and S. Avidan, "Multi-operator

media retargeting," ACM Transaction Graphics, vol. 28,

no. 3, pp. 1–11, 2009.

[5] M. Rubinstein and A. Shamir, "Improved seam carving for

video retargeting," ACM Transactions on Graphics (TOG),

vol. 27, no. 3, pp. 1–9, 2008.

[6] Y. Pritch, E. Kav-venaki, and S. Peleg, "Shift-map image

editing," ICCV, pp. 151–158, 2009.

[7] [7] Y. Wang, C. L. Tai, O. Sorkine, and T. Y. Lee,

"Optimized scale-and-stretch for image resizing," ACM

Trans. Graph, vol. 27, no. 5, pp. 1–8, 2008.

[8] P. Krahenbuhl, M. Lang, A. Hornung, and M. Gross, "A

system for retargeting of streaming video," ACM TOG, vol.

28, no. 5, 2009.

[9] L. Wolf, M. Guttmann, and D. Cohen-Or, "Non-

homogeneous content driven video retargeting," IEEE 11th

International Conference on ICCV, pp. 1–6, 2007.

[10] L. Chen, X. Xie, X. Fan, W. Ma, H. Zhang, and H. Zhou,

"A visual attention model for adapting images on small

displays," ACM Multimedia Systems Journal, vol. 9, no. 4,

pp. 353–364, 2003.

[11] A. Shamir and O. Sorkine, "Visual media retargeting,"

ACM SIGGRAPH ASIA 2009 Courses, pp. 1–13, 2009.

[12] S. Avidan and A. Shamir, "Seam carving for content-aware

image resizing," ACM Trans. Graph, vol. 26, no. 3, p. 10,

2007.

[13] R. Gal, O. Sorkine, and D. Cohen-Or, "Feature aware

texturing," In EGSR, pp. 297–303, 2009.

[14] Y. Zhang, S. Hu, and R. Martin, "Shrinkability maps for

content-aware video resizing," Computer Graphics Forum,

vol. 27, no. 7, pp. 1797–1804, 2008.

[15] Y. Gong, L. Wu, and X. Zhang, "A semantic aware image

retargeting scheme combining seam carving and non-

homogeneous warping," Proceedings of the Second

International Conference on Internet Multimedia

Computing and Service, pp. 53–56, 2010.

[16] D.-S. Hwang and S.-Y. Chien, "Content-aware image re-

sizing using perceptual seam carving with human attention

model," Multimedia and Expo, 2008 IEEE International

Conference, pp. 1029–1032, 2008.

[17] R. Achanta and S. Susstrunk, "Saliency detection for

content-aware image resizing," Image Processing (ICIP),

2009 16th IEEE International Conference, pp. 1005–1008,

2009.

[18] Z. Toony and M. Jamzad, "A modified saliency detection

for content-aware image resizing using cellular automata,"

Signal and Image Processing (ICSIP), 2010 International

Conference, pp. 175–179, 2010.

[19] D. Domingues, A. Alahi, and P. Vandergheynst, "Stream

carving: An adaptive seam carving algorithm," Image Pro-

cessing (ICIP), 2010 17th IEEE International Conference,

pp. 901–904, 2010.

[20] J. Chen, L. Miao, and X. Liu, "Balanced energy for

content-aware image resizing," Ubi-media Computing (U-

Media), 2010 3rd IEEE International Conference, pp. 24–

29, 2010.

[21] G. Zhang, M. Cheng, S. Hu, and R. R. Martin, "A shape-

preserving approach to image resizing," Computer

Graphics Forum, vol. 28, no. 7, pp. 1897–1906, 2009.

[22] H. Huang, T. Fu, R. Paul, and L. Chun, "Real-time

content-aware image resizing," Science in China Series F:

Information Sciences, vol. 52, no. 2, pp. 172–182, 2009.

[23] Y. Liu, X. Luo, Y. Xuan, W. Chen, and X. Fu, "Image

retargeting quality assessment," In Eurographics, vol. 30,

no. 2, 2011.

[24] S. Cho, H. Choi, Y. Matsushita, and S. Lee, "Image

retargeting using importance diffusion," Image Processing

(ICIP), 2009 16th IEEE International Conference, pp.

977–980, Nov. 2009.

[25] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli,

"Image quality assessment: From error visibility to struc-

tural similarity," IEEE Transaction on Image Processing,

vol. 13, no. 4, pp. 600–612, 2004.

[26] D. Simakov, Y. Caspi, E. Shechtman, and M. Irani,

"Summarizing visual data using bidirectional similarity,"

CVPR, pp. 1–8, 2008.

[27] S. Goferman, L. Zelnik-Manor, and A. Tal, "Context-

aware saliency detection," IEEE Computer Vision and

Pattern Recognition (CVPR) 2010 (ORAL), pp. 2376–2383,

2010.

[28] L. Wu, L. Cao, J. Wang, and S. Liu, "Content aware metric

for image resizing assessment," 12th Pacific-Rim

Conference on Multimedia (PCM), pp. 203–212, 2011.

[29] B. Li, Y. chen, and J. Wang, "Fast retargeting with

adaptive grid optimization," Proc. ICME'11, 2011.

Lifang Wu received her B.E. and M.E.

degree from Beijing University of

Technology (BJUT), Beijing, China, in

1991 and 1994, respectively. She

received her Ph.D. degree of pattern

recognition and intelligent system from

BJUT in 2003. She is now the faculty of

School of Electronic Information and

Control Engineering, Beijing University

of Technology, where she currently serves as a professor. She

has published over 50 referred technical papers in international

journals and conferences of image/video processing, pattern

recognition. Her research interests include image/video analysis

and understanding, face detection and recognition, face

encryption. She is a senior member of Chinese Institute of

Electronics.

Lianchao Cao was born in China in

1987. He received the Bachelor Degree

in Software Engineering in 2010 from

China University of Geosciences in

Beijing. He is currently a postgraduate

student in Beijing University of

Technology. His research activity mainly

focuses on Digital Image Retargeting.

JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014 491

© 2014 ACADEMY PUBLISHER

Min Xu received the B.E. degree from

University of Science and Technology of

China, in 2000, M.S. degree from

National University of Singapore in 2004

and Ph.D. degree from University of

Newcastle, Australia in 2010. Currently,

she is a lecturer in School of Computing

and Communications, Faculty of

Engineering and IT, University of

Technology, Sydney. Her research interests include multimedia

content analysis, video adaptation, interactive multimedia,

pattern recognition and computer vision.

Jinqiao Wang received the B.E. degree

in 2001 from Hebei University of

Technology, China, and the M.S. degree

in 2004 from Tianjin University, China.

He received the Ph.D. degree in pattern

recognition and intelligence systems

from the National Laboratory of Pattern

Recognition, Chinese Academy of

Sciences, in 2008. He is currently an

Associate Professor with Chinese Academy of Sciences. His

research interests include pattern recognition and machine

learning, image and video processing, mobile multimedia, and

intelligent video surveillance.

492 JOURNAL OF MULTIMEDIA, VOL. 9, NO. 4, APRIL 2014

© 2014 ACADEMY PUBLISHER