digital image stabilization techniques

Digital Image Stabilization Techniques

Ashok Kumar‡, Ram Saran*, Hari Babu Srivastava† Instruments Research and Development Establishment

(DRDO), Raipur Road, Dehra Dun-248008 India ‡[email protected], Tel. +91-135-2782143

*[email protected], Tel. +91-135-2782143 †[email protected], Tel. +91-135-2782470

ABSTRACT The acquisition of digital video usually suffers from undesirable camera jitters due to unstable camera motion. The problem of stabilization of the video sequence from a camera mounted on a moving platform such as airborne reconnaissance, main battle tanks and ships is challenging and more complex. The motion composed of not only translation but rotation, zooming in & out and panning is a very common. In this paper, we propose a robust real-time video stabilization algorithm that rejects undesirable disturbances from the unstable video to produce a stabilized video. The proposed algorithm uses block-based parametric motion model for generating global camera motion parameters, which are further smoothed temporarily to reduce the motion fluctuations. These smoothed motion parameters along with original frames are required to produce stabilized video sequence. Real-time video and synthetic image sequences have been used to demonstrate the performance and efficiency of the algorithm. Key Words: Digital image stabilization, motion estimation, real-time, rotational motion, motion compensation

1. INTRODUCTION Video stabilization is an important video enhancement technology which aims for removing unintentional shaky motions, while reserving intentional global motion in a video sequence captured by hand-held devices such as Steady-Cam and optical systems [1] . Stabilization in these systems have been very popular in past, efficient real-time digital implementations are now becoming common for the military systems also [2]. Sighting system mounted on the vehicle such as Main Battle Tanks requires high stabilization function which removes all of the angular motion disturbances under stationary or moving conditions [3]. Because of its advantages of high image stabilizing precision, compact size, light weight, low power consumption and reasonable price, the digital image stabilization has become a focus of study competition among many countries in the world [4]. Digital image stabilization is typically considered to contain three successive steps: motion estimation, motion filtering, and motion compensation. Success in each of these phases affects the quality of the resulting video [5]. For example, image motion that is caused by camera motion has to be separated from other motion seen in a view, and only the unintentional part of this motion should be removed. Compensating for the motion should not decrease the image quality. A number of papers have been proposed in the video stabilization in which accurate motion estimation is critical. Image motion can be estimated using spatio-temporal or region matching approaches. Spatio-temporal approaches include block matching [6], optical flow estimation [7], and probabilistic model for motion with Kalman filter [8] and a least mean-square error matrix approach [9]. Region matching methods include bit-plane matching [10], feature tracking [11], point-to line correspondence [12], and pyramidal approaches [13]. In this proposed paper, block-matching algorithm has been implemented. It utilizes the full

image information and can be applied to any type of image, rich or poor in texture. The block correlation is robust against random noise and has high accuracy. The remainder of this paper is organized as follows: Section 2 describes brief mention of structure of over all digital image stabilization algorithms. It presents the motion estimation scheme to estimate local and global motion parameters. It covers the determination of motion decision to distinguish translational motion from the rotation. Motion compensation module covers the strategy for generating stabilized video sequence. Section 3 presents the experimental results obtained by the proposed system and finally, section 4 lists the conclusion.

2. DIGITAL IMAGE STABILIZATION SYSTEM Digital image stabilization system consists of local motion estimator, motion decision, global motion estimator, motion smoother and motion compensation units. The functioning of each unit is explained below:

Figure 1. Overview of digital stabilization algorithm First, an estimation of the local motions is carried out. The aim of this step is to obtain motion vectors, which characterizes the different motions included within the image. This paper uses block-matching algorithm for the estimation of local motion with the minimum mean absolute difference (MAD) criteria. Next, motion decision is carried out in order to distinguish if the global motion is pure translational, pure rotational or if it is a combination of these two components. With this approach, the system is able to overcome the limitations of previous algorithms which were able to detect single motion only. The decision unit is able to segment the video sequence containing both types of motion. Third, the parameters defining the global motion are estimated depending on the result from the previous module; this way, if the global motion has been classified as pure translational, only the displacements along the axis are estimated; and in case the motion is declared as rotational, or combination of these two, the center and angle of rotation are estimated. Fourth, The global motion parameters will be sent to the motion smoother unit, where the motion parameters are filtered to remove the unwanted motion but retaining intentional motion i.e. panning. Finally, once the global motion has been estimated, the Motion compensation unit corrects the motion in previous frame giving stabilized frame. This unit warps the current frame using filtered global parameters information and generates stabilized video sequence.

Input Video Sequence Local

Motion Estimation

Motion Decision

Unit

Global Motion

Estimation

Motion Smoother

Motion Compensation

Output Video Sequence

Input Video Sequence Local

Motion Estimation

Motion Decision

Unit

Global Motion

Estimation

Motion Smoother

Motion Compensation

Output Video Sequence

2.1 Local Motion Estimation Block matching can be considered as the most popular method due to its less hardware complexity. The cost function implemented for matching the blocks (input image divided into blocks) has been the minimum absolute difference (MAD) function. For each block, we provide a search region in the current frame to estimate its local motion vector. The functional definition of MAD is given below:

∑∈

++−=Byx

dydxfyxfNN

ddMAD),(

2121

21 ,(2),(11),(

where B denotes an N1xN2 block for a set of candidate motion vectors (d1,d2). The estimate of the motion vector is taken to be the value of (d1,d2) which minimizes the MAD. That is [ ] ),(minarg 21),(21 21

ddMADdd ddT =

Finding the best matching block requires optimizing the matching criterion over all possible candidate motion vectors at each pixel. This can be accomplished by integrating a full search which evaluates the matching for all values at each of the pixels and which, obviously, is extremely time consuming. In order to reduce the computational burden, searching area is limited to a window or region. The output of this unit is the local motion vector for each block. The block-matching algorithm is given in the following section. 2.1.1 Block Matching Algorithm /* f1 previous image frame , f2 current image frame*/ /* R is search window or region, N block size (image is divided into blocks */ /* Height & weight are dimension of the image */ /*for every block in the previous frame */ for i=1:N:height-N

for j=1:N:width-N MAD_min=256*N*N; mvx=0; mvy=0; /* within Search Region R */ for k=-R:1:R,

for l=-R:1:R MAD=(1/N*N)*sum(sum(abs(f1(i:i+N-1,j:j+N-1)-f2(i+k:i+k+N-1,j+l:j+l+N-1)))); /* calculate MAD for this candidate */ if MAD<MAX_min

MAD_min=MAD d2=k; d1=l;

end; end;

end; /* store the estimated motion vectors for each block */ mvx(iblk,jblk)=d1; mvy(iblk,jblk)=d2;

end; end;

2.2 Motion Decision Estimation Once the local motion vectors have been estimated, the global motion decision unit has to determine the appropriate type of motion. Distinguishing translational motion from rotational motion becomes straightforward by examining the statistics of local motion vectors. Decision of the global motion type is taken by evaluating the variance of local motion vectors. This variance gives the clue whether it is translation, rotation or combination of these two. When the variance of each component (x or y) of local motion vectors is below the threshold, we declare that that it is transnational. Variance with low value presumes that motion vector field is very homogeneous while with large value it indicates that motion vector field is heterogeneous resulting in rotation motion. Fig. 2 plots the variance for the local motion vectors distributions for two sequences with rotational and transnational movement, respectively. It is evident from the plot that variance is different for two types of motion.

(a)

(b)

Figure 2. Variance of local motion vectors distribution for (a) Translation motion (b) Rotation motion 2.3 Global Motion Estimation This module estimates the parameters defining the global motion of the sequence. These parameters will vary depending on the type of motion previously estimated; that is, for a translational motion only the displacements along x and y will be estimated, and, for those containing rotations, the center and angle of rotation is estimated. . 2.3.1 Motion Model When consecutive image sequence is purely rotated at arbitrary rotational center (x0,y0) by rotation angle θ , the movement of a pixel point between image frames is shown in Fig 3. This relation can be expressed as below [14]:

⎥⎦

⎤⎢⎣

⎡+⎥

⎦

⎤⎢⎣

⎡−−

⎥⎦

⎤⎢⎣

⎡ −=⎥

⎦

⎤⎢⎣

⎡

0

0

01

01

2

2

cossinsincos

yx

yyxx

yx

θθθθ

(1)

where the point (x2,y2) in the current image frame is matched with the point (x1,y1) in the previous image frame after pure rotation. If translation is combined with pure rotational motion, (1) becomes

⎥⎦

⎤⎢⎣

⎡+⎥

⎦

⎤⎢⎣

⎡+⎥

⎦

⎤⎢⎣

⎡−−

⎥⎦

⎤⎢⎣

⎡ −=⎥

⎦

⎤⎢⎣

⎡

y

x

dd

yx

yyxx

yx

0

0

01

01

2

2

cossinsincosθθθθ

(2)

where dx, dy are transnational components along x & y direction between consecutive frames.

(0,0)

(x2,y2)

(x1,y1)

(x0,y0)

Y

X

θ

Figure 3. Movement of pixel after rotation

2.3.2 Translational Global Motion Estimation After calculating the local motion vector field, the global displacements along the x and y axis are estimated by determining which of the vector components of the field are the most frequent ones (i.e. which of them have the highest probability of appearance as it is evident form histogram plot given in Fig. 4):

)}()(,{),( ijxyijxyyx vPvPVvvddv >∈∀== (3)

where V is the set of estimated vectors in the local motion vector field and P denotes the probability of v in V.

(a)

(b)

Figure 4. Motion estimation results for typical translational motion (a) quiver plots of local motion vectors (b) Histogram of local motion vectors 2.3.3 Rotational Global Motion Estimation Following the approach described in [13],[14], we consider that a translation and a rotation (or only pure rotation) can be described by the simplified single rotational model given in (4):

⎥⎦

⎤⎢⎣

⎡+⎥

⎦

⎤⎢⎣

⎡−−

⎥⎦

⎤⎢⎣

⎡ −=⎥

⎦

⎤⎢⎣

⎡βα

βα

θθθθ

1

1

2

2

cossinsincos

yx

yx

(4)

where α and β are given as

)cos1(2).cos.(sinsin).sin.cos)(cos1( 000000

θθθθθθθ

α−

−−+++++−−= yx dyyxdxyx

(5)

ydyyx ++−−−

= 000 .cos.sin[cos11 θθ

θβ

])cos1(2

).cos.(sinsin).sin.cos)(cos1(.sin 000000

θθθθθθθ

θ−

−−+++++−−+ yx dyyxdxyx

(6)

where (x0,y0) is the rotation center, θ is the rotation angle )( πθ n≠ . Fig. 5 shows the quiver and histogram plots of rotational motion.

Figure 5. Motion estimation results for typical rotational motion (a) quiver plots of local motion vectors (b) Histogram of local motion vectors 2.3.3.1 Rotation Center Estimation Based on the estimated motion fields, the parameters defining simplified model ),,( 00 θyx required to be estimated in the following way: Defining the vector fields in (7)

12 xxu −= and 12 yyv −= (7) where point (x2,x1) is the pixel in the current image frame, (x1,y1) corresponds to that pixel in the previous stabilized image frame and u,v are motion vector component along the x-axis and y-axis.

Line B

A2(x2,y2)

A1(x1,y1)

(x0,y0)

Line A

Y

X(0,0)

(a)

(0,0)

X

Y

(x0,y0)

θ

(x1,y1)

(x2,y2)

(b) Figure 6. (a) Rotation center estimation (b) Rotation angle estimation In Fig. 7, A1(x1,y1) and A2(x2,y2) are points of two local motion vectors. Equation of perpendicular bisector of local motion vector at A1 can be written as

11 bxay += (8)

where 1

11 v

ua −= , )2

()2

( 111

111

uxavyb +−+= and u1,v1 are local motion vectors at A1. Similarly the

equation of perpendicular bisector of local motion vector at A2 can be written as

22 bxay += (9)

where 2

22 v

ua −= , )2

()2

( 222

222

uxavyb +−+= and u2,v2 are local motion vectors at A2.

Although two distinct points are enough to compute rotation center and angle but inclusion of more pixels reduces the influence of noise and errors. For the case of N points, we get the N equations.

⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢

⎣

⎡

=⎥⎦

⎤⎢⎣

⎡

⎥⎥⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢⎢⎢

⎣

⎡

−

−−

NN b

bb

a

aa

.

.

1....11

2

1

2

1

βα

(10)

which can be written in the matrix form as

bAx = (11) The center of rotation can be estimated by

( ) bAAAx TT 1−= (12)

2.3.3.2 Rotation Angle Estimation Rotation angle is estimated by

))(())(())(())((tan

1212

12121

ββααβααβθ

−−+−−−−−−−= −

yyxxyxxy

(13) where ),( βα corresponds to the center of rotation estimated previously. 2.4 Motion Smoother Motion smoother is used to smooth out the abrupt motions. The video algorithm needs to distinguish the intentional panning from the unwanted motions. In particular, we assume that intentional panning is usually smooth with slow variations from frame to frame (i.e., temporally correlated motion with small variance). On the other hand, unwanted motion involves rapid motion variations over time in random fashion with usually large variance. As shown in fig. 7(a), a motion regarded as random-like will fluctuate around zero and the variance is usually large. On the other hand, fig. 7(b) shows a temporally correlated motion, which usually moves in a specific direction and the variance is small. In short, the high frequency components in the motion vector variations over time are considered to be effects of unwanted motion and therefore need to be removed. Removing the high frequency components can be done by applying low pass filters on local motion vectors. In this paper, moving average filter has been adapted which is given as:

⎪⎪⎩

⎪⎪⎨

⎧

=

0

1

][ Nnh

In our simulation, we have used N=7.

Movemen

Time

(a) Moveme

Time

(b)

Figure 7. (a) random-like motion (b) temporally correlated motion

For n=0,1,2 … N-1

Otherwise

2.5 Motion Compensation Valid local motion vectors are combined together to estimate the global motion vector for motion compensation. Usually shaking of camera or mounting platform involves transnational and rotation movement, single combined model incorporating transnational and rotation motion model has been used in this algorithm to describe the global motion. Once motion is estimated accurately, the motion compensation problem can be undertaken. Motion compensation is composed of two steps: determining the amount of compensation to apply and then warping the original frame into a stabilized version itself. In order to achieve, stabilized sequence from the first frame to the current frame, we must accumulate each frame motion vector (FMV) to form an accumulated motion vector (AMV). To suppress error accumulation and to have a mechanism to slowly pull the focus center back to the frame center, the following eqn (14) is used to robustly calculate accumulated motion vector. The equation uses a damping factor α .

AMV(t)=α * AMV(t-1)+FMV(t) (14) In simulation, α depends upon the input video sequence.

3. SIMULATION RESULTS The proposed method is tested against 8 video sequences. In the following, the results for 2 representative real-time sequences namely building and chart (shown in Fig. 8 & 9) are mentioned. One synthetic sequence has been simulated for rotation (Fig 10). Images from different sequences vary in dimension. One way to visualize the image stabilization process is by looking at plots of the estimated motion vectors. It has also been subjectively verified that the stabilized sequences assume better visual quality. Table 1 shows the results of rotation of a frame by different angles. It is clear from table that it correctly computes the angle of rotation and its center.

(a)

(b)

(c)

(d)

Figure 8: Difference between frames before and after motion correction (a) original frame 21, (b) difference between original frames 21 and 22, (c) difference between motion-corrected frames 21 and 22, and (d) original vertical and horizontal estimated motion vectors.

(a)

(b)

(c)

(d)

Figure 9: Difference between frames before and after motion correction (a) original frame 16, (b) difference between original frames 16 and 17, (c) difference between motion-corrected frames 16 and 17 and (d) original vertical and horizontal estimated motion vectors.

Table 1.

Simulated Test Condition Results Rotation Translation

Angle (deg.)

Center of rotation

X (pixels)

Y (pixels)

Estimated motion

type

Estimated angle

(degree)

Estimated center of rotation

Estimated dx

(pixels)

Estimated dy

(pixels)

- - -10 -13 Translation 10 14 - - 16 -15 Translation -16 16

0.9 (120,160) - - Rotation -0.95 (120.55,167.33) - - -2.0 (120,160) - - Rotation 1.97 (120.27,160.23) - - 7.0 (120,160) - - Rotation -6.61 (119.64,164.96) - -

Figure 10. Frames before and after motion correction (a) original 16th frame of a synthetic sequence, (b) frame rotated by 2 degree and (c) frame compensated by 1.97 degree

4. CONCLUSION The algorithm discussed in this paper caters for two types of motion i.e. translational and rotation or combination of these two. Most crucial part of the digital stabilization is the accurate estimation of local motion vectors because error in local motion vectors is reflected in the subsequent phases of the algorithm. Block matching algorithm has been used for estimating local motion. The algorithm implements the single combined translational and rotation motion model for the purpose of estimation of global motion parameters. Motion decision unit has also been implemented for catering the two types of motion in the same video sequence. Motion smoother unit takes care of intentional panning at the same time rejecting the unwanted motion. Finally, motion compensation generates stabilized video sequence. Presently algorithm caters in only two types of motion i.e. translational & rotational, other type of motions like scaling or zooming have to be taken care of. The proposed scheme lacks deblurring part. Deblurring of video sequence is necessary to avoid inaccurate estimation of motion parameters. In future, this part may be included.

ACKNOWLEDGEMENT The authors would like to acknowledge the encouragement given by Shri S. S. Sundaram, Director, IRDE, Dehra Dun for giving permission and providing necessary resources to carry out this work. Authors are expressing their sincere thanks to Shri Y.B. Limbu, Joint Director and Shri Avnish Kumar, Joint Director for their encouragement, ideas and advice.

REFERENCES [1] Carlos Morimoto and Rama Chellappa, “Fast Electronic Digital Image Stabilization for Off-Road Navigation”,

Real-Time Imaging 2, 285–296 (1996). [2] Daniel McReynolds, Pascal Marchand and Yunlong Sheng, “Stabilization of infra-red aerial image sequences

using robust estimation”, Proceedings of the Conference Vision Interface 1999, Trois-Rivières, 1999. [3] R.C. Haride, K.J. Barnard, J.G. Bognar, E.E. Armstrong and E.A. Watson, “High-resolution image

reconstruction from a sequence of rotated and translated frames and its application to an infrared imaging system”, Opt. Eng. 37, 247-260 (1998).

[4] Hung-Chang Changl, Shang-Hong Lai, and Kuang-Rong Lu, “A Robust and Efficient Video Stabilization Algorithm”, IEEE International Conference on Multimedia and Expo (ICME) 2004.

[5] Douglas R. Droege, “Electronic Image Stabilization Based on the spatial Intensity Gradient”, SPIE Vol. 6238. [6] Stanislav Soldatov, Konstantin Strelnikov, Dmitriy Vatolin, “Low Complexity Global Motion Estimation from

Block Motion Vectors”, http://www.compression-links.info/ [7] J.Y. Chang, W. F. Hu, M.H. Cheng and G.S. Chang, “Digital image translational and rotational motion

stabilization using optical flow technique”, IEEE Transactions on Consumer Electronics, Vol 48, No 1, pp 108-115, Feb 2002.

[8] Andrew Litvin, Janusz Konrad, William C. Karl, “Probabilistic video stabilization using Kalman filtering and mosaicking”, IS&T/SPIE Symposium on Electronic Imaging, Image and Video Communications and Proc., Jan. 20-24, 2003.

[9] Hung-Chang Chang, Shang-Hong Lai and Kuang-Rong Lu, “A Robust real-time video stabilization algorithm”, Journal of Visual communication, 2006

[10] S.J. Ko, S.H. Lee and K.H. Lee, “Digital image stabilization algorithms based on bit-plane matching, “IEEE Transaction on Consumer electronics”, vol 44,, No. 3, Aug. 1998

[11] Luca Bombini, Pietro Cerri, Paolo Grisleri, Simone Scaffardi, Paolo Zani, “An Evaluation of Monocular Image Stabilization Algorithms for Automotive Applications”, http://www.ce.unipr.it/

[12] Daniel McReynolds and Yunlong Sheng, “Stabilization of infrared image sequence with rotation, scaling and view angle changes”, International Conference on Applications of Photonic Technology, Ottawa, July 27-30 (1998).

[13] Enrique Estalayo, Luis Salgado, Fernando Jaureguizar and Narciso Garcia, “Efficient Image Stabilization and Automatic Target Detection in Aerial FLIR Sequences”, SPIE Vol. 6234, 2006.

[14] Ho Dong Seok and Joon Lyou, “Digital Image stabilization Using Simple Estimation of the Rotation and Translation Motion”, SPIE Vol 5810, 2005.

digital image stabilization techniques

Documents