enhanced motion parameters estimation for an active vision system

6
ISSN 1054-6618, Pattern Recognition and Image Analysis, 2008, Vol. 18, No. 3, pp. 370–375. © Pleiades Publishing, Ltd., 2008. Enhanced Motion Parameters Estimation for an Active Vision System 1 M. Shafik and B. Mertsching GET Lab, Paderborn University Pohlweg 47–49, 33098 Paderborn, Germany e-mail: [shafik, mertsching]@get.upb.de Abstract—In this paper, we present an enhanced approach for estimating 3D motion parameters from 2D motion vector fields. The proposed method achieves valuable reduction in computational time and shows high robustness against noise in the input data. The output of the algorithm is part in a multiobject segmentation approach implemented in an active vision system. Hence, the improvement in the motion parameters estimation process leads to speed-up in the overall segmentation process. DOI: 10.1134/S1054661808030024 Received November 26, 2007 1 1. INTRODUCTION The research presented in this paper is a building block of a comprehensive system of multiobject motion analysis in robotic vision, which is intend to be included in active vision applications [1–3]. The inter- pretation of multiple moving objects is a challenging problem. While a lot of approaches to segment the 3D motion of multiple moving objects assume that each segment represents a rigid and connected object [4, 5], a 3D motion segmentation approach in [6] is conceptu- ally able to handle transparent motion despite the con- nectivity of objects. Here the estimated motion param- eters serve as a homogeneity criterion for the segmen- tation approach while other applications, e.g., a global motion estimation (GME) [7], use 2D affine transfor- mation parameters for this purpose. The basic idea of the proposed algorithm is to enhance the computational speed of the motion seg- mentation approach represented in [6] by improving the 3D motion parameter estimation process. The segmen- tation approach initializes the segmentation process with the whole motion vector field (MVF) as one seg- ment. The objective is to obtain a state where only MVs belonging to the same 3D motion are connected. Figure 1 demonstrates the segmentation process of a synthetic MVF containing two different motions. The estimated motion parameters at a point p m are influenced by other MVs depending on their connectivity to the same 3D motion. Hence, the estimation of the motion parameters process is repeated N times for each iteration, where N is the total number of detected MVs. Therefore, enhancing the computational speed of the motion parameters estimation process leads to a significant speed-up in the segmentation approach. 1 The text was submitted by the authors in English. The remainder of the paper is organized as follows: Section 2 gives an account of work related to the pro- posed method, while Section 3 describes the proposed algorithm in order to estimate motion parameters. Sec- tion 4 discusses the results of experiments and evalu- ates the outcome of proposed method, and finally, Sec- tion 5 concludes the paper. 2. RELATED WORK 3D motion interpretation of an image flow has become an important problem in computer vision. COMPUTER VISION (a) (b) i = 1 i = 2 i = 3 i = 4 i = 5 i = 6 (c) Fig. 1. Segmentation of two different motions: (a) input synthetic MVF. (b) Result of segmentation process. (c) Development of the segmentation process after i-th iter- ations.

Upload: m-shafik

Post on 02-Aug-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enhanced motion parameters estimation for an active vision system

ISSN 1054-6618, Pattern Recognition and Image Analysis, 2008, Vol. 18, No. 3, pp. 370–375. © Pleiades Publishing, Ltd., 2008.

Enhanced Motion Parameters Estimationfor an Active Vision System

1

M. Shafik and B. Mertsching

GET Lab, Paderborn University Pohlweg 47–49, 33098 Paderborn, Germanye-mail: [shafik, mertsching]@get.upb.de

Abstract

—In this paper, we present an enhanced approach for estimating 3D motion parameters from 2Dmotion vector fields. The proposed method achieves valuable reduction in computational time and shows highrobustness against noise in the input data. The output of the algorithm is part in a multiobject segmentationapproach implemented in an active vision system. Hence, the improvement in the motion parameters estimationprocess leads to speed-up in the overall segmentation process.

DOI:

10.1134/S1054661808030024

Received November 26, 2007

1

1. INTRODUCTION

The research presented in this paper is a buildingblock of a comprehensive system of multiobject motionanalysis in robotic vision, which is intend to beincluded in active vision applications [1–3]. The inter-pretation of multiple moving objects is a challengingproblem. While a lot of approaches to segment the 3Dmotion of multiple moving objects assume that eachsegment represents a rigid and connected object [4, 5],a 3D motion segmentation approach in [6] is conceptu-ally able to handle transparent motion despite the con-nectivity of objects. Here the estimated motion param-eters serve as a homogeneity criterion for the segmen-tation approach while other applications, e.g., a globalmotion estimation (GME) [7], use 2D affine transfor-mation parameters for this purpose.

The basic idea of the proposed algorithm is toenhance the computational speed of the motion seg-mentation approach represented in [6] by improving the3D motion parameter estimation process. The segmen-tation approach initializes the segmentation processwith the whole motion vector field (MVF) as one seg-ment. The objective is to obtain a state where only MVsbelonging to the same 3D motion are connected. Figure1 demonstrates the segmentation process of a syntheticMVF containing two different motions. The estimatedmotion parameters at a point

p

m

are influenced by otherMVs depending on their connectivity to the same 3Dmotion. Hence, the estimation of the motion parametersprocess is repeated

N

times for each iteration, where

N

is the total number of detected MVs. Therefore,enhancing the computational speed of the motionparameters estimation process leads to a significantspeed-up in the segmentation approach.

1

The text was submitted by the authors in English.

The remainder of the paper is organized as follows:Section 2 gives an account of work related to the pro-posed method, while Section 3 describes the proposedalgorithm in order to estimate motion parameters. Sec-tion 4 discusses the results of experiments and evalu-ates the outcome of proposed method, and finally, Sec-tion 5 concludes the paper.

2. RELATED WORK

3D motion interpretation of an image flow hasbecome an important problem in computer vision.

COMPUTERVISION

(a) (b)

i

= 1

i

= 2

i

= 3

i

= 4

i

= 5

i

= 6

(c)

Fig. 1.

Segmentation of two different motions: (a) inputsynthetic MVF. (b) Result of segmentation process.(c) Development of the segmentation process after

i

-th iter-ations.

Page 2: Enhanced motion parameters estimation for an active vision system

PATTERN RECOGNITION AND IMAGE ANALYSIS

Vol. 18

No. 3

2008

ENHANCED MOTION PARAMETERS ESTIMATION FOR AN ACTIVE VISION SYSTEM 371

Early publications such as [8] discussed the estimationof general 3D motion parameters of a rigid body fromtwo or more consecutive image frames. Longuet-Hig-gins and Prazdny [9] introduced equations for comput-ing the 3D egomotion in stable scene. They suggest thatthe 3D motion interpretation problem is a matter ofsolving a system of equations for six motion parame-ters. A linear optimization approach was introduced in[10] with the assumption that the optical flow is accu-rately available.

A neural system for interpreting optical flow [11]includes a 2D signal transform similar to that describedby Daugman [12] for performing the Gabor transformof images. Daugman employed a network of neuronlikeunits with a specified learning rule. According to thearchitectural design, the stabilized connection weightsare the best least mean squares approximation to theGabor parameters. Daugman’s transform finds thederivative of an error with respect to each of the Gaborparameters using a gradient descent method in order toiteratively approximate the solution.

In the case of interpreting an optical flow, the ele-mentary signals are 2D vector fields of infinitesimalgenerators of a 3D Euclidean group. The infinitesimalmotion of a rigid body, i.e., a 3D vector field, can beexpressed as a linear combination of six component3D vector fields. The computation of a 3D motion froma 2D image flow or a motion template finds the optimalcoefficient values in a 2D signal transform. The idealoptical motion

v

opt

caused by the motion of a point(

x

,

y

,

d

) on a rigid visible surface

d

=

ρ

(

x

,

y

) is

(1)

where

e

i

(

x

,

y

) represents the six infinitesimal generatorsin the form of 2D vector fields. For translation,

,

, (2)

and for rotation,

,

v opt x y,( ) ciei x y,( ),i 1=

6

∑=

e1 x y,( ) ρ 1– x y,( ) 1 x2 y2+ +

0⎝ ⎠⎜ ⎟⎜ ⎟⎛ ⎞

=

e2 x y,( )0

ρ 1– x y,( ) 1 x2 y2+ +⎝ ⎠⎜ ⎟⎜ ⎟⎛ ⎞

=

e3 x y,( ) xρ 1– x y,( ) 1 x2 y2+ +–

yρ– 1– x y,( ) 1 x2 y2+ +⎝ ⎠⎜ ⎟⎜ ⎟⎛ ⎞

=

e4 x y,( ) xy–

1 y2+⎝ ⎠⎜ ⎟⎛ ⎞

=

, (3)

The error function

E

is defined as the differencebetween the ideal optical motion

v

opt

(

x

,

y

) and thesensed optical motion

v

(

x

,

y

):

(4)

where

v

(

x

,

y

), (

x

,

y

)

W

is an image flow in a window

W

with

n

points,

W

= {

p

i

,

i

= 1, …,

n

}. A least squareerror solution is a set of coefficients

c

i

which minimizesthe error

E

(

W

), i.e.,

dE

(

W

) = 0. The derivative of anerror

E

(

W

) with respect to

c

i

is given as

(5)

This approach has been improved in [6] by includ-ing a recursive term α into the learning rule

(6)

where α is a constant learning rate, which yields anoticeable speed-up at gradual slopes.

3. PROPOSED ALGORITHMThis part describes the functionality of the proposed

algorithm. It discusses the drawback in Daugman’salgorithm, according to which the change in a singleestimated parameter, i.e., ck, is affected by the estima-tion of other parameters. This would generate an errorespecially in the scenarios where an input MVFdescribes the motion generated by one of the parame-ters in a motion template. The proposed methodapproaches the aftermentioned problem by making useof global minimum search criterion for each parameterin an MVF, which is applicable from the first iterationstep k = 0. It is quite possible that each parameter inthe estimation process may require different num-bers of iterations m, i.e., m ∈ {0, 1, 2, …, p} for particular

e5 x y,( ) 1 x2+

xy⎝ ⎠⎜ ⎟⎛ ⎞

=

e6 x y,( ) y–

x⎝ ⎠⎜ ⎟⎛ ⎞

.=

E W( ) v x y,( ) v opt x y,( )– 2,x y,( ) W∈∑=

Dci

∂E w( )∂ci

----------------=

= 2 v x y,( )ei x y,( )[ ]x y,( ) W∈∑

– 2 ckek x y,( )k 1=

6

∑⎝ ⎠⎜ ⎟⎛ ⎞

ei x y,( )x y,( ) W∈

cke

= 2 v x y,( ) v opt x y,( )–[ ]ei x y,( ).x y,( ) W∈∑

∆c k 1–( )i

ck 1+ ck ∆ck with ∆cki+

12---∂E

∂ci

-------– α∆c k 1–( )i,+= =

Page 3: Enhanced motion parameters estimation for an active vision system

372

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 18 No. 3 2008

SHAFIK, MERTSCHING

k ∈ {0, 1, 2, …, N} to be where i ∈ {1, 2, …, 6}. The

root mean square error (RMSE) (c) is calculatedbetween the input motion vector field and the estimatedMVF as

(7)

where V(x, y) is a vector component of input MVF andVest(x, y) is the vector component of the estimated MVF.V is the total number of the detected vectors. After-

wards, the change in error between two succes-sive iterations is calculated as

(8)

The above parameter ∆ is significant in devisinga set of learning rules which determines the stop crite-rion during the motion parameters estimation process.

We start with the computation of a particular param-

eter coefficient as follows:

(9)

The convergence of is dependent on the value

of ∆ , which depends on the value RMSE (c) asgiven in

ckm

i

Ekm

i

Ekm

i c( ) 1V------ V x y,( ) Vest x y,( )–( )2

x y,( ) V∈∑ ,=

∆Ekm

i

∆Ekm

i c( ) Ekm

i c( ) Ekm 1–

i c( ).–=

Ekm

i

ckm 1+

i

ckm 1+

i ckm

i ∆ckm

i .+=

ckm 1+

i

ckm

i Ekm

i

(10)

where

(11)

and αi is an adjustable learning force parameter. Let usassume that at the start of the estimation process whenk = 0, = 0 as a default value.

∆ckm

i 12---

∆Ekm

i c( )

∆ckm

i------------------- αi∆ckm 1–

i ,+–=

∆ckm

i

1 if ckm

i ckm 1–

i=

ckm

i ckm 1–

i–

ckm

i ckm 1–

i–-------------------------- if ckm

i ckm 1–

i≠⎩⎪⎨⎪⎧

=

cki

(a) (b)

100

80

60

40

20

0 30252015105Iterations no., k

E(c), %

(c)

Fig. 2. Synthetic MVFs. (a) Generated by c = (1, 0, –1, –1.8,–2, 0.6). (b) After application of noise and MVs removal.(c) Progression of the mean square error E(c) over the gen-eral iteration steps k.

25

20

15

10

5

0

Daugman’s NNImproved Alg.

Err

or, %

25

20

15

10

5

0

Daugman’s NNImproved Alg.

25

20

15

10

5

0

Daugman’s NNImproved Alg.

25

20

15

10

5

0

Daugman’s NNImproved Alg.

25

20

15

10

5

0

Daugman’s NNImproved Alg.

25

20

15

10

5

0

Daugman’s NNImproved Alg.

25

20

15

10

5

0

Daugman’s NNImproved Alg.

25

20

15

10

5

0

Daugman’s NNImproved Alg.

100 200 300 100 200 300Iterations no.

(a) (b)

(c)(d)

(e) (f)

(g) (h)

Fig. 3. Progression of the mean error of the estimatedparameters over the particular iteration steps for Daugman’sNN and the proposed algorithm. (a)–(f) For the instanta-neous velocity coefficients c1, …, c6 respectively. (g) For asynthetic MVF generated by c = (1, 0, –1, –1.8, –2, 0.6).(f) For a synthetic MVF after application of 100% noise toeach vector component and random equally distributedremoval of MVs (with ρ = 0.5).

Page 4: Enhanced motion parameters estimation for an active vision system

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 18 No. 3 2008

ENHANCED MOTION PARAMETERS ESTIMATION FOR AN ACTIVE VISION SYSTEM 373

(12)

It can be seen that ∆ is proportional to ∆ (c)

at the first step. This means ∆ (c) will be positive for

⇒ ckm 1+

i ∆ckm

i=

⇒ ∆ckm

i 12---∆Ekm

i c( ) ∆ckm 1–

i∀– 0= =

∆ckm

i 1.=

ckm

i Ekm

i

Ekm

i

the first iteration under the assumption that the defaultinput MVF is a blank template, i.e., an MVF generatedfrom the motion parameter vector c = (0, 0, 0, 0, 0, 0).

Now if ∆ (c) is increasing, this means that the is

not a negative value. Hence, we will seek within the

positive values. In order to speed up the search process,

we will consider the value of the estimated ∆obtained in the first step (12) in order to skip redundant

Ekm

i ckm

i

ckm

i

ckm

i

(a) (b)

(c)

Fig. 4. Representation of computed MVFs in different scales. (a) Input sequence from PETS Dataset (Performance Evaluationon Tracking and Surveillance). (b) Resulting MVF. (c) Left: MVFs generated from scaling the input images. Right: MVFsresulting from scaling the generated MVF. From up to down, image sizes: 128 × 192, 64 × 96, 32 × 48, respectively.

Page 5: Enhanced motion parameters estimation for an active vision system

374

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 18 No. 3 2008

SHAFIK, MERTSCHING

computations. Afterwards, we will test the ∆ (c)again. In case it is increasing, we have not reach a glo-

bal minimum. However, may reach a local mini-

mum which could further reduce the RMSE (c).However, this would not be an optimum solution. Thispoint actually highlights the main difference betweenDaugman’s algorithm and the new methodology,according to which the new algorithm will not consider

the value of obtained at the first iteration k = 0 inestimating the other coefficients. The learning rule hasbeen changed to be

(13)

where Λ is a testing criterion to check the validity of theerror convergence in a particular direction

(14)

For primary motion templates, each template hasbeen generated using only one coefficient, the othercoefficients being equal to zero. This leads to the factthat, in order to estimate the right value for that coeffi-cient in a fast way, the other coefficients should bezeros. So for the first iteration, as we seek to discover ifthe MVF is one of those primary motion templates, weassume a correctly constructed MVF will be generatedusing only one coefficient. Therefore, we check for

Ekm

i

ckm

i

Ekm

i

ckm

i

ckm 1+

i

ckm

i ∆ckm

i+

ckm

i 2∆ckm

i if Λ 0=( )–

ck0

i if Λ 1= k∧ 0=( ),⎩⎪⎪⎨⎪⎪⎧

=

Λ0 if ∆Ekm

i c( ) 0≥ ∆ckm

i∧ 0<( )

1 if ∆Ekm

i c( ) 0≥ ∆ckm

i∧ 0≥( ).⎩⎨⎧

=

each if it reaches a global minimum or not, inde-pendent from other coefficients.

4. RESULTS AND DISCUSSION

In this section, two types of synthetical motions willbe tested. The first data set describes the motionobtained by the projection of single instantaneousvelocity coefficient at a time. The second data set rep-resented in Fig. 2 describes a synthetic MVF generatedby different coefficient values after application of 100%noise to each vector component and random equallydistributed removal of MVs (with ρ = 0.5).

In order to investigate the performance of the pro-posed algorithm correctly, the testing criterion is basedon the progression of the mean error of the estimatedparameters Etotal instead of the progression of the meansquare error E(c) over the general iteration steps k asshown in Fig. 2.

.

Figure 3 demonstrates a comparison of the progres-sion of the mean error of the estimated parameters Etotal

over the particular iteration step between the Daug-man’s NN and the proposed algorithm.

Due to the linearity between the derivative error and the partial velocity coefficients of the translation inX and Y directions (c1, c2), the performance of Daug-man’s network is almost the same as that of the pro-posed algorithm. On the contrary, the nonlinear relationwith respect to the translation in the Z direction and therotation in X, Y, and Z (c3, …, c6) leads to the need foran increased number of iteration steps. This drawbackhas been overcome by the new algorithm as seen in the

ckm

i

Etotal16--- εi, where εi

i 0=

6

∑ copt ci–copt

----------------- 100×= =

kim

Dv i

(a)

(b)

Fig. 5. Representation of scaled MVFs. (a) Input sequencefrom the left camera of “SiMORE.” (b) Resulting MVF at atime t = {t1, t4, t5}.

1009080706050403020100

32 × 4864 × 96128 × 192

0.22.1

6.90.5

100

32.4Daugman’sNNImprovedAlg.

Com

puta

tiona

l tim

e, %

Image size

Fig. 6. Reduction of the computational time achieved by theimproved algorithm needed for segmenting an MVF of size(128 × 192).

Page 6: Enhanced motion parameters estimation for an active vision system

PATTERN RECOGNITION AND IMAGE ANALYSIS Vol. 18 No. 3 2008

ENHANCED MOTION PARAMETERS ESTIMATION FOR AN ACTIVE VISION SYSTEM 375

results of the first data set. In the second data set, thenew approach showed an enhanced performance inreaching a minimum error of Etotal < 0.01% for a syn-thetic MVF and Etotal < 0.5% for a significant alterationto the same MVF.

In the case of using a sequence of real images, a newchallenge has been raised due to the large number ofgenerated VFs. In order to reduce the number of pro-cessed vectors, the input data could be represented indifferent scales. Scaling the input image itself willresult in big losses of input information, while scalingthe generated MVF will produce a better result asshown in Fig. 4. Figure 5 represents the scaled MVF fora sequence of images generated from the left camera ofa simulated framework “SiMORE” [13]. As a result ofthe egomotion of the robot, some artifacts affect thedirection of the MVs raising a new challenge which isthe subject of our current development. Figure 6 dem-onstrates the improvement in computational time of themotion segmentation approach with respect to the com-putational time needed for segmenting an MVF of size(128 × 192) compared to the results obtained by [6].

5. CONCLUSIONS

In this paper, we have presented a fast approach toestimate the motion parameters coefficients, whichresults in a significant speed up compared to the estima-tion process from primitive motion patterns as itenhances the reduction of the mean error of the esti-mated parameters even with highly noised MVF. Theproposed algorithm will have a great influence inreducing the computational time of motion segmenta-tion approaches that have the need for fast processingmethods.

ACKNOWLEDGMENTS

The authors would like to thank I. Ali for very help-ful comments.

REFERENCES

1. Z. Aziz and B. Mertsching, “An Attentional Approachfor Perceptual Grouping of Spatially Distributed Pat-terns,” in 29th Annual Symposium of the German Asso-ciation for Pattern Recognition, DAGM 07, 2007.

2. Z. Aziz, B. Mertsching, M. Shafik, and R. Stemmer,“Evaluation of Visual Attention Models for Robots,” inIEEE ICVS, 2006.

3. I. Ali, Z. Aziz, and B. Mertsching, “Design Issues forMulti-Modal Attention in Autonomous Robot Systems,”in International Conference on Spatial Cognition, 2006.

4. L. Hajder and D. Chetverikov, “Robust 3D Segmentationof Multiple Moving Objects under Weak Perspective,” inICCV Workshop on Dynamical Vision, Beijing, 2005.

5. H. Sekkati and A. Mitiche, “Joint Optical Flow Estima-tion, Segmentation, and 3D Interpretation with Level

Sets,” Computer Vision and Image Understanding 103(2), 89–100 (2006).

6. A. Massad, M. Jesikiewicz, and B. Mertsching, “Space-Variant Motion Analysis for an Active-Vision System,”in Proceedings of ACIVS 2002, Ghent, Belgium, 2002.

7. Y. Su, M. T. Sun, and V. Hsu, “Global Motion Estimationfrom Coarsely Sampled Motion Vector Field and theApplications,” IEEE Trans. Circuits Syst. Video Techn.15 (2), 232–242 (2005).

8. J.-Q. Fang and T. S. Huang, “Solving Three-Dimen-sional Small-Rotation Motion Equations: Uniqueness,Algorithms, and Numerical Results,” Computer Vision,Graphics, and Image Processing 26, 183–206 (1984).

9. H. C. Longuet-Higgins and K. Prazdny, “The Interpreta-tion of a Moving Retinal Image,” Proc. R. Soc. (Lon-don), Ser. B 208, 385–397 (1981).

10. G. Adiv, “Determining Three-Dimensional Motion andStructure from Optical Flow Generated by Several Mov-ing Objects,” IEEE Trans. Pattern Anal. Machine Intell.PAMI-7, 348–401 (1985).

11. T.-R. Tsao, H.-J. Shyu, J. M. Libert, and V. C. Chen, “ALie Group Approach to a Neural System for Three-Dimensional Interpretation of Visual Motion,” IEEETrans. on Neural Networks 2 (1), 149–155 (1991).

12. J. G. Daugman, “Complete Discrete 2D Gabor Trans-form by Neural Networks for Image Analysis and Com-pression,” IEEE Trans. on ASSP 36 (7), 1169–1179(1988).

13. B. Mertsching, Z. Aziz, and R. Stemmer, “Design of a Sim-ulation Framework for Evaluation of Robot Vision andManipulation Algorithms,” ICSC, 2005, pp. 494–498.

Mohamed Shafik obtained hisB.Sc. in mechanical engineering at theUniversity of Banha. In 2004 heearned an Information TechnologyDiploma in Mechatronics from theInformation Technology Institute(ITI). In 2006 he obtained his M.Eng.in applied mechatronics at the Univer-sity of Paderborn. Since 2006, he is aPhD student and a scientific assistantin the GET Lab. His research interestsfocus on robotic vision, neural net-

works, and mechatronic systems.

Baerbel Mertsching studied elec-trical engineering and obtained herPhD at the University of Paderborn.Between 1994 and 2003, she was pro-fessor of computer science at the Uni-versity of Hamburg. In 2003 shereturned to the University of Pader-born where she is now professor ofelectrical engineering and director ofthe GET Lab. Her research interestsfocus on cognitive systems engineer-ing, especially active vision systems,

and microelectronics for image and speech processing. Shehas been a member of a variety of scientific councils and edi-torial boards and is author of more than 120 scientific publi-cations.