discrete contours in multiple views: approximation and ...sujit/papers/pdf/ivc.pdf · discrete...

Discrete Contours in Multiple Views: Approximationand Recognition

�

M. Pawan Kumar, Saurabh Goyal, Sujit Kuthirummal, C. V. Jawahar, and P. J. Narayanan

Centre for Visual Information TechnologyInternational Institute of Information Technology

Gachibowli, Hyderabad. India. 500 019

Abstract

Recognition of discrete planar contours under similarity transformations has received alot of attention but little work has been reported on recognizing them under more generaltransformations. Planar object boundaries undergo projective and affine transformationsacross multiple views when the cameras used are perspective and affine respectively.We present two methods to recognize discrete curves in this paper. The first methodcomputes a piecewise parametric approximation of the discrete curve that is projectivelyinvariant. A polygon approximation scheme and a piecewise conic approximation schemeare presented here. The second method computes an invariant sequence directly from thesequence of discrete points on the curve in a Fourier transform space. The sequence isshown to be identical upto a scale factor in all affine related views of the curve. Wepresent the theory and demonstrate its applications to several problems including numeralrecognition, aircraft recognition, and homography computation.

Key words: Planar Shape Recognition, Polygonal Approximation, Fourier Transform,Invariant, Projective Geometry, Piecewise Conic ApproximationPACS:

�This work was partially supported by the Advanced

Data Processing Research Institute, Dept. of Space.Email address:�pawan@gdit.,saurabh@gdit.,sujit@gdit.,jawahar@,pjn@ � iiit.net(M. Pawan Kumar, Saurabh Goyal, SujitKuthirummal, C. V. Jawahar, and P. J. Narayanan).

1 Introduction

Analysis of multiple views of the same sceneis an area of active research in computer visiontoday. The study of the structure of projectionsof points and lines in two views received muchattention in the eighties and early nineties [1–3]. Similar study of projections of points andlines in three and more views were followedsince then [2,4,5]. These multiview studies have

Preprint submitted to Elsevier Science 8 February 2003

concentrated on how geometric primitives likepoints, lines and planes are related across views.Specifically, the algebraic constraints satisfiedby the projections of such primitives in differ-ent views have been the focus of intense studies.Discrete contours formed by boundaries of ob-jects are of great interest for shape recognition.The contour consists of an ordered sequence ofpoints. While projection of each point satisfiesmultiview constraints separately, their sequencecan have additional constraints that can help inrecognizing them under multiple views.

We study the issues in recognizing discrete pla-nar contours, consisting of a sequence of points,in multiple views under different projection mod-els in this paper. We limit the recognition prob-lem to planar objects represented using their dis-crete boundaries. Planar shape recognition hasbeen studied well. Recognition by alignment [6],polygonal approximation [7], based on geometri-cally invariant features [8], linear combination ofmodels [9], etc., have appeared in the literature.Descriptors computed in the Fourier domain [10]have also been used for recognition. These algo-rithms work well for similarity transformationsbetween views. The image-to-image transforma-tion is more general in practical situations of in-terest. When a planar object is imaged from mul-tiple viewpoints using a perspective camera, itsimages are related by a projective homography.When affine cameras are used, the homographyis affine. Both these situations occur frequentlyin practice. Conventional algorithms for similar-ity transforms will not work for them.

The problem of shape recognition in the contextof multiple views can be posed thus. Given theimage of an object in one or more views, can werecognize them to be images of the same objectwhen the viewing parameters of the cameras areunknown.

Projective and affine invariants for points, linesand parametric and algebraic curves, such as con-

ics, have been discovered [11]. Discrete contours,not amenable to simple algebraic or parametricrepresentations, pose new problems and have notbeen studied much earlier. Differential invariantsthat require high order derivatives of contourshave been used [11,12]. These are highly unsta-ble numerically for real images. Another relevanteffort identifies a few affine invariant features anduses them for recognition [13] in a classical pat-tern recognition framework.

In this paper, we present two approaches for therecognition of discrete planar contours in multi-ple views. The first approach is inspired by polyg-onal approximation techniques. We extend it topiecewise parametric approximation of the dis-crete contour. The specific parametric structuresused are lines, resulting in a polygonal approxima-tion, and conics. We show how the multiview in-variants of the underlying parametric structure canbe exploited to devise a projective invariant piece-wise parametric approximation. Thus, the approx-imation computed from multiple views of the con-tour will be isomorphic to each other. This neces-sary condition satisfied by matching contours canbe used for recognizing its shape in multiple viewswithout explicit correspondence between points.

The second approach uses new contour invari-ants computed from the sequence of points onthe contour. This approach combines multiviewconstraints on corresponding points and the con-straints of the sequence of points. We achieve theirfusion in a Fourier domain representation of thepoint sequence. We show that the novel Fouriervector representation of the contour points yieldsa single-view invariant sequence � that is identi-cal upto a scale factor for all affine-related projec-tions of the contour. This can be the basis for rec-ognizing the shape in multiple views. This is a di-rect method for recognition that does not involvethe selection of starting points or approximationof the contour using parametric structures. We es-tablish the contour invariants for affine homogra-

2

phies, but show that they are satisfied for mostpractical situations when the image-to-image ho-mography is not strictly affine. We present themost general contour invariant in this paper. In aprevious work, we have shown that recognitionconstraints based on the magnitude and phases ofthe Fourier vector representation also exist [14].

Section 2 presents the basic problem formula-tion and establishes the notation used throughoutthis paper. Section 3 discusses the general theorybehind projective invariant piecewise parametricrepresentation of discrete contours. The Fourierdomain contour invariant is presented in Section4. We establish the affine invariant and present themultiview recognition condition as a rank con-straint on a measurement matrix. Section 5 showsthat the problem of recognizing polygonal shapesin multiple views can be solved using the Fourierdomain invariant on a sequence of lines in thedual space. Section 6 shows the results of apply-ing both approaches for contour recognition on anumber of applications involving the recognitionof numerals and aircraft silhouettes. We also showthe extension of the recognition scheme to com-pute point-wise correspondence between match-ing contours and then to compute the homogra-phy between a pair of views. Section 7 presents afew concluding remarks.

2 Problem Formulation

Let O be a set of � points on the bound-ary of a planar object imaged in multiple views.Let �� be the homogeneous coordi-nates of points on the image of the closed bound-ary in view � . This boundary is represented by a

sequence of vectors as shown below.

� �� !�"��

#%$$$$$&When a planar object is imaged from multiple

viewpoints, points on it undergo a transformationthat can be mathematically described as a generalprojective transformation which is a linear opera-tion ( ')(*' matrix) in homogeneous coordinates.� �� +�-, �/. �� (1)

In some special cases, additional constraints on, can result in affine or similarity transforma-tion.

Two views are related by a similarity trans-formation only when there is a rotation, and/ora translation, and/or a uniform scaling betweenthem. Most reported recognition algorithms areprimarily designed to handle similarity transfor-mations. However, affine transformations deformthe shape of an object beyond rigid motionsthough they do preserve parallelism and pointsat infinity. An affine homography between twoimages is represented by a ,0 matrix with itslast row as �%1)1324� . Affine transformations form asubset of the general projective group.

Projective transformations, in general, do notpreserve parallelism of lines and can map points atinfinity to finite points and vice versa. It has beenshown that a planar object viewed from multipleviewing positions results in a projective image-to-image homography [2]. However, for cases likenarrow field of view or imaging an object from adistance, an affine transformation is considered tobe a good approximation of the projective trans-formation.

As we go higher up in the hierarchy of transfor-mations (similarity 5 affine 5 projective) the con-

3

structs that are preserved (remain invariant) be-come fewer, making the task of recognition morechallenging. We address the problem of planarobject recognition using appropriate invariants.

2.1 Invariants

Let 6 be a parameter vector subject to the lin-ear transformation ,0 . A scalar invariant 7+�86� of6 is preserved under the transformation if 7+�869�:�7+�;6 . � [11]. Here 7+�;6"<� is the function of the param-eters after the linear transformation. If the trans-formation ,0 is affine, the invariants are calledaffine invariants. For example, ratio of areas andratio of lengths on parallel lines are invariant toaffine transformations. Several cross ratios are in-variant under projective transformation. Cross ra-tio of four collinear points is defined as the ratioof ratios of distances between points. Below wedescribe the cross ratios that we employ, namelycross ratio of areas and concurrent lines.

Cross-ratio of areas of five points: The cross-ratio of the areas of five points �:= , �?> , �A@ , �/B ,and �?C is defined by

DFE � �G= � �?> � �A@ � �/B � �?C �H�JILKNMOKQPRK4SUTVILKQWXK4Y�K4SILKNMOKQWRK4SUTVILKQPXK4Y�K4S �(2)

where I*K4Z[KX\�K^] is the area of the triangle formedby points �?_ � �a` � �Ab . This is invariant to generallinear or projective transformations [11].

Cross-ratio of four concurrent lines: Thecross-ratio of four concurrent lines ��cF��edQ�f�eg4�f�9h isdefined as

DiE ��ci�f�9dj�f�egQ�f�<hF�H�lk �m�noc�g T k �m�nQdphk �m�nQdOg T k �m�noc�h � (3)

where nQq�r represents the angle formed by the lines�<q and �sr . This is invariant to a general projective

transformation [11].

We employ these invariances for piecewiseparametric approximation of discrete contours.

3 Projectively Invariant Piecewise Paramet-ric Representation

Parametric representation is a popular methodfor providing a compact representation of planarcontours. A piecewise parametric representationof a contour partitions the points on the contourinto sets, and represents each set with the helpof a few parameters. One example of such a rep-resentation is polygonal approximation. This hasbeen extensively used as an intermediate step invarious applications such as planar shape recogni-tion, volume rendering and multi-resolution mod-eling [15–17].

Polygonal approximation may be achieved byminimizing the error in approximation. A gen-eral optimization of the objective function maybe computationally expensive and prone to getstuck in local minima. Most popular polygonalapproximation algorithms look for an optimal so-lution using a greedy algorithm [18,7]. Some ofthese algorithms are designed for fast approxima-tion. They, in general, exploit the connectednessand ordering of the points in the set. There arealso algorithms that do not exploit the connect-edness of points [19]. They group points on theboundary into linear clusters. Some other algo-rithms emphasize the optimality and efficiency ofthe polygonal approximation [20]. Conic sectionsare also often used as the parametric representa-tion for each partition [21].

Let � �� U�l2V�ita� TjTjT ��m be a set of points onthe boundary of a planar object to be approxi-mated using parametric sections in the � th view.These sequences in multiple views are related by

4

� �� +�u, � . �� . In individual views, these bound-ary sequences can be approximated into v mutu-ally exclusive and collectively exhaustive subsetsw c �QxQx4x^� w y such that each of the subsets can beapproximated using a few parameters. This canbe achieved by minimizing an objective functionof the form:

z �|{ } ~� {�q8�+c�� f� r �i� � �� ?� w r (4)

where � r is the parametric structure which ap-proximates the points in

w r and � � T � is a measureof separation of � �� from � r . The separation � � T �can be the distance along the normal or a mea-sure of how the parametric structure ��r distortsthe real point � �� . For the parametric representa-tion to be invariant to projective transformations,there should be a one-to-one mapping from

w q tow .q .We now show that as long as we can identify

a projective invariant which gives a measure ofseparation for the desired parametric structure, wecan find an algorithm to represent the curve us-ing that parametric structure. We discuss two spe-cific cases of this approach in the form of polyg-onal approximation and piecewise conic sectionapproximation.

3.1 Proposed Approach

The existence of a projectively invariant piece-wise parametric method depends on the availabil-ity of a projective invariant which can give a mea-sure of separation of a point from the desired para-metric structure. Let 7+�;6� be a projective invariantfor a set of points 6 and let 7+� � �� denote its valuewhen all points of 6 except � �� are kept fixed. Ifthe locus of � �� for which 7+� � �� v is a para-metric structure � , we can use � 7+� � �� v�� asa measure of separation We first select the fixedpoints and compute the invariants corresponding

1

4

2

5 6len

len

dist

dist

dist

dist13

13

5

14

5

13

6

14

6

12

3

Fig. 1. Points 1,2,3,4,5 and 6

to those points. We then move along the contouruntil the approximation error is sufficiently large.Thus, we can identify regions in the curve whichcan be approximated using a parametric structure.We describe two specific examples of the abovemethod in the form of polygonal approximationand piecewise conic section approximation in thefollowing subsections.

3.2 Projective Invariant Polygon Approximation

In this section, we describe an algorithm thatcan result in isomorphic polygonal approximationfor projectively transformed versions of the sameimage. Some preliminary results of this have beenpresented in [15].

Consider points �G= � �?> � �A@ � �/B � �?C and �A� . Us-ing Equation 2, we get

DFE c�dOgph��U� I �ph�c TVI gOdRcI �OgRc TVI h�dRc � DFE c�dOgph�� I �ph�c TVI gOdRcI �OgRc TVI h�dRc xWe see that the area I �ph�c:� cd T ��Qm?c�h T � k4� c�h� , where��Qm�c�h is the length of the line segment definedby points 1 and 4 and � k4� c�h� is the perpendiculardistance of point 5 from the line defined by points1 and 4. We define a ratio of cross-ratios as

5

� � DiE c�dOgph��DiE c�dOgph�� I �ph�c TVI �OgRcI �OgRc TVI �ph�c� ��Qm?c�h T � k4� c�h� T ��Qm?c�g T � kQ� c�g��Qm?c�g T � k4� c�g� T ��Qm?c�h T � kQ� c�h�� kQ� c�h�� k4� c�g�� kQ� c�h� � � k4� c�g� x (5)

We first establish three useful properties beforeproving the existence of projective invariant poly-gon approximation.

Property 1: An invariant polygonal approxima-tion algorithm can exist only for a transforma-tion where collinearity is preserved. To verify this,we assume that we have a polygonal approxima-tion algorithm invariant to a transformation whichdoes not preserve collinearity. Under such a trans-formation, not every line gets transformed as aline. From this it is clear that there cannot ex-ist a polygonal approximation algorithm whichis invariant to such a transformation. Since pro-jective transformation preserves collinearity, therecan exist a polygonal approximation algorithm in-variant to this transformation.

Property 2: The ratio of cross-ratios of areasgives us a measure of deviation � � T � for polygonalapproximation. This is true as Equation 5 showsthat

�can be interpreted as the ratio of the ratio

of perpendicular distances of the points ��C and �A�from the two lines �� m/�Vc�g and �� m��oc�h . This valueequals 1 for the the line �� m/�Vc�� . When the point �?�moves away from this line,

�moves away from 1.

Thus,�

gives a measure of the collinearity of thepoints �G= � �?C and �A� . The measure � � T �G�� 2��can then be used as the measure of deviation asgiven in Equation 4.

Property 3: The ratio of cross-ratios of area � � �is invariant to projective transformations since the

cross-ratios of areas are projectively invariant.

Therefore, there exists an algorithm for polygo-nal approximation which is invariant to projectivetransformation as it is clear that there is a function

� � T � which is invariant to projective transforma-tion. This would result in a one-to-one mappingbetween the subsets � w ci� w dQ�Qx4xQx4� w y � .

We develop the algorithm in the following man-ner.

(1) Choose point 1 as the starting point and point5 as the point next to point 1 on the curve.Points 2, 3, and 4 may or may not lie on thecurve.

(2) Choose the point adjacent to 5 on the curveas point 6. Measure � ��:�� 2�� .

(3) If � ��¡ � (a suitable tolerance threshold),move point 6 forward. Go to step 2.

(4) Otherwise, approximate the curve by the linesegment joining points 1 and 6. Set point 6as the new point 1. Go to step 1.

The algorithm described above was imple-mented to work with real images. Points 3, 4, and5 have to be chosen carefully in order to over-come errors that are present in real images. Themost prominent error is due to the discretizationof the curve because of which the cross-ratioscalculated may differ in different views [11]. Inorder to decrease the relative error, points 3,4 and5 were chosen so that the areas to be calculatedfor

�were large. Point 5 need not be the same in

multiple views of the same object as long as it isclose to point 1.

The reference points should not be collinear inwhich case the cross ratio of areas will be unde-fined. Another problem arises due to the differ-ences in the location of point 3 and 4 in differentviews. However, as long as the threshold is low,this results in a difference of only 1-2 pixels inthe boundary points. The choice of the thresholdshould depend on the curvature of the section that

6

Fig. 2. Polygon approximation of two original curvesand after applying two projective transformations.

we are trying to approximate. For example, for alinear section the threshold should be low to re-trieve the same linear section as the approxima-tion.

Example: The results of the proposed algo-rithm on two planar boundaries are shown inFigure 2. The results were obtained on images ofsize '¢1¢1£(¤'¢1¢1 . We considered only the boundarypixels for approximation. The original image isshown on top with two projectively transformedversions below it. The boundary pixels are drawnin black colour. The red lines in the figure showthe polygonal approximation of the boundaryusing our algorithm. The green lines show theprojectively transformed polygon which approx-imates the original images. The green and redpolygons in the figures are identical in all casesexcept one or two breakpoints which got shiftedby 1-2 pixels due to discretization. We obtainedgood results on other planar boundaries as well.

3.3 Projectively Invariant Piecewise Conic Ap-proximation

The general method for parametric approxi-mation described in section 3.2 can be appliedfor finding piecewise conic approximations to aclosed curve which is invariant to projective trans-formations. Such a method is possible because aprojective invariant exists which can give the mea-sure of deviation from a conic section. This in-variant is the cross ratio of lines which is definedfor four concurrent lines as

DiEo¥ �m/� k ��¦ = �f¦ > �f¦ @ �f¦ B �H�¨§R©eª noc�g T §R©9ª nQdph§R©eª nQdOg T §R©9ª noc�h (6)

where nQq�r is the angle between lines ¦ _ and ¦ ` .Thisis invariant to general linear or projective trans-formations [11]. This can alternatively be definedfor five points in general position asDiE � �G= � �A> � �A@ � �/B � �?C �H� DiEV¥ �m/� k ��¦ =RC �f¦ >iC �f¦ @FC �f¦ B^C �(7)where ¦ _ ` is the line joining points ��_ and ��` .

According to Chasles’ theorem [22], if we keep�G= � �?> � �A@ and �/B fixed, the locus of all points �such that DFE � �G= � �?> � �A@ � �/B � � � is fixed, is a coniccurve. If we define

� � DiE � �«= � �A> � �A@ � ��B � �?C � ,then the value � DFE � �G= � �?> � �A@ � �/B � � �� gives themeasure of separation of point � from the conicdefined by points �G= to �?C .

The above results are used to develop an algo-rithm for projectively invariant piecewise conicapproximation. The algorithm is similar to theone described for polygonal approximation exceptthat the measure of deviation used is the cross-ratio of lines. We denote the starting point by �H= .Points �?> � �/@ � �/B and �?C are chosen near �«= onthe curve. We then move clockwise from point�?C and compute the deviation � � � � as the differ-ence between the cross ratio of lines of points� �G= � �?> � �A@ � �/B � �AC � and � �G= � �A> � �A@ � ��B � � � for ev-ery boundary point � . As long as � � � � is less than

7

(a) (b)

Fig. 3. Breakpoints found by the conic approximationalgorithm on two views of a contour

a suitable tolerance threshold, the visited pointscan be approximated with a conic section. When

� � � � becomes larger than the threshold, we put abreakpoint there and continue with the next pointon the boundary as �«= .

The algorithm will partition the curve pointsinto sets which can be approximated using a conicsection. A projectively invariant conic fitting al-gorithm should then be used to estimate the conicsection which fits those points. This, however,does not pose a big problem as the points to befitted are not highly scattered and therefore algo-rithms which are affine or similarity invariant leadto only very small errors in fitting [21].

Example: Figure 3 shows the breakpoints ob-tained by the above mentioned algorithm on twoviews of a contour. The boundary is shown inblack and the breakpoints are shown by the reddots. The contours show the outer boundary pointsof five intersecting circles. In both the views theconic approximation obtained is identical and thebreakpoints are very close to the intersections ofthe circles.

4 Fourier Domain Affine Invariant for Dis-crete Contours

In this section, we analyze the properties ofa collection of points, such as a planar object’scontour, in a transform domain. Collections of

points such as a boundary have more informationthan isolated points. The sequencing inherent insuch a collection makes a transform domain ap-proach, such as the Fourier one, a good tool tostudy their properties. The linear image-to-imagerelationships combined with the properties of thecontour in the Fourier domain enable rich con-straints that essentially characterize the contourindependent of the viewpoint. We come up witha view-independent characterization of the planarshape boundary using a measure computed in theFourier domain. Some preliminary results werepresented in [14,23].

4.1 Fourier Domain Representation of the Con-tour

Let the Fourier domain representation of the se-quence � �� f13¬� ®� (introduced in Section 2)be ¯L��%v��f1°¬uv* ®� such that

¯ �[v��+�� !± ��%v��² ��[v��³ ��%v��

#%$$$$$&where

± ��[v�� , ² ��%v�� , and³ ��%v�� are respectively the

Fourier transforms of the individual sequences�"�� , �� , and �¡�� . We call this the vectorFourier representation of the contour. Note thatthe sequences ¯ �[v�� are periodic and conjugatesymmetric.

Theorem 1: The Fourier transform and thecollineation commute with the above representa-tion. That is, if points are transformed betweenviews 1 and � using Equation 1, the same homog-raphy will transform corresponding frequencyterms in the Fourier domain also. In other words,

¯ �[v��+�-, ¯ . �[v�� , 1°¬uvL ��´x (8)

8

Proof: Let , �¶µ q�r �Q2·¬¸ R�O¹º¬»' . ExpandingEquation 1 for the � term,

� �� ¼µ cOc � . �� a½¾µ c�d � . �� a½¾µ c�g � . �� Taking the Fourier transform of the above equa-tion and using the linearity property of Fouriertransforms, we get± �[v��+��µ cOc ± . �[v��½¾µ c�d ² . �%v��a½ºµ c�g ³ . �[v��Similarly for

² �[v�� and³ �[v�� (Note that � ��

need not be 1). It is now easy to see that

¯ �%v��+�-, ¯ . �[v��giving us the desired result. ¿

Given a set of À views, the recognitionproblem can be formulated as the identificationof a view-independent function Á«� T � such thatÁG� � . � � c �QxQx4x4� �?Â �u� 1 . This recognition con-straint can be linear or nonlinear in image coor-dinates. The algebraic relation given by Á«� T � canthen be used to settle the question whether the Àobserved views were of the same object.

4.2 Rank Constraint for Recognition

If the image-to-image homography is affine, thetransformation matrix has µ gRc �Ãµ gOd �Ã1 andµ gOg �Ä2 . The transformation can be expressed interms of inhomogeneous coordinates as� �� +�uÅ �A. �� a½¾Æ (9)

where � �� is the inhomogeneous representationof the th point on the contour in view � , Å is theupper tÇ(´t minor of , and Æ is the upper twoelements of the last column of ,È .

The above expression is valid for the scenar-ios when correspondence between points acrossviews is known. However in practice, correspon-dence is rarely available. In case correspondence

information is not available, Equation 9 assumesthe form � �� +�-Å � . �� ½ � ��½ºÆ where cyclic shifting the sequence � . by

� wouldalign the corresponding points of � . and � . Thefrequency domain representation can be given by

¯ �%v��+��Å ¯ . �%v��jÉiÊ�Ë�� ¹�tÍÌ � v� �F�Î13 �vL ��(10)

if the Æ� term is eliminated by omitting the v¤�-1term in the Fourier domain.

Let us define a measure called the cross-conjugate product (CCP) on the Fourier repre-sentations of two views as

Ï ��1��^�[v��N�Ð��¯ . �%v��XÑ�Ò¯ �[v��Ó13 �v* Ð��Ð��¯ . �%v��XÑ�Ò+Å ¯ . �[v��?ÉFÊaË�� ¹�tÍÌ � v� �Fx(11)

The matrix Å can be expressed as a sum of asymmetric matrix and a skew symmetric matrixas ÅÔ��|ÅÔÕG½�ÅÔÕ�y where ÅÔÕÖ� cd ��ÅÔa½È��ÅÔ9� Ò �and ÅLÕ�y£� cd ��ÅÔ×�u��ÅÔ9� Ò � . The skew symmetricmatrix reduces to

D � ! 1 2��2 1# $& �

where D �uµ c�d �Øµ dRc is the difference of the off-diagonal elements of Å . We now have

Ï ��1��^�[v��N�Ù¯ . �[v��eÑ�ÒLÚ�Å Õ/½ÛÅ Õ�y^ÜH¯ . �%v��ÉFÊaË�� ¹�tÍÌ � v� �The first term of the above equation is purely realand the second term is purely imaginary. We ob-serve that the effect of the transformation matrixÅ on the second term is restricted to a scalingby a factor D . We can define a new measure � ,

9

ignoring scale, for the sequence ¯ in view � as

�?��4�%v��+�¼¯ �%v�� Ñ�Ò � ! 1 2��2 1#%$&£¯ �%v��x (12)

It can be shown [23] that

�?��4�%v��+�� Å �o��1��4�%v��Ý13 �vL ®� (13)

Equation 13 gives a necessary condition for thesequences ¯ . and ¯L to be two affine views of thesame planar shape, or in other words, the valuesof the measure �� T � in the two views should bescaled versions of each other. The �� T � values for acontour are, thus, signatures of that contour. Thisextends to multiple views also. Consider the À�(��J�®2N� matrix formed by the coefficients of the�?� T � measures for M different views.

Þ �� !

�?��1¢�4�;24� TjTjT �?��1��4��»�®24��?�X2j�4�;24� TjTjT �?��2N�4��»�®24�TjTjT TjTjT TjTjT�?��À ��2N�4�;24� TjTjT �?��À ��2N�4��J��24�

# $$$$$$$$&The necessary condition for matching of the pla-nar shape in À views then reduces to

rank � Þ �:�È2¢x (14)

The view independent function that we want isÁG� T ��È1àß rank � Þ �«�¼2��È1 . It should be notedthat this recognition constraint does not requirecorrespondence across views and is valid for anynumber of views.

Since, the � measures in the various views areonly scaled versions of each other, if we normalizethe � measure terms in each view with respect toa fixed one then

á ��^�[v��N�â�?��^�[v�� ?��4� 6×�?� p is fixed�Û�f��Å ��?��1��4�%v�� i� Å ��?��1¢�4� 6×��â�?��1¢�4�%v�� ?��1��4� 6×�á ��^�[v��N� á ��1¢�4�%v�� (15)

The terms of the normalized � measure – the ámeasure are independent of the view. Hence, á isan affine view invariant of a contour, whose com-putation does not need correspondence informa-tion across views.

Example: We consider four views of a dinosaurshown in Figure 4. These views are related byaffine transformations. Euclidean measures arenot preserved under these transformations. How-ever, we can still recognize them using the rankconstraint. Discretization noise does introduceerrors into the framework that make the rankconstraint an approximation, but the constraintis robust and still works appreciably. To verifywhether a matrix has rank E , we consider the ratioof E th to � E ½u2N� th singular values of the matrix.This ratio is high if the matrix has rank E . Todetermine whether the rank of

Þis 2 , we need

to compare the ratio of the highest two singularvalues. The ratios of two highest singular valuesofÞ

constructed from various two-view combi-nations of views shown in Figure 4 are arrangedin Table 1.

5 Polygons in Multiple Views

The recognition mechanism developed in theprevious section can be used for recognizing poly-gons under affine image-to-image homographies.In this section, we formulate a technique for rec-ognizing polygons in multiple views, specificallywhen the optical axes of the cameras used forimaging are coincident. We represent the bound-ary as a sequence of lines instead of points. This

10

Dinosaur 1 Dinosaur 2 Dinosaur 3 Dinosaur 4

Dinosaur 1 —- 43176.5 23988.5 35453.9

Dinosaur 2 43176.5 —- 25733.7 35352.6

Dinosaur 3 23988.5 25733.7 —- 17548

Dinosaur 4 35453.9 35352.6 17548 —-

Table 1Ratio of the highest singular value to the second highest singular value of ã constructed from various two-viewcombinations of views shown in Figure 4

(a) (b)

(c) (d)

Fig. 4. Four affine transformed views of a dinosaur

is applicable when the objects are polygons or therepresentation is an approximation.

When a planar object is being imaged frommultiple view points so that attention is focusedon it, the location of the view points can becharacterized using the azimuthal angle ä in thehorizontal plane, the elevation angle å from thevertical axis, the twist angle æ about its ownaxis, and the distance � from the origin (Fig-ure 5(a)). The image-to-image homography çLc�drelating two views given by ��ä c ��å c �Ræ c � � c � and��ä d �Rå d �Ræ d � � d � due to the object plane can beexpressed upto scale as

β dP

τ

Z

X

Y

α

(a) (b)

Fig. 5. (a) Characterizing a 3D point in terms of az-imuthal, elevation and twist angles and the distancefrom the origin. (b) Two views of a polygon relatedby a homography whose transpose has the form of anaffine homography

è �� !� c�ééééééé

E dcOc E cdRcE dc�d E cdOd ééééééé� � c�ééééééé

E d cOc E ccOcE d c�d E cc�d ééééééé1

� c éééééééE ddRc E cdRcE ddOd E cdOd ééééééé

� � c éééééééE ddRc E ccOcE ddOd E cc�d ééééééé

1ê ë ì

#%$$$$$$$$$$$$&where

ê � éééééééE cdRc � � c E dgRc � � d E cgRc �E cdOd � � c E dgOd � � d E cgOd �

ééééééé�

ë � éééééééE ccOc � � c E dgRc � � d E cgRc �E cc�d � � c E dgOd � � d E cgOd �

ééééééé� ì � � d ééééééé

E ccOc E cc�dE cdRc E cdOd ééééééé11

and E cq�r and E dq�r are the entries of the 'A(í' rotationmatrices that relate the vertical view to the viewgiven by ��ä q ��å q �Ræ q � respectively for the views 1and 2. This homography matrix has the specialform of the transpose of an affine homography.

If the homographyè

relates points in twoviews, then

èïî Ò relates corresponding lines intwo views. Therefore, if the homography relatingcorresponding points in two views has the formof transpose of affine, corresponding lines in thetwo views are related by affine image-to-imagehomographies. So, if we represent the boundaryin a line space - making use of the edges of thepolygons we can use the same approach as out-lined in Subsection 4.2 for achieving recognition.

Example: Figure 5(b) shows two views of apolygon, whose vertices are related by a homog-raphy whose transpose has the form of an affinehomography. Hence, corresponding edges are re-lated by affine homographies. All lines were nor-malized, so that the third component of each linewas unity. The

Þmatrix was computed from the

line representation of the polygon. Its rank wasessentially 1 as its highest singular values was2V2j1¢��2¢xñð times bigger than the next highest singu-lar value.

6 Applications and Extensions

6.1 Numeral Recognition

We demonstrate the applicability of polygo-nal approximation and the Fourier domain invari-ant to recognition of numerals. Numeral recog-nition has been conventionally addressed amongthe document image processing community. Pop-ular OCRs are designed to address this problemonly under similarity transformations. There are

many other situations in image and video pro-cessing where the numerals are to be recognizedunder transformations more general than similar-ity. For instance, the alpha numerals on a numberplate imaged from different viewpoints undergoprojective transformations as can be seen in Fig-ure 7(a). We tried approximating the numerals inmultiple views. Figure 7(b) shows the result ofapproximating the numeral ò in the two viewsshown in Figure 7(a).

The numerals in the multiple views are re-lated by projective homographies. Though theFourier domain invariant has been derived foraffine image-to-image homographies, we em-pirically found it to be applicable to most reallife situations. We can solve the generic numeralrecognition problem using both the approaches –polygon approximation and the Fourier domaininvariant. We constructed a dataset of 10 numer-als with 35 images, some of which are shown inFigure 6.

Both approaches provide excellent results onnumerals. The Fourier domain invariant had anaccuracy of ó¢��x[ò�2Nô , while polygonal approxima-tion resulted in óVõ×xñöV1�ô . The numerals � and óare related by a rotation and since the algorithmsare rotationally invariant, once the numerals wereidentified as � or ó , we had to take the horizontalprojection of the numeral to distinguish them.

The Fourier domain invariant had difficulty indistinguishing between 2 - ö and 1 - õ pairs, whilethe polygonal approximation based methods wereable to correctly identify them. The Fourier do-main invariant works well when the boundary hasmore detail (presence of high frequencies). Thus,the problem in distinguishing 2 and ö is to be ex-pected.

12

Fig. 6. Some inputs used in numeral recognition

Fig. 8. Some inputs used in aircraft recognition

(a)

(b)

Fig. 7. Two views of a number plate

6.2 Aircraft Recognition

Most aircrafts can be recognized from theirboundaries. Hence, shape based approaches fortheir recognition have received a lot of atten-tion [24]. We considered 2Nð categories of air-crafts for the experimental verification of theproposed algorithms. Using projective transfor-mations, these planar contours were transferredto 2j1 views each. Some of the sample inputs are

shown in Fig 8.

The polygonal approximation based recogni-tion approach gave a recognition accuracy ofó�t�xñ�¢�¢ô , while the Fourier domain invariant gavea recognition accuracy of ó�öax['¢'�ô . The aircraftcontour is complex and not amenable to easypolygonal approximation and so the performanceof the Fourier domain invariant (which preferscomplex contours) would be expected to be better.

For the Fourier domain invariant recognitionmechanism, we consider the ratio of the highesttwo singular values of the measurement matrix

Þconstructed from the views to be matched. Whenviews of the same contour were matched, this ra-tio was more than 2j1V1 . The ratio dropped to below2j1 when attempting to match views of dissimilarcontours. The Fourier domain invariant can be ex-tended to compute correspondence across views.Once correspondence is established, we can thencompute the homography between the views.

13

6.3 Computing Correspondence Across Views

We can use a modified version of the invari-ant described in section 4.2 to determine the shift� which when applied to the boundary represen-tation in view 1 would align all correspondingpoints in affine views � and 1 (the reference view).

The � measured derived above correlates eachvector Fourier coefficient with itself. The modi-fied measure �×÷ø �� correlates all frequency termswith a fixed coefficient 6 . We define �ù÷ø �� as

� ÷ø ��^�[v��+�ú��¯ �%v�� Ñ T � ! 1 2��2 1#%$& ¯ � 6×��Ø6�û��1�x

Expanding using ¯*��%v��-ÅÔ;¯ . �[v�� as before, weget

� ÷ø ��4�%v��+��Å �V� ÷ø ��1��4�%v��ÉFÊaË��X� ¹�tÍÌ � ��và�Ø6�� (16)

and� ÷ø ��^�[v�� ÷ø ��1¢�4�%v�� Å ��ÉFÊaË�� ¹�tÍÌ � ��và�·6�� (17)

Equation 17 states that the quotient series üfýþ^ÿ ��ü ýþ ÿ . �would be a complex sinusoid, whose inverseFourier Transform would show a distinct peak atthe shift that would align corresponding points.Figure 10 shows the magnitude spectrum of theIDFT of üfý � ÿ ��ü ý � ÿ . � against the shift for the viewsshown in Figure 9(a) and Figure 9(b). The IDFTmagnitude spectrum peaks at 729, which is thecorrect shift value.

6.4 Computing Image-to-Image Homographies

Given two views of a planar shape, we can use� ÷ to compute the shift that would align corre-sponding points in two views. In practice, though,

(a) (b)

Fig. 9. Two views of an aircraft related by affine im-age-to-image homographies

Fig. 10. Graph showing the amplitude of the IDFT ofü ý � ÿ ��ü ý � ÿ . � against the shift for views 9(a) and 9(b). Theshift aligning corresponding points in these views isgiven by the location of the peak, which is 729.

(a) (b)

Fig. 11. Two views of a road sign

the sampling of the contour does not extract pro-jections of the same points and so we cannot getcorresponding points. A spatial domain approachto computing the homography by solving the setof equations derived from corresponding pointswould hence be prone to errors. We can solve theproblem of homography computation more ro-bustly in the Fourier domain by solving the linearequations in Equation 9. The homography relat-ing views (a) and (b) of Figure 11 is projective.On computing an affine homography between thetwo contours and projecting view (a) into view(b), we see that the reprojected view is a verygood approximation of the target view. The con-

14

Fig. 12. Computation of Homography: The homogra-phy between views 11(a) and 11(b) was computedand used to project (a) into (b). The target view andthe projected view are shown above

tours are shown in Figure 12.

7 Conclusions and Future Work

We presented two approaches to recognizediscrete planar contours in this paper. The firstinvolves computing a piecewise parametric rep-resentation of the contour. Specifically, we usedpolygonal approximation and piecewise conicapproximation. The approximation is projectiveinvariant. Thus, the piecewise parametric rep-resentations generated from multiple views areisomorphic to each other. The second approachinvolved computing an invariant directly from thecontour from a Fourier domain representation.The affine-invariant we derived has excellentrecognition properties and work well on reason-able projective transformed contours also. Wedemonstrated the application of these approacheson real-life problems such as numeral recognitionand aircraft recognition.

The two approaches are, in a sense, comple-mentary in nature. The Fourier domain approachrequires sufficient frequency components in thecontour and works well on irregular contours. Ittends to perform poorly when the contour hasmany smooth sections. The piecewise parametricrepresentation, on the other hand, works exceed-ingly well in these cases. They have difficulty inapproximating the contour by parametric sections

if the contour is highly irregular. Together, theycan provide excellent recognition of discrete pla-nar contours in multiple views. We are presentlyworking on an algorithm that combines these twoapproaches to get robust recognition of all typesof contours.

References

[1] O. Faugeras, Q. Luong, The Geometry ofMultiple Images, MIT Press, 2001.

[2] R. Hartley, A. Zisserman, Multiple ViewGeometry, Cambridge University Press, 2000.

[3] H. C. Longuet-Higgins, A Computer Algorithmfor Reconstructing a Scene from two Projections,Nature 293 (1981) 133–135.

[4] R. Hartley, Lines and points in three views:An integrated approach, Proc. ARPA ImageUnderstanding Workshop .

[5] A. Shashua, Algebraic Functions for Recognition,IEEE Tran. Pattern Anal. Machine Intelligence16 (1995) 778–790.

[6] D. P. Huttenlocher, S. Ullman, ObjectRecognition using Alignment, Proc. InternationalConference on Computer Vision (1987) 102–111.

[7] T. Pavlidis, Structural Pattern Recognition,Springer-verlag, New York, 1977.

[8] A. K. Jain, Fundamentals of Digital ImageProcessing, Prentice-Hall, 1989.

[9] S. Ullman, R. Basri, Recognition by LinearCombination of Models, IEEE Trans. PatternAnalysis and Machine Intelligence 13 (1991)992–1006.

[10] C. Zahn, R. Roskies, Fourier Descriptors forPlanar curves, IEEE Trans. Comput. C-21.

[11] J. L. Mundy, A. Zisserman, Geometric Invariancein Computer Vision, MIT Press, 1992.

15

[12] L. J. V. Gool,T. Moons, E. Pauwels, A. Oosterlinck, GeometricInvariance in Computer Vision, MIT Press, 1992,Ch. Semi-Differential Invariants.

[13] K. Arbter, W. Snyder, H. Burkhardt, G. Hirzinger,Application of Affine-Invariant Fourier Descriptors to Recognition of3d Objects, IEEE Trans. on Pattern Analysis andMachine Intelligence 12.

[14] S. Kuthirummal, C. V. Jawahar, P. J. Narayanan,Multiview constraints for recognition of planarcurves in fourier domain, Indian Conference onComputer Vision, Graphics and Image Processing(2002) 323–328.

[15] M. Pawan, S. Goyal, C. V. Jawahar, P. J.Narayanan, Polygonal approximation of closedcurves across multiple views, Indian Conferenceon Computer Vision, Graphics and ImageProcessing (2002) 317–322.

[16] M. DeHaemer, Jr., M. J. Zyda, Simplification ofobjects rendered by polygonal approximations,Computers and Graphics 15 (2) (1991) 175 – 184.

[17] P. Shirley, A. A. Tuchman,Polygonal approximation to direct scalar volumerendering, in: Proceedings San Diego Workshopon Volume Visualization, Computer Graphics,Vol. 24:5, 1990, pp. 63 – 70.

[18] M. K. Leung, Y. Yang, Dynamic strip algorithmin curve fitting, in: Computer Vision,Graphicsand Image Processing, Vol. 23, 1990, pp. 69 – 79.

[19] I. Anderson, J. Bezdek, Curvature and tangentialdeflection of discrete arcs, T-PAMI 6 (1984) 27 –40.

[20] M. Salotti, An efficient algorithm for the optimalpolygonal approximation of digitized curves,Pattern Recognition Letters 22 (2) (2001) 215–221.

[21] F. Bookstein, Fitting conic sections to scattereddata, in: Computer Graphics and ImageProcessing, Vol. 9, 1979, pp. 56 – 71.

[22] R. Mohr, Projective Geometry and ComputerVision, World Scientific Pub., 1993.

[23] S. Kuthirummal, C. V. Jawahar, P. J. Narayanan,Planar Shape Recognition across Multiple Views,International Conference on Pattern Recognition.

[24] T. P. Wallace, P. A. Wintz, An Efficient Three-Dimensional Aircraft Recognition Algorithmusing Normalized Fourier Descriptors, ComputerGraphics and Amage Processing 13(2) (1980)99–126.

16

discrete contours in multiple views: approximation and ...sujit/papers/pdf/ivc.pdf · discrete...

Documents