isometric deformation modelling for object recognition

9
Isometric Deformation Modelling for Object Recognition Dirk Smeets , Thomas Fabry , Jeroen Hermans, Dirk Vandermeulen, and Paul Suetens K.U. Leuven, Faculty of Engineering, Department of Electrical Engineering, Center for Processing Speech and Images, Medical Imaging Research Center, Universitair Ziekenhuis Gasthuisberg, Herestraat 49 bus 7003, B-3000 Leuven, Belgium Abstract. We present two methods for isometrically deformable object recognition. The methods are built upon the use of geodesic distance matrices (GDM) as an object representation. The first method compares these matrices by using histogram comparisons. The second method is a modal approach. The largest singular values or eigenvalues appear to be an excellent shape descriptor, based on the comparison with other methods also using the isometric deformation model and a general base- line algorithm. The methods are validated using the TOSCA database of non-rigid objects and a rank 1 recognition rate of 100% is reported for the modal representation method using the 50 largest eigenvalues. This is clearly higher than other methods using an isometric deformation model. 1 Introduction During the last decades, many developments in 3D modelling and 3D capturing techniques augmented the interest in the use of 3D objects for a number of applications. Examples of these are CAD/CAM, architecture, computer games, archaeology, medical applications and biometrics. Because of this growing use of 3D objects, we see the emergence of 3D databases, which leads to a new research question: 3D object retrieval. One witness of this are the yearly SHREC contests [1]. For the last 3 years already, the 3D SHape REtrieval Contest has the objective to evaluate the effectiveness of 3D-shape retrieval algorithms. Our contribution considers 3D object recognition coping with non-rigid de- formations in particular. A few examples of these kinds of deformations are the expression variations of a human face, the movement of different subparts of a fabrication robot or simply the movement of a walking human. Based on the assumption that geodesic distances 1 remain approximately con- stant during natural non-rigid deformations, we propose a technique for non- rigid object recognition based on the geodesic distance matrix (GDM), a matrix summarizing all point-to-point geodesic distances on the object mesh. Corresponding author: [email protected] Joint first author. 1 The geodesic distance between two points is the length of the shortest path on the object surface between two points on the object. X. Jiang and N. Petkov (Eds.): CAIP 2009, LNCS 5702, pp. 757–765, 2009. c Springer-Verlag Berlin Heidelberg 2009

Upload: antwerp

Post on 14-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Isometric Deformation Modelling

for Object Recognition

Dirk Smeets�, Thomas Fabry��, Jeroen Hermans,Dirk Vandermeulen, and Paul Suetens

K.U. Leuven, Faculty of Engineering, Department of Electrical Engineering,Center for Processing Speech and Images,

Medical Imaging Research Center, Universitair Ziekenhuis Gasthuisberg,Herestraat 49 bus 7003, B-3000 Leuven, Belgium

Abstract. We present two methods for isometrically deformable objectrecognition. The methods are built upon the use of geodesic distancematrices (GDM) as an object representation. The first method comparesthese matrices by using histogram comparisons. The second method isa modal approach. The largest singular values or eigenvalues appear tobe an excellent shape descriptor, based on the comparison with othermethods also using the isometric deformation model and a general base-line algorithm. The methods are validated using the TOSCA databaseof non-rigid objects and a rank 1 recognition rate of 100% is reported forthe modal representation method using the 50 largest eigenvalues. This isclearly higher than other methods using an isometric deformation model.

1 Introduction

During the last decades, many developments in 3D modelling and 3D capturingtechniques augmented the interest in the use of 3D objects for a number ofapplications. Examples of these are CAD/CAM, architecture, computer games,archaeology, medical applications and biometrics. Because of this growing useof 3D objects, we see the emergence of 3D databases, which leads to a newresearch question: 3D object retrieval. One witness of this are the yearly SHRECcontests [1]. For the last 3 years already, the 3D SHape REtrieval Contest hasthe objective to evaluate the effectiveness of 3D-shape retrieval algorithms.

Our contribution considers 3D object recognition coping with non-rigid de-formations in particular. A few examples of these kinds of deformations are theexpression variations of a human face, the movement of different subparts of afabrication robot or simply the movement of a walking human.

Based on the assumption that geodesic distances1 remain approximately con-stant during natural non-rigid deformations, we propose a technique for non-rigid object recognition based on the geodesic distance matrix (GDM), a matrixsummarizing all point-to-point geodesic distances on the object mesh.� Corresponding author: [email protected]

�� Joint first author.1 The geodesic distance between two points is the length of the shortest path on the

object surface between two points on the object.

X. Jiang and N. Petkov (Eds.): CAIP 2009, LNCS 5702, pp. 757–765, 2009.c© Springer-Verlag Berlin Heidelberg 2009

758 D. Smeets et al.

2 Related Work

Some 3D object recognition methods dealing with non-rigid objects and mak-ing use of the geodesic distance matrix are already to be found in literature.The one that received the most attention is probably the method of Elad andKimmel [2]. Here, the GDM is computed using the fast marching on triangulateddomains (FMTD) method. Then, the GDM is processed using the multidimen-sional scaling (MDS) approach, converting the non-rigid objects into rigid in-variant signature surfaces. These can be compared using simpler algorithms forrigid matching. We will use an implementation of this method for comparison.

Another 3D object recognition method that shows some similarity to onemethod we propose here is the Geodesic Object Representation of Hamza andKrim [3]. Here, the shape descriptor is a global geodesic shape function. Thisshape function is defined in each point on the surface and measures the normal-ized integral of squared geodesic distances to other points on the surface. Theseglobal geodesic shape functions are then used to construct geodesic shape distri-butions. These are kernel density estimates (KDE) made of the (discretisized)global geodesic shape functions of a particular object. For the actual recognition,these KDEs are compared using the Jensen-Shannon divergence. The similarityof this method to our modal representation lies in the use of the geodesic distancematrix, which, in the method of Hamza and Krim, is used for the computationof the geodesic shape functions.

Finally, a similar method to our modal representation approach is the methodshown in Jain and Zhang’s work [4]. This method measures the inter-object dis-tance by taking the χ2-distance between the 20 largest eigenvalues of a weightedGDM. We will show that the weighting of the GDM has an adverse effect on theaccuracy of the method.

3 Isometric Deformation Modelling

In mathematics, an isometry is a distance-preserving isomorphism betweenmetric spaces. The basis of the isometric deformation model is therefore theinvariance of distances measured along the surface, called geodesic distances.Therefore, an appropriate object representation to exploit the advantages of anisometric model is the geodesic distance matrix (GDM). We call G a GDM fora particular object if G = [gij ], with gij the geodesic distance between points iand j on the object surface. This matrix is a symmetric matrix and defined up toa random permutation of the points on the represented object surface. Figure 1shows a 3D object and the associated GDM.

For the calculation of the GDM, a fast marching algorithm for triangulatedmeshes is used [5]. The algorithm computes the distance of the shortest (discrete)path between each pair of surface points. The complexity of this computationis O(n2), with n the dimension of the GDM. Beside the geodesic distance ma-trix (G1 = [gij ]), also other affinity matrices, closely related to the GDM areexamined. For example the squared GDM (G2 = [g2

ij ]), the Gaussian weighted

Isometric Deformation Modelling for Object Recognition 759

0

20

40

60

80

100

120

140

160

180

200

Fig. 1. 3D mesh of an object (a) and its geodesic distance matrix representation (b)

GDM (G3 = [exp(−gij2/(2σ2)]) and the increasing weighting function GDM

(G4 = [1 + 1σ gij ]−1) [6].

3.1 Multidimensional Scaling

Multidimensional scaling (MDS) is a technique that allows visualisation of theproximity between points with respect to some kind of dissimilarity (distance)measure matrix. For Euclidean distance matrix representations of a 3D object,three dimensional MDS provides the configuration of the original object. In[7], MDS is applied on the GDM in order to obtain a configuration of pointswhere pointwise Euclidean distances approximately equal the original pointwisegeodesic distances. Figure 2 shows the resulting 2D and 3D configurations, calledcanonical forms, calculated using classical MDS.

Because the geodesic distances remain constant under isometric transforma-tions, the GDM of an object is invariant with respect to isometric transforma-tions up to an arbitrary - simultaneous - permutation of rows and columns.However, the canonical forms have the same shape. Therefore, objects can be

(a) (b)

Fig. 2. 2D (a) and 3D (b) canonical form of the same object as shown in Fig. 1

760 D. Smeets et al.

compared by rigidly aligning the canonical forms and comparing the registrationerror.

3.2 Histogram Comparison

We propose another way to compare deformable objects: by comparing his-tograms of the values contained in the geodesic distance matrices. The resultingrepresentation is invariant for matrix permutations. Experiments were conductedwith two kinds of histograms. The first are histograms calculated from all valuesin the upper triangle of the GDM. The second are histograms of mean geodesicdistances per point. Examples of those histograms for the object in Fig. 1 areshown in Fig. 3. Other histogram variants are possible but are not consideredhere.

0 10 20 30 40 50 60 70 80 90 1000

0.5

1

1.5

2

2.5x 10

5

0 10 20 30 40 50 60 70 800

20

40

60

80

100

120

140

160

180

200

Fig. 3. Histograms of all (a) and point-wise averaged (PWA) (b) geodesic distances ofthe same object as shown in Fig.1

The histograms Sj (j = 1, . . . , n), with n the number of objects, can bethought of as m-dimensional vectors, with m the number of bins. They can becompared with a plethora of dissimilarity measures. We have tested 8 differentones. Histograms can be compared using the Jensen-Shannon divergence [8]:

JSD(S1, S2, . . . Sn) = H(n∑

j=1

πjSj) −

n∑j=1

πjH(Sj), (1)

with πj the weight for the histogram vector Sj and H(Sj) the Shannon entropy,given by H(S) = −∑m

i=1 Si logb Si. In this work only pair-wise comparisons areconsidered. Both histograms receive equal weighting (π1 = π2 = 1/2). The otherdissimilarity measures need less explication and are listed in Tab. 1.

3.3 Modal Representation

A third approach for object comparison using the isometric model is based on amodal representation. Here, the information in the geodesic distance matrix is sep-arated into a matrix that contains intrinsic shape information and a matrix with

Isometric Deformation Modelling for Object Recognition 761

Table 1. Dissimilarity measures

Dissimilarity measure Formula

Jensen-Shannon Divergence D1 = H( 12Sk + 1

2Sl) − ( 1

2H(Sk) + 1

2H(Sl))

Mean normalized Manhattan dis-tance

D2 =∑m

i=1

2|Ski −Sl

i|Sk

i +Sli

Mean normalized maximum norm D3 = maxi2|Sk

i −Sli|

Ski +Sl

i

Mean normalized absolute differ-ence of square root vectors

D4 =∑m

i=1

2|√

Ski −

√Sl

i|√Sk

i+√

Sli

Correlation D5 = 1 − Sk·Sl

‖Sk‖‖Sl‖

Euclidean distance D6 =√∑m

i=1(Ski − Sl

i)2

Normalized Euclidean distance D7 =√∑m

i=1(Ski − Sl

i)2/σ2

i

Mahalanobis distance D8 =√∑m

i=1(Sk − Sl)T cov(S)−1(Sk − Sl)

information about corresponding points. This is done with an eigenvalue decom-position (EVD) or a singular value decomposition (SVD) of the GDM. Both de-compositions give similar results because the GDM is a symmetric matrix. Theeigenvalues and singular values can be used as intrinsic shape descriptors, whilethe eigenvectors and singular vectors give information about correspondences. Fornumerical reasons, only the largest eigenvalues or singular values are computed.

Because we do not know anything about the order of the points, G and allpossible simultaneous permutations of rows and columns of G determine theconfiguration of the object. Let P be a random permutation matrix, such thatG′ = PGPT is a GDM with rows and columns permuted, and G = UΣV T asingular value decomposition, then

G′ = PGPT = PUΣ(PV )T. (2)

Because PU and PV remain unitary matrices and Σ is still a diagonal matrixwith non-negative real numbers on the diagonal, the right hand side of Eq. 2is a valid singular value decomposition of G′. A common convention is to sortthe singular values in non-increasing order. In this case, the diagonal matrix Σis uniquely determined by G′. Therefore, Σ = Σ′, with Σ′ the singular valuematrix of G′.

From this, we can see that Σ contains the intrinsic information about geome-try, while U and V contain the information about correspondences between points.This justifies our approach of object recognition using S = {σ1, σ2, . . . , σk}, withσ1, σ2, . . . , σk the first k singular values of the GDM, as a shape descriptor.As such,the computational complexity singular value calculation is limited toO(k.n2),withn the dimension of the GDM.

For comparing these singular value vectors, we can use the same dissimilaritymeasures as we described in Sect. 3.2 (see Tab. 1).

762 D. Smeets et al.

4 Experimental Validation

To examine the deviation to the isometric deformation assumption in a realisticsituation, we looked at the change in geodesic distance between four finger tipsin three situations with different configuration of a hand. This results in a meancoefficient of variation (CV) of 5.3% for the geodesic distances, while the CV forEucledian distances is equal to 27.6%.

For the validation of the three proposed object recognition approaches, weuse the TOSCA database [9]. This database consists of various 3D non-rigidshapes in a variety of poses and is intended for non-rigid shape similarity andcorrespondence experiments. We use 133 objects, i.e. 9 cats, 11 dogs, 3 wolves,17 horses, 21 gorillas, 1 shark, 24 female figures, and two different male figures,containing 15 and 20 poses. Each object contains approximately 3000 vertices.

We compare the three GDM-based methods with a baseline algorithm: the stan-dard iterative closest point (ICP) algorithm [10]. This is a well-known and exten-sively used rigid object registration method that minimizes the sum of squaredEuclidean distances between closest points. After rigid registration the objects canbe compared using the value of the employed registration objective function.

After roughly tuning the parameters, we used 100 bins for the histogramcomparison with all values and 80 bins for the pointwise averaged (PWA) valuehistogram comparison. This number was determined experimentally.

The different approaches are validated using standard recognition experi-ments, i.e. the verification and the identification scenario. The performance ofthose scenarios is measured with the receiving operating characteristic (ROC)curve and the cumulative matching curve (CMC), respectively. The former is acurve plotting the false rejection rate (FRR) against the false acceptance rate(FAR), while the latter gives the recognition rate for several ranks. These curvescan be found in Fig. 4. Here, we plotted the best combination of GDM weighting,dissimilarity measure and, for the modal representation approach, the optimalnumber of eigenvalues (see below).

The equal error (EER) and rank 1 recognition rate (R1RR) are characteristicpoints on the ROC and CMC respectively. These are tabulated in Tab. 2.

Figure 5 plots the R1RR against the number of eigenvalues (logarithmic scale)used in the shape descripor. A plateau of maximum recognition is observerd forshape descriptors using a number of eigenvalues between 35 and 430.

In Tab. 3, the different dissimilarity measures are compared, showing that hebest results are obtained with the mean normalized absolute difference of squareroot vectors of the 50 largest eigenvalues.

Table 2. Results of different isometric deformation model methods on TOSCAdatabase

experiment R1RR EERMDS 39.34% 29.49%histogram of PWA values 63.11% 16.93%histogram of all values 72.13% 14.90%modal representation 100.0% 2.43%ICP 35.29% 40.07%

Isometric Deformation Modelling for Object Recognition 763

2 4 6 8 10 12 140

10

20

30

40

50

60

70

80

90

100

rank

reco

gntio

n ra

te [%

]

(a)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

FR

R

FAR

(b)

Fig. 4. Results of standard validation experiments on the TOSCA-database with CMC(a) and ROC (b). Object recognition with a baseline algorithm (thin solid line) iscompared to object recognition using MDS (dash-dot line), histogram comparison ofPWA (dotted line) and all values (dashed line) and modal representation (thick solidline).

100

101

102

103

0

10

20

30

40

50

60

70

80

90

100

no. eigenvalues

rank

1 r

ecog

nitio

n ra

te [%

]

Fig. 5. The R1RR is plotted against the number of eigenvalues (in log scale) used inthe shape descriptor

Table 3. Comparison of different dissimilarity measures as defined in Tab. 1

Diss measure D1 D2 D3 D4 D5 D6 D7 D8PWA value Histogram comparison

R1RR 45.08% 54.92% 45.08% 54.92% 46.72% 56.56% 63.11% 20.49%EER 18.68% 15.83% 25.31% 15.69% 34.68% 23.13% 16.93% 42.07%

All value Histogram comparisonR1RR 67.21% 69.67% 47.54% 69.67% 58.20% 72.13% 66.39% 20.49%EER 14.95% 15.26% 21.01% 15.26% 19.63% 14.90% 16.94% 48.37%

Modal representationR1RR 84.43% 100.0% 85.25% 100.0% 54.92% 76.23% 97.54% 33.79%EER 10.11% 2.43% 10.09% 2.44% 20.33% 10.74% 7.74% 34.18%

To show the influence of different weightings of the GDM, we also tabulatethe rank one recognition rate and the equal error rate for the different weightingfunctions as proposed in Sect. 3. This can be found in Tab. 4. The abbreviationsused are the ones introduced in Sect. 3. We can clearly see that every weightingreduces the accuracy of both methods, sometimes quite drastically.

764 D. Smeets et al.

Table 4. Comparison of different weighting function of the GDM as defined in Sect. 3

G1 G2 G3 G4All Value Histogram comparison

R1RR 72.13% 70.49% 70.49% 69.67%EER 14.90% 14.01% 14.14% 15.42%

Modal representationR1RR 100.0% 97.54% 71.31% 90.98%EER 2.43% 3.47% 17.79% 12.25%

All results clearly show that the modal representation of the geodesic distancematrices provides the highest performance. We also note that all methods usinggeodesic distance matrices perform better than the baseline algorithm.

5 Conclusions

In this article, different methods using geodesic distance matrices are com-pared. Amongst all the representations and methods, the modal approach out-performs the other methods, a geodesic histogram based representation, theMDS-approach and the baseline ICP algorithm. For the TOSCA database arank 1 recognition rate of 100% is obtained.

As future work, we propose to further exploit the modal decomposition methodin order to obtain correspondences between different objects. This can be done us-ing the eigenvectors or singular vectors based on the method of Brady andShapiro [11].

References

1. AIM@SHAPE: SHREC - 3D shape retrieval contest,http://www.aimatshape.net/event/SHREC

2. Elad, A., Kimmel, R.: On bending invariant signatures for surfaces. IEEE Trans-actions on Pattern Analysis and Machine Intelligence 25(10), 1285–1295 (2003)

3. Hamza, A.B., Krim, H.: Geodesic object representation and recognition. In:Nystrom, I., Sanniti di Baja, G., Svensson, S. (eds.) DGCI 2003. LNCS, vol. 2886,pp. 378–387. Springer, Heidelberg (2003)

4. Jain, V., Zhang, H.: A spectral approach to shape-based retrieval of articulated 3Dmodels. Computer-Aided Design 39(5), 398–407 (2007)

5. Peyre, G., Cohen, L.D.: Heuristically driven front propagation for fast geodesicextraction. Intl. Journal for Computational Vision and Biomechanics 1(1),55–67

6. Carcassoni, M., Hancock, E.R.: Spectral correspondence for point pattern match-ing. Pattern Recognition 36, 193–204 (2003)

7. Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Expression-invariant 3D face recog-nition. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688, pp. 62–69.Springer, Heidelberg (2003)

8. Lin, J.: Divergence measures based on the shannon entropy. IEEE Transactions onInformation Theory 37(1), 145–151 (1991)

Isometric Deformation Modelling for Object Recognition 765

9. Bronstein, A., Bronstein, M., Kimmel, R.: Numerical Geometry of Non-RigidShapes. Springer, Heidelberg (2008)

10. Besl, P.J., Mckay, H.D.: A method for registration of 3-d shapes. IEEE Transactionson Pattern Analysis and Machine Intelligence 14(2), 239–256 (1992)

11. Shapiro, L.S., Brady, J.M.: Feature-based correspondence: an eigenvector approach.Image Vision Comput. 10(5), 283–288 (1992)