generalized elastic graph matching for face recognition

6
Generalized elastic graph matching for face recognition Hochul Shin a, * , Seong-Dae Kim a , Hae-Chul Choi b a Division of Electrical Engineering, Department of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology (KAIST), 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, Republic of Korea b Broadcasting Media Research Group, Electronics and Telecommunications Research Institute (ETRI), 161 Gajeong-dong, Yuseong-gu, Daejeon 305-701, Republic of Korea Received 3 November 2005; received in revised form 2 January 2007 Available online 17 January 2007 Communicated by T.K. Ho Abstract Elastic graph matching (EGM) is a well-known approach in face recognition area for the robust face recognition to a rotation in depth and facial expression change. We extended the conventional EGM to the generalized EGM (G-EGM), which is afford to handle even globally warped faces, by enhancing the robustness of node descriptors to a global warping, and introducing warping-compensated edges in graph matching cost function. The improved performance of the G-EGM was evaluated through the recognition simulation based on arbitrary posed faces. Ó 2007 Elsevier B.V. All rights reserved. Keywords: Face recognition; Face alignment; Elastic graph matching 1. Introduction Although elastic graph matching-based approaches by Lades et al. (1993), Wiskott et al. (1997), Duc et al. (1999), Lyons et al. (1999) and Tefas et al. (2001) were good in recognizing faces rotated in depth and changed in facial expression, they did not care about face scale change and rotation in optical axis, which were assumed to be already compensated by fixing camera view or employing a preprocessing for face alignment. But in real environment, it is very hard to completely compensate glo- bal warps between faces, and consequently, the uncon- trolled global misalignment seriously lowers the recognition performance. Even if faces were acquired in very controlled environment and powerful face alignment technique is available, the face matching and classification module as following stages of the face alignment should have a bump for compensating uncaught error during alignment, which is a desired function for the high-end face recognition system. With this background, we enhanced the previous EGM to be able to handle even unexpected global misalignments between faces by introducing the robust local descriptor and the generalized elastic matching cost. The proposed local descriptor for labeling graph node is a feature set which is more robust to global distortions in scale and rota- tion than simple Gabor-wavelet-coefficients. The proposed elastic matching cost function has a modified form to com- pensate a global affine warp between matched faces. The generalized EGM (G-EGM) means an EGM based on the robust jet and warping-robust elastic matching cost. In different with previous methods, the G-EGM should coincidently search global warping parameters and local deformations between matched face graphs, thus we also introduce an effective and stable cost optimization strategy for our distinguished elastic matching process. Well-known FERET face database (Phillips et al., 2000) includes various faces arbitrary rotated, scale-changed, and 0167-8655/$ - see front matter Ó 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.patrec.2007.01.003 * Corresponding author. Tel.: +82 42 869 5430; fax: +82 42 869 8570. E-mail addresses: [email protected] (H. Shin), [email protected] (S.-D. Kim), [email protected] (H.-C. Choi). www.elsevier.com/locate/patrec Pattern Recognition Letters 28 (2007) 1077–1082

Upload: hochul-shin

Post on 21-Jun-2016

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Generalized elastic graph matching for face recognition

www.elsevier.com/locate/patrec

Pattern Recognition Letters 28 (2007) 1077–1082

Generalized elastic graph matching for face recognition

Hochul Shin a,*, Seong-Dae Kim a, Hae-Chul Choi b

a Division of Electrical Engineering, Department of Electrical Engineering and Computer Science,

Korea Advanced Institute of Science and Technology (KAIST), 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, Republic of Koreab Broadcasting Media Research Group, Electronics and Telecommunications Research Institute (ETRI), 161 Gajeong-dong, Yuseong-gu,

Daejeon 305-701, Republic of Korea

Received 3 November 2005; received in revised form 2 January 2007Available online 17 January 2007

Communicated by T.K. Ho

Abstract

Elastic graph matching (EGM) is a well-known approach in face recognition area for the robust face recognition to a rotation indepth and facial expression change. We extended the conventional EGM to the generalized EGM (G-EGM), which is afford to handleeven globally warped faces, by enhancing the robustness of node descriptors to a global warping, and introducing warping-compensatededges in graph matching cost function. The improved performance of the G-EGM was evaluated through the recognition simulationbased on arbitrary posed faces.� 2007 Elsevier B.V. All rights reserved.

Keywords: Face recognition; Face alignment; Elastic graph matching

1. Introduction

Although elastic graph matching-based approaches byLades et al. (1993), Wiskott et al. (1997), Duc et al.(1999), Lyons et al. (1999) and Tefas et al. (2001) weregood in recognizing faces rotated in depth and changedin facial expression, they did not care about face scalechange and rotation in optical axis, which were assumedto be already compensated by fixing camera view oremploying a preprocessing for face alignment. But in realenvironment, it is very hard to completely compensate glo-bal warps between faces, and consequently, the uncon-trolled global misalignment seriously lowers therecognition performance. Even if faces were acquired invery controlled environment and powerful face alignmenttechnique is available, the face matching and classificationmodule as following stages of the face alignment should

0167-8655/$ - see front matter � 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.patrec.2007.01.003

* Corresponding author. Tel.: +82 42 869 5430; fax: +82 42 869 8570.E-mail addresses: [email protected] (H. Shin), [email protected]

(S.-D. Kim), [email protected] (H.-C. Choi).

have a bump for compensating uncaught error duringalignment, which is a desired function for the high-end facerecognition system.

With this background, we enhanced the previous EGMto be able to handle even unexpected global misalignmentsbetween faces by introducing the robust local descriptorand the generalized elastic matching cost. The proposedlocal descriptor for labeling graph node is a feature setwhich is more robust to global distortions in scale and rota-tion than simple Gabor-wavelet-coefficients. The proposedelastic matching cost function has a modified form to com-pensate a global affine warp between matched faces. Thegeneralized EGM (G-EGM) means an EGM based onthe robust jet and warping-robust elastic matching cost.In different with previous methods, the G-EGM shouldcoincidently search global warping parameters and localdeformations between matched face graphs, thus we alsointroduce an effective and stable cost optimization strategyfor our distinguished elastic matching process.

Well-known FERET face database (Phillips et al., 2000)includes various faces arbitrary rotated, scale-changed, and

Page 2: Generalized elastic graph matching for face recognition

1078 H. Shin et al. / Pattern Recognition Letters 28 (2007) 1077–1082

tilted in pose. For evaluating the effectiveness of ourmethod to improve face recognition performance, we madetwo face sets from a subset of FERET database different intheir alignment conditions; one set is of faces manuallyaligned according to their eyes and nose positions, andanother set is of faces untreated by any processing foralignment. The proposed method showed better perfor-mance in recognition of the manually pre-aligned faces,and specifically, showed dominant performance in recogni-tion of the unaligned faces comparing to the conventionalEGM.

2. Elastic graph matching

A graph on a face is a set of N nodes, fzi ¼ ðxi; yiÞgi¼Ni¼1 ,

and the local descriptor for labeling each node, j(zi), i.e. jet,which is a feature vector extracted by a tool, e.g. Gaborwavelet transformation (GWT). The elastic matchingbetween a reference and test face practically means to findthe optimal test graph, fzt

igi¼Ni¼1 , that minimize the elastic

matching cost with the given reference graph, fzrig

i¼Ni¼1 ,

and jets, fjrðziÞgi¼Ni¼1 . The cost function for EGM is given as,

fztig

i¼Ni¼1 Þopt: ¼ arg min

fztig

i¼Ni¼1

XN

i¼1

"�Snodeðjtðzt

iÞ; jrðzr

i ÞÞ

þkXj2Ei

Sedgeðetij; e

rijÞ#; ð1Þ

where etij ¼ zt

i � ztj and er

ij ¼ zri � zr

j, called edge, which rep-resents linkage between nodes. jtðzt

iÞ is a jet extracted fromtest image at a node position, zt

i, and Ei denotes the set ofindexes of the nodes which are in neighborhood with theith node. During the minimization of the cost, all nodesof test graph are forced to be located at the positions ofmost similar jets with the reference by maximizing nodesimilarities, Snode, but the structure of test graph is forcedto be preserved by minimizing edge distortions, Sedge. Atthat time, k controls elasticity of the test graph. Generally,Snode and Sedge are given as,

Sedgeðetij; e

rijÞ ¼ jjet

ij � erijjj: ð2Þ

SnodeðjtðztiÞ; j

rðzri ÞÞ ¼

jtðztiÞ � j

rðzri Þ

ðjjjtðztiÞjjjjjrðzr

i ÞjjÞ: ð3Þ

About the given face image, I(z), the typical jet at z0, pre-sented by Lades et al. (1993), is a vector of magnitude partsof Gabor wavelet coefficients calculated by convolution be-tween the image and Gabor kernels centered at z0, i.e.,

jl;tðz0Þ ¼ IðzÞ � wl;tðz� z0Þ; ð4Þjðz0Þmagnitude ¼ ðjj0;0j; jj1;0j; . . . ; jjL�1;M�1jÞ; ð5Þ

where jl,t is a simplified representation of jl,t(z0). The ker-nels of the GWT, commonly used for extracting jets, are gi-ven as,

wl;tðzÞ ¼kkl;tk2

r2e�kkl;tk2kzk2

2r2 ðeikl;tz � e�r2

2 Þ; ð6Þ

where l = 0, . . . ,L � 1, t = 0, . . . ,M � 1 and z = (x,y). L

denotes the number of directions and M is the number ofscales. Normally, L is set as 6 or 8, and M as 3 or 5. Thecharacteristics of Gabor kernels are determined by thewave vector which is defined as,

kl;t ¼ ðkmax=1tÞ expðipl=LÞ: ð7Þ

kmax is the maximum frequency and 1 is the spacing factorbetween kernels in the frequency domain.

3. Robust jet

The robust jet is naturally based on Gabor wavelet coef-ficients, but it is enhanced in robustness to a rotation andscaling attacks. To generate a robust jet, firstly, extractedGabor wavelet coefficients are aligned as L · M Gabor-wavelet-coefficients-matrix, i.e.,

U ¼j0;0 . . . j0;M�1

. . . . . . . . .

jL�1;0 . . . jL�1;M�1

264

375 ¼ ½jl;t�l¼0;...;L�1

t¼0;...;M�1

: ð8Þ

Next, L · M � 2D Discrete Fourier Transform (DFT) isapplied on the Gabor matrix,

X ¼ 2D DFTL�MðUÞ ¼ ½cl;t�l¼0;...;L�1t¼0;...;M�1

: ð9Þ

Then our proposing robust jet is given as a vector ofaligned magnitude parts of elements of X

jðz0Þrobust ¼ ðjc0;0j; jc1;0j; . . . ; jcL�1;M�1jÞ=q; ð10Þ

where q ¼PL�1

l¼0

PM�1t¼0 jcl;tj2

� �1=2

.For showing why these processes provide jets additional

robustness to a rotation and scaling attack, we assume thatan image, I(z), is scaled and rotated, and the GWT isapplied on the distorted image

j0l;tðz0Þ ¼ IðsRhðz� z0Þ þ z0Þ � wl;tðz� z0Þ: ð11Þ

Rh denotes h-rotation matrix and s is a scaling factor. Alsothe (11) can be represented by the inverse rotation and scal-ing of Gabor kernels, i.e.,

j0l;tðz0Þ ¼ IðpÞ � wl;t

1

sR�hðp� z0Þ

� �; ð12Þ

where p = sRh(z � z0) + z0. It means that the rotation andscale variations of an image are equivalent with the con-trary rotation and scaling of Gabor kernels. On the otherhand, as an important property of the GWT, it is easily de-rived from (4) that the rotation and scaling of the kernelscan be compensated by shifts of l and t, i.e.,

wl;t

1

sR�hðp� z0Þ

� �¼ s2wlþðLh=pÞ;tþðln s= ln f Þðp� z0Þ: ð13Þ

By merging (12) and (13),

j0l;tðz0Þ ¼ IðpÞ � ðs2wlþðLh=pÞ;tþðln s= ln f Þðp� z0ÞÞ: ð14Þj0l;tðz0Þ ¼ s2jlþl0 ;tþt0 ðz0Þ; ð15Þ

Page 3: Generalized elastic graph matching for face recognition

H. Shin et al. / Pattern Recognition Letters 28 (2007) 1077–1082 1079

where l 0 = Lh/p and t 0 = lns/lnf. About the Gabor-wave-let-coefficients-matrix shown in (8), these shifts of kernelmodes will be revealed as magnitude scaling, circular trans-lation in l-axis, and simple translation in t-axis of all ele-ments. Therefore, the rotation and scaling of the imageare reliably removed or minimized by the sequence; (1)applying 2D DFT on U, (2) taking magnitude parts of X,and (3) normalizing the magnitude of the jet, because ofthe shift invariant property of magnitude part of DFTcoefficient.

4. Generalized elastic matching cost function

By inserting global warping parameters into elasticmatching cost function, conventional EGM is generalizedso that can handle even an affine warp of graph. Ournew generalized elastic matching cost function includes2 · 2 warping matrix, A. The generalized elastic matchingcost function is given as,

ðfztig

i¼Ni¼1 ;AÞopt: ¼ arg min

fztig

i¼Ni¼1

XN

i¼1

"�Snodeðjtðzt

iÞ; jrðzri ÞÞ

þkXj2Ei

Sedgeðd tij; e

rijÞ#; ð16Þ

where d tij ¼ A�1et

ij ¼ A�1ðzti � zt

jÞ. Instead of simple edge,et

ij, warping-compensated edge, d tij, is used in our cost func-

tion. Because the proposed optimization cost observesstructural distortion between reference graph and warp-ing-freed test graph, it can stably adapt to more variousposes of target faces without any helps of pre-alignment

Fig. 1. Elastic graph matching: rectang

processes. Moreover the proposed robust jet is adequateto this generalized framework about global warping (seeFig. 1).

In different with conventional one, the minimization ofour cost function should be done with the search of optimalglobal warping parameters. An iterative strategy, shown inFig. 2, is proposed to do two jobs at once; node localizationand global graph fitting. In the first step, all node positionsof the test graph is searched using simulated annealingoptimization technique (Press et al., 1992) constrained bythe cost function of (16) with small k, which is commonlyused methodology in conventional approaches. Initiallysmall k should be assigned for more free and wider searchfor localized node positions. Then in second step, reliablenodes are selected according to the algorithm introducedat the end of this section. The reliable node means a nodethat properly located at its expected position in the testface. In last step, temporal warping matrix is estimatedfrom the selected reliable nodes, and also translational rela-tionship between graphs should be found at this stage.Temporally optimal warping parameters are estimated by

ðA; tÞopt: ¼ arg minA;t

Xk2fi�g

kztk � ðAzr

k þ tÞk2: ð17Þ

{i*} denotes a set of indexes of reliably located nodes, and tis a 2-dimensional translation vector between graphs. Sim-ple linear regression (Press et al., 1992) is enough to esti-mate the warping parameters from selected reliablenodes. More than three nodes should be engaged in warp-ing parameter estimation because total six parametersshould be estimated by (17).

ular grid and single reference case.

Page 4: Generalized elastic graph matching for face recognition

Fig. 2. Iterative algorithm for generalized elastic graph matching.

1080 H. Shin et al. / Pattern Recognition Letters 28 (2007) 1077–1082

After finishing three steps, all nodes of test graph arerelocated according to the estimated temporal A and t,and slightly increased k is assigned to lead local nodes tobe settled at more closed proper positions. Then processgoes back to the first step, and starts next iteration. Theoptimization process iterates until the cost of (16) goesunder a goal value or is not changed during severaliterations.

The key point in our optimization algorithm is that onlyselected reliable nodes are used for temporal estimation ofwarping parameters. False global warping parameters maylead some nodes to very poor position at which nodes can-not be stepped out by localized searching in limited range.Therefore, during the optimization, finding reliably located

Fig. 3. Various relative angles around a node.

temporal nodes is very important for the stable costminimization.

As shown in Fig. 3, the relative angle around a nodemeans the angle made by two edges attached to the node.The relative angle is invariant to a rotation and scalingof graph, therefore it is good measure to determine whetherthe owner node of the angle is properly located or not. Butwhen faces were distorted by some facial expressionchanges, or when the person of test face is different withthe one of reference face, the graph may be distorted seri-ously and even relative angles cannot be preserved. Forthese cases, we check all available angles around a node,and if any one of angles is preserved during the localizedsearching, we accept the node as reliable one.

<Algorithm. Reliable node selection>Goal. Find a set of indexes of reliable nodes, {i*}, fromthe temporal test graph, fzr

ig, and the reference graph,fzr

ig.

For all N nodes from i = 1 to i = N,

Step 1. Get relative angles around a node.

Find all relative angles htrelative around a node

zti,

htrelativeðk; l; iÞ ¼ cos�1 et

ki � etli

ketkikket

lik

� �

¼ cos�1 ðztk � zt

iÞ � ðztl � zt

iÞkzt

k � ztikkzt

l � ztik

� �

Page 5: Generalized elastic graph matching for face recognition

where ztk and zt

l are neighboring nodes zti of and

k 5 l .(If there are p neighboring nodes around zt

i, youcan find pC2 relative angles.)Also find all relative angles hr

relative around a refer-ence node zr

i .

Step 2. Check the node reliability.

For all possible relative angles around ith testnode, if there are anyone which is similar withthe corresponding relative angles around ithreference node, i.e.,

jhtrelativeðk; l; iÞ � hr

relativeðk; l; iÞj < T angle;

Then insert i into {i*}, reliable node index set.

H. Shin et al. / Pattern Recognition Letters 28 (2007) 1077–1082 1081

5. Experimental results

To evaluate the improved performance of our G-EGMfor face recognition, face recognition simulations weredone using a subset of FERET database including rota-tion, scale-change, facial expression change and tilt in facepose. From the FERET database, we made a sub-FERETdatabase by selecting 94 persons which includes some faceimages having relatively serious rotation and scale-changeand chose 10 near frontal (tilt in depth within ±22.5degrees) face images per person. Therefore 940 face imageswere selected for experiments. 10 images per person aredivided into four prototypes and six test patterns. Someexample face images used for experiments are shown inFig. 4.

Three elastic approaches and well known appearance-based approaches, i.e. Eigenfaces and Fisherfaces, weretested and compared. Compared three elastic matchingmethods are; (1) Conventional EGM based on magnitudejet, (2) Alternative EGM based on morphological jet intro-

Fig. 4. Examples of selected face images used in

duced by Tefas et al. (2001), (3) Proposed warping-robustEGM based on robust jet and generalized cost function.In simulation, test faces were aligned to a bunch of normalreference faces (Wiskott et al., 1997) by each elastic match-ing method, and features were extracted by linear discrim-inant analysis (LDA) on a merged vector of localized jets,namely LG (local-to-global) vector, which was introducedby Lyons et al. (1999). Then nearest neighbor classificationrule was used for determining identities of the tested faces.

For more detail performance evaluation, two simula-tions were performed and the simulation results are shownin Fig. 5. The first simulation was performed with manuallyaligned faces according to the positions of their eyes andtop of nose. Although the variations caused by global facewarping were almost compensated by the manual pre-alignment, flexible approaches based on the EGM showedbetter performance due to their adaptability to facialexpression changes and uncaught distortions by the man-ual alignment. Although serious rotation and scale-changein faces were almost removed by manual pre-alignment, theproposed method shows near 5% improvement in recogni-tion success rate, which demonstrates that the robust jet ismore effective than other types of jet in robustness andinformation preserving.

The second simulation was performed on the raw facesuntreated by any manual alignment process. The secondsimulation had done to compare the robustness of eachmethod to an arbitrary poses of faces. The performanceof every method was degraded because of an untreated glo-bal misalignment between faces. The appearance-basedmethods showed most serious performance degradationsand even the conventional EGM also showed about 10%decrease of recognition success rate. It is because the con-ventional cost function cannot properly work with seriousscale and pose changes in faces. On the contrary, our pro-posed method showed about 2% performance degrada-tions, which shows that our generalized EGM issuccessfully dealing arbitrary global warps of faces.

recognition experiments and G-EGM results.

Page 6: Generalized elastic graph matching for face recognition

Fig. 5. Result of face recognition simulation.

Table 1Average elapsed time (in s) for elastically matching a 128 · 128-pixel-sizeface to a bunch of graph of 50 reference faces (Simulation environment:P4-2.4 GHz CPU and Windows 2000 OS)

Method Required timefor jet extraction

Required time forcost minimization

Total

Morphologicaljet + conventionalcost function

0.35 4.06 4.41

Magnitudejet + conventionalcost function

1.27 4.07 5.34

Robust jet + generalizedcost function

1.93 5.24 7.17

1082 H. Shin et al. / Pattern Recognition Letters 28 (2007) 1077–1082

Computation times required for three elastic matchingmethods are presented in Table 1. These times are esti-mated as an average time required for elastically matchinga test face to a reference bunch of graph made by 50 refer-ence faces. Our proposed method has longer computationtime than compared conventional methods because of iter-ative estimations of pose parameters and DFT processingin robust jet extraction. Nevertheless, when a recognitionsystem has to work with arbitrary face images, our pro-posed method is still valuable to achieve steadily high rec-ognition performance with the sacrifice of working times.

6. Conclusion

A new robust local descriptor for labeling graph nodeand an elastic matching cost function enhanced by globalwarping model is proposed for the face recognition in morerealistic environment. Consistent face recognition perfor-mance is achieved by absorbing the global face alignment,which is treated as an independent preprocessing, into theelastic matching process. Global face alignment and local-

ized elastic matching complementarily cooperate to find thebest local and global matching condition between faces.The properness of the proposed method was demonstratedby several elastic face matching tests and face recognitionsimulations on the subset of FERET database includingfaces in various poses.

The performance of the proposed G-EGM is expectedto be further enhanced by employing more complex 8- or12-parameter-warping-model. Also the relative angle,which is only invariant to a scaling and z-axis rotation,should be replaced by an affine invariant feature extract-able around a node of a graph. Also reducing computa-tional complexity of our algorithm is one of the mostimportant further works. As shown in Table 1, seven sec-onds is quite a long time to recognize a face in real environ-ment. But we expect that using more powerful hardwaremay be a solution for this. On the other hand, we arenow researching about the high-performing face recogni-tion framework, namely elastic-kernel-based approach,which includes the proposed G-EGM and the recent ker-nel-based approaches (Liu, 2004; Lu et al., 2003).

References

Duc, B., Fischer, S., Bigun, J., 1999. Face authentication with gaborinformation on deformable graph. IEEE Trans. Image Process. 8, 504–516.

Lades, M., Vorbruggen, J.C., Buhmann, J., Lange, J., Von der Malsburg,C., Wurtz, R.P., Konen, W., 1993. Distortion invariant objectrecognition in the dynamic link architecture. IEEE Trans. Comput.42, 300–311.

Liu, C., 2004. Gabor-based kernel PCA with fractional power polynomialmodels for face recognition. IEEE Trans. Pattern Anal. MachineIntell. 26 (5).

Lu, J., Plataniotis, K.N., Venetsanopoulos, A.N., 2003. Face recognitionusing kernel direct discriminant analysis algorithms. IEEE Trans.Neural Networks 14 (1).

Lyons, M.L., Budyn, J., Akamatsu, S., 1999. Automatic classification ofsingle facial images. IEEE Trans. Pattern Anal. Machine Intell. 21,1357–1362.

Phillips, P.J., Moon, H., Rizvi, S.A., Rauss, P.J., 2000. The FERETevaluation methodology for face recognition algorithms. IEEE Trans.Pattern Anal. Machine Intell. 22, 1090–1104.

Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P., 1992.Numerical Recipes in C: The Art of Science Computing, second ed.Cambridge University Press.

Tefas, A., Kotropoulos, C., Pitas, I., 2001. Using support vector machinesto enhance the performance of elastic graph matching for frontal faceauthentication. IEEE Trans. Pattern Anal. Machine Intell. 23, 735–746.

Wiskott, L., Fellous, J.-M., Kruger, N., Von der Malsburg, C., 1997. Facerecognition by elastic bunch graph matching. IEEE Trans. PatternAnal. Machine Intell. 19, 775–779.