[lecture notes in control and information sciences] intelligent computing in signal processing and...

Robust Face Recognition of Images Captured byDifferent Devices

Guangda Su, Yan Shang, and Baixing Zhang

Research Institute of Image and GraphicsElectronic Engineering Department, Tsinghua University

100084, Beijing, P.R. [email protected]

Abstract. The performance of a face recognition system is greatly in-fluenced when the target and test images are captured through differentimaging devices, like scanned photos, images captured through digitalcamera or video camera. By excluding the influence of lighting, expres-sion, as well as time difference, the difference of gray scale distributionof images captured through different devices is analyzed in this paper.Based on this analysis, we decomposed the facial feature into generalfeature and local feature. General facial feature is determined by theimaging device and local facial feature represents the individual facialdifference, which is to be used for face recognition. Before performingface recognition, the general facial feature is excluded through gray scalenormalization. In this paper the normalization was carried out throughgray scale sorting, and the experiment results under ideal and realis-tic conditions showed the performance improvement of face recognitionsystem through this technique.

1 Introduction

Face recognition is a hot topic in pattern recognition society, and it has foundwide usage in many areas [3,4,5]. Based on our research of face recognition tech-niques [1,2,6], we have noticed that the performance of a face recognition systemis greatly influenced by the input imaging device. If the input facial image andthe target facial images are captured through the same type of imaging device, ahigh recognition rate can be achieved. The recognition rate decreases significantlywhen the input image and the target ones are from different imaging devices. Insome real applications, this kind of influence is unbearably serious to the systemperformance. For example, for a face surveillance system in an airport, the facialimages in the pre-stored database are usually scanned photo images, and thefacial image to be recognized is captured on-site through digital camera or dig-ital video camera. In other applications, facial images captured through mobilephones may be compared with scanned photos, which is frequently encounteredin forensic areas. In the above mentioned circumstances, the comparison of facialfeatures for the same person is not carried out on the images captured throughthe same type of device, thus the recognition rate is consequently deteriorated.

D.-S. Huang, K. Li, and G.W. Irwin (Eds.): ICIC 2006, LNCIS 345, pp. 461–469, 2006.c© Springer-Verlag Berlin Heidelberg 2006

462 G. Su, Y. Shang, and B. Zhang

Nowadays, quite a lot face recognition systems with satisfying laboratory per-formance encounter difficulties in real-world applications. Reason for this phe-nomenon lies not only in the commonly regarded difference in pose, expression,lighting and imaging time, the difference in imaging devices is also an importantfactor, which is the topic of our research in this paper.

In order to investigate the influence of different imaging devices on face recog-nition system, we have captured facial images through three types of devices: tra-ditional camera, digital camera and digital camcorder. Front images under samelighting, pose and expression are captured, Figure 1 shows the captured facialimages. From Figure 1, it is obvious that there exists fundamental difference ingray scale distribution between different image types, which we have investigatedtheoretically in [6]. However, it is difficult to apply an unified theoretical equa-tion to exclude this device dependent difference because of the huge variance inthe imaging environments in real-world applications. Some researchers have ad-dressed the problem of excluding the difference in lighting, pose, expression andetc. for a face recognition system through preprocessing algorithms [7,8,9,10].To our knowledge, there is no special research aimed at excluding the influenceof different imaging devices. In this paper, we have proposed a novel gray scalenormalization method based on the device characteristic to solve this problem.In the following section, this novel gray scale normalization method is explainedin detail theoretically, and section 3 describes the realization procedure. Theexperiment results is given in section 4, this paper is concluded in section 5.

(a) (c )(b )

Fig. 1. Facial images from same candidates captured through (a) traditional camera,(b) digital camcorder, and (c) digital camera. All the images are captured under samecircumstances, thus differences in lighting, pose, expression are excluded.

2 Gray Scale Normalization Based on DeviceCharacteristics

Figure 2 shows the corresponding gray scale histograms of the facial imagesin Figure 1. From figure 2, it can be seen that the histograms from one typeof imaging device have a similar shape, and the histogram shapes of differentimaging devices shows significant difference.

Robust Face Recognition of Images Captured by Different Devices 463

(a) (c )(b )

Fig. 2. Histograms for the corresponding facial images in Figure 1. From left to right,histograms of the facial images from(a) traditional camera, (b) digital camcorder, and(c) digital camera. All the images are captured under same circumstances, thus differ-ences in lighting, pose, expression are excluded.

From the special characteristic of the device-dependent histogram distribu-tion, we have derived the following conclusion: under ideal situation, whichmeans only the difference of imaging devices is considered, the facial featurecan be expressed as,

Facial Featute = Gerneral Feature + Individual Feature. (1)

Where, general feature is determined by the imaging device, and it can be seenas a carrier wave. Individual feature is determined by each person’s physiologicalcharacteristic, and it can be regarded as a small signal. The total facial imagecan be seen as these two signals being modulated together with the followingequation,

f0 ≡ fgfi. (2)

where f0 is the facial image to be processed, fg is the general feature, and fi

is the individual feature. According to the information theory, the informationcontained in the facial image can be calculated as,

H(f0) = H(fgfi) = H(fi|fg) + H(fg). (3)

where H refers to the entropy of the two dimensional image matrix.In a face recognition system, the basic operation is to compare the similarity

between two facial images f10 and f20. For a robust and reliable face recogni-tion system, it is important to compare only the individual facial features, whichmeans that f1g and f2g should have no influence on the recognition result. There-fore, it is necessary for a face recognition system to evaluate the general featurefor an input facial image and exclude this feature first before submitting theimage for similarity comparison:

(1) when f1g = f2g = fg, the feature difference between the two input facialimages can be expressed as:

Dif(f10, f20) = Dif(f1i|fg, f2i|fg) = Dif(f1i, f2i). (4)

This equals to compare two images from the same input device. In thiscircumstances, no pre-processing is needed to exclude the influence of theinput device.


(2) when f1g �= f2g, the feature difference can be expressed as:

Dif(f10, f20) = Dif(f1i|f1g, f2i|f2g) + H(f1gf2g). (5)

This means that the two images are from different imaging devices. HereH(f1gf2g) is a constant, which can be derived through large sample training.Therefore, we only need to calculate Dif(f1i|f1g, f2i|f2g).

For the above mentioned first situation, many face recognition systems haveachieved satisfying results, while for the second situation, it is normally difficultto achieve reliable performance without further preprocessing. In the followingsection, we have proposed a novel gray scale sorting algorithm, which excludesthe general feature introduced by the input devices, while preserves the individ-ual facial feature.

3 Normalization Through Gray Scale Sorting

In order to improve the face recognition performance for images capturedthrough different devices, it is important to exclude the general image featureintroduced by the input device. Gray scale sorting is adopted in this paper toperform the gray scale normalization for images from different input sources.Standard gray scale distributions are defined, which represents the general char-acteristic of facial images captured through a certain type of input imagingdevice. These standard distributions are derived through training large numberof samples with known input devices. Gray scale sorting is then performed totransform the histogram of the facial image to be recognized to the standarddistribution.

3.1 Standard Gray Scale Distribution

The standard gray scale distribution is calculated through a training process:a set of facial images from the same type of imaging device are chosen as thetraining samples, and the histograms which reflect the gray scale distributionin the images are then submitted to a statistical process, which calculates aweighted average version fg from all the images. It should be mentioned here,that the training images are gray scale images, and the pixel values of the averageimage fg are not truncated to integer, instead they are preserved in this processas positive real number.

To get a standard gray scale distribution for a certain type of imaging device,which can be used as a reference to normalize a facial image from a differentinput device, we first calculated a class center distribution function (CCDF).Based on this CCDF, we then calculated a class center cumulative function,which is used as a lookup table in the gray scale normalization. The followingprocedure describes how these two functions are calculated:

(1) First the histogram hn(g) of every facial image in the training set is cal-culated (where n is the nth sample, g is the gray scale, and n = 1,2,...,N,g = 0,1,...,255).


(2) Define B(g) as the class center distribution function for the specified inputclass. For every gray scale, B(g) is calculated recursively. Take B(0) as anexample: suppose the class center for the first two images in the training setis k1, the precision radius is r1, then

k1 =h1(0) + h2(0)

2and r1 =

|h1(0) − h2(0)|2

. (6)

If |h3(0) − k1| ≤ r1, then the class center and the precision radius remainthe same k2 = k1 and r2 = r1; If otherwise,

k2 =k1 + h3(0)

2and r2 =

|k1 − h3(0)|2

. (7)

The same calculation is carried out to all the N training samples with B(0) =kN−1. The class center values for other gray scales are calculated in the sameway as the calculation of B(0).

(3) Based on B(g), the class center cumulative function P (g) is calculated as,

P (g) =∫ 255

0B(g)dg. (8)

Figure 3 shows the class center cumulative functions for facial images fromscanned photos, digital camera and digital camcorder. There are all together10000 images in the total training set, and it is shown that all the three cumu-lative function have their own distinctive shape.

(a) (c )(b )

Fig. 3. Class center cumulative functions for facial images from scanned photos (a),digital camera (b) and digital camcorder (c)

3.2 Gray Scale Sorting

The aim of gray scale sorting is to normalize a facial image to a standard grayscale distribution of a certain image type. First we define the following function,

Definition 1. Nδ((i, j), (m, n), D), where (i, j) and (m, n) are the coordinatesof the pixels in a two-dimensional facial image, D is the gray scale matrix,δ is a neighborhood region of pixel (m, n). Nδ((i, j), (m, n), D) is defined as theascending gray scale order of pixel (i, j) in the neighborhood region of pixel (m, n)in the gray scale matrix D. The neighborhood region δ can be a region of 3 × 3,or the entire image. It can also be region containing facial parts like eye, nose,brow and etc.


Let Ds be the gray scale distribution matrix of the standard image fg, Do bethe gray scale distribution of the image to be processed, and X be the targetimage after gray scale sorting with the same size as the original one, for everypixel in the target, its gray scale is computed as,

X(i, j) = D(m, n). (9)

where Nδ((i, j), (i, j), Do) = Nδ((m, n), (i, j), Ds).It is easy to prove that the above assignment is unique and complete. In

a specified neighborhood δ, the gray scales of the pixels in the target imageare assigned through the gray scale in the standard image, therefore the targetimage preserves the statistical characteristic of the standard image like meanand variance. On the same time, the gray scale order of the original image isalso preserved in the target one, therefore, the individual facial feature remainsintact in this process, while the general feature is normalized to a standarddevice. Figure 4 shows sample images and their corresponding sorted imagesusing the whole image as the neighborhood δ.

(a) D estination image fromscanned photo

(b) from C C D camcorder (c) from digita l camera

(d) his togram of (a) (e) normalized image from (b) (f) normalized image from (c)

Fig. 4. Sample images and their sorted images

4 Experiment Results

We have tested the effect of the proposed gray scale sorting algorithm on a facerecognition system based on multimodal part based PCA (MMP-PCA) [1]. Con-struction of all the eigenspaces is performed on a selected training set. Firstly,


pure face, brow+eye, eye, nose, mouth are automatically extracted from a nor-malized face image, which are shown in Figure. 5. PCA is applied to construct thedesired eigenspaces: eigenface(1), eigenbrow+eye(2), eigeneye(3), eigennose(4),eigenmouth(5). The projection vectors (Bi) of extracted facial parts (qi) fromthe facial image to be recognized can be calculated through Equation 10,

Bi = uTi × qi i = (1, 2, 3, 4, 5). (10)

where ui is the matrix composed from the eigenvectors of the corresponding itheigenspace obtained from the training set. For face recognition, the projectionvectors of the simulated portrait calculated from Equation 10 are compared withthe corresponding projection vectors of every face image stored in the databaseto calculate the similarity of the simulated portrait to the stored face image.

Fig. 5. Extracted pure face, brow+eye, eye, nose, and mouth from a geometricallynormalized face image

Altogether 31 modalities can be formed through different combination of theabove 5 eigenspaces, which can be used freely for simulated portrait recognitionbased on the similarity evaluation of differnt facial parts from the witness or theoperator. When a global modal is chosen, which means the combination of allthe eigenspaces together, the total similarity is calculated through a weightingscheme of 6 : 5 : 4 : 3 : 2 for the individually calculated similarities of pureface, brow+eye, eye, nose, mouth. Table 4.1 illustrates the recognition rate ofthe partial and global modalities for face images of different ages in a database of100,000 face images. 500 test images are used, and the age difference between thetext image to be recognized and the target image stored in the database is abovethree years. It is shown from table ?? that the global recognition modality withthe proposed weighting scheme has the highest recognition rate, and eye+browhas the best performance among all the facial parts.

Test of the proposed algorithm is carried out using the above explained facerecognition system. We have designed two experiments: First, 20 test images of20 individuals were inserted in the database of 100000 images. These test samplesare scanned from photos taken under ideal situation which excludes the differencein lighting, expression, pose and time. Another 20 images from the same persons


Table 4.1. Recognition rate of the partial and global recognition modalities for faceimages of different ages in a database of 100,000 facial images. First column shows theposition of the target image in the descending similarity list.

pure face brow+eye eye nose mouth globalfirst 30.6% 18.3% 13.4% 6.2% 14.0% 71.8%

first 5 66.3% 32.4% 23.8% 11.5% 14.0% 71.8%first 10 71.4% 38.4% 28.2% 15.0% 17.0% 78.9%first 50 79.5% 51.2% 37.8% 24.3% 26.6% 87.6%

Table 4.2. Recognition rate under ideal situation with/without gray scale normaliza-tion. First column shows the position of the target image in the descending similaritylist.

without normalization after normalizationfirst 60% 75%

first 5 85% 90%first 10 90% 90%first 50 95% 95%

Table 4.3. Recognition rate under realistic situation with/without gray scale normal-ization. First column shows the position of the target image in the descending similaritylist.

without normalization after normalizationfirst 27.1% 40%

first 5 45.3% 57.1%first 10 50.0% 61.8%first 50 68.2% 75.9%

as the test group were captured under the same condition using digital camcorder,they were processed with the proposed gray scale sorting algorithm, which normal-ized their gray scale distribution to the standard distribution of scanned photos.The second experiment was carried out in the same scenario as the first one, exceptthat the images were not captured under the same condition, there exist differencein lighting, pose, as well as half year time difference and the number of test imageswere increased to 170. This represents a more realistic scenario. Under the globalrecognition mode, the face recognition performance for the above two experimentsare given in table 4.2 and 4.3 respectively. It is shown from the data in these twotables that the recognition performance is increased around 5% in ideal situationand around 10% in a more realistic situation.

5 Conclusion

One of the factors that influence the system performance of a face recognitionsystem is that the images to be compared are often from different imaging de-


vices. For example, in many cases the images stored in the database are scannedphotos, while the facial images to be recognized are captured through digitalcamcorder. Therefore in order to increase the system performance in this sit-uation, it is important to normalize the gray scale distribution of the imagesbefore comparing them. In this paper, a gray scale sorting algorithm is proposedto perform the gray scale normalization, and the experiment results showed theeffectiveness of the proposed algorithm. The image data in our system were col-lected under the same illumination environment, therefore it is ensured that thedifference in gray scale distribution comes only from different imaging devices.It is worth to mention here that although the proposed gray scale normalizationmethod is tested on a face recognition system with PCA, it is also applicable toface recognition systems that adopt LDA, SVM or etc.

In the future, it is worth to investigate further the influence of the neighbor-hood size, which in this paper we have set as the entire image. Another choicewhich may perform better is to set the neighborhood to important facial regions.It is also worth to investigate the influence of the proposed algorithm on facerecognition systems not based on eigenface technique as the one adopted in thispaper.

References

1. Su, G.D., Zhang, C.P., Ding, R., Du, C.: MMP-PCA Face Recognition Method.Electronic Letters, Vol. 38, No. 25, (2002) 415–438

2. Ding, R., Su, G.D., Lin, X.G.: Face Recognition Algorithm Using Both Local Infor-mation and Global Information. Electronic Letters, Vol. 38, No. 8, (2002) 363–364

3. Zhao, W., Chellappa, R., Phillips, J., Rosenfeld, A.: Face Recognition: A LiteratureSurvey. ACM Computing Surveys, (2003) 399–458

4. Turk, M., Pentland, A.:Face Recognition Using Eigenfaces. Proc. IEEE CVPR ’91,(1991) 586–591

5. Pentland, A., Moghaddam, B., Starner, T.: View-Based and Modular Eigenspacesfor Face Recognition. Proc. IEEE CVPR ’94, (1994) 84–91

6. Zhang, B.X., Su, G.D.: Research on Characteristic of Facial Imaging and Goalsof Facial Image Normalization. Optic Electronic · Laser , Vol. 14, No. 4, (2003)406–410

7. Hezeltine, T.: Evaluation of Image Pre-processing Techniques for Eigenface BasedFace Recognition. the Second International Conference on Image and Graphics(ICIG2002), Hefei, China, SPIE, Vol. 4875, (2002) 677-685

8. Gross, R., Baker, S., Matthews, I., Kanade, T.: Face Recognition Across Pose andIllumination. In: Li, S.Z., Jain, Anil K. (eds.): Handbook of Face Recognition.Springer-Verlag, June, 2004

9. Georghiades, A.S., Belhumeur P.N., Kriegman D.J.: From Few to Many: Illumina-tion Cone Models for Face Recognition under Variable Lighting and Pose. IEEETransactions on Pattern Analysis and Machine Intelligence, Vol. 23, No. 6, (2001)643–660

10. Tian, Y.L., Kanade, T., Cohn, J.F.: Recognizing Action Units for Facial ExpressionAnalysis. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 23,No. 2, (2001) 97–115

[lecture notes in control and information sciences] intelligent computing in signal processing and...

Documents