[ieee 2013 ieee international conference on signal and image processing applications (icsipa) -...

2013 IEEE International Conference on Signal and Irnage Processing Applications (ICSIPA)

Initial Experimental Results of Real-Time Variant

Pose Face Detection and Tracking System Nur Baiti Zahirl, Rosdiyana Samad2, Mahfuzah Mustafa3

Faculty of Electrical & Electronics Engineering, University Malaysia Pahang

26600, Pekan, Pahang, Malaysia

Tel: +609-4246099 '[email protected]

2 [email protected]

[email protected]

Abstract- A face detection system is a computer application for

automatically detecting a human face from digital image or video frame. This paper presents a face detection system that used web camera to detect and track a face in real-time. To detect a face in the image, a simple method of skin color detection is used. By using color detection method in this project, the face can be segmented easily from the complex background. However, to

detect a face in real-time is quite challenging especially when a face is moving and the real-time environment has uneven illumination. This paper presents the preliminary result of face detection and tracking system, which is the system, detects a face

that has different poses in a real-time situation, where the light condition is uneven. Here, to complete the detection process, contour detection method is added so that the detection is more accurate. This system can be applied in many applications such as banking system to reduce the number of forgery, security

system, and human-computer interaction (He I).

Keywords: face detection, face tracking, skin color analysis, contour method

I. INTRODUCTION

Nowadays, many computer applications use face detection algorithm to detect and tracking faces. Face detection and tracking system is a technology in computer vision that determines the movement of a face in arbitrary images [1]. It detects facial features and ignores anything else, such as building, trees and also bodies. However, not all face detection systems can detect and track face accurately. Major problem in face detection method is, sometimes the detection fails to detect the correct face. [t may detects others object that similar to the face. Beside, in real-time environment, human face is usually moving to the right and left side or tilt in certain angle and sometime it rotates. This environment is challenging, plus with the uneven illumination that (cause by changing time, weather or lighting) may effect the detection. Although many face detection systems have been proposed, but not all are able to detect a face when the person is moving or turns their head.

Robust face detection algorithm or system is very important to the application that applies this face detection for other process, such as facial expression recognition or personal identification system. Therefore, the detection must

be accurate to get a valuable tracking. In the automatic facial expression recognition system, the face detection is an important task that has major influence on the performance of the entire system. Various techniques have been proposed to detect a face automatically from still images or image sequence. For most proposed face detection techniques, the facial image or image sequence is obtained in a controlled environment with uniformed illumination and a simple background. When capturing a face image indoors, normally the camera distance and illumination settings are carefully controlled and constantly recalibrated to ensure that the settings are identical across subjects [2].

Face detection in semi-controlled or uncontrolled conditions is the most challenging [3]. This condition happens in a complex environment, where a cluttered background and sometimes occlusions can occur. Illumination variation in uncontrolled environment is a major problem, especially for appearance-based approaches. [t can result in huge variations in images of the same person due to varying lighting conditions. The problem is mainly due to the 3D shape of human faces under lighting in different directions [4].

Researchers have proposed many face detection techniques. One of the earliest works in the face detection, Turk and Petland [4] developed an almost real-time detection system that located and tracked a person's head and then recognized the person. The system was based on eigen

decomposition which is also known as Principal Component Analysis (PCA). The most famous and successful real-time face detection method is a Viola's framework which was initially proposed by Viola and Jones [5][6] and improved by Lienhart [7].This framework used a set of Haar-like features in which each feature was described by the template.

Another method to detect face in a scene is to use a skin color segmentation method. Various approaches have been proposed using different colour space and skin model. For instance, Chandrappa et. al [8] and Phung et. al [9] proposed a human skin model in YCbCr color space. Li et. al [10] used another approach in the colour space, which they combined the YCbCr color space with the normalized RGB color space to create a mixed colour model to enhance the detection reliability. S. L. Phung, A. Bouzerdoum and D. Chai [11] study the colour representation, colour quantization and

978-1-4799-0269-9/13/$31.00 ©2013 IEEE 264


classification algorithm of the colour pixel classification approach to skin segmentation. In their analysis, several representative colour spaces using the Bayesian classifier with the histogram technique shows that skin segmentation based on colour pixel classification is largely unaffected by the choice of the colour space.

In another research, N. Sethi and A. Aggarwal [6], developed a face detection and tracking system using pyramidal Lucas Kanade algorithm. At the face detection stage, Shi and Thomasi algorithm was used to extract feature points and the pyramidal Lucas-Kanade algorithm is used to track those detected features.

In this study, the initial experiment results of variant pose face detection and tracking system in a real-time is presented. The objective of this study is to develop face detection and tracking system that can detects a face in frontal view and several poses and orientations. The experiments have been done in simulation environment and in real-time system. At the same time to apply the different method for detecting and tracking human face by using contour method.

A collection of face images (still image) is used for the simulation and for a real-time system; sequence of images is captured by using webcam. This system detects a face and track the movement of face when the face is moving at a certain angle. This system is developed by using Microsoft Visual Studio 2010 (C++) and OpenCV Library 2.2.

II. METHODOLOG Y

This section presents the method to develop the face detection and tracking system. Figure 1, shows a flowchart for the whole process. The next following section will provide the detail of each step.

A. Skin Colour Analysis

The detection of human face skin color can be used for statistic performance to find out the face characteristic. This method has been proved and shown by many papers, that skin color is a strong beneficial characteristic for human face detection [12].

Thus, for the skin color detection method, HSV color space has been chosen, where the color space of the image should be converted from R, G and B space into H, S and V space. H is a hue that represents color, S is a saturation that represents the amount of white in mixed color and V is a value; represents the amount of black in mixed color. HSV color model is related to human color perception. HSV color space has the luminance and chromaticity information. The separation of the brightness information from the chrominance and chromaticity in the HSV reduces the effect of uneven illumination in an image [13].

After converted the skin color into HSV color space, the skin color and the background color of the image are separated. This process can be done by determining the threshold values or range for hue, saturation and value. To determine the threshold values, the experiment has been done

in two environments, which are indoor and outdoor. Indoor environment has a little different with the outdoor environment, where outdoor environment is brighter compare to indoor and sometimes the light is uneven on a face. The appropriate threshold values are important in getting a better segmentation so that the skin can be separated well from the background. Based on the experimental results, the range of the skin color in H, S and V color space are follows:

6:S H:S 38 98 :S S :S 256 100 :S V :S 256

6 :S H :S 38 90:S S:S 200 86 :S V :S 216

Capture image from camera

Skin Color Analysis

Binaty Process

Binaty MOIVhology

Contour Process

BoundingBox

Result

(1)

(2)

Fig. l. Flow chart of the face variant pose face detection and tracking.

The ranges that are shown in Equation (1) is a threshold values recorded for the experiments that have been done in indoor environment, while threshold values in Equation (2) are taken during the experiment in outdoor environment. This is because the illumination from the outdoor and indoor environments is different. Besides, the range of the color component can be change if the background is having the different illumination. The original image is shown in Figure 2 and the result of the binarization is shown in Figure 3.

978-1-4799-0269-9/13/$31.00 ©2013 IEEE 265


Fig. 2. The original image

Fig. 3. The binary image after skin colour segmentation

B. Morphology Process

After segmenting the skin color and background color, there are still remain the other small objects which are similar to the skin color. These small objects are unwanted pixels or objects that should be removed. Thus, by using binary morphology process, it will be able to remove the small objects. Here, the erosion function is applied on the image. This process will help to minimize the objects in the image. Erosion is a process to remove a small point or objects in the image [12][14]. For this process, the 3x3 structure element is used and the result is shown in Figure 4. The operation equation formulates as follow:

where; F m : Skin color mask

1 1 1

F : Binary image from the color segmentation B : Structure Element B: Erosion operator

(3)

(4)

Fig. 4. Result of the erosion process

C. Contour Detection

In the previous section, the skin has been extracted from the background, and small objects or unwanted pixels are removed until get the main face region. Then the next step is to be able to assemble those object's (face) pixels into contours. Contour is a list or sequence of points that represent a curve in an image. In this work, contours are represented by sequences in each every entry in the sequence encodes information about the location of the next point on the curve [15]. The contours are computed from binary image that contains large and small objects, in which the edges are implicit as boundaries between positive (white) and negative (black) regions.

Based on the experimental result, it found that a large object produced in the binary image is a face, while the small objects are pixels or object that are not needed. Therefore, by calculating the contours of all objects can help to get the boundary of each object. Then, each object can be differentiated by using the geometric characteristic [16]. Thus, only the interesting object or useful object for the detection will appear. The detection will be easier after this as long as the region of interest (ROJ) of the object has been determined. After that this system can determines whether the detected object is a face or not a face.

Contour detection can be done by using several methods, for examples; Hough Transform, active snake contour and geometric active contour (GAC) [17]. These methods are useful for shape analysis and object detection or recognition. In this experiment, the active snake contour has been chosen since this method has been used widely in detecting the contour.

Here, the contour is said to possess energy (Esnake) which is defmed as the sum of the three energy terms as follows [18]:

E k =E ,+E ,+E . sna e lllterna externa constramt (5)

Esnake = fa 1 Esnake(V (s ) )ds (6)

= fa 1 Eint(V(S)) + Eimage(V(S)) + Econ(V(S)) ds

978-1-4799-0269-9/13/$31.00 ©2013 IEEE 266


Where,

Eint(V(S)) : represents the internal energy of the spline due to bending

Eimage(V(s)): represents the image forces

Econ(V(s)) : represents the external constraint force

The internal energy ( Einternal ) is depends on the intrinsic properties of the curve. It is the sum of the elastic energy and bending energy. The external energy ( Eexternal) is derived from the image thus, it takes on its smaller values at the features on interest such as boundaries. The result is shown in Figure 5. The contour area is marked here, with white ellipse. Resulting from the region of the contour, a bounding box is creating based on the maximum area of the appeared pixels. The bounding box is created by connecting some points to another point until it becomes a rectangle.

Fig. 5. Contour detection

Contour area

III. EXPERIMENTAL RESULT & DISCUSSION

The simulation results can be obtained in the following by using Microsoft Visual C++ 2010 with OpenCV Library as shown in Figure 6, 7 and 8. The experiment has been done for indoor and outdoor environment.

The real-time results are produced by detecting only one face in a sequence of images. This experiment has two parts, which are testing the detection of face in real-time, and also a simulation (by using a collection of still images). For the simulation, there are only 55 images can be detected accurately from the total images (60 images). Thus, the percentage of accuracy in detection and tracking during the simulation is 91.67%. Here, some of the images are failed to be detected because the influenced by the illumination from background. The testing results are shown in Figure 6, 7 and 8.

For the outdoor experiment, the detection system is able to detect a face in real-time. However, there is a limitation for the outdoor detection. This is due the challenges of the face detection in uneven illumination. The illumination at outdoor sometimes change, either from dark to bright or bright to dark.

Thus, some of the images during the outdoor test are having high illumination (very bright). It makes the detection is failed form detecting a face. Refer to Figure 7, it can be shown that one of the face image is failed to detect. The red rectangle appears only at the nose and mouth areas. This shows that the detection cannot detect the whole face. A very bright illumination can influence the outdoor detection. Therefore, the uneven illumination should be considered in the future work.

In the Figure 8, it shows that detection also able to detect and track a face in a complex background, even though some of the objects at the background have similar color with the subject's face in the image. This system also capable to detect a face of an individual person that has different skin colors.

Fig. 6. Face tracking in a real-time for indoor testing.

Fig. 7. Face tracking in a real-time for outdoor testing.

978-1-4799-0269-9/13/$31.00 ©2013 IEEE 267


Fig. 8. Real-time face tracking with the complex background.

IV. CONCLUSION

From the experimental results, this paper presents the

pose variant face detection and tracking in a real-time system

by using a web camera. This system uses contour detection

method for motion tracking based on the shape recognition.

Thus, it gives a good result for detection and tracking. Based

on the experimental results, this system is able to detect and

track human face in variant of pose. Even though there are

many types of method that can be used for tracking such as

Lucas-Kanade method, the active snake contours method also

capable to track the human face and the result nearly similar

with the tracking using Lucas-Kanade. From the discussion of

the detection and tracking, this system can be used robustly in

indoor environment but has a limitation in several outdoor

environments. However, for the outdoor environment, the face

can be detected and tracked well if the illumination or lighting

is not very extreme or bright. In the future, the threshold

ranges for both indoor and outdoor environment can be

changed or the automatic thresholding method can be applied

in the process to improve the accuracy of the detection.

REFERENCES

[1] Y. Ming-Hsung , D. 1. Kriegman, and N. Ahuja. "Detecting faces in images: a survey " IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 24 (1), Jan. 2002.

[2] S. Z. Li and A. K. Jain, "Handbook of Face Recognition ". Springer Verlag, New York, 2005.

978-1-4799-0269-9/13/$31.00 ©2013 IEEE 268

[3] R. Samad, "A study of natural facial expression recognition using compressed features and its application to robust face detection," Phd. Thesis, Kagawa University, Japan, March 2012.

[4] M. Turk and A. Petland, "Eigenfaces for recognition," The Journal of Cognitive Neuroscience, vol. 3(1), pp. 71-86,1991.

[5] P. Viola and M. Jones, "Robust real-time face detection ". The International Journal of Computer Vision, vol. 57(2), pp. 137-154, 2004.

[6] N. Sethi and A. Aggarwal. "Robust face detection and tracking using pyramidal Lucas Kanade tracker algorithm," International Journal of Computer Technology and Applications, voI.2(5), pp. 1432-1438,2011.

[7] R. Lienhart and J. Maydt, "Haar-like features for rapid object detection ". in Proc. of IEEE International Conference on Image Processing, pp. 900-903, 2002.

[8] D. N. Chandrappa, M. Ravishankar and D.R. R. Babu, "Face detection in color images using skin color model algorithm based on skin color information," in Proc. The 3'd International Conference on Electronics Computer Technology, ICECT, pp. 254-258, 2011.

[9] S. L. Phung, A. Bouzerdoum and D. Chai, "A novel skin color model in YCbCr color space and its applications to human face detection," in Proc. The International Conference on Image Processing, pp. 289-292, 2002.

[10] Z. Li, L. Xue and F. Tan, "Face detection in complex background based on skin color features and improved Adaboost algorithm," in Proc. The IEEE International Conference on Progress in Informatics and Computing, PIC, pp. 723-727,2010.

[II] S. L. Phung, A. Bouzerdoum and D. Chai, "Skin segmentation using color pixel classification: analysis and comparison," IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 27(1), pp. 148-154, 2005.

[12] G. Huang and 1. Su, "A real-time face detection and tracking," in Proc. International Conference on Audio, Language and Image Processing, ICALIP 2008, pp. 173-179,2008.

[13] P. Sebastian,Y. V. Voon and R. Comley, "The effect of colour space on tracking robustness," in Proc. The 3rd IEEE Con! Industrial Electronics and Applications, ICI EA 2008, 2512-2516, 3-5 June 2008.

[14] M. Sonka, V. Hlavac and R. Boyle, "Image processing, analysis and machine vision ". Second Edition, PWS Publishing, Pacific Grove, CA,1999.

[15] G. Brandski and A. Kaehler, "Learning OpenCV ", O'reilly Media Inc. US, 2008.

[16] M. Kass, A. Witkin and D. Terzopoulos. "Snake active contour models," International Journal of Computer Vision, pp. 321-331, 1988.

[17] P. Arbelaez, M. Maire, C. Fowlkes and 1. Malik. "Contour detection and hierarchical image segmentation ". IEEE TPAMI, vol. 33(5), pp. 898-916, May 20 II.

[18] M. Maire, P. Arbelaez, C. Fowlkes and 1. Malik, "Using contour to detect and localize junctions in natural image," in Proc. IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. I-8,23-28June2008.

[ieee 2013 ieee international conference on signal and image processing applications (icsipa) -...

Documents