face tracking for interaction -review and work changbo hu advisor: matthew turk department of...

Face tracking for interactionFace tracking for interaction-review and work-review and work

Changbo Hu

Advisor: Matthew Turk

Department of Computer Science, University of California, Santa Barbara

OutlineOutline Review

– What is the aim of face tracking?– How did people do it?– What we are going to go?

Current Works – Mean-shift skin tracking– Mean-shift elliptical head tracking– Face tracking and imitation

Face in interaction Face in interaction

Where?Who?What?

Detection

Recognition, verification

Expression, talking… attributes

What we expect computer?To perceive the above information

To response properlyapplications

ApplicationsApplications

– Authentication – Human recognition– Internet– Human-computer interface– Facial animation– Talking agent– Model-based video coding

The role of trackingThe role of tracking

Two meaning:– When face detected, keep up its motion

Tracking is easier in some sense Some Tasks request you

– To know its pose To improve performance for recognition of face and

expression Synthesis and animation

What facts cause face What facts cause face variation?variation?

1. Pose (model the relative view to camera )

2. Deformation(model the face expression and talking…)

3. Intensity change (model the illumination and sensor)

What is face tracking?What is face tracking?

To find all the variation factorsProblem formulation:

)))((( txRgPI

deformationtranslation

rotationIntensity sensor

projection

0I

How people did it?How people did it?

ctnedctned

To look into some detailsTo look into some details

Gang Xu, ICPR98

Black, CVPR 95


Blake, ICCV98

Bilinear combination of motion and expression

Cassia CVPR99


DT, PAMI 93

Pentland, Computer Graphics, 96


Pentland ICCV workgroup 99


GorkTurk ICCV01

What will we do?What will we do?

Task:Personalized full tracking and animation of face

Start point: 2d face locationSelecting face modelModeling expressionModeling illuminationAnimation

What conditions we have?What conditions we have?Personalized face is specific

– to model shape– to model expression– to have stable feature points– to sample lighting effect

Statistical learning – PCA, ASM,AAM– muscle vector, human metric for expression– Learn feature point location

Start point--current workStart point--current work

Mean shift tracking of skin color Mean shift tracking of elliptical head 2 step face tracking and expression imitation

Selecting face modelSelecting face model

Face modeling itself is a large topic, related in graphics, talking face, etc. What model should we choose , must considering:

1. The model can account for 3d motion

2. The model is easy to adjust to individual

From Reference [29]

Face model: data captureFace model: data capture

to determine head geometry– method

two calibrated front and frofile images 10 feature ponits--four eye corners, two nostrils, the

bottom of the upper front teeth, the chin, the base of ears

Face model: locate featuresFace model: locate features

to locate the facial features with high precision in three steps– to find a coarse outline of the head and

estimation of main features– to analyze the important areas in more detail– zooms in on specific points and measure with

high accuracy.

Face model: locate featuresFace model: locate features

Face model: Location of main Face model: Location of main featuresfeatures

texture segmentation– using luminance image– bandpass filter and adaptive threshold– morphological operation– connected component analysis– extracting the center of mass, width, and height

of each blob

Face model: Location of main Face model: Location of main featuresfeatures

color segmentation– background color /skin,hair color– extraction the similar feature as the texture

evaluating combination of features – to train a 2-d head model (size)– to score blobs to select candidates– to check each eye candidate for good

combination– to evaluate whole head

Face model: Measuring facial Face model: Measuring facial featuresfeatures

to find the exact dimension– area around the mouth and the eye– using HSI color space– threshold for each color cluster(predefined)– recalibrating the color thresholds dynamcally– remarkable accurate, not robust enough – 2 pixels, standard deviation

Face model: Measuring facial Face model: Measuring facial featurefeature

the colors of teeth, lips and the inner,dark

part of the mouth is prelearned

Face model: High accuracy Face model: High accuracy feature pointsfeature points

Correlation analysis– a group of kernel– kernel chosen by width and height– scan in the image for the best correlation– 20X20 in 100X100, conjugate gradient descent

approach– 0.5 pixel standard deviation

Face model: High accuracy Face model: High accuracy from correlationfrom correlation

Face model: Pose estimationFace model: Pose estimation

using 6 corners, 3d known from the model

iteration equation (to find i,j and Z0)

lowpass filtering on their trajectories

Modeling expressionModeling expression

Like AAM, create pose free apperance patches

Modeling illuminationModeling illumination

3D linear space , assuming Labersion surface, without shadowing

sn TppapE )()()(

Considering shadowing and distrotion, can increase the basis to around 10

Using only one subject, we can learn the linear space by eperiment

AnimationAnimation

Synthesis animationPerformance driven sketch animation

EndEnd

Questions and comments?

Mean shift color trackingMean shift color tracking

An implementation to show power of skin Feature is probability of skin hue Mean-shift search

1. Choose a search window size. 2. Choose the initial location of the search window. 3. Compute the mean location in the search window. 4. Center the search window at the mean location

computed in Step3.5. Repeat Steps 3 and 4 until convergence

ctnedctned

Find the zeroth moment M00

Find the first moment for x and y, M10, M01

Then the mean search window location (the centroid) is (xc, yc)

(xc = M10/ M00, yc = M01/ M00 )

Get features from the blob:– Length, weighth, rotation

ctnedctned

back

Meanshift elliptical head Meanshift elliptical head trackingtracking

Based on shape and adaptive color: the

head is shaped as an ellipse and the head’s appearance is represented by adaptive color.

● First : mean shift to track the color blob● Second: Maximizing the normalized gradient around the

boundary of the elliptical head.

The head’s hue vary during tracking, esp. in different views or big rotation, such as:

In order to handle this problem, we modify the head’s color continuously during tracking using tracking result.

RTN hhh )1( hT : the initial color representation

hR : the tracking result color in the current frame

hN : the head’s color for tracking in the next frame

Why adaptive colorWhy adaptive color

Relocate elliptical headRelocate elliptical head

Maximizing the normalized Gradient

▲Assuming the elliptical head’s state

▲gi is the intensity gradient at perimeter pixel i of the ellipse

▲Nh is the number of pixels on the perimeter of the ellipse.

),( hys

hN

ii

hSs

gN

s1

1maxarg

Then update color

Benefits Benefits

Compared with Bradski’s paper and Stanford elliptical head paper, our approach has the benefits:– Robust (fusion of color and gradient cue,

adaptive to color changing)– Fast (do not need to search, meanshift iterate

fast)

DemoDemo

back

Real time face pose tracking & Real time face pose tracking & expression imitation (still on)expression imitation (still on)

A modification to Active apperance modelThe most obvious drawback of AAM?

– slow, because it can not apply PCA projection directly

Explictly compute the rigid motion by a rigid of feature points

Learning the PCA space for nonrigid shape and appearance

Two step face trackingTwo step face trackingFormulation:

Rigid features x1, nonrigid features x2

Ta(x1)->z1, the same T a (x2)->z2

PbZZ 22

Deal with unprecise of rigid points by synthesized feedback:

In the synthyzied Z2, relocate rigid feature x1 and compute new T

Iteration untill covergence

Pose free expressionPose free expression

Pose T

New face with pose and expression

Animation Animation One implementaion: using a hand drawing corresponding modes, for example:

back

ReferenceReference1.1. [H. li , PAMI93] H. li, P. Rovainen, and R. Forcheimer, “3-D motion estimation in model based [H. li , PAMI93] H. li, P. Rovainen, and R. Forcheimer, “3-D motion estimation in model based

facial image coding”PAMI, 6,1993facial image coding”PAMI, 6,1993

2.2. [DT, PAMI 93] D. Terzopulos and K. Water, Analysis and synthesis of facial image sequences [DT, PAMI 93] D. Terzopulos and K. Water, Analysis and synthesis of facial image sequences using physical and anatomical models. PAMI, 6, 1993using physical and anatomical models. PAMI, 6, 1993

3.3. [Black, CVPR 95] M Black, Yacoob, Tracking and recognizing rigid and non-rigid facial motion [Black, CVPR 95] M Black, Yacoob, Tracking and recognizing rigid and non-rigid facial motion using local parametric model of image motion, CVPR95using local parametric model of image motion, CVPR95

4.4. [Essa ICCV95] I. Essa and A. Pentland. Facial expression recognition using a dynamic model [Essa ICCV95] I. Essa and A. Pentland. Facial expression recognition using a dynamic model and motion energy. InProc. 5th Int.Conf. on Computer Vision, pages 360{367, 1995.and motion energy. InProc. 5th Int.Conf. on Computer Vision, pages 360{367, 1995.

5.5. [Darell CVPR96] Trevor Darrell, Baback Moghaddam Alex pentland, Active face tracking and [Darell CVPR96] Trevor Darrell, Baback Moghaddam Alex pentland, Active face tracking and pose estimation in an Interactive room, CVPR96,pose estimation in an Interactive room, CVPR96,

6.6. [Pentland, Computer Graphics, 96] Urfan Essa, Sumit Basu, T Darrel, Pentland, Modeling, [Pentland, Computer Graphics, 96] Urfan Essa, Sumit Basu, T Darrel, Pentland, Modeling, tracking and interactive animation of faces and heads// using input from video, Proceedings tracking and interactive animation of faces and heads// using input from video, Proceedings Computer Graphics, 1996Computer Graphics, 1996

7.7. [L. Davis FG96] T. Horprasert, Y. Yacoob, and l.S Davis, “computing 3D head orientation from [L. Davis FG96] T. Horprasert, Y. Yacoob, and l.S Davis, “computing 3D head orientation from monocular image sequence”, FG96monocular image sequence”, FG96

8.8. [Yacoob, PAM96] Y. Yacoob and LS Davis, “computing spatio-temporal representations of [Yacoob, PAM96] Y. Yacoob and LS Davis, “computing spatio-temporal representations of human faces”, PAMI, 6, 1996human faces”, PAMI, 6, 1996

9.9. [Decarlo, CVPR 96] D. Decarlo and D . Metaxas, the intergration of optical flow and [Decarlo, CVPR 96] D. Decarlo and D . Metaxas, the intergration of optical flow and deformable models woth applications to human face shape and motion estimation, CVPR 96deformable models woth applications to human face shape and motion estimation, CVPR 96

10.10. [Nesi RTI96] P. Nesi and R. Magnol_. Tracking and synthesizing facial motions with dynamic [Nesi RTI96] P. Nesi and R. Magnol_. Tracking and synthesizing facial motions with dynamic contours. Real Time Imaging, 2:67-79, 1996.contours. Real Time Imaging, 2:67-79, 1996.

11.11. [Oliver CVPR97] Nuria Olivedr, Alex Pentland, LAFTER: Lips and Face real time tracker, [Oliver CVPR97] Nuria Olivedr, Alex Pentland, LAFTER: Lips and Face real time tracker, CVPR97,CVPR97,

12.12. [DT, CVPR97] P. Fieguth and D Terzopoulous, “Color-based tracking of heads and other [DT, CVPR97] P. Fieguth and D Terzopoulous, “Color-based tracking of heads and other mobile objects at video frame rates” CVPR97mobile objects at video frame rates” CVPR97

13.13. [Pentland CVPR97] TS. Jebra and A Pentland, “Parameterized structure from motion for 3D [Pentland CVPR97] TS. Jebra and A Pentland, “Parameterized structure from motion for 3D adaptive feedback tracking of faces” CVPR97adaptive feedback tracking of faces” CVPR97

14.14. [Cootes ECCV 98] T. Cootes, G Edwards, Active appearance model, ECCV98,[Cootes ECCV 98] T. Cootes, G Edwards, Active appearance model, ECCV98,

15.15. [Gang Xu, ICPR98][Gang Xu, ICPR98]Gang Xu and Takeo SugimotoGang Xu and Takeo Sugimoto, , "Rits Eye: A Software-Based System for "Rits Eye: A Software-Based System for Realtime Face Detection and Tracking Using Pan-Tilt-Zoom Controllable Camera", Proc. of Realtime Face Detection and Tracking Using Pan-Tilt-Zoom Controllable Camera", Proc. of 14th International Conference on Pattern Recognition, pp.1194-1197, 199814th International Conference on Pattern Recognition, pp.1194-1197, 1998

16.16. [Birtchfield CVPR98] Stan Birchfield, Elliptical head tracking using Intensity Gradients and [Birtchfield CVPR98] Stan Birchfield, Elliptical head tracking using Intensity Gradients and color histograms, CVPR 98color histograms, CVPR 98

17.17. [Hager PAMI98] G Hager, P Belhumeur, Efficient Region Tracking With Parametric Models of [Hager PAMI98] G Hager, P Belhumeur, Efficient Region Tracking With Parametric Models of Geometry and Illumination (with P. Belhumeur), IEEE Transactions on Pattern Analysis and Geometry and Illumination (with P. Belhumeur), IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(10), pp.~1125-1139, 1998Machine Intelligence, 20(10), pp.~1125-1139, 1998

18.18. [Shodl PUI98] Schödl, Haro, and Essa, Head tracking using a textured polygonal model, [Shodl PUI98] Schödl, Haro, and Essa, Head tracking using a textured polygonal model, PUI98. PUI98.

19.19. [Blake ICCV98] B. Bascle, A. Blake, Separability of pose and expression in facial tracking and [Blake ICCV98] B. Bascle, A. Blake, Separability of pose and expression in facial tracking and animation, ICCV98animation, ICCV98

20.20. [Cassia CVPR99] La Cascia, M, Sclaroff, S., fast, Reliable Head tracking under [Cassia CVPR99] La Cascia, M, Sclaroff, S., fast, Reliable Head tracking under illumination, CVPR99illumination, CVPR99

21.21. [Pentland ICCV workgroup 99] J. strom, T. Jebara, S. Baru, A. Pentland, Real time [Pentland ICCV workgroup 99] J. strom, T. Jebara, S. Baru, A. Pentland, Real time tracking and modeling of faces: an EKF-based analysis by synthesis approach, In tracking and modeling of faces: an EKF-based analysis by synthesis approach, In International Conference on Computer Vision: Workshop on Modelling People,Corfu, International Conference on Computer Vision: Workshop on Modelling People,Corfu, Greece, September 1999.Greece, September 1999.

22.22. [GorkTurk ICCV01] Salih Burak Gokturk, Jean-Yves Bouguet, et. al, A data-driven [GorkTurk ICCV01] Salih Burak Gokturk, Jean-Yves Bouguet, et. al, A data-driven model for monocular face tracking, ICCV 2001model for monocular face tracking, ICCV 2001

23.23. [Y Li ICCV01] Yongmin Li, Shaogang Gong and Heather Liddell, Modeling face [Y Li ICCV01] Yongmin Li, Shaogang Gong and Heather Liddell, Modeling face dynamically across views and over time, ICCV, 2001dynamically across views and over time, ICCV, 2001

24.24. [Feris ICCV workgroup 01] Rogerio S Feris, Roberto m. Cesar Jr, Efficient real-time [Feris ICCV workgroup 01] Rogerio S Feris, Roberto m. Cesar Jr, Efficient real-time face tracking in wavelet subspace, ICCV Workshop, 2001face tracking in wavelet subspace, ICCV Workshop, 2001

25.25. [Ahlberg RATFFG-RTS01] Jorgen Ahlberg, Using the Active Appearance Algorithm for [Ahlberg RATFFG-RTS01] Jorgen Ahlberg, Using the Active Appearance Algorithm for Face and Facial Feature Tracking 2nd International Workshop on Recognition, Analysis Face and Facial Feature Tracking 2nd International Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Realtime Systems (RATFFG-RTS), pp. 68 - 72, and Tracking of Faces and Gestures in Realtime Systems (RATFFG-RTS), pp. 68 - 72, Vancouver, Canada, July 2001.Vancouver, Canada, July 2001.

26.26. [CC Chang IJCV02]Chin-Chun Chang and Wen-hsinag Tsai, Determination of head [CC Chang IJCV02]Chin-Chun Chang and Wen-hsinag Tsai, Determination of head pose and facial expression from a single perspective view by successive scaled pose and facial expression from a single perspective view by successive scaled orthographic approximations, IJCV,3,2002orthographic approximations, IJCV,3,2002

27.27. Dorin Comaniciu and Peter Meer. Real-time tracking of non-rigid objects using Mean Dorin Comaniciu and Peter Meer. Real-time tracking of non-rigid objects using Mean shift. In the Proc.of the IEEE CVPR, 2000, pp: 142-149.shift. In the Proc.of the IEEE CVPR, 2000, pp: 142-149.

28.28. G.R.Bradski. Real-Time Face and Object Tracking as a Component of a Perceptual G.R.Bradski. Real-Time Face and Object Tracking as a Component of a Perceptual User Interface. IEEE Workshop Application of Computer Vision. 1998, pp: 214-219User Interface. IEEE Workshop Application of Computer Vision. 1998, pp: 214-219

29.29. Eric Cosatto and Hans Peter Graf, Photo-realistic talking-heads from image samples, Eric Cosatto and Hans Peter Graf, Photo-realistic talking-heads from image samples, IEEE trans. On Multimedia, vol.2, No.3, September 2000 IEEE trans. On Multimedia, vol.2, No.3, September 2000

cntedcnted

face tracking for interaction -review and work changbo hu advisor: matthew turk department of...

Documents

face model face modeling

face expression

animation of face

recognition of face

face variation

personalized face

ctned slide

step face tracking