3d facial model reconstruction

9
SIViP DOI 10.1007/s11760-011-0278-9 ORIGINAL PAPER 3D facial model reconstruction, expressions synthesis and animation using single frontal face image Narendra Patel · Mukesh Zaveri Received: 15 May 2011 / Revised: 10 November 2011 / Accepted: 12 November 2011 © Springer-Verlag London Limited 2011 Abstract With better understanding of face anatomy and technical advances in computer graphics, 3D face synthesis has become one of the most active research fields for many human-machine applications, ranging from immersive tele- communication to the video games industry. In this paper we proposed a method that automatically extracts features like eyes, mouth, eyebrows and nose from the given frontal face image. Then a generic 3D face model is superimposed onto the face in accordance with the extracted facial fea- tures in order to fit the input face image by transforming the vertex topology of the generic face model. The 3D-specific face can finally be synthesized by texturing the individual- ized face model. Once the model is ready six basic facial expressions are generated with the help of MPEG-4 facial animation parameters. To generate transitions between these facial expressions we use 3D shape morphing between the corresponding face models and blend the corresponding tex- tures. Novelty of our method is automatic generation of 3D model and synthesis face with different expressions from frontal neutral face image. Our method has the advantage that it is fully automatic, robust, fast and can generate var- ious views of face by rotation of 3D model. It can be used in a variety of applications for which the accuracy of depth is not critical such as games, avatars, face recognition. We have tested and evaluated our system using standard database namely, BU-3DFE. N. Patel (B ) Department of Computer Engineering, BVM Engineering College, V.V.Nagar, Gujarat, India e-mail: [email protected] M. Zaveri Department of Computer Engineering, SVNIT, Surat, Gujarat, India e-mail: [email protected] Keywords A generic 3D model · Morphing · Animation · MPEG-4 · FAPs · Texture 1 Introduction Facial modeling and animation are among the fastest devel- oping technologies in the field of animation. The model is a description of 3D objects in a strictly defined language or data structure. The face model is the representation of the head that accurately captures the shape of head and face. The head model may be obtained using photographs, 3D scanning or modeling using special software. Models can use different representations, like polygon mesh and B-Spline. To syn- thesize a specific 3D face, the extraction of two basic sets of data, the vertex topology of the 3D wire frame and the texture of the specific face, has to be carried out. The ver- tex topology specifies the structure of the 3D generic model, while the face texture enhances the degree of realism of the face model. The algorithms of 3D face synthesis can be clas- sified into the four categories [1]: 3D laser scanner, stereo images, multi view images and single view images. For the increasingly growing demand on real-time applications like video phony and video conferencing, methods generating 3D model from multiple images are no longer feasible because they need user supervision and are computationally complex. Therefore, to achieve a simple, quick system, 3D face syn- thesis from a single image has came to be an ideal choice. There has been a lot of work on face modeling from images. In [2] researchers have suggested the use of two orthogo- nal views: one frontal view and one side view to create 3D model. Such systems require the users to manually specify the face features on the two images. Blanz and Vetter [3] devel- oped a system to create face models from a single image. Their system uses both a geometry database and an image 123

Upload: absurd88

Post on 28-Nov-2014

117 views

Category:

Documents


1 download

DESCRIPTION

kinds of model reconstruction in technology

TRANSCRIPT

Page 1: 3D Facial Model Reconstruction

SIViPDOI 10.1007/s11760-011-0278-9

ORIGINAL PAPER

3D facial model reconstruction, expressions synthesisand animation using single frontal face image

Narendra Patel · Mukesh Zaveri

Received: 15 May 2011 / Revised: 10 November 2011 / Accepted: 12 November 2011© Springer-Verlag London Limited 2011

Abstract With better understanding of face anatomy andtechnical advances in computer graphics, 3D face synthesishas become one of the most active research fields for manyhuman-machine applications, ranging from immersive tele-communication to the video games industry. In this paperwe proposed a method that automatically extracts featureslike eyes, mouth, eyebrows and nose from the given frontalface image. Then a generic 3D face model is superimposedonto the face in accordance with the extracted facial fea-tures in order to fit the input face image by transforming thevertex topology of the generic face model. The 3D-specificface can finally be synthesized by texturing the individual-ized face model. Once the model is ready six basic facialexpressions are generated with the help of MPEG-4 facialanimation parameters. To generate transitions between thesefacial expressions we use 3D shape morphing between thecorresponding face models and blend the corresponding tex-tures. Novelty of our method is automatic generation of 3Dmodel and synthesis face with different expressions fromfrontal neutral face image. Our method has the advantagethat it is fully automatic, robust, fast and can generate var-ious views of face by rotation of 3D model. It can be usedin a variety of applications for which the accuracy of depthis not critical such as games, avatars, face recognition. Wehave tested and evaluated our system using standard databasenamely, BU-3DFE.

N. Patel (B)Department of Computer Engineering,BVM Engineering College, V.V.Nagar, Gujarat, Indiae-mail: [email protected]

M. ZaveriDepartment of Computer Engineering,SVNIT, Surat, Gujarat, Indiae-mail: [email protected]

Keywords A generic 3D model · Morphing · Animation ·MPEG-4 · FAPs · Texture

1 Introduction

Facial modeling and animation are among the fastest devel-oping technologies in the field of animation. The model is adescription of 3D objects in a strictly defined language or datastructure. The face model is the representation of the headthat accurately captures the shape of head and face. The headmodel may be obtained using photographs, 3D scanning ormodeling using special software. Models can use differentrepresentations, like polygon mesh and B-Spline. To syn-thesize a specific 3D face, the extraction of two basic setsof data, the vertex topology of the 3D wire frame and thetexture of the specific face, has to be carried out. The ver-tex topology specifies the structure of the 3D generic model,while the face texture enhances the degree of realism of theface model. The algorithms of 3D face synthesis can be clas-sified into the four categories [1]: 3D laser scanner, stereoimages, multi view images and single view images. For theincreasingly growing demand on real-time applications likevideo phony and video conferencing, methods generating 3Dmodel from multiple images are no longer feasible becausethey need user supervision and are computationally complex.Therefore, to achieve a simple, quick system, 3D face syn-thesis from a single image has came to be an ideal choice.There has been a lot of work on face modeling from images.In [2] researchers have suggested the use of two orthogo-nal views: one frontal view and one side view to create 3Dmodel. Such systems require the users to manually specify theface features on the two images. Blanz and Vetter [3] devel-oped a system to create face models from a single image.Their system uses both a geometry database and an image

123

Page 2: 3D Facial Model Reconstruction

SIViP

database. Their system is computationally more expensive.Feng and yuen [4] synthesized the face only from a singleimage, but this method needs to estimate head rotation param-eters by using another reference image. Liu [5] also proposeda system to create face models from a single face image butthey have used existing software to detect features. Aftergeneration of 3D face model next important issue is syn-thesis human face with high degree of realism. One of theways of achieving realism is modeling of facial expressionsand animation on synthesized human face. However, the taskof modeling all human expressions on to a virtual characteris complicated by the richness of human facial expressionsand the fact that each individual has their unique way ofexpressing emotions facially. Approaches of modeling facialexpressions and animation required to manipulate the facemesh’s vertices. There are five different approaches for facialanimation [1]: interpolation, performance driven, muscle-based, pseudo muscle-based and direct parameterization ani-mation. The most commonly used animation approachesnowadays are direct parameterization, where facial anima-tion is parameterized by a set of animation parameters [6].These parameters not only govern global animation of thehead but also are able to emulate a variety of facial expres-sions, overcoming the limitation of expression interpolation.The typical parameters are the affine transformation param-eters, action units (AUs) and Facial animation parameters(FAPs) etc.

MPEG-4 standard employs the FAPs operating on a setof facial definition parameters (FDP) or facial feature points(FP). The FDP defines the 3D location of 84 points on aneutral face. FDPs are usually corresponding to facial fea-tures and therefore roughly outline the shape of face. TheFAPs specify FDPs displacements which model actual facialfeatures movements in order to generate various expres-sions [7,8]. All FAPs involving translation movement areexpressed in terms of facial animation parameters unit(FAPU).

We have proposed a method that automatically generate3D face model from the given frontal image of the face andgenerates the universal six expressions of the face, such asfear, anger, happiness, surprise, disgust and sadness. In ourproposed system these expressions are represented with thehelp of MPEG-4 facial animation parameters. The MPEG-4visual standard specifies a set of facial definition parameters(FDPs) and facial animation parameters (FAPs) for facial ani-mation. The FAPs are used to characterize the movementsof facial features defined over jaw, lips, eyes, mouth, noseand cheek. Facial animation is produced by interpolationbetween two or more different models using 2D morphingtechniques combined with 3D transformations of geomet-ric model. Novelty of our method is automatic generationof 3D model and synthesis face with different expressionsfrom frontal neutral face image. We have used 3D facial

expression database BU-3DFE [9] for texture mapping andto determine the values of FAPs. It is also possible to gener-ate new expression through the blending of selected expres-sions.

The paper is organized as follows: Sect. 2 describes 3Dface model reconstruction. It is followed by expressions gen-eration and expression morphing in Sects. 3 and 4, respec-tively. Section 5 describes animation. The simulation resultsand conclusions are discussed in Sects. 6 and 7.

2 3D face model reconstruction

The method that reconstructs the 3D face model from sin-gle frontal 2D face image consists of facial feature extraction,face model adaptation and texture mapping [10,11]. Our pro-posed method first extracts features from given face imageand then a generic 3D face model is superimposed onto theface in accordance with the extracted facial features in orderto fit the input face image by transforming the vertex topologyof the generic face model. The advantage of using genericmodel is that the number of triangles used to represent modelare fixed. These triangles are deformed to fit a specific face.Even in case of large image size our approach is able toreconstruct model properly because we have taken care of itin rendering part.

2.1 Facial feature extraction

Facial feature extraction comprises two phases: face detec-tion and facial feature extraction [12,13]. Face is detectedby segmenting skin and non-skin pixels. It is reported thatY CbCr color model is more suitable for face detection thanany other color model. It is also reported that the chrominancecomponent Cb and Cr of the skin tone always have valuesbetween 77 <= Cb <= 127 and 133 <= Cr <= 173,respectively [14]. Face detection method using skin tone onlydetects face correctly when person has dark hair. Light colorhair is also detected as skin tone. We separate hair from facewith the help of finding luminance changes which is evidentin the hair. To find luminance changes standard deviation iscalculated by dividing face region into 4×4 blocks. After cal-culating standard deviation, a region is found in which stan-dard deviation is prominent. Finally face region is extractedwithout hair.

After detection of face, the features like eyes, mouth andeyebrows are detected. We first build two separate eye maps,one from the chrominance components and the other fromthe luminance component [15]. We have used upper half ofthe face region for preparation of eye maps to detect eyes.The eye map from the chroma is based on the observationthat high Cb and low Cr values are found around the eyes. Itis constructed by

123

Page 3: 3D Facial Model Reconstruction

SIViP

Ec = 1

3

((C2

b

)+ (

Cr)2 + Cb

Cr

)(1)

where C2b ,

(Cr

)2and Cb/Cr all are normalized to the range

[0, 1] and Cr is the negative of Cr (i.e. 1 − Cr )The eyes usually contain both dark and bright pixels in

the luma component so grayscale morphological operatorsdilation (⊕) and erosion (�) are used to emphasis brighterand darker pixels in the luma component around eye regions.It is constructed using Eq. 2.

El = Y (x, y) ⊕ G(x, y)

Y (x, y)�G(x, y)(2)

where Y (x, y) is luma component of face region and g(x, y)

is structuring element.The eye map from the chroma is combined with the eye

map from the luma by an AND (multiplication) operation.The resulting eye map is dilated with same structuring ele-ment to brighten eyes and suppress other facial areas. Thelocations of the eyes are estimated from the eye map. Wehave determined mean and standard deviation of eye mapwhich is used to find location of eyes. After the large num-ber of experiments we have set the value of threshold (T ) =mean + 0.3 ∗ variance. Eye feature points, the left and rightcorners and the upper and lower middle points of the eyelidsare extracted from the edge map of the eye using sobel gra-dient operator. After two eye corners and two middle pointson the eyelids have been located two parabolas are appliedon the detected eyes. The location and feature points of theeyebrows are found from the edge map of the region of theface above the eye.

Lip region is extracted using the observation that the lippixels have stronger red component but green and blue com-ponents are almost same. Skin pixels also have stronger redcomponent but green component has higher value comparedto blue component. Difference between red and green com-ponent is greater for lip pixels than skin pixels. Hulbert andpoggio [16] proposed a pseudo hue definition that calculatespseudo hue as

H(x, y) = R(x, y)

R(x, y) + G(x, y)(3)

where R(x, y) and G(x, y) are the red and green componentsof the pixel (x, y), respectively. However, a person with red-dish skin, as shown in Fig. 1, pseudo hue may not give correctresult. Method discussed in [17] also fails when person hasreddish skin. So we have combined pseudo hue H(x, y) withH1(x, y)

H1(x, y) = Log

(G(x, y)

B(x, y)

)(4)

where G(x, y) and B(x, y) are the green and blue compo-nents of the pixel (x, y), respectively. Lip pixels have lowervalue of Green and Blue color components so log function

Fig. 1 a Frontal face image. b Pseudo hue. c Log(G/B) of lip area

is used to enhance contrast. Lip pixels have higher value ofH(x, y) and lower values of H1(x, y). The location of themouth is detected by finding the region having higher valueof H(x, y) and lower value of H1(x, y). We have found thatpseudo hue (H ) value varies from 0.55 to 0.65 and value ofH1 is to be less than 0.73 for lip pixels. It is found that lipcorners are in shadow and they have lower value of inten-sity. Lip corner points are found using intensity componentof lip region having lower value. The pseudo hue componentH(x, y) of the lip region is shown in Fig. 2. It is observedthat the hue value (H ) for the middle part of the lip pixelsare higher when mouth is closed. But when mouth is openthe hue value is lower for teeth part but higher for cavity.This observation is used to check whether mouth is closed oropen.

We have applied canny edge detector on intensity compo-nent of lip region and determined edge points correspondingto upper outer and lower outer lip contour for middle col-umn. The edge map of lip region is shown in Fig. 3. Whenmouth is closed, inner upper and inner lower boundary edgepoints are same. They are the points with maximum pseudohue value for middle column as shown in Fig. 2.

We have found P2, P3 and P4 points on the upper bound-ary of lip as shown in Fig. 4. To find P2 we have traversed leftedge of upper lip boundary from P4 till position is decreasing.P2 is an edge point with lowest position. Similarly we havetraversed right edge of upper lip boundary to find point P3(Fig. 4). When mouth is open feature points on inner upper lipboundary (P8) and inner lower lip boundary (P9) are deter-mined. Teeth and tongue cause problems in determination ofP8 and P9 from edge map. So after the determination of P4we have searched for first point in down direction up to P1which has maximum gradient of pseudo hue (H ) for mid-dle column, which is P8. Same way after determination ofP6 we have searched for first point in up direction up to P1which has maximum gradient of pseudo hue (H ) for middlecolumn, which is P9. After detecting feature points the upperlip boundary is modeled using cubic curve (cardinal spline)

123

Page 4: 3D Facial Model Reconstruction

SIViP

Fig. 2 a Lip region. b Pseudo hue (H )

Fig. 3 a Lip region. b Edges of lip region

[18]. Experimentally it is found that upper inner boundary,lower inner boundary and lower outer boundary of lip canbe modeled more accurately using parabola than cubic curvewhich is shown in Fig. 4. The location and feature point ofnose are found using vertical component of gradient of the

Fig. 4 Lip model

Fig. 5 Facial features

face image between eye and mouth. Detected feature pointsare shown in Fig. 5 for frontal face image

2.2 Face model adaptation

This is a process in which the generic 3D face model isdeformed to fit a specific face. Our proposed generic model[18] is shown in Figs. 6 and 7 which is polygon-based (tri-angle mesh) and consists of 350 triangles and 221 vertices.

Model is adapted to given frontal face image with the helpof two geometrical transformations scaling and translation.Assuming orthographic projection, the translation vector canbe derived by calculating the distance between the 3D facemodel centers to the 2D face center. Let Cl indicate center ofleft eye, Cr indicate center of right eye, Cc indicate middlepoint between two eyes and Cm indicate center of mouth ingiven face. Similarly C ′

l , C ′r , C ′

c and C ′m are corresponding

points in the 2D projection of the 3D face model. Model isscaled by an amount Sx , Sy and Sz using Eq. 5.

Sx = |Ci −Cr ||C ′i −C ′

r | Sy = |Cc−Cm ||C ′

c−C ′m | Sz = (Sx +Sy)

2 (5)

After global adaptation of model we perform local refinementof model eyes, eyebrows and mouth with that of face fea-tures. Face boundary is detected using morphological oper-ation erosion as shown in Eq. 6.

B(x, y) = F(x, y) − F(x, y)�G(x, y) (6)

123

Page 5: 3D Facial Model Reconstruction

SIViP

Fig. 6 Generic face model

Fig. 7 Models of a eyebrow, b eyes, c mouth, d left cheek, e rightcheek, f nose

Fig. 8 Chin model

where F(x, y) is face image and G(x, y) is structuring ele-ment.

To get complete 3D model, model boundary points arealigned with corresponding face boundary points with thehelp of translation. In many faces boundary correspondingto chin may be not be found properly. So for chin we havefitted parabola as shown in Fig. 8 after finding points C1, C2and C3. C1 and C2 are the boundary points corresponding tomouth corner points. C3 is bottom boundary point for middlecolumn, which is minimum intensity point between mouthand neck because there is shadow between face and neck.After the 3D face geometry is reconstructed, it is renderedand appropriate texture is mapped to synthesis 2D face image.

3 Expressions generation

Expressions are represented with the help of MPEG-4 facialanimation parameters (FAPs). The FAPs are a set of param-eters defined in the MPEG-4 visual standard for the anima-tion of synthetic face models. There are 68 FAPs including2 high-level FAPs used for visual phoneme and expressionand 66 low-level FAPs used to characterize the facial featuremovements over jaw, lips, eyes, mouth, nose, cheek, ears etc.The expression parameter FAP-2 define the six primary facialexpression as shown in Table 1. We have generated six basicexpressions with the help of low-level FAPs as discussed in[17,19]. The FAPs are computed through tracking a set offacial features defined in Fig. 5 and they are measured byfacial animation parameter units (FAPUs) that permit us toplace FAPs on any facial model in a consistent way [20]. TheFAPUs are defined with respect to the distances between keyfacial features in their neutral state such as eyes (ES0), eye-lids (IRDS0), eye-nose (ENS0), mouth-nose (MNS0) and lipcorners (MW0) as shown in Fig. 9.

Table 2 gives the relation between the expressions andinvolved FAPs. Expressions are generated by moving anddeforming various control vertices of face model accordingto FAPs. Negative sign with FAPs indicate opposite direc-tion motion. If Vm indicates the neutral coordinate of the mthvertex in a certain dimension of the 3D space its animatedposition V ′

m in the same dimension can be expressed as

V ′m = Vm + wn

∗FAPUn∗in (7)

where ωn is the weight of the n FAP, FAPUn is the FAPU ton FAP, in is the amplitude of FAP ranging between [0, 1]. Infact, the term, ωn ∗ FAPUn defines the maximum displace-ment of FAPn , while coefficient in controls the intensity ofFAPn. We have developed scan line algorithm that establishthe correspondence between each triangle of neutral modeland expression model for each scan line for each pixel togenerate expression specific texture.

4 Expression morphing

Expression morphing means the generation of continuousand realistic transitions between different facial expressions.We achieve these effects by morphing between correspond-ing face models. 3D morphing sequence can be obtainedusing simple linear interpolation between the geometriccoordinates of corresponding vertices in each of the twoface meshes. Together with the geometric interpolation, weneed to blend the associated textures. When we morph twodifferent expressions of the same face model then first inter-mediate face model is generated by geometric interpolation.Texture for this intermediate model is directly generatedfrom neutral face by establishing corresponded between each

123

Page 6: 3D Facial Model Reconstruction

SIViP

Table 1 Facial expressionsdefined by FAP-2

Textual description Expressions

The eyebrows are relaxed. The mouth is open and mouth corners pulled backtoward the ears

Happiness

The inner eyebrows are bent upward. The eyes are slightly closed, the mouth isrelaxed

Sadness

The inner eyebrows are pulled downward and together. The eyes are wide open.The lips are pressed against each other or opened to expose the teeth

Anger

The eyebrows are raised and pulled together. The inner eyebrows are bentupward. The eyes are tense and alert

Fear

The eyebrows and eyelids are relaxed. The upper lip is raised and curled, oftenasymmetrically

Disgust

The eyebrows are raised. The upper eyelids are wide open, the lower relaxed. Thejaw is opened

Surprise

Fig. 9 Neutral face and FAPUs

Table 2 Facial expressions and FAPs

Expressions FAPs no

Happiness Raise corner lip, stretch corner lip, lift cheek,mouth open59,60,6,7, 41,42,4-,5-,51-,52-

Sadness Lower corner lip, lower inner eyebrow,close eyelid59-,60-,31-,32-,19,20

Disgust Close eyelid, mouth open, stretch nose,raise corner lip19,20, 4-,5,51-,52, 61,62,59,60

Surprise Raise eyebrow, mouth open, open eyelid

31,32,33,34,35,36, 4-,5-,51-,52-,19-,20-

Anger Open eyelid, lower eyebrow, squeeze eyebrow,mouth open19-,20-,31-,32-,37,38,4-,5,51-,52

Fear Open eyelid, raise eyebrow, squeeze eyebrow,mouth open19-,20-,31,32,33,34,35,36,37,38, 4-,5-,51-,52-

triangle of neutral model and intermediate expression modelfor each scan line for each pixel. We have also developedalgorithm which morph expressions of any two face mod-els. We have used triangle-based wrapping method [21,22].Intermediate texture is generated using linear interpolationof respective source and destination triangle for each scan

line. New expression can be generated by blending any twoexpressions from the six basic expressions with the help ofmorphing. Our proposed morphing algorithm is as follows

Step 1: For each value of interpolation factor t repeat Step2 to 5 (0 <= t <= 1)Step 2: For each triangle of the model repeat Steps 3 to 5to generate intermediate frameStep 3: Sort three vertices of triangle based on their y coor-dinate in ascending orderStep 4: Interpolate vertices of triangle in source and targetmodel.

Interpolated vertices = (1 − t)* source triangle vertices +t* target triangle vertices

MI = M1∗Ms

M1 = MI∗Inv (Ms)

MI = M2∗Mt

M2 = MI∗Inv(Mt )

Where, M1 is affine transformation matrix that relatessource triangle vertices (Ms) with intermediate triangle ver-tices (MI ). Same way M2 is affine transformation matrix thatrelates target triangle vertices (Mt ) with Intermediate trianglevertices (MI ).

Step 5: Generate texture for intermediate frame by estab-lishing correspondence between source model triangle anddestination model triangle for each scan line for each pixel

Image(x, y) = (1 − t)∗Image1(x1, y1) + t∗Image2(x2, y2)

[x1, y1, 1]T = Inv(M1)∗[xy1]T

[x2, y2, 1]T = Inv(M2)∗[xy1]T

(x, y) is pixel in intermediate frame, (x1, y1) is correspond-ing pixel in source image and (x2, y2) is corresponding pixel

123

Page 7: 3D Facial Model Reconstruction

SIViP

Fig. 10 Synthesized imagesand 3D models

in destination image. Starting x value corresponding to nextscan line = previous x start value + (1/slope). Becauseof affine transformations this algorithm giving good resultscompared to our previous proposed algorithm discussed in[17,23].

5 Animation

Once the expressions are modeled as described above, it istime to animate our character of interest. As mentioned ear-lier, we have used key-frame approach of animation. Everyfacial expression can be stored as key-frame by storing thevalues of parameters. After all key-frames are defined, ani-mation can be created by generating intermediate frames.Intermediate frames are generated by interpolating theparameters values of successive key-frames. Global anima-tion of the face is of great importance in the implementa-tion of facial animation when an animator manipulates a 3Dface in terms of translation, rotation and zooming. We haverotated 3D model about three primary axes and our morphingalgorithm automatically generate texture for rotated model.Same way 3D model is scaled to create effect of Zoom-inand Zoom-out.

6 Simulation results

The face images we have used for simulation are mainlyfrom 3D facial expression database BU-3DFE. The databasecovers both male and female images with different expres-sions, various nationality and different illuminations. Wehave tested our algorithm on lower resolution face imagesas well on higher resolution face images of size 512 × 512.Results discussed in this paper are visually better than pub-lished in [17,19,23]. Our database consists of images of

Fig. 11 A Frontal face with hair. b Synthesized face. c Synthesizedface after rotation about y-axis

different people with different illuminations. We have eval-uated our algorithm on many face images and result of facialfeatures extraction for one of the face image is shown inFig. 5. Some of the results of 3D model construction areshown in Fig. 10. We have also tried to construct 3D modelof the face image with hair, which is shown in Fig. 11. Wehave tested accuracy of our algorithm with comparing syn-thesized face image with original face image as pixel level.Our algorithm manipulates vertices of face model accord-ing to FAPs specified in MPEG-4 standard and generates sixbasic expressions. Figure 12 shows Happy, Sad, Angry, Fear,Disgust and Surprise expressions on synthesized human face.3D morphing sequence is obtained using linear interpolationbetween the geometric coordinates of corresponding verticesin each of the two face meshes and blending the associatedtexture. In our experiment we have used linear interpolationparameter t = 0.2, 0.4, 0.6, 0.8 and 1 to generate intermedi-ate frames. We have chosen the value of t in step of 0.2 to showsmooth change from one frame to other frame which enablesto run our algorithm faster without compromising quality ofreconstruction. In Fig. 13 we have morphed sad and surpriseexpression and generated new expression. Figure 14 showsresult of morphing between any two different face models.Result of 3D rotation about three principle axes is shownin Fig. 15. Results of Zoom-in and Zoom-out are shown in

123

Page 8: 3D Facial Model Reconstruction

SIViP

Fig. 16. We have developed our algorithm in MATLAB andtested on PIV, 3 GHz, and 1 GB RAM computer. The totaltime starting from feature extraction to synthesis of imageis shown in Table 3. The overall speed of algorithm will beslower in case of large image size, which is proportional tothe number of pixels to be rendered.

7 Conclusion

Our proposed method adapt generic 3D model into face spe-cific model and successfully synthesis 3D face, i.e. gener-ate 3D topology and texture. Our proposed method does notrequire any user interaction. We have also implemented facialanimation with the synthesized 3D face in variant mannersincluding expressions synthesis and synthesis different viewswith the help of rotation of 3D model. Our proposed morp-hing algorithm generates new expressions from the existing

Fig. 12 Synthesize expressions a happy, b sad, c angry, d fear, e dis-gust, f surprise

Fig. 14 Morphing between Model A and Model B

Fig. 15 a Synthesized face after rotation of model about y-axis (45).b Synthesized face after rotation of model about x-axis (45). c Synthe-sized face after rotation of model about z-axis (30)

Fig. 16 a Zoom-out (Sx , Sy, Sz) = (0.5, 0.5, 0.5). b Zoom-in(Sx , Sy, Sz) = (2, 2, 2)

Fig. 13 Morphing between sadand surprise expression

123

Page 9: 3D Facial Model Reconstruction

SIViP

Table 3 Time complexity of 3D model reconstruction algorithm

Feature points Generic 3D model Image size Time (s)

Left eye: 4, Right eye: 4 350 triangles 200 × 200 1.438

Nose: 1, Lip: 8 221 vertices

Chin: 3

Face contour: 20

Left eye: 4, Right eye: 4 350 triangles 512 × 512 5.578

Nose: 1, Lip: 8 221 vertices

Chin: 3

Face contour: 20

Left eye: 4, Right eye: 4 350 triangles 688 × 472 7.125

Nose: 1, Lip : 8 221 vertices

Chin: 3

Face contour: 20

six basic expressions. Morphing result indirectly tell theaccuracy of our algorithm because it only produce smoothresult if features are properly align in the respective models.

Acknowledgments The authors would like thank to state universityof New York at Binghamton for providing database BU-3DFE.

References

1. Ersotelos, N., Dong, F.: Building highly realistic facial modelingand animation: a survey. Vis. Comput. 24(1), 13–30 (2008)

2. Ip, H.H.S., Yin, L.: Constructing a 3D individualized head modelfrom two orthogonal views. Vis. Comput. 12(5), 254–266 (1996)

3. Blanz, V., Vetter, T.: A morphable model for the synthesis of3D faces. In: Proceedings of SIGGRAPHc699, pp. 187–194. Losangles, CA, USA, (1999)

4. Feng, G.C., Yuen, P.C.: Recognition of head and shoulder faceimage using virtual frontal view image. In: IEEE Trans. Syst. ManCybern. Part A 30(6), 871–882 (2000)

5. Liu, Z.: A fully automatic system to model faces from a singleimage. MSR technical report (2003)

6. Parke, F.I.: Parameterized models for facial animation. In: IEEEComput. Graph. Appl. 2(9), 61–68 (1982)

7. Pandzic, I.S., Komiya, R., Forchheimer, R.: MPEG-4 facial anima-tion: the standard, implementation and applications. John Wiley &Sons, New York (2002). ISBN: 0-470-84465-5

8. Zhang, Y., Zhu, Z., Yi, B.: Dynamic facial expression analysisand synthesis with MPEG-4 facial animation parameters. In: IEEETrans. Syst. Video Technol. 18(10), 1383–1396 (2008)

9. Yin, L., Sun, X., Wang, Y., Rosato, M.J.: A 3D facial expressiondatabase for facial behavior research. In: Proceedings of Interna-tional Conference on FGR, vol. 2–6, pp. 211–216. UK (2006)

10. Guan, Y.: Automatic 3D face reconstruction based on single 2Dimage. International Conference on Multimedia and UbiquitousEngineering, pp. 1216-1219. Seoul, Korea, 26–28 April (2007)

11. Baek, S., Kim, B., Lee, K.: 3D face model reconstruction from sin-gle 2D frontal image. VRCAI 2009, pp. 95–101. Yokohama, Japan,Dec. 14–15, 2009 (ACM 2009, ISBN 978-1-60558-912-1).

12. Yagi, Y.: Facial feature extraction from frontal face image. In: IEEEInt. Conf. Signal Process. 2, 1225–1232 (2000)

13. Sheng, Y., Sadka, A.H., Kondoz, A.M.: An automatic algorithm forfacial feature extraction in video applications. In: Proceedings ofthe Fifth International Workshop on Image Analysis for Multime-dia Interactive Services (WIAMIS’2004), Lisbon, Portugal, April21–23, (2004)

14. Sheng, Y., Sadka, A.H., Kondoz, A.M.: Automatic single viewbased 3D face synthesis for unsupervised multimedia applica-tions. In: IEEE Trans. Circuits Syst. Video Technol. 18(17),961–974 (2008)

15. Hsu, R.L., Abdel-Mottaleb, M., Jain, A.K.: Face detection incolor image. In: IEEE Trans. Pattern Anal. Mach. Intell. 24(5),696–706 (2002)

16. Hulbert, A., Poggio, T.: Synthesizing a colour algorithm fromexamples. Sciences 239, 482–485 (1998)

17. Patel N.M., Zaveri M.: 3D Facial model construction and expres-sions synthesis using a single frontal face image. Int. J. Comput.Graph. SERC 1(1) (2010)

18. Patel, N.M., Patel, P., Zaveri, M.: Parametric model based facialanimation synthesis. International conference on emerging trendsin computing, pp. 8–10. Kamraj College of engg. & Tech., Tamiln-adu, India (2009)

19. Narendra, P., Mukesh, Z.: 3D model construction and expressionsynthesis from a single frontal image. International conference oncomputer and communication technology, MNIT, Allahabad, Sept.17–19 (2010)

20. Krindis, S., Pitas, I.: Statistical analysis of human facial expres-sions. J. Inf. Hiding Multimedia Signal Process. 1(3), 241–260 (2010)

21. Pighin, F., Hecker, J., Lishinski, D., Szeliski, R., Salesin, D.H.:Synthesizing realistic facial expressions from photographs. In: Pro-ceedings of SIGTGRAPH, pp. 75–84 (1998)

22. Lee, W., Magnenat-Thalmann, N.: Head modeling from picturesand morphing in 3D with image metamorphosis based on triangula-tion, vol. 1537. pp. 254–267. Springer, Berlin, Heidelberg (1998)

23. Narendra, P., Mukesh, Z.: 3D facial model construction and anima-tion from a single frontal face image. International conference oncommunications and signal processing, NIT, Calcuit, Feb. 10–12,(2011)

123