automated restyling of human portrait based on facial...

3
Automated Restyling of Human Portrait Based on Facial Expression Recognition and 3D Reconstruction Cheng-Han(Dennis) Wu, Hsin Chen Department of Electrical Engineering, Department of Mechanical Engineering, Stanford University 1. 8 Emotion Effects 2. Unseen Dataset Test accuracy Using 746 labeled Cohn-Kanade(CK) images Fine KNN has a 73.5% accuracy over unseen data. KNN/subKNN robust toward two models. Softmax performance heavily depends on model and feature quality. Classification 1. Classification Algorithm Benchmark Cross-validation accuracy Max. Accuracy: Googlenet(~99%) > DeXpression(~94%) 2. KNN and Subspace KNN Kinect • RGB & Depth Preproccesing Expression • CNN Model Training • CNN Model Comparison • Classification Algo. Optimization • Expression Recognition Restyling • 3D Face Model Calculation • Shading/Relighting/Restyling Results Feature Extraction Neutral Angry Contempt Disgust Fear Sad Happy Surprise 1. Dataset(CK+) Extended Cohn-Kanade dataset 5876 images, 8 Emotions Trained Size: Small: 981 images Large: 1635 images 2. Training Platform: Caffe 3. Model DeXpression(DeX): 11 main layers 400000 Iterations (loss ~ 0.0008) GoogleNet(GlN): 27 main layers 240000 Iterations (loss ~ 0.0004) Image Processing Misclassification Analysis System Architecture Related Work Motivation 1. Kinect for Windows V2 Calibration 2. Cropping ROI and Image Masking Using Depth Crop with OpenCV tracked face ROI. Mask with dilated, hole-filled, and threshold depthmap. 3. Depth Denoising 4. Shading/Relighting Calculate directional lighting intensity and color for normal map. Original Left(-x) blue light Top(+y) red light Depth Normal Map (R:x G:y B:z) Surface Normal Calculation 1. Depth Based Visual Effect Matthias Ziegler, Andreas Engelhardt, Stefan Mller, Joachim Keinert, Frederik Zilly, Siegfried Foessel. Multi- camera system for depth based visual effects and compositing. CVMP, 2015. 2. Caffe Model: GoogLeNet Christian Szegedy, Wei Liu, Yangqing Jia, …et al. Going Deeper with Convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3. Caffe Model: DeXpression Peter Burkert, Felix Trier …et al. DeXpression: Deep Convolutional Neural Network for Expression Recognition. German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany. Aug, 2016. Inspired by the movie Inside Out, the project showcases the capabilities of AI in visual effects automation in photography or film making. Furthermore, through convolutional neural network(CNN) with enhanced classification algorithm, we are able to achieve high expression recognition accuracy. 0 10 20 30 40 50 60 70 80 90 100 Complex Tree Quadratic Discriminant Cubic SVM Fine KNN Subspace KNN Softmax Small dataset/Dexpression Large dataset/Dexpression Large dataset/Googlenet 0 10 20 30 40 50 60 70 80 90 100 Complex Tree Quadratic Discriminant Cubic SVM Fine KNN Subspace KNN Softmax Small dataset/Dexpression Large dataset/Dexpression Large dataset/Googlenet Confusion Matrix (subspace KNN) Confusion Matrix (softmax) Contempt Neutral Disgust Angry 2. GoogleNet: prone to angry → disgust misclassification 1. DeXpression: prone to contempt → neutral misclassification KNN: Given a set of N points in an -dimensional feature space, calculate proximity of the test point with the closest neighbors using their relative Euclidean distances. Assign the test point to the class that has the most frequent occurrence. Subspace KNN: Given a set of N points in an -dimensional feature space, randomly choose −dimensional subspace ( < ), in which find nearest neighbors for a test point. Repeat such procedure for times, then assign the test point to the class that has the most frequent occurrence. For this model, = 8, = 4, = 30.

Upload: others

Post on 24-Mar-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Automated Restyling of Human Portrait Based on Facial ...cs229.stanford.edu/proj2016/poster/WuChen-Automated Restyling of Human Portrait Based...Automated Restyling of Human Portrait

Automated Restyling of Human Portrait Based on Facial Expression Recognition and 3D Reconstruction

Cheng-Han(Dennis) Wu, Hsin ChenDepartment of Electrical Engineering, Department of Mechanical Engineering, Stanford University

1. 8 Emotion Effects2. Unseen Dataset Test accuracy Using 746 labeled Cohn-Kanade(CK) images Fine KNN has a 73.5% accuracy over unseen

data. KNN/subKNN robust toward two models. Softmax performance heavily depends on

model and feature quality.

Classification1. Classification Algorithm Benchmark Cross-validation accuracy Max. Accuracy: Googlenet(~99%) > DeXpression(~94%)

2. KNN and Subspace KNN

Kinect • RGB & Depth Preproccesing

Expression

• CNN Model Training

• CNN Model Comparison

• Classification Algo. Optimization

• Expression Recognition

Restyling• 3D Face Model Calculation

• Shading/Relighting/Restyling

Results

Feature Extraction

Neutral Angry

Contempt Disgust

Fear

Sad Happy

Surprise

1. Dataset(CK+) Extended Cohn-Kanade dataset 5876 images, 8 Emotions Trained Size:

● Small: 981 images● Large: 1635 images

2. Training Platform: Caffe3. Model DeXpression(DeX): 11 main layers400000 Iterations (loss ~ 0.0008) GoogleNet(GlN): 27 main layers240000 Iterations (loss ~ 0.0004)

Image Processing

Misclassification Analysis

System Architecture

Related Work

Motivation

1. Kinect for Windows V2 Calibration2. Cropping ROI and Image Masking Using Depth Crop with OpenCV tracked face ROI. Mask with dilated, hole-filled, and thresholddepthmap. 3. Depth Denoising4. Shading/Relighting Calculate directional lighting intensity and color for normal map. Original Left(-x) blue

lightTop(+y) red light

Depth

Normal Map(R:x G:y B:z)

Surface Normal

Calculation

1. Depth Based Visual EffectMatthias Ziegler, Andreas Engelhardt, Stefan Mller, Joachim Keinert, Frederik Zilly, Siegfried Foessel. Multi-camera system for depth based visual effects and compositing. CVMP, 2015.

2. Caffe Model: GoogLeNetChristian Szegedy, Wei Liu, Yangqing Jia, …et al. Going Deeper with Convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

3. Caffe Model: DeXpressionPeter Burkert, Felix Trier …et al. DeXpression: Deep Convolutional Neural Network for Expression Recognition. German Research Center for Artificial Intelligence (DFKI), Kaiserslautern, Germany. Aug, 2016.

Inspired by the movie Inside Out, the project showcases the capabilities of AIin visual effects automation in photography or film making. Furthermore,through convolutional neural network(CNN) with enhanced classificationalgorithm, we are able to achieve high expression recognition accuracy.

0

10

20

30

40

50

60

70

80

90

100

Complex Tree QuadraticDiscriminant

Cubic SVM Fine KNN Subspace KNN Softmax

Small dataset/Dexpression Large dataset/Dexpression Large dataset/Googlenet

0

10

20

30

40

50

60

70

80

90

100

Complex Tree QuadraticDiscriminant

Cubic SVM Fine KNN Subspace KNN Softmax

Small dataset/Dexpression Large dataset/Dexpression Large dataset/Googlenet

Confusion Matrix (subspace KNN) Confusion Matrix (softmax)

Contempt

Neutral

Disgust

Angry

2. GoogleNet: prone to angry → disgust

misclassification

1. DeXpression: prone to contempt → neutral

misclassification

KNN:Given a set of N points in an 𝑛 -dimensional feature space, calculate proximity of the test point with the closest 𝑘 neighbors using their relative Euclidean distances. Assign the test point to the class that has the most frequent occurrence.Subspace KNN:Given a set of N points in an 𝑛 -dimensional feature space, randomly choose 𝑚−dimensional subspace (𝑚 < 𝑛), in which find 𝑘 nearest neighbors for a test point.Repeat such procedure for 𝐶𝑛

𝑚 times, then assign the test point to the class that has the most frequent occurrence.For this model, 𝑛 = 8, 𝑚 = 4, 𝑘 = 30.

Page 2: Automated Restyling of Human Portrait Based on Facial ...cs229.stanford.edu/proj2016/poster/WuChen-Automated Restyling of Human Portrait Based...Automated Restyling of Human Portrait

Dex largeDex

Automated Restyling of Human Portrait Based on Facial Expression Recognition and 3D Reconstruction

Cheng-Han(Dennis) WuDepartment of Electrical Engineering, Stanford University

Page 3: Automated Restyling of Human Portrait Based on Facial ...cs229.stanford.edu/proj2016/poster/WuChen-Automated Restyling of Human Portrait Based...Automated Restyling of Human Portrait

OriginalFront(+z)

white lightLeft(-x) blue

lightTop(+y) red light

Depth

Normal Map(R:x G:y B:z)

Original Depth Face Mask

Surface NormalCalculation

1. Cropping ROI and Image Masking Crop with OpenCV tracked face ROI Mask with dilated, hole-filled, and

thresholded depthmap. 2. Shading/Relighting Calculate surface normal from depth. Calculate directional lighting intensity

and color for the face.

2. Restyling Apply mood lighting and background, processed using above mask applied on both images.