face recognition based on deep learning (yurii pashchenko technology stream)
TRANSCRIPT
Face Recognition Based on Deep Learning
recognizzit
www.recognizz.it
Face recognition approaches
Verification
Identification
Similarity
Attributes
Benchmarks
ORL
FERET
Labeled Faces in the Wild (LFW)
YouTube Faces (YTF)
ORL
Images taken between April 1992 and April 1994 at the lab
There are 10 different images of each of 40 distinct subjects
The size of each image is 92x112 pixels, with 256 grey levels per pixel
~10 mb* http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
FERET
Images taken in 2 steps: 1993-1997 1998-20011200 subjects, 14051 face images
fa fb dublicate 1 fc dublicate 2
Heads with views ranging from frontal to left and right profiles
Different lightning
Size ~ 8GB
* http://www.itl.nist.gov/iad/humanid/feret/feret_master.html
Labeled Faces in the Wild
Faces collected from the web
1680 subjects, 13 000 face images
Size ~200mb
* http://vis-www.cs.umass.edu/lfw/
LFW unrestricted with labeled outside data protocol
10 batches
Each batch: 300 matched 300 mismatched
Huang, G.B., Learned-Miller, E.: Labeled faces in the wild: Updates and new reporting procedures. Technical Report UM-CS-2014-003, UMass Amherst (2014)
Round 1
Validation Set
Training Set
Round 2 Round 3 Round 10
...
Face Recognition Algorithm
Face localization Normalization Feature extraction Comparing
Verificationmetric FALSE
What was before DCN
Method Accuracy ± SE
combined Joint Bayesian 0.9242 ±0.0108
Tom-vs-Pete + Attribute 0.9330 ±0.0128
High-dim LBP 0.9517 ±0.0113
TL Joint Bayesian 0.9633 ± 0.0108
Human, cropped 0.975
DeepFace
Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the gap to human-level performance in face verification. In CVPR, 2014
Training data
Dataset: Social Face Classification (4030 subjects, ~4.4 million face images)
Alignment: 2D 3D
Architecture
Input: RGB image 152x152
Output feature size: 4096
Parameters: ~ 120 million
Verification metric: weighted chi-squared distance siamese
Results on LFW
Method Accuracy ± SE
combined Joint Bayesian 0.9242 ±0.0108
Tom-vs-Pete + Attribute 0.9330 ±0.0128
High-dim LBP 0.9517 ±0.0113
TL Joint Bayesian 0.9633 ± 0.0108
DeepFace-align2D 0.943 ± 0.0043
DeepFace-Siamese 0.9617 ± 0.0038
DeepFace-ensemble 0.9735 ± 0.0025
Human, cropped 0.975
DeepID
Y. Sun, X. Wang, and X. Tang. Deep learning face representation from predicting 10,000 classes. In CVPR, 2014.
Training data
Dataset: CelebFaces (10 177 subjects, 202 599 face images)
Alignment: 2D Patch
Architecture
Input: 39x31 RGB or grayscale
Output feature size: 160
Additional algorithms: Joint Bayesian PCAONLY 1 PATCH
Results on LFW
Method Accuracy ± SE#Net
DeepFace-Siamese 1
7
60
100
100
0.9617 ± 0.0038
DeepFace-ensemble 0.9735 ± 0.0025
DeepID on CelebFaces 0.9605 ± …
DeepID on CelebFaces+ 0.972 ± …
DeepID on CelebFaces+ TL 0.9745 ± 0.0026
Human, cropped 0.975
CASIA
D. Yi, Z. Lei, S. Liao, and S. Z. Li. Learning face representation from scratch. Technical report, arXiv:1411.7923, 2014.
Training data
Dataset: CASIA WebFace (10 575 subjects and 494 414 face images)
Alignment: 2D
Architecture
Input: gray image 100x100
Output feature vector size: 320
Parameters:~ 5 million
Verification metric: Cosine
Additional algorithms: Joint Bayes PCA
Combined identification and verification loss
Very deep architectureFace Labels
Cost2Contrastive
Verification Labels[0,1]
Cost1Softmax
Conv12Conv11
Poo112x2+2(S)
Conv22Conv21
Poo122x2+2(S)
Conv32Conv31
Poo132x2+2(S)
Fc6
Dropcut40%
Poo157x7+1(S)
Conv51Conv52
Poo142x2+2(S)
Conv41Conv42
Results on LFW
Method Accuracy ± SE#Net
DeepFace 7
100
1
1
1
1
1
0.9735 ± 0.0025
DeepID 0.9745 ± 0.0026
DR + Cosine 0.9613 ± 0.003
DR + PCA on CASIA-WebFace + Cosine 0.963 ± 0.0035
DR + Joint Bayes on CASIA-WebFace 0.973 ± 0.0031
DR + PCA on LFW training set + Cosine
DR + Joint Bayse on LFW training set
0.9633 ± 0.0042
0.9773 ± 0.0031
* DR – CASIA-WebFace
CASIA replication
Training data Accuracy on LFW
normalized 97,27 %
no alignment 94 %
our alignment 96,2 %
Database size
Jingtuo Liu, Yafeng Deng, Tao Bai, Zhengping Wei, and Chang Huang; Baidu�Targeting Ultimate Accuracy: Face Recognition via Deep Embedding.
Identities Error rate
1.5K 3.1%
9K 1.35%
18K
Faces
150K
450K
1.2M 0.87%
Database availability
Dataset #Images#Subjects
LFW 5 749
2 995
10 177
4 030
2 000
10 575
13 233
WDRef 99 773
CelebFaces 202 599
SFC 4 400 000
CACD 163 446
CASIA-WebFace 494 414
Availability
Public
Public (feature only)
Private
Private
Public (partial annotated)
Public
D. Yi, Z. Lei, S. Liao, and S. Z. Li. Learning face representation from scratch. Technical report, arXiv:1411.7923, 2014.
Recommendations
Big dataset with large amount of subjects
Careful face extraction and alignment
Deep architecture
Joint Identification-Verification
K. Simonyan and A. Zisserman. “Very deep convolutional networks for large-scale image recognition”. arXiv preprint arXiv:1409.1556, 2014
Y. Sun, Y. Chen, X. Wang, and X. Tang. Deep learning face representation by joint identification-verification. In NIPS, 2014.