face recognition based on deep learning (yurii pashchenko technology stream)

Face Recognition Based on Deep Learning

recognizzit

www.recognizz.it

Face recognition approaches

Verification

Identification

Similarity

Attributes

Benchmarks

ORL

FERET

Labeled Faces in the Wild (LFW)

YouTube Faces (YTF)

ORL

Images taken between April 1992 and April 1994 at the lab

There are 10 different images of each of 40 distinct subjects

The size of each image is 92x112 pixels, with 256 grey levels per pixel

~10 mb* http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html

FERET

Images taken in 2 steps: 1993-1997 1998-20011200 subjects, 14051 face images

fa fb dublicate 1 fc dublicate 2

Heads with views ranging from frontal to left and right profiles

Different lightning

Size ~ 8GB

* http://www.itl.nist.gov/iad/humanid/feret/feret_master.html

Labeled Faces in the Wild

Faces collected from the web

1680 subjects, 13 000 face images

Size ~200mb

* http://vis-www.cs.umass.edu/lfw/

LFW unrestricted with labeled outside data protocol

10 batches

Each batch: 300 matched 300 mismatched

Huang, G.B., Learned-Miller, E.: Labeled faces in the wild: Updates and new reporting procedures. Technical Report UM-CS-2014-003, UMass Amherst (2014)

Round 1

Validation Set

Training Set

Round 2 Round 3 Round 10

...

Face Recognition Algorithm

Face localization Normalization Feature extraction Comparing

Verificationmetric FALSE

What was before DCN

Method Accuracy ± SE

combined Joint Bayesian 0.9242 ±0.0108

Tom-vs-Pete + Attribute 0.9330 ±0.0128

High-dim LBP 0.9517 ±0.0113

TL Joint Bayesian 0.9633 ± 0.0108

Human, cropped 0.975

DeepFace

Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deepface: Closing the gap to human-level performance in face verification. In CVPR, 2014

Training data

Dataset: Social Face Classification (4030 subjects, ~4.4 million face images)

Alignment: 2D 3D

Architecture

Input: RGB image 152x152

Output feature size: 4096

Parameters: ~ 120 million

Verification metric: weighted chi-squared distance siamese

Results on LFW

Method Accuracy ± SE

combined Joint Bayesian 0.9242 ±0.0108

Tom-vs-Pete + Attribute 0.9330 ±0.0128

High-dim LBP 0.9517 ±0.0113

TL Joint Bayesian 0.9633 ± 0.0108

DeepFace-align2D 0.943 ± 0.0043

DeepFace-Siamese 0.9617 ± 0.0038

DeepFace-ensemble 0.9735 ± 0.0025


DeepID

Y. Sun, X. Wang, and X. Tang. Deep learning face representation from predicting 10,000 classes. In CVPR, 2014.

Training data

Dataset: CelebFaces (10 177 subjects, 202 599 face images)

Alignment: 2D Patch

Architecture

Input: 39x31 RGB or grayscale

Output feature size: 160

Additional algorithms: Joint Bayesian PCAONLY 1 PATCH

Results on LFW

Method Accuracy ± SE#Net

DeepFace-Siamese 1

7

60

100

100

0.9617 ± 0.0038

DeepFace-ensemble 0.9735 ± 0.0025

DeepID on CelebFaces 0.9605 ± …

DeepID on CelebFaces+ 0.972 ± …

DeepID on CelebFaces+ TL 0.9745 ± 0.0026


CASIA

D. Yi, Z. Lei, S. Liao, and S. Z. Li. Learning face representation from scratch. Technical report, arXiv:1411.7923, 2014.

Training data

Dataset: CASIA WebFace (10 575 subjects and 494 414 face images)

Alignment: 2D

Architecture

Input: gray image 100x100

Output feature vector size: 320

Parameters:~ 5 million

Verification metric: Cosine

Additional algorithms: Joint Bayes PCA

Combined identification and verification loss

Very deep architectureFace Labels

Cost2Contrastive

Verification Labels[0,1]

Cost1Softmax

Conv12Conv11

Poo112x2+2(S)

Conv22Conv21

Poo122x2+2(S)

Conv32Conv31

Poo132x2+2(S)

Fc6

Dropcut40%

Poo157x7+1(S)

Conv51Conv52

Poo142x2+2(S)

Conv41Conv42

Results on LFW

Method Accuracy ± SE#Net

DeepFace 7

100

1

1

1

1

1

0.9735 ± 0.0025

DeepID 0.9745 ± 0.0026

DR + Cosine 0.9613 ± 0.003

DR + PCA on CASIA-WebFace + Cosine 0.963 ± 0.0035

DR + Joint Bayes on CASIA-WebFace 0.973 ± 0.0031

DR + PCA on LFW training set + Cosine

DR + Joint Bayse on LFW training set

0.9633 ± 0.0042

0.9773 ± 0.0031

* DR – CASIA-WebFace

CASIA replication

Training data Accuracy on LFW

normalized 97,27 %

no alignment 94 %

our alignment 96,2 %

Database size

Jingtuo Liu, Yafeng Deng, Tao Bai, Zhengping Wei, and Chang Huang; Baidu�Targeting Ultimate Accuracy: Face Recognition via Deep Embedding.

Identities Error rate

1.5K 3.1%

9K 1.35%

18K

Faces

150K

450K

1.2M 0.87%

Database availability

Dataset #Images#Subjects

LFW 5 749

2 995

10 177

4 030

2 000

10 575

13 233

WDRef 99 773

CelebFaces 202 599

SFC 4 400 000

CACD 163 446

CASIA-WebFace 494 414

Availability

Public

Public (feature only)

Private

Private

Public (partial annotated)

Public

D. Yi, Z. Lei, S. Liao, and S. Z. Li. Learning face representation from scratch. Technical report, arXiv:1411.7923, 2014.

Recommendations

Big dataset with large amount of subjects

Careful face extraction and alignment

Deep architecture

Joint Identification-Verification

K. Simonyan and A. Zisserman. “Very deep convolutional networks for large-scale image recognition”. arXiv preprint arXiv:1409.1556, 2014

Y. Sun, Y. Chen, X. Wang, and X. Tang. Deep learning face representation by joint identification-verification. In NIPS, 2014.

THANK YOU

Yurii [email protected]

www.recognizz.it

face recognition based on deep learning (yurii pashchenko technology stream)

Technology