an illumination invariant face recognition system for access control using video

An Illumination Invariant Face An Illumination Invariant Face Recognition System for Access Control Recognition System for Access Control using Videousing Video

Ognjen ArandjelovićRoberto Cipolla

Funded by Toshiba Corp. and Trinity College, Cambridge

Face RecognitionFace Recognition

• Single-shot recognition – a popular area of research since 1970s

• Many methods have been developed

• Bad performance in presence of:

– Illumination variation

– Pose variation

– Facial expression

– Occlusions (glasses, hair etc.)

Eigenfaces

Wavelet methods3D MorphableModels

Face Recognition from VideoFace Recognition from Video

• Face motion helps resolve ambiguities of single shot recognition – implicit 3D

• Video information often available (surveillance, authentication etc.)

Recognition setup Training stream Novel stream

Face ManifoldsFace Manifolds• Face patterns describe manifolds which are:

– Highly nonlinear, and

– Noisy, but

– Smooth

Facial features Face pattern manifold Face region

Limitations of Previous WorkLimitations of Previous Work

• In this work we address 3 fundamental questions:

– How to model nonlinear manifolds of face motion

– How to achieve illumination and pose robustness

– How to choose the distance measure

?

Face Motion Manifolds: RevisitedFace Motion Manifolds: Revisited

Unchanging identity, changing illumination

Changing identity, unchanging illumination

• Motivation: How can we use the prior knowledge on the shape of the manifolds?

Pose ClustersPose Clusters• Face motion manifolds are nonlinear, but:

– Low-dimensional (c.f. registration for the reduction of the dimensionality), and

– Key observation: can be described well using only 3 linear pose clusters

Colour-coded pose clusters for 3 manifolds

Determining Pose ClustersDetermining Pose Clusters• Pose clusters are semantic clusters:

– K-means and similar algorithms are unsuitable

– We are using a simple method based on the motion parallax

– Membership decided based on Maximum Likelihood

0.5 reye leye rnostril lnostril

reye leye

x x x xx x

Pupils

Discrepancy η

Image plane

Yaw measure

Distribution for 3 clusters

Pose Clusters: ExamplePose Clusters: Example

Input manifold and colour-coded pose clusters

Sample frames fromthe 3 pose clusters

Illumination compensationIllumination compensation• Performed in two stages:

– Coarse illumination compensation (exploiting face smoothness)

– Fine illumination compensation (exploiting low dimensionality of the face illumination subspace)

RGIC Optimization

Reference Cluster

Illumination Subspace

Input Output

Region-based GICRegion-based GIC

*

2*

,

*

arg min ( , ) ( , )

( , )

Cx yI x y I x y

I I x y

Gamma Intensity Correction (GIC)

Canonical image

• Region-based GIC (RGIC): faces are (roughly) divided into regions with smoothly varying surface normal

Solved by 1D non-linear optimization

1 2

3 4

Face regions

Varying Gamma

Region-based GIC: ArtefactsRegion-based GIC: Artefacts

• Region-based GIC suffers from artefacts at region boundaries

Mean face γ value map Smoothed γ map

Input face RGIC face Our method

Boundary artefacts

Artefactsremoved

Illumination SubspaceIllumination Subspace• Each input frame corrected for a linear Pose Illumination Subspace

component to match the reference distribution of the same pose

– Illumination subspace is high-dimensional

– Constrained to expected variations by Mahalanobis distance

Input manifold

Reference manifold

* *

*

Subject to:

arg min

Where ... is the Mahanalobis distance

in the reference Gaussian

I

I M

M

I I B a

a I B a

Illumination Subspace

Illumination Compensation ResultsIllumination Compensation Results

Original/input frames

Illumination-correctedframes

Reference frames

Strong side lighting

And in face pattern space…

Comparing Pose ClustersComparing Pose Clusters

• “Distribution-based” distances (Kullback-Leibler divergence, Resistor Average Distance etc.) unsuitable

• We use the simple Euclidean distance between cluster centres

Reference cluster

Novel cluster

Reduced spread

Clustercentres

Unified Manifold SimilarityUnified Manifold Similarity• Recognition based based on the likelihood ratio:

1,2,3

1,2,3

( | )( | )P D sP D s

Manifolds belong to the same person

Distances between pose clusters• Learn likelihoods from ground truth training data

Likelihoodhistogram

Undefined value regions

RBF-interpolatedlikelihood

Two-pose interpolatedlikelihood

Likelihood nowmonotonically decreasing

Face Video Database RevisitedFace Video Database Revisited

• Testing performed under extreme, varying illuminations

10 illumination conditions used (random 5 for training, others for testing)

RegistrationRegistration• Linear operations on images are highly nonlinear in the pattern space

• Translation/rotation and weak perspective can be easily corrected for directly from point correspondences

– We use the locations of pupils and nostrils to robustly estimate the optimal affine registration parameters

Translationmanifold

Skewmanifold

Rotationmanifold

Registration Method UsedRegistration Method Used• Feature localization based on the combination of shape and pattern matching (Fukui et al.

1998)

Detect features

Crop & affineregister faces

ResultsResults

• Very high recognition rates attainted (96% average) under extreme variations in illumination

• Other methods showed little to no illumination invariance

Results, continuedResults, continued• The method was shown to give promising results for authentication uses:

– Good separability of inter- and intra- class manifold distances was found

– It can provide a secure system with only 0.1% false positive rate and 8% false negative rate

Cumulative distributionsof inter- and intra- class

manifold distances

The ROC curve forthe proposed method

Future ResearchFuture Research

• Non-constant illumination within a single sequence causes problems

• Illumination compensation is still not perfect – pose illumination subspaces have unnecessarily high dimensions

• Pose estimation is too primitive – outliers cause problems in estimation of linear subspaces

• Complete pose invariance is still not achieved (what if there are no corresponding pose clusters?)

For suggestions, questions etc. please contact me at: [email protected]

an illumination invariant face recognition system for access control using video

Documents

artefactsregionbased

pose clusterslearn likelihoods

clustersface motion

face pattern spacecomparing

clusterspose clusters

face manifoldsface patterns

semantic clusters

motion parallaxmembership