binning and indexing biometric records

21
Binning and Indexing Biometric Records Sharat S. Chikkerur CUBS, University at Buffalo [email protected]

Upload: petula

Post on 13-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

Binning and Indexing Biometric Records. Sharat S. Chikkerur CUBS, University at Buffalo [email protected]. Problem Description. Biometrics are being deployed for immigration and national ID applications US-VISIT program Voter ID and national ID programs[3] - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Binning and Indexing Biometric Records

Binning and Indexing Biometric Records

Sharat S. ChikkerurCUBS, University at Buffalo

[email protected]

Page 2: Binning and Indexing Biometric Records

Problem Description

Biometrics are being deployed for immigration and national ID applications US-VISIT program Voter ID and national ID programs[3] Potential size that can run into millions

Largest study by NIST considers only 620,000 records[4]

Apart from accuracy speed and efficiency also become important at this scale

Only biometric identification (1:N matching) can prevent duplicate enrollments

Page 3: Binning and Indexing Biometric Records

Problem Description (cont.) In biometric templates, there is no natural order by which one can sort the biometric records

Biometric Templates are inherently higher dimensional

Semantic features are not stored in the template

Page 4: Binning and Indexing Biometric Records

Identification Problem

FRRFRR

FARNFARFAR

N

NN

)1(1

Let FAR and FRR be the false acceptance rate and false reject rate for 1:1 matching

For a 1:N matching,

The total number of false accepts is given by

FARNFARNFAR NN 2))1(1(N accepts False

Even if FAR = 0.0001%, False accepts = 1 in 10 for N=100000(lower bound)

No single biometric is capable of meeting this security requirement individually

Page 5: Binning and Indexing Biometric Records

Uses of Indexing and Binning Ways to reduce identification errors:

Reduce N Reduce FAR (Limited by technology)

We can reduce N by pruning the records Let PSYS – Penetration rate For a 1:N matching,

FRRFRR

FARNPSYSFARFAR

N

NPSYSN

)1(1

The total number of false accepts is given by

FARPSYSNFAR N 2)(PSYSN accepts False

State of the art fingerprint systems has PSYS=0.5 [6]

Page 6: Binning and Indexing Biometric Records

Indexing and Binning(cont.)

Will allow us to screen immigrants at airports against a ‘watch list’

Will make biometric systems more user-friendly by eliminating the need to remember PINs and Ids

Will improve accuracy (FARN) and performance

0 2 4 6 8 10

x 105

0

2

4

6

8

10x 10

7

N

Fal

se A

ccep

ts

1.0 0.75

0.5

0.3

0.1

Page 7: Binning and Indexing Biometric Records

Binning Biometric Data

Vector Quantization Approach

Page 8: Binning and Indexing Biometric Records

In general a biometric template may be represented as a vector

The objective is to classify the vectors into N distinct classes(code book vectors)

The code book vectors divide the feature space into N distinct Voronoi regions

Properties of the regions:

Vector Quantization(cont.)

kikiiii xxxxx ]....,,,[ 4321

kNYYYYY ]....,,,[ 4321

iV

jixYxY ii 22

ik

i VV and

Page 9: Binning and Indexing Biometric Records

Vector Quantization-Voronoi Regions

Page 10: Binning and Indexing Biometric Records

Hand Geometry- Template Model

Page 11: Binning and Indexing Biometric Records

Experimental Evaluation

25x10 hand geometry features used Each print represented by a 21D vector Data divided equally among training and testing Data is normalized using

)(

X

X

VQ is implemented using k-means clustering The codebook vectors are used on the test set

Page 12: Binning and Indexing Biometric Records

Normalization

0 20 40 60 800

10

20

30

40

50

60

70

80

FTR(1)

FTR(1)

FT

R(2

)

Observations Data normalization leads to spreading of data Without norm., clusters converge to a single center Equivalent to measuring Mahalanobis distance[5] Difference instances of the same had misclassified

-3 -2 -1 0 1 2 3 4-2

-1

0

1

2

3

4

FTR'(1)

FT

R'(2

)

Page 13: Binning and Indexing Biometric Records

Preliminary Results

PSYS

0

10

20

30

40

50

60

0 5 10 15 20 25

Number of bins

Pen

etr

ati

on

Rate

2 3 4 5 6 7 8 9 10 11 12 21

56.95 46.9 42.6 39.6 37.39 36.14 34.56 33.63 31.91 32.25 31.95 29.4456.95 47.2 42.6 40.17 37.39 35.77 34.56 33.62 33.13 32.5 31.95 29.4456.95 47.24 43.26 39.65 37.82 36.02 34.89 33.91 33.13 31.3 31.95 29.4456.95 46.95 42.6 39.65 37.39 35.77 34.56 33.91 32.82 32.25 31.95 29.4456.95 46.66 42.6 39.65 37.82 36.14 34.78 33.62 32.86 32.25 31.95 29.44

56.95 46.99 42.732 39.744 37.562 35.968 34.67 33.738 32.77 32.11 31.95 29.44

Page 14: Binning and Indexing Biometric Records

Indexing Biometric Data

Spatial Access Methods Approach

Page 15: Binning and Indexing Biometric Records

Introduction to Spatial databases Relational databases organize and store scalar data

Has planar organization Contains scalar data (excluding LOBs, binary) Data can be ordered linearly Structured Query Language used to retrieve records

Spatial databases Contain multi-dimensional or vectorial data Relative positions may be explicit or inferred Linear proximity does not imply spatial proximity

Multi dimensional data is used in computer vision, medical imaging, and BIOMETRICS

Original Applications Point sets

CAD VLSI drawings Cartography, astronomy

Page 16: Binning and Indexing Biometric Records

Spatial databases (cont.)

Difference from pattern classification – QUERIES Spatial searches Neighborhood searches

PAM/SAM Point Access Methods

Used on point databases Points may be multi-dimensional (Vectors) Points have spatial extents, intersection undefined Each point is specified uniquely by its d co-ordinates

Spatial Access Methods Used on lines, polygons, solids Have spatial extent, intersection of objects well defined A point may be occupied by more than one object

Page 17: Binning and Indexing Biometric Records

Problems with vectorial/spatial data No standard algebra defined on spatial data

Union, intersection, union not defined exactly Data operations highly application specific Operators are not closed

Queries Need support for spatial queries – point and region queries No standard spatial query language

No natural ordering Ordering that preserves spatial proximity does not exist No mapping between multi-dimensional space to 1D such that

two points that are close together in higher dimensional space are also closed linearly[1]

Is it possible to do this via PCA/KLT? Cannot extend single key structures like B-Tree

Page 18: Binning and Indexing Biometric Records

Requirements of a spatial database

Dynamic updates The structure should be consistent as data is inserted and

deleted Changes should be tracked

Independence of input data and insertion sequence Should handle skewed data Structure should be independent of insertion

sequence(Compare tree)

Scalable Efficiency

Time Efficiency Efficient design will approach the performance of B-Trees

Space Efficiency Indexing overhead should be small

Page 19: Binning and Indexing Biometric Records

Types of structures

K-d Trees Binary tree in d-dimensional space d-1 hyperspaces separate the subspaces The directions alternate among the d-possibilities Insertion and search are straight forward Deletion is cumbersome Structure is sensitive to insertion order

Page 20: Binning and Indexing Biometric Records

References1. Gaede and Gunther, “Multidimensional Access Methods”, ACM Computing

Surveys, Vol.30, No.2, 1998

2. www.geocities.com/mohamedqasem/ vectorquantization/vq.html

3. Bolle et al. Guide to Biometrics, Springer Verlag, 2003

4. NIST report to the United States Congress, “Summary of NIST Standards for Biometric Accuracy, Tamper Resistance and Interoperability”, http://www.itl.nist.gov/iad/894.03/NISTAPP_Nov02.pdf

5. http://www.galactic.com/Algorithms/discrim_mahaldist.htm

6. Dr.Wayman’s report, NIST