udrc summer school introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3....

10
1 1 UDRC Summer School Pattern Recognition Josef Kittler email: [email protected] www.ee.surrey.ac.uk/Personal/J.Kittler 2 Introduction Pattern recognition is a branch of signal processing that focuses on the recognition of patterns and regularities in data Detected regularities (shape, texture, relations) enable signal classification Applications span a vast range of problems auto- mating human perception in decision making Biometrics Bridge detection Object detect Target detect Pattern recognition problem variants 3 Multiclass pattern recognition Detection (two class problem) One-class problems (anomaly detection) Verification Multilabel classification Retrieval 4 Pattern recognition system Pattern recognition problem m – number of classes x - pattern vector -priori probability of class -measurement distri- bution of patterns in class Z - input pattern y - pattern representation vector x - feature vector d - decision

Upload: others

Post on 31-Dec-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: UDRC Summer School Introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3. 15. · Feature value Pdf of Class 2Pdf Pdf of Class 2 Aposteriori class probabilities

1

1

UDRC Summer School

Pattern Recognition

Josef Kittler email: [email protected]

www.ee.surrey.ac.uk/Personal/J.Kittler

2

Introduction

n  Pattern recognition is a branch of signal processing that focuses on the recognition of patterns and regularities in data

n  Detected regularities (shape, texture, relations) enable signal classification

n  Applications span a vast range of problems auto-mating human perception in decision making

Biometrics Bridge detection Object detect Target detect

Pattern recognition problem variants

3

§ Multiclass pattern recognition § Detection (two class problem) § One-class problems (anomaly detection) § Verification § Multilabel classification § Retrieval

4

Pattern recognition system

Pattern recognition problem m – number of classes x - pattern vector

- priori probability of class

- measurement distri- bution of patterns in class

Z - input pattern y - pattern representation vector x - feature vector d - decision

Page 2: UDRC Summer School Introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3. 15. · Feature value Pdf of Class 2Pdf Pdf of Class 2 Aposteriori class probabilities

2

Sensor

n  Broad concept, including n  Physical device n  Signal reconstruction and conditioning n  Background removal n  Registration n  Invariant representation

n  Sensor design/development is application specific

5 6

Example: Image analysis

0

1

R e l a t i v e  f r e q u e n c y

1 2 3 4 5

C o d e b o o k  e l e m e n t

Spatial pyramid (2x2)

Dense sampling

Harris-Laplace salient points

Point sampling strategy Color descriptor extraction Codebook model

0

1

1 2 3 4 5

0

1

1 2 3 4 5

0

1

1 2 3 4 5

0

1

1 2 3 4 5

0

1

R e l a t i v e  f r e q u e n c y

1 2 3 4 5

C o d e b o o k  e l e m e n t

Bag-of-words

Bag-of-words

Multiple bags-of-words

Classifier

.

.

.

.Image

Focus

Object likelihood

.

.

.

.

..

Spatial pyramid (1x3) Multiple bags-of-words

0

1

1 2 3 4 5

0

1

1 2 3 4 5

0

1

1 2 3 4 5

Focus

Machine learning

7

Pattern recognition system

Pattern representation Pattern recognition problem

m – number of classes

- priori probability of class

- measurement distri- bution of patterns in class

8

Training data

Geometric viewpoint of the pattern recognition problem

Page 3: UDRC Summer School Introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3. 15. · Feature value Pdf of Class 2Pdf Pdf of Class 2 Aposteriori class probabilities

3

9

Overlapping classes

n Probabilistic model required

10

Course topics § Statistical Pattern Recognition

§  Pattern generation process §  Statistical decision theory §  Density function estimation §  Basic classifiers §  Sparse representation §  Similarity based classification §  Performance evaluation §  Multiple classifier systems

§ Dimensionality reduction §  Feature selection §  Feature extraction

§ Support Vector Machines (John Shawe-Taylor) § Deep Neural Networks (Mark Plumbley)

11

UDRC Summer School

Statistical Pattern Recognition:

Classification

12

Basic probability relationships

n  In Pattern Recognition we are dealing with two random variables

n  The probability of their joint occurrence can be expressed in terms of conditional probabilities

Bayes formula relating conditional probabilities

Page 4: UDRC Summer School Introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3. 15. · Feature value Pdf of Class 2Pdf Pdf of Class 2 Aposteriori class probabilities

4

13

Probability distribution functions

14

Problem formulation

15

Statistical decision theory

n  Given the probabilistic model of a pattern generating process, how do we decide the class membership of pattern

n  Need to introduce decision costs associated with the assignment

of pattern , belonging to class to class n  Note

n  Example: Signature verification

16

Bayes minimum error rule

Page 5: UDRC Summer School Introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3. 15. · Feature value Pdf of Class 2Pdf Pdf of Class 2 Aposteriori class probabilities

5

17

Feature value

Pdf of Class 2

Pdf of Class 2

Pdf

0 T Class 1 errors Class 2 errors

Decision threshold 18

Actual decision rule

n  Aposteriori class probabilities estimated from training data set

n  Normally, the training data set would be labelled

n  n  System performance characterisation based on

an independent test data set

19

Terminology n  The design of a pattern recognition system using

labelled data is referred to as supervised learning

n  If class labels are unavailable, we deal with a non supervised learning problem

n  PR system design involves

n  development of practical decision rules n  model inference n  performance evaluation

20

Parametric decision rules Gaussian classifiers

§  is the mean vector of class i

§ 

Page 6: UDRC Summer School Introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3. 15. · Feature value Pdf of Class 2Pdf Pdf of Class 2 Aposteriori class probabilities

6

21

Parametric decision rules

Nearest mean

Piecewise linear

Quadratic

22

k-nearest neighbour decision rules

nearest neighbour (k=1) ki….# of samples from class i among the k-nearest neighbours to x - choice of k

Properties of NN rule: - intuitive - suboptimal - computationally expensive - dependent on metric used

23

NN decision rule summary

n  Intuitively appealing n  Suboptimal performance unless the training set

is edited using the MULTIEDIT algorithm n  Computationally involved, unless data set edited

and condensed n  The choice of metric used for measuring distance

is very important

24

Optimal metric

Page 7: UDRC Summer School Introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3. 15. · Feature value Pdf of Class 2Pdf Pdf of Class 2 Aposteriori class probabilities

7

25

Gaussian Mixture models GMM estimation

26

GMM comments

n  xxx

27

Dirichlet prior

n  xxx

28

Page 8: UDRC Summer School Introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3. 15. · Feature value Pdf of Class 2Pdf Pdf of Class 2 Aposteriori class probabilities

8

Dirichlet prior GMM estimation

n  cc

29

Sparse representation classification (SRC)

n  xx

30

SRC

n  xx

31

…………. =

SRC algorithm

n  xx

32

Page 9: UDRC Summer School Introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3. 15. · Feature value Pdf of Class 2Pdf Pdf of Class 2 Aposteriori class probabilities

9

Similarity based classification

n  xx

33

Similarity based class.

n  xxx

34

35

Performance characterisation Multiple classifier systems

n  Aim: fuse multiple classifier outputs to improve performance, combating the effect of n  Particular training set n  Particular classifier choice n  Particular classifier parameters n  Particular feature space

n  Exploiting n  Multiple modalities n  Context n  Quality gauging n  Error correcting coding

36

Page 10: UDRC Summer School Introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3. 15. · Feature value Pdf of Class 2Pdf Pdf of Class 2 Aposteriori class probabilities

10

37

Levels of Fusion

PCA

LDA

MFCC

PLP

DCT GMM

MLP

MSE

GMM HMM

Fusion

Feature Fusion Data

Fusion

Score Fusion

less information to deal with threshold

score

Legend

38

Fixed decision fusion strategies

Other examples

39

n  Stacked generalisation (Wolpert)

n  Error correcting output coding (ECOC) (Dietrich) cl

ass

code

References 1.  PA Devijver and J Kittler, Pattern recognition: A statistical

approach, Prentice Hall 1982 2.  T Dietterich et al, Solving multiclass learning problems via error-

correcting output coding, Jnl. Artificial Intel. Research, 1995 3.  R Duda et al, Pattern classification, Wiley 2001 4.  K Fukunaga, Introduction to statistical pattern recognition,

Academic Press, 1990 5.  T Kimura et al, Expectation-maximization algorithms for inference

in Dirichlet processes mixture, Pattern Analysis and Appl. 16, pp 55-67, 2013

6.  J Kittler et al, On combining classifiers, PAMI, pp 226-239, 1998 7.  LI Kuncheva, Combining pattern classifiers, Wiley 2004 8.  M Pelillo, Similarity-based pattern analysis and recognition,

Springer 2013 9.  C Rasmussen, The infinite Gaussian mixture model, In Advances in

Neural Information Processing Systems 12, MIT Press 2000 10.  AR Webb & KD Copsey, Statistical Pattern Recognition, Wiley 2011 11.  DH Wolpert, Stacked generalization, Neural Networks,5, pp

241-259, 1992 12.  AY Wright et al., Robust face recognition via sparse

representation. IEEE PAMI, 2009.

40