udrc summer school introductionudrc.eng.ed.ac.uk/sites/udrc.eng.ed.ac.uk/files... · 2016. 3....
TRANSCRIPT
1
1
UDRC Summer School
Pattern Recognition
Josef Kittler email: [email protected]
www.ee.surrey.ac.uk/Personal/J.Kittler
2
Introduction
n Pattern recognition is a branch of signal processing that focuses on the recognition of patterns and regularities in data
n Detected regularities (shape, texture, relations) enable signal classification
n Applications span a vast range of problems auto-mating human perception in decision making
Biometrics Bridge detection Object detect Target detect
Pattern recognition problem variants
3
§ Multiclass pattern recognition § Detection (two class problem) § One-class problems (anomaly detection) § Verification § Multilabel classification § Retrieval
4
Pattern recognition system
Pattern recognition problem m – number of classes x - pattern vector
- priori probability of class
- measurement distri- bution of patterns in class
Z - input pattern y - pattern representation vector x - feature vector d - decision
2
Sensor
n Broad concept, including n Physical device n Signal reconstruction and conditioning n Background removal n Registration n Invariant representation
n Sensor design/development is application specific
5 6
Example: Image analysis
0
1
R e l a t i v e f r e q u e n c y
1 2 3 4 5
C o d e b o o k e l e m e n t
Spatial pyramid (2x2)
Dense sampling
Harris-Laplace salient points
Point sampling strategy Color descriptor extraction Codebook model
0
1
1 2 3 4 5
0
1
1 2 3 4 5
0
1
1 2 3 4 5
0
1
1 2 3 4 5
0
1
R e l a t i v e f r e q u e n c y
1 2 3 4 5
C o d e b o o k e l e m e n t
Bag-of-words
Bag-of-words
Multiple bags-of-words
Classifier
.
.
.
.Image
Focus
Object likelihood
.
.
.
.
..
Spatial pyramid (1x3) Multiple bags-of-words
0
1
1 2 3 4 5
0
1
1 2 3 4 5
0
1
1 2 3 4 5
Focus
Machine learning
7
Pattern recognition system
Pattern representation Pattern recognition problem
m – number of classes
- priori probability of class
- measurement distri- bution of patterns in class
8
Training data
Geometric viewpoint of the pattern recognition problem
3
9
Overlapping classes
n Probabilistic model required
10
Course topics § Statistical Pattern Recognition
§ Pattern generation process § Statistical decision theory § Density function estimation § Basic classifiers § Sparse representation § Similarity based classification § Performance evaluation § Multiple classifier systems
§ Dimensionality reduction § Feature selection § Feature extraction
§ Support Vector Machines (John Shawe-Taylor) § Deep Neural Networks (Mark Plumbley)
11
UDRC Summer School
Statistical Pattern Recognition:
Classification
12
Basic probability relationships
n In Pattern Recognition we are dealing with two random variables
n The probability of their joint occurrence can be expressed in terms of conditional probabilities
Bayes formula relating conditional probabilities
4
13
Probability distribution functions
14
Problem formulation
15
Statistical decision theory
n Given the probabilistic model of a pattern generating process, how do we decide the class membership of pattern
n Need to introduce decision costs associated with the assignment
of pattern , belonging to class to class n Note
n Example: Signature verification
16
Bayes minimum error rule
5
17
Feature value
Pdf of Class 2
Pdf of Class 2
0 T Class 1 errors Class 2 errors
Decision threshold 18
Actual decision rule
n Aposteriori class probabilities estimated from training data set
n Normally, the training data set would be labelled
n n System performance characterisation based on
an independent test data set
19
Terminology n The design of a pattern recognition system using
labelled data is referred to as supervised learning
n If class labels are unavailable, we deal with a non supervised learning problem
n PR system design involves
n development of practical decision rules n model inference n performance evaluation
20
Parametric decision rules Gaussian classifiers
§ is the mean vector of class i
§
6
21
Parametric decision rules
Nearest mean
Piecewise linear
Quadratic
22
k-nearest neighbour decision rules
nearest neighbour (k=1) ki….# of samples from class i among the k-nearest neighbours to x - choice of k
Properties of NN rule: - intuitive - suboptimal - computationally expensive - dependent on metric used
23
NN decision rule summary
n Intuitively appealing n Suboptimal performance unless the training set
is edited using the MULTIEDIT algorithm n Computationally involved, unless data set edited
and condensed n The choice of metric used for measuring distance
is very important
24
Optimal metric
7
25
Gaussian Mixture models GMM estimation
26
GMM comments
n xxx
27
Dirichlet prior
n xxx
28
8
Dirichlet prior GMM estimation
n cc
29
Sparse representation classification (SRC)
n xx
30
SRC
n xx
31
…………. =
SRC algorithm
n xx
32
9
Similarity based classification
n xx
33
Similarity based class.
n xxx
34
35
Performance characterisation Multiple classifier systems
n Aim: fuse multiple classifier outputs to improve performance, combating the effect of n Particular training set n Particular classifier choice n Particular classifier parameters n Particular feature space
n Exploiting n Multiple modalities n Context n Quality gauging n Error correcting coding
36
10
37
Levels of Fusion
PCA
LDA
MFCC
PLP
DCT GMM
MLP
MSE
GMM HMM
Fusion
Feature Fusion Data
Fusion
Score Fusion
less information to deal with threshold
score
Legend
38
Fixed decision fusion strategies
Other examples
39
n Stacked generalisation (Wolpert)
n Error correcting output coding (ECOC) (Dietrich) cl
ass
code
References 1. PA Devijver and J Kittler, Pattern recognition: A statistical
approach, Prentice Hall 1982 2. T Dietterich et al, Solving multiclass learning problems via error-
correcting output coding, Jnl. Artificial Intel. Research, 1995 3. R Duda et al, Pattern classification, Wiley 2001 4. K Fukunaga, Introduction to statistical pattern recognition,
Academic Press, 1990 5. T Kimura et al, Expectation-maximization algorithms for inference
in Dirichlet processes mixture, Pattern Analysis and Appl. 16, pp 55-67, 2013
6. J Kittler et al, On combining classifiers, PAMI, pp 226-239, 1998 7. LI Kuncheva, Combining pattern classifiers, Wiley 2004 8. M Pelillo, Similarity-based pattern analysis and recognition,
Springer 2013 9. C Rasmussen, The infinite Gaussian mixture model, In Advances in
Neural Information Processing Systems 12, MIT Press 2000 10. AR Webb & KD Copsey, Statistical Pattern Recognition, Wiley 2011 11. DH Wolpert, Stacked generalization, Neural Networks,5, pp
241-259, 1992 12. AY Wright et al., Robust face recognition via sparse
representation. IEEE PAMI, 2009.
40