computer visioncs-courses.mines.edu/csci507/schedule/34/pattern...% meas(150,4) - each row is a...

43
Computer Vision Colorado School of Mines Colorado School of Mines Computer Vision Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/

Upload: others

Post on 21-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Colorado School of Mines

Computer Vision

Professor William HoffDept of Electrical Engineering &Computer Science

http://inside.mines.edu/~whoff/

Page 2: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Pattern Recognition

2

Page 3: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

• The process by which patterns in data are found, recognized, discovered– Usually aims to classify data (patterns) based on either a priori knowledge or on 

statistical information extracted from the patterns– The patterns to be classified are observations, defining points in a multidimensional 

space

• Classification is usually based on a set of patterns that have already been classified (e.g., by a person)

– This set of patterns is termed the training set– The learning strategy is called supervised

• Learning can also be unsupervised– In this case there is no training set– Instead it establishes the classes itself based on the statistical regularities of the patterns

• Resources– “Statistical Toolbox” in Matlab– Journals include “Pattern Recognition”, “IEEE Trans. Pattern Analysis & Machine 

Intelligence”– Good book:  “Pattern Recognition and Machine Learning” by Bishop

Pattern Recognition

3

Page 4: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Approaches

• Statistical Pattern Recognition– We assume that the patterns are generated by a probabilistic system– The data is reduced to vectors of numbers and statistical techniques 

are used for classification

• Structural Pattern Recognition– The process is based on the  structural interrelationships of features– The data is converted to a discrete structure (such as a grammar or a 

graph) and classification techniques such as parsing and graph matching are used

• Neural– The model simulates the behavior of biological neural networks

4

Page 5: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Unsupervised Pattern Recognition

• The system must learn the classifier from unlabeled data

• It’s related to the problem of trying to estimate the underlying probability density function of the data

• Approaches to unsupervised learning include– clustering (e.g., k‐means, mixture models, hierarchical clustering)– techniques for dimensionality reduction (e.g., principal component 

analysis, independent component analysis, non‐negative matrix factorization, singular value decomposition)

5

Page 6: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

k‐means Clustering

• Given a set of n‐dimensional vectors• Also specify k (number of desired clusters)• The algorithm partitions the vectors into clusters, such that it minimizes 

the sum, over all clusters, of the within‐cluster sums of point‐to‐cluster‐centroid distances

6

Page 7: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

k‐means Algorithm

1. Given a set of vectors {xi}2. Randomly choose a set of kmeans {mi} as the center of 

each cluster3. For each vector xi compute distance to each mi; assign xi to 

the closest cluster4. Update the means to get a new set of cluster centers5. Repeat steps 3 and 4 until there is no more change in 

cluster centers

kmeans is guaranteed to terminate, but may not find the global optimum in the least squares sense

7

Page 8: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Example: Indexed Storage of Color Images

• If an image uses 8‐bits for each of R,G,B, there are 2^24 possible colors• Most images don’t use the entire color space of possible values – we 

can get by with fewer• We’ll use k‐means clustering to find the reduced set  of colors

Image using only 32 discrete colors

8

Image using the full color space

Page 9: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Indexed Storage of Color Images

• For each image, we find a set of colors that are a good approximation of the entire set of pixels in the image; and put those into a colormap

• Then for each pixel, we just store the indices into the colormap

Color image, using 64 colorsImage of indices (0..63)

9

Page 10: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Indexed storage of Color Images• Use a colormap

– Image f(x,y) stores indices into a lookup table (colormap)

– Colormap specifies RGB for each index

>> cmap(1:20,:)

ans =

0.0980    0.0941    0.1020

0.1451    0.1020    0.1098

0.1608    0.1412    0.1804

0.2196    0.1216    0.1020

0.2431    0.1569    0.1373

0.2196    0.1843    0.2118

0.2471    0.2353    0.2824

0.3137    0.1490    0.1020

0.3294    0.2039    0.1569

0.4118    0.1725    0.0902

0.4235    0.2314    0.1373

0.3176    0.2471    0.2431

0.4039    0.2784    0.2118

0.4078    0.3137    0.2627

0.3255    0.3059    0.3490

0.5176    0.2039    0.0157

0.5059    0.2275    0.0902

0.6039    0.2392    0.0471

0.6392    0.3059    0.0353

0.5098    0.2706    0.1529

:

Display system will display these values of RGB

Index = 17

10

[img,cmap] = imread(‘kids.tif’);imshow(img,cmap);Also see rgb2ind, ind2rgb

Page 11: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines 11

clear allclose all

% Read imageRGB = im2double(imread('peppers.png')); RGB = imresize(RGB, 0.5);%RGB = im2double(imread('pears.png')); RGB = imresize(RGB, 0.5);%RGB = im2double(imread('tissue.png')); RGB = imresize(RGB, 0.5);

figure, imshow(RGB);

% Convert 3-dimensional array (M,N,3) array to 2D (MxN,3)X = reshape(RGB, [], 3);

k=16; % Number of clusters to find

% Call kmeans. It returns:% IDX: for each point in X, which cluster (1..k) it was assigned to% C: the k cluster centers[IDX,C] = kmeans(X,k, ...

'EmptyAction', 'drop'); % if a cluster becomes empty, drop it

% Reshape the index array back to a 2-dimensional imageI = reshape(IDX, size(RGB,1), size(RGB,2));

% Show the reduced color imagefigure, imshow(I, C);

% Plot pixels in color spacefigurehold onfor i=1:20:size(X,1)

plot3(X(i, 1), X(i, 2), X(i, 3), ...'.', 'Color', C(IDX(i),:));

end

% Also plot cluster centersfor i=1:k

plot3(C(i,1), C(i,2), C(i,3), 'ro', 'MarkerFaceColor', 'r');endxlabel('Red'), ylabel('Green'), zlabel('Blue');axis equalaxis vis3d

Page 12: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines 12

0 0.2 0.4 0.6 0.8 1 1.2 1.40

0.5

1

1.5-0.2

0

0.2

0.4

0.6

0.8

1

1.2

Page 13: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

• A class is a set of objects having some important properties in common– We might have known description for each– We might have set of samples for each

• A feature extractor is a program that inputs the data (image) and extracts features that can be used in classification; these values are put into a feature vector

• A classifier is a program that inputs the feature vector and assigns it to one of a set of designated classes or to the “reject” class

• We will look at these classifiers:– Decision tree– Nearest class mean

• Another powerful classifier is “support vector machines”

Supervised Statistical Methods

13

Page 14: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Feature Vector Representation

• A feature vector is a vector x=[x1, x2, … , xn], where each xjis a real number

• The elements xj may be object measurements

• For example, xj may be a count of object parts or properties

• Example: an object region can be represented by [#holes, #strokes, moments, …]

from Shapiro & Stockman14

Page 15: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Possible features for Char Recognition

from Shapiro & Stockman 15

Page 16: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Discriminant functions

• Functions f(x, K) perform some computation on feature vector x

• Knowledge K from training or programming is used

• Final stage determines class

from Shapiro & Stockman16

Page 17: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Decision Trees• Strength: Easy to understand• Weakness: Overtraining

from Shapiro & Stockman

Class #holes #strokes Best axis Mom of inertia

‘A’ 1 3 90 Med

‘B’ 2 1 90 Large

‘8’ 2 0 90 Med

‘0’ 1 0 90 Large

‘1’ 0 1 90 Low

‘W’ 0 4 90 Large

‘X’ 0 2 ? Large

‘*’ 0 0 ? Large

‘-’ 0 1 0 Low

‘/’ 0 1 60 Low

17

Page 18: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Entropy‐Based Automatic Decision Tree Construction

Node 1What feature

should be used?

What values?

Training Set Sx1=(f11,f12,…f1m)x2=(f21,f22, f2m)

.

.xn=(fn1,f22, f2m)

Choose the feature which results in the most information gain, as measured by the decrease in entropy

from Shapiro & Stockman

18

Page 19: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Entropy

Given a set of training vectors S, if there are c classes,

Entropy(S) = -pi log (pi)

where pi is the proportion of category i examples in S.i=1

c

2

If all examples belong to the same category, the entropy is 0.

If the examples are equally mixed (1/c examples of each class), the entropy is a maximum at 1.0.

e.g. for c=2, -.5 log .5 - .5 log .5 = -.5(-1) -.5(-1) = 12 2

from Shapiro & Stockman 19

Page 20: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Decision‐Tree Classifier

• Uses subsets of features in sequence

• Feature extraction may be interleaved with classification decisions

• Can be easy to design and efficient in execution

from Shapiro & Stockman20

Page 21: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Matlab demo• See “Decision Trees” in Matlab help

– Statistics and Machine Learning toolbox

• Fisher's iris data consists of measurements on the sepal length, sepal width, petal length, and petal width of 150 iris specimens. There are 50 specimens from each of three species. 

21

clear allclose all

% Loads:% meas(150,4) - each row is a pattern (a 4-dimensional vector)% species{150} - each element is the name of a flowerload fisheriris

% Create a vector of class numbers. We know that the input data is grouped% so that 1..50 is the 1st class, 51..100 is the 2nd class, 101..150 is the% 3rd class.y(1:50,1) = 1; % class 'setosa'y(51:100,1) = 2; % class 'versico'y(101:150,1) = 3; % class 'virginica'

X = meas(:, 1:2); % just use first 2 features (easier to visualize)

Page 22: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines 22

% We will just use the first 2 features, since it is easier to visualize.% However, when we do that there is a chance that some points will be% duplicated (since we are ignoring the other features). If so, just keep% the first point.indicesToKeep = true(size(X,1),1);for i=1:size(X,1)

% See if we already have the ith point.if any((X(i,1)==X(1:i-1,1)) & (X(i,2)==X(1:i-1,2)))

indicesToKeep(i) = false; % Skip this pointend

endX = X(indicesToKeep, :);y = y(indicesToKeep);

% Plot the feature vectors.figurehold onplot(X(y==1,1), X(y==1,2), '*r');plot(X(y==2,1), X(y==2,2), '*g');plot(X(y==3,1), X(y==3,2), '*b');xlabel('Sepal length'), ylabel('Sepal width');

Sepa

l wid

th

Page 23: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Choose a test point to classify

23

% Specify a test vector to classify.xTest = [5.6, 3.1];

plot(xTest(1), xTest(2), 'ok'); % the black circle (k) is the test pointhold off

Sepa

l wid

th

Page 24: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Construct a decision (or classification) tree

24

% A "classification" tree produces classification decisions that are% "nominal" (ie, names). A "regression" tree produces classification% decisions that are numeric.ctree = ClassificationTree.fit(X,y, ...

'MinParent', 10); % default is 10

view(ctree); % Prints a text descriptionview(ctree, 'mode', 'graph'); % Draws a graphic description of the tree

Page 25: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Use decision tree to classify a vector

25

In our example, x1 = 5.6 and x2 = 3.1

% Classify a test vector, using the decision tree.class = predict(ctree, xTest);fprintf('Test vector is classified as %d\n', class);

Page 26: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

% Visualize entire feature space, and what class each vector belongs to.xmin = min(X(:,1));xmax = max(X(:,1));ymin = min(X(:,2));ymax = max(X(:,2));

hold on;dx = (xmax-xmin)/40;dy = (ymax-ymin)/40;for x=xmin:dx:xmax

for y=ymin:dy:ymaxclass = predict(ctree, ...

[x y]);if class==1

plot(x,y,'.r');elseif class==2

plot(x,y,'.g');else

plot(x,y,'.b');end

endendhold off;

View the whole feature space

26

Note that some training vectors are incorrectly classified

Page 27: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines 27

The input parameter “MinParent” has default value 10.  Setting 'MinParent’=1 will cause the decision tree to split (make a new node) if there are any instances that are still not correctly labeled.

“'MinParent” = 1

Page 28: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Generalization

• Making the tree accurately classify every single training point leads to “overfitting”.– If you have lots of data, some training points may be noisy– We don’t want the tree to learn the “noise”

• We want the tree to generalize from the training data– It should learn the general, underlying rules– It’s ok to misclassify a few training points

28

Page 29: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines 29

“MinParent” = 1

4 4.5 5 5.5 6 6.5 7 7.5 82

2.5

3

3.5

4

4.5

4 4.5 5 5.5 6 6.5 7 7.5 82

2.5

3

3.5

4

4.5

Sepal length

Sep

al w

idth

Page 30: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines 30

4 4.5 5 5.5 6 6.5 7 7.5 82

2.5

3

3.5

4

4.5

“MinParent” = 20

Page 31: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines 31

4 4.5 5 5.5 6 6.5 7 7.5 82

2.5

3

3.5

4

4.5

“MinParent” = 40

Page 32: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Classification using nearest class mean

• Compute the Euclidean distance between feature vector X and the mean of each class.

• Choose closest class, if close enough (reject otherwise)

• Low error rate (intersection)

from Shapiro & Stockman

di

ii,1

21121 ][][ xxxx

32

Page 33: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Scaling Distance Using Standard Deviations

• Scale distance to the mean of class c, according to the measured standard deviation i in each direction i

• Otherwise, a point near the top of class 3 will be closer to the class 2 mean 

2/][][ icc ii xxxx

from Shapiro & Stockman

33

Page 34: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

If ellipses are not aligned with axes

• Instead of using standard deviation along each separate axis, use the covariance matrix C

• Variance (of a single variable x) is defined as

• Covariance (of two variables, x and y) is

xx

xx

xx

x

x xx

x

xx

x

x

x

ooo

o

o

ooo

oo

o

N

iix

N 1

22

11

22

22

yyyx

xyxxxyC

yi

N

ixiyxxy

N

iyiyy

N

ixixx

yxN

yN

xN

1

22

1

22

1

22

11

11

11

34

Page 35: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Examples

• Notes– Off diagonal values are small if variables are independent– Off diagonal values are large if variables are correlated (they vary together)

C =0.8590 0.88360.8836 1.1069

C =0.0497 0.01230.0123 0.8590

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

Matlab “cov” function 35

Page 36: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Probability Density

• Let’s assume that errors are Gaussian

• The probability density for an 2‐dimensional error vector x is

xCxC

x 1x

x

Tp 21

21 exp2

1

-3-2

-10

12

3

-3-2

-10

12

30

0.1

0.2

0.3

0.4

x1 - axisx2 - axis

36

Page 37: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Probability Density• Look at where the probability is a constant.  This is where the 

exponent is a constant:

• This is the equation of an ellipse.  

• For example, with uncorrelated errors this reduces to

• Can choose z to get desired probability.  For z=3, the cumulative probability is about 97%.

2zT xCx 1x

22

2

2

2

zyx

yyxx

37

Page 38: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Plotting 

• Contours of constant probability

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

x1 - axis

x2 -

axis

-3 -2 -1 0 1 2 3-3

-2

-1

0

1

2

3

x1 - axis

x2 -

axis

38

Page 39: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Matlab code% Show covariance of two variables

clear allclose all

randn('state',0);

yp = randn(40,1);xp = 0.25 * randn(40,1);

% xp = randn(40,1);% yp = xp + 0.5*randn(40,1);

plot(xp,yp, '+'), axis equal;axis([-3.0 3.0 -3.0 3.0]);

C = cov(xp,yp)

Cinv = inv(C);detCsqrt = sqrt(det(C));

% Plot the probability density,% p(x,y) = (1/(2pi det(C)^0.5))exp(-x Cinv x/2)L = 3.0;delta = 0.1;[x1,x2] = meshgrid(-L:delta:L,-L:delta:L);

for i=1:size(x1,1)for j=1:size(x1,2)

x = [x1(i,j); x2(i,j)];fX(i,j) = (1/(2*pi*detCsqrt)) * exp( -0.5*x'*Cinv*x );

endend

hold on% meshc(x1,x2,fX); % this does a surface plotcontour(x1,x2,fX); % this does a contour plotxlabel('x1 - axis');ylabel('x2 - axis');

39

Page 40: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Example – flower data from Matlab% This loads in the measurements:% meas(N,4) are the feature values% species{N} are the species namesload fisheriris;

% There are three classesX1 = meas(strmatch('setosa', species), 3:4); % use features 3,4X2 = meas(strmatch('virginica', species), 3:4);X3 = meas(strmatch('versicolor', species), 3:4);

hold onplot( X1(:,1), X1(:,2), '.r' );plot( X2(:,1), X2(:,2), '.g' );plot( X3(:,1), X3(:,2), '.b' );

m1 = sum(X1)/length(X1);m2 = sum(X2)/length(X2);m3 = sum(X3)/length(X3);

plot( m1(1), m1(2), '*r' );plot( m2(1), m2(2), '*g' );plot( m3(1), m3(2), '*b' );

1 2 3 4 5 6 70

0.5

1

1.5

2

2.5

40

Page 41: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Overlaying probability contours

41

% Plot the contours of equal probability[f1,f2] = meshgrid( min(meas(:,3)):0.1:max(meas(:,3)), ...

min(meas(:,4)):0.1:max(meas(:,4)) );

C = cov(X1);Cinv = inv(C);detCsqrt = sqrt(det(C));for i=1:size(f1,1)

for j=1:size(f1,2)x = [f1(i,j) f2(i,j)];fX(i,j) = (1/(2*pi*detCsqrt)) * exp( -0.5*(x-m1)*Cinv*(x-m1)' );

endendcontour(f1,f2,fX);

C = cov(X2);Cinv = inv(C);detCsqrt = sqrt(det(C));for i=1:size(f1,1)

for j=1:size(f1,2)x = [f1(i,j) f2(i,j)];fX(i,j) = (1/(2*pi*detCsqrt)) * exp( -0.5*(x-m2)*Cinv*(x-m2)' );

endendcontour(f1,f2,fX);

C = cov(X3);Cinv = inv(C);detCsqrt = sqrt(det(C));for i=1:size(f1,1)

for j=1:size(f1,2)x = [f1(i,j) f2(i,j)];fX(i,j) = (1/(2*pi*detCsqrt)) * exp( -0.5*(x-m3)*Cinv*(x-m3)' );

endendcontour(f1,f2,fX);

Page 42: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Mahalanobis distance

• Given an unknown feature vector x, which class is it closest to?– Assume you know the class centers (centroids) zi, and their covariances Ci– We find the class that has the smallest distance from its center to the point in 

feature space

• The distance is weighted by the covariance – this is called the “Mahalanobis distance”

• For example, the Mahalanobis distance of feature vector x to the ithclass is

• where Ci is the covariance matrix of the feature vectors in the ithclass

42

iiT

iid zxCzx 1

Page 43: Computer Visioncs-courses.mines.edu/csci507/schedule/34/Pattern...% meas(150,4) - each row is a pattern (a 4-dimensional vector) % species{150} - each element is the name of a flower

Computer VisionColorado School of Mines

Summary / Questions

• In pattern recognition, we classify “patterns” (usually in the form of vectors) into “classes”.

• Training of the classifier can be supervised (i.e., we have to provide labeled training data) or unsupervised.– k‐means clustering is an example of unsupervised learning

• Approaches to classification include– Statistical– Structural– Neural

• Name some statistical pattern recognition methods.

43