1 pattern recognition: statistical and neural lonnie c. ludeman lecture 14 oct 14, 2005 nanjing...
TRANSCRIPT
1
Pattern Recognition:Statistical and Neural
Lonnie C. Ludeman
Lecture 14
Oct 14, 2005
Nanjing University of Science & Technology
2
Lecture 14 Topics
1. Review structures of Optimal Classifier
2. Define Linear functions, hyperplanes, boundaries, unit normals,various distances
3. Use of Linear Discriminant functions for defining classifiers- Examples
3
Motivation!
Motivation!
Motivation!
4
if- (x – M
1)TK
1
-1(x – M1) + (x – M
2)TK
2
-1(x – M2)
><
C1
C2
T1
Optimum Decision Rules: 2-class Gaussian
Quadratic Processing
if ( M1 – M
2)T K-1 x >
<C
1
C2
T2
Case 2: K1 = K
2 = K
Case 1: K1 = K
2
Linear Processing
Review 1
5
if ( M1 – M
2)T x >
<C
1
C2
T3
Case 3: K1 = K
2 = K = s2 I
Optimum Decision Rules: 2-class Gaussian (cont)
Linear Processing
Review 2
6
Qi(x) = (x – M
j)TK
j
-1(x – Mj) } – 2 ln P(C
j) + ln | K
i |
M-Class General Gaussian MPE and MAP
Select Class Cj if Q
j(x) is MINIMUM
Select Class Cj if L
j(x) is MAXIMUM
Lj(x) = M
j
TK-1x – ½ Mj
T
K-1M
j
+ lnP(Cj)
Case 2: K1 = K
2 = … = K
M = K
Case 1: K1 = K
2
Review 3
7
Bayes decision rule is determined form a set of yi(x)
defined by
p(x|Ck) = 1
(2 )N/2 Kk
½ exp(- ½ (x – M
k)TK
k
-1(x – Mk) )
Ck :
X ~ N( M
k, K
k ) , P(C
k)
M-Class General Gaussian: Bayes
where
Review 4
8
(2 )N/2 Kj
½ C
ij exp(- ½ (x – M
j)TK
j
-1(x – Mj)) P(C
j)
yi(x) = j=1
M
Taking the ln of the yi(x) for this case
does not simplify to a linear or quadratic processor
The structure of the optimum classifier uses a sum of exp( quadratic forms) and thus is a special form of nonlinear processing using quadratic forms.
Review 5
9
Gaussian assumptions Quadratic
processing
Linear and
Reasons for studying linear, quadratic and other special forms of non linear processing
If Gaussian we can find or learn a usable decision rule and the rule is optimum
If non-Gaussian case we can find or learn a usable decision rule; however the rule is NOT necessarily optimum
10
Linear functions
f(x1) = w1x1 + w2
One Variable
f(x1, x2 ) = w1x1 + w2x2 + w3
Two Variables
f(x1, x2 , x3) = w1x1 + w2x2 + w2x2 + w3
Three Variables
11
w1x1 + w2 = 0
w1x1 + w2x2 + w3 = 0
w1x1 + w2x2 + w3x3 + w4 = 0
Constant
Line
Plane
w1x1 + w2x2 + w3x3 + w4x4 + w5 = 0 ?
Answer = Hyperplane
12
Hyperplanes
w1x1 + w2x2 + … + wnxn + wn+1 = 0
x = [ x1 , x2 ,… , xn ]
T
w0= [ w1, w2 ,… , wn ]T
w0 x + wn+1 = 0
Define
An alternative representation of a Hyperplane is
n-dimensional Hyperplane
T
13
Hyperplanes as boundaries for Regions
R+ = { x: }
R- = { x: }
Positive side of Hyperplane boundary
Negative side of Hyperplane boundary
w0 x + wn+1 = 0Hyperplane boundary
14
15
Definitions
(1) Unit Normal u
16
(2) Distance from a point y to the hyperplane
(3) Distance from the origin to the hyperplane
17
(4) Linear Discriminate Functions
whereAugmented Pattern Vector
Weight vector
18
Linear Decision Rule: 2-Class Case using single linear discriminant function
No claim of optimality !!!
for a vector x if
given: d(x)=w1x1 + w2x2 + … + wnxn + wn+1
19
20
Linear Decision Rule: 2-Class Case using two linear discriminant function
except on boundaries d1(x) = 0 and d
2(x) = 0
where we decide randomly between C1 and C2
given two discriminant functions
define decision rule by
21
Decision regions (2-class case) using two linear discriminant functions and AND logic
22
Decision regions (2-class case) using two linear discriminant functions(continued)
23
Decision regions (2-class case) alternative formulation using two linear discriminant functions
24
Decision regions (2-class case) using alternative form of two linear discriminant functions
equivalent to
25
Decision regions (3-class case) using two linear discriminant functions
26
Decision regions (4-class case) using two linear discriminant functions
27
Decision region R1 (M-class case) using
K linear discriminant functions
28
Example: Piecewise linear boundaries
Given the following discriminant functions
29
If d1(x) > 0 AND d
2(x) > 0
Define the following decision rule
Show the decision regions in the two dimensional pattern space
OR
d3(x) > 0 AND d
4(x) > 0 AND d
5(x) > 0 AND d
6(x) > 0
then decide x comes from class C1,
on the boundaries decide randomly, otherwise decide C
2
Example Continued
30
Solution:
31
Lecture 14 Summary
1. Reviewed structures of Optimal Classifier
2. Defined Linear functions, hyperplanes, boundaries, unit normals,various distances
3. Used Linear Discriminant functions for defining classifiers- Examples
32
End of Lecture 14