1 pattern recognition: statistical and neural lonnie c. ludeman lecture 14 oct 14, 2005 nanjing...

32
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

Upload: duane-fowler

Post on 03-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

1

Pattern Recognition:Statistical and Neural

Lonnie C. Ludeman

Lecture 14

Oct 14, 2005

Nanjing University of Science & Technology

Page 2: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

2

Lecture 14 Topics

1. Review structures of Optimal Classifier

2. Define Linear functions, hyperplanes, boundaries, unit normals,various distances

3. Use of Linear Discriminant functions for defining classifiers- Examples

Page 3: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

3

Motivation!

Motivation!

Motivation!

Page 4: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

4

if- (x – M

1)TK

1

-1(x – M1) + (x – M

2)TK

2

-1(x – M2)

><

C1

C2

T1

Optimum Decision Rules: 2-class Gaussian

Quadratic Processing

if ( M1 – M

2)T K-1 x >

<C

1

C2

T2

Case 2: K1 = K

2 = K

Case 1: K1 = K

2

Linear Processing

Review 1

Page 5: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

5

if ( M1 – M

2)T x >

<C

1

C2

T3

Case 3: K1 = K

2 = K = s2 I

Optimum Decision Rules: 2-class Gaussian (cont)

Linear Processing

Review 2

Page 6: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

6

Qi(x) = (x – M

j)TK

j

-1(x – Mj) } – 2 ln P(C

j) + ln | K

i |

M-Class General Gaussian MPE and MAP

Select Class Cj if Q

j(x) is MINIMUM

Select Class Cj if L

j(x) is MAXIMUM

Lj(x) = M

j

TK-1x – ½ Mj

T

K-1M

j

+ lnP(Cj)

Case 2: K1 = K

2 = … = K

M = K

Case 1: K1 = K

2

Review 3

Page 7: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

7

Bayes decision rule is determined form a set of yi(x)

defined by

p(x|Ck) = 1

(2 )N/2 Kk

½ exp(- ½ (x – M

k)TK

k

-1(x – Mk) )

Ck :

X ~ N( M

k, K

k ) , P(C

k)

M-Class General Gaussian: Bayes

where

Review 4

Page 8: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

8

(2 )N/2 Kj

½ C

ij exp(- ½ (x – M

j)TK

j

-1(x – Mj)) P(C

j)

yi(x) = j=1

M

Taking the ln of the yi(x) for this case

does not simplify to a linear or quadratic processor

The structure of the optimum classifier uses a sum of exp( quadratic forms) and thus is a special form of nonlinear processing using quadratic forms.

Review 5

Page 9: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

9

Gaussian assumptions Quadratic

processing

Linear and

Reasons for studying linear, quadratic and other special forms of non linear processing

If Gaussian we can find or learn a usable decision rule and the rule is optimum

If non-Gaussian case we can find or learn a usable decision rule; however the rule is NOT necessarily optimum

Page 10: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

10

Linear functions

f(x1) = w1x1 + w2

One Variable

f(x1, x2 ) = w1x1 + w2x2 + w3

Two Variables

f(x1, x2 , x3) = w1x1 + w2x2 + w2x2 + w3

Three Variables

Page 11: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

11

w1x1 + w2 = 0

w1x1 + w2x2 + w3 = 0

w1x1 + w2x2 + w3x3 + w4 = 0

Constant

Line

Plane

w1x1 + w2x2 + w3x3 + w4x4 + w5 = 0 ?

Answer = Hyperplane

Page 12: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

12

Hyperplanes

w1x1 + w2x2 + … + wnxn + wn+1 = 0

x = [ x1 , x2 ,… , xn ]

T

w0= [ w1, w2 ,… , wn ]T

w0 x + wn+1 = 0

Define

An alternative representation of a Hyperplane is

n-dimensional Hyperplane

T

Page 13: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

13

Hyperplanes as boundaries for Regions

R+ = { x: }

R- = { x: }

Positive side of Hyperplane boundary

Negative side of Hyperplane boundary

w0 x + wn+1 = 0Hyperplane boundary

Page 14: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

14

Page 15: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

15

Definitions

(1) Unit Normal u

Page 16: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

16

(2) Distance from a point y to the hyperplane

(3) Distance from the origin to the hyperplane

Page 17: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

17

(4) Linear Discriminate Functions

whereAugmented Pattern Vector

Weight vector

Page 18: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

18

Linear Decision Rule: 2-Class Case using single linear discriminant function

No claim of optimality !!!

for a vector x if

given: d(x)=w1x1 + w2x2 + … + wnxn + wn+1

Page 19: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

19

Page 20: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

20

Linear Decision Rule: 2-Class Case using two linear discriminant function

except on boundaries d1(x) = 0 and d

2(x) = 0

where we decide randomly between C1 and C2

given two discriminant functions

define decision rule by

Page 21: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

21

Decision regions (2-class case) using two linear discriminant functions and AND logic

Page 22: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

22

Decision regions (2-class case) using two linear discriminant functions(continued)

Page 23: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

23

Decision regions (2-class case) alternative formulation using two linear discriminant functions

Page 24: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

24

Decision regions (2-class case) using alternative form of two linear discriminant functions

equivalent to

Page 25: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

25

Decision regions (3-class case) using two linear discriminant functions

Page 26: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

26

Decision regions (4-class case) using two linear discriminant functions

Page 27: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

27

Decision region R1 (M-class case) using

K linear discriminant functions

Page 28: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

28

Example: Piecewise linear boundaries

Given the following discriminant functions

Page 29: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

29

If d1(x) > 0 AND d

2(x) > 0

Define the following decision rule

Show the decision regions in the two dimensional pattern space

OR

d3(x) > 0 AND d

4(x) > 0 AND d

5(x) > 0 AND d

6(x) > 0

then decide x comes from class C1,

on the boundaries decide randomly, otherwise decide C

2

Example Continued

Page 30: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

30

Solution:

Page 31: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

31

Lecture 14 Summary

1. Reviewed structures of Optimal Classifier

2. Defined Linear functions, hyperplanes, boundaries, unit normals,various distances

3. Used Linear Discriminant functions for defining classifiers- Examples

Page 32: 1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 14 Oct 14, 2005 Nanjing University of Science & Technology

32

End of Lecture 14