8/10/2015 1 rbf networksm.w. mak radial basis function networks 1. introduction 2. finding rbf...

29
06/15/22 RBF Networks M.W. Mak Radial Basis Function Radial Basis Function Networks Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison between RBF and BP

Upload: madison-foster

Post on 23-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 1RBF Networks M.W. Mak

Radial Basis Function Radial Basis Function NetworksNetworks

1. Introduction

2. Finding RBF Parameters

3. Decision Surface of RBF Networks

4. Comparison between RBF and BP

Page 2: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 2RBF Networks M.W. Mak

1. Introduction MLPs are highly non-linear in the parameter space

gradient descent local minima RBF networks solve this problem by dividing the

learning into two independent processes.

w

Page 3: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 3RBF Networks M.W. Mak

RBF networks implement the function

s x w w x ci i ii

M

( ) ( )

0

1

wi i and ci can be determined separately

Fast learning algorithm Basis function types

( ) log( )

( ) exp( )

( )

( )

r r r

r r

r r

rr

2

2

2

2 2

2 21

Page 4: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 4RBF Networks M.W. Mak

For Gaussian basis functions

s x w w x c

w wx c

p i i p ii

M

ipj ij

ijj

n

i

M

( )

exp( )

01

0

2

211 2

Assume the variance across each dimension are equal

s x w w x cp ii

pj ijj

n

i

M

( ) exp ( )

0 22

11

1

2

Page 5: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 5RBF Networks M.W. Mak

To write in matrix form, let

a x c

s x w a a

pi i p i

p i pii

M

p

where ( )

00 1

s x

s x

s x

a a a

a a a

a a a

w

w

wN

M

M

N N NM M

( )

( )

( )

`

1

2

11 12 1

21 22 2

1 2

0

1

1

1

1

s Aw

A s1

w

Page 6: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 6RBF Networks M.W. Mak

2. Finding the RBF Parameters

Use the K-mean algorithm to find ci

1

2

2

2

1

1

Page 7: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 7RBF Networks M.W. Mak

K-mean Algorithm

step1: K initial clusters are chosen randomly from the samples to form K groups.

step2: Each new sample is added to the group whose mean is the closest to this sample.

step3: Adjust the mean of the group to take account of the new points.

step4: Repeat step2 until the distance between the old means and the new means of all clusters is smaller than a predefined tolerance.

Page 8: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 8RBF Networks M.W. Mak

Outcome: There are K clusters with means representing the centroid of each clusters.

Advantages: (1) A fast and simple algorithm.

(2) Reduce the effects of noisy samples.

Page 9: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 9RBF Networks M.W. Mak

Use K nearest neighbor rule to find the function width

2

1

1

K

kiki cc

K

k-th nearest neighbor of ci

The objective is to cover the training points so that a smooth fit of the training samples can be achieved

Page 10: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 10RBF Networks M.W. Mak

Centers and widths found by K-means and K-NN

Page 11: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 11RBF Networks M.W. Mak

Determining weights w using the least square method

E d w x cp j jj

M

p jp

N

0

2

1

where dp is the desired output for pattern p

E

E

T

T T

( ) ( )

( )

d Aw d Aw

wA A A dSet w

0 1

Page 12: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 12RBF Networks M.W. Mak

Let E be the total-squared error between the actual output and the target output TNdddd

21

wAdwAdET

AwAwAwddAwdd

AwdAwdTTTTTT

TTT

AwAww

Awdw

dAwww

E TTTTT

0

AwAdA

wAAAwAdAww

dA

TT

TTTTTT

22

dAAAw

dAAwATT

TT

1

Page 13: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 13RBF Networks M.W. Mak

Note that

xAxAxAxx

yAyAxx

yyxx

TT

T

T

Problems

(1) Susceptible to round-off error.

(2) No solution if is singular.

(3) If is close to singular, we get very large component in w.

AAT

AAT

Page 14: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 14RBF Networks M.W. Mak

Reasons

(1) Inaccuracy in forming(2) If A is ill-conditioned, small change in A introduces

large change in(3) If ATA is close to singular, dependent columns in ATA

exist

AAT

1AAT

e.g. two parallel straight lines.

x

y

Page 15: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 15RBF Networks M.W. Mak

singular matrix :

1

0

42

21

y

x

If the lines are nearly parallel, they intersect each other at

,

i.e.

0

0

y

x

0

0

y

xor

So, the magnitude of the solution becomes very large; hence overflow will occur.

The effect of the large components can be cancelled out if the machine precision is infinite.

Page 16: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 16RBF Networks M.W. Mak

If the machine precision is finite, we get large error.For example,

0

0

102

104

21

2138

38

Finite machine precision =>

33

33

38

38

101

101

102

1000001.4

21

21

Solution: Singular Value Decomposition

Page 17: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 17RBF Networks M.W. Mak

xp

K-means

K-NearestNeighbor

BasisFunctions

LinearRegression

ci

ci

i

A w

RBF learning processRBF learning process

Page 18: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 18RBF Networks M.W. Mak

RBF learning by gradient descent

Let and i p

pj ij

ijj

n

p p pxx c

e x d x s x( ) exp ( ) ( ) ( )

1

2

2

21

E e x pp

N

1

2 1

2

( ) .

we have

E

w

E E

ci ij ij

, , and

Apply

Page 19: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 19RBF Networks M.W. Mak

we have the following update equations

w t w t e x x i M

w t w t e x i

t t e x w x x c t

c t c t e x w x x c t

i i w p i pp

N

i i w pp

N

ij ij p i i p pj ij ijp

N

ij ij c p i i p pj ij ijp

N

( ) ( ) ( ) ( ) , , ,

( ) ( ) ( )

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( )

1 1 2

1 0

1

1

1

1

2 3

1

2

1

when

when

Page 20: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 20RBF Networks M.W. Mak

Elliptical Basis Function networks

)}()(2

1exp{)( 1

jpjT

jppj xxx

j

j

: function centers

: covariance matrix

1

x1

2 M

x2 xn

J

jpjkjpk xwxy

0

)()(

y W D W = +

y x1( )

y xK ( )

Page 21: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 21RBF Networks M.W. Mak

K-means and Sample covariance K-means :

if Sample covariance :

j jj x

Nx

j

1

x j

x x j kj k

jj

j jT

xN

x xj

1

( )( )

The EM algorithm

Page 22: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 22RBF Networks M.W. Mak

EBF Vs. RBF networksEBF Vs. RBF networks

RBFN with 4 centers EBFN with 4 centers

-3

-2

-1

0

1

2

3

-3 -2 -1 0 1 2 3

Class 1Class 2

-3

-2

-1

0

1

2

3

-3 -2 -1 0 1 2 3

Class 1Class 2

Page 23: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 23RBF Networks M.W. Mak

Out put 1 of an EBF net work (bias, no rescale, gamma=1)

'nxor.ebf 4.Y.N.1.dat ' 1.43

0.948 0.463

-0.0209 -0.505

-3-2

-10

12

3 -3-2

-10

12

3

-1

-0.5

0

0.5

1

1.5

2

EBF Network’s output

Elliptical Basis Function NetworksElliptical Basis Function Networks

Page 24: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 24RBF Networks M.W. Mak

RBFN for Pattern Classification

MLP RBFHyperplane Kernel function

The probability density function (also called conditional density function or likelihood) of the k-th class is defined as

kCxp |

Page 25: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 25RBF Networks M.W. Mak

•According to Bays’ theorem, the posterior prob. is

xp

CPCxpxCP kk

k

||

where P(Ck) is the prior prob. and

)()|( rr

r CPCxpxp

• It is possible to use a common pool of M basis functions, labeled by an index j, to represent all of the class-conditional densities, i.e.

)|()|(|1

k

M

jk CjPjxpCxp

Page 26: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 26RBF Networks M.W. Mak

)1|(xp

)|( kCxp

)|( Mxp)2|(xp

k

M

jk CjPjxpCxp |||

1

)|( kCMP

Page 27: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 27RBF Networks M.W. Mak

kk

M

jk CPCjPjxpxp

1

||

M

j

kk

k

M

j

jPjxp

CPCjPjxp

1

1

|

||

jP

jP

jPjxp

CPCjPjxp

xCP M

j

M

jkk

k

1

''

1

|

||

|

Page 28: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 28RBF Networks M.W. Mak

M

jjkj

M

jk

M

j

M

j

kk

xw

xjPjCP

jPjxp

jPjxp

jP

CPCjP

1

1

1

''1

||

|

||

Hidden node’s output posterior prob. of the j-th set of

features in the input .

weight posterior prob. of class membership, given

the presence of the j- th set of features .

:)|()( xjPxj

:)|( jCPw kkj

No bias term

Page 29: 8/10/2015 1 RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison

04/19/23 29RBF Networks M.W. Mak

RBF networks MLP

Learning speed Very Fast Very Slow

Convergence Almost guarantee Not guarantee

Response time Slow Fast

Memoryrequirement

Very large Small

Hardwareimplementation

IBM ZISC036Nestor Ni1000www-5.ibm.com/fr/cdlab/zisc.html

Voice Direct 364www.sensoryinc.com

Generalization Usually better Usually poorer

Comparison of RBF and MLPComparison of RBF and MLP

To learn more about NN hardware, see To learn more about NN hardware, see http://www.particle.kth.se/~lindsey/HardwareNNWCourse/home.htmlhttp://www.particle.kth.se/~lindsey/HardwareNNWCourse/home.html