supervised learning: linear perceptron nn

67
Supervised Learning: Linear Perceptron NN

Upload: kyne

Post on 05-Feb-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Supervised Learning: Linear Perceptron NN. Distinction Between Approximation-Based vs. Decision-Based NNs. Teacher in Approximation-Based NN are quantitative in real or complex values Teacher in Decision-Based NNs are symbols, instead of numeric complex values. Decision-Based NN (DBNN). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Supervised Learning:  Linear Perceptron NN

Supervised Learning: Linear Perceptron NN

Page 2: Supervised Learning:  Linear Perceptron NN

Distinction Between Approximation-Based vs. Decision-Based NNs

•Teacher in Approximation-Based NN are quantitative in real or complex values

•Teacher in Decision-Based NNs are symbols, instead of numeric complex values.

Page 3: Supervised Learning:  Linear Perceptron NN

Decision-Based NN (DBNN)

•Linear Perceptron

•Discriminant function (Score function)

•Reinforced and Anti-reinforced Learning Rules

•Hierarchical and Modular Structures

Page 4: Supervised Learning:  Linear Perceptron NN

incorrect/correct classes

next pattern

1xw) 2xw) Mxw)

Page 5: Supervised Learning:  Linear Perceptron NN

Supervised Learning: Linear Perceptron NN

Page 6: Supervised Learning:  Linear Perceptron NN

Upon the presentation of the m-th training pattern z(m) , the weight vector w(m) is updated as

Two-Classes:Linear Perceptron Learning Rule

w(m+1) = w(m) + (t (m) - d (m) ) z(m)

jxwj) = xTwj+w0)= zTŵj (= zTw)▽jzwj) = z

where is a positive learning rate.

Page 7: Supervised Learning:  Linear Perceptron NN

If a set of training patterns is linearly separable, then the linear perceptron learning algorithm converges to a correct solution in a finite number of iterations.

Linear Perceptron: Convergence Theorem(Two Classes)

Page 8: Supervised Learning:  Linear Perceptron NN

It converges when learning rate is small enough.

w(m+1) = w(m) + (t (m) - d (m) ) z(m)

Page 9: Supervised Learning:  Linear Perceptron NN

linearly separable

Multiple Classes

strongly linearly separable

Page 10: Supervised Learning:  Linear Perceptron NN

If the given multiple-class training set is linearly separable, then the linear perceptron learning algorithm converges to a correct solution after a finite number of iterations.

Linear Perceptron Convergence Theorem(Multiple Classes)

Page 11: Supervised Learning:  Linear Perceptron NN

Multiple Classes:Linear Perceptron Learning Rule

(linearly separability)

Page 12: Supervised Learning:  Linear Perceptron NN
Page 13: Supervised Learning:  Linear Perceptron NN

P1j= [ z 0 0 … -z 0 … 0]

Page 14: Supervised Learning:  Linear Perceptron NN

DBNN Structure for Nonlinear Discriminant Function

x

y

1xw) 2xw) 3xw)

MAXNET

Page 15: Supervised Learning:  Linear Perceptron NN

DBNN

MAXNET

w1w2 w3

teacher Training if teacher indicates the need

x

y

Page 16: Supervised Learning:  Linear Perceptron NN
Page 17: Supervised Learning:  Linear Perceptron NN

Decision-based learning rule is based on a minimal updating principle. The rule tends to avoid or minimize unnecessary side-effects due to overtraining.

•One scenario is that the pattern is already correctly classified by the current network, then there will be no updating attributed to that pattern, and the learning process will proceed with the next training pattern.

•The second scenario is that the pattern is incorrectly classified to another winning class. In this case, parameters of two classes must be updated. The score of the winning class should be reduced, by the anti-reinforced learning rule, while the score of the correct (but not winning) class should be enhanced by the reinforced learning rule.

Page 18: Supervised Learning:  Linear Perceptron NN

wjwjjxw)

Reinforced and Anti-reinforced Learning

wiwiixw)Reinforced Learning

Anti-Reinforced Learning

Suppose that the m -th training patternn x(m) ,

j = arg maxi≠j φ( x(m), Θj )

The leading challenger is denoted by

x(m) is known to belong to the i-th class.

Page 19: Supervised Learning:  Linear Perceptron NN

Anti-Reinforced Learning wjx wj)

▽jxwj) = x wj)

wix wi)Reinforced Learning

For Simple RBF Discriminant Function

Upon the presentation of the m-th training pattern z(m) , the weight vector w(m) is updated as

jxwj) = .5x wj)2

Page 20: Supervised Learning:  Linear Perceptron NN
Page 21: Supervised Learning:  Linear Perceptron NN
Page 22: Supervised Learning:  Linear Perceptron NN

The learning scheme of the DBNN consists of two phases:

• locally unsupervised learning.

• globally supervised learning.

Decision-Based Learning Rule

Page 23: Supervised Learning:  Linear Perceptron NN

Several approaches can be used to estimate the number of hidden nodes or the initial clustering can be determined based on VQ or EM clustering methods.

Locally Unsupervised Learning Via VQ or EM Clustering Method

2 2.5 3 3.5 4 4.5 5 5.5 x 105-1

-0.5

0

0.5

1x 10

5

1st Principal Components

2nd

Prin

cipa

l Com

pone

nts 1 2 3

4

• EM allows the final decision to incorporate prior information. This could be instrumental to multiple-expert or multiple-channel information fusion.

Page 24: Supervised Learning:  Linear Perceptron NN

•The objective of learning is minimum classification error (not maximum likelihood estimation) .

•Inter-class mutual information is used to fine tune the decision boundaries (i.e., the globally supervised learning).

•In this phase, DBNN applies reinforced-antireinforcedlearning rule [Kung95] , or discriminative learning rule [Juang92] , to adjust network parameters. Only misclassified patterns need to be involved in this training phase.

Globally Supervised Learning Rules

Page 25: Supervised Learning:  Linear Perceptron NN

a

a

aa a

a

a

aa

a

a

a

aa a

b

b

b

bb

b

b

b

b

bb

b

cc cc c

cc ccc c

cc ccc c

bb

a

a

aa a

a

a

aa a

a

b

b

b

bb

b

bb

cc ccc c

bbb b

Pictorial Presentation of Hierarchical DBNN

Page 26: Supervised Learning:  Linear Perceptron NN

Discriminant function (Score function)

•LBF Function (or Mixture of)

•RBF Function (or Mixture of)

•Prediction Error Function

• Likelihood Function : HMM

Page 27: Supervised Learning:  Linear Perceptron NN

Hierarchical and Modular DBNN

•Subcluster DBNN

•Probabilistic DBNN

•Local Experts via K-mean or EM

•Reinforced and Anti-reinforced Learning

Page 28: Supervised Learning:  Linear Perceptron NN

MAXNET

Subcluster DBNN

Page 29: Supervised Learning:  Linear Perceptron NN

Subcluster DBNN

Page 30: Supervised Learning:  Linear Perceptron NN

Subcluster Decision-Based Learning Rule

Page 31: Supervised Learning:  Linear Perceptron NN
Page 32: Supervised Learning:  Linear Perceptron NN
Page 33: Supervised Learning:  Linear Perceptron NN
Page 34: Supervised Learning:  Linear Perceptron NN
Page 35: Supervised Learning:  Linear Perceptron NN

Probabilistic DBNNProbabilistic DBNN

Page 36: Supervised Learning:  Linear Perceptron NN

MAXNET

Probabilistic DBNN

Page 37: Supervised Learning:  Linear Perceptron NN

Probabilistic DBNN

Page 38: Supervised Learning:  Linear Perceptron NN

MAXNET

Probabilistic DBNN

Page 39: Supervised Learning:  Linear Perceptron NN

Subnetwork of a Probabilistic DBNN is basically a mixture of local experts

RBF RBF RBF

P(y|x,P(y|x,

P(y|x,

P(y|x,k

k-th subnetwork x

Page 40: Supervised Learning:  Linear Perceptron NN

Probabilistic Decision-Based Neural NetworksProbabilistic Decision-Based Neural Networks

Page 41: Supervised Learning:  Linear Perceptron NN

Training of Probabilistic DBNN

•Selection of initial local experts: Intra-class training

Unsupervised training

EM (Probabilistic) Training

•Training of the experts: Inter-class training

Supervised training

Reinforced and Anti-reinforced Learning

Page 42: Supervised Learning:  Linear Perceptron NN

Locally Unsupervised Phase Globally supervised Phase

K-meansK-means

Feature Vectors

K-NNsK-NNs

EMEM

x(t)

ClassificationClassification

x(t)Class ID

ReinforcedLearningReinforcedLearning

Converge ?

Y

Misclassifiedvectors

N

Probabilistic Decision-Based Neural NetworksProbabilistic Decision-Based Neural Networks

Training procedure

j

j

)}(,,{ jPjj }),(,,{ TjPjj

Page 43: Supervised Learning:  Linear Perceptron NN

Probabilistic Decision-Based Neural NetworksProbabilistic Decision-Based Neural Networks

500

1000

1500

2000

2500

3000

3500

0 200 400 600 800 1000 1200 1400

F2

(Hz)

F1(Hz)

headhid

hodhad

heardwho'd

hawedhud

heedhood

500

1000

1500

2000

2500

3000

3500

0 200 400 600 800 1000 1200 1400

F2

(Hz)

F1(Hz)

headhid

hodhad

heardwho'd

hawedhud

heedhood

GMM PDBNN

2-D Vowel Problem:

Page 44: Supervised Learning:  Linear Perceptron NN

For MOE, the influence from the training patterns on each expert is regulated by the gating network (which itself is under training) so that as the training goes, the training patterns will have higher influence on the closer-by experts, and lower influence on the far-away ones. (The MOE updates all the classes.)

Unlike the MOE, the DBNN makes use of both unsupervised (EM-type) and supervised (decision-based) learning rules. The DBNN uses only mis-classified training patterns for its globally supervised learning. The DBNN updates only the ``winner" class and the class which the mis-classified pattern actually belongs to. Its training strategy is to abide by a ``minimal updating principle“.

Difference of MOE and DBNN

Page 45: Supervised Learning:  Linear Perceptron NN

DBNN/PDBNN Applications

• OCR (DBNN)• Texture Segmentation(DBNN)• Mammogram Diagnosis (PDBNN)• Face Detection(PDBNN)• Face Recognition (PDBNN)• Money Recognition(PDBNN)• Multimedia Library(DBNN)

Page 46: Supervised Learning:  Linear Perceptron NN

OCR Classification (DBNN)

Page 47: Supervised Learning:  Linear Perceptron NN

Image Texture Classification (DBNN)

Page 48: Supervised Learning:  Linear Perceptron NN
Page 49: Supervised Learning:  Linear Perceptron NN
Page 50: Supervised Learning:  Linear Perceptron NN
Page 51: Supervised Learning:  Linear Perceptron NN

Face Detection (PDBNN)

Page 52: Supervised Learning:  Linear Perceptron NN

Face Recognition (PDBNN)

Page 53: Supervised Learning:  Linear Perceptron NN
Page 54: Supervised Learning:  Linear Perceptron NN

show movies

Page 55: Supervised Learning:  Linear Perceptron NN

Multimedia Library(PDBNN)

Page 56: Supervised Learning:  Linear Perceptron NN

MatLab Assignment #4: DBNN to separate 2 classes

•RBF DBNN with 4 centroids per class

•RBF DBNN with 4 centroids and 6 centroids for green and blue classes respectively.

ratio=2:1

Page 57: Supervised Learning:  Linear Perceptron NN

RBF-BP NN for Dynamic Resource Allocation

•use content to determine renegotiation time

•use content/ST-traffic to estimate how much resource to request

Neural network traffic predictor yields smaller prediction MSE and higher link utilization.

Page 58: Supervised Learning:  Linear Perceptron NN

Modern information technology in the internet era should support interactive and intelligent processing that transforms and transfers information.

Intelligent Media Agent

Integration of signal processing and neural net techniques could be a versatile tool to a broad spectrum of multimedia applications.

Page 59: Supervised Learning:  Linear Perceptron NN

EM Applications

•Uncertain Clustering/ Model

•Channel Confidence

*

*

*

Expert 1 Expert 2

Channel 1

Channel 2

Page 60: Supervised Learning:  Linear Perceptron NN

Channel Fusion

Page 61: Supervised Learning:  Linear Perceptron NN

classes-in-channel network

channel channel

Sensor = Channel = Expert

Page 62: Supervised Learning:  Linear Perceptron NN

Sensor Fusion

Human Sensory

Modalities

Computer Sensory

Modalities

Da.

“Ga”

“Ba”

Page 63: Supervised Learning:  Linear Perceptron NN

Fusion Example

Toy Car Recognition

Page 64: Supervised Learning:  Linear Perceptron NN

Probabilistic Decision-Based Neural NetworksProbabilistic Decision-Based Neural Networks

Page 65: Supervised Learning:  Linear Perceptron NN

Locally Unsupervised Phase Globally supervised Phase

K-meansK-means

Feature Vectors

K-NNsK-NNs

EMEM

x(t)

ClassificationClassification

x(t)Class ID

ReinforcedLearningReinforcedLearning

Converge ?

Y

Misclassifiedvectors

N

Probabilistic Decision-Based Neural NetworksProbabilistic Decision-Based Neural Networks

Training procedure

j

j

)}(,,{ jPjj }),(,,{ TjPjj

Page 66: Supervised Learning:  Linear Perceptron NN

Probabilistic Decision-Based Neural NetworksProbabilistic Decision-Based Neural Networks

500

1000

1500

2000

2500

3000

3500

0 200 400 600 800 1000 1200 1400

F2

(Hz)

F1(Hz)

headhid

hodhad

heardwho'd

hawedhud

heedhood

500

1000

1500

2000

2500

3000

3500

0 200 400 600 800 1000 1200 1400

F2

(Hz)

F1(Hz)

headhid

hodhad

heardwho'd

hawedhud

heedhood

GMM PDBNN

2-D Vowel Problem:

Page 67: Supervised Learning:  Linear Perceptron NN

References:

[1] Lin, S.H., Kung, S.Y. and Lin, L.J. (1997). “Face recognition/detection by probabilistic decision-based neural network, IEEE Trans. on Neural Networks, 8 (1), pp. 114-132.

[2] Mak, M.W. et al. (1994), “Speaker Identification using Multi Layer Perceptrons and Radial Basis Functions Networks,” Neurocomputing, 6 (1), 99-118.