lecture 25 radial basis network (ii)
DESCRIPTION
Lecture 25 Radial Basis Network (II). Outline. Regularization Network Formulation Radial Basis Network Type 2 Generalized RBF network Training algorithm Implementation details. Properties of Regularization network. An RBF network is a universal approximator : - PowerPoint PPT PresentationTRANSCRIPT
Intro. ANN & Fuzzy Systems
Lecture 25Radial Basis Network (II)
(C) 2001-2003 by Yu Hen Hu 2
Intro. ANN & Fuzzy Systems
Outline
• Regularization Network Formulation • Radial Basis Network Type 2 • Generalized RBF network
– Training algorithm– Implementation details
(C) 2001-2003 by Yu Hen Hu 3
Intro. ANN & Fuzzy Systems
Properties of Regularization network
• An RBF network is a universal approximator: – it can approximate arbitrarily well any multivariate
continuous function on a compact support in Rn where n is the dimension of feature vectors, given sufficient number of hidden neurons.
• It is optimal in that it minimizes E(F). • It also has the best approximation property. That
means given an unknown nonlinear function f, there always exists a choice of RBF coefficients that approximates f better than other possible choices of models.
(C) 2001-2003 by Yu Hen Hu 4
Intro. ANN & Fuzzy Systems
Radial Basis Network (Type II)
• Instead of xi, use virtual data points tj in the solution of F(x). Define
• Substitute each xi into eq.
F(xi)=di
we have a new system:
(GTG + Go)w = GTd
Thus,
w = (GTG + Go)-1GTd
when = 0,
w = G+d = (GTG)-1GTd
where G+ is the pseudo-inverse matrix of G.
J
jjj txGwxF
1
);()(
JKJKKK
J
J
txGtxGtxG
txGtxGtxGtxGtxGtxG
),(),(),(
),(),(),(),(),(),(
21
22212
12111
G
JJJJJJ
J
J
ttGttGttG
ttGttGttGttGttGttG
),(),(),(
),(),(),(),(),(),(
21
22212
12111
0
G
(C) 2001-2003 by Yu Hen Hu 5
Intro. ANN & Fuzzy Systems
RBN2 Algorithm Summary
Given: {xi; 1 i K}, d: desired output, and J: # of radial
basis neurons
• Cluster {xi} into J clusters, find clustering centers {tj;1 j
J}. Variance j2 or inverse covariance matrix j
1 are
also computed.
• Compute G matrix (K by J) and G0 matrix.
Gi,j+1 = exp(0.5||x(i)tj||2/j
2) or
Gi,j+1 = exp(0.5(x(i)tj)Tj
1(x(i)tj))
Solve w = G†d or (GTG + G0)-1GTd
• Above procedure can be refined by fitting the clusters into a Gaussian mixture model and train it with the EM algorithm.
(C) 2001-2003 by Yu Hen Hu 6
Intro. ANN & Fuzzy Systems
Example
-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5-1.5
-1
-0.5
0
0.5
1train samples test samples approximated curveradial basis
(C) 2001-2003 by Yu Hen Hu 7
Intro. ANN & Fuzzy Systems
General RBF Network
• Consider a Gaussian RBF model
• In RBN-II training, in order to compute {wi}, parameters {tj} are determined in advance using Kmeans clustering and 2 is selected initially.
• To fit the model better at F(xi) = di, these parameters may need fine-tuning.
• Additional enhancements include
– Allowing each basis has its own width parameter j, and
– A bias term is added to compensate for nonzero background value of the function over the support.
• While similar to the Gaussian mixture model, {wi} can be negative is the main difference.
J
j
jj
txwxF
12
2
2
)(exp)(
btx
wxFJ
j
jj
j
12
2
2
)(exp)(
(C) 2001-2003 by Yu Hen Hu 8
Intro. ANN & Fuzzy Systems
Training of Generalized RBN
The parameters = {wj, tj, j, b} are to be chosen to minimize the approximation error
The steepest descent gradient method leads to:
Specifically, for 1 m Jb
txwxF
J
j
jj
j
12
2
2
)(exp)(
K
iii
K
i
i dxFe
E1
2
1
2
])|([2
1
2)(
)|()(
)()()1(
1
i
K
ii xFen
Enn
1)|(
,2
)(exp
)|(
2
)(exp
2
)()|(
2
)(exp
)()|(
2
2
2
2
4
2
2
2
2
2
b
xF
tx
w
xF
txtxwxF
txtxw
t
xF
i
m
mi
m
i
m
mi
m
mimi
m
mi
m
mim
m
i
m
and
(C) 2001-2003 by Yu Hen Hu 9
Intro. ANN & Fuzzy Systems
Training …Note that
Hence
Thus, the individual parameters’ on-line learning formula are:
K
ii
K
iimi
m
K
i m
mimimi
K
i m
mimimi
m
eb
E
Gew
E
txwGe
E
txwGe
t
E
m
1
1
14
2
2
12
)(
,)(
2
)()(
)()(
and
K
ii
K
iimimm
K
i m
mimimi
K
i m
mimimimm
i
J
jijji
enbnb
Genwnw
txwGenn
txwGentnt
dGwe
mm
1
1
14
222
12
1
)()1(
,)()1(
2
)()()1(
)()()1(
and
2
2
2
||exp
j
jiij
txG
(C) 2001-2003 by Yu Hen Hu 10
Intro. ANN & Fuzzy Systems
Implementation Details
• The cost function may be augmented with additional smoothing terms for the purpose of regularization. For example, the derivative of F(x|) may be bounded by a user-specified constant. However, this will make the training formula more complicated.
• Initialization of RBF centers and variance can be accomplished using the Kmeans clustering algorithm
• Selection of the number of RBF function is part of the regularization process and often need to be done using trail-and-error, or heuristics. Cross-validation may also be used to give a more objective criterion.
• A feasible range may be imposed on each parameter to prevent numerical problem. E.g. 2 > 0