[email protected] hidden soft computing and its …wilambm/pap/2000/iecon2000_tutorial.pdf1 soft...
TRANSCRIPT
1
Soft Computing and its Application
B. M. ‘Dan’ Wilamowski
University of Idaho
Introduction
Neural networks
Learning Algorithms
Advanced Neural Network Architectures
Pulse Coded Neural Networks
Fuzzy Systems
Genetic Algorithms
Hardware implementation of neuro-fuzzy systems
Conclusion
nn.uidaho.edu
An example of the three layer feedforward neural network, which is sometimes known also as the backpropagation network.
+1 +1 +1
hiddenlayer #1
hiddenlayer #2
outputlayer
W
W
WT
WT
a ab
b
a
(a) (b)
An example of the bi-directional autoassociative memory - BAM: (a) drawn as a two layer network with circulating signals (b) drawn as two layer network
with bi-directional signal flow. Several logical operations using networks with McCulloch-Pitts neurons.
memory
net = w xi=1
n
i i∑ oif netif net
=≥<
1 00 0
net = w xi=1
n
i i∑ net = w x + wi=1
n
i i n+1∑
o f net net if netif net
= =+
=≥<
( ) sgn( ) 12
1 00 0
<−≥
===01
01)sgn()(
netifnetif
netnetfo
( )o = f(net) = 1
1 + - netexp λ( ) ( )
o = f(net) = 0.5 net = 21 + - net
- 1tanhexp
λλ
Threshold implementation
Threshold implementation with an additional weight and constant input with +1 value : (a) neuron with threshold T, (b) modified neuron with threshold T=0 and additional weight equal to -T
T = t T = 0
x1x2x3x4
xn
+1
x1x2x3x4
xn wn+1= -T
(a) (b)
Typical activation functions: (a) hard threshold unipolar, (b) hard threshold bipolar, (c) continuous unipolar, (d) continuous bipolar.
(a) (b)
(c) (d)
o f net net if netif net
= =+
=≥<
( ) sgn( ) 12
1 00 0
<−≥
===01
01)sgn()(
netifnetif
netnetfo
( )o = f(net) = 1
1 + - netexp λ( ) ( )
o = f(net) = 0.5 net = 21 + - net
- 1tanhexp
λλ
2
Activation functions
bipolar
( ) ( )netknetk
netfo tanh12exp1
2)( =−−+
== ( )21)(' okof −=
( )netknetfo
4exp11)(−+
== ( )ookof −= 14)('
unipolar
Learning rules for single neuron xw δα=∆ iHebb rule (unsupervised): o=δ
correlation rule (supervised): d=δ
perceptron fixed rule: od −=δperceptron adjustable rule - as above but the learning constant is modified to:
2*
xxxwx net
T
T
λαλαα ==
LMS (Widrow-Hoff) rule: netd −=δdelta rule: ( ) 'fod −=δpseudoinverse rule (for linear system): ( ) dTT xxxw 1−
=
iterative pseudoinverse rule (for nonlinear system): ( )'
1
fodTT −
=− xxxw
LMS AND REGRESSION ALGORITHMSIf a single layer of neurons is considered, error back propagation type of algorithms minimize global error as shown in equation 1:
TotalError d opj pjj
J
p
P= −∑∑
==
2
11( )
where P is the number of patterns and J is the number of outputs. A similar approach is taken in the Widrow-Hoff (LMS) algorithm:
TotalError d netpj pjj
J
p
P= −∑∑
==
2
11( )
Where
and I is the size of the augmented input vector i.e. xjIp=+1.
pj jii
Ijipnet w x= ∑
=1
For any given neuron the training data is given in two arrays:
11 12 13
21 22 2
31 32 3
1 2
1
2
1
2
3
x x x xx x xx x x
x x x
ww
w
ddd
d
iI
i
i
P P PII
P
K
KK
KK
M M O M
KK
MM
×
=
where I is the number of augmented inputs. The over-determinated set of equations can be solved in a least mean square sense:
( )jT 1 T
jw x x x d=−
where wj=unknown vector of the weights of the jth neuron. The matrix x must be converted only once, and the weights for all the neurons (j=1 to N) can be found. Regardless of whether the regression algorithm or the LMS algorithm is used, the outcome will be the same.
0 2 4 6 8-1
0
1
2
3
4
5
6
7
8
A
B
0 2 4 6 8-1
0
1
2
3
4
5
6
7
8
BA
Two examples of non-optimum solutions, where line A is the result of the regression (LMS) algorithm, and line B is the separation generated by the
minimum distance classifier.
LMS algorithm AW ALGORITHMThe total error for one neuron j and pattern p is now defined by a simple difference:
jpo jp jpE D O net= − ( )where net=w1x1+w2x2+.....wnxn. The derivative of this error with respect to the ith
weight of the jth neuron can be written as
jp
i
jpjp
dE
dw
dO
dnet
dnet
dwf ijpx
i= = − ′
The error function can then be approximated by the first two terms of the linear approximation around a given point:
jp jpojp jp jp
nnE E
dEdw
wdEdw
wdEdw
w= + + +1
12
2∆ ∆ ∆........
Therefore:
3
=
′
−
′
−
′
−
′
−
∇
∇
f P
O PD P
f p
O pD p
f
OD
f
OD
w
ww
xxxx
xxxx
xxxxxxxx
I
PIPPP
pIppp
I
I
M
M
M
L
MMMM
L
MMMM
L
L
2
22
1
11
2
1
111
321
2232321
1131211
AW ALGORITHM 2
( )∆w X X X del Y del=−
=T T1
The Y matrix is composed of input patterns, and must be computed only once !
0 2 4 6 8-1
0
1
2
3
4
5
6
7
8
D
A
B
C
0 2 4 6 8-1
0
1
2
3
4
5
6
7
8
A
B
C
D
Comparison of Algorithms where the algorithms can be identified by the labels A-regression, B-minimum distances C-modified minimum distance, and D-modified
regression and delta (back propagation) algorithm.
AW ALGORITHM 3
An example of the three layer feedforward neural network, which is sometimes known also as the backpropagation network.
+1 +1 +1
hiddenlayer #1
hiddenlayer #2
outputlayer
Illustration of the property of linear separation of patterns in the two-dimensional space by a single neuron.
x
x
x
x
1
2
10
20
+1
w2
w1
w1=
w2=
w3=
x1
x2w3
1
1
-1
x10
x20
inputs
hiddenlayer #1
hiddenlayer #2
output
+1
+1+1
AND
OR
An example of the three layer neural network with two inputs for separation of three clusters into one category. This network can be generalized and can be used for
solution all classification problems.An example with a comparison of results obtained using LMS and Delta training
algorithms. Note that LMS is not able to find the proper solution.
1 2
1
2
3
4
6
3 4 5 6 7
-1
-2
-2 -1
5
x1
x2
LMS
x 11
2.5
=
Del
tax 2
x 1-
1.4 1
24=
1
4
Weights
V
Neurons Weights Neurons
W
Layer j Layer k
net j net k
y = f(net j) o = f(net k)1
2
K
1
J
2
f' (net j)
-+
dd - o
initialweightvector
w j
∆W = ηδoy
δo = [(dk-ok)f'(netk)]w jδo
η
Outputvector
Desiredoutputvector
Inputvector
learningconstant
ηδy = w jδof'(netj)
∆V = ηδyzf' (net j)
z
Feedforward
Back-propagation
Error Backpropagation
xi wij ojj-th
o1
ok
oK+1
+1
Ej
Aij =d ok
d netj
netj
Illustration of the concept of the gain computation in neural networks
• Although the error backpropagation algorithm (EBP) was a significant breakthrough in neural network research, it is known as an algorithm with a very poor convergence rate.
• Many attempts have been made to speed up the EBP algorithm:– heuristics approaches such as momentum,– variable learning rate– stochastic learning– artificial enlarging of errors for neurons operating in saturation
region
• More significant improvement was possible by using various second order approaches:– Newton, – conjugate gradient,– Levenberg-Marquardt (LM) method. The LM algorithm is now
considered as the most efficient one It combines the speed of the Newton algorithm with the stability of the steepest decent method.
Levenberg - MarquardtSteepest decent method: gww α−=+ kk 1
Newton method: gAww 11
−+ −= kkk
where Ak is Hessian and g is gradient vector
Assuming: JJA T2≈ and vJg T2≈where J is Jacobian and v is error vector
( ) vJJJww Tkk
Tkkk 22 1
1−
+ −= ( ) vJJJww Tkk
Tkkk
11
−
+ −=or
Levenberg - Marquardt method: ( ) vJIJJww Tkk
Tkkk
11
−
+ +−= µ
⋅⋅⋅⋅
⋅∂∂
∂∂∂
∂∂∂
⋅∂∂
∂∂∂
∂∂∂
⋅∂∂
∂∂∂
∂∂∂
⇔
23
2
23
2
13
232
2
22
2
12
231
2
21
2
21
2
xv
xxv
xxv
xxv
xv
xxv
xxv
xxv
xv
Hessian
⋅⋅⋅⋅
⋅∂∂
∂∂
∂∂
⋅∂∂
∂∂
∂∂
⋅∂∂
∂∂
∂∂
⇔
3
3
2
3
1
3
3
2
2
2
1
2
3
1
2
1
1
1
xv
xv
xv
xv
xv
xv
xv
xv
xv
Jacobian
LM algorithm combines the speed of the Newton algorithm with the stability of the steepest decent method. The LM algorithm uses the following formula to calculate weights in subsequent iterations:
( ) EJIJJww Tkk
Tkkk
11
−
+ +−= µ
where E is the cumulative (for all patterns) error vector I is identity unit matrix, µ is a learning parameter and J is Jacobian of m output errors with respect to n weights of neural network. For µ = 0 it becomes the Gauss-Newton method. For very large µ the LM algorithm becomes the steepest decent or the EBP algorithm. The µ parameter is automatically adjusted at each iteration in order to secure convergence.
Levenberg - Marquardt 2 Error Back Propagation Algorithm
10-1
100
101Error Back P ropaga tion
100
101
102
Sum of squared errors as a function of number of iterations for the “XOR” problem using EBP algorithm with Nguyen-Widrow weight initialization
Sum of squared errors as a function of number of iterations for the“XOR” problem using EBP algorithm with unfavorable weight initialization
5
100 101 10210-4
10-3
10-2
10-1
100
101Levenberg-Marqua rdt
number of ite ra tions
glob
al e
rror
Levenberg-Marquardt Algorithm
Sum of squared errors as a function of number of iterations for the “XOR”problem using LM algorithm with Nguyen-Widrow weight initialization.Algorithm failed in 15% to 25% cases
WW hheenn iinn iitt iiaa ll wwee iigghhtt wweerree cchhoosseennppuurrppoossee llyy vveerryy ffaarr ffrroomm tthhee ssoo lluutt iioonn tthhee
LLMM aa llggoorr iitthhmm ffaa ii lleedd iinn 110000%% ccaasseess
-1
+1 output
desiredoutput
net
f(net)
actual derivative
modified derivative
Illustration of the modified derivative calculation for faster convergence of the error backpropagation algorithm
A poor convergence of EBP algorithm is not because of local mimima but it is due to plateaus on the error surface. This problem is also known as “flat spot”problem. The prime reason for the plateau formations is a characteristic shape of the sigmoidal activation functions.
100 101 10210-4
10-3
10-2
10-1
100
101
102Modified EBP Algorithm
number of ite ra tions
glob
al e
rror
100 101 10210-4
10-3
10-2
10-1
100
101
102Modified EBP Algorithm
number of ite ra tions
glob
al e
rror
Sum of squared errors as a function of number of iterations for the “XOR”problem using modified EBP algorithm with unfavorable weight initialization
Sum of squared errors as a function of number of iterations for the “XOR”problem using modified EBP algorithm with Nguyen-Widrow weight initialization
Results of flat spot elimination
unipolarneurons
Kohonen layer
Grossberg layer
norm
alize
d in
puts
outp
uts
summing circuits
0
1
0
0
0
The counterpropagation network.
Kohonen layer
norm
alize
d in
puts
winner
w
(a) (b)
A winner take all - WTA architecture for cluster extracting in the unsupervised training mode: (a) network connections, (b) single layer network arranged into a
hexagonal shape.
-2 2 -3 64 -4 4 3
-2 3 -4 3-2 6 -8 80 2 -5 42 -4 5 2
-4 8 -4 90 -2 6 34 -1 4 1
-3 7 -6 7-2 3 -2 43 -2 6 20 2 -3 4
-6 4 -8 7-3 5 -3 6-3 5 -4 4-3 2 -5 2-1 3 -3 3
-2 2 -2 5-5 4 -5 80 1 -3 3
-2 1 -5 3-2 6 -7 6-6 6 -7 8-2 2 -2 6-3 6 -5 7-4 6 -6 6-6 7 -5 9-2 5 -6 80 3 -2 63 -2 4 4
-6 6 -6 6-3 6 -7 7-3 8 -5 10-2 7 -6 62 -2 4 4
-5 7 -5 8-3 8 -8 92 -2 2 2
-2 6 -8 92 -2 3 2
-2 1 -5 32 -1 4 1
-6 4 -6 6-2 3 -2 5-1 2 -3 2-2 5 -2 20 -2 4 0
-6 4 -6 8-2 3 -4 41 0 6 2
-4 4 -6 10-1 3 -4 6-4 5 -8 6
-4 4 -1 4-1 1 -2 2-4 3 -3 20 3 -4 6
-4 5 -5 7-4 3 -2 44 0 2 4
-2 3 -3 4-2 1 -3 6-2 1 -3 44 -3 5 02 -4 4 0
-3 6 -6 74 -4 3 2
-1 4 -3 5-2 5 -1 5-3 6 -6 9-4 4 -2 4
Find number and location of clusters in 4-dim. space
(-2 3 –3 4) (4 –4 6 –6) (2 –2 4 2)
6
hidden neurons
outputneurons
inpu
ts
outp
uts
+1
+1
+1
+1once adjusted weights and then frozenweights adjusted every step
The cascade correlation architecture
inpu
ts
outp
uts
summing circuit
y2
y3
y1
D
D
D
hidden "neurons"
w1
w2
y1
y2
y3
D
0
1
0
0
outputnormalization
s1
s4
s3
s2
stored
stored
stored
stored
x is
cl o
se t
o s
2
A typical structure of the radial basis function network.
v1 v1
v2v2
v3 v3
v4v4
v5v5
W
A Hopfield network or autoassociative memory
W
W
WT
WT
a ab
b
a
(a) (b)
An example of the bi-directional autoassociative memory - BAM: (a) drawn as a two layer network with circulating signals (b) drawn as two layer network
with bi-directional signal flow.
+1
outp
uts
inpu
ts
nonl
inea
r ele
men
ts
The functional link network
x1 x2
x1
x2
output+1
+1+1
-3
-0.5
x1 x2
x1
x2
output+1
-0.5
+1XOR XOR
(a) (b)
unipolar neuron bipolar neuron
Functional link networks for solution of the XOR problem: (a) using unipolar signals, (b) using bipolar signals.
7
Sarajedini and Hecht-Nielsen network
Let us consider stored vector w and input pattern x. Both input and stored patterns have the same dimension n. The square Euclidean distance between x and w is:
After defactorization
finally
( ) ( ) ( )x w− = − + − + ⋅ ⋅ ⋅ + −21 1
2
2 2
2 2x w x w x wn n
( )x w− = + + ⋅ ⋅ ⋅ + + + + ⋅ ⋅ ⋅ + − + + ⋅ ⋅ ⋅ +212
22 2
12
22 2
1 1 2 22x x x w w w x w x w x wn n n n
x w x x w w x w x w− = + − = + −2 2 22 2T T T net
x 2
x1
x2
xn
+1
+1
-2w1 w 2
-2w2
-2wn
x w− 2
∑
010
2030
0
10
20
300
100
200
300
400
500
Input pattern transformation on a sphere
xc3
xc1
xc2
xe1
xe2
xe3
-10-5
05
10
-10-5
05
10-1
-0.5
0
0.5
1
R2 2− x
x1
x2
xn
z1
z2
zn+1
zn
...
Input pattern transformation on a sphere 2
Network with two neurons capable of separating crescent shape of patterns (a) input-output mapping, (b) network diagram
(b)-10
-50
5
10
-10
-5
0
5
100
0.5
1
(a)
+1
r x x212
22− −
+1
-1
x1
x2
Input pattern transformation on a sphere 3
Spiral problem solved with sigmoidal type neurons (a) network diagram, (b) input-output mapping.
(a)
+1
+1+1
x1
x2
-10
-5
0
5
10
-10
-5
0
5
10-1
0
1
(b)
+1
+1
+1
+1
-1-1
-1
-1
-1
M1
M2M3
C1 R1 R2C2
VDD
NCNeural
Cell
MP
MNM4
wei
ghte
d ou
tput
cur
rent
s
inpu
t nod
e
Pulse Coded Neural Networks
time0.0 0.1 0.2mS
0123456789V
Transient Graph
VC1 VC2
Pulse Coded Neural Networks 2
X
Y
RC
NC
NC
0 50 100 150 200 250 300
5
10
0 50 100 150 200 250 300
5
10
8
Pulse Coded Neural Networks 3 Pulse Coded Neural Networks 4VDD
C1R1
C2R2
inpu
t
outp
ut
coupling withneighbors
coupling with
neighbors
inpu
t spa
tial i
mag
e
outp
ut te
mpo
ral p
atte
rn
mut
ualy
cou
pled
neu
rons
0 50 100 150 200 250 3000
2
4
6
8
10
12
14
16
18
20
time
# of
eve
nts
Burning Fire (comple te image)
Pulse Coded Neural Networks 5 Fuzzy systems• Inputs can be any value from 0 to 1.• The basic fuzzy principle is similar to Boolean logic.• Max and min operators are used instead of AND and OR. The NOT
operator also becomes 1 - #.
A minus one A -1 ACor BA, of luelargest va C}B,max{A,CBACor BA, of aluesmallest v C}B,min{A,CBA
−⇒
−⇒∪∪−⇒∩∩
BA∩
11011000
1000
Boolean FuzzyBA∪
11011000
1000
BA∩
0.80.70.30.70.80.20.30.2
0.70.30.20.2
BA∪
0.80.70.30.70.80.20.30.2
0.80.70.80.3
Fuzzy systems 2
Block diagram for Zadeh fuzzy controller
Takagi-Sugeno type defuzzifier
fuzz
ifier
s
sele
ctio
n of
n*m
are
as
n+m(fuzzy)
k(fuzzy)
2 inputs(analog)
1 output(analog)
defu
zzifi
er
max
ope
rato
rsfo
r eac
hk
cate
gorie
s
n*m(fuzzy)
n and m membershipfunctions
k membershipfunctions
n*m(fuzzy)
segr
egat
ion
into
k ca
tego
ries
k(fuzzy)
1 output(analog)
defu
zzifi
er
k(fuzzy)
1 output(analog)
wei
ghte
d su
m
norm
aliz
atio
n
k(fuzzy)
Fuzzification
• There are three major types of membership functions– Gausian, Triangular and Trapezoidal
• Three basic membership function rules1. Each point of an input should belong to one
membership function2. The sum of two overlapping functions should never
be greater than 1.3. For higher accuracy, more membership functions can
be used, but this can lead to system instability and will require a larger fuzzy table.
9
Fuzzification
20 30 40 50 60 70 80 90 100 110 1200
1
Degrees F
Normal Warm HotCold Cool
38 deg F 83 deg F
fuzz
ifier
38 deg F
cool
cold
warm
hot
normal
0
0.7
0.3
0
0
fuzz
ifier
83 deg F
cool
cold
warm
hot
normal
0
0
0.2
0.8
0
Trapezoidal Membership Function Results from Fuzzification
Rule EvaluationInput 2
Input 1 cold cool normal warm hotcold A A A B Acool A A B C B
norma A B C C Cwarm A B C D Dhot B C D E E
Input 2Input 1 0 cold 0 cool 0.2 normal 0.8 warm 0 hot
0.7 cold A A A B A0.3 cool A A B C B0 normal A B C C C0 warm A B C D D0 hot B C D E E
Input 2Input 1 0 cold 0 cool 0.2 normal 0.8 warm 0 hot
0.7 cold 0 0 0.2 A 0.7B 00.3 cool 0 0 0.2 B 0.3 C 00 normal 0 0 0 0 00 warm 0 0 0 0 00 hot 0 0 0 0 0
Zadeh fuzzy tables
Input 2Input 1 cold cool normal warm hotcold O1 O2 O3 O4 O5cool O6 O7 O8 O9 O10
normal O11 O12 O13 O14 O15warm O16 O17 O18 O19 O20hot O21 O22 O23 O24 O25
Input 2Input 1 0 cold 0 cool 0.2 normal 0.8 warm 0 hot
0.7 cold O1 O2 O3 O4 O50.3 cool O6 O7 O8 O9 O100 normal O11 O12 O13 O14 O150 warm O16 O17 O18 O19 O200 hot O21 O22 O23 O24 O25
Input 2Input 1 0 cold 0 cool 0.2 normal 0.8 warm 0 hot
0.7 cold 0 0 0.2*O3 0.7*O4 00.3 cool 0 0 0.2*O8 0.3*O9 00 normal 0 0 0 0 00 warm 0 0 0 0 00 hot 0 0 0 0 0
Tagagi-Sugeno fuzzy tables
Defuzzification
• The equation to describe the defuzzification process.– n – Number of membership functions– zk – Fuzzy output variables– zck – analog values from table
• Outputs:– Zadeh
– Tagagi-Sugeno
∑
∑
=
== n
1kk
n
1kkk
z
zcz Output
0.30.70.2C*0.3B*0.7A*0.2 Output
++++
=
0.20.30.70.2O9*0.3O8*0.2O4*0.7O3*0.2 Output
++++++
=
Fuzzy systems VLSI implementation
Block diagrams of the fuzzy VLSI chip
fuzz
yfie
rs
64 M
INop
erat
ors
8+8(fuzzy)
64(fuzzy)
2 inputs(analog)
1 output(analog)
wei
ghte
dsu
m
norm
aliz
ario
n
64(fuzzy)
0
5
10
0
5
100
2
4
6
8
10
0
5
10
05
10150
50
100
150
200
Control surfaces: (a) desired control surface, (b) information stored in defuzzifier as weights, and (c) measured control surface of VLSI chip
Fuzz
yfie
rFu
zzyf
ier
X
Y
Array ofcluster cells
out
weightedcurrents
voltages
0
5
10
0
5
100
2
4
6
8
10
XY
Fuzzy systems VLSI implementation 2
refe
rence v
oltages
fuzzy current variables
VIN
I1 I2 I3 I4 I5 I6
(a)
8
9
10
[uA
]
I1 I2 I3 I4 I5 I6
Fuzzyfier with Six Diffrent Membership Functions
Fuzzifier (a) circuit diagram of fuzzifier, (b) example of the SPICE simulation
Fuzzy systems VLSI implementation 3
outp
ut to
sum
min
g no
de
W/LW/L W/L
fuzz
y in
puts
Defuzzifier using normalization and weighted sum
MIN
(A,B
)
A
B
(a) (b)
IA
IB
ITH IBIAS
VREF
VDD
IOUT
Selection circuits (a) MIN circuit in voltage mode (b) neuron circuit with
threshold in the current mode
10
Fuzzy systems VLSI implementation 4
MAX operators (a) concept diagram and (b) simulation results for MAX1 and for the proposed MAX2.
M1
+VDD
I1 I2
IMAX
IBIAS
M3M2
M4 M51 2
3
VBIAS6 7 8
5M6 M7 M8
0 0.5 1 1.5 2 2.5 3 3.5 49
9.2
9.4
9.6
9.8
10
10.2
10.4
10.6
10.8
11
I1
I2
MAX2MAX1
Comparison of MAX circuits
TIME
Cur
rent
s
[s]
[uA
]
Fuzzy systems VLSI implementation 5
The cluster cell with rule selection (transistors M1-M4) and defuzzification (source I0 and
transistors M4-M6)
Six bit programmable current sources
X i-th fuzzy voltage
Z k-th fuzzy voltage
Y j-th fuzzy voltage
to globalsumming node
common nodesupplied by singleconstant current
source
W/L sets the outputvalue for the cluster
+12V
I0
M1 M2 M3
M4
M5 M6
1 2 4
1 2 4
IREF
14
4 1
2
1
1 2 4
8 16 32
W/L are marked
Fuzzy systems VLSI implementation 6
Normalization circuit (a) circuit diagram and (b) characteristics
I1 I3
1
M1 M2 M3 M4
+- IBIASVREF
M5 M6
I2
78
IN1 IN3IN2
2 3
0 0.05 0.1 0.15 0.20
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
I1
I2
I3
I1N
I2NI3N
Normaliza tion Circuit
I1
Cur
rent
s
[uA]
[uA
]
Fuzzy systems - microprocessor implementation
Required control surface Zadeh with trapezoidal membership functions
Zadeh with triangular membership functions
Zadeh with Gaussianmembership functions
Tagagi-Sugeno with trapezoidal membership
functions
Tagagi-Sugeno with triangular membership
functions
Fuzzy systems - microprocessor implementation 2
0.306294.2Tagagi-Sugeno fuzzy controller withGaussian membership function
6
0.219210.8Tagagi-Sugeno fuzzy controller with triangular membership function
5
0.309296.5Tagagi-Sugeno fuzzy controller with trapezoidal membership function
4
0.585562.0Zadeh fuzzy controller with Gaussianmembership function
3
0.671644.4Zadeh fuzzy controller with triangular membership function
2
0.945908.4Zadeh fuzzy controller with trapezoidal membership function
1
Error MSEError SSEApproach used
Neural systems -microprocessor implementation
Required control surface
11
Neural systems - microprocessor implementation 2
0.0003020.2902Neural network with 6 neurons in one hidden layer
3
0.0000930.0895Neural network with 5 neurons in cascade
2
0.0005780.5559Neural network with 3 neurons in cascade
1
Error MSEError SSEApproach used
Comparison of various fuzzy and neural controllers
0.000303.8660Neural network with 6 neurons in one hidden layer
0.000093.31070Neural network with 5 neurons in cascade
0.000571.72680Neural network with 3 neurons in cascade
0.30652.32845Tagagi-Sugeno with Gaussian
0.21928.51502Tagagi-Sugeno with triangulal0.30928.51502Tagagi-Sugeno with trapezoidal
0.58539.83245Zadeh with Gaussian
0.6711.952324Zadeh with triangular
0.9451.952324Zadeh with trapezoidal
Error MSE
processing time (ms)
length of code
Type of controller
Genetic AlgorithmsThe genetic algorithms follow the evolution process in the nature to find the better solutions of some complicated problems. Foundations of genetic algorithms are given in Holland (1975) and Goldberg (1989) books.
Genetic algorithms consist the following steps:
InitializationSelectionReproduction with crossover and mutation
Selection and reproduction are repeated for each generation until a solution is reached
During this procedure a certain strings of symbols, known as chromosomes, evaluate toward better solution.
Genetic Algorithms 2
All significant steps of the genetic algorithm will be explainedusing a simple example of finding a maximum of the function(sin2(x)-0.5*x)2 with the range of x from 0 to 1.6. Note, that in this range the function has global maximum at x=1.309, and local maximum at x=0.262.Coding and initialization
At first, the variable x has to be represented as a string of symbols. With longer strings process converges usually faster, so less symbols for one string field are used it is the better. While this string may be the sequence of any symbols, the binary symbols "0" and "1" are usually used. In our example, let us use for coding six bit binary numbers having a decimal value of 40x. Process starts with a random generation of the initial population given in Table
Genetic Algorithms 3
Initial Populationstring decimal
valuevariable
valuefunction
valuefractionof total
1 101101 45 1.125 0.0633 0.2465
2 101000 40 1.000 0.0433 0.1686
3 010100 20 0.500 0.0004 0.0016
4 100101 37 0.925 0.0307 0.1197
5 001010 10 0.250 0.0041 0.0158
6 110001 49 1.225 0.0743 0.2895
7 100111 39 0.975 0.0390 0.1521
8 000100 4 0.100 0.0016 0.0062
Total 0.2568 1.0000
Initial Populationstring decimal
valuevariable
valuefunction
valuefractionof total
1 101101 45 1.125 0.0633 0.2465
2 101000 40 1.000 0.0433 0.1686
3 010100 20 0.500 0.0004 0.0016
4 100101 37 0.925 0.0307 0.1197
5 001010 10 0.250 0.0041 0.0158
6 110001 49 1.225 0.0743 0.2895
7 100111 39 0.975 0.0390 0.1521
8 000100 4 0.100 0.0016 0.0062
Total 0.2568 1.0000
Initial PopulationInitial Populationstringstring decimal
valuedecimalvalue
variablevalue
variablevalue
functionvalue
functionvalue
fractionof totalfractionof total
11 101101101101 4545 1.1251.125 0.06330.0633 0.24650.2465
22 101000101000 4040 1.0001.000 0.04330.0433 0.16860.1686
33 010100010100 2020 0.5000.500 0.00040.0004 0.00160.0016
44 100101100101 3737 0.9250.925 0.03070.0307 0.11970.1197
55 001010001010 1010 0.2500.250 0.00410.0041 0.01580.0158
66 110001110001 4949 1.2251.225 0.07430.0743 0.28950.2895
77 100111100111 3939 0.9750.975 0.03900.0390 0.15210.1521
88 000100000100 44 0.1000.100 0.00160.0016 0.00620.0062
TotalTotal 0.25680.2568 1.00001.0000
Genetic Algorithms 4
Selection and reproduction Selection of the best members of the population is an important
step in the genetic algorithm. Many different approaches can be used to rank individuals. In our example the ranking function is given. Highest rank has member number 6 and lowest rank has member number 3. Members with higher rank should have higher chances to reproduce. The probability of reproduction for each member can be obtained as fraction of the sum of all objective function values. This fraction is shown in the last column of the Table.
Using a random reproduction process the following population arranged in pairs could be generated:
101101 -> 45 110001 -> 49 100101 -> 37 110001 -> 49 100111 -> 39 101101 -> 45 110001 -> 49 101000 -> 40
12
Genetic Algorithms 5
Reproduction101101 -> 45 110001 -> 49 100101 -> 37 110001 -> 49 100111 -> 39 101101 -> 45 110001 -> 49 101000 -> 40
If size of the population from one generation to another is the same, therefore two parents should generate two children. By combining two strings two another strings should be generated. The simples way to do it is to split in half each of the parent string and exchangesubstrings between parents. For example from parent strings 010100 and 100111 the following child strings will be generated 010111 and 100100. This process is known as the crossover and resultant children are shown below
101111 -> 47 110101 -> 53 100001 -> 33 110000 -> 48100101 -> 37 101001 -> 41 110101 -> 53 101001 -> 41
Genetic Algorithms 6
MutationOn the top of properties inherited from parents they are
acquiring some new random properties. This process is known asmutation. In most cases mutation generates low ranked children,which are eliminated in reproduction process. Sometimes however, the mutation may introduce a better individual with a new property into. This prevents process of reproduction from degeneration. In genetic algorithms mutation plays usually secondary role. Mutation rate is usually assumed on the level much below 1%. In our example mutation is equivalent to the random bit change of a given pattern. In this simple example with short strings and small population with a typical mutation rate of 0.1%, our patterns remain practically unchanged by the mutation process. The second generation for our example is shown in Table
Genetic Algorithms 7
Population of Second Generationstring
numberstring decimal
valuevariable
valuefunction
valuefractionof total
1 010111 47 1.175 0.0696 0.1587
2 100100 37 0.925 0.0307 0.0701
3 110101 53 1.325 0.0774 0.1766
4 010001 41 1.025 0.0475 0.1084
5 100001 33 0.825 0.0161 0.0368
6 110101 53 1.325 0.0774 0.1766
7 110000 48 1.200 0.0722 0.1646
8 101001 41 1.025 0.0475 0.1084
Total 0.4387 1.0000
Population of Second Generationstring
numberstring decimal
valuevariable
valuefunction
valuefractionof total
1 010111 47 1.175 0.0696 0.1587
2 100100 37 0.925 0.0307 0.0701
3 110101 53 1.325 0.0774 0.1766
4 010001 41 1.025 0.0475 0.1084
5 100001 33 0.825 0.0161 0.0368
6 110101 53 1.325 0.0774 0.1766
7 110000 48 1.200 0.0722 0.1646
8 101001 41 1.025 0.0475 0.1084
Total 0.4387 1.0000
Population of Second GenerationPopulation of Second Generationstring
numberstring
numberstringstring decimal
valuedecimalvalue
variablevalue
variablevalue
functionvalue
functionvalue
fractionof totalfractionof total
11 010111010111 4747 1.1751.175 0.06960.0696 0.15870.1587
22 100100100100 3737 0.9250.925 0.03070.0307 0.07010.0701
33 110101110101 5353 1.3251.325 0.07740.0774 0.17660.1766
44 010001010001 4141 1.0251.025 0.04750.0475 0.10840.1084
55 100001100001 3333 0.8250.825 0.01610.0161 0.03680.0368
66 110101110101 5353 1.3251.325 0.07740.0774 0.17660.1766
77 110000110000 4848 1.2001.200 0.07220.0722 0.16460.1646
88 101001101001 4141 1.0251.025 0.04750.0475 0.10840.1084
TotalTotal 0.43870.4387 1.00001.0000
Genetic Algorithms 8Note, that two identical highest ranking members of the
second generation are very close to the solution x=1.309. The randomly chosen parents for third generation are
010111 -> 47 110101 -> 53 110000 -> 48 101001 -> 41 110101 -> 53 110000 -> 48 101001 -> 41 110101 -> 53
which produces following children:010101 -> 21 110000 -> 48 110001 -> 49 101101 -> 45 110111 -> 55 110101 -> 53 101000 -> 40 110001 -> 49The best result in the third population is the same as in the
second one. By careful inspection of all strings from second or third generation one may conclude that using crossover, where strings are always split into half, the best solution 110100 -> 52 will never be reached no matter how many generations are created.
The genetic algorithm is very rapid and it leads to a good solution within a few generations. This solution is usually close to global maximum, but not the best.
Genetic Algorithms 9 Genetic Algorithms 10
13
Genetic Algorithms 11 Genetic Algorithms 12
Soft Computing and its Application
B. M. ‘Dan’ Wilamowski
University of Idaho
nn.uidaho.edu