the 2005 uk workshop on computational intelligence 5-7 september 2005, london l2-svm based fuzzy...
TRANSCRIPT
The 2005 UK Workshop on Computational Intelligence
5-7 September 2005, London
L2-SVM Based Fuzzy Classifier with Automatic Model Selection and
Fuzzy Rule Ranking
Shang-Ming Zhou and John Q. Gan
Department of Computer Science, University of Essex, UK
Background and Objectives(1/4)
The challenges : To apply SVM techniques to parsimonious fuzzy system
modelling for regression and classification. Difficult to link the kernel functions in SVM to basis functions in
fuzzy system.
Advantage of SVM: Parsimonious solutions based on quadratic programming
Background and Objectives(2/4)
Chen and Wang’s work [Chen and Wang 2003]: Established this sort of relation for fuzzy classification based on
L1-SVM techniques. Parameters: kernel parameters and regularization parameter not
updated optimally from data for fuzzy rule induction.
One objective : To apply L2-SVM techniques to fuzzy system modelling to
optimally learn the parameters from data in terms of radius-margin bound J;
Radius-margin bound: not hold in L1-SVM.
( ) / ( )J S
Rule ranking, rule selection: Rule base structure [Setnes and Babuska 2001]
SVD-QR with column pivoting algorithm and pivoted QR decomposition method [Yen and Wang 1998,1999, Setnes and Babuska 2001];
Contribution of fuzzy rule consequents:More effective [Setnes and Babuska 2001]OLS [Chen et al 1991]
Both rule base structure and contribution of fuzzy rule consequents:Highly desired [Setnes and Babuska 2001]Not reported yet in literature.
Background and Objectives(3/4)
Another objective:
-values of fuzzy rules: Contribution of rule consequents;
-values of fuzzy rules:Rule base structure and contribution of rule consequents.
Background and Objectives(4/4)
L2-SVM based Fuzzy Classifier Construction (1/10)
11: n
i i n i i iRule if x is A and and x is A then z b1
0 1 0 0 0 0: nnRule if x is A and and x is A then z b
01
( ) sgn ( )L
i ii
f x b r x b
01
1
( )
1 ( )
L
i iiL
ii
b r x by
r x
( ) ( )ji i j
j
r x A x
Fuzzy Classifier:
L2-SVM based Fuzzy Classifier Construction (2/10)
Conditions of Applying SVM to Fuzzy Classifier Construction:
are Mercer kernel;
If are generated from a reference function through location shift, then are Mercer kernel [Chen and Wang 2003];
leading to Gaussian MFs;
Kernel parameters manually selected in [Cheng and Wang 2003].
( )ir x
( )ji jA x
ja( )ir x
2
( ) ( 0)jzjja z e
L2-SVM based Fuzzy Classifier Construction (3/10)
im
L2-SVM based Fuzzy Classifier:
Parameters optimally updated in terms of radius-margin bound:The number of rules L, prototypes , weights , bias , and
scaling parameters . j ib 0b
0
1
),(sgn)( bbmxxfL
iii
n
j
mxn
j
jij
ji
jijjemxamx
1
)(
1
22
)(),(
L2-SVM based Fuzzy Classifier Construction (4/10)
Two quadratic programming problems:1)
st
where are Lagrangian multipliers,
( ) ( ) ( ) ( ) ( ) ( ) ( )
1 , 1
1( , ) ( , )
2
N Nl l k l k l k
l l k
W y y x x
N
l
ll y1
)()( 0 )(0 l
TN ],,[ )()1(
Cxxxx lkklkl /),(),(~ )()()()(
L2-SVM based Fuzzy Classifier Construction (5/10)
2)
st
Radius-margin bound:
( ) ( ) ( ) ( ) ( ) ( ) ( )
1 , 1
( ) max ( , ) ( , )N N
i l l l k l k
l l k
S x x x x
11
)(
N
l
l )(0 l
),()(2 0 WSJ
L2-SVM based Fuzzy Classifier Construction (6/10)
Automatic Model Selection Algorithm
)(
)(
)()( t
t
t
J
t
J
j
j
jj
00
( ( )) ( , ( ))2 ( , ( )) 2 ( ( ))
( ) ( ) ( )j j j
J S t W tW t S t
t t t
00
( ( )) ( , ( ))2 ( , ( )) 2 ( ( ))
( ) ( ) ( )
J S t W tW t S t
C t C t C t
L2-SVM based Fuzzy Classifier Construction (7/10)
Extraction Fuzzy Rules from L2-SVM Learning Results
The number of fuzzy rules L is the number of support vectors; The premise parts of fuzzy rules:
where is the jth element of the ith support vector .
The consequent parts of fuzzy rules:
where are the non-zero Lagrangian multipliers.
( ) ( )j j ji j j iA x a x m
jim
( )im
( ) ( )0 , 1, ,i i
ib y i L
( )0i
L2-SVM based Fuzzy Classifier Construction (8/10)
Fuzzy rule ranking based on L2-SVM learning
R-values of fuzzy rules: [Setnes and Babuska 2001] Absolute values of the diagonal elements of matrix R in the QR
decomposition of firing strength matrix;
-values of fuzzy rules: Determining the depth of the effect of the rule consequent.
-values of fuzzy rules:
Considering both rule base structure and effect of the rule consequent.
|| iiR
( )0i
( )
( )
| |
max max | |
io ii
i io ii
i i
R
R
L2-SVM based Fuzzy Classifier Construction (9/10)
Fuzzy rule selection procedure
Evaluate the misclassification rates (MRs) of on the validation data set V and the test data set T separately:
and ;
Select the most influential fuzzy rules
where is the threshold.
Construct a fuzzy classifier by using the influential fuzzy rules selected.
)0(SVMFC
)0(Verr )0(Terr
*
* *
( )io si i
Rule or h
)0( ss hh
)(sSVMFC
L2-SVM based Fuzzy Classifier Construction (10/10)
Fuzzy rule selection procedure (cont.)
Apply to the validation data set V and the test data set T to obtain new MRs and ;
If > , stop selection; otherwise, assign a
higher threshold value and go to step 2.
)(sSVMFC
)(serrV )(serrT
)(serrV )0(Verr
Experimental Results(1/6)
Benchmark problem-ringnorm 2 classes; 7400 samples; 20 attributes; Theoretically expected MR: 1.3% [Breiman 1998] 400 training samples; 5000 testing samples; 2000 validation
samples.
Initial conditions: C=1; ; Learning rates for updating C and : 0.0001 and 0.01 separately Threshold for updating the radius-margin bound:
0.5j j
0
55 10J
Experimental Results(2/6)
L2-SVM Induced Fuzzy Classifier: 249 fuzzy rules generated; MR: 1.32% on test data set;
Comparison with the well-known methods on generalization performance:
Algorithms LDA QDA OLS-RBF with Gausian BFs
OLS-RBF with Cauchy BFs
MLP The proposed
MRs 24.54% 2.6% 2.52% 3.12% 13.0% 1.32%
Experimental Results(3/6)
Fuzzy rule ranking results:
0 50 100 150 200 2500
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
R v
alu
es
Experimental Results(4/6)
0 50 100 150 200 2500
0.05
0.1
0.15
0.2
0.25
v
alu
es
Experimental Results(5/6)
0 50 100 150 200 2500
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
v
alu
es
Experimental Results(6/6)
Using R-value index Using -value index Using -value index
No. of rules
selected
No. of rules
selected
No. of rules
selected
0 249 1.45% 1.32% 0 249 1.45% 1.32% 0 249 1.45% 1.32%
0.001 242 1.45% 1.32% 0.001 90 1.45% 1.32% 0.0001 90 1.45% 1.32%
0.002 214 1.45% 1.32% 0.002 89 1.50% 1.32% 0.0006 89 1.45% 1.32%
0.003 193 1.80% 1.5% 0.005 88 1.55% 1.38% 0.0008 88 1.50% 1.34%
Fuzzy rule selection results:
sh VerrTerr sh Verr
Terr sh Verr Terr
To have applied L2-SVM to fuzzy rule induction for classification: Fuzzy rules optimally generated in term of radius-margin bound. Efficient way of avoiding the “curse of dimensionality” in high
dimensional space.
Two novel indices for fuzzy rule ranking: Experimentally proved to be very effective in producing parsimonious
fuzzy classifiers.
Conclusions and Discussions(1/1)