knowledge extraction from support vector machines
TRANSCRIPT
![Page 1: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/1.jpg)
Knowledge Extraction from
Support Vector Machines:
A Fuzzy Logic Approach
![Page 2: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/2.jpg)
“Certain class of SVMs is mathematically equivalent to FARB”
![Page 3: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/3.jpg)
What is SVM
![Page 4: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/4.jpg)
What does it do?
Learns a hyper plane to classify data into 2 classes.
![Page 5: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/5.jpg)
What is a hyperplane?
A hyperplane is a function like the equation for a line,
𝑦 = 𝑚𝑥 + 𝑏
In fact, for a simple classification task with just 2 features, the hyperplane can be a line.
![Page 6: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/6.jpg)
SVM finds the optimal solution.
![Page 7: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/7.jpg)
Support Vector Machine
SVM attempts to maximize the margin, so that the hyperplane is just as far away from red ball as the blue ball. In this way, it decreases the chance of misclassification.
![Page 8: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/8.jpg)
More Formally
Input:
set of (input, output) training pair samples.
Output:
set of weights w (or 𝑤𝑖), one for each feature, whose linear combination predicts the value of y.
![Page 9: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/9.jpg)
We use the optimization of maximizing the margin (‘street width’) to reduce the number of weights that are nonzero to just a few that
correspond to the important features that ‘matter’ in deciding the separating
line(hyperplane)…these nonzero weights correspond to the support vectors (because they ‘support’ the separating hyperplane)
![Page 10: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/10.jpg)
The optimization problem
minimize 𝑓(𝑤) ≡ (1/2) ∥ 𝒘 ∥2
subject to 𝑔 𝑤, 𝑏 ≡ −𝑦𝑖 𝒘 ⋅ 𝒙 + 𝑏 + 1 ≤ 0, 𝑖 = 1…𝑚
we use Lagrange multipliers to get thisproblem into a form that can be solved analytically
![Page 11: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/11.jpg)
What if things get more complicated?
![Page 12: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/12.jpg)
Throw the balls in the air. While the balls are in the air and thrown up in just the right way, you use a large sheet of paper to divide the
balls in the air.
mapping data to a high dimensional space
![Page 13: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/13.jpg)
Kernel
polynomial: (𝒙𝒊 ⋅ 𝒙𝒋 + 𝑐)𝑝
Gaussian radial basis function: exp(−∥ 𝒙𝒊– 𝒙𝒋 ∥2/2𝜎2)
SVM does its thing, maps them into a higher dimension and then finds the hyperplane to separate the classes.
![Page 14: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/14.jpg)
![Page 15: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/15.jpg)
Where does SVM get its name from?
• The decision function is fully specified by a (usually very small) subset of training samples, the support vectors.
• Support vectors are the data points that lie closest to the decision surface (or hyperplane)
• They are the data points most difficult to classify
• They have direct bearing on the optimum location of the decision surface
• they ‘support’ the separating hyperplane
![Page 16: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/16.jpg)
Knowledge ExtractionExtracting the knowledge learned by a black–box classifier and
representing it in a comprehensible form
![Page 17: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/17.jpg)
Knowledge Extraction
Benefits :
• Validation
• Feature extraction
• Knowledge refinement and improvement
• Knowledge acquisition for symbolic AI systems
• Scientific discovery
![Page 18: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/18.jpg)
Knowledge Extraction Rule Extraction
• Methods for RE from ANNs have been classified into three categories:
Decompositional
Pedagogical
Eclectic
![Page 19: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/19.jpg)
decompositional approach for KE
SVM :The IO mapping of the trained SVM f : Rn → {−1, 1} is given by
![Page 20: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/20.jpg)
decompositional approach for KE
![Page 21: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/21.jpg)
![Page 22: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/22.jpg)
![Page 23: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/23.jpg)
“Certain class of SVMs is mathematically equivalent to FARB”
![Page 24: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/24.jpg)
What is FARB !?
Let’s take an example first !
![Page 25: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/25.jpg)
Example:Input: q ∈ R ,
Output: O ∈ R ,
And: a0, a1, k ∈ R, with k > 0.
Rules:
R1: If q is equal to k Then O = a0 + a1,
R2: If q is equal to −k Then O = a0 − a1,
![Page 26: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/26.jpg)
- Linguistic terms: equal to k , equal to –k
- To express fuzziness, Gaussian membership function is used:
![Page 27: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/27.jpg)
These Function Satisfy:
![Page 28: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/28.jpg)
Applying Singleton fuzzifier and Centre of Gravity defuzzifier yields:
But, What does this Output mean !??
![Page 29: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/29.jpg)
Take a deeper look !
It is a feedforward ANN with a single neuron, employing the activation function tanh() !
So: this FRB is equivalent to ANN
![Page 30: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/30.jpg)
This FRB , in particular, satisfy the definition of FARB, which is:
![Page 31: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/31.jpg)
![Page 32: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/32.jpg)
![Page 33: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/33.jpg)
To get the same output, apply the same steps as in the example, which is:
But how this output is any close to the one in the example ?!?
![Page 34: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/34.jpg)
And many other MFs satisfy this output, given specific values of z,u,v,r and g. Such as Logistic function and others.
Apply: zi = ui = 1, vi = ri = 0, and gi(x) = tanh(x).
![Page 35: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/35.jpg)
ResultKolman and Margaliot: Every standard ANN has a corresponding FARB.
There’s a transformation T:
This work extend that to: Certain class of SVMs satisfy the transformation P:
![Page 36: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/36.jpg)
“Certain class of SVMs is mathematically equivalent to FARB”
![Page 37: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/37.jpg)
The SVM-FARB Equivalence
*
1
( ) * ( , )Nsv
i
i i
i
h x b y K x s
(2)
0
1 1
( ) ( )m m
i i i i i i i i
i i
O q a ra z a g u q v
(8)
![Page 38: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/38.jpg)
Theorem 2. (SVM-FARB equivalence)conditionFind FARB with:
So these conditions would hold0, , , ,i i i i
m Nsv
q a a
*
0
1
*
,
,
( ) ( , )
m
i i
i
i i i i
i
i i i i
a ra b
z a y
g u q v K x s
(15)
![Page 39: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/39.jpg)
Pause and Think
• Let’s say we have a FARB
• How many rules have we got?
1 1
0 1
...
...
m m
m
If q is and and q is
Then O a a a
(7)
![Page 40: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/40.jpg)
Famous SVM Kernels
( , ) , (linear kernel)TK x y x y
( , ) (1 / ) , , , (polynomial kernel)T dK x y x y c c d
( , ) tanh( ), 0, 0, (MLP kernel)TK x y x y
2 2ˆ ˆ( , ) exp( / (2 )), , (RBF or Gaussian kernel)K x y x y
![Page 41: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/41.jpg)
Corollary 1 {MLP kernel}
These parameters will satisfy (15) conditions
( ) ( )tanh(( ) )
( ) ( ) 2
k k
k k
q qq k
q q
,
,
2 , /
i i
T i
i
i i
k k
i i
q x s
k
* *
1
( ) tanh( )Nsv
T i
i i
i
h x y x s b
(17)
0
1 1
( ) ( )m m
i i i i i i i i
i i
O q a ra z a g u q v
* *
0
1, 0,
2 ,
2 ,
( ) tanh( )
,
i i
i i
i i i
i
i i i
z r
u
v k
g x x
a b a y
![Page 42: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/42.jpg)
Pause and Think
appear in the FARB if-part
What could this mean?
iq
cos
cos ; and are normalized
T i
i
i
i
i
i
q x s
q x s
q x s
![Page 43: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/43.jpg)
Corollary 2 {MPL Kernel}2
* *
21
( ) exp( )ˆ2
Nsv
i i
i
x yh x y b
(18)
These parameters will
satisfy (15) conditions
0
1 1
( ) ( )m m
i i i i i i i i
i i
O q a ra z a g u q v
2
2
( ) ( ) ( )2exp( ) 1
( ) ( ) 2
k k
k k
q q q k
q q
0 0
,
,
ˆ , 0
T i
i
i i
q x s
k
2
2
* *
0 1
*
2, 1
1 2 ,
0,
( ) exp( ),
/ 2,
/ 2
i i
i
i
i
Nsv
i ii
i i i
z r
u
v
g x x
a b y
a y
![Page 44: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/44.jpg)
Experimentation{Iris data set}
• 150 examples• sepal length
• sepal width
• length
• petal width
• 3 classes• Setosa
• Versicolor
• Virginica
![Page 45: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/45.jpg)
SVM
![Page 46: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/46.jpg)
Results {SVM1}
![Page 47: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/47.jpg)
![Page 48: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/48.jpg)
![Page 49: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/49.jpg)
Results {SVM2}
![Page 50: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/50.jpg)
![Page 51: Knowledge extraction from support vector machines](https://reader034.vdocuments.mx/reader034/viewer/2022051300/58ed38f61a28abf0588b459d/html5/thumbnails/51.jpg)