data selection for support vector machine classifier

20
Glenn Fung and Olvi L. Mangasarian August 2000 20081021 Kuan-Chi-I 1

Upload: guanbo

Post on 14-Jun-2015

527 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Data Selection For Support Vector Machine Classifier

Glenn Fung and Olvi L. Mangasarian

August 2000

20081021 Kuan-Chi-I

1

Page 2: Data Selection For Support Vector Machine Classifier

OutlineIntroductionSVMMSVMComparisonsConclusion

2

Page 3: Data Selection For Support Vector Machine Classifier

IntroductionA method for selecting a small set of support vectors

which determines a separating plane clsssifier.Useful for applications contain millions of data points.

3

Page 4: Data Selection For Support Vector Machine Classifier

SVMA method for classification.

4

Page 5: Data Selection For Support Vector Machine Classifier

SVM (Linear Separable Case)

5

Page 6: Data Selection For Support Vector Machine Classifier

SVM To find the maximum margin ,equivelent to find minimum ½||w||2.

We can transfer above problem to a quadratic problem with parameter v > 0.

A : a real m×n matrix. e : column vectors of ones in arbitrary dimension.e ′ : transpose of e.y : nonnegitive slack variables.D : m×m diagonal matrix of 1 or -1.

6

Page 7: Data Selection For Support Vector Machine Classifier

SVMWritten in individual component natation.

Ai :row vector of matrix A.

7

Page 8: Data Selection For Support Vector Machine Classifier

SVMx′w = γ +1 bounds the class A + points.x′w = γ +1 bounds the class A - points.

γ : the location relative to the origin.

w : normal to the bounding planes.

The linear separating surface is the plane:

8

Page 9: Data Selection For Support Vector Machine Classifier

SVM (Linearly Inseparable Case)

9

Page 10: Data Selection For Support Vector Machine Classifier

SVM (Inseparable)If the class are inseparable then the two planes bound the two

class with a 〝 soft margin”.

10

Page 11: Data Selection For Support Vector Machine Classifier

MSVM (1-Norm SVM)A minimal support vertor machine (MSVM).In order to make use of a faster programming based approach,

we reformulate (1) by replacing the 2-norm by a 1-norm as follows:

11

Page 12: Data Selection For Support Vector Machine Classifier

MSVMThe mathematical program (7) is easily convert to a linear

program as follows:

υ : the absolute value |w| of w, and υi |≧ wi|

12

Page 13: Data Selection For Support Vector Machine Classifier

MSVM

If we define nonnegative multipliers u R∈ m associated with the first set of constraints of the linear program (8), and multipliers (r, s) R∈ n+n for the second set of constraints of (8), then the dual linear program associated with the linear SVM formulation (8) is the following:

13

Page 14: Data Selection For Support Vector Machine Classifier

MSVMWe modify the linear program to generate an SVM with as fewer

support vector as possible by addingan error term e′y*

The term e′y* suppresses mis-classified points and results in our minimal support vector machine MSVM:

y* :vector x in Rn with components (y*)i =1 if yi > 0 and 0 otherwise.μ :positive parameter ,chosen by a tuning set .

14

Page 15: Data Selection For Support Vector Machine Classifier

MSVMWe approximate e′y* here by a smooth concave exponential on

the nonnegative real line as was done in the feature selection approach of. For y ≥ 0, the approximation of the step vector y ∗of (9) by the concave exponential, , i = 1, . . . ,m, that is:

15

Page 16: Data Selection For Support Vector Machine Classifier

MSVMThe smooth MSVM:

16

Page 17: Data Selection For Support Vector Machine Classifier

MSVM (SLA)

17

Page 18: Data Selection For Support Vector Machine Classifier

Comparison

18

Page 19: Data Selection For Support Vector Machine Classifier

Observations of Comparisons 1. For all test problems MSVM had least number of support

vectors.

2. For the Ionosphere problem, the reduction in the num-

ber of support vectors of MSVM over SVM| · |1 is 81%, and

the average reduction in the number of support vectors of MSVM over SVM| · | is 65.8%.

3. Tenfold testing set correctness of MSVM was good.

4. Computing times were higher for MSVM than for other classifiers.

19

Page 20: Data Selection For Support Vector Machine Classifier

ConclutionWe proposed a minimal support vector machine.Useful in classifying very large datasets by using only a

fraction of the data.Improves generalization over other classifiers that use a higher

number of data points.MSVM requires the solution of a few linear programs to

determine a sepaeating surface.

20