spoken language group chinese information processing lab. institute of information science academia...

Post on 14-Jan-2016

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Spoken Language GroupSpoken Language GroupChinese Information Processing Lab.Chinese Information Processing Lab.Institute of Information ScienceInstitute of Information ScienceAcademia Sinica, Taipei, TaiwanAcademia Sinica, Taipei, Taiwanhttp://sovideo.iis.sinica.edu.tw/SLG/index.htmhttp://sovideo.iis.sinica.edu.tw/SLG/index.htm

Multiple Parameter Selection of Multiple Parameter Selection of Support Vector MachineSupport Vector Machine

Hung-Yi Lo

2

2007/07/11

OutlineOutline

Phonetic Boundary Refinement Using Support Vector Machine (ICASSP’07, ICSLP’07)

Automatic Model Selection for Support Vector Machine (Distance Metric Learning for Support Vector Machine)

3

2007/07/11

Automatic Model Selection for Support Vector Machine(Distance Metric Learning for

Support Vector Machine)

4

2007/07/11

Automatic Model Selection for SVMAutomatic Model Selection for SVM

The problem of choosing a good parameter or model setting for a better generalization ability is the so called model selection.

We have two parameter in support vector machine: regularization variable C Gaussian kernel width parameter γ

Support vector machine formulation:

Gaussian kernel:

n

i ii yxeyxK 1

2)(),(

ww2

1min

s. t.

(QP)

0

)(

eebAwD

mnRbw 1),,(

5

2007/07/11

C.-M. Huang, Y.-J. Lee, Dennis K. J. Lin and S.-Y. Huang. "Model Selection for Support Vector Machines via Uniform Design", A special issue on Machine Learning and Robust Data Mining of Computational Statistics and Data Analysis. (To appear)

Automatic Model Selection for SVMAutomatic Model Selection for SVM

6

2007/07/11

Automatic Model Selection for SVMAutomatic Model Selection for SVM

Strength: Automate the training progress of SVM, nearly no

human-effort needed. The object of the model selection procedure is directly

related to testing performance. In my experimental experience, testing correctness always better than the results of human-tuning.

Nested uniform-designed-based method is much faster than exhaustive grid search.

Weakness: No closed-form solution, need doing experimental

search. Time consuming.

7

2007/07/11

Distance Metric LearningDistance Metric Learning L. Yang "Distance Metric Learning: A Comprehensive

Survey", Ph.D. survey

Many works have done to learn a quadratic (Mahalanobis) distance measures:

where xi is the input vector for the ith training case and Q is a symmetric, positive semi-definite matrix.

Distance metric learning is equivalent to feature transformation:

)()( jijiij xxxxd Q

)()(

)AA()AA(

)(AA)(

jiji

jiji

jijiij

yyyy

xxxx

xxxxd

8

2007/07/11

Supervised Distance Metric Learning

Local

Local Adaptive Distance Metric Learning

Neighborhood Components Analysis

Relevant Component Analysis

Unsupervised Distance Metric Learning Nonlinear embedding

LLE, ISOMAP, Laplacian Eigenmaps

Distance Metric Learning based on SVM

Large Margin Nearest Neighbor Based Distance Metric Learning

Cast Kernel Margin Maximization into a SDP problem

Kernel Methods for Distance Metrics Learning

Kernel Alignment with SDP

Learning with Idealized Kernel

Linear embeddingPCA, MDS

Global Distance Metric Learning by Convex Programming

9

2007/07/11

Distance Metric LearningDistance Metric Learning

Strength: Usually have closed-form solution.

Weakness: The object of the distance metric learning is based

some data distribution criterion, but not the evaluation performance.

10

2007/07/11

Automatic Multiple Parameter Selection Automatic Multiple Parameter Selection for SVMfor SVM

n

i ii yxeyxK 1

2)(),(

Gaussian kernel:

Traditionally, each dimension of the feature vector will be normalized into zero-mean and one standard deviation. So each dimension have the same contribute to the kernel.

However, some features should be more important.

which is equivalent to diagonal distance metric learning:

n

i iii yxeyxK 1

2)(),(

)()(),( yxQyxeyxK

11

2007/07/11

I would like to do this task by experimental search, and incorporate data distribution criterion as some heuristic. Much more time consuming, might only applicable on small data.

Feature selection is another similar task and can be solved by experimental search, while the diagonal of the matrix is zero or one. Applicable on large data. But, already have many publication.

Automatic Multiple Parameter Selection Automatic Multiple Parameter Selection for SVMfor SVM

)()(),( yxQyxeyxK

12

2007/07/11

Thank you!Thank you!

top related