learning with augmented features for heterogeneous domain ... · learning with augmented features...

20
Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, Dong Xu, Ivor Tsang Nanyang Technological University, Singapore

Upload: duongxuyen

Post on 09-Aug-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Learning with Augmented Features for

Heterogeneous Domain Adaptation

Lixin Duan, Dong Xu, Ivor Tsang

Nanyang Technological University, Singapore

Page 2: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Outline

• Background

• Heterogeneous Feature Augmentation (HFA)

– Learning Augmented Feature Representations

– Formulation with Support Vector Machine

– Kernelized HFA

• Experiments

– Object Recognition

– Multilingual Text Classification

• Conclusion

Page 3: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Background

• Domain Adaptation

– To retain and apply previous knowledge learned from

some existing domain (a.k.a., the source domain) to

improve learning in the new domain (a.k.a., the target

domain).

Traditional Machine Learning

Training and test data are

from the same domain

Sample in domain A

Domain Adaptation

Training and test data are

from different domains

Sample in domain B

Page 4: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Background

• Heterogeneous Domain Adaptation

– Source and target data have different feature dimensions

• English vs. Chinese

• Text vs. Images

• SIFT features vs. SURF features

• others

?

Page 5: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Heterogeneous Feature Augmentation

• Mapping heterogeneous features into a common

subspace

– �� ∈ ℝ�� and �� ∈ ℝ��, � ≠ �• Projection � ∈ ℝ��×��: �� → � ⋅ ��• Projection � ∈ ℝ��×��: �� → � ⋅ ��

�� ��� �Common subspace ℝ��

Page 6: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Heterogeneous Feature Augmentation

• Feature Augmentation

– New feature mapping �� �� and �� ��– Data with heterogeneous features can be compared in the

common subspace

– Similarities between data in the same domain can be

enhanced by incorporating the original features [Daumé III, 2007]

�� �� = ��������and �� �� = ��������

�� �� , �� = � 2 ⋅ � �� , �� , if�� and��arefromthesamedomain� �� , �� , if�� and��arefromdifferentdomains

Page 7: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Linear HFA

• HFA Formulation with SVM

– * = *+, , *�, , *�, ′ is a feature weight vector

– .��s and .��s are slack variables

– /0 and /1 are pre-defined parameters

min�,�,*,2,34�,34� 12 * 6 + 8 9 .��:��;< +9 .��:�

�;<s. t. >�� *,�� ��� + ? ≥ 1 − .�� , .��≥ 0;

>�� *,�� ��� + ? ≥ 1 − .�� , .��≥ 0;� D6 ≤ /0, � D6 ≤ /1

Page 8: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Linear HFA

• Taking the dual w.r.t. *, ?, .�� and .�� :

– F : a vector of dual variables; G : a vector of training labels

– H�,� = I�, J:� + �,� I� I�,�,�I�I�,�,�I� I�, K:� + �,� I�– It is nontrivial to determine the optimal dimension + for

the common subspace

min�,� maxF M:�N:�, F − 12 F ∘ G ,H�,� F ∘ Gs. t. G,F = 0, �:�N:� ≤ F ≤ 8M:�N:�;� D6 ≤ /0, � D6 ≤ /1

Page 9: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Linear HFA

• Transformation Metric

– Get rid of +• Linear Formulation with SVM

– HP = I�,I� + Q�, PQ� Q�, PQ�Q�,PQ� I�,I� + Q�,PQ�– Q� = J��R��×�� I� , Q� = R��×��J�� I� , / = /0 + /1

P = �,� , �,� ∈ ℝ(��N��)×(��N��)

minP≽� maxF M:�N:�, F − 12 F ∘ G ,HP F ∘ Gs. t. G,F = 0, �:�N:� ≤ F ≤ 8M:�N:�; trace(P) ≤ /

Page 10: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Linear HFA

• Solution: Iteratively update the transformation metric P using SDP and the dual variable F using SVM

• Issues in Linear HFA

– P is linear, which may not be effective for some tasks

– Infeasible for applications with very high dimensional data

• Can we learn P with its size dependent on the

number of training samples?

Kernelization

Page 11: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Kernelized HFA

• Notations

– Nonlinear mapping function W: � → W(�)– Kernel function X �� , �� = W, �� W ��– Kernel matrices H� = Y�,Y� and H� = Y�,Y�– Projection matrices �Z and �Z– Nonlinear feature transformation P�

Page 12: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Kernelized HFA

• Learn �� ∈ ℝ��×:� and �� ∈ ℝ��×:� instead of �Z and �Z

• Nonlinear Transformation MetricP� = ��,�� , ��,�� ∈ ℝ(:�N:�)×(:�N:�)

Page 13: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Kernelized HFA

• Kernelized Formulation with SVM

– HP� = H� + Q[ �, P�Q[ � Q[ �, P�Q[ �Q[ �,P�Q[ � H� + Q[ �,P�Q[ �– Q[ � = J:�R:�×:� H�</6, Q[ � = R:�×:�J:� H�</6

• Solution: Iteratively update the transformation

metric P and the dual variable F

minP�≽� maxF M:�N:�, F − 12 F ∘ G ,HP� F ∘ Gs. t. G,F = 0, �:�N:� ≤ F ≤ 8M:�N:�; trace(P�) ≤ /

Page 14: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Related Work

• ARC-t [Kulis et al., 2011]

– It learns an asymmetric transformation metric between

the different feature spaces.

• DAMA [Wang et al., 2011]

– It learns a common feature subspace by utilizing the class

labels of the source and target training data for manifold

alignment.

• HeMap [Shi et al., 2010]

– Unsupervised. It finds the projection matrices for a

common feature subspace as well as learns the optimal

projected data from both domains.

Page 15: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Experiments

• Object Dataset [Kulis et al., 2011]

– 4,106 images with 31 categories from three sources

– Source: amazon or webcam; Target: dslr

– Test data: The remaining dslr images are used

– Classification accuracy

]88 = #correctlyclassifiedsamples#totaltestsamples

Page 16: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Experiments

• Reuters Multilingual Dataset [Amini et al., 2009]

– 11,547 documents with 6 classes from 5 sources

– Source: English, French, German or Italian; Target: Spanish

– Test data: The remaining Spanish documents are used

– Classification accuracy

]88 = #correctlyclassifiedsamples#totaltestsamples

Page 17: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Experiments

• Results– Object dataset [Kulis et al., 2011]

– Reuters multilingual dataset [Amini et al., 2009]

– Our HFA method is significantly better than the other

methods under both settings, judged by the t-test with a

significance level at 0.05

Page 18: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Experiments

• Results w.r.t. # Target Training Samples Per Class– Reuters multilingual dataset [Amini et al., 2009]

• Convergence Analysis− “back_pack” on the object

dataset

− “C15” on the Retuers

multilingual dataset

− Less than 80 and 40 iter. on

both datasets

Page 19: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

References

Amini, M., Usunier, N., and Goutte, C. Learning from multiple

partially observed views – an application to multilingual text

categorization. In NIPS, 2009.

Daumé III, H. Frustratingly easy domain adaptation. In ACL, 2007.

Kulis, B., Saenko, K., and Darrell, T. What you saw is not what you

get: Domain adaptation using asymmetric kernel transforms. In

CVPR, 2011.

Shi, X., Liu, Q., Fan, W., Yu, P. S., and Zhu, R. Transfer learning on

heterogenous feature spaces via spectral transformation. In

ICDM, 2010.

Wang, C. and Mahadevan, S. Heterogeneous domain adaptation

using manifold alignment. In IJCAI, 2011.

Page 20: Learning with Augmented Features for Heterogeneous Domain ... · Learning with Augmented Features for Heterogeneous Domain Adaptation Lixin Duan, ... • SIFT features vs. SURF features

Thank you!

Q & A