Download - Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao Wei Fan Jing JiangJiawei Han University of Illinois at Urbana-Champaign IBM T. J

Knowledge Transfer via Multiple Model Local Structure Mapping

Jing Gao† Wei Fan‡ Jing Jiang†Jiawei Han†

†University of Illinois at Urbana-Champaign‡IBM T. J. Watson Research Center

2/17

Standard Supervised Learning

New York Times

training (labeled)

test (unlabeled)

Classifier 85.5%

New York Times

3/17

In Reality……

New York Times

training (labeled)

test (unlabeled)

Classifier 64.1%

New York Times

Labeled data not available!Reuters

4/17

Domain Difference Performance Droptrain test

NYT NYT

New York Times New York Times

Classifier 85.5%

Reuters NYT

Reuters New York Times

Classifier 64.1%

ideal setting

realistic setting

5/17

Other Examples• Spam filtering

– Public email collection personal inboxes

• Intrusion detection– Existing types of intrusions unknown types of intrusions

• Sentiment analysis– Expert review articles blog review articles

• The aim– To design learning methods that are aware of the training and

test domain difference

• Transfer learning– Adapt the classifiers learnt from the source domain to the new

domain

6/17

All Sources of Labeled Information

training (labeled)

test (completely unlabel

ed)

Classifier

New York Times

Reuters

Newsgroup

…… ?

7/17

A Synthetic Example

Training(have conflicting concepts)

Test

Partially overlapping

8/17

Goal

SourceDomain Target

Domain

SourceDomain

SourceDomain

• To unify knowledge that are consistent with the test domain from multiple source domains

9/17

Summary of Contributions

• Transfer from multiple source domains– Target domain has no labeled examples

• Do not need to re-train– Rely on base models trained from each dom

ain– The base models are not necessarily develo

ped for transfer learning applications

10/17

Locally Weighted Ensemble

),( yxf k

k

i

iiE yxfxwyxf1

),()(),(

),(2 yxf

C1

C2

Ck

……

Training set 1),(1 yxf

),|(),( ii CxyYPyxf

),(maxarg| yxfxy Ey

Test example xTraining set 2

Training set k

……

)(1 xw

)(2 xw

)(xwk

k

i

i xw1

1)(

X-feature value y-class label

11/17

Optimal Local Weights

C1

C2

Test example x

0.9 0.1

0.4 0.6

0.8 0.2

Higher Weight

• Optimal weights– Solution to a regression problem– Impossible to get since f is unknown!

12/17

Graph-based Heuristics

• Graph-based weights approximation– Map the structures of a model onto the structure

s of the test domain– Weight of a model is proportional to the similarity

between its neighborhood graph and the clustering structure around x.

Higher Weight

13/17

Experiments Setup• Data Sets

– Synthetic data sets– Spam filtering: public email collection personal inboxes (u01,

u02, u03) (ECML/PKDD 2006)– Text classification: same top-level classification problems with

different sub-fields in the training and test sets (Newsgroup, Reuters)

– Intrusion detection data: different types of intrusions in training and test sets.

• Baseline Methods– One source domain: single models (WNN, LR, SVM)– Multiple source domains: SVM on each of the domains– Merge all source domains into one: ALL– Simple averaging ensemble: SMA– Locally weighted ensemble: LWE

14/17

Experiments on Synthetic Data

15/17

Experiments on Real Data

0.5

0.6

0.7

0.8

0.9

1

Spam Newsgroup Reuters

WNNLRSVMSMALWE

0.5

0.6

0.7

0.8

0.9

1

DOS Probi ng R2L

Set 1Set 2ALLSMALWE

16/17

Conclusions• Locally weighted ensemble framework

– transfer useful knowledge from multiple source domains

• Graph-based heuristics to compute weights– Make the framework practical and effecti

ve

Download - Knowledge Transfer via Multiple Model Local Structure Mapping Jing Gao Wei Fan Jing JiangJiawei Han University of Illinois at Urbana-Champaign IBM T. J

Top Related