real-time ranking with concept drift using expert advice

20
Real-time Ranking with Concept Drift Using Expert Advice Hila Becker and Marta Arias Center for Computational Learning Systems Columbia University

Upload: hila-becker

Post on 23-Dec-2014

599 views

Category:

Technology


1 download

DESCRIPTION

Hila Becker, Marta Arias, "Real-time ranking with concept drift using expert advice", in Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '07), 86-94

TRANSCRIPT

Page 1: Real-time ranking with concept drift using expert advice

Real-time Ranking with Concept Drift Using Expert Advice

Hila Becker and Marta AriasCenter for Computational Learning Systems

Columbia University

Page 2: Real-time ranking with concept drift using expert advice

2

Dynamic Ranking

Continuous arrival of data over time Set of items to rank

Dynamic featuresAdapt to Changes

Given a list of electrical grid components, produce a ranking according to failure susceptibility

Page 3: Real-time ranking with concept drift using expert advice

+

+

+

-

+

-

-

-

time

Problem Setting

+

+

+

-

+

-

-

-

+

+

+

-

+

-

-

-

+

+

+

-

+

-

-

-

?

. . .t-1 t1 2 3

?

?

?

?

?

?

?

M

+

Feature Vector

x = x1,x2,…,xn

Label y

Page 4: Real-time ranking with concept drift using expert advice

4

Challenges

Changes in underlying distributionHiddenConcept driftAdapt learning model to improve

predictions Finite storage space

Sample from the dataDiscard old or irrelevant information

Page 5: Real-time ranking with concept drift using expert advice

5

Concept Drift

+

+

+

+

+ -

+

+

-

--

-

-

-

time

Page 6: Real-time ranking with concept drift using expert advice

6

Ensemble Methods

time

Page 7: Real-time ranking with concept drift using expert advice

7

Weighted Expert Ensembles

Associate a weight with each expert Measure of belief in expert performance Weights used in final prediction

Use only the best expertWeighted average of predictions

Update the weights after every prediction

Page 8: Real-time ranking with concept drift using expert advice

8

Weighted Majority Algorithm

e1 . . .e2 e3 eNN Experts

1 0 0 1?

w1*1 + w2*0 + w3*0 + . . . + wN*1

>0.5 <0.5

1 0

1

Page 9: Real-time ranking with concept drift using expert advice

9

Modified Weighted Majority

Different Constrains for data streamsIncorporate new dataStatic vs. Dynamic set of experts

Ranking AlgorithmLoss function – 1-normalized average

rank of positive examplesCombine Predictions – weighted

average rank

Page 10: Real-time ranking with concept drift using expert advice

10

Online Ranking Algorithm

e1 . . .e2 e3 eB

w1 w2 w3 wB

? F1

F4

F3

F2

F5

F4

F2

F1

F3

F5

F1

F3

F5

F4

F2

F1

F3

F4

F2

F5

F1

F3

F4

F2

F5

F3

F1

F4

F2

F5

eB+1 eB+2

wB+1 wB+2

Page 11: Real-time ranking with concept drift using expert advice

11

Performance – Summer 05

Page 12: Real-time ranking with concept drift using expert advice

12

Performance – Winter 06

Page 13: Real-time ranking with concept drift using expert advice

13

Contributions

Additive weighted ensemble based on the Weighted Majority algorithm

Algorithm adapted to ranking Experiments on a Real-world

datastreamOutperform traditional approachesExplore performance/complexity

tradeoffs

Page 14: Real-time ranking with concept drift using expert advice

14

Future Work

Ensemble diversity control Exploit re-occurring contexts

Use knowledge of cyclic patternsRevive old experts

Change detection Statistical estimation of predicting

ensemble size

Page 15: Real-time ranking with concept drift using expert advice

15

Ensemble Methods

Static ensemble with online learners [Hulten ’01]

Use batch-learners as experts Can use many learning algorithms Loses interpretability

Additive ensembles Train an expert at constant intervals [Street

and Kim ’01] Train an expert when performance declines

[Kolter ’05]

Page 16: Real-time ranking with concept drift using expert advice

16

Ensemble Pruning

Additive ensembles can grow infinitely large

Criteria for removing expertsAge - retire oldest model [Chu and

Zaniolo ‘04]Performance

• Worst in the ensemble • Below a minimal threshold [Stanley ’01]

Instance-based Pruning [Wang et al. ’03]

Page 17: Real-time ranking with concept drift using expert advice

17

Dealing with a moving set of experts

Introduce new parametersB: “budget” (max number of models) set to 100p: new models weight percentile in [0,100]: age penalty in (0,1]

If too many models (more than B), drop models with poor q-score, whereqi = wi • pow(, agei)

I.e., is rate of exponential decay

Page 18: Real-time ranking with concept drift using expert advice

18

Performance Metric

feedersfailures

failureranki

i

##

)(1

5833.08*3

5321

1 0

2 1

3 1

4 0

5 1

6 0

7 0

8 0

ranking outages

3

2

1

1 2 3 4 5 6 7 8 pAUC=17/24=0.7

Page 19: Real-time ranking with concept drift using expert advice

19

Budget Variation

Page 20: Real-time ranking with concept drift using expert advice

20

Data Streams

Continuous arrival of data over time Real-world applications

Consumer shopping patternsWeather predictionElectricity load forecasting

Increased attentionCompanies collect dataTraditional approaches do not apply