dynamic classifier selection for effective mining from noisy data streams

22
Dynamic Classifier Selection for Effective Mining from Noisy Data Streams Xingquan Zhu, Xindong Wu, and Ying Yang Proc. of KDD 2003 2005/3/25 報報報 : 報報報

Upload: indiya

Post on 06-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Dynamic Classifier Selection for Effective Mining from Noisy Data Streams. Xingquan Zhu, Xindong Wu, and Ying Yang Proc. of KDD 2003 2005/3/25 報告人 : 董原賓. Problem. Problem: Many existing data stream mining efforts are based on the Classifier Combination techniques - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Xingquan Zhu, Xindong Wu, and Ying YangProc. of KDD 2003

2005/3/25 報告人 : 董原賓

Page 2: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Problem

Problem: Many existing data stream mining efforts

are based on the Classifier Combination techniques

Dramatic concept drift 、 Significant amount of noise

Solution: Choose the most reliable classifier

Page 3: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Multiple Classifier System(MCS) MCS assumption: each base classifier has a

particular sub-domain from which it is most reliable

Two categories of MCS integration techniques: Classifier Combination (CC) techniques

All base classifiers are combined to work out the final decision EX:SAM( Select All

Majority ) Classifier Selection (CS) techniques

Select the single best classifier from base classifiers for the final decision

Page 4: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Classifier Selection techniques

Two types of CS techniques: Static Classifier Selection, during the

training phase, EX: CVM (Cross Validation Majority)

Dynamic Classifier Selection, during the classification phase, call it “dynamic” because the classifier used critically depends on the test instance itself, EX: DCS_LA (Dynamic Classifier Selection by Local Accuracy)

Page 5: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Definition

Dataset D, training set X, test set Y and evaluation set Z

Nx, Ny and Nz represent the numbers of instances in X, Y and Z respectively

C1,C2,…,CL the L base classifiers from X The selected best classifier C* to classi

fy each instance Ix in Y

Page 6: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Definition The instances in D have M attributes A

1,A2,…,AM and each attribute A contains ni values V1

Ai,…,VniAi

For an attribute Ai ,use its values to partition Z into ni subsets S1

Ai,…,SniAi whe

re S1Ai ∪.. ∪ Sni

Ai = Z Ik

Ai denotes instance Ik’s value on attribute Ai

Page 7: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Attribute-Oriented Dynamic Classifier Selection (AO-DCS)

Three steps of AO-DCS: Partition the evaluation set into subsets by

using the attribute values of the instances Evaluate the classification accuracy of

each base classifier on all subsets For a test instance, use its attribute values

to select the corresponding subsets and select the base classifier that has the highest classification accuracy

Page 8: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Partition by attributes

Page 9: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Partition By Attributes

NameGende

rAge

Height

Mary Female 29 163

Dave Male 51 170

Martha Female 63 149

Nancy Female 35 157

John Male 18 182

Age :< 30(S1A)

≧30(S2A)

Height :≦ 160 (S1H)

161 ~ 180(S2H)

≧ 181 (S3H)

Base Classifier : C1, C2, C3

Instance IMary

S1G : IDave, IJohn

S2G : IMary, IMartha, INancy

Gender : Male(S1

G)

Female(S2G)

Page 10: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Evaluate the classification accuracy

Partition by attributes

L base classifiers

Subsets from Attribute Ai

Page 11: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

The classification accuracy

Page 12: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Dynamic Classifier Selection

Page 13: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

S1G S2

G S1A S2

A S1H S2

H S3H

C1 0.8 0.5 0.6 0.4 0.2 0.4 0.6

C2 0.4 0.7 0.6 0.3 0.5 0.9 0.8

C3 0.6 0.9 0.3 0.5 0.7 0.8 0.4

NameGende

rAge Height

Alex Male 24 177The accuracy of C1 : AverageAcy[1] = (0.8+0.6+0.4) / 3 = 0.6

AverageAcy[2] = 0.63

AverageAcy[3] = 0.56

Page 14: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Applying AO-DCS in Data Steam Mining

Steps: partition streaming data into a series

of chunks, S1 , S2 , .. Si ,.., each of which is small enough to be processed by the algorithm at one time.

Then learn a base classifier Ci from each chunk Si

Page 15: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Applying AO-DCS in Data Steam Mining (cont.)

To evaluate all base classifiers (in the case that the number of base classifiers is too large, we can keep only the most recent K classifiers) and determine the “best” one for each test instance

note: We will dynamically construct an evaluation set Z (using the most recent instances, because they are likely consistent with the current test instances)

Page 16: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Experiment

Page 17: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Experiment

Page 18: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Experiment

Page 19: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Experiment

Page 20: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Experiment

Page 21: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Experiment

Page 22: Dynamic Classifier Selection for Effective Mining from Noisy Data Streams

Experiment