intelligent database systems lab presenter: chang, shih-jie authors: luca cagliero, paolo garza...
TRANSCRIPT
![Page 1: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/1.jpg)
Intelligent Database Systems Lab
Presenter: CHANG, SHIH-JIE
Authors: Luca Cagliero, Paolo Garza
2013.DKE.
Improving classification models with taxonomy information
![Page 2: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/2.jpg)
Intelligent Database Systems Lab
Outlines
MotivationObjectivesMethodologyExperimentsConclusionsComments
![Page 3: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/3.jpg)
Intelligent Database Systems Lab
Motivation • A number of different approaches to build accurate
classifiers have been proposed but the integration of taxonomy information in data used for classifier training has never been investigated so far.
![Page 4: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/4.jpg)
Intelligent Database Systems Lab
Objectives
• This paper presents a general-purpose strategy to improve structured data classifier accuracy provided by a taxonomy built over data items.
![Page 5: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/5.jpg)
Intelligent Database Systems Lab
Definition. Aggregation tree
Definition. Multiple-taxonomy
Let T ¼ t 1 ; …; t be a set of attributes.A multiple-taxonomy Θ={AT 1 ,…,AT m } is a forest of aggregation trees defined on the domains of attributes in T .
![Page 6: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/6.jpg)
Intelligent Database Systems Lab
Methodology
![Page 7: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/7.jpg)
Intelligent Database Systems Lab
Methodology – Multiple-taxonomy over data items in D
![Page 8: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/8.jpg)
Intelligent Database Systems Lab
two-step process:(i)Generalized classification rule mining. ex: {(Location,Italy)} {(User category, Entrepreneur)}⇒ (s=50%, c=100%)(1)An extended training dataset version is generated first(2) FP-tree-like representation of the extended dataset is generated . Only frequent items are included in the FP-tree.
(ii)Rule selection by means of lazy pruning.
![Page 9: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/9.jpg)
Intelligent Database Systems Lab
Methodology – lazy pruning(1) Pruning rules that only misclassify training data.
(2) Rules that correctly classify at least one training data are grouped in the Level I rule set, while rules that remain unused during the training phase are kept in the Level II.
![Page 10: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/10.jpg)
Intelligent Database Systems Lab
Methodology – The G−L3 algorithm
![Page 11: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/11.jpg)
Intelligent Database Systems Lab
Methodology
![Page 12: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/12.jpg)
Intelligent Database Systems Lab
Methodology – G−L3 class prediction
When a new test case rt has to be classified, G−L3 considers the sorted rule sets in Level I and Level II.
If none of the Level I rules match rt , then the top-ranked rule in Level II matching r is considered.
If none of the rules belonging to the two model sets match rt , the default class label is assigned to rt.
![Page 13: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/13.jpg)
Intelligent Database Systems Lab
Experiments – Dataset characteristics
![Page 14: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/14.jpg)
Intelligent Database Systems Lab
Experiments – Accuracy comparison(baseline V.S. extended)
![Page 15: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/15.jpg)
Intelligent Database Systems Lab
Experiments – Accuracy comparison
![Page 16: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/16.jpg)
Intelligent Database Systems Lab
Experiments –
![Page 17: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/17.jpg)
Intelligent Database Systems Lab
Experiments –
![Page 18: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/18.jpg)
Intelligent Database Systems Lab
Experiments –
![Page 19: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/19.jpg)
Intelligent Database Systems Lab
Experiments – execution time comparison
![Page 20: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/20.jpg)
Intelligent Database Systems Lab
Conclusions– Taxonomy integration is shown to yield significant
accuracy improvements.
![Page 21: Intelligent Database Systems Lab Presenter: CHANG, SHIH-JIE Authors: Luca Cagliero, Paolo Garza 2013.DKE. Improving classification models with taxonomy](https://reader036.vdocuments.mx/reader036/viewer/2022070414/5697c0001a28abf838cc20a3/html5/thumbnails/21.jpg)
Intelligent Database Systems Lab
Comments• Advantages
– More accurate.• Applications
– Classification、 Data mining.