csc288 term project data mining on predict voice-over-ip phones market ----- huaqin xu
TRANSCRIPT
![Page 1: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/1.jpg)
CSc288 Term Project Data mining on predict Voice-over-IP Phones market
----- Huaqin Xu
![Page 2: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/2.jpg)
Agenda
Abstract Introduction Methodology Result Conclusion Learning Experience References
![Page 3: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/3.jpg)
Abstract
This project based on the VoIP survey data sets. Weka explorer’s classifiers are chosen as data mining tool to build models to predict potential customers of VoIP phone and the most important features and services of two VoIP models.
![Page 4: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/4.jpg)
Introduction
BackgroundVoIP phone has a potential opportunity with
the wide use of internet service.Two VoIP phone models: Basic & Deluxe
Data mining ScopeCustomerProduct features and services
![Page 5: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/5.jpg)
Methodology
Data Mining Tools C4.5/C5.0, Cubist Weka Microsoft SQL Server SPSS
Chose: Weka Explorer
Why? Free, Easy, Good Interface, More choices……
![Page 6: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/6.jpg)
Methodology
Explorer Vs KnowledgeFlow
![Page 7: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/7.jpg)
Methodology
Datasets: Totally: 94 instances
![Page 8: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/8.jpg)
Methodology Preprocessing
Split table Customer: 17 attributes Basic-model: 14 attributes Deluxe-model: 10 attributes
Processing Missing data Delete Replaced by “?”
Transfer data typeSPSS Excel Weka
![Page 9: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/9.jpg)
Methodology
Algorithm selectionClassification ClusteringAssociation
Chose: NNgeWhy?
High accuracy rate Simple, clear Rules
Algorithms Correct Instances (%)
Naivebayes 63.82
DecisionStump 65.95
Id3 84.04
J48 75.53
NBTree 79.78
ConjunctiveRule 69.14
DecisionTable 80.85
NNge 87.23
OneR 71.27
PART 72.34
Prism 88.29
Ridor 71.27
JRip 74.46
ZeroR 63.83
AdaBoostM1 65.95
BayesNet 60.63
![Page 10: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/10.jpg)
NNge classifier Nearest-neighbor like algorithm using
non-nested generalized exemplars. a rule based classifier builds a sort of “hypergeometric” model. shows promise as an ML method that
performs well on a wide range of datasets
Methodology
![Page 11: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/11.jpg)
Result
![Page 12: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/12.jpg)
Result
![Page 13: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/13.jpg)
Result Rules:
One of customer rules :class Would_Buy IF :
cost in {10-20} ^ phone in {yes} ^ email in {yes} ^ fax in {no} ^ chat in {yes,no} ^ other in {no} ^ service type in {Phone_cards_only} ^ price in {Somewhat_Dissatisfied, Somewhat_Satisfied} ^ voice_quality in {Somewhat_Dissatisfied, Somewhat_Satisfied} ^ service in {Somewhat_Dissatisfied} ^ convenience in {Somewhat_Satisfied} ^ promotion in {Somewhat_Dissatisfied} ^ Know VoIP in {yes,no} ^ marital status in {Single} ^ gender in {Male} (11)
![Page 14: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/14.jpg)
Result Stat:
Classes allocation Feature weights
![Page 15: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/15.jpg)
Result Basic-model & Deluxe-model
Schema: meta.AttributeSelectedClassifier
Subschema: rules.NNge
Selected attributes: 3,6,8,10,11,12 : 6
Why?avoid overfitting
![Page 16: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/16.jpg)
Result
Evaluation
Ten-fold cross-validation Summary
Correctly classified instances > 85% Detailed Accuracy By Class
TP, FP, Precision, Recall, F measure Confusion Matrix
Misclassified instances:12 instances/94 instances
![Page 17: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/17.jpg)
Result
![Page 18: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/18.jpg)
Conclusion
LimitationSmall Datasets Incomplete Data source
ModelsHigh accuracy rateHelp further Market AnalysisHelp product design
![Page 19: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/19.jpg)
Learning Experience
Process a real data mining problem Know Classification algorithms better
Numeric, Nominal Missing data Overfitting
Know Evaluation methods better How to compare algorithms Evaluation factors
![Page 20: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/20.jpg)
Learning Experience
Learn how to use WekaFuture work: learn how to modify source to
perform better data mining Learn from classmates
![Page 21: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/21.jpg)
References
”Data Mining - Concepts and Techniques" by Jiawei Han and Micheline Kamber, Morgan Kaufmann 2001.
“Data Mining – Practical Machine Learning Tools and Techniques with Java Implementations” by Ian H. Witten and Eibe Frank, Morgan Kaufmann 2000.
http://www.cs.waikato.ac.nz/~ml/index.html. Machine Learning---Weka Home Page
Marketing Research by David A. Aaker, V. Kumer and George S. Day, eighth edition, Willey 2004.
![Page 22: CSc288 Term Project Data mining on predict Voice-over-IP Phones market ----- Huaqin Xu](https://reader035.vdocuments.mx/reader035/viewer/2022062716/56649dc75503460f94abc4e1/html5/thumbnails/22.jpg)
Thank you