data mining tools (r , weka, rapid miner, orange)

15
Data Mining Tools Kowshik Madhumati Mayur Mohamed Sharique Vidyashankar

Upload: mayursurani

Post on 16-Jul-2015

851 views

Category:

Business


5 download

TRANSCRIPT

Page 1: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

Data Mining Tools

Kowshik

Madhumati

Mayur

Mohamed Sharique

Vidyashankar

Page 2: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

• Open source

• Data visualization and analysis

• Novice and experts

• Through Python scripting

• Available for all popular platforms, including Windows, Mac OS X and variants of Linux.

• Founded on 1996

• Orange is distributed free under the GPL.

• M&D at the Bioinformatics Laboratory of the Faculty of Computer and Information Science, University of Ljubljana, Slovenia.

Product Details

Company Details

Python is a widely used general-purpose, high-level programming language.GNU General Public License is the most widely used free software license

Page 3: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

Features• Visual Programming• Visualization• Interaction and Data Analytics• Large Toolbox• Scripting Interface• Extendable• Documentation• Open Source• Platform Independence

Success Stories• Astra-Zeneca, a pharmaceutical giant, which uses

Orange in drug development and sponsors the development of several related parts of Orange

• At Jožef Stefan Institute, the visual programming interface has been upgraded in Orange4WS to support service-oriented architectures

Page 4: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

Screenshot

Page 5: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

• Latest R-language engine for statistical computing

• Open source, R- Enterprise, R-Cloud(Paid version )

• Data visualization and analysis up to 16 TB

• Extended capabilities with reproducible R tool Kits

• Windows , Mac OS and variants of Linux.

• Founded on 1993 in New Zealand

• Robert and Rossa pioneer in R language development .

• R has General Public Licence.

• Many Big MNC companies are using R software.

Product Details

Company Details

Page 6: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

Useful Functions • Graphics Visualization

• Spatial Data Analysis

• Clustering

• Text Mining

• Social Network Analysis and Graph mining

• Statistics

• Graphics

• Data Manipulation

Success Stories• Bank of America

• Bing

• Facebook

• Ford

• Google

Page 7: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

Screenshot

Page 8: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

• Open source

• a collection of machine learning algorithms

• Data visualization and analysis

• Java based platform

• Most researchers and practitioners

• Founded on 1997

• University of Waikato

Product Details

Company Details

Public License is the most widely used free software license

Page 9: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

Features • General public license

• GUI for interacting

• Explorer is the main user interface of WEKA

• primitive tasks including data pre-processing, classification, regression, clustering, association rules and visualization

• Execute data files in multiple format

• One exceptional feature of WEKA is the database connection using JDBC with any RDBMS package

• The Weka mailing list has over 1100 subscribers in 50 countries, including subscribers from many major companies such as Rechtsportal

Success Stories

Page 10: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

Screenshot

Page 11: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

• Open source.

• Data visualization and analysis

• Machine Learning

• Data Mining, Text Mining.

• Business Intelligence.

• Works on java runtime.

• Available on all major operating systems and platforms

• Started as YALE in 2001 by Ralf Klinkenberg, Ingo Mierswa, and Simon Fische

• In 2006 it was renamed by Rapidminer since developed by Rapid-1 founded by Ralf Klinkenberg, Ingo Mierswa

• Licensed by AGPL.

Product Details

Company Details

Page 12: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

Features • A visual - code-free - environment, so no programming needed

• Design of analysis processes

• Predictive analytics (with pre-made templates)

• Data loading

• Data transformation

• Data Modelling

• Data visualization (with lots of visualizations)

• Allows you to work with different types and sizes of data sources

• Platform Independence.

• Acts as a powerful scripting language engine along with a graphical user

• Modular operator concept.

• Multi-layered data view.

• CISCO• PAYPAL• EBAY• MIELE• VOLKSWAGEN

Success Stories

Page 13: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

Screenshot

Page 14: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

Procedure R-Programming RapidMiner Weka Orange

Partitioning of dataset into training and testing sets.

Pass (but limited partitioning methods)

Pass (but limited partitioning methods)

Pass (but limited partitioning methods)

Pass (but limited partitioning methods)

Descriptor scaling Pass Pass

Fail (cannot save parameters for scaling to apply to future datasets)

Fail (no scaling methods)

Descriptor selectionFail (no wrapper methods)

PassPass (but is not part of KnowledgeFlow)

Fail (no wrapper methods)

Parameter optimization of machine learning/statistical methods

Fail (not automatic) Pass Fail (not automatic) Fail (not automatic)

Model validation using cross-validation and/or independent validation set

Pass (but limited error measurement methods)

Pass

Pass (but cannot save model so have to rebuild model for every future dataset)

Pass (but cannot save model so have to rebuild model for every future dataset)

Overall Comparison

Page 15: Data mining tools (R , WEKA, RAPID MINER, ORANGE)

• http://old.biolab.si/

• http://en.wikipedia.org/

• http://www.predictiveanalyticstoday.com/

• http://thenewstack.io/

• www.facebook.com/

• www.slideshare.net/

• www.kdnuggets.com/

• www.researchgate.net

• https://rapidminer.com/

• www.r-project.org

• sourceforge.net/projects/weka

• www.thearling.com