data mining: statisticaie_155/lecture/statistica.pdf · • statistica can read from excel, .txt...

16
1 The University of Iowa Intelligent Systems Laboratory Data Mining: STATISTICA The University of Iowa Intelligent Systems Laboratory Outline •Prepare the data •Classification and regression

Upload: others

Post on 01-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

1

The University of Iowa Intelligent Systems Laboratory

Data Mining: STATISTICA

The University of Iowa Intelligent Systems Laboratory

Outline

•Prepare the data•Classification and regression

Page 2: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

2

The University of Iowa Intelligent Systems Laboratory

Prepare the Data• Statistica can read from Excel, .txt and many other types of files• Compared with WEKA, Statistica is much easier in terms of data

preparing

The University of Iowa Intelligent Systems Laboratory

Open an Excel File• Click the “Import selected sheet to Spreadsheet”• Select the desired Excel sheet where your data is stored• Get variable names from the first row

Page 3: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

3

The University of Iowa Intelligent Systems Laboratory

Open an Excel File• Change variable type

The University of Iowa Intelligent Systems Laboratory

Open an Excel File• Change variable type

Page 4: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

4

The University of Iowa Intelligent Systems Laboratory

Classification and Regression

• C&RT

• Boosting tree

• Neural Networks

The University of Iowa Intelligent Systems Laboratory

C&RT Classification• Iris data is used as a example data set

Page 5: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

5

The University of Iowa Intelligent Systems Laboratory

C&RT Classification• Click “Data Mining” menu and find the “Interactive Trees”

The University of Iowa Intelligent Systems Laboratory

C&RT Classification• View the final tree and understand the results

Page 6: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

6

The University of Iowa Intelligent Systems Laboratory

C&RT---Regression• Use the CPU data set and select the regression analysis

Don’t check it

The University of Iowa Intelligent Systems Laboratory

C&RT---Regression• Regression tree structure

Page 7: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

7

The University of Iowa Intelligent Systems Laboratory

C&RT---Regression

Pre

dict

ed v

alue

s

The University of Iowa Intelligent Systems Laboratory

Boosting tree Classification• In “Data Mining” menu and find the “Boosted Trees”

Page 8: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

8

The University of Iowa Intelligent Systems Laboratory

Boosting tree Classification• See the results and predictor’s importance

The University of Iowa Intelligent Systems Laboratory

Boosting tree Classification• See the results and predictor’s importance

Page 9: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

9

The University of Iowa Intelligent Systems Laboratory

Boosting tree Regression• CPU data set

The University of Iowa Intelligent Systems Laboratory

Boosting tree Regression• See the results and predictor’s importance

Pre

dict

ed v

alue

s

Page 10: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

10

The University of Iowa Intelligent Systems Laboratory

Boosting tree Regression• See the results of Observed values vs. Predicted values

The University of Iowa Intelligent Systems Laboratory

Boosting tree Regression• See the results and predictor’s importance

Page 11: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

11

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• In “Data Mining” menu and find the “Automated Neural Networks”

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• Choose “Classification”, then select variables

Page 12: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

12

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• Statistica will try a set of different neural networks and keep the best ones

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• See the classification results

Page 13: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

13

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• See the classification results---Predictions

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• See the classification results---Predictions

Page 14: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

14

The University of Iowa Intelligent Systems Laboratory

Neural Networks Classification• See the classification results---Confusion matrix

The University of Iowa Intelligent Systems Laboratory

Neural Networks Regression• CPU data set

Page 15: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

15

The University of Iowa Intelligent Systems Laboratory

Neural Networks Regression• CPU data set, select variables

The University of Iowa Intelligent Systems Laboratory

Neural Networks Regression• Training and results

Page 16: Data Mining: STATISTICAie_155/Lecture/Statistica.pdf · • Statistica can read from Excel, .txt and many other types of files • Compared with WEKA, Statistica is much easier in

16

The University of Iowa Intelligent Systems Laboratory

Neural Networks Regression• Predictions

The University of Iowa Intelligent Systems Laboratory

Neural Networks Regression• Some statistics about the predictions