data mining & spss modeler
DESCRIPTION
Data Mining & SPSS Modeler. Day 1. 数据挖掘简介. 1.1 数据挖掘概念. 1.1 数据挖掘概念. 数据挖掘概念. 数据挖掘定义: 用 已验证 的方法 从大量数据 中 发掘出 可 采取行动的内在知识,从而改善企业 运营。. 业务中的数据量呈现指数增长( GB/ 小时) 传统技术难以从这些大量数据中发现有价值的规律 数据挖掘可以帮助我们从大量数据中发现有价值的规律. 运营. 预测与分类 聚类 关联分析 序列分析 异常监测 时间序列. 1.1 数据挖掘概念. 预测和分类. - PowerPoint PPT PresentationTRANSCRIPT
SPSS
Data Mining & SPSS Modeler11.1 Day 11.1 4GB/
31.1 4
4
Group 1Group 2Group 3Group n
1.1 5()51.1 6(Antecedent) (Consequent)
? ?(Antecedent) (Consequent)
? ?
(1) & (2) & & (m) Antecedents Consequent Buying Pattern 61.1 7Mail DirectorycafCommunitybannerModel 1MainModel 2cafecafeModel 3MainModel 4CommunityPattern 71.1 8 /1472365 8Transaction ID200902131224350001200902131224350002881.1 9
91.2 Day 1CRM Risk Scoring&/ScoringLTV (Life Time Value)Risk (Warranty)
1.2 11
Warranty&Cross-selling, Up-selling
1.2 1212--
3, , .SPSS, 3!! & 2~3%
: , Intervention : IMF, 911/ Data
01
03
021.2 1313 & & & Cross-selling, Up-selling (FDS)Underwriting List&
1.2 1414--
? SPSSCLTVRFM , !! !!
CLV()/()Life Stage ()VIPGoldVIP Silver()
&
01
03
021.2 1515--&
, ?SPSS!!!! ?
/ex) 78.5 (24%)//ex) 85.8 (18%)
/ Scoring
01
03
021.2 1616(Cross/Up-selling) (Re-selling)&Shopping Mall
1.2 1717--
, ? ?Mining!!SPSS !!!! !!
6 Set /
01
03
021.2 1818--
, e-MAIL & SMS ?SPSS Mining, , , !!!!
01
03
021.2 1919&Up-selling&&
1.2 2020--&
IAPPMOVIEMUSICLIFE Gain ChartRule Description
!!! Needs, ? SPSSCollaborative Filtering !!!!
01
03
021.2 2121--
CDMAW-CDMA
& ?, SPSS!!/!!
01
03
021.2 2222//
/
1.2 2323/--
+
?SPSS Decision!!, j !!
01
03
021.2 2424/--
Segment
//
(Time Series)
ARIMA / / AR
/
/??SPSS !!/!!
01
03
021.2 2525&Up-selling&&
(, , , )
1.2 2626--
DB
SPSS DB
Guide> TV > Rating/Guide> TV > RatingTV Program TableTVTableTableETCL ODBCData Mart INSERT.net Data Mart
?SPSS , !! &!!
01
03
021.2 27272.1 SPSS Modeler Day 1SPSS Modeler2.1 SPSS Modeler 14.2IBM SPSS ModelerGUI.IBM SPSS Modeler 14.2
(Open Architecture)
IBM SPSS Modeler Data Mining1429SPSS Education29 2011 SPSS Korea, Data Solution Inc. All rights reservedModelerCSSPSS Modeler SeverSQL PushbackSPSS Modeler Client90
ClientServer
Server302.1 C/SSPSS Education30 2011 SPSS Korea, Data Solution Inc. All rights reserved-1312.1
Mining
(Open Architecture)
MiningData Mining.
S/W DBMSData Mining S/WSourceMiddle Ware, S/WGroupS/W Up-Grade
SPSS Education31 2011 SPSS Korea, Data Solution Inc. All rights reserved-2322.1
(Open Architecture)
Data Mining DBMiningDB Bulk LoadingLoader2ExportSPSS Modeler(GUI )Loader 3DB TableBulk Loading4
DB Bulk Loader
IBM SPSSModeler
txt 1Data Base
* Client PC*** PC Client PCIBM SPSS Modeler ClientWindows XP/Vista/7TCP/IPMiningMart
Web- Java
SPSS Modeler scripts Python 2.6 SPSS Modeler IBM SPSS Modeler Server options.cfg python_exe_path (,) SPSS Modeler scripts Python scripts SPSS Modeler
SPSS Education32 2011 SPSS Korea, Data Solution Inc. All rights reserved-3CRISP-DM (Cross Industry Standard Process for Data Mining)CRISP-DM NCR, OHRA, SPSS, Daimler-Benz,
CRISP-DM
/ / Modeling Modeling
/
33. , . CRISP-DM .2.1 SPSS Education33 2011 SPSS Korea, Data Solution Inc. All rights reserved2.2 SPSS Modeler SPSS Modeler Day 1-12.2 SPSS Modeler SPSS Modeler 14.2IBM SPSS ModelerIBM SPSS ModelerNode StatisticsSAS ExcelODBC Microsoft SQL ServerDB2Oracle
Source NodeNode
Operation NodeNode27
Graph NodeNodeDecision Tree, Regression, Neural Network, Clustering, Association .
Modeling NodeMining Node
Output NodeIBM SPSS ModelerCRISP-DM /Mining Data Mining35
SPSS Education35 2011 SPSS Korea, Data Solution Inc. All rights reserved-22.2 SPSS Modeler
36SPSS Education36 2011 SPSS Korea, Data Solution Inc. All rights reserved-32.2 SPSS Modeler AutomatedClassificationAssociationSegmentation
37SPSS Education37 2011 SPSS Korea, Data Solution Inc. All rights reserved2.2 SPSS Modeler IBM SPSS Modeler IBM SPSS StatisticsExcelSASODBC
Source NodeOracle, SQL Server, DB2ODBCASCII Statistics, IBM Cognos BI, Excel, SAS 6,7,8
38
SPSS Education38 2011 SPSS Korea, Data Solution Inc. All rights reserved2.2 SPSS Modeler IBM SPSS Modeler
Operation NodeRFM
39SPSS Education39 2011 SPSS Korea, Data Solution Inc. All rights reserved2.2 SPSS Modeler 2\3
Graph Node
XYWeb XY NodeXYMulti plot.Node
40SPSS Education40 2011 SPSS Korea, Data Solution Inc. All rights reservedIBM SPSS Modeler
Modeling Node
412.2 SPSS Modeler SPSS Education41 2011 SPSS Korea, Data Solution Inc. All rights reservedIBM SPSS Modeler
Modeling Node(Gains Chart)(Response Chart ) (Lift Chart) (Profit Chart)
ROI
422.2 SPSS Modeler
SPSS Education42 2011 SPSS Korea, Data Solution Inc. All rights reservedOutput
Output NodeStatistics html, txt.Data Quality Null Value, Blankhtml, txt.Display Export Statistics.savExcel SAS 6,7,8 ODBC)
432.2 SPSS Modeler SPSS Education43 2011 SPSS Korea, Data Solution Inc. All rights reserved3.1 Day 1
453.1
45463.1
= 46 473.1 Statistics tree_credit.sav SPSS Statistics
IBM SPSS Statistics 47483.1
48493.1
49 CHAID 503.1
50513.1 400 200
51523.1
52 CHAID 533.1
----53543.1
54553.1
2 18% 1 (89%) 10 1 4 5 5 89% 97%--55563.1
1 (89%) 10 1 4 5 5 89% 97%-- 3 6 7 5 58% 85%564.1 Day 14.1 1 Class $XF-Class 90%
58584.1 2
59 = 100%/: :, : :
59604.1
60614.1
$R-Credit rating $R- $RC- 0.0 1.0 CHAID 16% 100%
612464 1960 79%624.1
$R-Credit rating -- 2464 1960 79%
62 Statistics
634.1
Statistics
Copyright IBM Corporation 1994, 2011. Statistics
Copyright IBM Corporation 1994, 2011. Statistics PMML IBM SPSS Collaboration and Deployment Services
63644.1 645.1 (Estimation)| |Day 25.1 |SPSSModelerLinear RegressionNeural Network
--
66661. .(Method of Least Squares) : ,
675.1 |-672. ( Enter) : (Stepwise) : /(Backwards) : (Forwards) :
Regression in SPSS Modeler
685.1 |-683. 31.sav
695.1 |-694.
705.1 |-
705. 1
715.1 |-R1715. 2
725.1 |-F725. 3
735.1 |-Sig>0.05= -139.099+ 0.469*735. 4
745.1 |-Mean=0.001P-PY=x
P-P741. Framework of Neural NetworkThe Neurals in Human Body
755.1 |-752. 1 (Input Layer) (Hidden Layer) (Output Layer)...X1X2XPZ1ZWZ2Y*Kw1w2w3wpw*K1w*Km765.1 |-762. 2 (Input Layer) -0~1 (Hidden Layer) (Output Layer) (target)775.1 |-773. Neural NetworkMLP(Multi Layer perceptron) RBF(radial basis function)
12312
MLP
K-means1MLP;12
RBF785.1 |-784. SPSS Modeler Neural Network1Neural Network in SPSS Modeler795.1 |-
794. SPSS Modeler Neural Network2Neural Network in SPSS Modeler805.1 |-
1 Bagging&Boosting
805. Neural NetworkRBFGDP.xls
815.1 |-816.
825.1 |-827. 1
835.1 |-837. 2
845.1 |-848.
855.1 |-857000
1. 865.1 |-
86:----2. 875.1 |-87T S C I 3. 885.1 |-
()88Nave Method Moving AverageExponential SmoothingHoltBrownWinterARIMA
Multivariate ARIMA4. 895.1 |-
894.1.3 (Association)|4.1 |Day 24.1.3 |1. Buying Pattern
(Antecedent) (Consequent)
? ?(Antecedent) (Consequent)
? ?
(1) & (2) & & (m) Antecedents Consequent 91912. (Cross-Selling), (Up-Selling) - Negative Rule - Negative Rule. ~A B , A ~B , ~A ~B (Fraud Detection) : Negative Rule[](Shelf Planning)924.1.3 |923.
934.1.3 |934. (Association) (Instances)A (Support)1)Pr(A) = A / (Rule Support)Pr(A C) = (A C) / (Confidence)2)Pr(A | C) = (A C) / A (Lift)3)Pr(A | C) /Pr(C)=Pr(A C) / { Pr(A) * Pr(C) } (Deployability) 4)Pr(A) Pr(A C) = -A C, A : Antecedent(), C : Consequent(), (coverage)(accuracy)944.1.3 |945. (Association) (Rule support), (Confidence), (Lift)
Transactione.g.AssociationAnalysisRules
___% ().___% () ..Rule & 954.1.3 |956. SPSS Modeler(Support), (Confidence), (Antecedent)4AprioriCARMA964.1.3 |961. AprioriApriori100% 40%43%40%15%
0
974.1.3 |-Apriori972. Apriori 10%
1 >110%C5.0 0 1
984.1.3 |-Apriori983. SPSS ModelerApriori1 Apriori in SPSS Modeler994.1.3 |-Apriori
Apriori
993. SPSS ModelerApriori2 Apriori in SPSS Modeler1004.1.3 |-Apriori
100shopping.txt4. Apriori
1014.1.3 |-Apriori101Apriori /5. Apriori
1024.1.3 |-Apriori102IDMilk and Frozen foods=>Bakery goods8585{Milk Frozen foods}Milk and Frozen foods=>Bakery goods85/786*100%=10.814% Milk and Frozen foods=>Bakery goods9.033%9.033%{MilkFrozen foodsBakery goods}6. Apriori1
1034.1.3 |-Apriori103Milk and Frozen foods=>Bakery goods83.529%= 9.033%/ 10.814% Milk and Frozen foods=>Bakery goods83.529%42.69%83.529%/ 42.69%=1.948140MilkFrozen foodsBakery goods140/786=1.781%,6. Apriori2
1044.1.3 |-Apriori1041Bakery goods7.
1054.1.3 |-Apriori1051. SPSS ModelerCarma1064.1.3 |-Carma Carma in SPSS Modeler
106CarmaCarmaBASKETS1n.txt2. Carma
1074.1.3 |-Carma1074.1.3 |-9PCCPU1067IDsequence.xls
1.
108108IDID12
2. 1
1094.1.3 |-109IDIDIDIDID
3.
1104.1.3 |-110$S-ID-1=brioches$SC-ID-1=0.700$S-ID-2=biscuits$SC-ID-2=0.600brioches70%biscuits60%
4.
1114.1.3 |-111FailTelRepair.txt ID fieldIDtime fieldIndex1, content fieldStage2992101121125.1.1 SPSS Modeler Batch 5.1 Batch Day 25.1.1 SPSS Modeler Batch1. BusinessUnderstandingDataPreparationDataUnderstandingModelingEvaluation???Deployment Data Mining Process (CRISP-DM) Management of Mining ResultsModeler Batch
1141142. Modeler Batch
ClientServerBatch Program Batch Script Modeler Batch Mode System ArchitectureDOS
clientstreamclientstreamDOS
DOSClembstreamScriptDOSDOSClembScript1155.1.1 SPSS Modeler Batch1153. SPSS Modeler Batch SPSS ModelerModeler BatchModeler Servemodelerscriptmodeler batchclemb.. -server SPSS Modeler -hostname-port-username-password -domain -hostname -port -username -password -password -epassword SPSS Modeler -domain -P =@ @ -directory -server_directory
-directory -execute-stream -script -model .gm -state -project -output .cou -help-P =1165.1.1 SPSS Modeler Batch1164. SPSS Modeler Batch1
batch SPSS Modeler Stream/Batch SPSS Modeler Stream (Clem_Batch_Test.str)
Server(C:\BATCH_TEST\ Clem_Batch_Test.str)Batch (C:\BATCH_TEST\ clem_batch.bat)
SPSS Modeler nodeBatch.
->1225.1.1 SPSS Modeler Batch1225.1.2 5.1 Batch Day 2Q & A124SPSS Education124 2010 SPSS Korea, Data Solution Inc. All rights reserved