![Page 1: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/1.jpg)
1
All Powder Board and Ski
Oracle 9i WorkbookChapter 8: Data Warehouses and Data MiningJerry PostCopyright © 2003
![Page 2: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/2.jpg)
2
Oracle Relational Approach
Customer
Sale
SaleItem
Item
Relational Tables
Fact Measure
DimensionDimension
Dimension Dimension
Star Design
Meta-Data
Sale +
Customer
Materialized Views
![Page 3: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/3.jpg)
3
Desired Sales Cube Dimensions
Sales Dimensions
State (ship)MonthCategoryStyleSkillLevelSizeColorManufacturerBindingStyleWeightMax?ItemMaterial?WaistWidth?
![Page 4: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/4.jpg)
4
Early Data: Spreadsheets
![Page 5: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/5.jpg)
5
External Tables: Attach to CSV
create table OldSale_Ext( SaleID INTEGER, SaleDate DATE, ShipState VARCHAR2(50), ShipZIP VARCHAR2(50), PaymentMethod VARCHAR2(50), SKU VARCHAR2(50), QuantitySold INTEGER, SalePrice NUMBER(10,2) ModelID VARCHAR2(250), ItemSize NUMBER, ManufacturerID INTEGER,
create or replace directory csv_dir as ‘D:\students\BuildAllPowder\AllPowderSampleDataCSV';
Category VARCHAR2(50), Color VARCHAR2(50), ModelYear INTEGER, Graphics VARCHAR2(50), ItemMaterial VARCHAR2(50), ListPrice NUMBER(10,2), Style VARCHAR2(50), SkillLevel INTEGER, WeightMax NUMBER, WaistWidth NUMBER, BindingStyle VARCHAR2(50) )
Continued on next slide
Warning: currency columns cannot have $ symbols or commas
![Page 6: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/6.jpg)
6
External File Definitionorganization external (
type oracle_loaderdefault directory csv_dir
access parameters ( records delimited by newline fields terminated by ',' optionally enclosed by '"' lrtrim missing field values are null
(SaleID, SaleDate char date_format date mask "mm/dd/yyyy",ShipState, ShipZIP, PaymentMethod, SKU, QuantitySold, SalePrice,odelID, ItemSize, ManufacturerID, Category, Color, ModelYear,Graphics, ItemMaterial, ListPrice, Style, SkillLevel, WeightMax,WaistWidth, BindingStyle
) ) location ('Lab 08-01 Early Sales.csv') )
reject limit unlimited;
![Page 7: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/7.jpg)
7
Create Customer and Employee
CustomerID and EmployeeID are missing from the old data.Instead of relying on blank cell values, create a new customer called “Walk-in” and a new employee called “Employee”Write down the ID numbers generated for these anonymous entries.If you use SQL, you can assign a value of zero to these entries.
INSERT INTO Customer (CustomerID, LastName)Values (0,'Walk-in')
INSERT INTO Employee (EmployeeID, LastName)Values (0,'Staff')
![Page 8: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/8.jpg)
8
Extract Model Data
SELECT DISTINCT OldSale_ext.ModelID, OldSale_ext.ManufacturerID, OldSale_ext.Category, OldSale_ext.Color, OldSale_ext.ModelYear, OldSale_ext.Graphics, OldSale_ext.ItemMaterial, OldSale_ext.ListPrice, OldSale_ext.Style, OldSale_ext.SkillLevel, OldSale_ext.WeightMax, OldSale_ext.WaistWidth, OldSale_ext.BindingStyleFROM OldSale_ext;
![Page 9: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/9.jpg)
9
UNION Query for Models
SELECT DISTINCT ModelID, ManufacturerID, Category, …
FROM OldSales_ext
UNION
SELECT DISTINCT ModelID, ManufacturerID, Category, …
FROM OldRentals_ext
![Page 10: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/10.jpg)
10
Insert Model Data into ItemModel
INSERT INTO ItemModel (ModelID, ManufacturerID, Category, Color, ModelYear, Graphics, ItemMaterial, ListPrice, Style, SkillLevel, WeightMax, WaistWidth, BindingStyle)SELECT DISTINCT qryOldModels.ModelID, qryOldModels.ManufacturerID, qryOldModels.Category, qryOldModels.Color, qryOldModels.ModelYear, qryOldModels.Graphics, qryOldModels.ItemMaterial, qryOldModels.ListPrice, qryOldModels.Style, qryOldModels.SkillLevel, qryOldModels.WeightMax, qryOldModels.WaistWidth, qryOldModels.BindingStyle FROM qryOldModels;
![Page 11: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/11.jpg)
11
Insert SKU Data into Inventory
INSERT INTO Inventory (ModelID, SKU, ItemSize, QuantityOnHand)SELECT DISTINCT qryOldInventory.ModelID, qryOldInventory.SKU, qryOldInventory.ItemSize, 0 As QuantityOnHand FROM qryOldInventory;
Note the use of the column alias to force a zero value for QuantityOnHand for each row
CREATE VIEW qryOldInventory ASSELECT DISTINCT ModelID, SKU, ItemSizeFROM OldSale_extUNIONSELECT DISTINCT ModelID, SKU, ItemSizeFROM OldRental_ext;
![Page 12: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/12.jpg)
12
Copy Sales Data
INSERT INTO Sale (SaleID, SaleDate, ShipState, ShipZIP, PaymentMethod)SELECT DISTINCT OldSales_ext.SaleID, OldSales_ext.SaleDate, OldSales_ext.ShipState, OldSales_ext.ShipZIP, OldSales_ext.PaymentMethodFROM OldSales_ext;
Note that if you have added data to your Sales table, your existing SaleID values might conflict with these
You can solve the problem by adding a number to these values so they are all larger than your highest ID
INSERT INTO Sale (SaleID, SaleDate, ShipState, ShipZIP, PaymentMethod)SELECT DISTINCT OldSales_ext.SaleID+5000, OldSales_ext.SaleDate, OldSales_ext.ShipState, OldSales_ext.ShipZIP, OldSales_ext.PaymentMethodFROM OldSales_ext;
![Page 13: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/13.jpg)
13
Copy SaleItem Rows
INSERT INTO SaleItem (SaleID, SKU, QuantitySold, SalePrice)SELECT DISTINCT OldSale_ext.SaleID+5000, OldSale_ext.SKU, OldSale_ext.QuantitySold, OldSale_ext.SalePrice FROM OldSale_ext;
If you transformed the SaleID in the prior step for the Sale data, you must do the exact same calculation for SaleID in the SaleItem table
![Page 14: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/14.jpg)
14
Copy Rental Data
INSERT INTO Rental (RentID, RentDate, ExpectedReturn, PaymentMethod)SELECT DISTINCT OldRental_ext.RentID+5000, OldRental_ext.RentDate, OldRental_ext.ExpectedReturn, OldRental_ext.PaymentMethod FROM OldRental_ext;
INSERT INTO RentItem (RentID, SKU, RentFee, ReturnDate)SELECT DISTINCT OldRental_ext.RentID+5000, OldRental_ext.SKU, OldRental_ext.RentFee, OldRental_ext.ReturnDate FROM OldRental_ext;
![Page 15: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/15.jpg)
15
Discoverer Administrator: Load Business Area
Schema
Select tables
Tables and views
![Page 16: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/16.jpg)
16
Load Wizard Options: LOV
Most options are selected by default
Select the LOV option to have Discoverer build lookup lists
![Page 17: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/17.jpg)
17
Discoverer: Business Area
Tables shown as folders and named so managers understand them
Columns shown as items
Add a calculated item
![Page 18: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/18.jpg)
18
Create a Data Hierarchy
Select Category and Style from the SkiBoardStyle lookup table
![Page 19: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/19.jpg)
19
Discoverer Desktop: New Workbook
Select the dimensions and the fact item
![Page 20: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/20.jpg)
20
Initial Crosstab Layout
Row area
Column areaPage area
![Page 21: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/21.jpg)
21
Discoverer Crosstab Browser
Select all items
Format options
Totals
![Page 22: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/22.jpg)
22
Time Series Analysis: Moving Average
![Page 23: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/23.jpg)
23
Time Series Analysis: Discoverer
![Page 24: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/24.jpg)
24
Sales by State for Regression
Note that some states are missing from the list.
![Page 25: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/25.jpg)
25
Regression Data Query
CREATE VIEW StateSales2004 ASSELECT StateName, Income2001, Pop2002, Sum(SalePrice*QuantitySold) AS Sales2004FROM Sale INNER JOIN StateDemographicsON Sale.ShipState = StateDemographics.StateCodeINNER JOIN SaleItem ON Sale.SaleID = SaleItem.SaleIDWHERE ShipState IS NOT NULL AND SaleDate Between '01-Jan-2004' And '31-Dec-2004'GROUP BY StateName, Income2001, Pop2002ORDER BY StateName;
![Page 26: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/26.jpg)
26
Regression Setup
You should include the label row but be sure to check the box to show you included it
![Page 27: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/27.jpg)
27
Regression Results
Relatively high R-square
Population is a significant predictor, Income is not
![Page 28: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/28.jpg)
28
Association Rules/Market Basket
Item to find Possible location
Data mining samples D:\Oracle\ora92\dm\demo\sample
ORACLE_HOME D:\Oracle\ora92
JAVA_HOME C:\OracleData\Ora92DS\jdk
Locate folders
![Page 29: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/29.jpg)
29
Copy Files to Protect Original
compileSampleCode.bat
executeSampleCode.bat
Sample_AssociationRules.java
Sample_AssociationRules_Transactional.property
Sample_Global.property
![Page 30: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/30.jpg)
30
Edit Sample_Global.property File
miningServer.url=jdbc:oracle:thin:@YourServerName:1521:DBName
miningServer.userName=odm
miningServer.password=password
inputDataSchemaName=powder
outputSchemaName=powder
timeout=120
If necessary, use enterprise manager to unlock and assign new passwords to accounts: odm and odm_mtr
![Page 31: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/31.jpg)
31
Create New Table To Hold Transaction Basket Data
CREATE TABLE MARKET_BASKET_TX_BINNED
( SEQUENCE_ID INTEGER,
ATTRIBUTE_NAME VARCHAR2(35),
VALUE NUMBER
);
GRANT SELECT ON MARKET_BASKET_TX_BINNED TO odm;
commit;
If you use these names, you do not have to edit the Transactional.property file
![Page 32: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/32.jpg)
32
Copy SaleItem Data
INSERT INTO MARKET_BASKET_TX_BINNED (SEQUENCE_ID, ATTRIBUTE_NAME, VALUE)SELECT SaleID,
ItemModel.Category || '_' || ItemModel.Style AS AName, 1 As Value
FROM SaleItem Inner Join InventoryON SaleItem.SKU = Inventory.SKUInner Join ItemModelON Inventory.ModelID = ItemModel.ModelIDGROUP BY SaleID, ItemModel.Category || '_' || ItemModel.Style;
![Page 33: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/33.jpg)
33
Copy Sale Data
INSERT INTO MARKET_BASKET_TX_BINNED
(SEQUENCE_ID, ATTRIBUTE_NAME, VALUE)
SELECT SaleID, 'ID', SaleID
FROM Sale;
commit;
![Page 34: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/34.jpg)
34
Remove Dashes from Attribute
UPDATE MARKET_BASKET_TX_BINNED
SET ATTRIBUTE_NAME = substr(ATTRIBUTE_NAME,1,instr(ATTRIBUTE_NAME,'-')-1)
|| '_' || substr(ATTRIBUTE_NAME,instr(ATTRIBUTE_NAME,'-')+1)
WHERE instr(ATTRIBUTE_NAME,'-') > 0;
commit;
Run at least twice—until you get zero changes.Because a row might have more than one dash.
![Page 35: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/35.jpg)
35
Limit Size of Attribute_Name
UPDATE MARKET_BASKET_TX_BINNED
SET ATTRIBUTE_NAME = substr(ATTRIBUTE_NAME,1,20);
commit;
This is critical—but is probably due to a bug in Oracle’s code. There is a slight chance it arises because of the 30 character name limitation in Oracle.
![Page 36: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/36.jpg)
36
Compile and Run the Code
SET ORACLE_HOME = D:\Oracle\ora92
SET JAVA_HOME = C:\OracleData\ora92DS\jdk
compileSampleCode.bat Sample_AssociationRules.java
executeSampleCode.bat Sample_AssociationRules Sample_AssociationRules_Transactional.property
Type as all one line—do not hit <Enter> until the end
To redirect the output to a file, at the end, add:>myfile.txt
![Page 37: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/37.jpg)
37
Sample Results
Getting top 5 rules for model: Sample_AR_Model_tx sorted by support.
Rule 124: If Boots_=1 then Clothes_=1 [support: 0.17285714, confidence: 0.44814816]
Rule 38: If Clothes_=1 then Boots_=1 [support: 0.17285714, confidence: 0.35276967]
Rule 101: If Board_Half_Pipe=1 then Clothes_=1 [support: 0.11357143, confidence: 0.4622093]
Rule 9: If Clothes_=1 then Board_Half_Pipe=1 [support: 0.11357143, confidence: 0.23177843]
Rule 100: If Ski_Freestyle=1 then Clothes_=1 [support: 0.09785714, confidence: 0.48070174]
Get rules by support: Sample_AR_Model_tx, with minimum support of 0.16.
Rule 124: If Boots_=1 then Clothes_=1 [support: 0.17285714, confidence: 0.44814816]
Rule 38: If Clothes_=1 then Boots_=1 [support: 0.17285714, confidence: 0.35276967]
Get rules by confidence: Sample_AR_Model_tx, with confidence of 0.56 or more.
Investigate and think about the results.Do you have too many clothes targeted to half-pipe boards and freestyle skiers, or not enough?
![Page 38: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/38.jpg)
38
GIS: Microsoft MapPoint
The Discoverer worksheet places the data into rows and columns
A dynamic copy of this sheet is used to remove the top rows
![Page 39: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/39.jpg)
39
MapPoint Data Wizard
![Page 40: 1 All Powder Board and Ski Oracle 9i Workbook Chapter 8: Data Warehouses and Data Mining Jerry Post Copyright © 2003](https://reader036.vdocuments.mx/reader036/viewer/2022062517/56649f345503460f94c51fda/html5/thumbnails/40.jpg)
40
GIS Analysis of Sales