sample dm physical data model v1.0.doc

23
Global Business Services Business Intelligence Practice Customer Interaction Analysis Project Data Mart Physical Data Model Sample Deliverable Version 1.0 01/27/2005 D ata S ources D ata Integration A ccess H ardware & S oftware P latfo rms C ollaboration D ata M ining Modeling Q uery & R eporting N etwork C onnectivity,P rotocols & A ccess Middlew are D ata Quality M etadata S corecard Visualization Em bedded Analytics D ata R epositories O perational D ata S tores Data W arehouses Metadata Staging A reas D ata Marts A nalytics W eb B row ser Portals D evices W eb S ervices E nterprise U nstructured Inform ational External D ata flow and W orkflow B usiness A pplications C lean Staging Extract/Subscribe Initial Staging D ata Quality Technical/Business Transform ation Load-Ready Publish Load/Publish D ata Governance D ata R epositories Layer O perational D ata Stores Data W arehouses Metadata S taging Areas D ata M arts

Upload: mamatamaharana

Post on 10-Dec-2015

227 views

Category:

Documents


0 download

TRANSCRIPT

Global Business Services Business Intelligence Practice

Customer Interaction Analysis Project

Data MartPhysical Data ModelSample Deliverable

Version 1.001/27/2005

Data SourcesData IntegrationAccess

Hardware & Software Platforms

Collaboration

Data Mining

Modeling

Query & Reporting

Network Connectivity, Protocols & Access Middleware

Data Quality

Metadata

Scorecard

Visualization

Embedded Analytics

Data Repositories

Operational Data Stores

Data Warehouses

Metadata

Staging Areas

Data Marts

Analytics

Web Browser

Portals

Devices

Web Services

Enterprise

Unstructured

Informational

External

Data flow and Workflow

Bu

sin

ess

Ap

plic

atio

ns

Clean Staging

Extract / Subscribe

Initial Staging

Data QualityTechnical/Business

Transformation

Load-Ready Publish

Load/Publish

Data Governance

Data Repositories Layer

Operational Data Stores

Data Warehouses

Metadata

Staging Areas

Data Marts

Trademarks

Trademarked names may appear throughout this document. Rather than list the names and entities that own the trademarks or insert a trademark symbol with each mention of the trademarked name, the names are used only for editorial purposes and to the benefit of the trademark owner with no intention of infringing upon that trademark.

Revision History

Date Version Revised By Description

01/27/2006 V 1.0 IBM Consultant Initial Version

Page i Data Mart Physical Data Model

Table of Contents

1. DOCUMENT OVERVIEW................................................................................3

1.1. CONTEXT...............................................................................................................31.2. DOCUMENT AUDIENCE............................................................................................31.3. PROJECT OVERVIEW..............................................................................................4

2. DATA MART PHYSICAL DATA MODEL........................................................5

2.1. STAR SCHEMA.......................................................................................................52.2. TABLE-COLUMN REPORT........................................................................................6

2.2.1. Table: ARGMT_PRIM_PAY_TYP..............................................................................62.2.2. Table: CHANNEL.......................................................................................................72.2.3. Table: CUSTOMER...................................................................................................82.2.4. Table: CUST_INT_ANALYSIS...................................................................................92.2.5. Table: CUST_PERF_STAT....................................................................................112.2.6. Table: GEOGRAPHIC_AREA..................................................................................122.2.7. Table: MEAS_PERIOD..........................................................................................132.2.8. Table: PRIM_PROD_ARGMT.................................................................................15

3. DATABASE CONFIGURATION....................................................................16

3.1. TABLESPACE CONFIGURATIONS............................................................................163.2. BUFFERPOOL CONFIGURATIONS............................................................................16

4. DATA NAMING STANDARDS......................................................................17

4.1. Approved Abbreviations.....................................................................................17

Page ii Data Mart Physical Data Model

1. Document OverviewThis document has been developed by the Customer Interaction Analysis – Data Repository Thread project for the Financial Services Client Enterprise Data Warehouse Program.

This deliverable was developed from the IBM Business Intelligence Methodology deliverables, leveraging the IBM Banking Data Warehouse Model (BDWM), using the Customer Lifetime Value Application Solution Template.

BDWM Application Solution Templates (AST’s) provide a template of the data requirements for non-reporting applications for which the BDW is expected to deliver data, for example Customer Lifetime Value.

1.1. Context

The Physical Data Mart Data Model implements the database-specific requirements in the data marts physical data model for implementation purposes that deals with mult-layered aggregations.

For this deliverable, the Data Model will consist of an Star Schema and Table-Column Report.

1.2. Document Audience

The primary audience are:

Enterprise Data Warehouse Architects

Dimensional Data Modelers

Application Database Administrators

These are the individuals who have the responsibility for envisioning, modeling, and, structuring the data repository environment.

Page 3 Data Mart Physical Data Model

1.3. Project OverviewThe major objectives of the Financial Services Client Customer Interaction Analysis Project include:

Develop a reporting architecture to provide interactive analytics to loan, credit risk, and finance users leveraging an Analytics Package.

Develop the Customer Interaction Analysis Data Mart database to support the analytic requirements.

Develop the core Data Warehouse tables to supply the data to the Customer Interaction Analysis Data Mart database

Develop the Customer Operational Data Store database as a staging area to consolidate customer information for operational reporting and to populate the Data Warehouse tables

Develop the data integration processes required to load and update the Customer Operational Data Store, core Data Warehouse tables, as well as the Customer Interaction Analysis Data Mart database

Page 4 Data Mart Physical Data Model

2. Data Mart Physical Data Model

2.1. Star Schema

Model Name: CIA Data Warehouse Model

Project: Customer Interaction Analysis

Model Type: Physical Data Model

Structure Type: Data Mart

Author: IBM Consultant Version 1.0 Date: 01/01/06

Argmt_Prim_Pay_Typ

Arr_Pri_Py_Typ_Id: INTEGER

Arr_Pri_Py_Typ_Dsc: VARCHAR(20)

Channel

Chnl_Id: SMALLINT

Chnl_Cod: CHAR(6)Chnl_Nam: CHAR(32)

Customer

Cust_Id: INTEGER

Eff_Cust_Dat: DATEEnd_Cust_Dat: DATEProvision_Flg: SMALLINTAir_Travel_Flg: SMALLINTAir_Clb_Pass_Flg: SMALLINTInternet_Bnk_Flg: SMALLINTSpecial_Term_Flg: SMALLINTPhone_Bnk_Flg: SMALLINT

Cust_Int_Analysis

Meas_Period: SMALLINTCust_Id: INTEGERGeo_Area_Id: INTEGERCust_Perf_Stat_Id: INTEGERPrim_Prod_Argmt_Id: INTEGERArgt_Prim_Py_Tp_Id: INTEGERChnl_Id: SMALLINT

Num_Act_Com_Thrds: SMALLINTNum_Of_Thrds_Clsd: SMALLINTTot_FI_Proc_Time: INTEGERTot_Num_New_Argmt: SMALLINTAvg_Num_Of_Comm: NUMERIC(15,5)Avg_Num_Of_Chnl_Us: NUMERIC(15,5)Ave_Thrd_Duration: SMALLINTMeas_Per_Typ_Id: INTEGERSeason_Id: INTEGERChnl_Typ_Id: INTEGER

Cust_Perf_Stat

Cust_Perf_Stat_Id: INTEGER

Cust_Perf_Stat_Dsc: VARCHAR(20)

Geographic_Area

Geo_Area_Id: INTEGER

Time_Zone_Id: SMALLINTGeo_Area_Typ_Id: SMALLINTGeo_Area_Den_Ds_Id: SMALLINTUnemp_Rate_Seg_Id: SMALLINTBnkrpt_Rate_Seg_Id: SMALLINTInfl_Rate_Seg_Id: SMALLINTGeo_Area_Nature_Id: SMALLINTTelephonic_Code_Id: INTEGERGeo_Area_Code: CHAR(6)

Prim_Prod_Argmt

Prim_Prod_Argmt_Id: INTEGER

Prim_Prod_Argmt_Ds: VARCHAR(20)

Meas_Period

Meas_Period: SMALLINT

Pop_Dat: DATEPop_Tim: TIMEUnq_Id_In_Src_Sys: CHAR(20)Meas_Period_Typ_Id: SMALLINTEff_Date: DATEEnd_Date: DATEMeas_Period_Name: CHAR(32)Par_Cal_Period_Id: SMALLINTCal_Year: SMALLINTCal_Quarter: SMALLINTCal_Month: SMALLINTWk_Of_Cal_Year: SMALLINTWk_Of_Cal_Quarter: SMALLINTWk_Of_Cal_Month: SMALLINTDay_Of_Cal_Year: SMALLINTDay_Of_Cal_Quarter: SMALLINTDay_Of_Cal_Month: SMALLINTPar_Fiscal_Per_Id: SMALLINTFiscal_Year: SMALLINTFiscal_Quarter: SMALLINTFiscal_Month: SMALLINTWk_Of_Fiscal_Year: SMALLINTWk_Of_Fiscal_Quart: SMALLINTWk_Of_Fiscal_Month: SMALLINTDay_Of_Fiscal_Year: SMALLINTDay_Of_Fiscal_Quar: SMALLINTDay_Of_Fiscal_Mont: SMALLINTDay_Of_Week: SMALLINTSeason_Id: INTEGERNum_Of_Days: SMALLINTNum_Of_Bus_Days: SMALLINTNum_Of_Crd_Int_Day: SMALLINTNum_Of_Dbt_Int_Day: SMALLINTPublic_Holidy_Flg: SMALLINTCompany_Holidy_Flg: SMALLINTBusiness_Day_Flg: SMALLINTLst_Bs_Dy_In_Mn_Fg: SMALLINTDescription: VARCHAR(256)

Page 5 Data Mart Physical Data Model

2.2. Table-Column Report

2.2.1. Table:ARGMT_PRIM_PAY_TYP

Table Type: Dimension

Logical Name: Arrangement Primary Payment Type

Description: Channel identifies the different delivery and communications mechanisms through which products and services are made available to a customer and by which the Financial Institution and customers communicate with each other.

A Channel is a role played by either an Involved Party (e.g. Employee, Organization Unit) or a Resource Item (e.g. an ATM, a Website).

The lowest granularity of Channel required will be a matter of choice for the Financial Institution. Some may wish to just identify the ATM Network (a Resource Item) as a Channel, whereas others will wish to be able to identify each individual ATM machine (each a Resource Item). A Call Centre (an Organization Unit) may be sufficient granularity as a Channel in some cases - others will require recording of each Call Centre operative (Employees).

Where a given Involved Party or Resource Item instance is capable of both receiving and distributing services, it may be appropriate to associate that instance with two Channels. For example, a Teller Employment Position may be part of the Teller Receipt Channel for Transactions, but part of the Teller Distribution Channel for Product Campaigns.

Domain: None

Columns:

Logical Attribute Name Physical Column Name Datatype Required Default Value

Arrangement Primary Payment Type Id Arr_Pri_Py_Typ_Id Integer YesArrangement Primary Payment Type Dscr Arr_Pri_Py_Typ_Dsc Varchar No

Indices:

Index Name Index Type Columns

XPKArrangement_Pri

PK Arr_Pri_Py_Typ_Id

Constraints:

Constraint Name Constraint Type Referenced Table ColumnsXPKArrangement_Pr Primary Key Argmt_Prim_Pay_Typ Arr_Pri_Py_Typ_Id

Page 6 Data Mart Physical Data Model

Constraint Name Constraint Type Referenced Table Columnsi

2.2.2. Table:CHANNEL

Table Type: Dimension

Logical Name: Channel

Description: Channel identifies the different delivery and communications mechanisms through which products and services are made available to a customer and by which the Financial Institution and customers communicate with each other.

A Channel is a role played by either an Involved Party (e.g. Employee, Organization Unit) or a Resource Item (e.g. an ATM, a Website).

The lowest granularity of Channel required will be a matter of choice for the Financial Institution. Some may wish to just identify the ATM Network (a Resource Item) as a Channel, whereas others will wish to be able to identify each individual ATM machine (each a Resource Item). A Call Centre (an Organization Unit) may be sufficient granularity as a Channel in some cases - others will require recording of each Call Centre operative (Employees).

Where a given Involved Party or Resource Item instance is capable of both receiving and distributing services, it may be appropriate to associate that instance with two Channels. For example, a Teller Employment Position may be part of the Teller Receipt Channel for Transactions, but part of the Teller Distribution Channel for Product Campaigns.

Domain: None

Columns:

Logical Attribute Name Physical Column Name Datatype Required Default Value

Channel Id Chnl_Id Smallint No

Channel Type Id Chnl__Typ_Id Integer Yes

Channel Code Chnl_Cod Char No

Channel Name Chnl_Nam Char No

Indices:

Index Name Index Type Columns

XPKChannel PK Chnl_Id

Constraints:

Page 7 Data Mart Physical Data Model

Constraint Name Constraint Type Referenced Table ColumnsXPKChannel Primary Key Channel Chnl_Id

2.2.3. Table: CUSTOMER

Table Type: Dimension

Logical Name: Customer

Description: A Customer is a role played by an Involved Party that is considered to be receiving services or products from the Financial Institution or one of its Organization Units, or who is a potential recipient of such services or products.

Domain: None

Columns:

Logical Attribute Name Physical Column Name Datatype Required Default Value

Customer Id Cust_Id Integer Yes

Effective Customer Date Eff_Cust_Dat Date No

End Customer Date End_Cust_Dat Date No

Provision Flag Provision_Flg Smallint Yes

Air Travel Flag Air_Travel_Flg Smallint No

Airline Club Pass Flag Air_Clb_Pass_Flg Smallint No

Internet Banking Flag Internet_Bnk_Flg Smallint No

Special Terms Flag Special_Term_Flg Smallint No

Telephone Banking Flag Phone_Bnk_Flg Smallint No

Indices:

Index Name Index Type Columns

XPKChannel PK Chnl_Id

Constraints:

Constraint Name Constraint Type Referenced Table ColumnsXPKCustomer Primary Key Customer Cust_Id

Page 8 Data Mart Physical Data Model

2.2.4. Table: CUST_INT_ANALYSIS

Table Type: Fact

Logical Name: Customer Interaction Analysis

Description:

Domain: None

Columns:

Logical Attribute Name Physical Column Name Datatype Required Default Value

Meas_Period Meas_Period Smallint Yes

Customer Id Cust_Id Integer Yes

Geographic Area Id Geo_Area_Id Integer Yes

Customer Performance Status Id Cust_Perf_Stat_Id Integer Yes

Primary Product Arrangement Id Prim_Prod_Argmt_Id Integer No

Arrangement Primary Argt_Prim_Py_Tp_Id Integer No

Channel Id Chnl_id Smallint No

Interaction Type Int_Typ_Id Smallint No

Number of Active Communication Threads Num_Act_Com_Thrds Smallint NoNumber Of Threads Closed Num_Of_Thrds_Clsd Smallint NoTotal FI Processing Time Tot_FI_Proc_Time Integer NoTot No of New Arrangements From Communications

Tot_Num_New_Argmt Smallint No

Average Number of Communications Avg_Num_Of_Comm Integer NoAverage Number of Channels Used Avg_Num_Of_Chnl_Us Numeric NoAverage Thread Duration Ave_Thrd_Duration Smallint NoMeasurement Period Type Id Meas_Per_Typ_Id Integer NoSeason Id Season_Id Integer NoChannel Type Id Chnl_Typ_Id Integer No

Indices:

Index Name Index Type Columns

XPKCustomer_Intera

PK Meas_Period,Cust_Id,Geo_Area_Id,Cust_Perf_Stat_Id,Prim_Prod_Argmt_Id,Argt_Prim_Py_Tp_Id,Chnl_Id

XIF1Cust_Int_Analy FK Chnl_Typ_Id

Page 9 Data Mart Physical Data Model

Index Name Index Type Columns

XIF10Customer_Inte FK Meas_Per_Typ_IdXIF11Cust_Int_Anal FK Meas_PeriodXIF2Cust_Int_Analy FK Chnl_IdXIF3Cust_Int_Analy FK Argt_Prim_Py_Tp_IdXIF4Cust_Int_Analy FK Prim_Prod_Argmt_IdXIF5Cust_Int_Analy FK Cust_Perf_Stat_IdXIF6Cust_Int_Analy FK Geo_Area_IdXIF7Cust_Int_Analy FK Cust_IdXIF8Cust_Int_Analy FK Season_Id

Constraints:

Constraint Name Constraint Type Referenced Table ColumnsXPKCustomer_Intera

Primary Key Cust_Int_Analysis Meas_Period, Cust_Id, Geo_Area_Id, Cust_Perf_Stat_Id, Prim_Prod_Argmt_Id, Argt_Prim_Py_Tp_Id, Chnl_Id

R_1 Foreign Key Meas_Period Meas_PeriodR_7 Foreign Key Customer Cust_IdR_8 Foreign Key Geographic_Area Geo_Area_IdR_9 Foreign Key Cust_Perf_Stat Cust_Perf_Stat_IdR_10 Foreign Key Prim_Prod_Argmt

Prim_Prod_Argmt_Id

R_11 Foreign Key Argmt_Prim_Pay_Typ Arr_Pri_Py_Typ_IdR_13 Foreign Key Channel Chnl_Id

Page 10 Data Mart Physical Data Model

2.2.5. Table: CUST_PERF_STAT

Table Type: Dimension

Logical Name: Customer Performance Status

Description: None.

Domain: None

Columns:

Logical Attribute Name Physical Column Name Datatype Required Default Value

Customer Performance Status Id Cust_Perf_Stat_Id Integer YesCustomer Performance Status Dscr Cust_Perf_Stat_Dsc Varchar No

Indices:

Index Name Index Type Columns

XPKCust_Perf_Stat PK Cust_Perf_Stat_Id

Constraints:

Constraint Name Constraint Type Referenced Table ColumnsXPKCust_Perf_Stat Primary Key Cust_Perf_Stat Cust_Perf_Stat_Id

Page 11 Data Mart Physical Data Model

2.2.6. Table: GEOGRAPHIC_AREA

Table Type: Dimension

Logical Name: Geographic Area

Description: Geographic Area is a Location that identifies a bounded area or a combination of bounded areas that is defined by nature or society; for example, Africa, Germany, the Midwest, or Fairfax County.

Domain: None

Columns:

Logical Attribute Name Physical Column Name Datatype Required Default Value

Geographic Area Id Geo_Area_Id Integer YesTime Zone Id Time_Zone_Id Smallint NoGeographic Area Type Id Geo_Area_Typ_Id Smallint YesGeographic Area Density Designation Id Geo_Area_Den_Ds_Id Smallint NoUnemployment Rate Segment Id Unemp_Rate_Seg_Id Smallint NoBankruptcy Rate Segment Id Bnkrpt_Rate_Seg_Id Smallint NoInflation Rate Segment Id Infl_Rate_Seg_Id Smallint NoGeographic Area Nature Id Geo_Area_Nature_Id Smallint NoTelephonic Code Id Telephonic_Code_Id Integer NoGeographic Area Code Geo_Area_Code Char No

Indices:

Index Name Index Type Columns

XPKGeographic_Area PK Geo_Area_Id

Constraints:

Constraint Name Constraint Type Referenced Table ColumnsXPKGeographic_Area Primary Key Geographic_Area Geo_Area_Id

Page 12 Data Mart Physical Data Model

2.2.7. Table: MEAS_PERIOD

Table Type: Dimension

Logical Name: Measurement Period

Description: Measurement Period Measurement Period records the intervals of time at which measurements are captured in the warehouse.

Given the importance and frequency of use of the temporal dimension in the warehouse, significant redundancy with regard to this entity is justified. The concept of the 'pre-computed structure of time' is appropriate in this regard.

Domain: None

Columns:

Logical Attribute Name Physical Column Name Datatype Required Default Value

Meas_Period Meas_Period Smallint Yes

Population Date Pop_Dat Date YesPopulation Time Pop_Tim Time NoUnique Id In Source System Unq_Id_In_Src_Sys Char NoMeasurement Period Type Id Meas_Period_Typ_Id Smallint YesEffective Date Eff_Date Date YesEnd Date End_Date Date YesMeasurement Period Name Meas_Period_Name Char YesParent Calendar Period Id Par_Cal_Period_Id Smallint NoCalendar Year Cal_Year Smallint NoCalendar Quarter Cal_Quarter Smallint NoCalendar Month Cal_Month Smallint NoWeek Of Calendar Year Wk_Of_Cal_Year Smallint NoWeek Of Calendar Quarter Wk_Of_Cal_Quarter Smallint NoWeek Of Calendar Month Wk_Of_Cal_Month Smallint NoDay Of Calendar Year Day_Of_Cal_Year Smallint NoDay Of Calendar Quarter Day_Of_Cal_Quarter Smallint NoDay Of Calendar Month Day_Of_Cal_Month Smallint NoParent Fiscal Period Id Par_Fiscal_Per_Id Smallint NoFiscal Year Fiscal_Year Smallint NoFiscal Quarter Fiscal_Quarter Smallint NoFiscal Month Fiscal_Month Smallint NoWeek Of Fiscal Year Wk_Of_Fiscal_Year Smallint NoWeek Of Fiscal Quarter Wk_Of_Fiscal_Quart Smallint NoWeek Of Fiscal Month Wk_Of_Fiscal_Month Smallint NoDay Of Fiscal Year Day_Of_Fiscal_Year Smallint NoDay Of Fiscal Quarter Day_Of_Fiscal_Quar Smallint NoDay Of Fiscal Month Day_Of_Fiscal_Mont Smallint NoDay Of Week Day_Of_Week Smallint NoSeason Id Season_Id Integer NoNumber Of Days Num_Of_Days Smallint NoNumber Of Business Days Num_Of_Bus_Days Smallint No

Page 13 Data Mart Physical Data Model

Logical Attribute Name Physical Column Name Datatype Required Default Value

Integer Of Credit Interest Days Num_Of_Crd_Int_Day Smallint NoInteger Of Debit Interest Days Num_Of_Dbt_Int_Day Smallint NoPublic Holiday Flag Public_Holidy_Flg Smallint NoCompany Holiday Flag Company_Holidy_Flg Smallint NoBusiness Day Flag Business_Day_Flg Smallint NoLast Business Day In Month Flag Lst_Bs_Dy_In_Mn_Fg Smallint NoDescription Description Varchar No

Indices:

Index Name Index Type Columns

XPKMeas_Per PK Meas_Period

Constraints:

Constraint Name Constraint Type Referenced Table ColumnsXPKMeas_Per Primary Key Meas_Period Meas_Period

Page 14 Data Mart Physical Data Model

2.2.8. Table: PRIM_PROD_ARGMT

Entity Type: Dimension

Logical Name: Primary Product Arrangement

Description: None.

Domain: None

Columns:

Logical Attribute Name Physical Column Name Datatype Required Default Value

Primary Product Arrangement Id Prim_Prod_Argmt_Id Integer YesPrimary Product Arrangement Dscr Prim_Prod_Argmt_Ds Varchar No

Indices:

Index Name Index Type Columns

XPKPrim_Prod_Argmt PK Prim_Prod_Argmt_Id

Constraints:

Constraint Name Constraint Type Referenced Table ColumnsXPKPrim_Prod_Argmt Primary Key Prim_Prod_Argmt Prim_Prod_Argmt_Id

Page 15 Data Mart Physical Data Model

3. Database Configuration

The basic database parameter setting for the CIA Operational Data Store

3.1. Tablespace Configurations

Tablespace Name: TST_CST_INT_P08

Tablespace Type: Regular

Extent Size: 36

Prefetch Size: 108

3.2. Bufferpool Configurations

Bufferpool Name: BP_PART1_8K

NodeGroup: NG123

Size: 10000

Page 16 Data Mart Physical Data Model

4. Data Naming Standards

4.1. Approved Abbreviations

To adhere to a naming size physical limit, the following are approved abbreviations.

Logical Physical

Arrangement Argmt

Average Avg

Banking Bnk

Bankruptcy Bnkrpt

Business Bus

Channel Chnl

Calendar Cal

Customer Cust

Date Dat

Effective Eff

Flag Flg

Geography Geo

Inflation Infl

Interaction Interact

Measurement Meas

Number Num

Payment Pmt

Performance Perf

Primary Prim

Product Prod

Population Pop

Telephone Phone

Terms Term

Type Typ

Segment Seg

Source Src

Status Stat

System Sys

Week Wk

Page 17 Data Mart Physical Data Model