bidw roadmap

27
BIDW Roadmap Author : Dave Goyal

Upload: dave-goyal

Post on 08-Apr-2015

45 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: BIDW Roadmap

BIDW RoadmapAuthor : Dave Goyal

Page 2: BIDW Roadmap

BIDW Process Roadmap

2Author : Dave Goyal

Page 3: BIDW Roadmap

Overall Process Program / Project Planning and

Management Business Process Definition Technical Architecture Design Product Selection and Installation Dimensional Modeling

3Author : Dave Goyal

Page 4: BIDW Roadmap

Overall Process…Contd. Physical Design ETL Design and Development BI Application Design BI Application Development Deployment Change Management and Maintenance

4Author : Dave Goyal

Page 5: BIDW Roadmap

Program / Project Planning and Management

Define the Project Build the Business Case and Justification Plan the Project Manage the Project Manage the Program

5Author : Dave Goyal

Page 6: BIDW Roadmap

Business Process Define Business Process Define Requirements using Interviews Define Requirements using Facilitated

Sessions

6Author : Dave Goyal

Page 7: BIDW Roadmap

Technical Architecture Design Back Room Architecture (Source , ETL) Presentation Server Architecture

(Dimensional Architecture) Front Room Architecture (BI) Additional Architecture Features

(Infrastructure, Metadata, Security)

7Author : Dave Goyal

Page 8: BIDW Roadmap

Product Selection and Installation Architecture Plan (DW Architecture

Diagram and Application Architecture Document)

Product Selection (Hardware/OS, DBMS, ETL, BI, Data Profiling, Data Cleansing etc.)

8Author : Dave Goyal

Page 9: BIDW Roadmap

Dimensional Modeling Process Value Chain Business Process Choose the Business Process Declare the Grain Identify the Dimensions Identify the Facts Enterprise Bus Matrix

9Author : Dave Goyal

Page 10: BIDW Roadmap

Physical Design High Level Physical Design Develop Standards Develop the Physical Data Model Develop Initial Indexing Plan Design OLAP Database Design Aggregations

10Author : Dave Goyal

Page 11: BIDW Roadmap

ETL Table Naming Convention D_ : Dimension Table F_ : Fact Table S_ : Source Table - Contains all data

copied directly from a source file X _ : Extract Table – Contains changed

source data only, Changes may be from an incremental extract or derived from a full extract

11Author : Dave Goyal

Page 12: BIDW Roadmap

ETL Table Naming Convention 2 C_ : Clean Table – contains source rows that

have been cleaned E_ : Error Table - contains error rows found in

source data M_ : Master table – maintains history of all clean

rows T_ : Transform Table – contains the data

resulting from a transformation of source data

12Author : Dave Goyal

Page 13: BIDW Roadmap

ETL Table Naming Convention 3 I_ : Insert Table – contains new data to be

inserted in dimension table U_ : Update Table – contains changed data

to be inserted in dimension table

13Author : Dave Goyal

Page 14: BIDW Roadmap

Data Quality Avoid Null string in dimension tables Specify default value for NOT NULL

columns – ‘N/A’, ‘Not Known’, ‘Invalid’ Dimension Primary keys should be auto

generated surrogate keys. Allow data quality rows as 0, -1 , -2

Author : Dave Goyal 14

Page 15: BIDW Roadmap

Surrogate Keys Always use surrogate keys for dimension

keys as auto generate keys Use SET IDENTITY ON and SET

IDENTITY OFF sql statement to create keys 0 , -1 and -2 rows for each dimension when it is created 0 : INVALID -1 : UNKNOWN -2 : NOT APPLICABLE

Author : Dave Goyal 15

Page 16: BIDW Roadmap

ETL Design and Development Round Up the Requirements Extract Data from source (3 Steps) Clean and Conform Data (5 Steps) Delivering Data (13 Steps) Managing the ETL Environment (13 Steps)

16Author : Dave Goyal

Page 17: BIDW Roadmap

ETL Roadmap

17Author : Dave Goyal

Page 18: BIDW Roadmap

ETL Implementation Process Analyze data quality thoroughly and have

options available to resolve it Define Data source definitions Create High Level S2T Map Create Detail Level S2T Map Create Fact Worksheet

18Author : Dave Goyal

Page 19: BIDW Roadmap

ETL Process…Extract Extract Data to S_Table (Full Load) Compare S_ to M_ table and load the difference

in X_ tables Clean X Table by removing duplicate rows from

X_ Table . De-duplication step Move duplicate rows to E_ Table Move non duplicate clean rows to C_ table Compare C_ to M_ and insert new into M and

update M_ with changed

19Author : Dave Goyal

Page 20: BIDW Roadmap

ETL Process…Transform Select and Transform from C_ to T_ Compare T_ with D_ for new and changed

rows Insert New rows in I_ and changed rows in

U_

20Author : Dave Goyal

Page 21: BIDW Roadmap

ETL Process…Load (I_) Insert rows directly into D_ table from I_ Update rows from U_ to D_ when its SCD

1,3. Insert rows from U_ to D_ when its SCD 2 Please Dimension or Surrogate keys will be

generated during Load stage

21Author : Dave Goyal

Page 22: BIDW Roadmap

ETL Process… To remember S_ , X_ , M_ , C_ , E_ tables should be

named as source tables such S_Agents . T_ , I_ , U_ , or D_ table should be named

as target tables such as T_Agent, T_PolicyHolder etc.

Source table data size should follow source data formats except Natural keys should be varchar to accommodate data quality

22Author : Dave Goyal

Page 23: BIDW Roadmap

High Level BIDW System Architecture Model

23Author : Dave Goyal

Page 24: BIDW Roadmap

BI Application Design Define the structure of the portal and its

webpages Define High Level Reporting requirements

(Dashbaords, Scorecards) Define Analytical reporting requirements

( Cubes, Interactive reports, Adhoc Queries) Define Detailed reporting requirements

( Filter based reports, Adhoc queries)

24Author : Dave Goyal

Page 25: BIDW Roadmap

BI Application Layers

Author : Dave Goyal 25

Page 26: BIDW Roadmap

BI Application Development Setup the development environment Setup the Issue management system Develop all reports Test and Balance each report against the

source system

26Author : Dave Goyal

Page 27: BIDW Roadmap

Deployment / Maintenance Design Version control system Define the change management process Define the documents to deploy changes

from Dev, Test, QA to Production Manage and maintain environments.

27Author : Dave Goyal