Best PracticesBest Practices
Building a Data Warehouse Quickly Building a Data Warehouse Quickly
October 16, 2009 | Florida Chapter October 16, 2009 | Florida Chapter Presented by Raphael Klebanov, WhereScape USAPresented by Raphael Klebanov, WhereScape USA
Copyright © 2009 by WhereScape Software
Key factors that influence a successful data warehouse task
•Implementing the True Development Approach
•Choosing a Rapid Development Product
•Ensuring Data Availability
•Involving Key Users throughout the whole project
•Relying on a Pragmatic Governance Framework
•Utilizing experienced Team Members
•Selecting the right Hardware, Infrastructure Technology
Copyright © 2009 by WhereScape Software | Slide # 2
AbstractAbstract
Copyright © 2009 by WhereScape Software | Slide # 3
Basic Architecture of a Data WarehouseBasic Architecture of a Data Warehouse
……for a intelligentfor a intelligent decision-making decision-making process?process?
……for data warehouse?for data warehouse?
Are you ready …Are you ready …
Copyright © 2009 by WhereScape Software | Slide # 4
Why do Data Warehouse projects fail?Why do Data Warehouse projects fail?
Copyright © 2009 by WhereScape Software | Slide # 5
• Unreliable or unattainable user requirements
Why do Data Warehouse projects fail?Why do Data Warehouse projects fail?
Copyright © 2009 by WhereScape Software | Slide # 6
• Unreliable or unattainable user requirements
• Quality of the data that feeds the source system
Why do Data Warehouse projects fail?Why do Data Warehouse projects fail?
Copyright © 2009 by WhereScape Software | Slide # 7
• Unreliable or unattainable user requirements
• Quality of the data that feeds the source system
• Changing source or target requirements
Why do Data Warehouse projects fail?Why do Data Warehouse projects fail?
Copyright © 2009 by WhereScape Software | Slide # 8
• Unreliable or unattainable user requirements
• Quality of the data that feeds the source system
• Changing source or target requirements
• Poor development productivity
Why do Data Warehouse projects fail?Why do Data Warehouse projects fail?
Copyright © 2009 by WhereScape Software | Slide # 9
• Unreliable or unattainable user requirements
• Quality of the data that feeds the source system
• Changing source or target requirements
• Poor development productivity
• High TCO (Total Cost of Ownership
Why do Data Warehouse projects fail?Why do Data Warehouse projects fail?
Copyright © 2009 by WhereScape Software | Slide # 10
• Unreliable or unattainable user requirements
• Quality of the data that feeds the source system
• Changing source or target requirements
• Poor development productivity
• High TCO (Total Cost of Ownership)
• Poor documentation
Why do Data Warehouse projects fail?Why do Data Warehouse projects fail?
Copyright © 2009 by WhereScape Software | Slide # 11
• Unreliable or unattainable user requirements
• Quality of the data that feeds the source system
• Changing source or target requirements
• Poor development productivity
• High TCO (Total Cost of Ownership)
• Poor documentation
“…over 50% of data warehouse projects fail or go wildly over budget – they blame data quality…” The real problem is project approach.
Source: Gartner. Magic Quadrant for Data Integration Tools, 2007
Why do Data Warehouse projects fail?Why do Data Warehouse projects fail?
Copyright © 2009 by WhereScape Software | Slide # 12
DW Project ComponentsDW Project Components
Copyright © 2009 by WhereScape Software | Slide # 13
• Strong sponsorship of the DW from the business
DW Project ComponentsDW Project Components
Copyright © 2009 by WhereScape Software | Slide # 14
• Strong sponsorship of the DW from the business
• Divide and Conquer approach
DW Project ComponentsDW Project Components
Copyright © 2009 by WhereScape Software | Slide # 15
• Strong sponsorship of the DW from the business
• Divide and Conquer approach
• Iterative Development approach
DW Project ComponentsDW Project Components
Copyright © 2009 by WhereScape Software | Slide # 16
• Strong sponsorship of the DW from the business
• Divide and Conquer approach
• Iterative Development approach
• Productive development tools
DW Project ComponentsDW Project Components
Copyright © 2009 by WhereScape Software | Slide # 17
• Strong sponsorship of the DW from the business
• Divide and Conquer approach
• Iterative Development approach
• Productive development tools
• Real data to populate the prototype
DW Project ComponentsDW Project Components
Copyright © 2009 by WhereScape Software | Slide # 18
• Strong sponsorship of the DW from the business
• Divide and Conquer approach
• Iterative Development approach
• Productive development tools
• Real data to populate the prototype
• Access to SME during development
DW Project ComponentsDW Project Components
Copyright © 2009 by WhereScape Software | Slide # 19
• Strong sponsorship of the DW from the business
• Divide and Conquer approach
• Iterative Development approach
• Productive development tools
• Real data to populate the prototype
• Access to SME during development
• Compact teams
DW Project ComponentsDW Project Components
Copyright © 2009 by WhereScape Software | Slide # 20
• Strong sponsorship of the DW from the business
• Divide and Conquer approach
• Iterative Development approach
• Productive development tools
• Real data to populate the prototype
• Access to SME during development
• Compact teams
• Sturdy development hardware
DW Project ComponentsDW Project Components
Copyright © 2009 by WhereScape Software | Slide # 21
Business OwnershipBusiness Ownership
Copyright © 2009 by WhereScape Software | Slide # 22
•The data warehouse should be owned by the business – not IT
Business OwnershipBusiness Ownership
Copyright © 2009 by WhereScape Software | Slide # 23
•The data warehouse should be owned by the business – not IT
•A successful project depends upon creating a partnership with the business
Business OwnershipBusiness Ownership
Copyright © 2009 by WhereScape Software | Slide # 24
•The data warehouse should be owned by the business – not IT
•A successful project depends upon creating a partnership with the business
•Prioritization of project phases or agreement on a data dictionary should be agreed by the business
Business OwnershipBusiness Ownership
Copyright © 2009 by WhereScape Software | Slide # 25
•The data warehouse should be owned by the business – not IT
•A successful project depends upon creating a partnership with the business
•Prioritization of project phases or agreement on a data dictionary should be agreed by the business
•Without a strong, high level business sponsor(s) the project is likely to hit problems
Business OwnershipBusiness Ownership
Copyright © 2009 by WhereScape Software | Slide # 26
•The data warehouse should be owned by the business – not IT
•A successful project depends upon creating a partnership with the business
•prioritization of project phases or agreement on a data dictionary to should be agreed by the business
•Without a strong, high level business sponsor(s) the project is likely to hit problems
•If sponsorship is present then the data warehouse project can be broken down into a set of smaller projects
Business OwnershipBusiness Ownership
Copyright © 2009 by WhereScape Software | Slide # 27
The Data Warehouse lifecycle The Data Warehouse lifecycle …as we know it…as we know it
Divide and ConquerDivide and Conquer
Copyright © 2009 by WhereScape Software | Slide # 29
• A ‘big bang’ approach to data warehousing has almost always ended in disaster
Divide and ConquerDivide and Conquer
Copyright © 2009 by WhereScape Software | Slide # 30
• A ‘big bang’ approach to data warehousing has almost always ended in disaster
• The project phases and the order in which they are developed should be decided by the data warehouse sponsors
Divide and ConquerDivide and Conquer
Copyright © 2009 by WhereScape Software | Slide # 31
• A ‘big bang’ approach to data warehousing has almost always ended in disaster
• The project phases and the order in which they are developed should be decided by the data warehouse sponsors
• Momentum is paramount for keeping the required focus
Divide and ConquerDivide and Conquer
Copyright © 2009 by WhereScape Software | Slide # 32
• A ‘big bang’ approach to data warehousing has almost always ended in disaster
• The project phases and the order in which they are developed should be decided by the data warehouse sponsors
• Momentum is paramount for keeping the required focus
• Rapid prototyping and tight development cycles are vital for successful warehouse
Divide and ConquerDivide and Conquer
Copyright © 2009 by WhereScape Software | Slide # 33
• A ‘big bang’ approach to data warehousing has almost always ended in disaster
• The project phases and the order in which they are developed should be decided by the data warehouse sponsors
• Momentum is paramount for keeping the required focus
• Rapid prototyping and tight development cycles are vital for successful warehouse
• Keep in view the bigger picture
Divide and ConquerDivide and Conquer
Copyright © 2009 by WhereScape Software | Slide # 34
• A ‘big bang’ approach to data warehousing has almost always ended in disaster
• The project phases and the order in which they are developed should be decided by the data warehouse sponsors
• Momentum is paramount for keeping the required focus
• Rapid prototyping and tight development cycles are vital for successful warehouse
• Keep in view the bigger picture
• Use smaller phases to fund the project adequately
Divide and ConquerDivide and Conquer
Copyright © 2009 by WhereScape Software | Slide # 35
The True Project ApproachThe True Project Approach
Copyright © 2009 by WhereScape Software | Slide # 36
• Getting the business reps to use working prototypes to share an understanding of the scope
The True Project ApproachThe True Project Approach
Copyright © 2009 by WhereScape Software | Slide # 37
• Getting the business reps to use working prototypes to share an understanding of the scope
• Collect detailed user requirements is exactly the wrong start to a data warehouse project
The True Project ApproachThe True Project Approach
Copyright © 2009 by WhereScape Software | Slide # 38
• Getting the business reps to use working prototypes to share an understanding of the scope
• Collect detailed user requirements is exactly the wrong start to a data warehouse project
• Showing business users the data and relationships that are available to them in a working, populated prototype
The True Project ApproachThe True Project Approach
Copyright © 2009 by WhereScape Software | Slide # 39
• Getting the business reps to use working prototypes to share an understanding of the scope
• Collect detailed user requirements is exactly the wrong start to a data warehouse project
• Showing business users the data and relationships that are available to them in a working, populated prototype
• A better place to start is to collect KPIs and source system technical documentation
The True Project ApproachThe True Project Approach
Copyright © 2009 by WhereScape Software | Slide # 40
• Getting the business reps to use working prototypes to share an understanding of the scope
• Collect detailed user requirements is exactly the wrong start to a data warehouse project
• Showing business users the data and relationships that are available to them in a working, populated prototype
• A better place to start is to collect KPIs and source system technical documentation
• OLAP technology and user workshops are key tools in allowing the business to get their hands on the data
The True Project ApproachThe True Project Approach
Copyright © 2009 by WhereScape Software | Slide # 41
• Getting the business reps to use working prototypes to share an understanding of the scope
• Collect detailed user requirements is exactly the wrong start to a data warehouse project
• Showing business users the data and relationships that are available to them in a working, populated prototype
• A better place to start is to collect KPIs and source system technical documentation
• OLAP technology and user workshops are key tools in allowing the business to get their hands on the data
• Data quality should not be addressed in the DW; problem should be fixed on the source system
The True Project ApproachThe True Project Approach
Copyright © 2009 by WhereScape Software | Slide # 42
Rapid Development ProductRapid Development Product
Copyright © 2009 by WhereScape Software | Slide # 43
• Scrutinize Extract/Transform and Load (ETL) tools when considering building a DW. ETL tools do not provide the ability to build a working prototype and work in short development cycles
Rapid Development Product and ETLRapid Development Product and ETL
Copyright © 2009 by WhereScape Software | Slide # 44
• Combining processing and design
• Ability to enable, manage fast iterations of the prototype
• Environment migration
• Version control
• Automatic documentation
Rapid Development Product Enables:Rapid Development Product Enables:
Copyright © 2009 by WhereScape Software | Slide # 45
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 46
• Single Development Interface
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 47
• Single Development Interface
• Documentation
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 48
• Single Development Interface
• Documentation
• Automated Table Generation
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 49
• Single Development Interface
• Documentation
• Automated Table Generation
• Automated Code Generation
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 50
• Single Development Interface
• Documentation
• Automated Table Generation
• Automated Code Generation
• Metadata Migration
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 51
• Single Development Interface
• Documentation
• Automated Table Generation
• Automated Code Generation
• Metadata Migration
• Version Control
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 52
• Single Development Interface
• Documentation
• Automated Table Generation
• Automated Code Generation
• Metadata Migration
• Version Control
• Object Checkout
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 53
• Single Development Interface
• Documentation
• Automated Table Generation
• Automated Code Generation
• Metadata Migration
• Version Control
• Object Checkout
• Leverage Existing Core Skills.
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 54
• Single Development Interface
• Documentation
• Automated Table Generation
• Automated Code Generation
• Metadata Migration
• Version Control
• Object Checkout
• Leverage Existing Core Skills
• Consistent Framework
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 55
• Single Development Interface
• Documentation
• Automated Table Generation
• Automated Code Generation
• Metadata Migration
• Version Control
• Object Checkout
• Leverage Existing Core Skills
• Consistent Framework
• Extensibility
The Features of a DWLC ToolThe Features of a DWLC Tool
Copyright © 2009 by WhereScape Software | Slide # 56
Copyright © 2009 by WhereScape Software | Slide # 57
Try to get it right first time: The SDLC approach
But:
• Tools and operators in silos – inflexible
• Hard to engage business users, no shared
understanding
• Locked in requirements that can’t be met
Redevelopment
120 day cycle
Risky, expensive, never OTTB & never finished
No documentation, so hard to support
The Traditional ApproachThe Traditional Approach
Copyright © 2009 by WhereScape Software | Slide # 58
Prototype and iterate to prove a design with users
Supported by:
• Integrated toolset and metadata repository –
maximum flexibility
• Continuous business engagement, shared
understanding
• No ambiguity or disagreement about scope
Successful phase completion
Complete, OTTB, user expectations exceeded
Documented solution that is easy to support
The Rapid Development ApproachThe Rapid Development Approach
5 day cycle
A DWLC Tool would save a huge amount of development time and effort, and would enable the approach required to deliver a successful outcome
The DWLC methodology is a child concept for Agile Development Methodology also known as Systems Development Life Cycle (SDLC)
The Features of a DWLC ToolThe Features of a DWLC Tool
Slide # 59
Ensuring Data Availability Ensuring Data Availability
Copyright © 2009 by WhereScape Software | Slide # 60
•The lack of good quality live data will have a major impact on the success of Iterative project approach
•The DW’s capacity to answer BI requirements is unworkable, without sufficient data to populate the DW
•If a new source system is integrated into the data warehouse, the “real” data is quite essential
•If no “real” data for new source is available, then the significant rework will be required once the source is up and running
Ensuring Data Availability Ensuring Data Availability
Copyright © 2009 by WhereScape Software | Slide # 61
Involving the BusinessInvolving the Business
Copyright © 2009 by WhereScape Software | Slide # 62
• Representatives from the Business provide the partnership with the DW development team
• These reps need to be able to articulate the needs of the business to the dev. team
• These reps have to trust the business department behind them when it comes to making any decisions
• The partnership during the iterative project approach provides a reliable, successful outcome
• The main forum for developers to show a working prototype and get user feedback is user workshop
• The business Involvement for the duration of the DW development will reduce the QA overheads
Involving the BusinessInvolving the Business
Copyright © 2009 by WhereScape Software | Slide # 63
Copyright © 2009 by WhereScape Software | Slide # 64
Project governance Project governance
Governance of the data warehouse project should operate at two levels:
•an enterprise level and
•a project level
Pragmatic Governance Framework Pragmatic Governance Framework
Copyright © 2009 by WhereScape Software | Slide # 65
Governance of the data warehouse project should operate at two levels:
•an enterprise level and
•a project level
Pragmatic Governance Framework Pragmatic Governance Framework
Copyright © 2009 by WhereScape Software | Slide # 66
Business Requirements
Technical Constraints
Governance of the data warehouse project should operate at two levels:
•an enterprise level and
•a project level
Pragmatic Governance Framework Pragmatic Governance Framework
Copyright © 2009 by WhereScape Software | Slide # 67
Business Requirements
Technical Constraints
Shared understanding, Prototype and Iterate, Best possible outcome
Sponsorship is sourced from a highly-placed executive
The steering committee provides:
+ Vision
+ Visibility
+ Priorities
+ Scope
+ Focus
+ Terminology
Copyright © 2009 by WhereScape Software | Slide # 68
The DW needs to be owned by the business The DW needs to be owned by the business
At a minimum project governance should include:At a minimum project governance should include:
•A project plan, detailing (high level) scope and timelines
•Regular status meetings to share information
•Change request process documentation
•Standards and procedures for building a consistent DW
•Version control and backup procedures
•Ownership of specific environments and project roles
Copyright © 2009 by WhereScape Software | Slide # 69
Project governance Project governance
Utilizing Experienced Team Members
Copyright © 2009 by WhereScape Software | Slide # 70
• Productivity within a data warehouse implementation is dependent on having experienced team members – both on business side and also on the technical side
• Experienced Subject Matter Experts (SME) provide a thorough understanding of the business and its needs
• Experienced data warehouse developers can take those requirements and turn them into a functioning data warehouse in a rapid timeframe
Utilizing Experienced Team Members Utilizing Experienced Team Members
Copyright © 2009 by WhereScape Software | Slide # 71
Selecting the Right InfrastructureSelecting the Right Infrastructure
Copyright © 2009 by WhereScape Software | Slide # 72
• Sufficient hardware and technology infrastructure during development
• Lower productivity can translate into slower development cycles and iterations, which stands the risk of losing project momentum
• Trade-off between having adequately sized hardware and the cost associated with purchasing that hardware
• One way to mitigate undersized hardware is to use smaller subsets of data during the prototyping phase
Selecting the Right InfrastructureSelecting the Right Infrastructure
Copyright © 2009 by WhereScape Software | Slide # 73
Treat the Warehousing as a process, not a project
ConclusionConclusion
Copyright © 2009 by WhereScape Software | Slide # 74
This means:
•focusing on iterative releases and rollouts that follow in quick succession
•keeping the warehouse in line with the ever changing needs of the business, instead of treating it as a one-time project
•In order to achieve this, a change in the development approach and tools utilized for building the data warehouse must be adopted
ConclusionConclusion
Copyright © 2009 by WhereScape Software | Slide # 75
The key factors to creating a successful data warehouse are:--------------------------------------------------------------------------------Implementing the True Development Approach
Choosing a Rapid Development Product
Ensuring Data Availability
Involving Key Users throughout the whole project
Relying on a Pragmatic Governance Framework
Utilizing experienced Team Members
Selecting the right hardware and other related Infrastructure Technology
Copyright © 2009 by WhereScape Software | Slide # 76
ConclusionConclusion
Raphael Klebanov, Analyst at WhereScape
Office Phone: 303.968.0703
Email address: [email protected]
Public Profile:
http://www.linkedin.com/in/raphaelklebanov
My personal information:My personal information:
Copyright © 2009 by WhereScape Software | Slide # 77