dw - chap 3

Upload: kwadwo-boateng

Post on 05-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 DW - Chap 3

    1/3

    Data Warehouse Development Methodology

    This chapter will explain the methodology of building a data warehouse. In software engineering,the discipline that studies the process people use to develop an information system is called thesystem development life cycle (SDLC) or system development methodology . The waterfallmethodology is called the sequential methodology. The iterative methodology is also known as the

    incremental methodology.

    Waterfall Methodology:

    1. feasibility study2. Requirements3. architecture4. design5. development6. testing7. deployment8. and operation

    There are other variations of the step names as: proposal, analysis, business requirementsdefinition, technical architecture, functional design, technical design, unit testing, system testing,integration testing, user acceptance testing, rollout, implementation, maintenance, and support.

    Feasibility study :You gather the requirements at a high level (for example, determining why you need a datawarehouse and whether data warehousing is the right solution), you have a quick look at the sourcesystems to find out whether it is possible to get the data you need, you get sample data to assessthe data quality, and write a proposal (some people prefer to call this the business case).Important things to mention in the proposal or business case are:

    i. Benefitsii. How long it will takeiii. How much it will costiv. And the business justificationsv. Requirements (Summary)vi. The project organizationvii. Project plan

    Feasibility:I find it best to set the feasibility step as its own project on separate budget. In this phase, thework is to analyze the requirements and data volume at a high level and determine suitablearchitecture and toolset so you can produce a rough estimate. For this work you will need abusiness analyst to analyze the requirements, a data warehouse architect to determine thearchitecture and a project manager to produce the estimate and high-level project plan. Other types of resources that may be required during the feasibility study phase are a hardware and infrastructure person (to verify the details of the proposed architecture), a business owner (such as a customer service manager), and the source system DBA (to get the informationabout data volume).

    Requirements:You talk to users to understand the details of the processes, the business, the data, and theissues. Arrange site visits to get firsthand experience. Discuss the meaning of the data, theuser interface, and so on, and document them. You also have to list the nonfunctional requirements such as performance.

    1

  • 7/31/2019 DW - Chap 3

    2/3

    Architecture:You need to determine which data flow architecture you are going to use and what systemarchitecture you are going to use in detail, including specifications for database servers, thetype of network, the storage solution, and so on.

    Design:You need to design the three main parts of the data warehouse system: the data stores, theETL system, and front-end applications.

    Development :You need to build the three parts that you design: data stores, ETL system and front-end applications.

    Testing:Basically, you need to test the data stores, the front-end applications and the ETL system.

    Deployment:Once the system is ready, you put all the components in the production boxes: the ETL system,the data stores, and the front-end applications. You do the initial load; that is, you load the datafrom the source system for the first time. You do a few tests and the front-end applications areproducing the correct figures. You produce a user guide, the operational guide, and thetroubleshooting guide for the operations team. You train the users and the operational team.You support the system, the users, and the operations team for a few weeks.

    Operation:The users continue to use the data warehouse and the application. The operations teamcontinues to administer the data warehouse and to support the users.

    Infrastructure setup:One of the biggest tasks when you build an application is to prepare the productionenvironment where you are going to run the application and to build the development andtesting environments.

    The Amadeus Entertainment Case Study:

    Lets first assume that the solution for this project to build four data marts: Sales, Purchasing,Inventory, and CRM.

    We will be using the NDS + DDS data architecture. We have a PM, a DWA, business analyst,and two developers. We will deliver functionality by building the easiest data mart (purchasing).

    The BA and DWA will work together to design and create part of the NDS and DDS databasesrequired by the purchasing data mart. They will also specify the front-end application for threereports and one OLAP cube.

    The two developers will build the ETL and the report and set up the CUBE. The datawarehouse system will extract the purchase order table from the Jupiter, Jade and WebTower9systems every day and load into the NDS database. The DDS ETL will get the data from theNDS database.

    At this time, the NDS will contain perhaps seven tables, and the DDS will contain perhaps fivetables only the necessary entities, fact tables, and dimension table required to support the

    purchasing data mart to function properly.

    2

  • 7/31/2019 DW - Chap 3

    3/3

    At the heart of the iterative methodology you will do all the system components (ETL, stage,NDS, DDS, DQ, control, cubes, reports) including the training and the handover to theoperations team but on a smaller scale.

    3