lecture 08 - external data and the data warehouse - building the data warehouse

Upload: bondaigia

Post on 03-Apr-2018

238 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    1/16

    Chapter 8: External Data and the Data Warehouse

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    2/16

    Agenda1. Introduction

    2. External Data in the Data Warehouse

    3. Metadata and External Data

    4. Storing External Data

    5. Different Components of External Data

    6. Modeling and External Data

    7. Secondary Reports

    8. Archiving External Data

    9. Comparing Internal Data to External Data

    10. Summary

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    3/16

    8.1 IntroductionMost organizations build their first data warehouse efforts

    on data whose source is existing systems (that is, on datainternal to the corporation).

    A whole host of other data is of legitimate use to acorporation that is not generated from the corporationsown systems. This class of data is called external data andusually enters the corporation in an unpredictable format.

    (Figure 8.1). The data warehouse is the ideal place to store external

    data. If external data is not stored in a centrally locatedplace, several problems are sure to arise. (Figure 8.2).

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    4/16

    8.1 Introduction (cont)

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    5/16

    8.1 Introduction (cont)

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    6/16

    8.2 External Data in the Data

    Warehouse Several issues relate to the use and storage of external

    data in the data warehouse.

    The first problem is thefrequency of availability

    The second problem is totally undisciplined

    The third problem is unpredictability

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    7/16

    8.2 External Data in the Data

    Warehouse (cont) There are many methods to capture and store external

    information.

    One of the best places to locate external data if it is voluminous ison a bulk storage medium such as near-line storage.

    Another technique for handling external data that is sometimes

    effective is to create two stores of external data.

    The external data becomes an adjunct to the data warehouse.

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    8/16

    8.3 Metadata and External Data

    Metadata isvital when it comes to the issue of external data.

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    9/16

    8.3 Metadata and External Data (cont) Associated with metadata is another type of datanotification data.

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    10/16

    8.4 Storing External Data

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    11/16

    8.5 Different Components of External

    DataOne of the important design considerations of external

    data is that it often contains many different components,some of which are of more use than others.

    To manage the data, an experienced DSS analyst orindustrial engineer must determine the most importantunits of data.

  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    12/16

    8.6 Modeling and External Data The following question must be answer.

    What is the relationship between the data model and externaldata? (As described in Figure 8.6)

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    13/16

    8.7 Secondary Reports

    When data is repetitive in nature, secondary reports can be createdfrom the detailed data over time.

    For example, take the month-end Dow Jones Industrial Averagereport shown in Figure 8-7.

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    14/16

    8.9 Archiving External Data Every piece of informationexternal or otherwisehas a

    useful lifetime. Once that lifetime is past, it is noteconomical to keep the information.

    An essential part of managing external data is decidingwhat the useful lifetime of the data is. There remainsthe issue of whether the data should be discarded orput into archives.

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    15/16

    8.10 Comparing Internal Data to

    External Data One of the most useful things to do with external data is to compare

    it to internal data over a period of time. The comparison allowsmanagement a unique perspective.

    The following is some problems must be notice when compareinternal Data to External Data

    The comparison is made on a common key.

    There needs to be a cleansing of the external data.

    http://it-slideshares.blogspot.com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/
  • 7/29/2019 Lecture 08 - External Data and the Data Warehouse - Building the Data Warehouse

    16/16

    8.11 Summary The data warehouse is capable of holding much more than internal,

    structured data. There is much information relevant to the running ofthe company that comes from sources outside the company.

    External data is captured, and information about the metadata isstored in the data warehouse metadata.

    External data often undergoes significant editing andtransformation as the data is moved from the externalenvironment to the data warehouse environment.

    The metadata that describes the external data and theunstructured data serves as an executive index to information.

    External and unstructured data may or may not actuallybe stored in the data warehouse.

    http://it-slideshares blogspot com/

    http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/http://it-slideshares.blogspot.com/