data ware house structure

Upload: syed-motasham

Post on 08-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 Data Ware House Structure

    1/44

    Data ware House Architecture

  • 8/7/2019 Data Ware House Structure

    2/44

    Problem

    No two data warehouse implementationsare exactly alike.

    worst data warehousing mistakes an makeis to try to force your business analysisand reporting needs to fit into anenvironment that you copied from

    somewhere else. Although a certain amount of analysis is

    standard across companies

  • 8/7/2019 Data Ware House Structure

    3/44

    Data ware house Components

    A data warehouse is composed of manydifferent components, each of which canbe implemented in several way.

    These components include

    No of different subjects and focus points(No of different functional or regional

    organizations) The number of sources that will provide

    raw data

  • 8/7/2019 Data Ware House Structure

    4/44

    Data ware house Components

    Methods for data movement from sourceapplications and loaded into the data warehouse

    rules applied to the raw source data to produce

    high quality data assets The target databases in which data assets are

    stored

    The business intelligence, front-end tool used to

    access the data assets The overall architectural complexity of the

    environment

  • 8/7/2019 Data Ware House Structure

    5/44

    Differences

    The two identical companies probably have

    these differences

    Different data sources, unique to each company

    Different data, as a result of the different sources

    The use of different source-to-warehouse

    movement techniques for example, business

    rules for forecasting future revenue

  • 8/7/2019 Data Ware House Structure

    6/44

    Classifying the Data Warehouse

    Data warehouse lite:

    Data warehouse deluxe:

    Data warehouse supreme:

  • 8/7/2019 Data Ware House Structure

    7/44

    The data warehouse lite

    A data warehouse lite is a no-frills, low-tech

    approach to providing data that can help with

    some of your business decision-making.

    No-frills means that you put together, wherever

    possible, proven capabilities and tools already

    within your organization to build your system.

    The term data martis commonly used to refer todata warehouse lite

  • 8/7/2019 Data Ware House Structure

    8/44

    Lite (Subject areas and data

    content) focused on the reporting or analysis of

    only one or possibly two subject areas.

    A data warehouse lite has just enoughdata content to satisfy the primary purpose

    of the environment

    Example

  • 8/7/2019 Data Ware House Structure

    9/44

    Figure 1 Lite Subject Area

  • 8/7/2019 Data Ware House Structure

    10/44

    Lite (Data Sources)

    A data warehouse lite has a limited set of data sourcestypically one

    The data warehouse lite acts as the restructuring agentfor the applications data to make it more query- and

    report-friendly.

    The most common means of restructuring a singleapplications data is to denormalize the contents of theapplications relational database tables eliminate(relational join)

    you dont worry about duplicated data; you try to createrows of data in a single table that most likely mirrors thereports and queries that users run.

  • 8/7/2019 Data Ware House Structure

    11/44

    Figure 2 Lite Data source

  • 8/7/2019 Data Ware House Structure

    12/44

    Lite (Business intelligence tools)

    The users of a data warehouse lite usually

    ask questions and create reports that

    reflect a Tell me what happened

    perspective.

    Because those users dont do much

    heavy-duty analytical processing

  • 8/7/2019 Data Ware House Structure

    13/44

    Lite (Database)

    Data warehouse lite solutions are limitedby users, data content, and the type ofbusiness intelligence tools utilized.

    These limitations are the primary reasonthat a data warehouse lite is usually builton a standard, general purpose relational

    database management system. In some situations, though, amultidimensional database (MDB) is used

  • 8/7/2019 Data Ware House Structure

    14/44

    Data extraction, movement, and

    loading

    Simple file extracts from the run-the-business

    systems and file transfers that allow you to move

    data from its sources to the data warehouse lite

    Straightforward custom code (or perhaps aneasy-to-use tool) that can extract and move the

    data

    If the data source for your data warehouse lite is

    built on a relational Database, use SQL to easily

    handle data extraction and movement.

  • 8/7/2019 Data Ware House Structure

    15/44

    Figure 3 Lite Data movement

  • 8/7/2019 Data Ware House Structure

    16/44

    Lite (Architecture)

    The architecture of a data warehouse lite

    is composed of

    the database used to store the data the front-end business intelligence tools

    used to access the data

    the way the data is moved and the number of subject areas.

  • 8/7/2019 Data Ware House Structure

    17/44

    Lite (Architecture)

    The architecture of a data warehouse lite, as

    shown in Figure, contains these major

    component types:

    A single database contains the warehousesdata.

    That database is feed directly from each of the

    sources providing data to the warehouse.

    Users access data directly from the warehouse.

  • 8/7/2019 Data Ware House Structure

    18/44

    Figure 4 Lite (Architecture)

  • 8/7/2019 Data Ware House Structure

    19/44

    The data warehouse deluxe

    Data from many different sources

    converge in these real data warehouses.

    That make available a wealth ofarchitectural options that you can fit to

    meet your specific needs.

  • 8/7/2019 Data Ware House Structure

    20/44

    Figure 5 The data warehouse

    deluxe

  • 8/7/2019 Data Ware House Structure

    21/44

    Deluxe (Subject areas and data

    content)

    A data warehouse deluxe contains a broad

    range of related subject areas

    everything (or most things) that would

    follow a natural way of thinking about and

    analyzing information.

  • 8/7/2019 Data Ware House Structure

    22/44

    example

    In a data-warehouse-deluxe version of the telephone-company example

    Consumer basic calling revenues and volumes

    Consumer long-distance calling revenues and volumes

    Consumer wireless calling revenues and volumes

    Business wireless services

    Business basic calling revenues and volumes

    Business long-distance calling revenues and volumes

    Business wireless calling revenues and volumes

    Internet access (DSL) services

    Internet revenues and volumes

  • 8/7/2019 Data Ware House Structure

    23/44

  • 8/7/2019 Data Ware House Structure

    24/44

    Data sources

    Although the exact number of datasources depends on the specifics toimplementation

    an average eight to ten applicationsand external databases provide datato warehouse.

  • 8/7/2019 Data Ware House Structure

    25/44

    Difficulty in Deluxe data sources

    Different encodings for similar information:

    Different sets of customer numbers come fromdifferent sources.

    Data integrity problems across multiple sources:

    The information in one source is different from theinformation in another when they should be the same.

    Different source platforms:

  • 8/7/2019 Data Ware House Structure

    26/44

    Business intelligence tools

    deluxe means that you usually have

    several different ways of looking at that

    warehouses contents.

    This list shows the different ways that you

    can use a data warehouse

  • 8/7/2019 Data Ware House Structure

    27/44

    Business intelligence tools

    Simple reporting and querying:

    Business analysis: Tell what happened and why.

    Dashboards and scorecards: In this model, a variety of information is gathered from the data

    warehouse and that information is made available to users whodont want to mess around with the data warehouse they wantto see snapshots of many different things.

    Data mining or statistical analysis: In this area, statistical, artificial intelligence, and related

    techniques are used to mine through large volumes of data andprovide knowledge

  • 8/7/2019 Data Ware House Structure

    28/44

    Database

    Data warehouse deluxe implementations

    are big and getting bigger all the time.

    Implementations that use hundreds ofgigabytes and even terabytes increasingly

    more common.

    To manage this volume of data and user

    access, you need a very robust server and

    database.

  • 8/7/2019 Data Ware House Structure

    29/44

    Architecture

    A data warehouse deluxe can have three tiers

    Data mart: Receives subsets of information from the data warehouse deluxe

    and serves as the primary access point for users.

    Interim transformation station: An area in which sets of data extracted from some of the sources

    undergo some type of transformation process before movingdown the pipeline toward the warehouses database.

    Quality assurance station:

    An area in which groups of data undergo intensive qualityassurance checks before you let them move into the datawarehouse.

  • 8/7/2019 Data Ware House Structure

    30/44

  • 8/7/2019 Data Ware House Structure

    31/44

    The data warehouse supreme

    todays state-of-the-art data warehouselooks like a complicated data warehousedeluxe

    The data warehouse of tomorrow will looklike data warehouse Supremes.

    There are few enterprises that have

    ventured in this direction due to overall cost and capabilities, it is

    still rare to find many data warehouseSupremes.

  • 8/7/2019 Data Ware House Structure

    32/44

    Subject areas and data content

    No of subject areas in a data warehouse

    supreme is unlimited

    because the data warehouse is virtual. It isnt all contained in a single database or

    even within multiple databases that you

    personally load and maintain.

  • 8/7/2019 Data Ware House Structure

    33/44

    Subject areas and data content

    Instead, only part of your warehouse (small part)is physically located on some data warehouseserver

    the rest is out there in cyberspace somewhere Accessible through networking capabilities

    warehouse users have an infinite number ofsubject-area possibilities anything that could

    possibly be of interest to them

  • 8/7/2019 Data Ware House Structure

    34/44

    Subject areas and data content

    Think of how you use the Internet today toaccess Web sites all over the world

    sites that someone else creates and maintains.

    each of those sites contains information aboutsome specific area of interest to you

    Also imagine that you can query and run reportsby using the contents of one or more of thesesites as your input.

    Thats the model of the data warehousesupreme: opening up an unlimited number ofpossibilities to users.

  • 8/7/2019 Data Ware House Structure

    35/44

  • 8/7/2019 Data Ware House Structure

    36/44

  • 8/7/2019 Data Ware House Structure

    37/44

    Data sources

    such as quality assurance processes orhow frequently the data is refreshed.

    I have more good news, though: Because

    the most critical part of a data warehousesupreme is still internallyacquired data(the data coming from your internalapplications)

    you populate your data warehousesupreme with multimedia information inaddition to traditional data, video servers,Web sites, and databases that storedocuments and text.

  • 8/7/2019 Data Ware House Structure

    38/44

    Business intelligence tools

    Basic reporting and querying,business analysis, dashboards andscorecards, and data mining are all

    part of the data warehouse supremeenvironment.

    The business intelligence tools willenable users to pull information fromthe data warehouse supreme andintegrate it with a better visualizationfor instance, Google Earth or

    Microsoft Virtual Earth.

  • 8/7/2019 Data Ware House Structure

    39/44

    Business intelligence tools

    The biggest difference B/W deluxe andsupreme, is the dramatically increased useof push technology.

    using intelligent agents you can haveinformation fed back to you from the farends of the Internet based universe

  • 8/7/2019 Data Ware House Structure

    40/44

    Figure 7 Intelligent Agent

  • 8/7/2019 Data Ware House Structure

    41/44

    Database

    A data warehouse supreme most likely consistsof a database environment that meets theserequirements:

    Its distributed across many differentplatforms.

    It operates in a location-transparent manner

    It has object-oriented capabilities to storeimages, videos, and text

    For faster performance access data directlyfrom transactional databases without havingto copy the information to a separate datawarehouse database.

  • 8/7/2019 Data Ware House Structure

    42/44

    Architecture

    Figure shows an example of what the

    architecture of a data warehouse supreme

    might look like.

    But with all the upcoming technology

    trends and improvements discussed in the

    preceding sections, your data warehouse

    supreme can look like (almost) anythingyou want.

  • 8/7/2019 Data Ware House Structure

    43/44

    Figure 8 Supreme Architecture

  • 8/7/2019 Data Ware House Structure

    44/44