dw-1: introduction to data warehousing

27
DW-1: Introduction to Data Warehousing

Upload: aquila-townsend

Post on 31-Dec-2015

63 views

Category:

Documents


1 download

DESCRIPTION

DW-1: Introduction to Data Warehousing. Overview. What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process Data in a Data Warehouse. What Is Database. Before Program = Algorithm + Data Structure Now - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DW-1: Introduction to Data Warehousing

DW-1: Introduction to Data Warehousing

Page 2: DW-1: Introduction to Data Warehousing

Overview

What is Database

What Is Data Warehousing

Data Marts and Data Warehouses

The Data Warehousing Process

Data in a Data Warehouse

Page 3: DW-1: Introduction to Data Warehousing

What Is Database

Before

Program = Algorithm + Data Structure

Now

Application (Weblication) = Visual I/F + SQL Query + Database

Database is Integrated Data

from multiple file system data for OLTP

Data Base (From Air Base?), DB, 데이타베이스 , 자료기지 ( 북한 )

Page 4: DW-1: Introduction to Data Warehousing

Database and Data Model

Computer Representation of Data for efficient understanding and processing

Data Model based on Relationship modeling

Relationship between recordone-to-one(1:1), one-to-many(1:N), many-to-many(N:M)

Hierarhical Model: Hierarchical Relationship, 1:N

Network model: Network like relationship, N:M

Relational Model: Use relation (table) for Relationship

Object-Oriented data model: Complex object modelingSET type, Reference, List

Page 5: DW-1: Introduction to Data Warehousing

What Is Data Warehousing

Defining Data Warehousing

Operational Systems: A Transactional Solution

Analytical Systems: A Data Warehousing Solution

Comparing Transactional and Data Warehousing Solutions

Page 6: DW-1: Introduction to Data Warehousing

Defining Data Warehousing

Business Intelligence Database Marketing: Personalized Product

Especially S/W, Cocoon business etc. Electronic Commerce

Data Warehouse: 자료 창고 for OLAP, Data Mining, DSS

Knowledge Management

Data Warehousing: Process to build Data Warehouse

Page 7: DW-1: Introduction to Data Warehousing

Defining Data Warehousing

A Data Warehouse Is a Database That Contains: Enterprise data Integrated sets of historical data Subject-oriented, consolidated, consistent data Data structured for distribution and querying

A Data Warehousing Solution Is a Process That: Retrieves and transforms data Manages the database Uses tools for building and managing the data warehouse

Page 8: DW-1: Introduction to Data Warehousing

Operational Systems: A Transactional Solution

Track Individual Events

Used for Real-time Data Entry and Editing

Examples:

Order-tracking applications

Customer service applications

Point-of-sale applications

Service-based sales applications

Banking functions

Page 9: DW-1: Introduction to Data Warehousing

Analytical Systems: A Data Warehousing Solution

Assist with Strategic Decision Support

Provide Different Levels of Analysis

Allow Users to Navigate to Different Levels of Data

Allow System Searches to Find New Relationships

Examples:

Spreadsheet-based applications

Sales forecasting applications

Page 10: DW-1: Introduction to Data Warehousing

Comparing Transactional and Data Warehousing Solutions

TransactionalTransactionalsolutionssolutions

TransactionalTransactionalsolutionssolutions

Data warehousingData warehousingsolutionssolutions

Data warehousingData warehousingsolutionssolutions

Update frequencyUpdate frequency Real-timeReal-time PeriodicallyPeriodically

Structured forStructured for Data integrityData integrity Ease in queryingEase in querying

Optimized forOptimized for Transaction performanceTransaction performance Query performanceQuery performance

Page 11: DW-1: Introduction to Data Warehousing

Data Marts and Data Warehouses

What Is a Data Mart

Moving Data from a Data Warehouse to Data Marts

Moving Data from Data Marts to a Data Warehouse

Page 12: DW-1: Introduction to Data Warehousing

What Is a Data Mart

What Is a Data Mart A subset of a data warehouse Used in an enterprise Specific to a particular subject or business activity

Why Build Data Marts Faster queries and fewer users Faster deployment time

Integrated Data Marts Ensure consistent data Require advance planning

Page 13: DW-1: Introduction to Data Warehousing

Moving Data From a Data Warehouse to Data Marts

Advantages Shared fields Common source Distributed processing

Disadvantages Longer time to develop

Customer Customer Service MartService Mart

Sales MartSales Mart

DataDataWarehouseWarehouse Financial MartFinancial Mart

Source 1Source 1Source 1Source 1

Source 2Source 2Source 2Source 2

Source 3Source 3Source 3Source 3

Page 14: DW-1: Introduction to Data Warehousing

Moving Data from Data Marts to a Data Warehouse

Advantages Simpler and faster to implement Department-specific data Smaller hardware requirements

Disadvantages Data duplication Incompatible data marts

DataDataWarehouseWarehouse

Sales MartSales Mart

Financial MartFinancial Mart

Customer ServiceCustomer ServiceMartMart

Source 1Source 1Source 1Source 1

Source 2Source 2Source 2Source 2

Source 3Source 3Source 3Source 3

Page 15: DW-1: Introduction to Data Warehousing

The Data Warehousing Process

Basic Elements of the Process

Tools to Manage the Process

Page 16: DW-1: Introduction to Data Warehousing

Basic Elements of the Process

Data Marts

DataDataWarehouseWarehouse

Source OLTPSystems

Clients

Retrieve DataRetrieve Data Populate Populate Populate Populate Query Query Transform Data Transform Data Data Warehouse Data Warehouse Data Marts Data Marts the Data the Data

11

22

33 44 55

Page 17: DW-1: Introduction to Data Warehousing

Tools to Manage the Process

SQL Server

Data Transformation Services

SQL Server OLAP Services

Microsoft Repository

Microsoft English Query

PivotTable Service

Page 18: DW-1: Introduction to Data Warehousing

ETL process

Extraction, Transformation, Loading

Extraction: 추출

Data retrieval from existing data source such as File, Table etc.

Transformation: 변환

Data modification, sorting, calculation etc

Loading: 적재

Bulk, incremental loading from operational DB

Time consuming process: may use special H/W

Page 19: DW-1: Introduction to Data Warehousing

Data in a Data Warehouse

Data Characteristics

Example of Organizing Data

Page 20: DW-1: Introduction to Data Warehousing

Data Characteristics

Data characteristicData characteristicData characteristicData characteristic DescriptionDescriptionDescriptionDescription

ConsolidatedConsolidated Enterprise-wideEnterprise-wide

ConsistentConsistent Within the data warehouseWithin the data warehouse

Subject-orientedSubject-oriented Organized to user perspectiveOrganized to user perspective

HistoricalHistorical Snapshots over timeSnapshots over time

Read-onlyRead-only Cannot updateCannot update

SummarizedSummarized To appropriate level of detailTo appropriate level of detail

Page 21: DW-1: Introduction to Data Warehousing

Example of Organizing Data

Southeast RegionTotal

City

Miami

Tampa

Atlanta

Savannah

Columbia

Monthly Southeast Regional Sales Report - May 1999

State

FL

FL

FL Totals

GA

GA

GA Totals

SC

SC Totals

Units Sold

2,500

2,750

5,250

3,200

1,725

4,925

1,900

1,900

12,075

Sales $

$12,850

$14,135

$26,985

$16,800

$ 9,143

$25,943

$ 9,595

$ 9,595

$62,473

Page 22: DW-1: Introduction to Data Warehousing

Data Warehouse Schema Example: Star schema

Page 23: DW-1: Introduction to Data Warehousing

A Example of Cube Browsing

1 Fact with 4 Dimension Table-- Sales_Fact, Product, Store, Time, Customer

Page 24: DW-1: Introduction to Data Warehousing

Drilling Down

Drilling Down to products

Page 25: DW-1: Introduction to Data Warehousing

Drilling Down

Drilling Down to the lowest level of Customer Dimension

Page 26: DW-1: Introduction to Data Warehousing

Rolling up

Rolling up

Page 27: DW-1: Introduction to Data Warehousing

Review

What Is Data Warehousing

Data Marts and Data Warehouses

The Data Warehousing Process

Data in a Data Warehouse

Data Warehouse will be more popular than DB?