modeling the data warehouse chapter 7. data warehouse database design phases zdefining the business...
Post on 20-Dec-2015
234 views
TRANSCRIPT
Modeling the Data Warehouse
Chapter 7
Data Warehouse Database Design PhasesDefining the business
model (conceptual model)
Creating the dimensional model (logical model)
Modeling summariesCreating the physical
model
Select a businessprocess
Physical modelPhysical model
2,3
Performing Strategic Analysis
Phase 1: Defining the Business Model
Select a businessprocess
�Performing strategic
analysis
�Creating the business
(conceptual) model
Creating the Business Model
Phase 1: Defining the Business Model Performing strategic analysis Creating the business (conceptual)model - Defining business requirements - Identifying the business measures - Identifying the dimensions - Identifying the grain - Identifying the business definitions and rules - Verifying data sources
Phase 1: Defining the Business Model
Creating the Business Model
Performing strategic analysis Creating the business (conceptual) model - Defining business requirements - Identifying the business measures - Identifying the dimensions - Identifying the grain - Identifying the business definitions and rules - Verifying data sources
Business Requirements Drive the Design Process
Typ e tit le h ere Typ e tit le h ere Typ e tit le h ere
Typ e tit le h ere
Primary input
Other inputsBusiness requirements
Existingmetadata
ProductionERD model
Research Nonrelationallegacy systems
Identifying Measures and Dimensions
The attribute variescontinuously: •Balance•United Sold•Cost•Sales
The attribute is perceived asa constant or discrete value:
•Description•Location•Color•Size
DimensionsMeasures
Determining Granularity
YEAR?
QUARTER?
MONTH?
WEEK?
DAY?
Identifying Business Rules
Location
Geographic proximity0 - 1 miles1 - 5 miles> 5 miles
Product
Type Monitor StatusPC 15 inch New
Server 17 inch Rebuilt 19 inch Custom
None
Time
Month>Quarter>Year
Store
Store>District>Region
Creating the Dimensional Model
Identify fact tables - Translate business measures into fact tables - Analyze source system information for additional measures - Identify base and derived measures - Document additivity of measures Identify dimension tables Link fact tables to the dimension tables Create views for users
Dimension Tables
Dimension tables have the following characteristics:Contain textual information that represents the
attributes of the businessContain relatively static dataAre joined to a fact table through a foreign key
reference
Facts(units, price)
Channel
Time
Product
Customer
Fact Tables
Fact tables have the following characteristics: Contain numeric measures (metric) of the business May contain summarized (aggregated) data May contain date-stamped data Are typically additive Have key value that is typically a concatenated key
composed of the primary keys of the dimensions Joined to dimension tables through foreign keys that
reference primary keys in the dimension tables
Facts(units, price)
Channel
Time
Product
Customer
Fact table
Dimension tables
Star Schema Model
Central fact tableRadiating
dimensionsDenormalized
model
Sales Fact TableProduct_idStore_ idItem_idDay_idSales_dollarsSales_units
Product TableProduct_idProduct_desc
Store TableStore_idDistrict_id
Time TableDay_idMonth_idYear_id
Item TableItem_idItem_desc
Star Schema Model
Easy for users to understandFast response to queriesSimple metadataSupported by many front end toolsLess robust to changeSlower to buildDoes not support history
Snowflake Schema Model
Sales Fact TableProduct_idStore_ idItem_idDay_idSales_dollarsSales_units
Product TableProduct_idProduct_desc
Store TableStore_idDistrict_id
Time TableDay_idMonth_idYear_id
Item TableItem_idItem_desc
District TableDistrict_id
District_desc
Dept TableDept_id
Dept_descMgr_id
Mgr TableDept_idMgr_id
Mgr_name
Snowflake Schema Model
Direct use by some toolsMore flexible to changeProvides for speedier data loadingMay become large and unmanageableDegrades query performanceMore complex metadata
Country State County City
Using Summary Data
Phase 3: Modeling summariesProvides fast access to precomputed
dataReduces use of I/O, CPU, and memoryIs distilled from source systems and
precalculated summariesUsually exists in summary fact tables
Designing Summary Tables
AverageMaximum
TotalPercentage
Units Sales($) Store
Product A Total
Product B Total
Product C Total
Summary Tables Example
SALES FACTSSales$ Region Month10,000 North Jan 9912,000 North Feb 9911,000 South Jan 9915,000 West Mar 9918,000 South Feb 9920,000 North Jan 9910,000 East Jan 992,000 West Mar 99
SALES BY MONTH/REGIONMonth Region Tot_Sales$Jan 99 North 41,000Jan 99 East 10,000Feb 99 South 40,000Mar 99 West 17,000
SALES BY_MONTHMonth Tot_SalesJan 99 51,000Feb 99 40,000Mar 99 17,000
Summary Management in Oracle8i
Summary advisor
Summaryrecommendations
Spacerequirements
Summaryusage
Region
State
City
Product Time
Salessummary
Using Time in the Data Warehouse
The Time Dimension
Time is critical to the data warehouseA consistent representation of time is
required for extensibility
Sales factTime
dimension
Where should the element of time be stored?
Creating the Physical Model
Phase 4: Creating the Physical Model Translate the dimensional design to a physical
model for implementation Define storage strategy for tables and indexes Perform database sizing Define initial indexing strategy Define partitioning strategy Update metadata document with physical
information
Physical Model Design Tasks
Define naming and database standardsPerform database sizingDesign tablespacesDevelop initial indexing strategyDevelop data partition strategyDefine storage parametersSet initialization parametersUse parallel processing
Using Data Modeling Tools
Tools with a GUI enable definition, modeling, and reporting
Avoid a mix of modeling techniques caused by: - Development pressure - Developers with lack of knowledge - No strategyDetermine a strategyWrite and publish formallyMake available electronically
Spreadsheets
CASE tools Paper andpencil
Summary
This lesson discussed the following topics:
Creating a business modelCreating a dimensional modelModeling the summariesCreating a physical model
Select among businessprocesses
Physical modelPhysical model
2,3
Business model
Dimensional model