(kajal maam(principles of dimensional modeling

31
Principles Of Dimensional Modeling

Upload: rahulmhatre26

Post on 29-Mar-2015

199 views

Category:

Documents


2 download

DESCRIPTION

Uploaded from Google Docs

TRANSCRIPT

Page 1: (Kajal Maam(Principles of Dimensional Modeling

Principles Of Dimensional Modeling

Page 2: (Kajal Maam(Principles of Dimensional Modeling

Design Requirements

Design of the DW must directly reflect the way the managers look at the business

Should capture the measurements of importance along with parameters by which these parameters are viewed

It must facilitate data analysis, i.e., answering business questions

Page 3: (Kajal Maam(Principles of Dimensional Modeling

3

What is Dimensional Modeling (DM)?

DM is a logical design technique that seeks to present the data in a standard, intuitive framework that allows for high-performance access.

Can be implemented using a relational or a multidimensional DBMS, with some restrictions.

It is different from ER modeling Every dimensional model is composed of one table with a

multipart key, called the fact table, and a set of smaller tables called dimension tables.

Each dimension table has a single-part primary key that corresponds exactly to one of the components of the multipart key in the fact table.

This characteristic "star-like" structure is often called a star schema.

Page 4: (Kajal Maam(Principles of Dimensional Modeling

ER Modeling

A logical design technique that seeks to eliminate data redundancy

Illuminates the microscopic relationships among data elements

Perfect for OLTP systems Responsible for success of transaction processing in

Relational Databases

Page 5: (Kajal Maam(Principles of Dimensional Modeling

Problems with ER Model

ER models are NOT suitable for DW? End user cannot understand or remember an ER

Model Many DWs have failed because of overly complex ER

designs Not optimized for complex, ad-hoc queries Data retrieval becomes difficult due to normalization Browsing becomes difficult

Page 6: (Kajal Maam(Principles of Dimensional Modeling

ER vs Dimensional Modeling

ER models are constituted to Remove redundant data (normalization) Facilitate retrieval of individual records having

certain critical identifiers Thereby optimizing OLTP performance

Dimensional model supports the reporting and analytical needs of a data warehouse system.

Page 7: (Kajal Maam(Principles of Dimensional Modeling

7

Comparison between the ER & Dimensional Model

Dimensional Modeling ER Model

Support adhoc querying for business analyses and complex analyses

Support for OLTP

The data model is multidimensional

The data model has two dimensions

It is asymmetric It is symmetric

Permit redundancy Removes redundancy

It is extensible, application is not changed

If the model is modified ,applications are modified

It can be done independent of expected query patterns

It is variable in structure and very vulnerable to changes in the user’s querying habits

Easy and understandable Hard for people to visualize

Models the business practically Models the micro relationships among data elements

Page 8: (Kajal Maam(Principles of Dimensional Modeling

8

Dimension Modeling Concepts

Design goals :user understanability,Query performance,resilience to change

Components of DM: Fact Tables Dimension Tables

Page 9: (Kajal Maam(Principles of Dimensional Modeling

9

Inside Dimension table

Dimensional table key Large no. of attributes Textual attributes Attributes not directly related Flattened table,not normalized Ability to drill down/roll up Multiple hierarchies Less number of records

Page 10: (Kajal Maam(Principles of Dimensional Modeling

10

Inside Fact Table

Concatenated fact table key Grain/level of data identified Fully-additive-all dimensions Semi-additive-some dimensions Large no. of records Few attributes Sparsity of data Degenerate dimensions

Page 11: (Kajal Maam(Principles of Dimensional Modeling

11

Factless Fact Table

Some fact tables have no measured facts Useful to describe events and

coverage ,tables contain information that something has/has not happened

Often used to represent many-to-many relationships

The only thing they contain is concatenated key

Page 12: (Kajal Maam(Principles of Dimensional Modeling

12

Star Schema keys

Primary keys Surrogate keys Foreign keys

Page 13: (Kajal Maam(Principles of Dimensional Modeling

Modeling Design Process

1. Identify the Business Process Source of “measurements”

2. Identify the Grain What does 1 row in the fact table represent

or mean?

3. Identify the Dimensions Descriptive context, true to the grain

4. Identify the Facts Numeric additive measurements, true to the

grain

Page 14: (Kajal Maam(Principles of Dimensional Modeling

Step 1 - Identify the Business Process

This is a business activity typically tied to a source system.

Not to be confused with a business department or function. An Orders dimensional model should support the activities of both Sales and Marketing.

“If we establish departmentally bound dimensional models, we’ll inevitably duplicate data with different labels and terminology.”

Page 15: (Kajal Maam(Principles of Dimensional Modeling

Step 2 - Identify the Grain

The level of detail associated with the fact table measurements.

A critical step necessary before steps 3 and 4. Preferably it should be at the most atomic

level possible. “How do you describe a single row in the fact

table?”

Page 16: (Kajal Maam(Principles of Dimensional Modeling

Step 3 - Identify the Dimensions

The list of all the discrete, text-like attributes that emanate from the fact table.

They are the “by” words used to describe the requirements.

Each dimension could be though of as an analytical “entry point” to the facts.

“How do business people describe the data that results from the business process?”

Page 17: (Kajal Maam(Principles of Dimensional Modeling

Step 4 - Identify the Facts

Must be true to the grain defined in step 2. Typical facts are numeric additive figures. Facts that belong to a different grain belong in

a separate fact table. Facts are determined by answering the

question, “What are we measuring?” Percentages and ratios, such as gross margin,

are non-additive. The numerator and denominator should be stored in the fact table.

Page 18: (Kajal Maam(Principles of Dimensional Modeling

18

Advantages of star schema

Easy for users to understand Optimizes navigation Most suitable for query processing

Page 19: (Kajal Maam(Principles of Dimensional Modeling

19

DM:Advanced Topics

Slowly Changing dimensionsType 1 changes: Correction of errorsIs used when

the old value of the attribute has no significance or can be discarded.

Easy and Fast

Type 2 changes: preservation history Partitions history so that fact tables properly

reflect original values. Requires use of Surrogate Keys Causes table growth due to additional history rows Users must be aware of the added complexity Effective Dates used secondary to cleaner fact joins

Page 20: (Kajal Maam(Principles of Dimensional Modeling

20

Type 3 changes: tentative soft revisions Additional attribute used to capture changes.

Used less frequently then Type 1 or 2. Relate to tentative changes in the source systems. Used to compare performances. Ability to track forward and backward

Page 21: (Kajal Maam(Principles of Dimensional Modeling

21

Large Dimensions Rapidly changing dimensions

Page 22: (Kajal Maam(Principles of Dimensional Modeling

22

Snowflake Schema

Snowflaking is a method of normalizing the dimension tables in STAR schema

Advantages: Small savings in storage space Normalized structures are easier to update and maintainDisadvantage: Schema less intuitive and end users are put off by

complexity Ability to browse through the contents difficult Degraded query performance because of additional joins

Page 23: (Kajal Maam(Principles of Dimensional Modeling

23

Star Schema

Page 24: (Kajal Maam(Principles of Dimensional Modeling

24

Flattened Star

Page 25: (Kajal Maam(Principles of Dimensional Modeling

25CSE 5331/7331

F'07

Normalized Star

Page 26: (Kajal Maam(Principles of Dimensional Modeling

26CSE 5331/7331

F'07

Snowflake Schema

Page 27: (Kajal Maam(Principles of Dimensional Modeling

27

Snowflake Schema

Star Schema

Joins: Higher number of Joins Fewer Joins

Ease of Use:

More complex queries and hence less easy to understand

Less no. of foreign keys and hence lesser query execution time

Query Performance:More foreign keys-and hence more query execution time

Less no. of foreign keys and hence lesser query execution time

Ease of maintenance/change:

No redundancy and hence more easy to maintain and change

Has redundant data and hence less easy to maintain/change

Type of Data warehouse:

Good to use for small data warehouses/datamarts

Good for large data warehouses

Dimension table:

It may have more than one dimension table for each dimension

Contains only single dimension table for each dimension

DimTable Normalization:

3 Normal Form2 Normal Denormalized Form

Page 28: (Kajal Maam(Principles of Dimensional Modeling

28

Fact Constellation schema

It is shaped like constellation of stars For each star schema or snowflake schema it is possible to

construct a fact constellation schema This schema is more complex than star or snowflake architecture,

which is because it contains multiple fact tables allows dimension tables to be shared amongst many fact tables. solution is very flexible, however it may be hard to manage and

support. The main disadvantage of the fact constellation schema is a more

complicated design because many variants of aggregation must be considered

Different fact tables are explicitly assigned to the dimensions, which are for given facts relevant. This may be useful in cases when some facts are associated with a given dimension level and other facts with a deeper dimension level.

Page 29: (Kajal Maam(Principles of Dimensional Modeling

29

Dimensional Model Star Schema

Page 30: (Kajal Maam(Principles of Dimensional Modeling

30

Snow-Flake Schema in Dimensional Modeling

Page 31: (Kajal Maam(Principles of Dimensional Modeling

31

Fact Constellation Schema