data warehouse

14
DATA WAREHOUSE by Sonali Chawla

Upload: sonali-chawla

Post on 14-Feb-2017

192 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Data warehouse

DATA WAREHOUSE

bySonali Chawla

Page 2: Data warehouse

INTRODUCTION TO DATA WAREHOUSE Subject-oriented Integrated Time Variant Non Volatile

Page 3: Data warehouse

DATA WAREHOUSE CONCEPTS Operational Data and Informational Data Difference between Operational Data and

Data Warehouse Why have a separate Data Warehouse?

Page 4: Data warehouse
Page 5: Data warehouse
Page 6: Data warehouse

DATA WAREHOUSE ARCHITECTURE Steps of the Design and Construction of Data

Warehouse The design of data warehouse The process of data warehouse design

Three tier architecture Model

Enterprise Warehouse Data Marts Virtual Warehouse

Page 7: Data warehouse

Metadata Repository Data Warehouse Back-End Tools and Utilities

Data Extraction Data Clearing Data Transformation Load Refresh

Advantages of Building Data Warehouse Data Warehouse Application

Information Processing Analytical Processing Data Mining

Page 8: Data warehouse

DATABASE DATA MODELING Star Schema Snowflake Schema Facts Constellation Schema

Page 9: Data warehouse

OLAP AND DATA CUBE OLAP Data Cube Measures

Distributive Algebraic Holistic

Concept Hierarchies

Page 10: Data warehouse

Operations on Cubes

Roll Up Drill Down Slice and Dice Pivot (rotate) Drill Across Drill Through

OLAP Server Relational OLAP Server Multidimensional OLAP Server Hybrid OLAP Server

Page 11: Data warehouse

DATA PROCESSING Data Cleaning

Look for Missing Values Ignore the tuples Fill in the missing values manually Use a global constant to fill in the missing values Use the attribute mean to fill the missing values Use the attribute mean for all samples belonging to

the same class as given tuple Use the most probable value to fill in the missing value

Page 12: Data warehouse

Noisy Data

Binning Smoothing by bin mean Smoothing by bin boundaries

Regression Linear Multiple Linear

Clustering

Page 13: Data warehouse

Data Integration

Entity Identification Problem Redundancy Detection and Resolution of Data value conflicts

Data Transformation Smoothing Aggregation Generalization Normalization Attribute Construction

Page 14: Data warehouse

Data Reduction

Data cube aggregation Attribute Subset Selection Dimensionality Reduction Numerosity Reduction Discretization and concept hierarchy generation