business intelligence: a review
DESCRIPTION
TRANSCRIPT
Business Intelligence: Business Intelligence: A ReviewA Review
Prof. Swanand [email protected]
Business IntelligenceBusiness Intelligence
• An information system that can be used to analyze large datasets
• Such analyses help in decision making becauseo Data relates directly with the businesso Data provides objective basis for decision makingo Numbers do not lie!
• Common sectors where business intelligence is being employedo Organized retail (for example, market basket analysis)o BFSI (for example, credit ratings)o IT Security (for example, detecting network intrusion)o Online marketing (for example, assessing consumers’
opinions)
Data WarehouseData Warehouse
• Business Intelligence exercise involves two partso Data Warehouseo Data Mining
• Data warehouse compiles the data from multiple sources and transforms it into a uniform collection
• The uniformity is in the structure of the data (for example, nomenclature of columns)
Data MiningData Mining
• Data mining involves analyzing the data collected in data warehouse
• Business applications involveo Forecastingo Segmentationo Market basket analysiso Intrusion detection (IT Security related)
Business Intelligence Process Business Intelligence Process
Source Systems
Data Warehouse
Cubes
ETL AnalyticsData Mining
metadata
Data WarehouseData Warehouse• Data Warehouse has two kinds of tables
o Factso Dimensions
• Factso These tables record actual datao Most columns in fact tables can be subjected to
mathematical operationso For example: actual sales, share prices, commodity prices,
employee salaries etc.
• Dimensions o These tables provide an lens for examining the factso These tables include descriptive and categorical datao A single table may not reflect multiple dimensionso For example: product dimensions, time dimension,
demographic dimension
Developing Data WarehouseDeveloping Data Warehouse
• Developing data warehouse is a tedious task
• Common sub-tasks includeo Extracting data from source systemso Defining facts and dimensionso Transforming data; leading to creation of facts and
dimensions (most cumbersome and costly)o Relating facts and dimensions tableso Loading data into the data warehouse
Developing Data WarehouseDeveloping Data Warehouse
• Depending on how facts and dimensions are created and related, data warehouse takes different look and feel
Criteria Nomenclature
1 table for a dimension Star Schema
More than one table for a dimension
Snowflake Schema
Star SchemaStar Schema
Customer(Dimension Table)
Supplier(Dimension
Table)
Product(Dimension
Table)
Geography(Dimension Table)
Time(Dimension Table)
Sales (Fact Table)
Snowflake SchemaSnowflake Schema
FactSales
DimProd
DimProdCat
DimBrand
How to Decide on DW Structure?How to Decide on DW Structure?
• The choice is o Performance versus Robustness
• Star Schema involves one table per dimension
• Snowflake Schema involves multiple tables per dimension
• Trend in sales in Nike products over last yearo Fact: Saleso Dimensions in case of star schema: DimProd, DimTimeo Dimensions in case of Snowflake schema: DimProd,
DimBrand, DimProdCat
DimProd would be split into DimProdCat and DimBrand
How to Decide on DW Structure?How to Decide on DW Structure?
• Star schema gives better performance because reporting needs to be done from fewer tables.
• Snowflake schema provide better data management because data is structured normally.
OLAP OperationsOLAP Operations
• On a cube, following types of operations can be performed:o Roll up
Aggregate data for a given conditions
o Drill down Dig deeper into a cube for given conditions
o Slice and dice Cut the data for given condition
o Pivot Orient data along a certain attribute
Additional ConceptsAdditional Concepts
Meta-data of Data Warehouseo Useful in describing the structure of different
components of data warehouse (e.g. cubes, data marts, facts, dimensions etc.)
o Usually contains following Description of the structure of data warehouse Operational metadata (when was the last data migration
done? What is performance of the warehouse systems?) Summary generation (how the data was aggregated and
summarized? What reports are generated?) Mapping (what were the source systems? what
transformations were executed on the data? How the access is being managed?
Business terminology (what are the important business terms? How are they defined?
Additional ConceptsAdditional Concepts
• Data Martso Subsets of data warehouseo For users belonging a specific domain
(marketing data mart, finance data mart etc.)