1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

Download 1 Sharif University Data Warehouse. 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse

Post on 25-Dec-2015

221 views

Category:

Documents

6 download

Embed Size (px)

TRANSCRIPT

<ul><li> Slide 1 </li> <li> 1 Sharif University Data Warehouse </li> <li> Slide 2 </li> <li> 2 Sharif University Objectives Need for Data Warehouse. What is Data Warehouse? Data Warehouse Properties. Data Warehouse Architectures. Data Marts. Corporate Information Factory. Extraction, Transportation, Loading and Transformation. Design in Data Warehouses. Data Warehousing Schemas. </li> <li> Slide 3 </li> <li> 3 Sharif University Decision support questions that enterprises need to have answered How did sales representatives perform over different periods of time? What are the popular products? What types of customers buy what types of products? How much are the various internal organizations spending on what products? </li> <li> Slide 4 </li> <li> 4 Sharif University Cont. What were the variances between the amounts budgeted and the amounts spent? What positions are being filled by people with what types of background? What is the average pay for people within different age brackets? </li> <li> Slide 5 </li> <li> 5 Sharif University What is a Data Warehouse? A data warehouse is a relational database that is designed for query and analysis rather than for transaction processingA data warehouse is a relational database that is designed for query and analysis rather than for transaction processing A common way of introducing data warehousing is to refer to the characteristics of a data warehouse as set forth by William Inmon : Subject Oriented Integrated Nonvolatile Time Variant </li> <li> Slide 6 </li> <li> 6 Sharif University Data Warehouse Properties Subject Oriented Integrated Data Warehouse Non Volatile Time Variant </li> <li> Slide 7 </li> <li> 7 Sharif University Subject Oriented For example, to learn more about your companys sales data, "Who was our best customer for this item, in this region last year?" This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented. Data is categorized and stored by business subject rather than by application. Operational Systems Operational Systems Region Time Customer Product Customer Financial Information Customer Financial Information Data Warehouse Subject Area </li> <li> Slide 8 </li> <li> 8 Sharif University Integrated Data warehouses must put data from disparate sources into a consistent format. </li> <li> Slide 9 </li> <li> 9 Sharif University Time Variant (time series) Data is stored as a series of snapshots, each representing a period of time. DataTime Jan/03 Feb/03 Mar/03 Data for January Data for February Data for March DataWarehouse </li> <li> Slide 10 </li> <li> 10 Sharif University Non Volatile Typically data in the data warehouse is not updated or deleted. Read Load INSERT Read UPDATEDELETE Operational Databases Warehouse Database Nonvolatile means that, once entered into the warehouse, data should not change.This is logical because the purpose of a warehouse is to enable you to analyze what has occurred. </li> <li> Slide 11 </li> <li> 11 Sharif University Other Characteristics of Data Warehouse Summarized Not Normalized Meta Data Sources (Both operational and external data are presents) </li> <li> Slide 12 </li> <li> 12 Sharif University Summary Data Provide fast access to pre-computed data Reduce use of I/O CPU Memory Distill from Source systems - lightly summarized Pre-calculated summaries - highly summarized Determine requirements early </li> <li> Slide 13 </li> <li> 13 Sharif University Summary Data Average Maximum Total Percentage DimensionData FactData Units Sold Sales($) Store Product A Total Product B Total Product C Total </li> <li> Slide 14 </li> <li> 14 Sharif University Summary Data Time Product Store Summary Fact (Derived) </li> <li> Slide 15 </li> <li> 15 Sharif University Normalization Normalized data contains no Redundancy. Repeating data. Key independent columns. Denormalized data often Improves efficiency in OLAP systems. Exists in data warehouse databases. Comprises derived or summary data. Star and snowflake models are denormalized. </li> <li> Slide 16 </li> <li> 16 Sharif University Meta Data (Data about Data) Provides information about the content of the warehouse. Meta Data includes: A guide to moving data to the warehouse Rules for summarization Business terms used to describe data Technical terminology Rules for data extractions </li> <li> Slide 17 </li> <li> 17 Sharif University Data Warehouse Architectures Data Warehouse Architecture (Basic) Data Warehouse Architecture (with a Staging Area) Data Warehouse Architecture (with a Staging Area and Data Marts) </li> <li> Slide 18 </li> <li> 18 Sharif University Data Warehouse Architecture (Basic) End users directly access data derived from several source systems through the data warehouse. </li> <li> Slide 19 </li> <li> 19 Sharif University Data Warehouse Architecture (with a Staging Area) you need to clean and process your operational data before putting it into the warehouse. You can do this programmatically, although most data warehouses use a staging area instead. </li> <li> Slide 20 </li> <li> 20 Sharif University Data Warehouse Architecture (with a Staging Area and Data Marts) you may want to customize your warehouses architecture for different groups within your organization. You can do this by adding data marts, which are systems designed for a particular line of business. </li> <li> Slide 21 </li> <li> 21 Sharif University Data Marts A Data Mart is a small warehouse designed for strategic business unit or a department. Data Mart Advantages: The cost is low. Implementation time is shorter. They are controlled locally rather than centrally. They contain less information than the data warehouse and hence have more rapid response. They allow a business unit to build its own DSS without relying on a centralized IS department. Data Mart Types: Replicated Data Marts. Stand-alone Data Marts. </li> <li> Slide 22 </li> <li> 22 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other Corporate Information Factory </li> <li> Slide 23 </li> <li> 23 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other Business Operations Business Intelligence Business Management Major Business Functions </li> <li> Slide 24 </li> <li> 24 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other Operational Systems are the internal and external core systems that run the day-to-day business operations. They are accessed through application program interfaces (APIs) and are the source of data for the data warehouse and operational data store. Operational Systems </li> <li> Slide 25 </li> <li> 25 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other External Data is any data outside the normal data collected through an enterprises internal applications. Generally, external data, such as demographic, credit, competitor, and financial information, is purchased by the enterprise from a vendor of such information. External Data </li> <li> Slide 26 </li> <li> 26 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other Data Acquisition is the set of processes that capture, integrate, transform, cleanse, and load source data into the data warehouse and operational data store. Data Acquisition </li> <li> Slide 27 </li> <li> 27 Sharif University Data Problems </li> <li> Slide 28 </li> <li> 28 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other The Data Warehouse is a subject-oriented, integrated, time-variant, non-volatile collection of data used to support the strategic decision- making process for the enterprise. Data Warehouse </li> <li> Slide 29 </li> <li> 29 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other The Operational Data Store is an subject- oriented, integrated, current, volatile collection of data used to support the tactical decision-making process for the enterprise. Operational Data Store </li> <li> Slide 30 </li> <li> 30 Sharif University Comparing an Operational Data Store and a Data Warehouse </li> <li> Slide 31 </li> <li> 31 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other CIF Data Management is the set of processes that protect the integrity and continuity of the data within and across the data warehouse and operational data store. It may employ a staging area for cleansing and synchronizing data. CIF Data Management </li> <li> Slide 32 </li> <li> 32 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other The Transactional Interface is an easy-to-use and intuitive interface for the end user to access and manipulate data in the operational data store. Transactional Interface </li> <li> Slide 33 </li> <li> 33 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other Data Delivery is the set of processes that enables end users and their supporting IT groups to filter, format, and deliver data to data marts and oper-marts. Data Delivery </li> <li> Slide 34 </li> <li> 34 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other The Exploration Warehouse is a data mart whose purpose is to provide a safe haven for exploratory and ad hoc processing. An exploration warehouse may utilize specialized technologies to provide fast response times with the ability to access the entire database. Exploration Warehouse </li> <li> Slide 35 </li> <li> 35 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other The Data Mining Warehouse includes tasks known as knowledge extraction, data archaeology, data exploration, data pattern processing and data harvesting. Data Mining Warehouse </li> <li> Slide 36 </li> <li> 36 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Management Service Management Data Acquisition Management Systems Management Data Acquisition CIF Data Management Data Delivery Information Feedback API DSI TrI DSI Operational Systems Operational Data Store Data Warehouse Exploration Warehouse Data Mining Warehouse OLAP Data Mart Oper Mart External ERP Internet Legacy Other The OLAP (online analytical processing) Data Mart is aggregated and/or summarized data that is derived from the data warehouse and tailored to support the multidimensional requirements of a given business unit or business function. OLAP Data Mart </li> <li> Slide 37 </li> <li> 37 Sharif University Information Workshop Meta Data Management Operation &amp; Administration Library &amp; Toolbox Workbench Change Manageme...</li></ul>

Recommended

View more >