Data ware housing- Introduction to data ware housing

Download Data ware housing- Introduction to data ware housing

Post on 22-Mar-2017




2 download

Embed Size (px)


<ul><li><p>Introduction to Data Warehousing</p></li><li><p>From DBMS to Decision SupportDBMSs widely used to maintain transactional dataAttempts to use of these data for analysis, exploration, identification of trends etc. has led to Decision Support Systems.Rapid Growth since mid 70sDBMSs vendors have answered this trend by adding new features to existing productsRarely enough</p></li><li><p>DBs for Decision SupportTrend towards Data WarehousingData Warehousing consolidation of data from several databases which are in turn maintained by individual business units along with historical and summary information</p></li><li><p>Characteristics of TPSsCharacteristicOLTPTypical operationUpdateLevel of analytical requirementsLowScreensUnchangingAmount of data per transactionSmallData levelDetailedAge of dataCurrentOrientationRecords</p><p>*</p></li><li><p>Complex AnalysisHistorical informationto analyzeData needs to be integrated</p><p>Database design: Denormalized, star schemaOLTPInformation to supportday-to-day serviceData stored at transactionlevelDatabase design: NormalizedTPS vs Decision Support</p><p>*</p></li><li><p>MIS and Decision SupportOperational reportsDecision makersProductionplatformsMIS systems provided business dataReports were developed on requestReports provided little analysis capabilityno personal ad hoc access to dataAd hoc access</p></li><li><p>Analyzing Data from Operational SystemsData structures are complexSystems are designed for high performance and throughputData is not meaningfully representedData is dispersedTPS systems unsuitable for intensive queries</p><p>Operational reportsProductionplatformsERP </p></li><li><p>Data Extract ProcessingEnd user computing offloaded from the operational environmentUsers own data</p><p>ExtractsOperational systemsDecision makers</p></li><li><p>Management IssuesExtract explosionDuplicated effortMultiple technologiesObsolete reportsNo metadata</p><p>ExtractsOperational systemsDecision makers</p></li><li><p>Data Quality IssuesNo common time basisDifferent calculation algorithmsDifferent levels of extractionDifferent levels of granularityDifferent data field namesDifferent data field meaningsMissing informationNo data correction rulesNo drill-down capability</p></li><li><p>From Extract to Warehouse DSSControlledReliableQuality informationSingle source of data</p><p>Data warehouseInternal andexternal systemsDecision makers</p></li><li><p>Data Warehousing Architecture</p><p>OLAPData WarehouseOperational DatabasesData Mining</p></li><li><p>Business MotivatorsProvide superior services and products Know the businessNew productsInvest in customersRetain customersInvest in technologyReinvent to face new challenges</p></li><li><p>Centralised data warehouseFederated data warehouse</p></li><li><p>Tiered data warehouse</p></li><li><p>Data Warehouses Vs Data Marts</p><p>Data MartDepartmentSingle-subjectFew&lt; 100 GBMonths</p><p>PropertyScopeSubjectsData SourceSize (typical)Implementation timeData WarehouseEnterpriseMultipleMany100 GB to &gt; 1 TBMonths to years</p></li><li><p>End-user Access Tools High performance is achieved by pre-planning the requirements for joins, summations, and periodic reports by end-users.</p><p>There are five main groups of access tools:Data reporting and query toolsApplication development toolsExecutive information system (EIS) toolsOnline analytical processing (OLAP) toolsData mining tools</p></li><li><p>Data Usage - $1000 questionsNeed to complement RDBMS technology with a flexible, multidimensional view of data</p><p>Verification</p><p>Discovery</p><p>What is the average sale for in-store and catalog customers?</p><p>What is the best predictor of sales?</p><p>What is the average high school GPA of students who graduate from college compared to those who do not?</p><p>What are the best predictors of college graduation?</p></li><li><p>The Functionality of OLAPRotate and drill down Create and examine calculated data Determine comparative or relative differences.Perform exception and trend analysis.Perform advanced analytical functions</p></li><li><p>The star structure</p></li><li><p>Multidimensional Database ModelThe data is found at the intersection of dimensions.StoreTimeFINANCEStoreProductTime</p><p>SALESCustomer</p></li><li><p>Data Mining</p></li><li><p>Data mining functionsAssociations85 percent of customers who buy a certain brand of wine also buy a certain type of pastaSequential patterns32 percent of female customers who order a red jacket within six months buy a gray skirtClassifyingFrequent customers are those with incomes about $50,000 and having two or more childrenClusteringMarket segmentationPredictingpredict the revenue value of a new customer based on that personal demographic variables</p></li><li><p>Thank You !!!For More Information click below link:</p><p>Follow Us on:</p><p> </p><p>*</p><p>*</p></li></ul>