it ready - dw: 1st day
Post on 20-May-2015
Embed Size (px)
- 1. Data Warehousing (DAY 1) Siwawong W. Project Manager 2010.05.24
2. Agenda 09:00 09:15 Registration 09:15 09:30 Self-Introduction 09:30 10:30 Data Warehouse: Introduction 10:30 10:45 Break & Morning Refreshment 10:45 12:00 Data Warehouse: Introduction (Cont) 12:00 13:00 Lunch Break 13:00 15:00 Review RDBMS & SQL command 15:00 15:15 Break 15:15 16:00 Case Study ~ Q/A 3. SELF-INTRODUCTION 4. About Me
- My Name:Siwawong Wuttipongprasert
- Nick-name: Tae (You can call this name. its easier)
- My Background:
- B.Eng (Computer Engineering), Chiang Mai University.
- My Career Profile:
- 10+ years inIT business
- 5+ years withBlue Ball Co., Ltd.
- Role:Programmer, System Analysis, Consultant & Project Manager
- Working Area:ERP, MRP, Retailing, Banking, Financial, E-Commerce, etc.
- Working with multi-cultures:Japanese, German and Vietnamese
- Know Me More..
5. My Company: Blue Ball Blue Ball Groupis an Offshoring Company that focus totally in customer satisfaction. It takes advantage of western management combined with Asian human resources to provide high quality services Thailand (Head Office) Mexico (Special Developments) Vietnam (Offshoring Center) 6. Services from My Company Offshoring Programmers &Testers Blue Ball will get you ready to offshore successfully. No need to rush you into offshoring without you feeling confident on how to send, organize, receive, test and accept job confidently System Development & Embedded Solutions Solutions that combine technological expertise anddeep business understanding. We only start coding once every single detail such as milestones, scheduling, contact point, communication, issue management and critical protocols are in place Web design and E-commerce Premium web design, CMS, e-commerce solutions andSEO services. Website maintenance and copy content creation to develop marketing campaigns that SELL for discerning companies to increase the quality and reach of their marketing campaigns 7. My Clients 8. Data Warehouse: Introduction 9. Data Warehouse: Introduction
- Data Warehousing, OLAP and data mining:
- what and why (now)?
- Relation to OLTP
- Review RDMBS & SQL Command
- A case study
10. Data Warehouse: What & Why? Problem Statements 11. A producer wants to know. Which are ourlowest/highest margincustomers ? Who are my customersand what productsare they buying? Which customersare most likely to goto the competition ? What impact willnew products/serviceshave on revenueand margins? What product prom- -otions have the biggestimpact on revenue? What is the mosteffective distributionchannel? 12. Data, Data everywhere, yet ...
- I cant find the data I need
- data is scattered over the network
- many versions, subtle differences
- I cant get the data I need
- need an expert to get the data
- I cant understand the data I found
- available data poorly documented
- I cant use the data I found
- results are unexpected
- data needs to be transformed from one form to other
13. What is a Data Warehouse?
- A single, complete and consistent store of
- data obtained from a variety of different
- sources made available to end users in a
- what they can understand and use in a
- business context.
- [Barry Devlin]
14. What are the users saying...
- Data should be integrated across the enterprise
- Summary data has a real value to the organization
- Historical data holds the key to understanding data over time
- What-if capabilities are required
15. What is Data Warehousing? Aprocessof transformingdataintoinformation and making it available to users in a timely enough manner to make a difference [Forrester Research, April 1996] Data Information 16. Evolution
- 60s:Batch reports
- hard to find and analyze information
- inflexible and expensive, reprogram every new request
- 70s: Terminal-based DSS and EIS (executive information systems)
- still inflexible, not integrated with desktop tools
- 80s:Desktop data access and analysis tools
- query tools, spreadsheets, GUIs
- easier to use, but only access operational databases
- 90s:Data warehousing with integrated OLAP engines and tools
17. Very Large Data Bases Terabytes -- 10^12 bytes: Petabytes -- 10^15 bytes: Exabytes -- 10^18 bytes: Zettabytes -- 10^21 bytes: Zottabytes -- 10^24 bytes: Walmart -- 24 Terabytes Intelligence Agency Videos Geographic Information Systems National Medical RecordsWeather images 18. Data Warehousing --It is a process
- Technique for assembling and managing data from various sources for the purpose of answering business questions. Thus making decisions that were not previous possible
- A decision support database maintained separately from the organizations operational database
19. Data Warehouse
- A data warehouse is a
- subject-oriented:Organized based on use
- Integrated:inconsistencies remove
- time-varying:data are normally time-series
- non-volatile:store in read-only format
- collection of data that is used primarily in organizational decision making.
- -- Bill Inmon, Building the Data Warehouse 1996
20. Data Warehouse: Subjected-OrientedWH is organized around the major subjects of the enterprise..rather than the major application areas..This is reflected in the need to store decision-support data rather than application-oriented data Subject-Oriented DBWH Sales Operational DB OrderProcessing Application-Oriented 21. Data Warehouse: Integrated Because the source data come together from different enterprise-wide applications systems.The source data is often inconsistent using the integrated data source must be made consistent to present a unified view of the data to the users 22. Data Warehouse: time-varying The source data in the WH is only accurate and valid at some point in time or over some time interval.The time-variance of the data warehouse is also shown in the extended time that the data is held, the implicit or explicit association of time with all data, and the fact that the data represents a series of snapshots Historical data is recorded 23. Data Warehouse:Non-volatile Data isNOT update in real timebut is refresh from OS on a regular basis.New data is always added as a supplement to DB, rather than replacement.The DB continually absorbs this new data, incrementally integrating it with previous data Anyone who is using the database has confidence that a query will always produce the same result no matter how often it is run 24. Explorers, Farmers and Tourists Explorers:Seek out the unknown and previously unsuspected rewards hiding in the detailed data Farmers:Harvest information from known access paths Tourists:Browse information harvested by farmers 25. Data Warehouse Architecture Data WarehouseEngine Optimized Loader Extraction Cleansing Analyze Query Metadata Repository Relational Databases Legacy Data PurchasedData ERP Systems 26. OLAP & Data Mining 27. Data Warehouse for DS & OLAP
- Putting Information technology to help the knowledge worker make faster and better decisions
- Which of my customers are most likely to go to the competition?
- What product promotions have the biggest impact on revenue?
- Howdid the share price of software companies correlate with profits over last 10 years?
28. Decision Support (DS)
- Used to manage and control business
- Data is historical or point-in-time
- Optimized for inquiry rather than update
- Use of the system is loosely defined and can be ad-hoc
- Used by managers and end-users to understand the business and make judgements
29. Data Mining works with Warehouse Data Data Warehousing provides the Enterprise with a memory Data Mining provides the Enterprise with intelligence 30. We want to know ...
- Given a database of 100,000 names, which persons are the least likely to default on their credit cards?
- Which types of transactions are likely to be fraudulent given the demographics and transactional history of a particular customer?
- If I raise the price of my product by Rs. 2, what is the effect on my ROI?
- If I offer only 2,500 airline miles as an incentive to purchase rather than 5,000, how many lost responses will result?
- If I emphasize ea