ponniam1.ppt
DESCRIPTION
Ponniam1.pptTRANSCRIPT
![Page 1: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/1.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
1
M. Sc. (CS/IT) Part IPaper IV
Data Warehousing and Mining
Text Books: Paulraj Ponnian, “Data Warehousing
Fundamentals”, John Wiley. W.H. Inmon, “Building the Data Warehouses”, Wiley
Dreamtech R. Kimpall, “The Data Warehouse Toolkit”, John
Wiley Ralph Kimball, “The Data Warehouse Lifecycle
toolkit”, John Wiley
![Page 2: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/2.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
2
The need for DW Understand the desperate need for
strategic information Recognize the information crises at every
enterprise Distinguish between operational and
informational systems Past attempts to provide strategic
information The solution – Data Warehousing
![Page 3: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/3.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
3
Introduction What is your role in IT? Your IT experience Applications to run business What they do? What they provide? What executives requires? Where is the strategic information
required?
![Page 4: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/4.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
4
Organization’s use of DW Retail
Customer Loyalty Market Planning
Financial Risk Management Fraud Detection
Airlines Root Profitability Yield Managemnt
Manufacturing Cost Reduction Logistics Management
Utilities Asset Management Resource
Management Government
Manpower Planning Cost Control
![Page 5: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/5.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
5
Understand the desperate need for strategic information Who needs strategic information in
an Enterprise? What is strategic information? Examples of Business Objectives
Retain the present customer base Increase the customer base by 15%
over the next 5 years Gain market share by 10% in next 3
years
![Page 6: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/6.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
6
Examples of Business Objectives (cont…)
Improve product quality levels in the top five product groups
Enhance customer service level in shipments
Bring three new products to market in 2 years
Increase sales by 15% in the North East Division
![Page 7: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/7.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
7
Strategic Information (SI) Is it for running the day-to-day
operation of the business? What is SI? Characteristics of SI
![Page 8: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/8.jpg)
Characteristics of SI Integrated Must have a single,
enterprise-wide view
Data Integrity Information must be accurate and must conform to business rules
Accessible Easily accessible with intuitive access paths, and responsive for analysis
Credible Every business factor must have unique value
Timely Information must be available within the stipulated time period
![Page 9: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/9.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
9
The Information Crisis How much data is stored and available? Where is all this data? On which platforms? On one PC or across the network? Facts are Organization have lots of data IT resources and systems are not
affective to use this data as SI
![Page 10: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/10.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
10
Real Problem Most companies are faced with information
crisis not because of lack of sufficient data, but because the available data is not readily usable for strategic decision making.
Why is this so?We need information integrated from all systems.
Operational data is event drivenOperational data is not directly suitable for
review from different viewpoints
![Page 11: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/11.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
11
Technology Trends Name of Computer Department in
Company “DP”, “MIS”, “IS”, “IT” Phenomenon growth of IT in areas
like Computing Technology Human/Machine Interface Processing Options
What technology SI needs?
![Page 12: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/12.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
12
Technology Trends (cont…) The user will ask a question and
get the results… This interactive process continues Why making provision of SI is
feasible now?
![Page 13: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/13.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
13
Opportunities and Risks What are the opportunities
available to companies resulting from the possible use of SI?
What are threats and risks resulting from lack of SI available in companies?
![Page 14: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/14.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
14
Some Opportunities … SI required for Reliance
Telecommunication industry SI required for ICICI Bank SI required for Mediclaim companies SI required for Apna Bazar A Community based pharmacy
company
![Page 15: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/15.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
15
Some Risks … A car rental company (fleet
management) A multinational company - Supplier
of systems and components to automobile industry (Inconsistent data)
![Page 16: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/16.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
16
Failures of past DSS Example – A Chennai Branch is not … … You have to gather the data from
multiple applications and start from scratch.
In order to understand the reasons for the failures of IT to provide SI in the past, we need to consider how IT was attempting to do this all these years.
![Page 17: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/17.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
17
Past DSSs Ad- Hoc reports Special Extract Programs Small applications Information Centers DSS EIS (only programmed screens and
reports were available)
![Page 18: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/18.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
18
Inability to provide information Figure 1.4 IT receives too many ad hoc requests,
resulting in a large overload. Requests keep changing Users ask for more and more reports Users have to depend on IT to provide the
information You need very flexible and conductive
environment for providing info for making strategic decisions. IT has been unable to provide such an environment.
![Page 19: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/19.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
19
Operational vs DSS What is the basic reason for the
failure of all the previous attempts by IT to provide SI?
Do we need different types of systems?
![Page 20: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/20.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
20
Making the wheels ofBusiness Turn OLTP Systems Used to run the day-to-day core
business of company
![Page 21: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/21.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
21
Get the data inMaking the wheels of business turn
Take an order Process a claim Make a shipment Generate an invoice Receive cash Reserve an Airline ticket
![Page 22: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/22.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
22
Get the information outWatching the wheels of business turn
Show me the top-selling products Show me the problem regions Tell me why (drill down) Let me see other data (drill across) Show me highest margins Alert me when a district sells below
target
![Page 23: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/23.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
23
We need to design and build informational systems That serve different purposes Whose scopes are different Whose data content is different Where the data usage patterns are
different Where the data access types are
different
![Page 24: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/24.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
24
M. Sc. (CS/IT) Part IPaper IV
Data Warehousing and Mining Text Books: 1. Paulraj Ponnian, “Data Warehousing Fundamentals”, John Wiley. 2. W.H. Inmon, “Building the Data Warehouses”, Wiley Dreamtech 3. R. Kimpall, “The Data Warehouse Toolkit”, John Wiley 4. Ralph Kimball, “The Data Warehouse Lifecycle toolkit”, John Wiley
![Page 25: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/25.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
25
The need for DW Understand the desperate need for
strategic information Recognize the information crises at every
enterprise Distinguish between operational and
informational systems Past attempts to provide strategic
information The solution – Data Warehousing
![Page 26: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/26.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
26
Introduction What is your role in IT? Your IT experience Applications to run business What they do? What they provide? What executives requires? Where is the strategic information
required?
![Page 27: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/27.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
27
Organization’s use of DW Retail
Customer Loyalty Market Planning
Financial Risk Management Fraud Detection
Airlines Root Profitability Yield Managemnt
Manufacturing Cost Reduction Logistics Management
Utilities Asset Management Resource
Management Government
Manpower Planning Cost Control
![Page 28: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/28.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
28
Understand the desperate need for strategic information Who needs strategic information in
an Enterprise? What is strategic information? Examples of Business Objectives
Retain the present customer base Increase the customer base by 15%
over the next 5 years Gain market share by 10% in next 3
years
![Page 29: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/29.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
29
Examples of Business Objectives (cont…)
Improve product quality levels in the top five product groups
Enhance customer service level in shipments
Bring three new products to market in 2 years
Increase sales by 15% in the North East Division
![Page 30: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/30.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
30
Strategic Information (SI) Is it for running the day-to-day
operation of the business? What is SI? Characteristics of SI
![Page 31: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/31.jpg)
Characteristics of SI
Integrated Must have a single, enterprise-wide view
Data Integrity Information must be accurate and must conform to business rules
Accessible Easily accessible with intuitive access paths, and responsive for analysis
Credible Every business factor must have unique value
Timely Information must be available within the stipulated time period
![Page 32: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/32.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
32
The Information Crisis How much data is stored and available? Where is all this data? On which platforms? On one PC or across the network? Facts are Organization have lots of data IT resources and systems are not
affective to use this data as SI
![Page 33: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/33.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
33
Real Problem Most companies are faced with information
crisis not because of lack of sufficient data, but because the available data is not readily usable for strategic decision making.
Why is this so?We need information integrated from all systems.
Operational data is event drivenOperational data is not directly suitable for
review from different viewpoints
![Page 34: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/34.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
34
Technology Trends Name of Computer Department in
Company “DP”, “MIS”, “IS”, “IT” Phenomenon growth of IT in areas
like Computing Technology Human/Machine Interface Processing Options
What technology SI needs?
![Page 35: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/35.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
35
Technology Trends (cont…) The user will ask a question and
get the results… This interactive process continues Why making provision of SI is
feasible now?
![Page 36: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/36.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
36
Operational and Informational Systems
Data Content Current values Archived, derived, summarized
Data Structure Optimized for transactions
Optimized for complex queries
Access Frequency High Medium to low
Access Type Read, update, delete Read
Usage Predictable, Repetitive
Ad hoc, random, heuristic
Response Time msecs Many seconds
Users Large numbers Relatively small numbers
![Page 37: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/37.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
37
DW – The correct solution We need different types of DSS to
provide SI Information required for strategic
decision making is not available in operational systems
New environment is required for analysis, deciding trends and monitoring performance
![Page 38: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/38.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
38
Features of new environment : Database designed for analytical tasks Data from multiple applications Easy to use and helping to long interactive
sessions by users Read-intensive data usage Direct interaction with the system by the users
without help from IT staff Content updated periodically and stable Content to include current and historical data Ability for users to run queries and get results
online Ability for users to make reports
![Page 39: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/39.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
39
Processing requirements in the new environment (analytical processing requirements)
Running of simple queries and reports against current and historical data
Ability to perform “what if” analysis Ability to query, analyze and again
make query – continue this process as many as times required
Realize historical trends, mistakes and apply/correct them for future results
![Page 40: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/40.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
40
BI at DW The needed environment is DW It is kept separate from the system
environment supporting the day-to-day operations
DW contains BI.
![Page 41: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/41.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
41
Basic business process
Data transformation
DataWarehouse
Key measurements, business dimensions
OperationalSystems
Extraction,Cleansing,
aggregation
![Page 42: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/42.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
42
E.g. of BI at DW DW containing units of sales stored
along business dimensions Important : Data staging area
![Page 43: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/43.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
43
Definition of DW - DW is an informational environment that
Provides an integrated and total view of the enterprise
Makes the enterprise’s current and historical information easily available for decision making
Makes decision-support transactions possible without burdening operational systems
Renders consistently organization’s information Presents a flexible and interactive source of
strategic information
![Page 44: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/44.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
44
DW concept Is not to generate fresh data Is to make use of large existing
data and to transform it into forms suitable for providing SI
Take all the data you already have in the organization, clean and transform it, and then use it to provide SI
![Page 45: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/45.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
45
DW – An Environment,Not a Product It is a user-centric and user-driven
environment An ideal environment for data analysis and
decision support Constantly changing, flexible and
interactive Useful for the ask-answer-ask-again pattern Provides the ability to discover answers to
complex, unpredictable questions
![Page 46: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/46.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
46
The basic concept of DW is: Take all the data from the operational
systems Where necessary, include relevant data
from outside, such as industry benchmark indicators
Integrate all the data from the various sources
Remove inconsistencies and transform the data
Store the data in formats suitable for easy access for decision making
![Page 47: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/47.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
47
DW involves following functions Data extraction Loading the data Transforming the data Storing the data Providing UI
![Page 48: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/48.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
48
Technologies used in DW Data Quality
Data Modeling Data Acquisition Data Management Metadata Management
Administration Analysis Applications Development Tools Storage Management
![Page 49: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/49.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
49
Match the columns1. information crisis2. SI3. operational systems4. information center5. DW6. order processing7. EIS8. data staging area9. extract programs10. IT
A. OLTP applicationB. Produce ad hoc reportsC. explosive growthD. despite lots of dataE. data cleaned and
transformedF. users go to get
informationG. used for decision makingH. environment, not productI. for day-to-day operationsJ. Simple, easy to use
![Page 50: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/50.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
50
Class Test1. What do you mean by SI? For a commercial bank, name five
types of strategic objectives.2. Do you agree that a typical retail store collects huge
volumes of data through its operational systems? Name three types of transaction data likely to be collected by a retail store in large volumes during its daily operations.
3. Why were all the past attempts by IT to provide SI failures? List three concrete reasons and explain.
4. Differentiate between operational systems and informational systems.
5. List characteristics of the computing environment needed to provide SI.
6. What types of processing take place in a DW?7. A DW is an environment, not a product. Discuss.
![Page 51: Ponniam1.ppt](https://reader035.vdocuments.mx/reader035/viewer/2022062421/55cf988e550346d033984f5a/html5/thumbnails/51.jpg)
04/19/23 Girish Tere, Lecturer (CS), TCSC
51
Class Test (cont…)8. You are the IT Director of a nationwide insurance company.
Write a memo to the VP explaining the types of opportunities that can be realized with What do you mean by SI? For a commercial bank, name five types of strategic objectives.
9. For an airlines company, how can SI increase the number of frequent flyers? Discuss giving specific details.