final exam thursday dec. 9, 19:30, bsb b154 lecture notes 20 true/false and 20 multiple choice...

48
Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application and conceptual): 40 points

Upload: randall-morrison

Post on 23-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple

Choice questions: 40 points 4 Short answer questions

(application and conceptual): 40 points

Page 2: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Project Presentation Email me Power Point presentation the day before Presentation by all members: 15 minutes Question and answer: 5 minutes Peer evaluation:1)     Objective and the value of the project2)     Data modeling3)     Functions of the application (with demo)4)     Quality of presentation5) Overall

Page 3: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Warehouse

Page 4: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Warehouse The idea of a data warehouse is to put a

wide range of operational data from internal and external sources into one place so it can be better utilized by executives, line of business managers and other business analysts.

Once the information is gathered, OLAP (on-line analytical processing ) software comes into play by providing the desktop analysis tools for querying, manipulating and reporting the data from the data warehouse.

Page 5: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Warehouse environment the source systems from which

data is extracted the tools used to extract data for

loading the data warehouse the data warehouse database itself

where the data is stored the desktop query and reporting

tools used for decision support

Page 6: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

The Data Warehouse The Data Warehouse is an

integrated, subject-oriented, time-variant, non-volatile database that provides support for decision making.

Page 7: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Creating A Data Warehouse

Figure 13.3

Page 8: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Operational Vs. Multidimensional View Of Sales

Page 9: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

The Data Warehouse Integrated

The Data Warehouse is a centralized, consolidated database that integrates data retrieved from the entire organization.

Subject-Oriented The Data Warehouse data is arranged

and optimized to provide answers to questions coming from diverse functional areas within a company.

Page 10: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

The Data Warehouse Time Variant

The Warehouse data represent the flow of data through time. It can even contain projected data.

Non-Volatile Once data enter the Data Warehouse,

they are never removed. The Data Warehouse is always growing.

Page 11: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Operational Database vs. Data warehouse

Operational DB Similar data can have

different representations or meanings

Functional or process orientation

Current transaction Frequent updating

Data Warehouse Unified view of all

data elements Subject orientation

for decision support Historical

information with time dimension

Data are added without change

Page 12: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Mart A data mart is a small, single-

subject data warehouse subset that provides decision support to a small group of people.

Page 13: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Mart Data Marts can serve as a test vehicle

for companies exploring the potential benefits of Data Warehouses.

Data Marts address local or departmental problems, while a Data Warehouse involves a company-wide effort to support decision making at all levels in the organization.

Page 14: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Star Schema The star schema is a data modeling

technique used to map multidimensional decision support into a relational database.

Star schemas yield an easily implemented model for multidimensional data analysis while still preserving the relational structure of the operational database.

Page 15: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Star Schema Four Components:

Facts Dimensions Attributes Attribute hierarchies

Page 16: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Figure 13.14 A Three-Dimensional View of Sales

Page 17: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Figure 13.17 Attribute Hierarchies in Multidimensional Analysis

Page 18: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Figure 13.17 Star Schema For Sales

Page 19: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Star Schema Representation Facts and dimensions are normally

represented by physical tables in the data warehouse database.

The fact table is related to each dimension table in a many-to-one (M:1) relationship.

Fact and dimension tables are related by foreign keys and are subject to the primary/foreign key constraints.

Page 20: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Figure 13.18 Orders Star Schema

Page 21: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Star Schema Performance-Improving

Techniques Normalization of dimensional tables Multiple fact tables representing

different aggregation levels Denormalization of fact tables Table partitioning and replication

Page 22: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Figure 13.19 Normalized Dimension Tables

Page 23: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Multiple Fact Tables

Page 24: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Warehouse Implementation The Data Warehouse as an Active

Decision Support Network A Company-Wide Effort that

Requires User Involvement and Commitment at All Levels

Satisfy the Trilogy: Data, Analysis, and Users

Apply Database Design Procedures

Page 25: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Warehouse Implementation Road Map

Page 26: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

On-Line Analytical Processing On-Line Analytical Processing (OLAP) is an

advanced data analysis environment that supports decision making, business modeling, and operations research activities.

Four Main Characteristics of OLAP Use multidimensional data analysis techniques. Provide advanced database support. Provide easy-to-use end user interfaces. Support client/server architecture.

Page 27: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Figure 13.7 OLAP Server Arrangement

Page 28: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

http://www.dwinfocenter.org/

Page 29: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Mining The data warehouse that enterprises

are building until now have largely ignored

Factors make data mining feasible organizations are gathering more data

from on-line TPS with lower storage cost

high computation power allows using complex data mining algorithm

Page 30: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Mining With data mining, it is possible to

better manage product warranties, predict purchases of retail stock, unearth fraud, determine credit risk, and define new products and services.

Page 31: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data-Mining Phases

Page 32: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Four Phases of Data Mining1.Data Preparation

Identify and cleanse data sets. Data Warehouse is usually used for data

mining operations.

2.Data Analysis and Classification Identify common data characteristics or

patterns using Data groupings, classifications, clusters, or

sequences. Data dependencies, links, or relationships. Data patterns, trends, and deviations.

Page 33: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Four Phases of Data Mining 3. Knowledge Acquisition

Select the appropriate modeling or knowledge acquisition algorithms.

Examples: neural networks, decision trees, rules induction, genetic algorithms, classification and regression tree, memory-based reasoning, or nearest neighbor and data visualization).

4. Prognosis Predict future behavior and forecast business

outcomes using the data mining findings.

Page 34: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Mining Data mining yields five basic type of

information: Association - occurrences are linked to a

single event. “beer purchasers also buy peanuts 70% of the time”

Sequences - events are linked over time. “a new carpet purchase linked to new curtains”

Classification - patterns are recognized that describe the characteristics of a group, such as customers who cancel credit cards

Page 35: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Data Mining Clustering - discovers

undiscovered groupings ``Buyers of expensive sport cars are typically young urban professionals whereas luxury sedans are bought by elderly wealthy persons.''

Forecasting - estimates future value such as inventory turnover

Page 36: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Database Marketing It seems a lot of companies are taking

a friendly interest in your life these days

Companies are collecting mountains of information about you, crunching it to predict how likely you are to buy a products, and using that knowledge to craft a marketing message precisely calibrated to get you to do so.

Page 37: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Database Marketing The trend:

Mass marketing Marketing segmentation Individual marketing

Nothing is more powerful than knowledge about customers’ individual practice and preferences.

Page 38: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Database Marketing Gathering massive quantity of data

about consumers from multiple sources Data are combined and analyzed using

powerful tools Has a primary goal of better

understanding current and potential customers in order to boost sales and build customer loyalty

American Express Reader’s Digest

Page 39: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Pioneers of New Marketing General Motors surveys 12 million

GM Card holders on their car preferences

Blockbuster has a database of 36 million households and 2 million daily transactions. It is testing a system that will recommend movies based on a customer’s past rentals

Page 40: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Pioneers of New Marketing Kraft amassed a list of 30 million users

from coupons and survey questions. It regularly send them tips on nutrition and recipes, as well as coupons for specific brands

56% of manufacturers and retailers are currently building a database for marketing

85% believe they will do database marketing in 2000.

Page 41: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Some concerns Private intelligence-gathering gives

some people the creeps Targeted marketing efforts are

intrusive and annoying The collection, manipulation, and

combination of lists of personal information amount to an ominous invasion of privacy

Page 42: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

A Sample Of Current Data Warehousing And Data Mining Vendors

Table 13.10

Page 43: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

http://www.irmac.ca/

Page 44: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

http://www.almaden.ibm.com/cs/quest/TECH.html

Page 45: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Deploying Data Mining for Competitive Advantage The act of building data-mining

models does not, by itself, guarantee any business value

To be used as competitive weapon, data mining must be part of a larger process that ensures that the information learned by data mining is transformed into actionable results

Page 46: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

A process of deploying data mining for competitive advantage

Problem definition Discovery Implementation Taking action Monitoring the results

Page 47: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Anderson 2001 survey: Data mining in retailer industry Using data mining: 52.5% in total

75% of very large retailers (>$500 million) 46.4% of large ($200-499 million) 34.8% of medium ($50-199 million) 20% of small (<$50 million)

Effect (contribution to the bottom line) 52.5% said "no contribution" 19.8% said "very little." 17.8% said "somewhat” 8.9% said "very much."

Page 48: Final Exam Thursday Dec. 9, 19:30, BSB B154 Lecture notes 20 True/False and 20 multiple Choice questions: 40 points 4 Short answer questions (application

Hunt for terrorists Banks probe credit, debit card records in

hunt for terrorists: 'Data mining': Any suspicious transactions handed over to RCMP

Canada's major financial institutions are reviewing thousands of their customers' confidential transactions as part of a probe into terrorist funding that has gone beyond a list of 27 suspects provided by U.S. law enforcement agencies. Source: Financial Post (National Post) September 27, 2001