data warehousing, olap, and data mining

Post on 09-Mar-2015

80 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

OLAP, and OLTP

2

Introduction

• Data, data, data…everywhere!• Information…that’s another story!• Especially, the right information @ the right time!• Data ware housing's goal is to make the right

information available @ the right time• Data warehousing is a data store (eg., a

database of some sort) and a process for bringing together disparate data from throughout an organization for decision-support purposes

3

Different Goal

• Aggregation, summarization and exploration• Of historical data• To help management make informed decisions

Product Branch Time Price

Coke (0.5 gallon) Convoy Street 2006-03-01 09:00:01 $1.00

Pepsi (0.5 gallon) UTC 2006-03-01 09:00:01 $1.03

Coke (1 gallon) UTC 2006-03-01 09:00:02 $1.50

Altoids Costa Verde 2006-03-01 09:01:33 $0.30

...

• Find the total sales for each product and month• Find the percentage change in the total monthly

sales for each product

4

OLAP and OLTP

• OLTP-Online Transaction processing system (relies on solely on relational databases) record at time

• OLAP-Online analytical processing system (class of technologies that are designed for adhoc data access and analysis) deals with summarized data

5

6

Different Requirements

OLTP OLAP

Tasks Day to day operation High level decision support

Size of database Gigabytes Terabytes

Time span Recent, up-to-date Spanning over months / years

Size of working set Tens of records, accessed through primary keys

Consolidated data from multiple databases

Workload Structured / repetitive Ad-hoc, exploratory queries

Performance Transaction throughput Query latency

• OLTP – On-Line Transaction Processing• OLAP – On-Line Analytical Processing

7

Data Warehouse

Customers

Etc…

Vendors Etc…

Orders

DataWarehouse

Enterprise“Database”

Transactions

Copied, organizedsummarized

Data Mining

Data Miners:• “Farmers” – they know• “Explorers” - unpredictable

8

General Architecture for Data Warehousing

• Source systems

• Extraction, (Clean),

Transformation, &

Load (ETL)

• Central repository

• Metadata repository

• Data marts

• Operational feedback

• End users (business)

9

Where does OLAP fit in?

10

OLAP Overview

• Interactive, exploratory analysis of multidimensional data to discover patterns

age accid

ents

gen

de

r

11

OLAP Architecture

12

Server Options

• Single processor

• Symmetric

multiprocessor (SMP)

• Massively parallel

processor (MPP)

13

OLAP Server Options

• Multi-dimensional OLAP (MOLAP)– ‘A k-dimensional matrix based on a non relational

storage structure.’ [Agrawal et al]

• Relational OLAP (ROLAP)– ‘A relational back-end wherein operations of the data

are translated to relational queries.’ [Agrawal et al]

• Hybrid OLAP (HOLAP)– Integration of MOLAP with ROLAP.

• Desktop OLAP (DOLAP)– Simplified versions of MOLAP or ROLAP.

• ZOLAP– Speak with your chemist (normally only prescribed for

death march victims)

14

OLAP – Online Analytical Processing

• A definition:

• Data representation is in the form of a CUBE• OLAP goes beyond SQL with its analysis

capabilities• Key feature of OLAP: Relevant multi-dimensional

views such as products, time, geography

15

OLAP Cube - 1

16

OLAP Cube - 2

17

OLAP Cube - 3

• Star Structure (quite common)

Facts

Week

Product

Product

Year

Region

Time

Channel

Revenue

Expenses

Units

Model

Type

Color

Channel

Region

Nation

District

Dealer

Time

18

A Sample Data CubeTotal annual salesof TV in U.S.A.Date

Produ

ct

Cou

ntr

ysum

sum TV

VCRPC

1Qtr 2Qtr 3Qtr 4Qtr

U.S.A

Canada

Mexico

sum

19

OLAP Cube - 5

Three-Dimensional

CubeDisplay

Page ColumnsRegion:North

Sales

Redblob

Blueblob

Total

1996Rows 1997Year Total

20

OLAP Cube - 6

Six-Dimensional

Cube

Dimension ExampleBrand Mt. AiryStore AtlantaCustomer segment BusinessProduct group DesksPeriod JanuaryVariable Units sold

21

Rotation (Pivot Table)

22

Drill Down

23

OLAP Examples

• http://perso.wanadoo.fr/bernard.lupin/english/example.htm

• Excel Pivot Table example (similar to OLAP cube)

24

Sample of OLAP products

Just a snippet from http://www.olapreport.com/ProductsIndex.htm ; not an endorsement

25

Data Mining versus OLAP

26

Data Mining versus OLAP

• OLAP - Online

Analytical Processing

– Provides you with a very

good view of what is

happening, but can not

predict what will happen

in the future or why it is

happening

27

Results of Data Mining Include:

• Forecasting what may happen in the future• Classifying people or things into groups by

recognizing patterns• Clustering people or things into groups

based on their attributes• Associating what events are likely to occur

together• Sequencing what events are likely to lead

to later events

28

Thanks for listening.

top related