data resource management agenda what types of data are stored by organizations? how are different...

28
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems with stored data? 1

Upload: wilfrid-lee

Post on 18-Jan-2018

215 views

Category:

Documents


0 download

DESCRIPTION

Data Stored in the Purchase Order Database What is contained in the purchase order database on the previous page? Operational (transaction) data generated “internally” or within the organization. Historical data (PurchaseHistory entity). What generates the data (where does it come from)? How is it input into a computer? Who is responsible for the accuracy of that data? 3

TRANSCRIPT

Page 1: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

Data Resource Management Agenda

What types of data are stored by organizations?How are different types of data stored?What are the potential problems with stored data?

1

Page 2: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

2

Employee

PK EmpID

EmpLastName EmpFirstName EmpEmail EmpPhoneFK1 EmpMgrID

PurchaseOrder

PK PONumber

PODatePlaced PODateNeeded Terms ConditionsFK1 BuyerEmpIDFK2 VendorID

Vendor

PK VendorID

Name Address1 Address2 City State Zip Email Contact Phone FirstBuyDate

PurchaseOrderLine

PK,FK2 PONumberPK,FK1 ProductIDPK DateNeeded

QtyOrdered Price

Product

PK ProductID

description UOM EOQ QOHFK1 ProductTypeID

ProductType

PK ProductTypeID

Description

Receiver

PK ReceiverID

DateReceived QtyReceivedFK3 ConditionIDFK1 ReceiveEmpIDFK2 PONumberFK2 ProductIDFK2 DateNeeded

manages

places

is placed

with

contains contains

is on

is of

receives

is foris

received on

is managed by

Purchase Order Database

Condition

PK ConditionID

Description

is on has

PurchaseHistory

PK,FK1 productIDPK DatePurchased

Qty PriceFK2 VendorID

was purchased

was purchased from

Page 3: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

Data Stored in the Purchase Order Database

What is contained in the purchase order database on the previous page?

Operational (transaction) data generated “internally” or within the organization.

Historical data (PurchaseHistory entity).

What generates the data (where does it come from)?

How is it input into a computer?

Who is responsible for the accuracy of that data?

3

Page 4: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

4

Problems with internal operational data

Not integrated.Redundant of other systems in the organization.Potentially of poor quality (“dirty data”):

Incomplete.

Not accurate.

Inconsistent.

The meaning of the data is not fully defined and/or understood by all stakeholders.

Page 5: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

More about the purchase order database

What happens when a purchase order is completely received and paid for?

Is the data deleted from the database?

Should the data be stored? Why or why not?

Where would the data be stored? Same database? Different database?

Should the data be stored in the same format or in a different format?

5

Page 6: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

6

Page 7: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

7

What about decision support data?

How does data used to help make decisions differ from transaction data?Does decision making data need to be stored separately from transaction data?Does decision making data require a different database structure?

Page 8: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

We use data to answer management questions

Operational QuestionsWhat products are due to be received today?What products were received today?What is the price for ProductID 1224 on PO#0667?When is the next due date for ProductID 8992?Which employee received the products for PO#0667?Which employee placed the most purchase orders this month?

Decision Support Questions

Which vendor gives us the best price for ProductID 8992?Which vendor delivers ProductID 8992 most reliably (on time and in best condition)?Which Product Type is increasing in price most steeply?Which Product Type is decreasing in price most quickly?Do employees place purchase orders with the same vendors, or do they differentiate based on price or reliability?

8

Page 9: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

Compare and Contrast TPS and DSSIssue TPS/MIS DSS

Definition Systems to support day-to-day operations.

Systems to support ad-hoc decision making.

Users clerks, data entry, low-level supervisors.

managers, analysts, support staff, researchers.

Design goal Performance. Flexibility, ease of use, ease of access.

Transaction Type

Updates. Queries.

Query Activity

low; few joins. high; many joins.

Page 10: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

Operational vs. Decision making databases

Issue Operational database

DSS database

Content Internal data, process-oriented.

Internal and external data. Subject-oriented.

Data currency

Real time. Current. Volatile.

Batch. Historical. Non-volatile.

Summary level

Details of transactions; no (or very little) derived data.

Summarized; many aggregation levels.

Volume Megabytes to gigabytes.

Gigabytes to terabytes.

Design Normalized to prevent anomalies.

Denormalized to enhance query performance.

Page 11: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

Organizations also use external data.

What data NOT generated by the organization might be relevant to making decisions?

Who generates external data, and how does it get into my database??

11

Page 12: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

So, can one database support both transaction processing and decision support applications? Should we just add a few tables to the transaction processing database, as shown in the prior slide??

Or should we create a separate database for decision making?

One Database for both transaction processing and

decision support

Transaction Database separate from

decision making database

VS.

Page 13: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

A Business Intelligence “System”

Transaction processing systems vs. operational systems.A business intelligence system encompasses all processes, hardware and software necessary to extract data, transform it, integrate it, store it, and provide information. The information is then made effective and accessible to users to support decision making.Sounds like just another information system...

13

So what makes it different?

Page 14: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

14

Big Data!

Data Use!

Data Input!

Page 15: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

The “V’s” of Big Data

Volume: scale of dataVelocity: frequency of changeVariety: Different forms and sources of dataVeracity: Uncertainty of the accuracy of data

15

Page 16: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

16

DataSources

ERP

Legacy

POS

OtherOLTP/wEB

External data

Select

Transform

Extract

Integrate

Load

ETL Process

EnterpriseData warehouse

Metadata

Replication

A P

I

/ M

iddl

ewar

e Data/text mining

Custom builtapplications

OLAP,Dashboard,Web

RoutineBusinessReporting

Applications(Visualization)

Data mart(Engineering)

Data mart(Marketing)

Data mart(Finance)

Data mart(...)

Access

No data marts option

Page 17: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

17

Components of a business intelligence system

Data store (called a “data warehouse”).

Extraction/transformation/loading processes.

End user query tools.

End user visualization tools.

Page 18: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

What is a data warehouse?

A data warehouse is a database designed to support a business intelligence system.A data warehouse is:

Integrated: It is a centralized, consolidated database integrating data from an entire organization.

Subject-oriented: Data warehouse data are organized around key subjects. The data are usually arranged by topic, such as customers, products, suppliers, etc.

Time-variant: Data in the warehouse contain a time dimension so that they may be used as a historical aggregation.

Non-volatile: Once data enter, they seldom leave. Data are appended rather than overwritten. Data are updated in batches.

Page 19: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

19

Issues in creating a data warehouse

How to get accurate and complete data?

How to consolidate data?Differing data meanings.

Differing storage mechanisms.

Differing data formats.

Page 20: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

Data mart extraction data warehouse

20

Operationaldatabase

Operationaldatabase

External data source

User departments

Data mart

Data mart

Data martExtract,

Transform and Load Processes

Page 21: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

Two-tier data warehouse architecture

Data warehouse

Operationaldatabase

Operationaldatabase

Externaldata source

EDM

Summarizeddata

Transformationprocess

Data warehouseserver

User departments

Page 22: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

Three-tier data warehouse architecture

Data warehouse

Operationaldatabase

Operationaldatabase

Externaldata source

EDM

Summarizeddata

Transformationprocess

Data warehouseserver

Userdepartments

Data mart

Data mart

Data mart tier

Extractionprocess

Page 23: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

23

Issues in designing a data warehouse

Must have a predefined subject focus.Has the potential to be very large – must define the “grain” or granularity level of storage.Will always have a dimension of time.May contain derived data.May be a summary of data, rather than each detailed transaction.Does not always adhere to standard normalization rules.

Page 24: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

Potential Data Mart for Purchase Orders

24

Can be a “star” or a “snowflake” design – this one

is a snowflake

Page 25: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

25

Accessing a data warehouse

Visualization tools.Graphical.

Spreadsheet format - usually Excel look-and-feel.

Beyond the spreadsheet using discovery tools. Example: http://www.gapminder.org/

Dashboard. Examples: http://www.dundas.com/dashboard/online-examples/

Query tools.OLAP: Online analytical processing.

Data mining: Artificial intelligence based query methods.

Page 26: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

26

Online analytical processing

Provides multi-dimensional data analysis techniques.Works primarily with data aggregation.Provides advanced statistical analysis.Supports access to very large databases.Provides enhanced query optimization algorithms.Lots of acronyms: OLAP, ROLAP, MOLAP, HOLAP.Can be add-ons to existing products, example is Excel. Can have their own user interfaces.

Page 27: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

OLAP vs. Data Mining questionsOLAP Data Mining

Which customers spent the most with us in the past year?

Which types of customers are likely to spend the most with us in the coming year?

How much did the bank lose from loan defaulters within the past two years?

What are the characteristics of the customers most likely to default on their loans before the year is over?

What were the highest selling fashion items in our London stores?

What additional products are most likely to be sold to customers who buy shorts?

Which store/ location made the highest sales in the past year?

In which area whould we open a new store next year?

Page 28: Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems

28

Data mining

Data mining tools: analyze the data;

uncover patterns hidden in the data;

form computer models based on the findings; and

use the models to predict business behavior.

Proactive tools.Based on artificial intelligence software such as decision trees, neural networks, fuzzy logic systems, inductive nets and classification networking.