1 business intelligence & data warehousing tom a. fürstenberg business intelligence consultant...

104
1 Business Business Intelligence Intelligence & & Data Data Warehousing Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

Upload: eliza-spindle

Post on 31-Mar-2015

221 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

1

Business IntelligenceBusiness Intelligence&&

Data WarehousingData WarehousingTom A. Fürstenberg

Business Intelligence ConsultantCap Gemini Ernst & Young

Page 2: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

2

Leerdoelen collegeLeerdoelen college

• Wat is BI & DWH? (Conceptueel en Technisch)

• Toepassing van BI & DWH• De praktijk van een consultant iha en bij Cap

Gemini Ernst & Young ihb

Page 3: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

3

Inhoud CollegeInhoud College• Performance Management• Business Intelligence (Performance Measurement)• OLAP• Extranets• Architectuur• Data Warehouse• ETL• Multidimensioneel Modelleren• CGE&Y Aanpak• Data Mining

Page 4: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

4

Performance Performance ManagementManagement

Doelgericht meten en bijsturen van bedrijfsdoelstellingen

Page 5: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

5

In control of a companyIn control of a company

Page 6: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

6

OverviewOverview

Conceptueel Besturingsmodel

Strategie &Missie

Verantwoordelijkheden & Bevoegdheden

Operationeel Besturingsmodel

Besturings-systematiek

DoelenMiddelenRand-voorwaarden

Key Performance IndicatorsCritical Succes IndicatorsExternal Indicators

Informatie-model

Informatie-systeem

Informatie-voorziening

DatawarehouseDatawarehouse

datadata datadata datadata

Page 7: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

7

Besturings visie: BouwstenenBesturings visie: Bouwstenenvoor besturing van organisatiesvoor besturing van organisaties

Waarom ?Stake holders

Wie?

Doelstellingen &Prestatie indicatoren

Wat?

Methoden Systemen Hoe?

Waarden & normen

Strategie &Missie

Organisatie

Page 8: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

8

MethodenMethoden

Leid

erschap

Personeels-beleid

Middelen-mngt.

waarderingdoor

personeel

On

dern

emin

gs -

resultaten

waarderingdoor

klanten

Pro

cessen

Beleid &strategie

waarderingdoor

maatschappij

Financial PerspectivePerformance indicatorObjective

Internal Business PerspectivePerformance indicator Objective

Customer PerspectivePerformance indicatorObjective

Innovation & Learning PerspectivePerformance indicatorObjective

Balanced Scorecard

INK managementmodel

Diverse Financiële modellen

Page 9: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

9

Naar een operationeel BesturingsmodelNaar een operationeel Besturingsmodel

KPI

Na het vaststellen van de Doelen en KPI’s

CSI

CSI

CSI

CSI

Worden de Critische Succes Indicatoren bepaald

OI OI

Gevolgd door het vaststellen van de Omgevings Indicatoren

Page 10: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

10

Naar een operationeel BesturingsmodelNaar een operationeel Besturingsmodel

KPI

KPI

KPI

KPI

CS

I

CS

I

CS

I

OI

OI

Tijd

Regio

Product

AfdelingMarkt

Page 11: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

11

Van Model naar Gedragsverandering Van Model naar Gedragsverandering

KPI

KPI

KPI

KPI

CS

I

CS

I

CS

I

OI

OI

KPI

KPI

KPI

KPI

CS

I

CS

I

CS

I

OI

OI

ManagementCharter

Operationeel Besturingsmodel

Beoordeling en Sturing

InformatievoorzieningPlanning

en Commitment

Multi-dimensionale

Gegevensstructuur

Verantwoordelijkheden

en Bevoegdheden

Page 12: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

12

Some Typical Mgt. QuestionsSome Typical Mgt. Questions

PRODUCTPRODUCT CUSTOMERCUSTOMER

CHANNELCHANNEL MARKETINGMARKETING

• How much have we sold?• Which product gives the best profit?• Which product has the largest sales volume this quarter?• Which product best meets market needs?• How much to produce of each product?

• Who is the most profitable customer?• What is the satisfaction level?• Which are the best segments?• Which service to improve?• How many customers have we lost last

year?• Who are our biggest accounts?

• Which retailer yields most by volume and which by profit?• What promotions will yield most

profit?• What effect will discounts have on the turnover?

• What are the area coverage levels?• How many contacted people became a customer?• Promotions’ results?• What is the competition doing?

Page 13: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

13

Source: Results FIND! The Best benchmarkstudy conducted in 1997/1998 by Ernst & Young Consulting and VU. 103 industrial companies participated in the study.

Purchasing # hits Production # hits1 development of purchase prices 86 1 overhead costs 942 stocklevel by item or product line 84 2 direct costs of materials 933 reliability of suppliers; quality (zero-defects) 81 3 direct costs of labour 924 financial position of suppliers 81 4 factory overhead expenses 915 reliability of suppliers; time to delivery 79 5 quality finished product 916 reliability of suppliers; completeness 74 6 sick days 917 purchasing department costs 72 7 overtime 908 number of goods returned 48 8 stock of raw materials 909 cash discounts 47 9 production volume 8910 accounts payable as a % of purchasing value 39 10 maintenance costs 89

Sales # hits Finance # hits1 sales volume or sales volume growth 95 1 Earnings Before Interest & Taxes (EBIT) 1042 depreciation of accounts receivable 92 2 Gross margin 1033 number of buyer complaints 92 3 Profit Before Tax (PBT) 1024 quality of deliveries 88 4 Gross investments 1025 marketing expenses 87 5 working capital 1016 freight costs 86 6 cash flow 947 value of new orders 85 7 liquidity position 928 accounts receivable as a % of turnover 85 8 Return on Capital Employed 779 reliability of deliveries; time to delivery 84 9 Return on Sales 7510 market growth 83 10 turnover time of goods in stock 75

Key Performance Indicators Top 10Key Performance Indicators Top 10

Page 14: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

14

En nu alleen nog En nu alleen nog even even meten…meten…

Business IntelligenceBusiness Intelligence (performance measurement)(performance measurement)

Page 15: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

15

The AnswersThe Answers

The information is there, but spread everywhere!The information is there, but spread everywhere!

Page 16: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

16

De praktijk...De praktijk...

Page 17: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

17

Pro

ble

men

Pro

ble

men

• (Over)belasting IT-afdeling (queries)• Lange doorlooptijd rapport-’fabricage’• Hoge kosten aan manuren• Databronnen moeilijk integreerbaar• Niet-gestandaardiseerde rapporten• Geen eenduidige definities• Foutgevoelig• Manipuleerbaar• Afhankelijkheid van ‘schakels’• Discussies over verschillen in cijfers• Beperkte analyse-mogelijkheden• Verkeerde en te late interpretaties, conclusies,

beslissingen• ...

Page 18: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

18

Een druk op de knop...Een druk op de knop...

Page 19: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

19

Van chaos...Van chaos... Naar structuurNaar structuur

Page 20: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

20

Why now? Hype? Developments:Why now? Hype? Developments:

• Globalisation of markets• Individualisation of customers• Shorter life cycle of products• Information overload• Mergers

• Faster hardware• Cheaper disk capacity• Modern OLAP-tools• Any access: c/s, web, mobile

Market Pull

Technology Push

Page 21: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

21

OOnnLLine ine AAnalytical nalytical PProcessingrocessing

• Gebaseerd op de syntax van management-informatie vragen:

<meetwaarde> per <dim1> per <dim2> per ...

• KPI’s, CSI’s en OI’s zijn meetwaarden• Produkt, Regio, Klant, Tijd, etc. zijn dimensies

(slice & dice)• Dimensies kennen hierachiën (drill down)

Page 22: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

22

Product Manager’s View

Financial Manager’s View

Regional Manager’s View

Ad Hoc View

Product Time

OLAPOLAP

Page 23: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

23

Q4Time

Q1 Q2 Q3

ProductGrapes

Apples

Melons

Cherries

Pears

LocationAtlanta

DenverDetroit

SalesSales

Introduction to CubesIntroduction to Cubes

ProductGrapes

Apples

Melons

Cherries

Pears

ProductGrapes

Apples

Melons

Cherries

Pears

LocationAtlanta

DenverDetroit

SalesSales

Page 24: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

24

DemoDemo

• eFashion Case

• BusinessObjects Demo

Page 25: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

25

BusinessObjects: Semantic BusinessObjects: Semantic LayerLayer

Page 26: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

26

Any AccessAny Access

Page 27: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

27

Info- & Analysis-need at 3rd partiesInfo- & Analysis-need at 3rd parties

Page 28: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

28

e/m-Business Intelligence: e/m-Business Intelligence: ExtranetsExtranets

CUSTOMERS

PARTNERS

SUPPLIERS

extranet

extr anet

extr anet

Data Warehouse

Page 29: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

29

ExtranetExtranet demo’s demo’s

Page 30: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

30

Business IntelligenceBusiness IntelligenceTheoryTheory

Page 31: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

31

BI DefinitionBI Definition

Business Intelligence is the process of collection, cleansing, combining, consolidation, analysis, interpretation and communication of

all internal and available external data, relevant for the decision making process in the

organisation

Page 32: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

32

BI ConceptBI Concept

Data

Information

Knowledge

Action

Collection

Decisions

Integration

AnalysisFeedback

Bus

ines

s V

alue

Page 33: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

33

BI SystemsBI Systems

Reporting & Query

DSS, MIS and EIS

OLAP Data Mining

Page 34: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

34

mining

exploring

Number of users

Static

Dynamic

analysis

reporting

querying

Com

ple

xit

y o

f th

e q

uest

ion

The Five Functional Levels

standard reports

‘bunch of reports’, ‘cube’

unique ‘report’ orquestioni.e. finding

variablesi.e stat. analysis,

testing a hypothesis

Page 35: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

35

mining

exploring

Number of users

Static/Dynamic

interactief

analysis

reporting

querying

Com

ple

xit

y o

f th

e q

uest

ion

80 %of all users

The Five Functional Levels

Page 36: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

36

Applications

Any Source Any AccessAny Data

LOAD

MANAGEMENT

QUERY

MANAGEMENT

External data

Data Marts

Data Warehouse

Operational Data Store

Corporate Information FactoryCorporate Information Factory

LAN/WAN

WWW

Page 37: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

37

Components of the CIFComponents of the CIF

• Data Warehouse• Data Mart• Operational Data Store• ETL

Page 38: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

38

Data WarehouseData Warehouse

Page 39: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

39

Definition Bill InmonDefinition Bill Inmon

Characteristics of a data warehouse:• Subject-oriented• Integrated• Time-variant• Non-volatile• Both summary and detailed data

Page 40: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

40

Data WarehouseData Warehouse

• Contains data that can be used to meet the information of (part of) the organisation

• Contains integrated data extracted from one or more sources

• Mostly contains large amounts of data• Contains data that is clean and consistent• May contain aggregated data• Optimised for its use

Page 41: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

41

Data WarehouseData Warehouse

Data Base Data Warehouse

ActualActual HistoricalHistorical

InternalInternal Internal and ExternalInternal and External

IsolatedIsolated IntegratedIntegrated

TransactionsTransactions AnalysisAnalysis

NormalisedNormalised DimensionalDimensional

DirtyDirty Clean and ConsistentClean and Consistent

DetailedDetailed Detailed and SummaryDetailed and Summary

Page 42: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

42

Data WarehouseData Warehouse

Advantages• One point of contact• Time savings• No loss of historical data• OLTP’s not hampered by BI activities• Better consistency and quality of data• Improvement of Business Intelligence

Page 43: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

43

Data WarehouseData Warehouse

Disadvantages• Never quite up-to-date• Requires a lot of storage space• Requires a lot of communication,

coordination and cooperation• Large impact on the organisation• A data warehouse is only the beginning

Page 44: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

44

Data MartData Mart

• DW design does not optimise query performance

• Data is not stored in an optimal fashion for any given department in the DW

• Competition to get the resources required to get inside the DW

• Costs for DSS computing facilities are high because of the large volume in DW

Page 45: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

45

Data MartData Mart

Characteristics:• Customised for a specific department• Limited amount of history• Summarised• Very flexible• Elegant presentation• Processor dedicated to the department

Page 46: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

46

Data MartData Mart

Divided by:• Business• Geography• Security• Political (budget)• Structure (data mining)

Page 47: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

47

Data MartData Mart

Three different kinds of data marts:• Subset/summary• MOLAP• ROLAP

Page 48: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

48

Operational Data StoreOperational Data Store

Characteristics:• Subject-oriented• Integrated• Current-valued• Volatile• Detailed data

Page 49: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

49

ETL: ExtractionETL: Extraction

Source selection:• Data model is starting point: determine data

elements that are needed• For each data element, determine available data

sources• If more han 1 source available, select on:

– Quality, reliability and integrity

– Scope of data

– Location and availability of data

– Location and availability of expertise

Page 50: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

50

ETL: TransformationETL: Transformation

Processing:• Aggregate records• Encoding structures• Simple reformatting• Mathematical conversion• Resequencing of data• Default values• Key conversion• Cleansing

Page 51: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

51

ETL: TransformationETL: Transformation

Key structure A

Key structure B

Key structure C

Key structure A

Key structure A

Key structure B

Key structure C

New key structure

Key transformation

Page 52: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

52

ETL: CleansingETL: Cleansing

Data quality is critical for:• Marketing communications• Targeted marketing• Customer matching• Retail- and commercial householding• Combining information• Tracking retail sales

Page 53: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

53

ETL: CleansingETL: Cleansing

Common excuses for not cleaning:• The data in the operational systems seem to

work just fine• Data can be joined most of the time• Cleansing will take place after population of

the data warehouse• Data entry will be improved• The users will never agree to change their data

Page 54: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

54

Multi Multi DimensionalDimensional

Data Data ModelingModeling

Page 55: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

55

MD Modeling: ContentsMD Modeling: Contents

• E/R Modeling (Ex.)• MD Modeling (Ex.)• Star Schema• Slowly Changing Dimensions (Ex.)• Surrogate Keys• Aggregation (Ex.)• Measures & Dimensions reviewed• Other important MDM aspects

Page 56: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

56

Exercise: E/R ModelingExercise: E/R Modeling

How could the sales transaction database of the eFashion retailer look like?

Ticket

Ticket_nr

Store_nr

Card-nr

Employee_nr

Time_Stamp

Loyalty Card

Card_nr

Cust_name

Adress

Zip_code

City

...

Employee

Employee_nr

Emp_name

...

Products Sold

Ticket_nr

Product_nr

#_products

price

dicount

Products

Product_nr

Bar_Code

Prod_Desc

Actual_price

Weight

...

Store

Store_nr

Store_name

Adress

Zip_code

City

State

Manager

...

Page 57: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

57

Management QuestionsManagement Questions• Give me the annual revenue of all my product

lines divided over all the sales regions over the last 3 years

• Give me the top 10 of most profitable products this year

• Give me the top 10 of most sold products of last year

• Give me the top 10 of most profitable customers• Compare the YTD revenue with the one in the

same period last year and the target

Page 58: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

58

Why not E/R Modeling?Why not E/R Modeling?

• End users cannot understand, remember, navigate an E/R model (not even with a GUI)

• Software cannot usefully query an E/R model• Use of E/R modeling doesn’t meet the DW

purpose: intuitive and high performance querying

Page 59: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

59

Exercise: Model the Efashion DMExercise: Model the Efashion DM

• Sales Revenue• Time hierarchy (Year-Quarter-Month)• Store hierarchy (Region, State, City, Store)• Product hierarchy (Line, Category, SKU)

Page 60: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

60

eFashion Data MarteFashion Data Mart

Facts

Month_nr

Store_nr

SKU_nr

Sales_revenue

...

Product

SKU_nr

SKU_desc

Category

Line

Time

Month_nr

Month_desc

Quarter

Year

Geography

Store_nr

Store_name

City

State

Region

Page 61: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

61

DW Modeling ComponentsDW Modeling Components

GeographicGeographic ProductProduct TimeTime UnitsUnits $$

DimensionTables

DimensionTables

GeographicGeographicGeographicGeographic

ProductProductProductProduct

TimeTimeTimeTime

Fact TableMeasuresMeasures

FactsFactsFactsFacts

DimensionDimensionDimensionDimension

Page 62: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

62

Using a Star SchemaUsing a Star Schema

Fact TableDimension TableTime_DimTime_DimTime_DimTime_Dim

TimeKeyTimeKeyTheDate...

TheDate...

Sales_FactSales_FactTimeKeyEmployeeKeyProductKeyCustomerKeyShipperKey

TimeKeyEmployeeKeyProductKeyCustomerKeyShipperKey

$...

$...

Employee_DimEmployee_DimEmployee_DimEmployee_DimEmployeeKeyEmployeeKeyEmployeeID...

EmployeeID...

Product_DimProduct_DimProduct_DimProduct_DimProductKeyProductKeyProductID...

ProductID...

Customer_DimCustomer_DimCustomer_DimCustomer_DimCustomerKeyCustomerKeyCustomerID...

CustomerID...

Shipper_DimShipper_DimShipper_DimShipper_DimShipperKeyShipperKeyShipperID...

ShipperID...

Page 63: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

63

Components of a Star SchemaComponents of a Star SchemaEmployee_DimEmployee_DimEmployee_DimEmployee_Dim

EmployeeKeyEmployeeKeyEmployeeID...

EmployeeID...

EmployeeKey

Time_DimTime_DimTime_DimTime_DimTimeKeyTimeKeyTheDate...

TheDate...

TimeKeyProduct_DimProduct_DimProduct_DimProduct_Dim

ProductKeyProductKeyProductID...

ProductID...

ProductKey

Customer_DimCustomer_DimCustomer_DimCustomer_DimCustomerKeyCustomerKeyCustomerID...

CustomerID...

CustomerKeyShipper_DimShipper_DimShipper_DimShipper_Dim

ShipperKeyShipperKeyShipperID...

ShipperID...

ShipperKey

Sales_FactSales_FactTimeKeyEmployeeKeyProductKeyCustomerKeyShipperKey

TimeKeyEmployeeKeyProductKeyCustomerKeyShipperKey

$...

$...

TimeKey

CustomerKeyShipperKey

ProductKeyEmployeeKey

Multipart KeyMultipart KeyMultipart KeyMultipart Key

MeasuresMeasuresMeasuresMeasures

Dimensional KeysDimensional KeysDimensional KeysDimensional Keys

Page 64: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

64

Exercise: Slowly Changing DimensionsExercise: Slowly Changing Dimensions

Suppose the product categories change from time to time.Model the Data Mart when the manager wants to see historical reports against:1. The present categories2. The categories at the time of the sale3. Both against the present categories and the immediate previous categories4. The categories at any specified time

Page 65: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

65

SCD Exercise 1SCD Exercise 1

Facts

Month_nr

Store_nr

SKU_nr

Sales_revenue

...

Product

SKU_nr

SKU_desc

Category

Line

Time

Month_nr

Month_desc

Quarter

Year

Geography

Store_nr

Store_name

City

State

Region

Page 66: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

66

SCD Exercise 2SCD Exercise 2

Facts

Month_nr

Store_nr

Product_key

Sales_revenue

...

Product

Product_key

SKU_nr

SKU_desc

Category

Line

Time

Month_nr

Month_desc

Quarter

Year

Geography

Store_nr

Store_name

City

State

Region

Most Recent

Product Key Map

Product_key

SKU_nr

Page 67: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

67

SCD Exercise 3SCD Exercise 3

Facts

Month_nr

Store_nr

Product_key

Sales_revenue

...

Product

Product_key

SKU_nr

SKU_desc

Category

Category_old

Line

Time

Month_nr

Month_desc

Quarter

Year

Geography

Store_nr

Store_name

City

State

Region

Page 68: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

68

SCD Exercise 4SCD Exercise 4

Facts

Month_nr

Store_nr

SKU_nr

Sales_revenue

...

Product

SKU_nr

SKU_desc

Category

Line

Valid_from

Valid_until

Time

Month_nr

Month_desc

Quarter

Year

Geography

Store_nr

Store_name

City

State

Region

Page 69: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

69

Slowly Changing DimensionsSlowly Changing Dimensions

• Type 1: Overwrite the dimension record• Type 2: Create new dimension record• Type 3: Create an ‘old’ field in the dimension

record• Type 4: Add a valid_from and valid_until field in

the dimension record

Ad. Type 2: requires surrogate keys, but in general, one should always use these because of performance and flexibilityAd. Type 4: Kimball only recognizes 3 types SCD’s

Page 70: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

70

Always Use Surrogate KeysAlways Use Surrogate Keys

• Allows DWH to assign new key versions for SCD’s (type 2)

• Higher performance with numeric keys than with long, alphanumeric keys

Page 71: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

71

Exercise: AggregationExercise: Aggregation

Suppose the manager queries frequently on product line level and finds the performance too low.

Question: How to model the data mart when we want to add aggregated measures on product line level?

Page 72: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

72

Exercise: AggregationExercise: Aggregation

Facts

Month_nr

Store_nr

Product_key

Sales_revenue

...

Product

Product_key

SKU_nr

SKU_desc

Category

Line

Time

Month_nr

Month_desc

Quarter

Year

Geography

Store_nr

Store_name

City

State

Region

Aggregated Facts

Week_nr

Store_nr

Line_key

Sales_revenue

...

Product_Line

Line_key

Line

Page 73: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

73

Exercise: MeasuresExercise: Measures

• Stock Quantity• Product Price• Promotion Costs (product-specific, store-

independent)

Add the following measures to the eFashion Data Mart:

Page 74: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

74

Exercise: MeasuresExercise: Measures

Facts

Month_nr

Store_nr

Product_key

Sales_revenue

Stock_qty

Product

Product_key

SKU_nr

SKU_desc

Price

Category

Line

(Valid_from

Valid_until)

Time

Month_nr

Month_desc

Quarter

Year

Geography

Store_nr

Store_name

City

State

Region

Promotion Facts

Month_nr

SKU_nr

Promotion_cost

Duration

Promotion_type

...

Q_Stock Facts

Quarter

Store_nr

SKU_nr

Stock_qty (av, eom)

Month

Quarter

Year

Page 75: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

75

Measures & Dimensions reviewedMeasures & Dimensions reviewed

• Numeric• Additive

The most useful measures are

• The natural entry points of the facts• I.e., used for constraints and report breaks• Independent of each other, not hierarchically

related

Dimensions are:

Page 76: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

76

Other Important MDM-AspectsOther Important MDM-Aspects

• Cardinality• Grain• Referential Integrity• Conformed Dimensions• Drill Across• Traps

Page 77: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

77

Applications

Any Source Any AccessAny Data

LOAD

MANAGEMENT

QUERY

MANAGEMENT

External data

Data Marts

Data Warehouse

Operational Data Store

How to make the CIF?How to make the CIF?

LAN/WAN

WWW

Page 78: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

78

CGE&Y BI-Approach OverviewCGE&Y BI-Approach OverviewS

trat

egy

& o

bje

ctiv

es

DW

blu

epri

nt

Sou

rce

dat

a

Metamodel

Data WarehouseArchitecture I

Def

init

ion

In

crem

ents

Extraction,Transformation

LoadDevelopment

Imp

lem

enta

tion

Incremental Delivery

Evolutionary Strategy

Data WarehouseArchitecture II

Aw

aren

ess

Project Management

Communication

Page 79: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

79

Page 80: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

80

Data MiningData Mining

Page 81: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

81

Data MiningData Mining

Definition:

The process of digging intelligently into large volumes of data to discover and analyse previously unknown relationships or to validate hypotheses.

Page 82: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

82

Data Mining Versus OLAPData Mining Versus OLAP

OLAP/Query

Are there some customers from large accounts with a high

decrease in international calls?

Data Mining

Are there any common

characteristics among these customers?

Data

Information

Page 83: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

83

ApplicationsApplications

• Risk Analysis (grant credit, investment)• Fraud Detection (telephone charge, bank

withdrawals)• Trouble Shooting and Diagnosis• Process Controls (wafer fabrication)• Promotion Analysis• Bankruptcy Prediction (mortgage lending,

business partners)• Customer Churn (telco)• CRM (next slides)

Page 84: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

84

Maximizing Customer ValueMaximizing Customer Value

• Getting more prospects in• Turning prospects into customers• Selling more products to existing customers• Getting less customers out

Page 85: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

Which ones in and which ones out?Which ones in and which ones out?

Highest Lowest

Yield per customer

Costs per customer

Customer profitability

MigrationKeep

Growth

Out

Yieldperindividualcustomer

Page 86: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

86

Example: One to One MarketingExample: One to One Marketing

• Treat different customers differently– differentiate message– differentiate product offer– differentiate channel

• Need for usable information => predict customer behavior out of databases

Page 87: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young
Page 88: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young
Page 89: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

89

Page 90: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

90

Page 91: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

91

Page 92: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

92

Page 93: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

93

Example: clickstream analysisExample: clickstream analysis• What parts of our Web site get the most visitors? • What parts of the Web site do we associate most frequently with actual

sales? • What parts of the Web site are superfluous or visited infrequently? • Which pages on our Web site seem to be "session killers," where the

remote user stops the session and leaves? • What is the new-visitor click profile on our site? • What is the click profile of an existing customer? A profitable

customer? A complaining customer that all too frequently returns our product?

• What is the click profile of a customer about to cancel our service, complain, or sue us?

• How can we induce the customer to register with our site so we learn some useful information about that customer?

• How many visits do unregistered customers typically make with us before they are willing to register? Before they buy a product or service?

Page 94: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

94

Tele-sales Service desk

Customized Customer ServiceCustomized Customer ServiceCustomized Customer ServiceCustomized Customer Service

Page 95: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

95

CustomerData

Tele-sales

DirectMail

Salesvisit

Good

Bad

Example: Contact StrategyExample: Contact Strategy

Ch

an

nel

op

tim

isati

on

Datamining

Page 96: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

96

Organisation!

The customer choses the channelThe customer choses the channel

Complaint handling

Leaflet receipt

Status

service question

Status order

Operational systems

Integration Analysis

Service questionCC

App.

Leaflet requestContact

Order

Order

Complaint Email

Page 97: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

97

Data Sources for Data MiningData Sources for Data Mining

DATACollecting &

Cleansing

• Transactions (loyalty cards)• Behaviour of existing customers• Logfiles & cookies• Market research• Data suppliers• Public data

Page 98: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

98

Example: Affinity GroupingExample: Affinity Grouping

• Market Basket: what items are sold together?• Market Basket: what categories are sold with

what items?• Market Basket: what is not sold with certain

items?• Event Correlations: what other services are

brought in the first month after signing up for a satellite TV subscription?

Page 99: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

99

Data Mining TechniquesData Mining Techniques

• Decision Trees, Classification Trees, Rule Induction

• Neural Nets• Visualisation• Fuzzy Logic; Nearest Neighbour; Memory

Based Reasoning; Case Based Reasoning• Proprietary Logic• Classical Statistics

Page 100: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

100

Data Mining TechniquesData Mining TechniquesData Mining TechniquesData Mining Techniques

Statistical analysis Neural networks Genetic algorithms Decision trees

Intuïtion

Pre

dict

ive

Pow

erP

redi

ctiv

e P

ower

SimplicitySimplicity

Page 101: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

101

Critical Success FactorsCritical Success Factors

• Data availability (large amounts of a wide variety of data)

• Data consistency• Data quality• Domain expertise• Data used/needed is allowed by privacy laws

Page 102: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

102

BenefitsBenefits

• Improved customer relationships• More revenue from existing customers • Market segmentation• Differentiated products and services• Differentiated sales channels• More effective marketing programs• Improved fraud detection• Improved investments• …

Page 103: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

103

Decision Tree with Decision Tree with BusinessMiner from BusinessMiner from

BusinessObjectsBusinessObjects

DemoDemo

Page 104: 1 Business Intelligence & Data Warehousing Tom A. Fürstenberg Business Intelligence Consultant Cap Gemini Ernst & Young

104

Contact informationContact information

Tom A. FürstenbergBusiness Intelligence ConsultantCap Gemini Ernst & YoungSector Energy, Products & TransportTel +31 6 21 878 915email: [email protected]