1 business intelligence & data warehousing tom a. fürstenberg business intelligence consultant...
TRANSCRIPT
1
Business IntelligenceBusiness Intelligence&&
Data WarehousingData WarehousingTom A. Fürstenberg
Business Intelligence ConsultantCap Gemini Ernst & Young
2
Leerdoelen collegeLeerdoelen college
• Wat is BI & DWH? (Conceptueel en Technisch)
• Toepassing van BI & DWH• De praktijk van een consultant iha en bij Cap
Gemini Ernst & Young ihb
3
Inhoud CollegeInhoud College• Performance Management• Business Intelligence (Performance Measurement)• OLAP• Extranets• Architectuur• Data Warehouse• ETL• Multidimensioneel Modelleren• CGE&Y Aanpak• Data Mining
4
Performance Performance ManagementManagement
Doelgericht meten en bijsturen van bedrijfsdoelstellingen
5
In control of a companyIn control of a company
6
OverviewOverview
Conceptueel Besturingsmodel
Strategie &Missie
Verantwoordelijkheden & Bevoegdheden
Operationeel Besturingsmodel
Besturings-systematiek
DoelenMiddelenRand-voorwaarden
Key Performance IndicatorsCritical Succes IndicatorsExternal Indicators
Informatie-model
Informatie-systeem
Informatie-voorziening
DatawarehouseDatawarehouse
datadata datadata datadata
7
Besturings visie: BouwstenenBesturings visie: Bouwstenenvoor besturing van organisatiesvoor besturing van organisaties
Waarom ?Stake holders
Wie?
Doelstellingen &Prestatie indicatoren
Wat?
Methoden Systemen Hoe?
Waarden & normen
Strategie &Missie
Organisatie
8
MethodenMethoden
Leid
erschap
Personeels-beleid
Middelen-mngt.
waarderingdoor
personeel
On
dern
emin
gs -
resultaten
waarderingdoor
klanten
Pro
cessen
Beleid &strategie
waarderingdoor
maatschappij
Financial PerspectivePerformance indicatorObjective
Internal Business PerspectivePerformance indicator Objective
Customer PerspectivePerformance indicatorObjective
Innovation & Learning PerspectivePerformance indicatorObjective
Balanced Scorecard
INK managementmodel
Diverse Financiële modellen
9
Naar een operationeel BesturingsmodelNaar een operationeel Besturingsmodel
KPI
Na het vaststellen van de Doelen en KPI’s
CSI
CSI
CSI
CSI
Worden de Critische Succes Indicatoren bepaald
OI OI
Gevolgd door het vaststellen van de Omgevings Indicatoren
10
Naar een operationeel BesturingsmodelNaar een operationeel Besturingsmodel
KPI
KPI
KPI
KPI
CS
I
CS
I
CS
I
OI
OI
Tijd
Regio
Product
AfdelingMarkt
11
Van Model naar Gedragsverandering Van Model naar Gedragsverandering
KPI
KPI
KPI
KPI
CS
I
CS
I
CS
I
OI
OI
KPI
KPI
KPI
KPI
CS
I
CS
I
CS
I
OI
OI
ManagementCharter
Operationeel Besturingsmodel
Beoordeling en Sturing
InformatievoorzieningPlanning
en Commitment
Multi-dimensionale
Gegevensstructuur
Verantwoordelijkheden
en Bevoegdheden
12
Some Typical Mgt. QuestionsSome Typical Mgt. Questions
PRODUCTPRODUCT CUSTOMERCUSTOMER
CHANNELCHANNEL MARKETINGMARKETING
• How much have we sold?• Which product gives the best profit?• Which product has the largest sales volume this quarter?• Which product best meets market needs?• How much to produce of each product?
• Who is the most profitable customer?• What is the satisfaction level?• Which are the best segments?• Which service to improve?• How many customers have we lost last
year?• Who are our biggest accounts?
• Which retailer yields most by volume and which by profit?• What promotions will yield most
profit?• What effect will discounts have on the turnover?
• What are the area coverage levels?• How many contacted people became a customer?• Promotions’ results?• What is the competition doing?
13
Source: Results FIND! The Best benchmarkstudy conducted in 1997/1998 by Ernst & Young Consulting and VU. 103 industrial companies participated in the study.
Purchasing # hits Production # hits1 development of purchase prices 86 1 overhead costs 942 stocklevel by item or product line 84 2 direct costs of materials 933 reliability of suppliers; quality (zero-defects) 81 3 direct costs of labour 924 financial position of suppliers 81 4 factory overhead expenses 915 reliability of suppliers; time to delivery 79 5 quality finished product 916 reliability of suppliers; completeness 74 6 sick days 917 purchasing department costs 72 7 overtime 908 number of goods returned 48 8 stock of raw materials 909 cash discounts 47 9 production volume 8910 accounts payable as a % of purchasing value 39 10 maintenance costs 89
Sales # hits Finance # hits1 sales volume or sales volume growth 95 1 Earnings Before Interest & Taxes (EBIT) 1042 depreciation of accounts receivable 92 2 Gross margin 1033 number of buyer complaints 92 3 Profit Before Tax (PBT) 1024 quality of deliveries 88 4 Gross investments 1025 marketing expenses 87 5 working capital 1016 freight costs 86 6 cash flow 947 value of new orders 85 7 liquidity position 928 accounts receivable as a % of turnover 85 8 Return on Capital Employed 779 reliability of deliveries; time to delivery 84 9 Return on Sales 7510 market growth 83 10 turnover time of goods in stock 75
Key Performance Indicators Top 10Key Performance Indicators Top 10
14
En nu alleen nog En nu alleen nog even even meten…meten…
Business IntelligenceBusiness Intelligence (performance measurement)(performance measurement)
15
The AnswersThe Answers
The information is there, but spread everywhere!The information is there, but spread everywhere!
16
De praktijk...De praktijk...
17
Pro
ble
men
Pro
ble
men
• (Over)belasting IT-afdeling (queries)• Lange doorlooptijd rapport-’fabricage’• Hoge kosten aan manuren• Databronnen moeilijk integreerbaar• Niet-gestandaardiseerde rapporten• Geen eenduidige definities• Foutgevoelig• Manipuleerbaar• Afhankelijkheid van ‘schakels’• Discussies over verschillen in cijfers• Beperkte analyse-mogelijkheden• Verkeerde en te late interpretaties, conclusies,
beslissingen• ...
18
Een druk op de knop...Een druk op de knop...
19
Van chaos...Van chaos... Naar structuurNaar structuur
20
Why now? Hype? Developments:Why now? Hype? Developments:
• Globalisation of markets• Individualisation of customers• Shorter life cycle of products• Information overload• Mergers
• Faster hardware• Cheaper disk capacity• Modern OLAP-tools• Any access: c/s, web, mobile
Market Pull
Technology Push
21
OOnnLLine ine AAnalytical nalytical PProcessingrocessing
• Gebaseerd op de syntax van management-informatie vragen:
<meetwaarde> per <dim1> per <dim2> per ...
• KPI’s, CSI’s en OI’s zijn meetwaarden• Produkt, Regio, Klant, Tijd, etc. zijn dimensies
(slice & dice)• Dimensies kennen hierachiën (drill down)
22
Product Manager’s View
Financial Manager’s View
Regional Manager’s View
Ad Hoc View
Product Time
OLAPOLAP
23
Q4Time
Q1 Q2 Q3
ProductGrapes
Apples
Melons
Cherries
Pears
LocationAtlanta
DenverDetroit
SalesSales
Introduction to CubesIntroduction to Cubes
ProductGrapes
Apples
Melons
Cherries
Pears
ProductGrapes
Apples
Melons
Cherries
Pears
LocationAtlanta
DenverDetroit
SalesSales
24
DemoDemo
• eFashion Case
• BusinessObjects Demo
25
BusinessObjects: Semantic BusinessObjects: Semantic LayerLayer
26
Any AccessAny Access
27
Info- & Analysis-need at 3rd partiesInfo- & Analysis-need at 3rd parties
28
e/m-Business Intelligence: e/m-Business Intelligence: ExtranetsExtranets
CUSTOMERS
PARTNERS
SUPPLIERS
extranet
extr anet
extr anet
Data Warehouse
29
ExtranetExtranet demo’s demo’s
30
Business IntelligenceBusiness IntelligenceTheoryTheory
31
BI DefinitionBI Definition
Business Intelligence is the process of collection, cleansing, combining, consolidation, analysis, interpretation and communication of
all internal and available external data, relevant for the decision making process in the
organisation
32
BI ConceptBI Concept
Data
Information
Knowledge
Action
Collection
Decisions
Integration
AnalysisFeedback
Bus
ines
s V
alue
33
BI SystemsBI Systems
Reporting & Query
DSS, MIS and EIS
OLAP Data Mining
34
mining
exploring
Number of users
Static
Dynamic
analysis
reporting
querying
Com
ple
xit
y o
f th
e q
uest
ion
The Five Functional Levels
standard reports
‘bunch of reports’, ‘cube’
unique ‘report’ orquestioni.e. finding
variablesi.e stat. analysis,
testing a hypothesis
35
mining
exploring
Number of users
Static/Dynamic
interactief
analysis
reporting
querying
Com
ple
xit
y o
f th
e q
uest
ion
80 %of all users
The Five Functional Levels
36
Applications
Any Source Any AccessAny Data
LOAD
MANAGEMENT
QUERY
MANAGEMENT
External data
Data Marts
Data Warehouse
Operational Data Store
Corporate Information FactoryCorporate Information Factory
LAN/WAN
WWW
37
Components of the CIFComponents of the CIF
• Data Warehouse• Data Mart• Operational Data Store• ETL
38
Data WarehouseData Warehouse
39
Definition Bill InmonDefinition Bill Inmon
Characteristics of a data warehouse:• Subject-oriented• Integrated• Time-variant• Non-volatile• Both summary and detailed data
40
Data WarehouseData Warehouse
• Contains data that can be used to meet the information of (part of) the organisation
• Contains integrated data extracted from one or more sources
• Mostly contains large amounts of data• Contains data that is clean and consistent• May contain aggregated data• Optimised for its use
41
Data WarehouseData Warehouse
Data Base Data Warehouse
ActualActual HistoricalHistorical
InternalInternal Internal and ExternalInternal and External
IsolatedIsolated IntegratedIntegrated
TransactionsTransactions AnalysisAnalysis
NormalisedNormalised DimensionalDimensional
DirtyDirty Clean and ConsistentClean and Consistent
DetailedDetailed Detailed and SummaryDetailed and Summary
42
Data WarehouseData Warehouse
Advantages• One point of contact• Time savings• No loss of historical data• OLTP’s not hampered by BI activities• Better consistency and quality of data• Improvement of Business Intelligence
43
Data WarehouseData Warehouse
Disadvantages• Never quite up-to-date• Requires a lot of storage space• Requires a lot of communication,
coordination and cooperation• Large impact on the organisation• A data warehouse is only the beginning
44
Data MartData Mart
• DW design does not optimise query performance
• Data is not stored in an optimal fashion for any given department in the DW
• Competition to get the resources required to get inside the DW
• Costs for DSS computing facilities are high because of the large volume in DW
45
Data MartData Mart
Characteristics:• Customised for a specific department• Limited amount of history• Summarised• Very flexible• Elegant presentation• Processor dedicated to the department
46
Data MartData Mart
Divided by:• Business• Geography• Security• Political (budget)• Structure (data mining)
47
Data MartData Mart
Three different kinds of data marts:• Subset/summary• MOLAP• ROLAP
48
Operational Data StoreOperational Data Store
Characteristics:• Subject-oriented• Integrated• Current-valued• Volatile• Detailed data
49
ETL: ExtractionETL: Extraction
Source selection:• Data model is starting point: determine data
elements that are needed• For each data element, determine available data
sources• If more han 1 source available, select on:
– Quality, reliability and integrity
– Scope of data
– Location and availability of data
– Location and availability of expertise
50
ETL: TransformationETL: Transformation
Processing:• Aggregate records• Encoding structures• Simple reformatting• Mathematical conversion• Resequencing of data• Default values• Key conversion• Cleansing
51
ETL: TransformationETL: Transformation
Key structure A
Key structure B
Key structure C
Key structure A
Key structure A
Key structure B
Key structure C
New key structure
Key transformation
52
ETL: CleansingETL: Cleansing
Data quality is critical for:• Marketing communications• Targeted marketing• Customer matching• Retail- and commercial householding• Combining information• Tracking retail sales
53
ETL: CleansingETL: Cleansing
Common excuses for not cleaning:• The data in the operational systems seem to
work just fine• Data can be joined most of the time• Cleansing will take place after population of
the data warehouse• Data entry will be improved• The users will never agree to change their data
54
Multi Multi DimensionalDimensional
Data Data ModelingModeling
55
MD Modeling: ContentsMD Modeling: Contents
• E/R Modeling (Ex.)• MD Modeling (Ex.)• Star Schema• Slowly Changing Dimensions (Ex.)• Surrogate Keys• Aggregation (Ex.)• Measures & Dimensions reviewed• Other important MDM aspects
56
Exercise: E/R ModelingExercise: E/R Modeling
How could the sales transaction database of the eFashion retailer look like?
Ticket
Ticket_nr
Store_nr
Card-nr
Employee_nr
Time_Stamp
Loyalty Card
Card_nr
Cust_name
Adress
Zip_code
City
...
Employee
Employee_nr
Emp_name
...
Products Sold
Ticket_nr
Product_nr
#_products
price
dicount
Products
Product_nr
Bar_Code
Prod_Desc
Actual_price
Weight
...
Store
Store_nr
Store_name
Adress
Zip_code
City
State
Manager
...
57
Management QuestionsManagement Questions• Give me the annual revenue of all my product
lines divided over all the sales regions over the last 3 years
• Give me the top 10 of most profitable products this year
• Give me the top 10 of most sold products of last year
• Give me the top 10 of most profitable customers• Compare the YTD revenue with the one in the
same period last year and the target
58
Why not E/R Modeling?Why not E/R Modeling?
• End users cannot understand, remember, navigate an E/R model (not even with a GUI)
• Software cannot usefully query an E/R model• Use of E/R modeling doesn’t meet the DW
purpose: intuitive and high performance querying
59
Exercise: Model the Efashion DMExercise: Model the Efashion DM
• Sales Revenue• Time hierarchy (Year-Quarter-Month)• Store hierarchy (Region, State, City, Store)• Product hierarchy (Line, Category, SKU)
60
eFashion Data MarteFashion Data Mart
Facts
Month_nr
Store_nr
SKU_nr
Sales_revenue
...
Product
SKU_nr
SKU_desc
Category
Line
Time
Month_nr
Month_desc
Quarter
Year
Geography
Store_nr
Store_name
City
State
Region
61
DW Modeling ComponentsDW Modeling Components
GeographicGeographic ProductProduct TimeTime UnitsUnits $$
DimensionTables
DimensionTables
GeographicGeographicGeographicGeographic
ProductProductProductProduct
TimeTimeTimeTime
Fact TableMeasuresMeasures
FactsFactsFactsFacts
DimensionDimensionDimensionDimension
62
Using a Star SchemaUsing a Star Schema
Fact TableDimension TableTime_DimTime_DimTime_DimTime_Dim
TimeKeyTimeKeyTheDate...
TheDate...
Sales_FactSales_FactTimeKeyEmployeeKeyProductKeyCustomerKeyShipperKey
TimeKeyEmployeeKeyProductKeyCustomerKeyShipperKey
$...
$...
Employee_DimEmployee_DimEmployee_DimEmployee_DimEmployeeKeyEmployeeKeyEmployeeID...
EmployeeID...
Product_DimProduct_DimProduct_DimProduct_DimProductKeyProductKeyProductID...
ProductID...
Customer_DimCustomer_DimCustomer_DimCustomer_DimCustomerKeyCustomerKeyCustomerID...
CustomerID...
Shipper_DimShipper_DimShipper_DimShipper_DimShipperKeyShipperKeyShipperID...
ShipperID...
63
Components of a Star SchemaComponents of a Star SchemaEmployee_DimEmployee_DimEmployee_DimEmployee_Dim
EmployeeKeyEmployeeKeyEmployeeID...
EmployeeID...
EmployeeKey
Time_DimTime_DimTime_DimTime_DimTimeKeyTimeKeyTheDate...
TheDate...
TimeKeyProduct_DimProduct_DimProduct_DimProduct_Dim
ProductKeyProductKeyProductID...
ProductID...
ProductKey
Customer_DimCustomer_DimCustomer_DimCustomer_DimCustomerKeyCustomerKeyCustomerID...
CustomerID...
CustomerKeyShipper_DimShipper_DimShipper_DimShipper_Dim
ShipperKeyShipperKeyShipperID...
ShipperID...
ShipperKey
Sales_FactSales_FactTimeKeyEmployeeKeyProductKeyCustomerKeyShipperKey
TimeKeyEmployeeKeyProductKeyCustomerKeyShipperKey
$...
$...
TimeKey
CustomerKeyShipperKey
ProductKeyEmployeeKey
Multipart KeyMultipart KeyMultipart KeyMultipart Key
MeasuresMeasuresMeasuresMeasures
Dimensional KeysDimensional KeysDimensional KeysDimensional Keys
64
Exercise: Slowly Changing DimensionsExercise: Slowly Changing Dimensions
Suppose the product categories change from time to time.Model the Data Mart when the manager wants to see historical reports against:1. The present categories2. The categories at the time of the sale3. Both against the present categories and the immediate previous categories4. The categories at any specified time
65
SCD Exercise 1SCD Exercise 1
Facts
Month_nr
Store_nr
SKU_nr
Sales_revenue
...
Product
SKU_nr
SKU_desc
Category
Line
Time
Month_nr
Month_desc
Quarter
Year
Geography
Store_nr
Store_name
City
State
Region
66
SCD Exercise 2SCD Exercise 2
Facts
Month_nr
Store_nr
Product_key
Sales_revenue
...
Product
Product_key
SKU_nr
SKU_desc
Category
Line
Time
Month_nr
Month_desc
Quarter
Year
Geography
Store_nr
Store_name
City
State
Region
Most Recent
Product Key Map
Product_key
SKU_nr
67
SCD Exercise 3SCD Exercise 3
Facts
Month_nr
Store_nr
Product_key
Sales_revenue
...
Product
Product_key
SKU_nr
SKU_desc
Category
Category_old
Line
Time
Month_nr
Month_desc
Quarter
Year
Geography
Store_nr
Store_name
City
State
Region
68
SCD Exercise 4SCD Exercise 4
Facts
Month_nr
Store_nr
SKU_nr
Sales_revenue
...
Product
SKU_nr
SKU_desc
Category
Line
Valid_from
Valid_until
Time
Month_nr
Month_desc
Quarter
Year
Geography
Store_nr
Store_name
City
State
Region
69
Slowly Changing DimensionsSlowly Changing Dimensions
• Type 1: Overwrite the dimension record• Type 2: Create new dimension record• Type 3: Create an ‘old’ field in the dimension
record• Type 4: Add a valid_from and valid_until field in
the dimension record
Ad. Type 2: requires surrogate keys, but in general, one should always use these because of performance and flexibilityAd. Type 4: Kimball only recognizes 3 types SCD’s
70
Always Use Surrogate KeysAlways Use Surrogate Keys
• Allows DWH to assign new key versions for SCD’s (type 2)
• Higher performance with numeric keys than with long, alphanumeric keys
71
Exercise: AggregationExercise: Aggregation
Suppose the manager queries frequently on product line level and finds the performance too low.
Question: How to model the data mart when we want to add aggregated measures on product line level?
72
Exercise: AggregationExercise: Aggregation
Facts
Month_nr
Store_nr
Product_key
Sales_revenue
...
Product
Product_key
SKU_nr
SKU_desc
Category
Line
Time
Month_nr
Month_desc
Quarter
Year
Geography
Store_nr
Store_name
City
State
Region
Aggregated Facts
Week_nr
Store_nr
Line_key
Sales_revenue
...
Product_Line
Line_key
Line
73
Exercise: MeasuresExercise: Measures
• Stock Quantity• Product Price• Promotion Costs (product-specific, store-
independent)
Add the following measures to the eFashion Data Mart:
74
Exercise: MeasuresExercise: Measures
Facts
Month_nr
Store_nr
Product_key
Sales_revenue
Stock_qty
Product
Product_key
SKU_nr
SKU_desc
Price
Category
Line
(Valid_from
Valid_until)
Time
Month_nr
Month_desc
Quarter
Year
Geography
Store_nr
Store_name
City
State
Region
Promotion Facts
Month_nr
SKU_nr
Promotion_cost
Duration
Promotion_type
...
Q_Stock Facts
Quarter
Store_nr
SKU_nr
Stock_qty (av, eom)
Month
Quarter
Year
75
Measures & Dimensions reviewedMeasures & Dimensions reviewed
• Numeric• Additive
The most useful measures are
• The natural entry points of the facts• I.e., used for constraints and report breaks• Independent of each other, not hierarchically
related
Dimensions are:
76
Other Important MDM-AspectsOther Important MDM-Aspects
• Cardinality• Grain• Referential Integrity• Conformed Dimensions• Drill Across• Traps
77
Applications
Any Source Any AccessAny Data
LOAD
MANAGEMENT
QUERY
MANAGEMENT
External data
Data Marts
Data Warehouse
Operational Data Store
How to make the CIF?How to make the CIF?
LAN/WAN
WWW
78
CGE&Y BI-Approach OverviewCGE&Y BI-Approach OverviewS
trat
egy
& o
bje
ctiv
es
DW
blu
epri
nt
Sou
rce
dat
a
Metamodel
Data WarehouseArchitecture I
Def
init
ion
In
crem
ents
Extraction,Transformation
LoadDevelopment
Imp
lem
enta
tion
Incremental Delivery
Evolutionary Strategy
Data WarehouseArchitecture II
Aw
aren
ess
Project Management
Communication
79
80
Data MiningData Mining
81
Data MiningData Mining
Definition:
The process of digging intelligently into large volumes of data to discover and analyse previously unknown relationships or to validate hypotheses.
82
Data Mining Versus OLAPData Mining Versus OLAP
OLAP/Query
Are there some customers from large accounts with a high
decrease in international calls?
Data Mining
Are there any common
characteristics among these customers?
Data
Information
83
ApplicationsApplications
• Risk Analysis (grant credit, investment)• Fraud Detection (telephone charge, bank
withdrawals)• Trouble Shooting and Diagnosis• Process Controls (wafer fabrication)• Promotion Analysis• Bankruptcy Prediction (mortgage lending,
business partners)• Customer Churn (telco)• CRM (next slides)
84
Maximizing Customer ValueMaximizing Customer Value
• Getting more prospects in• Turning prospects into customers• Selling more products to existing customers• Getting less customers out
Which ones in and which ones out?Which ones in and which ones out?
Highest Lowest
Yield per customer
Costs per customer
Customer profitability
MigrationKeep
Growth
Out
Yieldperindividualcustomer
86
Example: One to One MarketingExample: One to One Marketing
• Treat different customers differently– differentiate message– differentiate product offer– differentiate channel
• Need for usable information => predict customer behavior out of databases
89
90
91
92
93
Example: clickstream analysisExample: clickstream analysis• What parts of our Web site get the most visitors? • What parts of the Web site do we associate most frequently with actual
sales? • What parts of the Web site are superfluous or visited infrequently? • Which pages on our Web site seem to be "session killers," where the
remote user stops the session and leaves? • What is the new-visitor click profile on our site? • What is the click profile of an existing customer? A profitable
customer? A complaining customer that all too frequently returns our product?
• What is the click profile of a customer about to cancel our service, complain, or sue us?
• How can we induce the customer to register with our site so we learn some useful information about that customer?
• How many visits do unregistered customers typically make with us before they are willing to register? Before they buy a product or service?
94
Tele-sales Service desk
Customized Customer ServiceCustomized Customer ServiceCustomized Customer ServiceCustomized Customer Service
95
CustomerData
Tele-sales
DirectMail
Salesvisit
Good
Bad
Example: Contact StrategyExample: Contact Strategy
Ch
an
nel
op
tim
isati
on
Datamining
96
Organisation!
The customer choses the channelThe customer choses the channel
Complaint handling
Leaflet receipt
Status
service question
Status order
Operational systems
Integration Analysis
Service questionCC
App.
Leaflet requestContact
Order
Order
Complaint Email
97
Data Sources for Data MiningData Sources for Data Mining
DATACollecting &
Cleansing
• Transactions (loyalty cards)• Behaviour of existing customers• Logfiles & cookies• Market research• Data suppliers• Public data
98
Example: Affinity GroupingExample: Affinity Grouping
• Market Basket: what items are sold together?• Market Basket: what categories are sold with
what items?• Market Basket: what is not sold with certain
items?• Event Correlations: what other services are
brought in the first month after signing up for a satellite TV subscription?
99
Data Mining TechniquesData Mining Techniques
• Decision Trees, Classification Trees, Rule Induction
• Neural Nets• Visualisation• Fuzzy Logic; Nearest Neighbour; Memory
Based Reasoning; Case Based Reasoning• Proprietary Logic• Classical Statistics
100
Data Mining TechniquesData Mining TechniquesData Mining TechniquesData Mining Techniques
Statistical analysis Neural networks Genetic algorithms Decision trees
Intuïtion
Pre
dict
ive
Pow
erP
redi
ctiv
e P
ower
SimplicitySimplicity
101
Critical Success FactorsCritical Success Factors
• Data availability (large amounts of a wide variety of data)
• Data consistency• Data quality• Domain expertise• Data used/needed is allowed by privacy laws
102
BenefitsBenefits
• Improved customer relationships• More revenue from existing customers • Market segmentation• Differentiated products and services• Differentiated sales channels• More effective marketing programs• Improved fraud detection• Improved investments• …
103
Decision Tree with Decision Tree with BusinessMiner from BusinessMiner from
BusinessObjectsBusinessObjects
DemoDemo
104
Contact informationContact information
Tom A. FürstenbergBusiness Intelligence ConsultantCap Gemini Ernst & YoungSector Energy, Products & TransportTel +31 6 21 878 915email: [email protected]