corporate data architecture in a federated world presented by deborah henderson, inergi lp to irmac...
Post on 18-Dec-2015
215 views
TRANSCRIPT
Corporate Data Architecture in a Federated World
Presented byDeborah Henderson, INERGI LPto IRMAC Business Intelligence & Data Warehouse SIG
May 23 2002
© 2002 Inergi LP - All rights reserved 24/2/2002
We know that the data must be in thewarehouse somewhere, but we can’t find it.
You Know You have a Problem When…...
You Have a 'Dark Matter Schema'
© 2002 Inergi LP - All rights reserved 34/2/2002
Two subsets, according to differing theories:
The WIMPs schema:we are just too overwhelmed by user requeststo track down where in the data that particular element resides.
The MACHOs schema:we are too busy being the all knowing DW experts to track the elements down.
The 'Dark Matter Schema'
© 2002 Inergi LP - All rights reserved 44/2/2002
Agenda
Inergi and Architecture Corporate Data Architecture What supports this : the IT Business Model
and Architectural Compliance Data Architecture Process and Procedure Models & modelling On the Horizon
© 2002 Inergi LP - All rights reserved 54/2/2002
INERGI LP Subsidiary of Cap Gemini Ernst & Young Canada
Inc. along with New Horizons Solutions, another energy sector affiliate reporting into CGE&Y
Created March 1 2002 Multi-year deal for sustainment of Hydro One IT
systems Supply, Finance, Pay, IT, Call Centre, Customer
billing for Hydro One We are IT and Business process outsourcing
specialists for the energy sector with many years experience
We are open for business!
© 2002 Inergi LP - All rights reserved 74/2/2002
Enterprise IT Architecture
Why is it Important? Reduces cost of operations, through reuse of standard pieces of technology, application and data, network
Example: Financials
– disk farm participant (technology)– official one source for ledger data (data)– Java reporting environment (application)– using network standards
© 2002 Inergi LP - All rights reserved 84/2/2002
Corporate Architecture
P h ys ica l D a ta A rch ite c tu re M e ta-d a ta A rch itec tu re D a ta A rch ite c tu re
D a ta A p p lica tion N e tw o rk T e ch no lo gy
C o rp ora te A rch ite c tu re
© 2002 Inergi LP - All rights reserved 94/2/2002
Data Architecture Idea
Data Architecture Set of principles that defines ‘organization-wide data resource of well-described, properly structured, high-quality data that are properly documented’. (Brackett, 1994)
Metadata Architecture Set of principles that defines and describes the data resources in an organization.
Physical DBMS Architecture Architecture component that defines physical data components.
© 2002 Inergi LP - All rights reserved 104/2/2002
Data Architecture Constructs
D a ta P a rtitio n ing D a ta P lace m e nt D a ta U se a ge
P h ys ica l, M e ta d a ta , D a ta A rch itec tu res
© 2002 Inergi LP - All rights reserved 114/2/2002
Data Architecture Should be Principles Based
• Try to leverage the DW Project : Data, metadata, technical, ETL (extract transform and load) , EUT (end use tools), Physical architectures
• Develop architecture changes and additions to the overall Enterprise Architecture
• High re-use of data and processes across the enterprise for next initiatives
© 2002 Inergi LP - All rights reserved 124/2/2002
EXAMPLES:1. System of record will be established for all data
2. Corporate definitions of data will be resolved and maintained
3. Data will reside on database servers not on application servers or mail servers
A Data Architecture is Principles Based
© 2002 Inergi LP - All rights reserved 134/2/2002
Enterprise Data Warehouse as Driver for Data Architecture
• First data architecture effort often constrained by the EDW project
• Think enterprise scalable: Hardware, Software, Processes, Centre of Excellence in Data, Corporate Data Architecture compliance, stewardship & vitality process
© 2002 Inergi LP - All rights reserved 144/2/2002
Data Architecture Implemented: Tools & Expertise
• Modelling & Metadata storage • Data store • Data mining • Statistics• Business Reporting • Environment modelling
• keep the number of tools to a minimum reduces lifetime ownership costs to the Company• establish the role of Product Specialist for all tools
© 2002 Inergi LP - All rights reserved 154/2/2002
Datamart Local model
OLAP & Analysis
Historical &Benchmark
purchased data
Operational Report server
Sample DW Data Architecture
Local Data
OLAP & Details
External Data & History
ODS source
10%
20%
30%
40%
© 2002 Inergi LP - All rights reserved 164/2/2002
Local model
OLAP & Analysis
External data,history
ODS
There are models for each component
Sometimes!! the models are linked/related
Sample DW Data Architecture
© 2002 Inergi LP - All rights reserved 174/2/2002
Metadata Architecture in Most Companies Today
ModellingCASE
RDBMS
ETL
BIEnd Use Tool
modeler
Interfaces are usually proprietary
= repository
End user
OLTP Applications
© 2002 Inergi LP - All rights reserved 184/2/2002
Metadata Arch
The Object Management Group (OMG) was established in 1989 and is the world's largest software consortium with a membership of over 700 vendors, developers, and end users.
In June 2000, OMG released an XML-based metadata standard.
OMG showcased XML metadata interchange in March 2001 atthe DAMAI conference in Anaheim
© 2002 Inergi LP - All rights reserved 194/2/2002
Physical DB Architecture
• ORACLE and DB2 outlook• Impact of DW features • partitions, instances and machines
• Referential integrity• placement and enforcement
• Multi-dimensional cubes
• Metadata ‘repository’ through hooks or??
© 2002 Inergi LP - All rights reserved 204/2/2002
Business Intelligence: The Delivery Maturity Model
InfrastructureDevelopment
Document- centricportal
DW /BIsupportportal
E-businessportal
© 2002 Inergi LP - All rights reserved 224/2/2002
IT Compliance Business Model
For every IT project Technical architecture Data architecture Map to Business model
pre-requisite is aPolicy on data sharing
© 2002 Inergi LP - All rights reserved 234/2/2002
Data Architecture Compliance Process
Modelling principles and procedures
Naming Conventions
Database design and implementation guidelines
Policies
Metadata principles
Data architecture principles
© 2002 Inergi LP - All rights reserved 244/2/2002
Data Architecture Compliance Process
Document processes that driveprocedures, standards*
• Data Models are necessary deliverable of project, should be noted in Charter asa deliverable
• Architectural compliance and operational readiness gate
• Vitality process - keeping current
© 2002 Inergi LP - All rights reserved 254/2/2002
*Additional Documentation to Support the Corporate Data Architecture
Metadata repository (or facsimile!) Business definitions Standard naming conventions Standard abbreviating procedures Standard domain structuring Standard translation schemes Conformed dimensions and facts - EDW Stewardship patterns CDA roles and responsibilities
© 2002 Inergi LP - All rights reserved 264/2/2002
Governance and Using the Architecture
Governance framework and repository maintenance
Using the Architecture in a project
© 2002 Inergi LP - All rights reserved 274/2/2002
• Data will be controlled and managed throughout its life cycle as a resource, in the same manner as any other asset (capital, material, and people).
• Access to data will be facilitated, and/or controlled and limited, as required to provide the best performance at the least cost for all users while meeting functional and technological , regulatory and legal requirements.
Data Stewardship: the Business-side Responsibility
© 2002 Inergi LP - All rights reserved 284/2/2002
• Data will be shared except where exempted by Corporate Security Policy.
• Data will be standardized to avoid duplication and facilitate integration.
Data Stewardship: the Business side responsibility cont’d
© 2002 Inergi LP - All rights reserved 294/2/2002
Building Models1. Assess subject areas involved in a project and publish for reuse :
• Conceptual model bubbles
• Subject area models where complete
• Documentation standards for models
© 2002 Inergi LP - All rights reserved 304/2/2002
Building Models
2. Build on these models and submit for review and approval
3. Develop conformed data objects where required (as you go)
3. Add new models to the Model Repository
REFER TO PROCESS DOCUMENTATION !
© 2002 Inergi LP - All rights reserved 314/2/2002
Related Data Models ARE Your Quality Control
Conceptual and Enterprise Data Models maintained by IT Architecture
Logical Models
assists in understanding, official definitions,
(OLTP physical = logical) Enterprise Data Warehouse Model
a dimensional model that gets implemented in Oracle
high-performance ‘read-only’ model Cube Designs - Problem centric Dimensional models
implemented in OLAP Cubes Source Data Models (OLTP) - informational for BI,
source for ODS
© 2002 Inergi LP - All rights reserved 334/2/2002
More Physics of Schemas :-)
'Black Hole Schema'
: Systems where the query never returns
'Pulsar Schema'
: Only returns results every few queries or so
'Milky Way Schema'
: A central warehouse with many dozens of offspringsthat no one can keep track of
'SuperStrings Schema'
: Many measures, all built on top of each other, relatingto each other and that give the same result
© 2002 Inergi LP - All rights reserved 344/2/2002
Conceptual Data Model
High-level model
Depiction of major Functional Areas in the Company
Each Functional Area defined
© 2002 Inergi LP - All rights reserved 354/2/2002
Enterprise Data Model
Limited number of high-level data sets (Subject Areas)
Global relationship cardinalities are shown
Data Sets fully defined Definition is formalized
at the Corporate level as official
Data Stewardship is established for each Subject Area
© 2002 Inergi LP - All rights reserved 364/2/2002
Logical Data Models
Logical Data Model developed per Project basis
LDM fully synchronized using
the Corporate Data Architecture Principles
Objects fully defined and
attributed Re-usable domains
implemented Re-usable rules identified,
documented and implemented consistently
Municipality_Type
Ministry_of_Environment_Distri
Work_Order_Type
Chemical
Disposal_Contractor
Hazard_Class
LAR_Fact_RemediationLAR_Factless_Fact
Work_Order
LAR_Fact_Budget_Actual
Business_Unit
Site
© 2002 Inergi LP - All rights reserved 374/2/2002
DW Physical Data Models
Implemented Limited use of RI (load
only) to keep the data integrity
Business rules implemented through ETL procedures
Model in-sync with the database
CASE tool used for model/database synchronization
Colors extensively used for better readability
Municipality_Type
Municipality_CD
Municipality_Description
Ministry_of_Environment_Distri
MOE_Region_CDMOE_District_CD
MOE_Region_DSCMOE_District_CD_DSC
Work_Order_Type
Work_Order_Type_CD
Work_Order_Type_CD_DSC
Chemical
Chemical_CD
Chemical_NMChemical_Measurement_CDChemical_Measurement_DSCChemical_Measurement_LimitChemical_Suite_CDChemical_Suit_DSCChemical_Suite_Measurement_CDMeasurement_DSCChemical_Suite_LimitChemical_Suite_Measurement_AMT
Disposal_Contractor
Disposal_Contractor_ID
Disposal_Contractor_NMHauling_Company_NMDisposal_Facility_NMCertificate_of_Approval
Hazard_Class
Hazard_Class_CD
Hazard_Class_Code_DSCHazard_Clas_CD_Measurement_AMTHazard_Class_CD_Measurement_CDHazard_Class_CD_Measure_CD_DSC
LAR_Fact_Remediation
Work _Order_NUM (FK )
Dis posa l _Contra c tor_ID (FK)Start_Date
Ha za rd_Cla s s_CD (FK)
Site _ID (FK )Remediation_DSCEnd_DateDisposed_Waste_Measure_DSCDisposed_Waste_QTY
LAR_Factless_Fact
Work _Order_NUM (FK )
Chemi c a l_CD (FK )Testing_Begin_DT
Work _Order_Type _CD (FK )
Site _ID (FK )
Busi ness _Unit (FK )Testing_End_DTChemical_Concentration_AMTAbove_Guideline_INDTesting_Status
Work_Order
Work_Order_NUM
Work_Order_DSCBegin_DTEnd_DTResource_CDResource_Code_DSCHand_Off_DTWork_Program_NumberWork_Program_Start_DTWork_Program_End_DTWork_Program_TypeWork_Type_DSCProject_CDProject_DSC
LAR_Fact_Budget_Actual
Work _Order_NUM (FK )
Munic ipa lity_CD (FK )
MOE_Region_CD (FK)
MOE_Distric t_CD (FK)
Work _Order_Type _CD (FK )
Busi ness _Unit (FK )
Site _ID (FK )
Budget_Credit_AMTBudget_Debit_AMTActual_Total_Credit_AMTActual_Total_Debit_AMT
Business_Unit
Business_Unit
Site
Site_ID
Site_NMGPS_North_ReadingGPS_West_ReadingGPS_DayGPS_TimeCRA_Tier_Ranking_INDCRA_Tier_Ranking_IND_DSCCRA_Ranking_INDCRA_Ranking_IND_DSCHydro_One_Ranking_INDHydro_One_Ranking_IND_DSC
© 2002 Inergi LP - All rights reserved 384/2/2002
: Has many dimensions, but you if you ask for more than a certain number at a time, it converts to a Black Hole Schema
Time for…...More Physics of Schemas :-)
'Binary System Schema'
: Two datamarts that do the same thing and try to suckeach other into themselves
’Chance Theory Schema'
: The results are always uncertain and questionable asit changes every time you run the report
‘Event Horizon Schema'
© 2002 Inergi LP - All rights reserved 394/2/2002
: A data warehouse that we miraculously brought into existence and the user does not know why or how or how it’s useful
'Big Bang Schema'
: Miraculously fast and becomes faster as you add data.But cannot be implemented as it is theoretical.
Most demos fall into this space.
'Tachyon Schema'
© 2002 Inergi LP - All rights reserved 404/2/2002
1 2 3 n
OLAPEnterprise
Data WarehouseModel
OLTPEnterpriseData Model
1 2
DATA MARTS
Logical Data models
Subject areaEXTRACTS
© 2002 Inergi LP - All rights reserved 414/2/2002
LDM 1 LDM2 LDM 3 LDM n
EnterpriseData Warehouse
Model
EnterpriseData Model
Extract Extract
© 2002 Inergi LP - All rights reserved 424/2/2002
For DW Conformed Dimensions & Facts
Supports iterative/parallel build and alignswith Kimball’s bus structure
BUT• Can get it wrong• Can loose control• Exponential complexity
© 2002 Inergi LP - All rights reserved 434/2/2002
• Impact of XML• Impact of Taxonomies in Business• Corporate Reporting Strategies• Overarching Mobile Data Strategies• ODS for all ERP• Information Architecture• BAM - Business Activity Monitoring*• Network Appliances• Synergies with Application Architecture
*Gartner April 2002
On the Horizon
© 2002 Inergi LP - All rights reserved 444/2/2002
CDA Together with Application Architecture
DATA APPL’N
• Business Processes should be implemented in the application not the database• Logical workflow and data flow must align• Applications must have owners just like data• We must be able to identify official source (aka system of record)
© 2002 Inergi LP - All rights reserved 454/2/2002
Corporate Data Architecture That is Process Driven
• Policies • Architecture & Standards• Addresses OLTP and OLAP “Federated” world• Compliance • Vitality• Stewardship• Can be applied to all new challenges on the horizon
With all the pieces….This could work!!