rev_3 components of a data warehouse

60
PRIMER FOR BUILDING A DATA WAREHOUSE

Upload: ryan-andhavarapu

Post on 13-Apr-2017

326 views

Category:

Documents


1 download

TRANSCRIPT

PRIMER FOR BUILDING A DATA WAREHOUSE

Vensai Consultants is an IT Consulting firm which specializes in providing Strategic leadership, Architectural direction, Resourcing ramp-up, IT Portfolio Management & Implementation of Data Warehouses.

WHO ARE WE?

We are a team of IT Architects specialized in implementing Data Warehouses based on customer's needs. We specialize in building Reporting systems all the way from understanding data sources to

building a presentation layer.

SOME INDUSTRY FACTS

Data Warehousing Today : FactsInformation processing using Microsoft Office Productivity software is long gone.Data is being collected in various formats - Tables, XML, Messages, Images, BLOBs, CLOBs etc.

Data Warehouses are driving the analytical and decision support systems.99% of corporations have Data Warehouse shops attempting to process the data into useful information for the Company.

Vensai Consultants believe that building a Data Warehouse shouldn't be a daunting exercise once the roadmap is adopted. Most companies fail in their data

warehouse efforts because :

Reporting requirements are not understood at the beginning of the project

Software Development life cycle is plagued with process deficiencies

Lack of Strategic direction & Monetary resources

WHAT IF?The Client has access to a team of IT Architects who have versatile experience across several tools related to Data Warehousing, who have sound foundation of Agile methodology in developing Data Warehouses, who have sound strategy in overcoming the challenges related to requirements and who cost a fraction of the typical IT development shops?

SAY HELLO TO VENSAI CONSULTANTS.

INTRODUCING VENSAI CONSULTANTS, LLC

A team of Data Warehouse consultants with diverse experience in all facets of development life cycle. Our consultants have extensive experience in Healthcare, Retail, Banking and Defense sectors. We have access to several IT Vendors who can assist the project in procuring and implementation the required software/hardware for building a Data Warehouse. Our team is also well-versed with Big Data technologies.

Vensai Consultants, LLC is a Service-based company in the realm of IT Development domain.

Vensai Consultants, LLC Presents

DATA WAREHOUSE ROADMAP

DATA WAREHOUSE

ROADMAP

CLAIMS DATA WAREHOUSE

Data Acquisition Service

Data Integration Services

Data Repository Services

Data Reporting Services

Data Acquisition Services

ARCHIVING AND CLEANSINGDATA SOURCE

STAGING AREA Data Archiving

Data Cleaning

Data Acquisition Service

Data Source Staging Area

Eligibility Source 1

Medical Claims Source 1

Medical Claims Source 2

Pharmacy Claims Source 1

Pharmacy Claims Source 2

Encounters Source 1

Assessment Source 1

Authorizations Source 1

Database

Eligibility Source 1

Medical Claims Source 1

Medical Claims Source 2

Pharmacy Claims Source 1

Pharmacy Claims Source 2

Encounters Source 1

Assessment Source 1

Authorizations Source 1

Archiving and Cleansing

Data Archiving Data Cleaning

Backing up input scores, securing data sources etc., Data Types, de-duplication etc.,

Data Integration Services

Data Integration Services

DATA PROFILINGDATA MAPPING,

TRANSFORMATION AND DERIVED DATA

Data Quality Measures

Event Logging & Auditing

Data Discovery

Data Validation

Identity Mapping (Master Data Management)

Generate Core Facts

Data Mining

References Data Taxonomies

Generate Dimensions (Type 1-3)Generate Fact Groups (Get Surrogates, Generate Stars)

DATA PROFILING

Data Discovery

(Data Profiling, Metadata collection, statistics,

histograms, Alerts, Etc..)

Data Validation

(RI, Conditionals, Data Types, Valid

Values, Cleansing, Scrubbing)

Data Profiling

Area

DATABASE

DATABASE

Integration & Transformation

Area

DATA MAPPING, TRANSFORMATION AND DERIVED DATA

Identity Mapping (Master Data Management)

Matching Algorithms, Dimensional Authority, Data Domains Setup

Generate Core Facts

Natural Keys only (Type1-3)

Data Mining

Grouper Generative Routines (MEG, PEG etc.,)

References Data Taxonomies

Develop Crosswalks, Taxonomies

Generate Dimensions (Type 1-3)Generate Fact Groups (Get Surrogates, Generate Stars)

DQ & Exception Technical Audit

Area

DATABASE DATABASE

Audit, Balance & Control Area

Data Quality Measures

Event Logging & Auditing

Define Quality Measures, Measure,

Remediation, Load & Reload)

System Event Logging, Auditing,

Balance & Control

DATABASE

Data Repository Services

Data Repository Services

Operational Admin Data & Operational Decision Support

Data MartsCore Data Area

Core Data Area

Atomic (Core) Facts

Database

Conformed Dimensions

Database

ETL – Generate Pre-Stored Aggregates

Data Marts

Value Proposition KPIs

DATABASE

Aggregate Snapshots

Pre-Aggregated Measures/Metrics Divisional Star-Schemas

DATABASE

DATABASE

DATABASE

Operational Admin Data & Operational Decision Support

DATABASE DATABASE

Data Quality and Audit Logging Metadata Repository

DATABASE

Process Dashboards, Event Monitoring, Warehouse Controls

Data Reporting Services

Data Reporting Services

Monitoring Planning Analysis Administration

Monitoring

Monitoring KPI-Dashboard, Value Levers, Scorecards

DQ & Exception Technical Audit Area

Planning

Predictive Modeling (Plans, Models, Forecasts)

Analysis

Multidimensional Report Cubes

Adhoc Analysis

Administration

Report Authority/Writers

Operation Monitoring

(Framework/Catalog Generation, User Security, Prompts, Data Filters etc.,)

Report Administrative Functions

Requesting ApplicationsUser Community Staff Support

Web Services

BI Analytics

Batch Extracts

FUNCTIONAL TEAMS RESPONSIBLE FOR DATA WAREHOUSE

DATA WAREHOUSE

Purpose: Provides framework for creating

and managing change requests on all aspect

of application development.

Change Management

Purpose: Maintain the IT Systems after the end-of development life cycle. Workflow, Scheduling, Process models, Process grouping, Interactions etc.

IT SupportPurpose: Audit mechanism for tallying the processed data across Data Warehouse layers. Error Handling, Consolidation and reporting.

Audit, Balance & Control

Purpose: Systems, Projects, Vendors, Network Hardware, Software etc.,

IT Portfolio Managements

Purpose: Administration access, configuration, release management, deployment of application servers.

Application Administration

Purpose: Creates application code to move data from one layer to other

Application Development

Purpose: Provides, maintains and supports the authoritative reference data along with taxonomies

Reference Data Management

Purpose: Creates, maintains and supports

the metadata artifacts like data models, data

packages, App Dev Repositories etc.,

Data Architecture

Purpose: Define security policies and manages the data acccess to individuals and processes

Data Security & access Control

Purpose: Create and manages mapping documents and data transformation rules

Data Integration

Purpose: Defines and manages the Data Quality KPIs to assure the quality of the Data Warehouse

Data Quality

Purpose: Creates a frameworks for storing & publishing all metadata content related to a Data Warehouse

Metadata Management

FUNCTIONAL CHARACTERISTICS OF DATA WAREHOUSE

DATA ACQUISITION LAYER

Houses only the changed records from the source system

Optimized for faster loading (Ex: No indexes or Constraints)

Truncate before each load

Only INSERTS into this layer

Access to ETL processes only

Apply business and project specific filter on data before this layer

Replica Structures of the Source System

NO User access at all

Exception: Data Retention period is limited to 2-6 months

Cleansed Data only

Data Load Frequency is requirement specific (Daily, Weekly, Monthly)

Minimize processing impact on application source database.

CANONICAL LAYER

Combine data from Multiple Sources

Denormalized Atomic Transaction Tables

Metadata Conformance

Transaction History accumulated for specific period of time (Ex: 7 years)

NO User access at all

Snapshots at specific changes to a transaction may be maintained

Purge Criteria will be established

Preferably STAR Schema Modeling of Data Structures

Lookup Reference data is loaded into the dimensions

Optimized for faster access (EX: Many indexes, Partitioning)

access to all downstream processes

Transaction History accumulation for specific period of time (Ex: 7 years)

access to all privileged users

DATAMART LAYER

SUGGESTED TOOLS REQUIRED FOR DEVELOPING & MAINTAINING A DATA

WAREHOUSE

SUGGESTED TOOLS REQUIRED FOR DEVELOPING & MAINTAINING A DATA WAREHOUSE

13

121110

987

654

321 Data Modeling Relational Database Management Systems

Non-Relational Database Systems

ETLMaster Data Management

Analytics

Reporting Data Profiling Job Control

Performance Monitoring

Unix/Mainframe OS Access

Identify Management

Support Management

DATA MODELING

Purpose: Designing data structures.

ER/Studio

Erwin

SAP Power Designer

IBM Rational Rose

TOAD Data Modeler

RELATIONAL DATABASE MANAGEMENT SYSTEMS

Teradata

Purpose: Storing Data.

Oracle,

Microsoft SQL Server

SAP Sybase

IBM DB2

NON-RELATIONAL DATABASE SYSTEMS

Purpose: Storing Data.

Oracle Exadata

Cloudera Big Data,

Hortonworks Hadoop

Mongo DB

ETL

Purpose: Extract, Transform & Load of data.

Informatica Power Center

SSIS

Ab Initio

Cognos Decision Streams

SAS DI Studio

Oracle Data Integrator

MASTER DATA MANAGEMENT

IBM Initiate

Purpose: Create, retrieve Master data.

Nextgate MatchMetrix

Informatica MDM

ANALITICAL

Purpose: Analyzing Information,Predictive Analysis.

Statistica

SAS,

Oracle Business Intelligence EE

IBM SPSS Modeler

REPORTING

Oracle Business Intelligence EEBusiness ObjectsCognosSSRSPyramid AnalyticsTIBCO SpotfireMicrostrategySAP InfoMakerMicrosoft access

Purpose: Writing Reports.

DATA PROFILING

Purpose: Analyzing the source data.

SAP Data Quality Management

SAP Address Directories

Informatica IDQ

JOB CONTROL

CA7

Purpose: Manage job schedules

Tivoli Workload Scheduler

CA Autosys

Maestro

PERFORMANCE MONITORING

Purpose: Analyzing the performance data.

TeamQuest CMIS

Purpose: access to OS for file operations.

UNIX/MAINFRAME OS access

WinSCP

Reflections

PUTTY

Telnet

IDENTITY MANAGEMENT

Purpose: Provisioning and managing access to users

Oracle Id Management Suite

Tivoli Identity Management

IBM Identity Management

SUPPORT MANAGEMENT

Purpose: Incident management,Escalation management, Contact management

BMC Remedy Action Request System

ABOVE ALLVensai Consultants, LLC is a woman owned small business based out of Maryland.

We specialize in tailoring the solutions based on the client needs & budget. We are available as a team and as well as on a consultant basis. Given our technical

acumen, we are sure that Vensail Consultants, LLC would be a value proposition to our clients.

HAPPY DATA WAREHOUSING!!

THANK YOU

Vensai Consultants, LLC

[email protected]

[email protected]

815-277 9201

[email protected]

EMAIL

PHONE