data governance data & metadata standards antonio amorin © 2011

38
Data Governance Data & Metadata Standards Antonio Amorin © 2011

Upload: claribel-craig

Post on 11-Jan-2016

230 views

Category:

Documents


2 download

TRANSCRIPT

Data GovernanceData & Metadata Standards

Antonio Amorin

© 2011

Abstract

• This data governance presentation focuses on data and metadata standards. The intention of the presentation is to identify new standards or modernize existing standards for both data and metadata.

© 2011

Biography

• Antonio AmorinPresident, Data Innovations, Inc.– Nineteen years of data modeling experience– Eleven years of data profiling experience– Delivered data modeling and data profiling solutions

to numerous clients in the Midwest and East Coast– Presented at national and international conferences,

user groups, webcasts, and at client sites– Founded Data Innovations, Inc. in 2002

© 2011

Data Innovations, Inc.

• Established in 2002• Based in northwest suburbs• Professional Services:

– Data Modeling– Data Profiling– Data Architecture– Metadata– Database Administration– ETL

• CA Service Partner in 2004• CA Commercial Reseller in

2006• CA Enterprise Solution

Provider in 2007

© 2011

Agenda

• Data Standards

• Metadata Standards

• Recommendations

• Summary

© 2011

Data Standards

• Documented agreements on representations, formats, and definitions of business data

© 2011

Data Standards

• Benefits– Improved data quality– Improved data

compatibility– Improved consistency

and efficiency of data collection, use, and sharing

– Reduced data redundancy

© 2011

Data Standards

• Data Stewards– Role or position– Responsible for

overseeing stewardship of the data and metadata

– Likely to be on both the business and IT sides of the organization

– Gatekeepers

© 2011

Data Standards

• Council or Board– Data stewards and

representatives of the various business areas

– Responsible and/or accountable for specific data for the organization

© 2011

Data Standards

• Types of Standards– Data definitions– Data rules– Data values– Data quality– Data standardization– Data security

© 2011

Data Standards

• Data Definitions and Rules– Provide a consistent,

clear understanding of what data content is expected

– Centralize or publish across the organization

– Enterprise data dictionary or metadata repository

© 2011

Data Standards

• Data Values– Valid values lists

• Static or rarely changed data

• Codes• Indicators

– Master reference data• Customer• Product• Etc

– Centralize

© 2011

Data Standards

• Data Quality– Leverage data profiling

• Column/Field– Value analysis– Pattern analysis– Data type analysis

• Table/File– Validate key structure– Determine dependencies

• Cross-table– Validate foreign keys– Valid values

• Cross-system

© 2011

Data Standards

• Data Quality Assessments– Standardize the process

through detailed analysis procedures

– Identify the different data quality problems using standardized notation

– Summarize the analysis in reports to communicate to others

– Create detailed examples to coincide with the analysis procedures

© 2011

Data Standards

• Data Standardization– Address

• Leverage address standardization software

– Phone and Email• Leverage data quality

software to standardize

– Business data• Leverage valid values and

master reference data to standardize data across the organization

© 2011

Data Standards

• Data Security– Identify sensitive data– Clearly define and

publish procedure for requesting access

– Identify and maintain lists of users with access rights

– Validate regularly that the user still needs access

© 2011

Metadata Standards

• Documented agreements on representations, formats, and definitions of Metadata

© 2011

Metadata Standards

• Metadata Stewards– Generally IT resources

fill this role or position– Responsible for

overseeing stewardship of the metadata

– Standards are generally integrated into the SDLC

© 2011

Metadata Standards

• Metadata Stewards– Generally IT resources

fill this role or position– Responsible for

overseeing stewardship of the metadata

– Standards are generally integrated into the SDLC

© 2011

Metadata Categories

© 2011

Model Metadata

• Business metadata– Business requirements– Functional requirements– Data requirements

• Data profiling metadata– Column profiling– Table profiling– Cross-table profiling– Cross-system profiling

• Data quality metadata– Data quality statistics

• Data modeling metadata– Enterprise data models– Logical models– Physical models

• Mapping metadata– Source-to-target

mapping– Data Flow Diagrams

• Database metadata– Data Definition

Language

© 2011

Model Metadata

• Business metadata– Business requirements– Functional requirements– Data requirements

• Data profiling metadata– Column profiling– Table profiling– Cross-table profiling– Cross-system profiling

• Data quality metadata– Data quality statistics

• Data modeling metadata– Enterprise data models– Logical models– Physical models

• Mapping metadata– Source-to-target

mapping– Data Flow Diagrams

• Database metadata– Data Definition

Language

© 2011

Metadata Standards

• Data Requirements– Align with the business

requirements– Each business

requirement is likely to have matching data requirements

– Clearly define the data content to be captured

– Profile existing data sources

© 2011

Metadata Standards

• Data Profiling– Identify standards for

utilization• Create a step-by-step

process for preparing the data, profiling the data, and analyzing the results

• Identify and document the communication method to the business and IT

© 2011

Metadata Standards

• Data Profiling– Column Profiling

• Identify both valid and invalid

– Values

– Patterns

– Data types

– Lengths

• Standardize notation– Descriptions

– Problems

© 2011

Metadata Standards

• Data Profiling– Table Profiling

• Validate key structure• Identify candidate keys• Identify natural keys• Identify and document

exceptions or violations

– Cross-Table Profiling• Identify redundant data• Validate foreign keys• Identify orphaned rows

© 2011

Metadata Standards

• Data Profiling– Table Profiling

• Validate key structure• Identify candidate keys• Identify natural keys• Identify and document

exceptions or violations

– Cross-Table Profiling• Identify redundant data• Validate foreign keys• Identify orphaned rows

© 2011

Metadata Standards

• Data Profiling– Cross-system Profiling

• Identify redundant data• Identify inconsistent

data• Identify common

matching criteria

© 2011

Metadata Standards

• Data Quality– Consider requiring as

part of all profiling initiatives

– Capture and store in metadata repository

– Establish thresholds– Trend monitoring

© 2011

Metadata Standards

• Data Modeling– Enterprise Data Model

• Identify high level view of where the data lives across the enterprise

• Centralize to make accessible across the organization

• Consider identifying enterprise-level entities for important data

© 2011

Metadata Standards

• Data Modeling– Model Standards

• Standardized development process

• Model naming convention

• Name standards• Data type standards• Clearly documented

review process

© 2011

Metadata Standards

• Data Modeling– Logical/Physical

Models Standards• Model or project

narrative• Subject area• Entity• Relationships• Attribute• Identifier• Derived and BI

Elements

© 2011

Metadata Standards

• Data Modeling– Metadata Validation

• Column level– Values

– Patterns

– Data types

– Lengths

• Table level– Key validation

• Cross-table level– Foreign key

relationships

© 2011

Metadata Standards

• Mapping– Standardize mapping

process– Standardize format of

mapping document– Require data profiling

as part of the mapping process or to validate mapping

© 2011

Recommendations

• Publish or centralize data and metadata standards

• Integrate data and metadata standards into the SDLC

• Include standards review during onboarding

• Identify and publish the list of stewards

• Enforce standards with offshore teams

© 2011

Summary

• Data and metadata standards need to be developed and supported by both IT and the business

• Well defined standards will enhance the development of new applications and simplify the integration of data across the organization

© 2011

Questions

?

© 2011

Thank You!

• Antonio C. Amorin– [email protected]– (847)975-0217

• Data Innovations, Inc.– www.dataprofilers.com– (888)438-3717

© 2011