ldm webinar: data modeling & metadata management
TRANSCRIPT
Data Modeling & Metadata ManagementDonna Burbank
Global Data Strategy Ltd.
Lessons in Data Modeling DATAVERSITY Series
October 27th, 2016
Global Data Strategy, Ltd. 2016
Donna is a recognized industry expert in information management with over 20 years of experience in data strategy, information management, data modeling, metadata management, and enterprise architecture.
She is currently the Managing Director at Global Data Strategy, Ltd., an international information management consulting company that specialises in the alignment of business drivers with data-centric technology. In past roles, she has served in a number of roles related to data modeling & metadata:
• Metadata consultant (US, Europe, Asia, Africa)
• Product Manager PLATINUM Metadata Repository
• Director of Product Management, ER/Studio
• VP of Product Marketing, Erwin
• Data modeling & data strategy implementation & consulting
• Author of 2 books of data modeling & contributor to 1 book on metadata management, plus numerous articles
• OMG committee member of the Information Management Metamodel (IMM)
As an active contributor to the data management community, she is a long time DAMA International member and is the President of the DAMA Rocky Mountain chapter. She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia,
and Africa and speaks regularly at industry conferences. She has co-authored two books: Data Modeling for the Business and Data Modeling Made Simple with ERwin Data Modeler and is a regular contributor to industry publications such as DATAVERSITY, EM360, & TDAN. She can be reached [email protected] is based in Boulder, Colorado, USA.
Donna Burbank
2
Follow on Twitter @donnaburbankToday’s hashtag: #LessonsDM
Global Data Strategy, Ltd. 2016
Lessons in Data Modeling Series
• July 28th Why a Data Model is an Important Part of your Data Strategy
• August 25th Data Modeling for Big Data
• September 22nd UML for Data Modeling – When Does it Make Sense?
• October 27th Data Modeling & Metadata Management
• December 6th Data Modeling for XML and JSON
3
This Year’s Line Up
Global Data Strategy, Ltd. 2016
Agenda
• How data modeling fits within a larger metadata management landscape
• When can data modeling provide “just enough” metadata management
• Key data modeling artifacts for metadata
• Organization, roles & implementation considerations
• Summary & questions
4
What we’ll cover today
Global Data Strategy, Ltd. 2016
Metadata is Hotter than ever
5
A Growing Trend
In a recent DATAVERSITY survey, over 80% of respondents stated that:
Metadata is as important, if not more important, than in the past.
Global Data Strategy, Ltd. 2016
What is Metadata?
Metadata is Data In Context
6
Global Data Strategy, Ltd. 2016
Metadata is the “Who, What, Where, Why, When & How” of Data
7
Who What Where Why When How
Who created this data?
What is the business definition of this data element?
Where is this data stored?
Why are we storing this data?
When was this datacreated?
How is this data formatted? (character, numeric, etc.)
Who is the Steward of this data?
What are the business rules for this data?
Where did this data come from?
What is its usage & purpose?
When was this data last updated?
How many databasesor data sources store this data?
Who is using this data?
What is the securitylevel or privacy level of this data?
Where is this data used & shared?
What are the business drivers for using thisdata?
How long should it be stored?
Who “owns” this data?
What is the abbreviation or acronym for this data element?
Where is the backup for this data?
When does it need to be purged/deleted?
Who is regulating or auditing this data?
What are the technical naming standards for database implementation?
Are there regional privacy or security policies that regulate this data?
Global Data Strategy, Ltd. 2016
Metadata is Part of a Larger Enterprise Landscape
8
A Successful Data Strategy Requires Many Inter-related Disciplines
“Top-Down” alignment with business priorities
“Bottom-Up” management & inventory of data sources
Managing the people, process, policies & culture around data
Coordinating & integrating disparate data sources
Leveraging & managing data for strategic advantage
Global Data Strategy, Ltd. 2016
Metadata Across & Beyond the Organization
• Metadata exists in many sources across & beyond the organization.
9
COBOL
Legacy Systems
JCL
SpreadsheetsMedia
Social Media
IoTOpen Data
Databases
Data Models
Documents
Data In Motion
Global Data Strategy, Ltd. 2016
Types of Metadata
• The DATAVERSITY Emerging Trends in Metadata survey revealed some interesting findings about what types of metadata organizations will be managing now and in the future.
10
= Supported by most data modeling tools
Now Future
Global Data Strategy, Ltd. 2016
Data Models are a Good Source of Metadata
• Data Models are another good source of both business & technical metadata for relational databases.
• They store structural metadata as well as business rules & definitions.
• Key relationships are also stored to provide lineage & impact analysis.
11
Customer
Customer_ID CHAR(18) NOT NULL
First Name
Last Name
City
Date Purchased
CHAR(18)
CHAR(18)
CHAR(18)
CHAR(18)
NOT NULL
NOT NULL
NULL
NULL
Technical Metadata Business Metadata
Global Data Strategy, Ltd. 2016
Data vs. Metadata
12
First Name Last Name Company CityYear
Purchased
Joe Smith Komputers R Us New York 1970
Mary Jones The Lord’s Store London 1999
Proful Bishwal The Lady’s Store Mumbai 1998
Ming Lee My Favorite Store Beijing 2001
Metadata
Data
Customer
Global Data Strategy, Ltd. 2016
Data vs. Metadata
13
STR01 STR02 TXT123 TXT127 DT01
Joe Smith Komputers R Us New York 1970
Mary Jones The Lord’s Store London 1999
Proful Bishwal The Lady’s Store Mumbai 1998
Ming Lee My Favorite Store Beijing 2001
Metadata?
Data
Customer
Global Data Strategy, Ltd. 2016
Metadata adds Context & Definition
14
First Name Last Name Company CityYear
Purchased
Joe Smith Komputers R Us New York 1970
Mary Jones The Lord’s Store London 1999
Proful Bishwal The Lady’s Store Mumbai 1998
Ming Lee My Favorite Store Beijing 2001
Customer DefinitionLast Name represents the surname or family name of an individual.
Business RulesIn the Chinese market, family name is listed first in salutations.
Format VARCHAR(30)
Abbreviation LNAME
Required YES
Etc.Numerous technical & business metadata including security, privacy, nullability, primary key, etc.Is this the city where the customer lives
or where the store is located?
Global Data Strategy, Ltd. 2016
Technical & Business Metadata
• Technical Metadata describes the structure, format, and rules for storing data
• Business Metadata describes the business definitions, rules, and context for data.
• Data represents actual instances (e.g. John Smith)
15
CREATE TABLE EMPLOYEE (
employee_id INTEGER NOT NULL,
department_id INTEGER NOT NULL,
employee_fname VARCHAR(50) NULL,
employee_lname VARCHAR(50) NULL,
employee_ssn CHAR(9) NULL);
CREATE TABLE CUSTOMER (
customer_id INTEGER NOT NULL,
customer_name VARCHAR(50) NULL,
customer_address VARCHAR(150) NULL,
customer_city VARCHAR(50) NULL,
customer_state CHAR(2) NULL,
customer_zip CHAR(9) NULL);
Technical Metadata
John Smith
Business Metadata
Data
Term Definition
EmployeeAn employee is an individual who currently works for the organization or who has been recently employed within the past 6 months.
Customer
A customer is a person or organization who has purchased from the organization within the past 2 years and has an active loyalty cardor maintenance contract.
Global Data Strategy, Ltd. 2016
Business vs. Technical Metadata
• The following are examples of types of business & technical metadata.
16
Business Metadata Technical Metadata
• Definitions & Glossary • Data Steward• Organization• Privacy Level• Security Level• Acronyms & Abbreviations• Business Rules• Etc.
• Column structure of a database table • Data Type & Length (e.g. VARCHAR(20))• Domains• Standard abbreviations (e.g. CUSTOMER ->
CUST)• Nullability• Keys (primary, foreign, alternate, etc.)• Validation Rules• Data Movement Rules• Permissions• Etc.
Global Data Strategy, Ltd. 2016
Levels of Data Modeling
17
Conceptual
Logical
Physical
Business Concepts
Data Entities
Physical Tables
Business Metadata
Technical Metadata
Global Data Strategy, Ltd. 2016
Business Definitions
From Data Modeling for the Business by Hoberman, Burbank, Bradley, Technics Publications, 2009
Non-Traditional SourcesNot all metadata is in a relational database
Global Data Strategy, Ltd. 2016
Human Metadata
• Much business metadata and the history of the business exists in employee’s heads.• It is important to capture this metadata in an electronic format for sharing with others.• Avoid the dreaded “I just know”
20
Avoid the dreaded “I just know”
Part Number is what used to be called Component Number before the
acquisition.
Business Glossary
Metadata Repository
Data Models
Etc.
Global Data Strategy, Ltd. 2016
Data Modeling in the Big Data Ecosystem
Hive HBase
Structured Data Unstructured Data
MapReduce / AnalyticsHadoop Framework
HDFS File System
JSON / XML
HQL
Semi-structured DataJSON
XML JSON
Data Sources
Global Data Strategy, Ltd. 2016
Cobol Copybook Metadata
• What is a COBOL Copybook? – In COBOL, a copybook file is used to define data elements that can be referenced by many programs
• What is COBOL Copybook Metadata? – structure, definition
22
MetadataDescribes structure & format of data
Global Data Strategy, Ltd. 2016
ERP/CRM and Packaged Application Metadata• Packaged applications such as CRM and ERP systems (e.g. Salesforce, Peoplesoft, etc.) are
typically based on a relational database system.
• Therefore, there is important metadata about both the physical table structures as well as the business names & definitions.
23
Technical Metadata Business Metadata
Relationship MetadataShowing How Information Interrelates
Global Data Strategy, Ltd. 2016
Data Lineage - Data Warehousing Example
• In the data warehouse example below, metadata for CUSTOMER exists in a number tools & data stores.
• This lineage can be tracked in most data modeling tools.
25
Sales Report
CUSTOMER
Database Table
CUST
Database Table
CUSTOMER
Database Table
CUSTOMER
Database Table
TBL_C1
Database Table
Business Glossary
ETL Tool ETL Tool
Physical Data Model
Physical Data Model
Logical Data Model
DimensionalData Model
BI Tool
Global Data Strategy, Ltd. 2016
Metadata Discovery Tools
• Metadata Discovery Tools extract metadata from source systems, and rationalize them to a common metamodel and storage facility.
26
Metadata Discovery Tools
Metamodel(s)
Metadata Storage(Database)
Metadata Storage (Repository)
Metadata Population
Global Data Strategy, Ltd. 2016
Impact Analysis & Where Used
• Impact Analysis shows the relationship between a piece of metadata and other sources that rely on that metadata to assess the impact of a potential change.
• For example, if I change the length & name of a field, what other systems that are referencing that field will be affected?
27
What happens if I change the name & length of the “Brand” field?
Brand CHAR(10)MyBrand VARCHAR(30)
Sales Application
Sales DatabaseDB2
Staging AreaETL
Customer Database
Oracle
Global Data Strategy, Ltd. 2016
Design Layer Relationships
• In a data model there are several design layers that describe a given data concept.
28
Organization, Roles & Implementation ConsiderationsEnsuring that metadata is used effectively across the organization
Global Data Strategy, Ltd. 2016
Who Uses Metadata?
• In addition to sharing metadata between tools and via export, many users across both IT & the business want to view the metadata through reports, portals, etc.
30
Developer
If I change this field, what else will be affected?
Business Person(e.g. Finance)
What’s the definition of “Regional Sales”
Auditor
How was “Total Sales” calculated? Show me the
lineage.
Data Architect
What is the approved data structure for storing
customer data?
Data Warehouse
Architect
What are the source-to-target mappings for the
DW?
Business Person(e.g. HR)
How can I get new staff up-to-speed on our company’s
business terminology?
Global Data Strategy, Ltd. 2016
Metadata is Needed by Business Stakeholders
31
Making business decisions on accurate and well-understood data
80% of users of metadata are from the business, according to the recent DATAVERSITY survey.
Global Data Strategy, Ltd. 2016
Metadata Publication & Reporting – Business Glossary• A Business Glossary is a common way to publish business terms & their definitions.
• When sourced from a common repository, these terms are integrated with the wider data landscape.
• Most data modeling tools can take the definitions from Logical and/or Conceptual data models and publish them to a Glossary-style format, via web portals or reports.
32
Business Term Abbreviation Definition Data Steward Security Level
BFPO Number BFPO NumBFPO Number is for British Forces Postal Office. It can be used in UK and overseas addresses. Accounting Unclassified
Interest Int The growth in capital of a monetary investment Finance Unclassified
PO Box POBA numbered box in a post office assigned to a person or organization, where mail for them is kept until collected Accounting Unclassified
A feedback mechanism is important to gather valuable input & updates from users.
Global Data Strategy, Ltd. 2016
Metadata Publication & Reporting – Lineage
• Data Lineage can be visualized through a web portal or reports.
• With web-based reporting, users can drill-down into each data source and investigate further lineage.
33
Global Data Strategy, Ltd. 2016
Metadata Publication & Reporting – Data Structures
• Having a common view of standard data structures is helpful for data architects, developers, etc.
• This can all be sourced from a data model.
34
Table Name Column Name Attribute Name Data Type Nullability Primary Key Definition
CUSTOMER CUST_ID Customer Identifier VARCHAR(20) NOT NULL YesCustomer ID is the unique identifier that locates a customer
F_NAME First Name VARCHAR(30) NOT NULL No The given name of an individual
L_NAME Last Name VARCHAR(40) NOT NULL No The family name of an individual
ORDER ORDER_ID Order Identifier VARCHAR(10) NOT NULL Yes
The number assigned to an order from the FIX10 system that locates a unique order.
Etc.
Global Data Strategy, Ltd. 2016
Data Models can provide “Just Enough” Metadata Management
35
Metadata Storage
Metadata Lifecycle & Versioning
Data Lineage Visualization
Business Glossary Data Modeling
Metadata Discovery &
Integration w/Other Tools
CustomizableMetamodel
Data Modeling Tools (e.g. Erwin, SAP
PowerDesigner, IderaER/Studio)
x X x X X x
Metadata Repositories (e.g.ASG, Adaptive) X X X X X X
Data Governance Tools (e.g. Collibra, Diaku) x x X x x
Spreadsheets x x x
• While data modeling tools are not metadata repositories, nor designed to be, they offer many features shared with these repository solutions:• Metadata storage, Data lineage visualization, Business Glossary, Integration with BI tools, ETL tools, etc.
• Metadata repositories have a broader range metadata sources & dedicated metadata management support.• And Data Modeling tools, of course, have the added benefit of doing data modeling!
• And the benefit is that much of the needed metadata is in these data models.
Global Data Strategy, Ltd. 2016
Key Components of Metadata Management
36
Metadata Strategy Metadata Capture & Storage
Metadata Integration & Publication
Metadata Management & Governance
Alignment with business goals & strategy
Identification of all internal & external metadata sources
Identification of all technical metadata sources
Metadata roles & responsibilities defined
Identification of & feedback from key stakeholders
Population/import mechanism for all identified sources
Identification of key stakeholders & audiences (internal & external)
Metadata standards created
Prioritization of key activities aligned with business needs & technical capabilities
Identification of existing metadata storage
Integration mechanism for key technologies (direct integration, export, etc.)
Metadata lifecycle management defined & implemented
Prioritization of key data elements/subject areas
Definition of enterprise metadata storage strategy
Publication mechanism for each audience
Metadata quality statistics defined & monitored
Communication Plan developed
Feedback mechanism for each audience
Metadata integrated intooperational activities & related data management projects
Global Data Strategy, Ltd. 2016
Implementing a Metadata Strategy
• A successful metadata strategy requires input from multiple factors.
37
Business Drivers & Motivation
Metadata Sources & Technology
Metadata Management MaturityStakeholders & Audience Metadata Strategy
Global Data Strategy, Ltd. 2016
Stakeholder Feedback
• Determine key business issues & drivers through direct feedback.
38
I didn’t know we had any documented data
standards
Where do I go to get the definition of “default banking standard”?
$12m has been spent on projects to clean up the data
over the past 2-3 years
What are the data structures used in the application?
We have 15 customer databases – with many
duplications.
There is limited ownership or enforcement of common practices and standards
across the projects
Key subject matter experts are relied upon to review
detailed data from various systems to ensure accuracy.
I just joined the company and don’t understand all of the
acronyms!
There was an error in reporting products by customer & region
that was noticed by upper management.
I need a central, accurate view of all my customers
worldwide.
Global Data Strategy, Ltd. 2016
Mapping Business Drivers to Metadata Management Capabilities
39
Business Drivers
Digital Self Service
Increasing Regulatory Pressures
Online Community & Social Media
Community Building
External Drivers
Internal Drivers
Targeted Marketing
360 View of Customer
Brand Reputation
Efficient IT
Stakeholder Challenges
Lack of Business Alignment• Data spend not aligned to Business Plans• Business users not involved with data
Integrating Data• Siloed systems • No common view of key information
3 Data Quality Issues• Bad customer info causing Brand damage• Completeness & Accuracy Needed
4Cost of Data Management
• Manual entry increases costs• System redundancy• No reuse or standards
5 No Audit Trails• No lineage of changes• Fines had been levied in past for lack of
compliance
6 Big Data Exploitation• Exploiting Unstructured Data• Access to External & Social Data
1
Shows “Heat Map” of Priorities
2
3
4
5
6
Metadata Capability
Metadata Strategy
Metadata Capture &Storage
Metadata Integration & Publication
Metadata Management & Governance
1 2 3 4 5 6
2 3 4 5 6
2 3 4
1 2 3 4 5 6
Global Data Strategy, Ltd. 2016
Inventory & Usage Mapping
• It’s also important to determine which teams are using these technologies to create a “heat map” of usage & priority.
40
Metadata Sources Leadership Sales Finance Marketing Support R&D HR Legal Compliance
Relational Databases
MySQL XOracle X X X X X X X XSQL Server X XSybase XEtc.
BI Tools
Tableau X X X X X XQlik X X XEtc.
Open Data
Data.gov – agricultural data X X XEtc.
Global Data Strategy, Ltd. 2016
Metadata Roles & Responsibilities
• It’s important to establish formal roles & responsibilities for your metadata effort.
• Some may be part-time, and some full-time, but they should be clearly defined and communicated so that staff has understanding of and accountability for their roles.• Executive Sponsor/Champion: Understands & communicates the importance of metadata
management across the organization.
• Steering Group: As part of a metadata management effort, or part of a larger data governance effort, the steering group prioritizes & sets direction for key activities.
• Data Stewards: Responsible for business definitions & rules for key data elements.
• Metadata Repository Administrator: Manages the administration, population, and interfaces of a metadata repository.
• Metadata Publicist: Establishes reports & publication methods to end users.
• Metadata Consumers: Actively use metadata as part of their daily jobs, and are held accountable for using published standards.
• Data Modelers• Developers• Business Users• Report Developers• Etc.
41
Global Data Strategy, Ltd. 2016
Monitoring Metadata Quality & Metrics
• Metadata is a key driver of data quality, and to support this, the metadata itself must be of high quality.
• In order to ensure that quality metadata is maintained, it must be actively managed and monitored. Dashboards & Reports can be used to monitor key quality indicators.
• Key metadata quality indicators include:• Completeness: e.g. Do definitions exist for all key data elements?
• Accuracy: e.g. Are current definitions correct? Do data types accurately represent currently implemented standards?
• Currency/ Timeliness: e.g. Are metadata definitions current or outdated?
• Consistency: e.g. Are metadata standards defined, published & implemented consistently across the organization?
• Accountability: e.g. Are data stewards or owners defined?
• Integrity: e.g. Are linkages and relationships established between critical metadata items?
• Privacy: e.g. Is any metadata subject to privacy restrictions?
• Usability: e.g. Are people actually using this metadata?
42
Global Data Strategy, Ltd. 2016
Summary
• Metadata is more important than ever
• Data models are a rich source of metadata
• While metadata repositories are valuable, data models & associated functionality can often provide “just enough” metadata management• Business definitions
• Technical data structures
• Data lineage & impact analysis
• Visual models
• Organizational considerations are critical to achieve success• Understanding business drivers
• Defining roles & responsibilities
• Monitoring metadata quality & metrics
• Have fun! Metadata is for the cool kids.
Global Data Strategy, Ltd. 2016
About Global Data Strategy, Ltd
• Global Data Strategy is an international information management consulting company that specializes in the alignment of business drivers with data-centric technology.
• Our passion is data, and helping organizations enrich their business opportunities through data and information.
• Our core values center around providing solutions that are:• Business-Driven: We put the needs of your business first, before we look at any technology solution.• Clear & Relevant: We provide clear explanations using real-world examples.• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s
size, corporate culture, and geography. • High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of
technical expertise in the industry.
44
Data-Driven Business Transformation
Business StrategyAligned With
Data Strategy
Visit www.globaldatastrategy.com for more information
Global Data Strategy, Ltd. 2016
Contact Info
• Email: [email protected]
• Twitter: @donnaburbank
@GlobalDataStrat
• Website: www.globaldatastrategy.com
• Company Linkedin: https://www.linkedin.com/company/global-data-strategy-ltd
• Personal Linkedin: https://www.linkedin.com/in/donnaburbank
45
Global Data Strategy, Ltd. 2016
White Paper: Emerging Trends in Metadata Management
• Download from www.dataversity.net• Under ‘Whitepapers’
46
Free Download
Global Data Strategy, Ltd. 2016
DATAVERSITY Training Center
• Learn the basics of Metadata Management and practical tips on how to apply metadata management in the real world. This online course hosted by DATAVERSITY provides a series of six courses including:• What is Metadata
• The Business Value of Metadata
• Sources of Metadata
• Metamodels and Metadata Standards
• Metadata Architecture, Integration, and Storage
• Metadata Strategy and Implementation
• Purchase all six courses for $399 or individually at $79 each.Register here
• Other courses available on Data Governance & Data Quality
47
Online Training Courses
New Metadata Management Course
Visit: http://training.dataversity.net/lms/
Global Data Strategy, Ltd. 2016
Lessons in Data Modeling Series
• July 28th Why a Data Model is an Important Part of your Data Strategy
• August 25th Data Modeling for Big Data
• September 22nd UML for Data Modeling – When Does it Make Sense?
• October 27th Data Modeling & Metadata Management
• December 6th Data Modeling for XML and JSON
48
Join us next time
Global Data Strategy, Ltd. 2016
Questions?
49
Thoughts? Ideas?