business impacts of poor data quality: building the ... · assessment and building the business...
TRANSCRIPT
![Page 1: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/1.jpg)
Business Impacts of Poor Data Quality: Building the Business Case
David Loshin Knowledge Integrity, Inc.
www.knowledge-integrity.com
1 © 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
![Page 2: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/2.jpg)
Data Quality Challenges
Data Challenges
Multiple sources
Inconsistency
Duplication
Ambiguity
Repurposing
Process Failures
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
2
![Page 3: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/3.jpg)
3
Addressing the Problem
p To effectively ultimately address data quality, we must be able to manage the n Identification of customer data quality expectations n Definition of contextual metrics n Assessment of levels of data quality n Track issues for process management n Determination of best opportunities for improvement n Elimination of the sources of problems n Continuous measurement of improvement against baseline
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
3
![Page 4: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/4.jpg)
Data Quality Processes
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
4
DQ Assessment
DQ Issue Repor0ng
Resolu0on Workflow
Performance Monitoring
DQ Issues Tracking
Iden0fy the Problem
Measure the Improvement
Act on What is Learned
Assess the Size and Scope
DQ Inspec0on
Acceptability Thresholds
Remedia0on ac0ons
Service Level Agreements
Data Standards
DQ Rules & Metrics
Metadata Management
![Page 5: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/5.jpg)
Understanding Business Impacts of Data Flaws
p Consider costs and risks related to data use
p Understand data quality expectations p Defining data validity rules p Measuring and reporting business-
related data quality
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
5
Financial Risk
Produc0vity Trust
![Page 6: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/6.jpg)
Aligning Data Quality with Business Expectations
p Validity of data p Business process performance
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
6
Completeness
Duplicates
Consistency
Syntax errors
Improved Financials
Reduced Risk
Increased Productivity
Increased Trust
?
![Page 7: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/7.jpg)
Understanding Business Process Impacts
p For each perceived business problem: n What makes this a critical business problem? n What are the measurable impacts? n How is each impact classified? n How is the impact measured?
p Assess the relationship to flawed data: n How is the business problem related to an application data issue? n How often does the data issue occur? n When the data issue occurs, how is it identified? n How often is the data issue identified before the business impact is
incurred?
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
7
![Page 8: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/8.jpg)
Financial Impact Classification
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
8
Financial
Overhead
Direct
Fees
Combined Ratio
Cash Flow
Depreciation
![Page 9: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/9.jpg)
Productivity Impact Classification
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
9
Productivity
Workloads
Throughput
Output Quality
Supply
Volume
Staffing
![Page 10: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/10.jpg)
Risk Impact Classification
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
10
Risk
Regulatory compliance
Bureau
Diversity
Model/ Parameter
risk
CAT Management
Fraud
![Page 11: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/11.jpg)
Trust Impact Classification
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
11
Trust
Forecasting
Reporting
Decisions
Employee satisfaction
Customer satisfaction
Credibility
![Page 12: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/12.jpg)
Successive Refinement – Drilling Down
p Risk n Diversity
p Property Locations/CAT Management
p Variety of Core Products p Agency/Broker Distribution p Market Dislocations p Growth Potential/Risk p International Diversification p Cross-Industry Diversification p Rent
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
12
Risk
Regulatory compliance
Bureau
Diversity
Model/ Parameter
risk
CAT Management
Fraud
![Page 13: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/13.jpg)
Examples - Insurance
p Health Insurance company: n Incomplete diagnostic codes skews calculation of premiums,
leading to significant decrease in profitability p Health Insurance company:
n Missing and invalid data impacts ability to calculate amounts of reserves for risk assurance
p Property & Casualty Insurance company: n Inconsistency of location data impacts assessment of potential
expenses involved in insuring the client (regional/local taxes and fees)
p Property & Casualty Insurance company: n Inconsistent data affects determination of changes in capacity
based on exposure in a given geographic area p Property & Casualty Insurance company:
n Difficulty in resolving unique customer identities impacts evaluation of overall corporate risk
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
13
![Page 14: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/14.jpg)
Examples – Financial
p Energy Services Company: n Inconsistent supplier data results in early (and incorrect)
payments n Increased effort for entering the same data multiple times
p DoD Guidelines on Data Quality: n “… the inability to match payroll records to the official employment
record can cost millions in payroll overpayments to deserters, prisoners, and “ghost” soldiers.”
n “… the inability to correlate purchase orders to invoices is a major problem in unmatched disbursements.”
p Telecommunications company: n Applied revenue assurance to detect underbilling indicated revenue
leakage of just over 3 percent of total revenue due to poor data quality
n Identified 49 misconfigured (but assumed to be unusable) high-bandwidth circuits that could be returned to productive use
14 © 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
![Page 15: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/15.jpg)
Examples – Risk
p Pharmaceutical/Medical Device company n Party database used to manage grantees n Grantees may also be providers n Inability to properly track grantees exposed company to risk of
violating Federal Anti-Kickback statute p Banking industry, credit risk:
n Low-documentation and no-documentation loans n Risk models with vague/incorrect assumptions
15 © 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
![Page 16: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/16.jpg)
Examples – Trust
p Pharmaceutical company: n Large investment made in creating front-end sales application fed
by back-end database n Application clients refused to use new application due to mistrust
of back-end database p Agriculture company:
n Multiple sales databases conflicted with accounting databases n Sales staff did not trust that their commissions were being
properly calculated
16 © 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
![Page 17: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/17.jpg)
Assessment and Building the Business Case
p Identify key business performance criteria related to data quality assurance
p Review how data problems contribute to each business impact p Determine the frequency that each impact occurs p Sum the measurable impacts/costs associated with each
impact incurred by a data quality issue p Assign an average cost to each occurrence of the problem p Validate the evaluation with subject matter experts
17 © 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
![Page 18: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/18.jpg)
Data Quality Assessment – Goals & Objectives
p Data quality assessment using data profiling and other analyses to: n Identify specific data issues related to known business impacts n Introduce a process for assessing objective data quality n Support the process of defining data quality dimensions and
corresponding data quality validations and measures n Correlate discovered issues to business impacts
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
18
![Page 19: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/19.jpg)
Data Quality Assessment – Process
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
19
Plan Business Process Prepare Analyze Synthesize Review
•Present anomalies •Verify criticality •Prioritize issues •Suggest action items •Review next steps •Develop action plan
• Review anomalies • Describe issues • Prepare report
• Data extraction • Data profiling • Data analysis • Drill-down • Note findings
• List data sets • Critical data elements • Proposed measures • Prepare DQ tools
• Review system docs • Review existing DQ issues • Collate bus- iness impacts • IP-MAP
• Select business process for review • Assess scope • Acquire sys docs • Identify business impacts • Assess existing DQ process • Project Plan
![Page 20: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/20.jpg)
Phase 1: Plan
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
20
Choose a business process
impacted by poor data
quality
Select a business process for review Assess scope Collect system
documentation Identify business
expectations Review existing data quality monitoring
Prepare DQ assessment plan
Assess scope of applications supplying data
Identify data sets to be analyzed
Note data demographics
Identify resources
Acquire business
process flows
Identify data quality
problems
Acquire metadata
Acquire additional
documentation
Seek references to DQ-related
impacts
Gap review
Prepare list of business DQ expectations
Review resource, schedule
requirements
Identify staff resources
Adjust Project plan template
Adjust schedule
Meet with SMEs
Document existing edits and validation
rules
Identify DQ metrics in use
![Page 21: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/21.jpg)
Adjusting the Template Plan
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
21
![Page 22: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/22.jpg)
Phase 2: Business Process Evaluation
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
22
Research application
architecture and information flows
Review data dictionary,
metadata, and reference data
Review application documentation
Review existing issues
Collate business impacts Develop IP-Map Review with Data
Governance team
Review reported incidents,
interview SMEs to identify business
impacts
Organize reported data
issues by business impact, and prioritize by
impact
Construct Information-
Product map (IP-Map)
Present developed artifacts (data
elements, business impact
template, IP-Map)
Prioritize issues for review
Update plan
Obtain resources
![Page 23: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/23.jpg)
Business Impacts
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
23
Impact Category Examples of issues for review
Opera0onal Efficiency • Time and costs of cleansing data or processing correc0ons • Inaccurate performance measurements for employees • Inability to iden0fy suppliers for spend analysis
Risk/Compliance • Missing credit data leads to inaccurate credit risk • Regulatory compliance viola0ons • Privacy viola0ons
Revenue • Lost opportunity cost • Iden0fica0on of high net worth customers • Increased value from matching against master customer database
Produc0vity • Decreased ability for straight-‐through processing via automated services
Sa0sfac0on • Reduced ease-‐of-‐use for staff • Inability to provide unified billing to customers
Performance • Impaired decision-‐making for seJng prices
![Page 24: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/24.jpg)
Business Impact Template
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
24
Issue ID Data Issue Business Impact Measure Severity
Assigned iden0fier for the issue
Descrip0on of the issue
Descrip0on of the business impact aEributable to the data issue; there may be more than one impact for each data issue
A means for measuring the degree of impact
An es0mate of the quan0fica0on of the cumula0ve impacts
![Page 25: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/25.jpg)
Prepare for Data Quality Assessment
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
25
Prepare a list of the data sets and
the data elements that are
to be analyzed
List data sets to be analyzed
Identify critical data elements
Define data quality measurements
Ensure access to data
Prepare data quality tools for use
List critical data elements
associated with data sets and data quality
issues
Specify initial set of data quality
rules
Specify initial set of measurements and solicit acceptability thresholds from the
business user
Specify initial measurement
processes to be used
Check if there is direct access to
the data
Verify tool access to data
Extract data if necessary and reformulate for use by profiling
tools
Verify access to data profiling
tools
Verify access to query tools
Verify access to extraction tools
Identify and verify access to
alternate analysis tools if needed
![Page 26: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/26.jpg)
Documenting Data Elements
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
26
Data Set Data Element Comment
Table name Data element name • Descrip0on of data element • Specifics of metadata implying business
rules or poten0al data quality issues
![Page 27: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/27.jpg)
Classifying Data Quality Rules
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
27
p Standardizing classes of rules for data quality simplifies measurement
p These categories are intended to represent different measurable aspects of data quality n Used in characterizing relevance across
different source data sets n Measurements are taken to review
compliance with data quality rules p Each group within the organization
has the freedom to introduce its own data quality rules with their own priorities
Intrinsic Contextual
Timeliness Accuracy
Lineage
Semantic
Structure
Currency
Completeness
Consistency
Identifiability
Reasonableness
![Page 28: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/28.jpg)
Measuring Data Quality
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
28
Data Set Data Element Rule Category Measurement process Acceptability Threshold
Table name Data element name
The class of data quality rule being measured
Method used for measurement, one of: • Data profiling
sta0s0cs • Data profiling,
valida0on rule • SQL query • Other tools • Combina0on of
techniques • Manual
measurement process
Quan0fied level that demonstrates data meets business expecta0ons
![Page 29: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/29.jpg)
Data Profiling and Analysis
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
29
Extract the data if
necessary
Set up for tool access to data for drill-down, if necessary
Data extraction
Column analysis
Table and cross-table
profiling
Validate against
reference data
Exact duplicate analyses
Statistical Analysis
Document results
Complete data profiling
report template
Conduct frequency-
based analysis
(max, min, high
frequency values, outliers, counts, nulls)
Populate column profiling template
Key analysis,
dependency analysis
Populate table
profiling template
Validate use of reference
data
Cross-table consistency validation
using cross-table
profiling
Compute mean and standard
deviation of numeric
data, durations,
counts
Identify any outliers and anomalies
Identify identifying attributes
and measure
exact duplicate records
based on the set of identifying attributes
Document potential anomalies
in observation template
Validate identified business rules and document
measures in observation template
Provide descriptive detail of potential anomalies
![Page 30: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/30.jpg)
Column Analysis Template
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
30
Table and
Column Name
Record Count
Inferred data type
# Dis0nct
# Null
% null Max Min Number of paTerns
Mean Median Standard Devia0on
![Page 31: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/31.jpg)
Observation Template
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
31
ID Table/Column Name
InspecDon Reported items Issues for Review
Fitness Assessment
Assigned iden0fier for issue
Table name and column name(s)
What measure was reviewed
Result of measurement
What needs to be reviewed, next steps
Characterized based on
business impact and severity
![Page 32: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/32.jpg)
Synthesis of Results
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
32
Review data profiling and analysis results
Map potential issues to business impacts
Review discovered anomalies
Document potential data quality issues
Evaluate data scope and need for additional analysis
Prepare draft data quality assessment report
Provides details of anomalies and
reasons for suspicion
Complete “Issues for Review” column for each observation in
the Observation template
Determine if segmentation of
tables by reference categories will provide
additional insight
Determine if other tables should be
reviewed
Perform any additional analyses on the data
Prioritize discovered issues
Document fitness review
Recommendations for issue remediation,
data quality improvements,
inspection
Populate the data quality analysis report
![Page 33: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/33.jpg)
Recommendation Template
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
33
ID Priority RecommendaDon
Unique iden0fier As assigned by business partner
• Driver for the recommenda0on,
• Reason for assigned priority, and
• Specific ac0ons to take
![Page 34: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/34.jpg)
Data Quality Assessment Report 1. Executive Summary, provides high level overview of the task and the results. 2. Introduction, describes how data profiling and additional analyses were used to
assess the quality of selected data sets 3. Goals, enumerating the specific goals of the analysis, such as “reviewing the
quality of data prior to integration in a data warehouse.” 4. Scope, detailing the results of task 1.2 and the business impacts identified
tasks under phase 2. 5. Approach, describing the details of the outputs of phase 3, namely profiling
and analyses to be performed, identified critical data elements, proposed measurements, and the techniques applied.
6. Data Analysis Results, providing the observations listed in the reasonableness template completed during phase 4
7. Recommendations, detailing the suggestions resulting from the synthesis of phase 5
8. Open Issues, in which any unresolved questions are listed. 9. Next Steps, providing the action items resulting from the recommendations
review and any requirements to resolve any of the open issues. 10. Additional Supporting Material, such as raw statistics from the column, table,
and cross-table templates and any other (non-profiling) analyses to support the recommendations.
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
34
![Page 35: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/35.jpg)
Client Review
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
35
Conduct walk-through of the observations in
the draft report
Describe the measure, impact,
expectation, and the actual measured
value
Provide justification of why the
observation was noted
Accumulate additional information from
business clients and subject matter
experts
Present and verify discovered issues, recommendations
Prioritize issues and suggestions
Identify and document action items
Develop plan for selected data quality improvement tasks
Note those discovered issues
that are relevant to business impacts
Prioritize the issues based on perceived business value from
business clients
Present concrete steps that can be taken to eliminate root causes of data
issues
Document action items to be
incorporated into the data quality
improvement plan
Plan for data cleansing
Define and enforce data standards
Institute data validity inspection and
reporting,
Plan modifications for business processes, applications, or data
models
![Page 36: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/36.jpg)
Data Profiling Concepts
p Column analysis n Review statistical aspects of values within a column
p Cross-column dependency analysis n Review relationships across sets of columns within a single view
p Cross-table redundancy analysis n Review overlapping data across columns in different tables
© 2010 Knowledge Integrity, inc. www.knowledge-integrity.com
(301)754-6350
36
Frequency Distribution Range Analysis Distinction Sparseness, Value Absence, Nulls Format Evaluation Cardinality and Uniqueness Abstract Type Recognition Overloading
Profiler Source Data
Frequency Analysis
![Page 37: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/37.jpg)
Column Profiling Techniques
p Range Analysis p Sparseness p Format Evaluation p Cardinality and Uniqueness p Frequency Distribution p Value Absence p Abstract Type Recognition p Overloading
37 © 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
![Page 38: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/38.jpg)
Cross-Column Analysis
p Key discovery p Normalization & structure analysis p Derived-value columns p Business rule discovery
38 © 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
![Page 39: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/39.jpg)
39
Cross-Table Analysis
p Foreign key analysis p Synonyms p Reference data coordination p Business rule discovery
© 2010 Knowledge Integrity, inc. www.knowledge-integrity.com
(301)754-6350
![Page 40: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/40.jpg)
Ongoing Monitoring Using Data Profiling
p Rule validation can be used to assert data quality expectations throughout the processing flow
p Use profiling jobs as “probes” across the information flow graph to identify where flaws are introduced
p Correlate occurrences of errors to documented business impact for prioritization
40 © 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
![Page 41: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/41.jpg)
Finding Hidden Value with Data Profiling
© 2010 Knowledge Integrity, inc. www.knowledge-integrity.com
(301)754-6350
41
Analyze/profile data
Assess data quality dimensions
Application
IMS
Flat File
RDBMS
VSAM
Create monitoring
system
Recommend data
transformations
Data quality, Validity, &
Transformation rules
![Page 42: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/42.jpg)
Creating a Data Quality Scorecard
Reduced Risk
Financial Opportunities
Productivity
Improved Confidence
Data Quality Scorecard
42 © 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
![Page 43: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/43.jpg)
Summary
p Standardized process for performing data quality assessment
p Can be adjusted to support operational and analytical business process consumers
p Allows for identification of key data quality metrics that can feed data stewardship activities, data monitoring, and a data quality scorecard
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
43
![Page 44: Business Impacts of Poor Data Quality: Building the ... · Assessment and Building the Business Case ! Identify key business performance criteria related to data quality assurance](https://reader034.vdocuments.mx/reader034/viewer/2022043017/5f3999098be19224593e7c0e/html5/thumbnails/44.jpg)
Questions and Open Discussion
p www.knowledge-integrity.com
p If you have questions, comments, or suggestions, please contact me David Loshin 301-754-6350 [email protected]
© 2010 Knowledge Integrity, Inc. www.knowledge-integrity.com
(301)754-6350
44 44