quality data an improbable dream?

49
Quality Data – An Improbable Dream? Quality Data An Improbable Dream? Elizabeth Vannan Centre for Education Information Victoria, BC, Canada

Upload: mavis

Post on 10-Feb-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Quality Data An Improbable Dream?. Elizabeth Vannan Centre for Education Information Victoria, BC, Canada. Information quality is a journey, not a destination - Larry P. English. Agenda. Data Definitions and Standards Project What is Quality Data? The Cost of Poor-Quality Data - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Quality DataAn Improbable Dream?

Elizabeth VannanCentre for Education Information

Victoria, BC, Canada

Page 2: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Information quality is a journey, not a destination

- Larry P. English

Page 3: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Agenda

• Data Definitions and Standards Project• What is Quality Data?• The Cost of Poor-Quality Data• Improving Data Quality – Our Process• Questions?

Page 4: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

BC Higher Education• Canada’s Western-most

province

• Population: 4.023 Million

• Land Area: 366,795 Sq Miles

• Publicly Funded Post-Secondary System– 22 Colleges– 6 Universities

Page 5: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

CEISS

The Centre for Education Information is an independent organization that provides research and technology

services to improve the performance of the BC education system

Page 6: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

CEISS

• Implement and manage administrative systems

• Perform custom surveys, research and analysis

• Facilitate development and implementation of data standards

• Negotiate and manage province wide software contracts (Oracle, SCT Banner, Datatel)

Page 7: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

DDEF Project

The Problem– Better data about the BC higher education

sector needed for decision-making– No infrastructure in place to facilitate the

collection of data electronically

Data Definitions and Standards ProjectInitiated in 1995

Page 8: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

DDEF Project

The Solution– Create data standards for all higher

education information (Student, HR, Finance)

– Develop a data warehouse based on standards for reporting

– Implement a common technical infrastructure at all higher education institutions

Page 9: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

DDEF Project

Project Goals– Improve the quantity and QUALITY of data

available– Reduce the number of data and reporting

requests – Develop business information system to

support the management and evaluation of the BC Post-Secondary system

Page 10: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

How Are We Doing?

• 16 institutions implemented/implementing

• Institutions using data warehouses for internal reporting

• Data requests reduced• Ministry using data

Page 11: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Why Focus on Data Quality?

• Poor data quality in our data warehouse impacts:–Confidence–Decision making–Funding

Page 12: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Quality Data Are…

The Four Attributes of Data Quality

Page 13: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Quality Data Are…

• Accurate– Free from

errors– Representative

Page 14: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Quality Data Are…

• Complete– All values are

present

Page 15: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Quality Data Are…

• Timely– Recorded

immediately– Available when

required

Page 16: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Quality Data Are…

• Flexible– Data

definitions understood

– Can be used for multiple purposes

Page 17: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Quality Data…

• Don’t have to be perfect• Good enough to fill the business

need at a price you’re willing to pay

Our ChallengeDefining Quality Criteria for

Higher Education Data

Page 18: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Cost of Poor-Quality Data

• Business Process Costs

Incorrect RegistrationsInaccurate Tuition Billings

Payroll Errors

Page 19: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Cost of Poor-Quality Data

• Rework

Re-collect DataCorrect Errors

Data Verification

Page 20: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Cost of Poor-Quality Data

• Missed Opportunities

Substandard Customer ServicePoor Decision Making

Loss of Reputation

Page 21: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Improving Data Quality

Business Process Review

Improved Data

Quality

Data Quality Assessment

Business Practice Change

Data Cleansing

Page 22: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Business Process Review

• When, where, how is data collected?

• Where is data stored?• Who creates data?• Who uses data?• What outputs are required?• What quality checks already exist?

Page 23: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Business Process Review

• Involve all stakeholders!–For student data we involve

• Executive• Registrars office• IT Department• Institutional Research

Page 24: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Business Process Review

• Results–Understanding of business

practices–Identification of data creators,

custodians, users–Preliminary quality metrics–Problem business practices

Page 25: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Data Quality Assessment

• Establish Metrics• Apply metrics to data• Review results

Page 26: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Establish Metrics

• For each element determine quality criteria–Acceptable range of values–Acceptable syntax–Comparison to known values–Business rules–Thresholds

Page 27: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Quality Metrics

Page 28: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Applying Metrics

• Collect known information for comparison

• Develop queries to test each of your validation criteria– We use Oracle Discoverer, but other tools

exist (MS Access, SQL)

Page 29: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Applying Metrics

Test 1PEN must be 9 digits long. No characters, no shorter

values acceptable

Page 30: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Test 1 Results

Two Student Records Contain Invalid PEN Numbers

Page 31: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Test 1 Results

Invalid PEN’s

Data Entry Error?

Can Identify specific students for data cleansing

Page 32: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Applying Metrics

Test 2At least 80% of student

records must have valid PEN number

Page 33: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Test 2 Results

This Institution Meets the Quality Threshold

Page 34: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Applying Metrics

Test 3No Duplicate PEN’s

Page 35: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Test 3 Results

This institution has a BIG problem!

Can we see more details?

Page 36: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Test 3 Results

Addition information reveals data

loading problems

Page 37: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Reviewing Results

• Systematic approach needed• Develop strategy for data cleaning• Identify source of data problems

Deal with Disparate Data Shock!

Page 38: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Reviewing Results

• Insert a quality review checklist

Page 39: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Reviewing Results

Page 40: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Data Cleansing

• Location– Administrative System?– Staging Area?

• Who• Scope

Page 41: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Typical Data Cleansing

• Correcting data entry errors• Removing or correcting nonsensical

dates• Deleting “garbage” records• Combining or deleting duplicates• Updating and applying code sets

Page 42: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Business Practice Change

• Two components– Implementing changes to improve data

quality– Adopting ongoing data quality review

process

Changing Business Practices is a ChallengeGet Stakeholder Support

Page 43: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Business Practice Change

• Education• Centralizing responsibility for codes• Consolidating data collection• Implementing validation routines• Change business processes

Page 44: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Quality Review Process

• Review data regularly• Make someone responsible• Establish procedures for correcting data

problems• Communicate quality improvements

Page 45: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Some Changes in BC

• Creation of Data Manager position, responsible for code sets, data quality

• Regular education for registration clerks and other data creators

• Established relationships between data creators and users

• Re-engineered administrative systems

Page 46: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Improvements to BC Data

• Improved data quality and quantity– Nonsensical dates almost eliminated– Completeness of key elements improved

(from 50% to 80-90%)– Data now being collected for CE in

standard format

Page 47: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Final Thoughts…

• Quality Data are Probable if you are willing to…– Take a critical look at your existing data– Implement changes to how you collect and

manage data– Invest the time to educate and

communicate with data users and creators– Make data quality improvement an on-

going process

Page 48: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Recommended Reading

•Brackett, Michael H., Data Resource Quality, Turning Bad Habits into Good Practices (New York:Addison-Wesley, 2000)

•English, Larry P., Improving Data Warehouse and Business Information Quality (New York: John Wiley and Sons, 1999)

•Redman, Thomas C., Data Quality for the Information Age (Boston;Artech House, Inc., 1996)

Page 49: Quality Data An Improbable Dream?

Quality Data – An Improbable Dream?

Thank You!

Presentation Available Atwww.ceiss.org

[email protected]