quality data an improbable dream?

Post on 10-Feb-2016

38 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Quality Data An Improbable Dream?. Elizabeth Vannan Centre for Education Information Victoria, BC, Canada. Information quality is a journey, not a destination - Larry P. English. Agenda. Data Definitions and Standards Project What is Quality Data? The Cost of Poor-Quality Data - PowerPoint PPT Presentation

TRANSCRIPT

Quality Data – An Improbable Dream?

Quality DataAn Improbable Dream?

Elizabeth VannanCentre for Education Information

Victoria, BC, Canada

Quality Data – An Improbable Dream?

Information quality is a journey, not a destination

- Larry P. English

Quality Data – An Improbable Dream?

Agenda

• Data Definitions and Standards Project• What is Quality Data?• The Cost of Poor-Quality Data• Improving Data Quality – Our Process• Questions?

Quality Data – An Improbable Dream?

BC Higher Education• Canada’s Western-most

province

• Population: 4.023 Million

• Land Area: 366,795 Sq Miles

• Publicly Funded Post-Secondary System– 22 Colleges– 6 Universities

Quality Data – An Improbable Dream?

CEISS

The Centre for Education Information is an independent organization that provides research and technology

services to improve the performance of the BC education system

Quality Data – An Improbable Dream?

CEISS

• Implement and manage administrative systems

• Perform custom surveys, research and analysis

• Facilitate development and implementation of data standards

• Negotiate and manage province wide software contracts (Oracle, SCT Banner, Datatel)

Quality Data – An Improbable Dream?

DDEF Project

The Problem– Better data about the BC higher education

sector needed for decision-making– No infrastructure in place to facilitate the

collection of data electronically

Data Definitions and Standards ProjectInitiated in 1995

Quality Data – An Improbable Dream?

DDEF Project

The Solution– Create data standards for all higher

education information (Student, HR, Finance)

– Develop a data warehouse based on standards for reporting

– Implement a common technical infrastructure at all higher education institutions

Quality Data – An Improbable Dream?

DDEF Project

Project Goals– Improve the quantity and QUALITY of data

available– Reduce the number of data and reporting

requests – Develop business information system to

support the management and evaluation of the BC Post-Secondary system

Quality Data – An Improbable Dream?

How Are We Doing?

• 16 institutions implemented/implementing

• Institutions using data warehouses for internal reporting

• Data requests reduced• Ministry using data

Quality Data – An Improbable Dream?

Why Focus on Data Quality?

• Poor data quality in our data warehouse impacts:–Confidence–Decision making–Funding

Quality Data – An Improbable Dream?

Quality Data Are…

The Four Attributes of Data Quality

Quality Data – An Improbable Dream?

Quality Data Are…

• Accurate– Free from

errors– Representative

Quality Data – An Improbable Dream?

Quality Data Are…

• Complete– All values are

present

Quality Data – An Improbable Dream?

Quality Data Are…

• Timely– Recorded

immediately– Available when

required

Quality Data – An Improbable Dream?

Quality Data Are…

• Flexible– Data

definitions understood

– Can be used for multiple purposes

Quality Data – An Improbable Dream?

Quality Data…

• Don’t have to be perfect• Good enough to fill the business

need at a price you’re willing to pay

Our ChallengeDefining Quality Criteria for

Higher Education Data

Quality Data – An Improbable Dream?

Cost of Poor-Quality Data

• Business Process Costs

Incorrect RegistrationsInaccurate Tuition Billings

Payroll Errors

Quality Data – An Improbable Dream?

Cost of Poor-Quality Data

• Rework

Re-collect DataCorrect Errors

Data Verification

Quality Data – An Improbable Dream?

Cost of Poor-Quality Data

• Missed Opportunities

Substandard Customer ServicePoor Decision Making

Loss of Reputation

Quality Data – An Improbable Dream?

Improving Data Quality

Business Process Review

Improved Data

Quality

Data Quality Assessment

Business Practice Change

Data Cleansing

Quality Data – An Improbable Dream?

Business Process Review

• When, where, how is data collected?

• Where is data stored?• Who creates data?• Who uses data?• What outputs are required?• What quality checks already exist?

Quality Data – An Improbable Dream?

Business Process Review

• Involve all stakeholders!–For student data we involve

• Executive• Registrars office• IT Department• Institutional Research

Quality Data – An Improbable Dream?

Business Process Review

• Results–Understanding of business

practices–Identification of data creators,

custodians, users–Preliminary quality metrics–Problem business practices

Quality Data – An Improbable Dream?

Data Quality Assessment

• Establish Metrics• Apply metrics to data• Review results

Quality Data – An Improbable Dream?

Establish Metrics

• For each element determine quality criteria–Acceptable range of values–Acceptable syntax–Comparison to known values–Business rules–Thresholds

Quality Data – An Improbable Dream?

Quality Metrics

Quality Data – An Improbable Dream?

Applying Metrics

• Collect known information for comparison

• Develop queries to test each of your validation criteria– We use Oracle Discoverer, but other tools

exist (MS Access, SQL)

Quality Data – An Improbable Dream?

Applying Metrics

Test 1PEN must be 9 digits long. No characters, no shorter

values acceptable

Quality Data – An Improbable Dream?

Test 1 Results

Two Student Records Contain Invalid PEN Numbers

Quality Data – An Improbable Dream?

Test 1 Results

Invalid PEN’s

Data Entry Error?

Can Identify specific students for data cleansing

Quality Data – An Improbable Dream?

Applying Metrics

Test 2At least 80% of student

records must have valid PEN number

Quality Data – An Improbable Dream?

Test 2 Results

This Institution Meets the Quality Threshold

Quality Data – An Improbable Dream?

Applying Metrics

Test 3No Duplicate PEN’s

Quality Data – An Improbable Dream?

Test 3 Results

This institution has a BIG problem!

Can we see more details?

Quality Data – An Improbable Dream?

Test 3 Results

Addition information reveals data

loading problems

Quality Data – An Improbable Dream?

Reviewing Results

• Systematic approach needed• Develop strategy for data cleaning• Identify source of data problems

Deal with Disparate Data Shock!

Quality Data – An Improbable Dream?

Reviewing Results

• Insert a quality review checklist

Quality Data – An Improbable Dream?

Reviewing Results

Quality Data – An Improbable Dream?

Data Cleansing

• Location– Administrative System?– Staging Area?

• Who• Scope

Quality Data – An Improbable Dream?

Typical Data Cleansing

• Correcting data entry errors• Removing or correcting nonsensical

dates• Deleting “garbage” records• Combining or deleting duplicates• Updating and applying code sets

Quality Data – An Improbable Dream?

Business Practice Change

• Two components– Implementing changes to improve data

quality– Adopting ongoing data quality review

process

Changing Business Practices is a ChallengeGet Stakeholder Support

Quality Data – An Improbable Dream?

Business Practice Change

• Education• Centralizing responsibility for codes• Consolidating data collection• Implementing validation routines• Change business processes

Quality Data – An Improbable Dream?

Quality Review Process

• Review data regularly• Make someone responsible• Establish procedures for correcting data

problems• Communicate quality improvements

Quality Data – An Improbable Dream?

Some Changes in BC

• Creation of Data Manager position, responsible for code sets, data quality

• Regular education for registration clerks and other data creators

• Established relationships between data creators and users

• Re-engineered administrative systems

Quality Data – An Improbable Dream?

Improvements to BC Data

• Improved data quality and quantity– Nonsensical dates almost eliminated– Completeness of key elements improved

(from 50% to 80-90%)– Data now being collected for CE in

standard format

Quality Data – An Improbable Dream?

Final Thoughts…

• Quality Data are Probable if you are willing to…– Take a critical look at your existing data– Implement changes to how you collect and

manage data– Invest the time to educate and

communicate with data users and creators– Make data quality improvement an on-

going process

Quality Data – An Improbable Dream?

Recommended Reading

•Brackett, Michael H., Data Resource Quality, Turning Bad Habits into Good Practices (New York:Addison-Wesley, 2000)

•English, Larry P., Improving Data Warehouse and Business Information Quality (New York: John Wiley and Sons, 1999)

•Redman, Thomas C., Data Quality for the Information Age (Boston;Artech House, Inc., 1996)

Quality Data – An Improbable Dream?

Thank You!

Presentation Available Atwww.ceiss.org

orevannan@ceiss.org

top related