data quality: issues and fixes

15
ILCS Raking Motivate Need and Illustrate Basic Approach Dr. Ali Mushtaq July 3, 2009 (for academic purposes only) R C R C

Upload: crrc-armenia

Post on 11-Nov-2014

911 views

Category:

Documents


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Data Quality: Issues and Fixes

ILCS Raking

Motivate Need and Illustrate Basic Approach

Dr. Ali Mushtaq

July 3, 2009(for academic purposes only)

RCRC

Page 2: Data Quality: Issues and Fixes

What is Raking?• A way to Adjust Survey totals “t” to

Independent Controls “T”• Takes existing Survey Weights,

usually wij = 1/pij, where pij is probability of selection

• Ratios them up to each total T in turn, until results are as close as wanted

Page 3: Data Quality: Issues and Fixes

What is the Value?• Can increase stability of survey

resultsReduce Sample Variance

• Get results that are close to desired outcomes

Reduce bias arising from minor operational errors

Page 4: Data Quality: Issues and Fixes

What Results to Expect?

• If Controls are Reasonable, Raking Process will converge

(“Hit” all controls)

• And improve survey results related to Control Totals

Page 5: Data Quality: Issues and Fixes

More Information Quality

• Only Weights are Changed by Raking, not Survey Data

• Data Quality is thus unchanged

• But Information Quality is usually Improved

Page 6: Data Quality: Issues and Fixes

What Does Raking Cost?

• Usually Done quickly on a PC• Independent Controls Need to be

consistent with each other• Sample must be reasonably large

for Raking to be Safely Applied• Some Costs incurred to explain

Method

Page 7: Data Quality: Issues and Fixes

Raking Made Simple

• “Fudge” Factor Intuition

• Develop a ratio of target total divided by sample total

• Repeat this process with each of the controls in turn

Page 8: Data Quality: Issues and Fixes

NSS Example from ILCS

While the NSS RA survey is raked across 4 dimensions (age, gender, marz and urban/rural), the example we’ll use here will just use two dimensions.

Page 9: Data Quality: Issues and Fixes

Table 1. Raking Example – Source Survey Data

Page 10: Data Quality: Issues and Fixes

Table 2: Desired Marginals

Page 11: Data Quality: Issues and Fixes

First Ratio Adjustment

Page 12: Data Quality: Issues and Fixes

Second Ratio Adjustment

Page 13: Data Quality: Issues and Fixes

After Second Iteration

Page 14: Data Quality: Issues and Fixes

ISLS Benefits Achieved

• Reduction in Bias

• Reduction (hopefully) in Variance

• Survey Results are Consistent with Census Projections

Page 15: Data Quality: Issues and Fixes

Again Many Thanks

Data Quality and Record Linkage Techniques

Springer 2007