linking data with action part 2: understanding data discrepancies

Linking Data with ActionPart 2: Understanding Data Discrepancies

What are data discrepancies?

Different data sources with different estimates for the same indicators

Example – HIV prevalence among men and women of reproductive age, 2007 DHS, 4.1% Sentinel surveillance, 6.4%

Does the difference matter?

Which estimate should I use?

?

What contributes to data discrepancies?Population – understand study populationWho is in the study?Where are they located?How were they selected?How do they compare with the greater population?

How are study samples selected?

1. Probability Sampling - each individual has an equal chance of being chosen because they are randomly selected.

2. Non-probability Sampling – chance of selecting any individual is not known because they are NOT randomly selected.

Study Populations – RDHS: 4.1% HIV prevalenceANC Surveillance: 6.4% HIV prevalence

Questions DHS 2007 Antenatal Surveillance 2007

Who is in the study? 6,641 Men & women of repro age

13,321 Pregnant women attending ANC

How do they compare? Not all are sexually active, variable contraceptive use

All are sexually active, none using contraception

How were they selected?

Random sample All women attending ANC

Where are they located?

Nationally representative household sample

30/359 health centers – 2 capital, 12 other urban, 16 rural

What contributes to data discrepancies? Error

Random Systematic

Definitions Indicators Terminology

Data Quality

2 Studies – Same indicator:No. of women who complete PMTCT

servicesStudy A: 95%

Criteria:CounseledTestedReceived test result

Study B: 75%

Criteria:CounseledTestedReceived test resultPositive women & babies treated

What contributes to data discrepancies?Terminology2010 Behavioral Surveillance Survey: 51% HIV prevalence

(n = 1,338 FCSW)‘High Burden of Prevalent & Recently Acquired HIV among Sex Workers & Female HIV VCT clients in Kigali, Rwanda’ by Braunstein, et al.: 24% HIV prevalence

(n = 800 FCSW)

What contributes to data discrepancies? Bias

Random Systematic

Definitions Indicators Terminology

Data Quality

ValidityAccuracy: Does the data reflect what it is intended to measure?

ReliabilityConsistency: Does the data measure a concept or characteristic consistently?

CompletenessComplete: Is all the data collected and considered?

PrecisionPrecise: Is the data described in sufficient detail?

TimelinessCurrent: Is the data current? Does it reflect actual program activities?

IntegrityData are protected from deliberate bias or manipulation for political or personal reasons.

Dimensions of Data Quality

Slide 5 of 18

Program Outcome Errors

“False Positive”: program had an effect when it did not (linked to significance level or p-value)

“False Negative”: failing to detect a true program effect (linked to significance level or p-value)

“Implementation Error”: No program effect due to lack of or inappropriate implementation.

What can help you interpret data discrepancies? Probe, question, investigate

Confidence Intervals Amount of uncertainty of an estimate, example –

3% (1%, 5%), 95% confidence intervals

OR 3% (±2%), p ≥ .05

Compare to other data sources

Ask an expert

Confidence intervals

Study Estimate 95% Confidence intervalStudy A 53% 49-57

Study B 61% 59-63

Confidence intervals

Study Estimate 95% Confidence intervalStudy A 53% 48-58

Study B 61% 57-65

What can help you interpret data discrepancies? Probe, question, investigate Confidence Intervals Comparison to other data sources Ask an expert

Small Group Exercise: Data Discrepancies

2005 Demographic Health Survey (DHS)

2005 Priorities for Local AIDS Control Efforts (PLACE) assessment

Small Group Exercise: Data Discrepancies Who is in the study? Where are they located? How were they selected? How does each study group compare to the greater

population? How are the 2 studies different? What is the use of each type of data for program

planners? For policy makers?

Small Group Exercise: Data Discrepancies Who is in the study? Where are they located? How were they selected? How does each study group compare to the

greater population? How is the data in the 2 studies different? What is the use of each type of data for program

planners? For policy makers?

This research has been supported by the President’s Emergency Plan for AIDS Relief (PEPFAR) through the United States Agency for International Development (USAID) under the terms of MEASURE Evaluation cooperative agreement GHA-A-00-08-00003-00 which is implemented by the Carolina Population Center, University of North Carolina at Chapel Hill with Futures Group, ICF International, John Snow, Inc., Management Sciences for Health, and Tulane University. Views expressed are not necessarily those of PEPFAR, USAID, or the United States government.

linking data with action part 2: understanding data discrepancies

Documents