medicaid underreporting in the cps: results from a record check study joanne pascale marc roemer...

Medicaid Underreporting in the CPS: Results from a Record Check Study

•Joanne Pascale

•Marc Roemer

•Dean Resnick

•US Census Bureau

•DCAAPOR

•August 21, 2007

2

Medicaid Undercount

• Records show higher Medicaid enrollment levels than survey estimates (~10-30%)

• Undercount affects many different surveys of health insurance

• Non-reporting error sources contribute

• Under-reporting is the largest contributor to the undercount

3

Current Population Survey

• Focus is on under-reporting in CPS – Produces most widely-cited estimates on health

insurance and uninsured– Other surveys gauge estimates against CPS;

mimic CPS design

• CPS = monthly survey on labor force and poverty; health insurance questions asked in annual supplement

4

CPS Health Insurance Questions: ‘Type by Type’ Structure

1. Job-based

2. Directly-purchased

3. Someone outside HH

4. Medicare

5. Medicaid

6. SCHIP

7. Military

8. Other

5

CPS Health Insurance Questions: Calendar Year Reference Period

• Survey is conducted in March

• Questions ask about coverage during previous calendar year

• “At any time during 2000, was anyone in this household covered by [plan type]?”

6

CPS Health Insurance Questions: Household-level Design

• Multi-person household:• At anytime during 2000 was anyone in

this household covered by [plan type]?• [if yes] Who was that?

• Single-person household:• At any time during 2000 were you

covered by [plan type]?

7

CPS Cognitive Testing

• Three main sources of misreporting:1. Type-by-type structure: Rs ‘pre-report’

and try to ‘fit’ coverage in earliest question

2. 12-month reference period: some respondents focus on current coverage or ‘spell’

3. Household size and complexity

8

More on HH Size and Complexity

• Rs forgot about certain HH members

• Rs did not know enough detail about other HH members’ plan type

• Neither problem related to ‘closeness’ between R and referent; affected housemates, distant relatives but also parents, siblings, live-in partners

9

Shared Coverage Hypothesis

• Health insurance administered in ‘units’– Private and military coverage: nuclear family– Medicaid and ~SCHIP: parent and children– Medicare: individual

• Any given HH may have a mix of units• E.g.: R on his union plan; mother on Medicare;

sister and child on Medicaid; live-in partner and her child on her job plan

• R may be able to report more accurately for HH members who are in their same unit (i.e.: share the same coverage type)

10

Methods

• Linked CPS survey and ‘MSIS’ record data for year 2000

• Analysis Dataset: CPS sample members… – known to be on Medicaid according to records – for whom a direct response to ‘Medicaid’ was reported

in CPS (not edited or imputed)• Several items fed into ‘Medicaid’ indicator (Medicaid, SCHIP,

other government plan, other)

– n = 19,345

• Dependent var = whether Medicaid was reported for the known enrollees

11

Shared Coverage Variable

• Referent (person reported on) is R (self-report)A. in single-person hhB. in multi-person hh

• Referent is not R (proxy report)C. But both are on same Medicaid caseD. But both are on Medicaid (different cases)E. Referent is on Medicaid; R is not

12

Logistic Regression Model

• Dependent var = Medicaid status reported in CPS• Independent vars:• HH composition

– Shared coverage var– Another HH member had Medicaid w/in year

• Recency and intensity of coverage– Most recent month referent enrolled– Proportion of days covered from January till last month enrolled– Referent covered in survey month

• Referent received Medicaid services w/in year• Demographics

– Sex of R– Age and race/ethnicity of referent

13

Results: Overview of Linked Dataset

• Of 173,967 CPS hh members, 19,345 (11.1%) had Medicaid according to records

• Medicaid was reported in CPS for only 12,351 (7.1%) hh members

• => 36.2% under-reporting

14

Results: Overall Regression

• Model is highly significant in explaining misreporting

• Effect of each variable is significant and highly discernible

• Ranked each of the 9 independent vars according to its importance to model

15

Ranking of Independent Vars

1. Most recent month enrolled2. Proportion of days covered from January3. Received Medicaid services w/in year4. Race/ethnicity of referent 5. Sex of respondent6. Another HH member had coverage w/in year7. Age of referent8. Covered in survey month9. Shared coverage var

16

Categorization of Independent Vars

• Recency and intensity of coverage1. Most recent month enrolled2. Proportion of days covered from January till last month enrolled8. Covered in survey month

• Receipt of Medicaid services 3. Received services with/in year

• Demographics4. Race/ethnicity of referent (white non-Hispanic)5. Sex of R7. Age of referent

• HH composition6. Another HH member had coverage w/in year9. Shared coverage var

17

Results: Shared Coverage Var

• Expected ranking:

• A: Self report in single-person HH

• B. Self report in multi-person HH

• C. Proxy report, same case

• D. Proxy report, different case

• E. Proxy report; R does not have Medicaid

•Actual ranking:

•A

•C

•D/B

•D/B

•E

18

Summary

• Recency, intensity of coverage• Receipt of Medicaid services• Shared coverage All contribute to the saliency of Medicaid to the

respondent, which could translate to more accurate reporting

• Rs in multi-person HHs forget to report their own coverage

19

Conclusions

• 1. Key components of wording are problematic:• “At any time during calendar year…” • “…was anyone in this household covered…” Explore questionnaire design alternatives

2. Reporting accuracy goes up if R and referent both have Medicaid

Explore questionnaire designs to exploit this See if results apply to other coverage types

20

Thoughts on Next Steps

1. Reference period: • start with questions about current status• ask when that coverage began• ‘walk’ back in time to beginning of calendar year

• 2. Other hh members and shared coverage:• Start with R’s coverage• For each plan type reported ask if other hh members are

also covered• Continue asking about other hh members by name

21

THANK YOU!!

• [email protected]



mailto:[email protected]

mailto:[email protected]

22

Finding low-income telephone households and people who do not have health insurance using auxiliary sample frame information for a random

digit dial survey

Tim Triplett, The Urban Institute

David Dutwin, ICRSharon Long, The Urban Institute

DCAPPOR Seminar

August 21, 2007

23

Presentation Overview

Purpose: Obtain representative samples of adults without health insurance and adults in low (less than 300 percent of the federal poverty level (FPL)) and medium (between 300 and 500 percent FPL) income families while still being able to produce reliable estimates for the overall population.

Strategy: Telephone exchanges within Massachusetts were sorted in descending order by concentration of estimated household income. These exchanges were divided into three strata and we oversampled the low and middle income strata.

Results: Oversampling of low and medium income strata did increase the number of interviews completed with adults without health insurance as well as adults living at or below 300 percent FPL.

24

About the Study

• Telephone survey conducted in Massachusetts

• Collect baseline data prior to implementation of the Massachusetts universal health care coverage plan

• Started on October 16, 2006, ended on January 7, 2007

• 3,010 interviews with adults 18 to 64

• Key sub groups were low and middle income households and uninsured adults

• Overall response rate 49% (AAPOR rr3 formula)

25

Sample design features

• RDD list +2 exchanges stratified by income and group into high, middle, and low income strata

• Over-sampled the low-income strata (n=1381)

• Separate screening sample was used to increase sample of uninsured (n=704)

• More aggressive over-sampling of the low income strata on the screening sample

• One adult interviewed per household

• Household with both insured and uninsured adults the uninsured adults had a higher chance of selection

• No cell phone exchanges were sampled

Percentage of uninsured and low-income adults by income strata

3.5%

8.9%

12.2%

9.5%

26.2%

39.4%

57.9%

45.9%

0% 20% 40% 60%

High

Medium

Low

Overall

Uninsured Low-income

27

Alternate sampling strategies that could yield enough uninsured respondents without increasing survey costs

• None – no oversampling of strata – simply increase the amount of screening interviewers

• OS (2:2:1, 3:2:1) - release twice as much sample in the main study from the low and middle income strata and 3 times as much in the screener survey

• OS *(3:2:1, 5:3:1) - strategy we used • OS (5:3:1, 5:3:1) - same for main and screener• OS (5,3:1, 8:4:1) – heavy oversample in screener

Simulation of sample sizes resulting from the various oversampling strategies

494

2115

659

2674

704

3010

733

3309

798

3381

0

1000

2000

3000

Uninsured Sample Overall Sample

None

OS: (221,321)

OS: *(321,531)

OS: (531,531)

OS: (531,841)

29

Why not go for the largest sample

• Design effects will increase as the sample becomes more clustered

• Larger design effects means smaller effective sample sizes

• So comparing different sampling strategies you need to compare effective sample sizes

• We can only calculate the design effect (and effective sample size) for the sample strategy we employed

• Isolating the increase in the design effect due to the oversampling allows us to estimate the design effect for the other strategies

Average Design Effects

1.25

1.2

0.94

0.31

0.63

0.83

1.0 1.5 2.0 2.5 3.0

UninsuredSample

Overall Sample

RDD Design Income Over-sample Post Stratification

Simulation of effective sample sizes under various oversampling rules

taking into consideration design effects

420

1946

461

2401

458

2589

432

2713

415

2704

0

700

1400

2100

2800

Uninsured Sample Overall Sample

None

OS: (2:2:1,3:2:1)

*OS: (3:2:1,5:3:1)

OS: (5:3:1,5:3:1)

OS: (5:3:1,8:4:1)

32

Conclusions

• Oversampling using exchange level information worked well

• Higher oversampling rate for the screener sample may not have been the best strategy

• Exchanges still cluster enough to use auxiliary information

• Except for the design we used – these are simulated estimates

33

Sampling in the next round

• Consider increasing (slightly) the oversampling rate for the main sample and decreasing (slightly) the rate for the screener sample or use the same rate

• Need to sample cell phone exchanges• Health Insurance coverage likely to be

higher• Conduct Portuguese interviews

34

Thank You

The survey was funded by the Blue Cross Blue Shield Foundation of Massachusetts, The Commonwealth Fund, and the Robert Wood

Johnson Foundation.

The analysis of the survey design was funded by the Urban Institute’s Statistical Methods Group.

Switching From Retrospective to Current Year Data Collection in the Medical Expenditure

Panel Survey-Insurance Component (MEPS-IC)

Anne T. KearneyU.S. Census Bureau

John P. SommersAgency for Healthcare Research and Quality

36

Important Terms

• Retrospective Design: collects data for the year prior to the collection period

• Current Year Design: collects data in effect at the time of collection

• Survey Year: the year of data being collected in the field

• Single Unit Establishment vs. Multi-Unit Establishment

37

Outline

• Background on MEPS-IC

• Why Switch to Current?/Barriers to Switching

• Impact on Frame and Reweighting Methodology

• Details of Current Year Trial Methods

• Results

• Summary

38

Background on MEPS-ICGeneral

• Annual establishment survey that provides estimates of insurance availability and costs

• Sample of 42,000 private establishments

• National and state-level estimates

• Retrospective design

39

Background on MEPS-ICTiming Example

• Let’s say retrospective design in survey year 2002– Create frame/sample in March 2003 using 2001

data from the business register (BR)– Create SU birth frame with 2002 data from BR– In the field from roughly July-December 2003– Reweighting in March-April 2004 using 2002

data from the BR– Estimation and publication in May-June 2004

40

Why Switch to a Current Year Design?

• Estimates published about 1 year sooner• Some establishments report current data already;

current data is at their fingertips • Most survey estimates are conducive to current

year design• Better coverage of businesses that closed after the

survey year and before the field operation• Some data users in favor of going current

41

Barriers to Switching to a Current Year Design

• One year older data for frame building

• One year older data for reweightingThese could possibly make our estimates very

different which we believe means worse

• Other data users believe retrospective design is better for collecting certain items

42

Impact on Frame

Example: Let’s use 2002 survey year again:

Retrospective Current Year

Create Frame in March 2003 March 2002

SU data available 2001 2001

MU data available 2001 2000

Pick up SU Births? Yes, 2002 No

Drop SU Deaths? Yes, 2002 No

43

Impact on ReweightingNonresponse Adjustment

• We use an iterative raking procedure

• We do the NR Adjustment using 3 sets of cells:

– Sector Groups– SU/MU– State by Size Group

44

• We use an iterative raking procedure using 2 sets of cells:

– State by Size Group and SU/MU

• Under the retrospective design for the 2002 survey:

Impact on ReweightingPoststratification

i

n

iNR

N

ii

NRPS

EMPAdjwgt

EMPAdjwgtAdjwgt

R

2002_*

2002_

1

11

45

Details of Trial Methods

• One issue for frame:– What to do with the births

• One issue for nonresponse adjustment:– What employment data to use for cell assignments

• Three issues for poststratification:– What employment data to use for cell assignments– What employment data to use for total employment– What payroll data to use to create the list of

establishments for total employment

46

Details of Trial Methods2002 Survey

Method # Employment Data for Cells/Poststrat

Totals

Inscope List ID’d Using Data from..

Drop Births from

Sample?

SU MU SU MU SU MU

Production 2002 2002 2002 2002 No No

1 2001 2001 2001 2001 No No

2 2002 2001 2001 2001 No No

3 2002 2001 2002 2001 No No

4 2002 2001 2002 2001 Yes No

5 2002 2001 2002 2001 Yes Yes

47



Totals


Drop Births from

Sample?

SU MU SU MU SU MU

Production 2002 2002 2002 2002 No No

1 2001 2001 2001 2001 No No

2 2002 2001 2001 2001 No No

3 2002 2001 2002 2001 No No

4 2002 2001 2002 2001 Yes No

5 2002 2001 2002 2001 Yes Yes

48



Totals


Drop Births from

Sample?

SU MU SU MU SU MU

Production 2002 2002 2002 2002 No No

1 2001 2001 2001 2001 No No

2 2002 2001 2001 2001 No No

3 2002 2001 2002 2001 No No

4 2002 2001 2002 2001 Yes No

5 2002 2001 2002 2001 Yes Yes

49



Totals


Drop Births from

Sample?

SU MU SU MU SU MU

Production 2002 2002 2002 2002 No No

1 2001 2001 2001 2001 No No

2 2002 2001 2001 2001 No No

3 2002 2001 2002 2001 No No

4 2002 2001 2002 2001 Yes No

5 2002 2001 2002 2001 Yes Yes

50



Totals


Drop Births from

Sample?

SU MU SU MU SU MU

Production 2002 2002 2002 2002 No No

1 2001 2001 2001 2001 No No

2 2002 2001 2001 2001 No No

3 2002 2001 2002 2001 No No

4 2002 2001 2002 2001 Yes No

5 2002 2001 2002 2001 Yes Yes

51

ResultsDefinitions

• National level estimates

• Estimates by firm size– Establishments categorized by their firm

employmentSize Number of Employees

Large 1000+

Medium 50 – 999

Small 1 - 49

52

ResultsSurvey Year 2002

Estimate:

% Estabs that offer insurance Prod

Trial Method (Method minus Prod)

1 2 3 5

Natl 57.16 1.22* 1.07* 0.80* 0.45*

L Firm 98.82 -0.06 -0.06 -0.06 -0.04

M Firm 93.65 -0.07 -0.01 0.04 0.08

S Firm 44.72 0.84* 0.67* 0.41* 0.57*

* Indicates significant difference

53


Estimate:

% Estabs that offer insurance

Prod

Trial Method

(Method minus Prod)

3 5

Natl 56.16 0.72* -0.11

L Firm 98.68 -0.01 0.10

M Firm 90.80 0.10 -0.00

S Firm 43.49 0.64* 0.01* Indicates significant difference

54


Estimate:


Prod

Trial Method

(Method minus Prod)

3 5

Natl 55.05 0.46* 0.32

L Firm 98.81 0.05 -0.04

M Firm 91.47 0.02 -0.16

S Firm 41.97 0.41* 0.75** Indicates significant difference

55


Estimate:


Prod

Trial Method

(Method minus Prod)

3 5

Natl 56.27 -0.25 -0.57*

L Firm 98.82 0.25 0.30*

M Firm 91.50 0.69 0.46

S Firm 43.42 -0.77* -0.57** Indicates significant difference

56


Estimate: Avg. Single

Premium Prod

Trial Method (Method minus Prod)

1 2 3 5

Natl $3,191 -$5* -$3 -$1 -$4

L Firm $3,136 -$1 $1 $1 -$7

M Firm $3,134 $2 -$4 -$2 -$6

S Firm $3,374 -$25* -$9* -$4 $4

* Indicates significant difference

57


Estimate:

Avg. Single Premium Prod

Trial Method

(Method minus Prod)

3 5

Natl $3,483 $2 $8 *

L Firm $3,428 $17* $17*

M Firm $3,458 -$10 $0

S Firm $3,620 -$5 $7 * Indicates significant difference

58


Estimate:


Trial Method

(Method minus Prod)

3 5

Natl $3,707 -$1 $1

L Firm $3,682 -$3 -$8

M Firm $3,713 $5 $11

S Firm $3,748 -$1 $10* Indicates significant difference

59


Estimate:


Trial Method

(Method minus Prod)

3 5

Natl $3,992 $1 $3

L Firm $3,933 $7 $2

M Firm $3,972 $14 $18

S Firm $4,134 -$24 -$14* Indicates significant difference

60

Governments SampleNeed Survey Year Data

• For the Governments Sample, we need to wait until survey year data is available: – we don’t collect employment from government

units to use for our published employment estimates – we use data from the governments frame

61

Summary

• Many positives with going current – timing• Possible frame and reweighting problems

but prior year data are a good substitute• Tested 4 Trial Methods and found:

– Estimates of premiums look good and rates looked reasonable

– Establishment and employment estimates are different but not most important estimates

62

Summary (cont.)

• We are planning to switch to a current year design for survey year 2008 using a methodology similar to Method 5.

• For the Governments Sample, we need to wait until survey year data is available: – we don’t collect government unit employment

to use for employment totals

63

[email protected]

[email protected]

DC-AAPOR Discussant Notes

AAPOR/ICES Encore:Issues in Health Insurance

•David Kashihara•Agency for Healthcare Research and Quality (AHRQ)

•August 21, 2007

65

Issues in Health Insurance

• Topic is at the forefront of American consciousness

• Surveys of health are vital to both policy-makers and researchers

• Improving these surveys should result in better policies and improved research

66

Medicaid Under-reportingPascale, Roemer & Resnick

• The Problem:

– Significant amount of Medicaid misreporting• 36.2% in the linked data set

– Undercount probably present in other surveys

67


• Linking CPS records to MSIS:

– Truth: MSIS records

– Non-Truths?• MSIS “no” but CPS “yes” (over-reports)• Non-matching records (multiple state claims)• Duplicates – were removed in this study• How many? Impact?

68


• The Solution

• Good use of survey methodology– Cognitive testing– Methods– Analysis

• Confirmed the logical– Recency, intensity: salience plays big part

• Found the not-so-logical– R’s in multi-psn HH’s sometimes forget to report own coverage

69


• Question:– If the MSIS is the Truth, how good is the truth?

• Important result:– Findings can hopefully help other surveys of

health identify, reduce or adjust for this misreporting

70

Low Income, No Insurance HH’sTriplett, Dutwin & Long

• Lack of health insurance in U.S. a hot topic– 13.7 % of U.S., non-institutionalized, < 65 (MEPS,

2004)

• Low income & no insurance are related

71


• Medical Expenditure Panel Survey (MEPS)

– U.S., non-institutionalized, < 65 population

– % of persons lacking health insurance: Jan. – Dec. 2004 by income level

Income

Level

% of

FPL

% Psns

(s.e.)

Poor < 125 24.8

(0.75)

Low 125 –

< 200

22.8

(1.05)

Middle 200 –

< 400

13.5

(0.60)

High 400 + 5.7

(0.42)

72


• More info about stratification of exchanges based on income– What was used to determine income level?

– How accurate is this?

– Are the clusters homogenous? (yes)

• No cell phone exchanges sampled– Cell only population

– Increase or decrease # of uninsured?• My guess: increase # uninsured• ages18-24 years highest uninsured group < 65 years (22.5 %)

73


• Good use of design effects

– Measure provides info not always intuitive to the untrained population

– Some may always assume that more oversampling is better

– Let statistics work for you

74


• If possible, try other factors that affect insurance coverage

– Age

– Race/Ethnicity

75




– % of persons lacking health insurance: Jan. – Dec. 2004 by age group

Age

Group

% Psns

No Ins

(s.e.)

< 18 6.8 (0.44)

18 -24 22.5 (1.03)

25 – 44 17.6 (0.63)

45 – 64 12.4 (0.53)

76




– % of persons lacking health insurance: Jan. – Dec. 2004 by race/ethnicity

Race/

Ethnicity

% Psns

No Ins

(s.e.)

Hispanic 28.1 (0.94)

Black,

Non-Hisp.

15.0 (1.09)

Asian/Oth,

Non-Hisp.

10.3 (0.38)

77

Retrospective to Current Year DesignKearney & Sommers

• Decisions, Decisions, Decisions

– How close is good enough?

– Weighted pros & cons list

– Administrative barriers

78


• Good list of pros & cons

• On the balance:– Different data users prefer different designs– Best design to please the most data users?– Best design for accurate estimates?– What is most important?

• What the users want

79


• How good is the Gold Standard (GS)?– “Survey-Year Data”

– Reason it’s a GS– GS may have flaws– Sometimes methodology changes correct or cancel

biases– GS is nice to have, but many surveys don’t have this

luxury and still produce excellent estimates

80


• Well devised study– Trials useful to tease out sources of problems– Results look promising – a convincing

argument to move forward

• Impact of the “minor” estimates?– Found to be different

81


• Transition to new design – any contingency plans?– In case new design doesn’t work well in reality– Concurrent samples (old & new methods)

• Draw 2nd sample (old method) when items become available

– Estimate bias between methods– Not cost effective or efficient

82

Issues in Health Insurance

• Three very good studies

• Methods & findings could be applied to other surveys

• We should be constantly improving surveys & making them more useful

medicaid underreporting in the cps: results from a record check study joanne pascale marc roemer...

Documents

medicaid r

medicaid indicator medicaid

coverage type slide

cps survey

cps mimic cps design

cps health insurance

medicaid status

plan type