medicaid underreporting in the cps: results from a record check study joanne pascale marc roemer...
TRANSCRIPT
Medicaid Underreporting in the CPS: Results from a Record Check Study
•Joanne Pascale
•Marc Roemer
•Dean Resnick
•US Census Bureau
•DCAAPOR
•August 21, 2007
2
Medicaid Undercount
• Records show higher Medicaid enrollment levels than survey estimates (~10-30%)
• Undercount affects many different surveys of health insurance
• Non-reporting error sources contribute
• Under-reporting is the largest contributor to the undercount
3
Current Population Survey
• Focus is on under-reporting in CPS – Produces most widely-cited estimates on health
insurance and uninsured– Other surveys gauge estimates against CPS;
mimic CPS design
• CPS = monthly survey on labor force and poverty; health insurance questions asked in annual supplement
4
CPS Health Insurance Questions: ‘Type by Type’ Structure
1. Job-based
2. Directly-purchased
3. Someone outside HH
4. Medicare
5. Medicaid
6. SCHIP
7. Military
8. Other
5
CPS Health Insurance Questions: Calendar Year Reference Period
• Survey is conducted in March
• Questions ask about coverage during previous calendar year
• “At any time during 2000, was anyone in this household covered by [plan type]?”
6
CPS Health Insurance Questions: Household-level Design
• Multi-person household:• At anytime during 2000 was anyone in
this household covered by [plan type]?• [if yes] Who was that?
• Single-person household:• At any time during 2000 were you
covered by [plan type]?
7
CPS Cognitive Testing
• Three main sources of misreporting:1. Type-by-type structure: Rs ‘pre-report’
and try to ‘fit’ coverage in earliest question
2. 12-month reference period: some respondents focus on current coverage or ‘spell’
3. Household size and complexity
8
More on HH Size and Complexity
• Rs forgot about certain HH members
• Rs did not know enough detail about other HH members’ plan type
• Neither problem related to ‘closeness’ between R and referent; affected housemates, distant relatives but also parents, siblings, live-in partners
9
Shared Coverage Hypothesis
• Health insurance administered in ‘units’– Private and military coverage: nuclear family– Medicaid and ~SCHIP: parent and children– Medicare: individual
• Any given HH may have a mix of units• E.g.: R on his union plan; mother on Medicare;
sister and child on Medicaid; live-in partner and her child on her job plan
• R may be able to report more accurately for HH members who are in their same unit (i.e.: share the same coverage type)
10
Methods
• Linked CPS survey and ‘MSIS’ record data for year 2000
• Analysis Dataset: CPS sample members… – known to be on Medicaid according to records – for whom a direct response to ‘Medicaid’ was reported
in CPS (not edited or imputed)• Several items fed into ‘Medicaid’ indicator (Medicaid, SCHIP,
other government plan, other)
– n = 19,345
• Dependent var = whether Medicaid was reported for the known enrollees
11
Shared Coverage Variable
• Referent (person reported on) is R (self-report)A. in single-person hhB. in multi-person hh
• Referent is not R (proxy report)C. But both are on same Medicaid caseD. But both are on Medicaid (different cases)E. Referent is on Medicaid; R is not
12
Logistic Regression Model
• Dependent var = Medicaid status reported in CPS• Independent vars:• HH composition
– Shared coverage var– Another HH member had Medicaid w/in year
• Recency and intensity of coverage– Most recent month referent enrolled– Proportion of days covered from January till last month enrolled– Referent covered in survey month
• Referent received Medicaid services w/in year• Demographics
– Sex of R– Age and race/ethnicity of referent
13
Results: Overview of Linked Dataset
• Of 173,967 CPS hh members, 19,345 (11.1%) had Medicaid according to records
• Medicaid was reported in CPS for only 12,351 (7.1%) hh members
• => 36.2% under-reporting
14
Results: Overall Regression
• Model is highly significant in explaining misreporting
• Effect of each variable is significant and highly discernible
• Ranked each of the 9 independent vars according to its importance to model
15
Ranking of Independent Vars
1. Most recent month enrolled2. Proportion of days covered from January3. Received Medicaid services w/in year4. Race/ethnicity of referent 5. Sex of respondent6. Another HH member had coverage w/in year7. Age of referent8. Covered in survey month9. Shared coverage var
16
Categorization of Independent Vars
• Recency and intensity of coverage1. Most recent month enrolled2. Proportion of days covered from January till last month enrolled8. Covered in survey month
• Receipt of Medicaid services 3. Received services with/in year
• Demographics4. Race/ethnicity of referent (white non-Hispanic)5. Sex of R7. Age of referent
• HH composition6. Another HH member had coverage w/in year9. Shared coverage var
17
Results: Shared Coverage Var
• Expected ranking:
• A: Self report in single-person HH
• B. Self report in multi-person HH
• C. Proxy report, same case
• D. Proxy report, different case
• E. Proxy report; R does not have Medicaid
•Actual ranking:
•A
•C
•D/B
•D/B
•E
18
Summary
• Recency, intensity of coverage• Receipt of Medicaid services• Shared coverage All contribute to the saliency of Medicaid to the
respondent, which could translate to more accurate reporting
• Rs in multi-person HHs forget to report their own coverage
19
Conclusions
• 1. Key components of wording are problematic:• “At any time during calendar year…” • “…was anyone in this household covered…” Explore questionnaire design alternatives
2. Reporting accuracy goes up if R and referent both have Medicaid
Explore questionnaire designs to exploit this See if results apply to other coverage types
20
Thoughts on Next Steps
1. Reference period: • start with questions about current status• ask when that coverage began• ‘walk’ back in time to beginning of calendar year
• 2. Other hh members and shared coverage:• Start with R’s coverage• For each plan type reported ask if other hh members are
also covered• Continue asking about other hh members by name
21
THANK YOU!!
22
Finding low-income telephone households and people who do not have health insurance using auxiliary sample frame information for a random
digit dial survey
Tim Triplett, The Urban Institute
David Dutwin, ICRSharon Long, The Urban Institute
DCAPPOR Seminar
August 21, 2007
23
Presentation Overview
Purpose: Obtain representative samples of adults without health insurance and adults in low (less than 300 percent of the federal poverty level (FPL)) and medium (between 300 and 500 percent FPL) income families while still being able to produce reliable estimates for the overall population.
Strategy: Telephone exchanges within Massachusetts were sorted in descending order by concentration of estimated household income. These exchanges were divided into three strata and we oversampled the low and middle income strata.
Results: Oversampling of low and medium income strata did increase the number of interviews completed with adults without health insurance as well as adults living at or below 300 percent FPL.
24
About the Study
• Telephone survey conducted in Massachusetts
• Collect baseline data prior to implementation of the Massachusetts universal health care coverage plan
• Started on October 16, 2006, ended on January 7, 2007
• 3,010 interviews with adults 18 to 64
• Key sub groups were low and middle income households and uninsured adults
• Overall response rate 49% (AAPOR rr3 formula)
25
Sample design features
• RDD list +2 exchanges stratified by income and group into high, middle, and low income strata
• Over-sampled the low-income strata (n=1381)
• Separate screening sample was used to increase sample of uninsured (n=704)
• More aggressive over-sampling of the low income strata on the screening sample
• One adult interviewed per household
• Household with both insured and uninsured adults the uninsured adults had a higher chance of selection
• No cell phone exchanges were sampled
Percentage of uninsured and low-income adults by income strata
3.5%
8.9%
12.2%
9.5%
26.2%
39.4%
57.9%
45.9%
0% 20% 40% 60%
High
Medium
Low
Overall
Uninsured Low-income
27
Alternate sampling strategies that could yield enough uninsured respondents without increasing survey costs
• None – no oversampling of strata – simply increase the amount of screening interviewers
• OS (2:2:1, 3:2:1) - release twice as much sample in the main study from the low and middle income strata and 3 times as much in the screener survey
• OS *(3:2:1, 5:3:1) - strategy we used • OS (5:3:1, 5:3:1) - same for main and screener• OS (5,3:1, 8:4:1) – heavy oversample in screener
Simulation of sample sizes resulting from the various oversampling strategies
494
2115
659
2674
704
3010
733
3309
798
3381
0
1000
2000
3000
Uninsured Sample Overall Sample
None
OS: (221,321)
OS: *(321,531)
OS: (531,531)
OS: (531,841)
29
Why not go for the largest sample
• Design effects will increase as the sample becomes more clustered
• Larger design effects means smaller effective sample sizes
• So comparing different sampling strategies you need to compare effective sample sizes
• We can only calculate the design effect (and effective sample size) for the sample strategy we employed
• Isolating the increase in the design effect due to the oversampling allows us to estimate the design effect for the other strategies
Average Design Effects
1.25
1.2
0.94
0.31
0.63
0.83
1.0 1.5 2.0 2.5 3.0
UninsuredSample
Overall Sample
RDD Design Income Over-sample Post Stratification
Simulation of effective sample sizes under various oversampling rules
taking into consideration design effects
420
1946
461
2401
458
2589
432
2713
415
2704
0
700
1400
2100
2800
Uninsured Sample Overall Sample
None
OS: (2:2:1,3:2:1)
*OS: (3:2:1,5:3:1)
OS: (5:3:1,5:3:1)
OS: (5:3:1,8:4:1)
32
Conclusions
• Oversampling using exchange level information worked well
• Higher oversampling rate for the screener sample may not have been the best strategy
• Exchanges still cluster enough to use auxiliary information
• Except for the design we used – these are simulated estimates
33
Sampling in the next round
• Consider increasing (slightly) the oversampling rate for the main sample and decreasing (slightly) the rate for the screener sample or use the same rate
• Need to sample cell phone exchanges• Health Insurance coverage likely to be
higher• Conduct Portuguese interviews
34
Thank You
The survey was funded by the Blue Cross Blue Shield Foundation of Massachusetts, The Commonwealth Fund, and the Robert Wood
Johnson Foundation.
The analysis of the survey design was funded by the Urban Institute’s Statistical Methods Group.
Switching From Retrospective to Current Year Data Collection in the Medical Expenditure
Panel Survey-Insurance Component (MEPS-IC)
Anne T. KearneyU.S. Census Bureau
John P. SommersAgency for Healthcare Research and Quality
36
Important Terms
• Retrospective Design: collects data for the year prior to the collection period
• Current Year Design: collects data in effect at the time of collection
• Survey Year: the year of data being collected in the field
• Single Unit Establishment vs. Multi-Unit Establishment
37
Outline
• Background on MEPS-IC
• Why Switch to Current?/Barriers to Switching
• Impact on Frame and Reweighting Methodology
• Details of Current Year Trial Methods
• Results
• Summary
38
Background on MEPS-ICGeneral
• Annual establishment survey that provides estimates of insurance availability and costs
• Sample of 42,000 private establishments
• National and state-level estimates
• Retrospective design
39
Background on MEPS-ICTiming Example
• Let’s say retrospective design in survey year 2002– Create frame/sample in March 2003 using 2001
data from the business register (BR)– Create SU birth frame with 2002 data from BR– In the field from roughly July-December 2003– Reweighting in March-April 2004 using 2002
data from the BR– Estimation and publication in May-June 2004
40
Why Switch to a Current Year Design?
• Estimates published about 1 year sooner• Some establishments report current data already;
current data is at their fingertips • Most survey estimates are conducive to current
year design• Better coverage of businesses that closed after the
survey year and before the field operation• Some data users in favor of going current
41
Barriers to Switching to a Current Year Design
• One year older data for frame building
• One year older data for reweightingThese could possibly make our estimates very
different which we believe means worse
• Other data users believe retrospective design is better for collecting certain items
42
Impact on Frame
Example: Let’s use 2002 survey year again:
Retrospective Current Year
Create Frame in March 2003 March 2002
SU data available 2001 2001
MU data available 2001 2000
Pick up SU Births? Yes, 2002 No
Drop SU Deaths? Yes, 2002 No
43
Impact on ReweightingNonresponse Adjustment
• We use an iterative raking procedure
• We do the NR Adjustment using 3 sets of cells:
– Sector Groups– SU/MU– State by Size Group
44
• We use an iterative raking procedure using 2 sets of cells:
– State by Size Group and SU/MU
• Under the retrospective design for the 2002 survey:
Impact on ReweightingPoststratification
i
n
iNR
N
ii
NRPS
EMPAdjwgt
EMPAdjwgtAdjwgt
R
2002_*
2002_
1
11
45
Details of Trial Methods
• One issue for frame:– What to do with the births
• One issue for nonresponse adjustment:– What employment data to use for cell assignments
• Three issues for poststratification:– What employment data to use for cell assignments– What employment data to use for total employment– What payroll data to use to create the list of
establishments for total employment
46
Details of Trial Methods2002 Survey
Method # Employment Data for Cells/Poststrat
Totals
Inscope List ID’d Using Data from..
Drop Births from
Sample?
SU MU SU MU SU MU
Production 2002 2002 2002 2002 No No
1 2001 2001 2001 2001 No No
2 2002 2001 2001 2001 No No
3 2002 2001 2002 2001 No No
4 2002 2001 2002 2001 Yes No
5 2002 2001 2002 2001 Yes Yes
47
Details of Trial Methods2002 Survey
Method # Employment Data for Cells/Poststrat
Totals
Inscope List ID’d Using Data from..
Drop Births from
Sample?
SU MU SU MU SU MU
Production 2002 2002 2002 2002 No No
1 2001 2001 2001 2001 No No
2 2002 2001 2001 2001 No No
3 2002 2001 2002 2001 No No
4 2002 2001 2002 2001 Yes No
5 2002 2001 2002 2001 Yes Yes
48
Details of Trial Methods2002 Survey
Method # Employment Data for Cells/Poststrat
Totals
Inscope List ID’d Using Data from..
Drop Births from
Sample?
SU MU SU MU SU MU
Production 2002 2002 2002 2002 No No
1 2001 2001 2001 2001 No No
2 2002 2001 2001 2001 No No
3 2002 2001 2002 2001 No No
4 2002 2001 2002 2001 Yes No
5 2002 2001 2002 2001 Yes Yes
49
Details of Trial Methods2002 Survey
Method # Employment Data for Cells/Poststrat
Totals
Inscope List ID’d Using Data from..
Drop Births from
Sample?
SU MU SU MU SU MU
Production 2002 2002 2002 2002 No No
1 2001 2001 2001 2001 No No
2 2002 2001 2001 2001 No No
3 2002 2001 2002 2001 No No
4 2002 2001 2002 2001 Yes No
5 2002 2001 2002 2001 Yes Yes
50
Details of Trial Methods2002 Survey
Method # Employment Data for Cells/Poststrat
Totals
Inscope List ID’d Using Data from..
Drop Births from
Sample?
SU MU SU MU SU MU
Production 2002 2002 2002 2002 No No
1 2001 2001 2001 2001 No No
2 2002 2001 2001 2001 No No
3 2002 2001 2002 2001 No No
4 2002 2001 2002 2001 Yes No
5 2002 2001 2002 2001 Yes Yes
51
ResultsDefinitions
• National level estimates
• Estimates by firm size– Establishments categorized by their firm
employmentSize Number of Employees
Large 1000+
Medium 50 – 999
Small 1 - 49
52
ResultsSurvey Year 2002
Estimate:
% Estabs that offer insurance Prod
Trial Method (Method minus Prod)
1 2 3 5
Natl 57.16 1.22* 1.07* 0.80* 0.45*
L Firm 98.82 -0.06 -0.06 -0.06 -0.04
M Firm 93.65 -0.07 -0.01 0.04 0.08
S Firm 44.72 0.84* 0.67* 0.41* 0.57*
* Indicates significant difference
53
ResultsSurvey Year 2003
Estimate:
% Estabs that offer insurance
Prod
Trial Method
(Method minus Prod)
3 5
Natl 56.16 0.72* -0.11
L Firm 98.68 -0.01 0.10
M Firm 90.80 0.10 -0.00
S Firm 43.49 0.64* 0.01* Indicates significant difference
54
ResultsSurvey Year 2004
Estimate:
% Estabs that offer insurance
Prod
Trial Method
(Method minus Prod)
3 5
Natl 55.05 0.46* 0.32
L Firm 98.81 0.05 -0.04
M Firm 91.47 0.02 -0.16
S Firm 41.97 0.41* 0.75** Indicates significant difference
55
ResultsSurvey Year 2005
Estimate:
% Estabs that offer insurance
Prod
Trial Method
(Method minus Prod)
3 5
Natl 56.27 -0.25 -0.57*
L Firm 98.82 0.25 0.30*
M Firm 91.50 0.69 0.46
S Firm 43.42 -0.77* -0.57** Indicates significant difference
56
ResultsSurvey Year 2002
Estimate: Avg. Single
Premium Prod
Trial Method (Method minus Prod)
1 2 3 5
Natl $3,191 -$5* -$3 -$1 -$4
L Firm $3,136 -$1 $1 $1 -$7
M Firm $3,134 $2 -$4 -$2 -$6
S Firm $3,374 -$25* -$9* -$4 $4
* Indicates significant difference
57
ResultsSurvey Year 2003
Estimate:
Avg. Single Premium Prod
Trial Method
(Method minus Prod)
3 5
Natl $3,483 $2 $8 *
L Firm $3,428 $17* $17*
M Firm $3,458 -$10 $0
S Firm $3,620 -$5 $7 * Indicates significant difference
58
ResultsSurvey Year 2004
Estimate:
Avg. Single Premium Prod
Trial Method
(Method minus Prod)
3 5
Natl $3,707 -$1 $1
L Firm $3,682 -$3 -$8
M Firm $3,713 $5 $11
S Firm $3,748 -$1 $10* Indicates significant difference
59
ResultsSurvey Year 2005
Estimate:
Avg. Single Premium Prod
Trial Method
(Method minus Prod)
3 5
Natl $3,992 $1 $3
L Firm $3,933 $7 $2
M Firm $3,972 $14 $18
S Firm $4,134 -$24 -$14* Indicates significant difference
60
Governments SampleNeed Survey Year Data
• For the Governments Sample, we need to wait until survey year data is available: – we don’t collect employment from government
units to use for our published employment estimates – we use data from the governments frame
61
Summary
• Many positives with going current – timing• Possible frame and reweighting problems
but prior year data are a good substitute• Tested 4 Trial Methods and found:
– Estimates of premiums look good and rates looked reasonable
– Establishment and employment estimates are different but not most important estimates
62
Summary (cont.)
• We are planning to switch to a current year design for survey year 2008 using a methodology similar to Method 5.
• For the Governments Sample, we need to wait until survey year data is available: – we don’t collect government unit employment
to use for employment totals
DC-AAPOR Discussant Notes
AAPOR/ICES Encore:Issues in Health Insurance
•David Kashihara•Agency for Healthcare Research and Quality (AHRQ)
•August 21, 2007
65
Issues in Health Insurance
• Topic is at the forefront of American consciousness
• Surveys of health are vital to both policy-makers and researchers
• Improving these surveys should result in better policies and improved research
66
Medicaid Under-reportingPascale, Roemer & Resnick
• The Problem:
– Significant amount of Medicaid misreporting• 36.2% in the linked data set
– Undercount probably present in other surveys
67
Medicaid Under-reportingPascale, Roemer & Resnick
• Linking CPS records to MSIS:
– Truth: MSIS records
– Non-Truths?• MSIS “no” but CPS “yes” (over-reports)• Non-matching records (multiple state claims)• Duplicates – were removed in this study• How many? Impact?
68
Medicaid Under-reportingPascale, Roemer & Resnick
• The Solution
• Good use of survey methodology– Cognitive testing– Methods– Analysis
• Confirmed the logical– Recency, intensity: salience plays big part
• Found the not-so-logical– R’s in multi-psn HH’s sometimes forget to report own coverage
69
Medicaid Under-reportingPascale, Roemer & Resnick
• Question:– If the MSIS is the Truth, how good is the truth?
• Important result:– Findings can hopefully help other surveys of
health identify, reduce or adjust for this misreporting
70
Low Income, No Insurance HH’sTriplett, Dutwin & Long
• Lack of health insurance in U.S. a hot topic– 13.7 % of U.S., non-institutionalized, < 65 (MEPS,
2004)
• Low income & no insurance are related
71
Low Income, No Insurance HH’sTriplett, Dutwin & Long
• Medical Expenditure Panel Survey (MEPS)
– U.S., non-institutionalized, < 65 population
– % of persons lacking health insurance: Jan. – Dec. 2004 by income level
Income
Level
% of
FPL
% Psns
(s.e.)
Poor < 125 24.8
(0.75)
Low 125 –
< 200
22.8
(1.05)
Middle 200 –
< 400
13.5
(0.60)
High 400 + 5.7
(0.42)
72
Low Income, No Insurance HH’sTriplett, Dutwin & Long
• More info about stratification of exchanges based on income– What was used to determine income level?
– How accurate is this?
– Are the clusters homogenous? (yes)
• No cell phone exchanges sampled– Cell only population
– Increase or decrease # of uninsured?• My guess: increase # uninsured• ages18-24 years highest uninsured group < 65 years (22.5 %)
73
Low Income, No Insurance HH’sTriplett, Dutwin & Long
• Good use of design effects
– Measure provides info not always intuitive to the untrained population
– Some may always assume that more oversampling is better
– Let statistics work for you
74
Low Income, No Insurance HH’sTriplett, Dutwin & Long
• If possible, try other factors that affect insurance coverage
– Age
– Race/Ethnicity
75
Low Income, No Insurance HH’sTriplett, Dutwin & Long
• Medical Expenditure Panel Survey (MEPS)
– U.S., non-institutionalized, < 65 population
– % of persons lacking health insurance: Jan. – Dec. 2004 by age group
Age
Group
% Psns
No Ins
(s.e.)
< 18 6.8 (0.44)
18 -24 22.5 (1.03)
25 – 44 17.6 (0.63)
45 – 64 12.4 (0.53)
76
Low Income, No Insurance HH’sTriplett, Dutwin & Long
• Medical Expenditure Panel Survey (MEPS)
– U.S., non-institutionalized, < 65 population
– % of persons lacking health insurance: Jan. – Dec. 2004 by race/ethnicity
Race/
Ethnicity
% Psns
No Ins
(s.e.)
Hispanic 28.1 (0.94)
Black,
Non-Hisp.
15.0 (1.09)
Asian/Oth,
Non-Hisp.
10.3 (0.38)
77
Retrospective to Current Year DesignKearney & Sommers
• Decisions, Decisions, Decisions
– How close is good enough?
– Weighted pros & cons list
– Administrative barriers
78
Retrospective to Current Year DesignKearney & Sommers
• Good list of pros & cons
• On the balance:– Different data users prefer different designs– Best design to please the most data users?– Best design for accurate estimates?– What is most important?
• What the users want
79
Retrospective to Current Year DesignKearney & Sommers
• How good is the Gold Standard (GS)?– “Survey-Year Data”
– Reason it’s a GS– GS may have flaws– Sometimes methodology changes correct or cancel
biases– GS is nice to have, but many surveys don’t have this
luxury and still produce excellent estimates
80
Retrospective to Current Year DesignKearney & Sommers
• Well devised study– Trials useful to tease out sources of problems– Results look promising – a convincing
argument to move forward
• Impact of the “minor” estimates?– Found to be different
81
Retrospective to Current Year DesignKearney & Sommers
• Transition to new design – any contingency plans?– In case new design doesn’t work well in reality– Concurrent samples (old & new methods)
• Draw 2nd sample (old method) when items become available
– Estimate bias between methods– Not cost effective or efficient
82
Issues in Health Insurance
• Three very good studies
• Methods & findings could be applied to other surveys
• We should be constantly improving surveys & making them more useful