x 11 x 12 x 13 x 21 x 22 x 23 x 31 x 32 x 33. research question are nursing homes dangerous for...
TRANSCRIPT
Research Question• Are nursing homes dangerous for seniors? Does admittance to a nursing home increase risk of death in adults over 65 years of age when controlling for age, gender, race, and number of emergency room visits?
Propensity Score Matchingor
Do nursing homes kill you?
ANNMARIA DE MARS, PH.D.
&
CHELSEA HEAVEN
THE JULIA GROUP
WHY YOU NEED IT
TWO NON-EQUIVALENT GROUPS
Patients in specialized units
People who attend a fundraising event
Any time you can ask the question ….
Is there a difference on OUTCOME between levels of “treatment” A,
controlling for X, Y and Z ?
Examples
OUTCOME “TREATMENT”LEVELS
COVARIATES
DROP OUT PUBLIC, PRIVATE INCOMEPARENT EDUCATIONGR. 8 ACHIEVEMENT
BMI DAILY SOFT DRINKSNO SOFT DRINKS
GENDERAGERACEEXERCISE FREQ.
DEATH LIVES AT HOMENURSING HOME
AGEGENDERTOTAL ER VISITS
2a. Decide on covariates
• Are the differences pre-existing or could they possibly be due to the different “treatment” levels?
• Race and gender are good choices for covariates. If more students at private vs public schools are black or female, the schooling probably didn’t cause that
• Differences in grade 10 math scores may be a result of the type of school
3. Run logistic regression to generate propensity scores
PROC LOGISTIC DATA= datasetname ;
CLASS categorical variables ;
MODEL dependent = list-of-covariates ;
OUTPUT OUT = newdataset
PREDICTED= propensity-score;
4. Select matching method
1. Quintiles
2. Nearest neighbors
3. Calipers
ALL OF THE ABOVE CAN BE DONE EITHER WITH OR WITHOUT REPLACEMENT
Our data
Kaiser Permanente Study of the Oldest Old, 1971-1979 and 1980-1988: [California]
DEPENDENT VARIABLE:
Dthflag = 1 if Died during study period
0 if alive at end of study period
Our data
TREATMENT VARIABLE
athome = 1 if lived at home continuously
0 if admitted to nursing home any time during study period
Before matching
AT HOME > NO YES TOTAL
DIED Frequency(Column %)
=========
=========
NO 184(14.6)
2,486(52.6)
2,670(44.6)
YES 1,077(85.4)
2,239(47.4)
3,316(55.4)
TOTAL 1,261 4,725 5,986
Covariates *
•AGE
•RACE
•GENDER
•TOTAL Emergency Room VISITS **
* Three out of four were DEFINITELY pre-existing differences
** Proxy for health
PROC LOGISTICPROC LOGISTIC DATA= saslib.old ;CLASS athome race sex ;
MODEL athome = race sex age_comp vissum1;OUTPUT OUT =study.allpropen PREDICTED = prob;
Create propensity scores
NOTE: No DESCENDING option
Yes, pre-existing differences
TYPE 3 ANALYSIS OF EFFECTS
Effect DF
WaldChi-
SquarePr > ChiS
qRACE 4 18.7017 0.0009SEX 1 12.5424 0.0004age_comp
1 412.8103 <.0001
VISSUM1 1 212.9695 <.0001
Part on creating quintiles blatantly copied (almost)
http://www.pauldickman.com/teaching/sas/quintiles.php
Calculate Quintile Cutpoints
PROC UNIVARIATE DATA= saslib.allpropen;
VAR prob;
OUTPUT OUT=quintile
PCTLPTS=20 40 60 80 PCTLPRE=pct;
Remember the dataset we created with the predicted probabilities saved in it?
PROC UNIVARIATE
VAR prob;
*** predicted probability as variable
OUTPUT OUT=quintile
PCTLPTS=20 40 60 80 PCTLPRE=pct;
*** output to a dataset named quintile,
*** create four variables at these percentiles
*** with the prefix pct ;
/* write the quintiles to macro variables */
data _null_ ;
set quintile;
call symput('q1',pct20) ;
call symput('q2',pct40) ;
call symput('q3',pct60) ;
call symput('q4',pct80) ;
Just because I am too lazy to write down the percentiles
Create quintiles
data STUDY.AllPropen;
set STUDY.AllPropen ;
if prob =. then quintile = .;
else if prob le &q1 then quintile=1;
else if prob le &q2 then quintile=2;
else if prob le &q3 then quintile=3;
else if prob le &q4 then quintile=4;
else quintile=5;
Quintiles
Quintile Frequency PercentCumulativeFrequency
CumulativePercent
1 1075 19.76 1075 19.76
2 1101 20.24 2176 40.00
3 1088 20.00 3264 60.00
4 1088 20.00 4352 80.00
5 1088 20.00 5440 100.00
Create case & control data sets
DATA small large ;
SET study.allpropen ;
IF athome = 0 THEN OUTPUT small ;
ELSE IF athome = 1 THEN OUTPUT large ;
Quintiles in smaller data set
Quintile Frequency PercentCumulativeFrequency
CumulativePercent
1 50 4.06 50 4.06
2 115 9.33 165 13.39
3 208 16.88 373 30.28
4 338 27.44 711 57.71
5 521 42.29 1232 100.00
Create sampling data set
DATA samp_pct ;
SET samp_pct ;
_NSIZE_ = 1 ;
_NSIZE_ = _NSIZE_ * COUNT ;
DROP PERCENT ;
Just here to make it easy to modify
PROC SURVEYSELECT
SAMPSIZE= input data set can provide stratum sample sizes in the _NSIZE_ variable
STRATA groups should appear in the same order in the secondary data set as in the DATA= data set.
SELECT RANDOM SAMPLE
PROC SORT DATA = large ;
BY quintile ;
PROC SURVEYSELECT DATA= large SAMPSIZE = samp_pct OUT = largesamp ;
STRATA quintile ;
Did it work?
Variable
Before After
AT Home
NOT Home
Prob AT Home NOT Home
Prob
Age 75.0 79.3 .0001 79.2 79.3 .60
ER visits
4.5 2.4 .0001 4.5 **** 3.8 **** .0001
Female 49% 54% .01 52% 54% .36
Race .0001 .97
** P <.01 **** P < .0001