basic epidemiologic analysis
TRANSCRIPT
![Page 1: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/1.jpg)
Basic epidemiologic analysis with Stata
Biostatistics 212
Lecture 5
![Page 2: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/2.jpg)
Housekeeping• Questions about Lab 4?
– Extra credit puzzle• Lab 3 issues
– Make sure your do file executes– Make sure your do file opens the dataset
• Final Project – by the last session you should:– Have dataset imported into Stata– Clean up the variables you will use– Sketch out (paper and pencil) a table and a figure– Be ready to write analysis do files
![Page 3: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/3.jpg)
Today...
• What’s the difference between epidemiologic and statistical analysis?
• Interaction and confounding with 2 x 2’s• Stata’s “Epitab” commands• Adjusting for many things at once• Logistic regression• Testing for trends
![Page 4: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/4.jpg)
Epi vs. Biostats• Statistical analysis – Evaluating the role of chance
• Epidemiologic analysis – Analyzing and interpreting clinical research data in the context of scientific knowledge– Directionality of causes– Mediation vs. confounding– Prediction vs. causal inference– Clinical importance of effect size– “Cost” of a type I and type II error
![Page 5: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/5.jpg)
Epi vs. Biostats
• Epi –Confounding, interaction, and causal diagrams.– What to adjust for?– What do the adjusted estimates mean?
A B
C
A BC
![Page 6: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/6.jpg)
2 x 2 Tables
• “Contingency tables” are the traditional analytic tool of the epidemiologist
Outcome
Exposure
+ -
+
-
a b
c d
OR = (a/b) /(c/d) = ad/bc
RR = a/(a+b) / c/(c+d)
![Page 7: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/7.jpg)
2 x 2 Tables
• Example
Coronary calcium
Binge drinking
+ -
+
-
106 585
186 2165
OR = 2.1 (1.6 – 2.7)
RR = 1.9 (1.6 – 2.4)
292 2750
2351
691
3042
![Page 8: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/8.jpg)
2 x 2 Tables
• Example
Coronary calcium
Binge drinking
+ -
+
-
106 585
186 2165
OR = 2.1 (1.6 – 2.7)
RR = 1.9 (1.6 – 2.4)
292 2750
2351
691
3042
Can we say that binge drinking CAUSES atherosclerosis?
![Page 9: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/9.jpg)
2 x 2 Tables
• There is a statistically significant association, but is it causal?
• Does male gender confound the association?
Binge drinking Coronary calcium
Male
![Page 10: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/10.jpg)
2 x 2 Tables
• Men more likely to binge– 34% of men, 14% of women
• Men have more coronary calcium– 15% of men, 7% of women
![Page 11: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/11.jpg)
2 x 2 Tables
• But what does confounding look like in a 2x2 table?
• And how do you adjust for it?
![Page 12: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/12.jpg)
2 x 2 Tables
• But what does confounding look like in a 2x2 table?
• And how do you adjust for it?– Stratify– Examine strata-specific estimates (for interaction)– Combine estimates if appropriate (if no interaction)
• Weighted average of strata-specific estimates
![Page 13: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/13.jpg)
2 x 2 Tables• First, stratify…
106 585
186 2165
CAC
Binge
+ -
+
-
89 374
118 801
CAC
Binge
+ -
+
-
17 211
68 1364
CAC
Binge
+ -
+
-
In men In women
RR = 1.94 (1.55-2.42)
(34%) (14%)
(15%) (7%)
RR = 1.57 (0.94-2.62)RR = 1.50 (1.16-1.93)
![Page 14: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/14.jpg)
2 x 2 Tables• …compare strata-specific estimates…
• (they’re about the same)
89 374
118 801
CAC
Binge
+ -
+
-
17 211
68 1364
CAC
Binge
+ -
+
-
In men In women
(34%) (14%)
(15%) (7%)
RR = 1.57 (0.94-2.62)RR = 1.50 (1.16-1.93)
![Page 15: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/15.jpg)
2 x 2 Tables• …and then “combine” the estimates.
89 374
118 801
CAC
Binge
+ -
+
-
17 211
68 1364
CAC
Binge
+ -
+
-
In men In women
RR = 1.50 (1.16-1.93) RR = 1.57 (0.94-2.62)
RRadj = 1.51 (1.21-1.89)
![Page 16: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/16.jpg)
106 585
186 2165Binge
+ -
+
-
89 374
118 801
CAC
Binge
+ -
+
-
17 211
68 1364
CAC
Binge
+ -
+
-
In men In women
(34%) (14%)
(15%) (7%)
RR = 1.57 (0.94-2.62)RR = 1.50 (1.16-1.93)
RR = 1.94 (1.55-2.42)
RRadj = 1.51 (1.21-1.89)
![Page 17: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/17.jpg)
2 x 2 Tables
• How do we do this with Stata?– Tabulate – output not exactly what we want.– The “epitab” commands
• Stata’s answer to stratified analyses
cs, cccsi, ccitabodds, mhodds
![Page 18: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/18.jpg)
2 x 2 Tables
• Example – demo using Stata
cs cac bingecs cac binge, by(male)
cs cac modalccs cac modalc, by(racegender)
cc cac binge
![Page 19: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/19.jpg)
2 x 2 Tables
• Example of a crude association (unadjusted). cs cac binge
| Binge pattern [>5 drinks| | on occasion] | | Exposed Unexposed | Total-----------------+------------------------+------------ Cases | 106 186 | 292 Noncases | 585 2165 | 2750-----------------+------------------------+------------ Total | 691 2351 | 3042 | | Risk | .1534009 .0791153 | .0959895 | | | Point estimate | [95% Conf. Interval] |------------------------+------------------------ Risk difference | .0742856 | .0452852 .103286 Risk ratio | 1.938954 | 1.551487 2.423187 Attr. frac. ex. | .484258 | .355457 .5873203 Attr. frac. pop | .1757923 | +------------------------------------------------- chi2(1) = 33.96 Pr>chi2 = 0.0000
![Page 20: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/20.jpg)
2 x 2 Tables
• Example of Confounding
. cs cac binge, by(male)
male | RR [95% Conf. Interval] M-H Weight-----------------+------------------------------------------------- 0 | 1.570175 .9402789 2.622042 9.339759 1 | 1.497071 1.164201 1.925117 39.53256 -----------------+------------------------------------------------- Crude | 1.938954 1.551487 2.423187 M-H combined | 1.511042 1.205656 1.89378-------------------------------------------------------------------Test of homogeneity (M-H) chi2(1) = 0.027 Pr>chi2 = 0.8700
![Page 21: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/21.jpg)
2 x 2 Tables
• Example of Effect Modification
. cs cac modalc, by(racegender)
racegender | RR [95% Conf. Interval] M-H Weight-----------------+------------------------------------------------- Black women | .75888 .3595892 1.601547 8.043758 White women | .8960739 .4971477 1.61511 11.07552 Black men | 1.945668 1.114927 3.3954 8.304878 White men | .9279831 .66551 1.293974 29.45557 -----------------+------------------------------------------------- Crude | 1.30072 1.023022 1.653798 M-H combined | 1.046446 .8225915 1.331218-------------------------------------------------------------------Test of homogeneity (M-H) chi2(3) = 6.245 Pr>chi2 = 0.1003
![Page 22: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/22.jpg)
2 x 2 Tables
• Inmediate commands – csi, cci– No dataset required – just 2x2 cell frequencies
csi a b c dcsi 106 186 585 2165 (for cac binge)
![Page 23: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/23.jpg)
Multivariable adjustment
• Binge drinking appears to be associated with coronary calcium– Association partially due to confounding by
gender
• What about race? Age? SES? Smoking?
![Page 24: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/24.jpg)
Multivariable adjustmentmanual stratification
# 2x2 tablesCrude association 1Adjust for gender 2Adjust for gender, race 4Adjust for gender, race, age 68Adjust for “” + income, education 816Adjust for “” + “” + smoking 2448
![Page 25: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/25.jpg)
Multivariable adjustmentcs command
• cs command– Does manual stratification for you
• Lists results from every strata• Tests for overall homogeneity• Adjusted and crude results
– Demo cs cac binge, by(male black age)
![Page 26: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/26.jpg)
Multivariable adjustmentcs command
• cs command– Does manual stratification for you
• Lists results from every strata• Tests for overall homogeneity• Adjusted and crude results
– Demo cs cac binge, by(male black age)– Can’t interpret interactions!
![Page 27: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/27.jpg)
Multivariable adjustmentmhodds command
• mhodds allows you to look at specific interactions, adjusted for multiple covariates– Does same stratification for you– Adjusted results for each interaction variable– P-value for specific interaction (homogeneity)– Summary adjusted result
• Demo mhodds cac binge age, by(racegender)
![Page 28: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/28.jpg)
Multivariable adjustmentmhodds command
• mhodds allows you to look at specific interactions, adjusted for multiple covariates– Does same stratification for you– Adjusted results for each interaction variable– P-value for specific interaction (homogeneity)– Summary adjusted result
• Demo mhodds cac binge age, by(racegender)• But strata get thin!
![Page 29: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/29.jpg)
Multivariable adjustmentlogistic command
• Assumes logit model– Await biostats class for details!– Coefficients estimated, no actual stratification– Continuous variables used as they are
![Page 30: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/30.jpg)
Multivariable adjustmentlogistic command
Basic syntax:
logistic outcomevar [predictorvar1 predictorvar2 predictorvar3…]
![Page 31: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/31.jpg)
Multivariable adjustmentlogistic command
If using any categorical predictors:
logistic outcomevar [i.catvar var2…]
Creates “dummy variables” on the fly
If you forget, Stata won’t know they are categorical, and you’ll get the wrong answer!
![Page 32: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/32.jpg)
Multivariable adjustmentlogistic command
Demo
logistic cac bingelogistic cac binge malelogistic cac binge male blacklogistic cac binge male black agelogistic cac binge male black age i.smokelogistic cac binge##i.racegender age i.smokelogistic cac modalc##racegender
![Page 33: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/33.jpg)
Multivariable adjustmentlogistic command
Demo . xi: logistic cac binge male black age i.smokei.smoke _Ismoke_0-2 (naturally coded; _Ismoke_0 omitted)
Logistic regression Number of obs = 3036 LR chi2(6) = 211.95 Prob > chi2 = 0.0000Log likelihood = -852.99988 Pseudo R2 = 0.1105
------------------------------------------------------------------------------ cac | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- binge | 1.387573 .1985355 2.29 0.022 1.048251 1.836736 male | 3.253031 .4608839 8.33 0.000 2.464287 4.294226 black | .7282563 .0994953 -2.32 0.020 .5571756 .9518674 age | 1.19833 .025771 8.41 0.000 1.148869 1.24992 _Ismoke_1 | 1.357694 .2308651 1.80 0.072 .972886 1.894707 _Ismoke_2 | 2.120925 .3302698 4.83 0.000 1.563063 2.87789------------------------------------------------------------------------------
![Page 34: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/34.jpg)
logistic command interaction demo. logistic cac modalc##racegender age i.smoke
Logistic regression Number of obs = 2795 LR chi2(10) = 186.28 Prob > chi2 = 0.0000Log likelihood = -739.54359 Pseudo R2 = 0.1119
------------------------------------------------------------------------------ cac | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- 1.modalc | .6024889 .2430813 -1.26 0.209 .2732258 1.328546 | racegender | 2 | 1.018361 .3137632 0.06 0.953 .5567262 1.862783 3 | 1.601149 .519393 1.45 0.147 .8478374 3.023786 4 | 4.119486 1.100853 5.30 0.000 2.439922 6.955209 | modalc#| racegender | 1 2 | 1.422897 .7314808 0.69 0.493 .5195041 3.897247 1 3 | 2.867897 1.473405 2.05 0.040 1.047736 7.850102 1 4 | 1.546468 .7057105 0.96 0.339 .6322751 3.782472 | age | 1.184036 .0271845 7.36 0.000 1.131937 1.238534 | smoke | 1 | 1.438413 .2623889 1.99 0.046 1.00603 2.056629 2 | 2.464978 .4157232 5.35 0.000 1.771154 3.430597------------------------------------------------------------------------------
![Page 35: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/35.jpg)
Multivariable adjustmentlogistic command
• Pro’s– Provides all OR’s in the model– Accepted approach (mhodds rarely used by statisticians)– Can deal with continuous variables (like age)– Better estimation for large models?
• Con’s– Interaction testing more cumbersome, less automatic– More assumptions– Harder to test for trends
![Page 36: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/36.jpg)
Multivariable adjustment
• Format for linear regression, and other types of regression is the same as for logistic regression, except for the initial command:
regress outcomevar [predictorvar1 predictorvar2 predictorvar3…]
ologit outcomevar [predictorvar1 predictorvar2 predictorvar3…]
etc
![Page 37: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/37.jpg)
Testing for trend
• Test of trend with tabodds. tabodds cac alccat
-------------------------------------------------------------------------- alccat | cases controls odds [95% Conf. Interval]------------+------------------------------------------------------------- 0 | 110 1325 0.08302 0.06835 0.10084 <1 | 90 933 0.09646 0.07770 0.11976 1-1.9 | 46 295 0.15593 0.11429 0.21275 2+ | 45 193 0.23316 0.16856 0.32252--------------------------------------------------------------------------Test of homogeneity (equal odds): chi2(3) = 36.70 Pr>chi2 = 0.0000
Score test for trend of odds: chi2(1) = 32.20 Pr>chi2 = 0.0000
![Page 38: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/38.jpg)
Testing for trendstabodds command
• Adjustment for multiple variables possible– tabodds cac alccat, adjust(age male black)
![Page 39: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/39.jpg)
Approaching your analysis
• Number of potential models/analyses is daunting– Where do you start? How do you finish?
• My suggestion– Explore– Plan definitive analysis, make dummy tables/figures– Do analysis (do/log files), fill in tables/figures– Show to collaborators, reiterate prn– Write paper
![Page 40: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/40.jpg)
Summary• Make sure you understand confounding and interaction
with 2x2 tables in Stata
• Epitab commands are a great way to explore your data– Emphasis on interaction
• Logistic regression is a more general approach, ubiquitous, but testing for interactions and trends is more difficult
![Page 41: Basic epidemiologic analysis](https://reader034.vdocuments.mx/reader034/viewer/2022042508/586cfa871a28abf22f8b9e2a/html5/thumbnails/41.jpg)
In lab today…
• Lab 5– Epi analysis of coronary calcium dataset– Walks you through evaluation of confounding
and interaction• Judgment calls – often no right answer, just focus on
reasoning.• Reminder – put your answers as comments in the do
file* 15c – 15%, p<.001