is there a comparison? ◦ are the groups really comparable? are the differences being reported...
TRANSCRIPT
The Epidemiologist’s Toolbox
Cari OlsonBonnie Kerker
NYC Department of Health and Mental HygieneNovember 29, 2012
Is there a comparison? ◦ Are the groups really comparable?
Are the differences being reported real?◦ Are they worth reporting? ◦ How much confidence do we have in them?
Can anything else explain this association?
What can (and can’t) this study tell us?
How should findings be accurately presented?
Questioning Health Data
Is there a comparison? ◦ Are the groups really comparable?
Are the differences being reported real?◦ Are they worth reporting? ◦ How much confidence do we have in them?
Can anything else explain this association?
What can (and can’t) this study tell us?
How should findings be accurately presented?
Questioning Health Data
Most data interpretation requires context – a comparison group.◦ Same group compared over time;◦ Different groups compared within same timeframe;◦ Different groups compared over time.
Without a comparison, the likelihood that findings are due to factors other than the hypothesized cause cannot be assessed.
Selection of study participants; Chance; Other factors or trends.
Comparison / Control Groups
A basic epidemiologic tool because they allow for appropriate comparisons.◦ Comparing counts can be misleading.
# of events in a specific time periodRate = -------------------------------------------------- x 10 n
Avg. pop during that time period
…per 100 (%) …per 1000 …per 100,000
Rates
There were 1,765 heart disease deaths in Flushing, Queens in 2002 and 882 in Pelham Bay, Bronx.
Start with a Fact….
A . Flushing, Queens
B . Pelham Bay, Bronx
Where are residents at greater risk for dying from heart disease?
Where are residents at greater risk for dying from heart disease?
A. Flushing, Queens 354/100,000 pop
B. Pelham Bay, Bronx 361/100,000 pop
Because Flushing (n = 498,318) has a larger population than Pelham Bay (n = 244,452).
Same as saying 25 miles-per-hour is faster than 50 miles-per-day:
◦ 25 miles 50 miles 1 hour 1 day (24 hours)
Why?
Is there a comparison? ◦ Are the groups really comparable?
Are the differences being reported real?◦ Are they worth reporting? How much confidence do we
have in them?
Can anything else explain this association?
What can (and can’t) this study tell us?
How should findings be accurately presented?
Questioning Health Data
The process of inferring from your data whether an observed difference is likely due to chance.
Commonly, significance set at 0.05 (5%): 95% sure that the association is not due to chance. sig=0.01 (1%): 99% sure.
sig=.10 (10%): 90% sure.
The smaller the sample, the more difficult it is to find a significant difference.
◦ In larger samples, it is often easy to find significance – but is it meaningful?
What is Statistical Testing?
Statistical significance ≠ importance
Not significant ≠ no association
Statistical significance ≠ causation
Notes on Interpretation
An interval or range of values that reflects the precision of an estimate of a population parameter. Statistically, how confident are we that the
number is real? E.g., Smoking prevalence (2010): 14.0% (12.9,
15.3)
The more confidence you want (90% vs. 95% vs. 99%), the wider the interval.
What is a Confidence Interval (CI)?
What does it mean if 2 CIs overlap?– Prevalence of smoking among:
• Men: 16.1% (14.3%-18.1%)• Women: 12.2% (10.8%-13.7%)
– Prevalence of diabetes among:• Men: 9.4% (8.3%-10.8%)• Women: 9.1% (8.2%-10.2%)
What does it mean if a CI includes 0?
Applied Interpretation of a CI
Is there a comparison? ◦ Are the groups really comparable?
Are the differences being reported real?◦ Are they worth reporting? How much confidence do we
have in them?
Can anything else explain this association?
What can (and can’t) this study tell us?
How should findings be accurately presented?
Questioning Health Data
16
A third factor that influences the relationship between exposure and disease.
If you are interested in actual differences in prevalence across populations, confounders are not that important.
However, if you are interested in assessing risk differences, confounders can and should be controlled for in analyses.
Confounding
Example: When comparing cardiac disease between men and women, what other factor may confound the relationship between sex and illness? Age! If we don’t adjust for age, and find a higher
prevalence among women, it might be due to the fact that in the general population, women are (on average) older than men.
Age-adjustment is one way to limit confounding. Ensures that any differences you see between groups
are NOT due to age.
Confounding
Is there a comparison? ◦ Are the groups really comparable?
Are the differences being reported real?◦ Are they worth reporting? ◦ How much confidence do we have in them?
Can anything else explain this association?
What can (and can’t) this study tell us?
How should findings be accurately presented?
Questioning Health Data
Cross-sectional◦ Select a sample from the population and measure predictor and
outcome variables at the same time. Yields prevalence; Cannot talk about incidence or risk of developing a disease; Cannot establish sequence of events; Cannot infer causation; Can be generalizable.
Case-control◦ Select two samples from the population - one with disease and one
without, then look back and measure predictor variable. Yields odds ratio (measure of association); Cannot talk about incidence or risk of developing a disease; Can be generalizable.
Types of Studies
Prospective cohort ◦ Select a sample from the population, measure
predictor variable (presence or absence), then follow up and measure the outcome variable. Yields incidence, relative risk; Can be generalizable.
Randomized Control Trial (RCT)◦ Randomly assign people to treatment or control
(exposure), then follow up and measure outcome. Can be generalizable; STRONGEST STUDY DESIGN FOR CAUSATION.
Types of Studies
Ecologic Study◦ Unit of analysis is a population, rather than an
individual. For example, looking at rates of disease across countries. Can’t infer anything about individuals; Cannot infer causality.
Qualitative Study◦ Aims to gather an in-depth understanding;◦ Includes focus groups, in-depth interviews;◦ Subjects are not systematically chosen to represent a
target population. Data cannot be generalized.
Types of Studies
Time sequence of events
Biological plausibility
Consistency and replications
Rule out confounding
Causality
Size of study◦ The bigger the study, the more power you have to
detect findings and the more generalizable it will be.
New knowledge vs. replicated finding◦ First study ever finding this result?◦ Scientific method requires ability to replicate
findings.
How meaningful is this study?
Is there a comparison? ◦ Are the groups really comparable?
Are the differences being reported real?◦ Are they worth reporting? ◦ How much confidence do we have in them?
Can anything else explain this association?
What can (and can’t) this study tell us?
How should findings be accurately presented?
Questioning Health Data
Provide clear context of literature base and importance of findings.◦ How big is the population that these findings apply to
and what population exactly is referenced?
Always source the data clearly, providing link to/information on original research for audience.
Question researchers on limitations to their data.◦ Researcher “headlines” (titles/abstracts) can be
misleading!
Presentation of Data Findings
Best answered by qualitative data (focus groups, interviews).
Speculation vs. Evidence.
Reporting “could be” rather than “is.”
The WHY Question
Anecdotes can make data come alive, but…◦ “Anecdotal evidence” is an oxymoron.
Anecdotes should not be the only “counterfactual” argument against data.◦ “Fairness” in reporting must insist on data (with
stated limitations) from both sides.
Illustrating Data with “Human Interest” Stories
Anecdotes must be presented in the context of the data.◦ Source says “Everyone does X” vs. data showing
that 35% of people do X.
Illustrating Data with “Human Interest” Stories
EpiQuery◦ Web-based, interactive data tool◦ Multiple data sources
My Community’s Health: Data and Statistics◦ www.nyc.gov/health
Remember Health Department Data Sources for NYC
THANK YOU!
Contact: [email protected]@health.nyc.gov