Cohort, case-control & cross- sectional studies
Source: Alain Moren, EPIET Introductory courses
Kostas DanisEPIET Introductory course
Menorca, Spain16/9-12/10/2012
Epidemiological studies
Two types Observation Experiment
Experiment
Exposed
Not exposed
Disease occurrence
Unethical to perform experiments on people
if exposure is harmful
Exposure assigned
RandomisedControlledTrial
BlindedDosesTime periodRisk - effectNo bias
If exposure not harmful
TreatmentPreventive measure (vaccination)
If RCT not possible Left with observation of experiments designed by Nature
Cohort studiesCross-sectional studies
Case control studies
Cohort studies marching towards outcomes
What is a cohort?
One of 10 divisions of a Roman legion Group of individuals
- sharing same experience - followed up for specified period of time
Examples- EPIET cohort 2012- birth cohort- cohort of guests at barbecue- occupational cohort of chemical plant workers- influenza vaccinated in 2011-12
follow-up period
Calculate measure of frequency
Cumulative incidence- incidence proportion- attack rate (outbreak)
Incidence rate
end of follow-up
Cohort studies
Purpose- Study if an exposure is associated with outcome(s)- Estimate risk of outcome in
exposed and unexposed cohorts- Compare risk of outcome in two cohorts
Cohort membership- Being at risk of outcome(s) studied- Being alive and - Being free of outcome at start of follow-up
unexposed
exposed
Cohort studies
unexposed
exposed
Incidence amongexposed
Incidence amongunexposed
Cohort studies
ate ham
did not eat ham
ill not ill Total
a b a+b
c d c+d
Presentation of cohort data: 2x2 table
Risk in exposed= a/a+b
Risk in unexposed= c/c+d
Incidence rate
Number of NEW cases of disease
Total person - time of observation
Incidence rate
Number of NEW cases of disease
Total person - time of observation
Rate
Denominator:
- is a measure of time
- the sum of each individual’s time at risk and free from disease
A
B
C
D
E
90 91 92 93 94 95 96 97 98 99 Time at risk
x
x
6.0
6.0
10.0
8.5
5.0
Total years at risk 35.5
-- time followed
x disease onset
Person-time
A
B
C
D
E
90 91 92 93 94 95 96 97 98 99 00 Time at risk
x
x
6.0
6.0
10.0
8.5
5.0
Total years at risk 35.5
-- time followed
x disease onset
Incidence rate (IR)(Incidence density)
IR = 2 / 35.5 person years
= 0.056 cases / person year
= 5.6 cases / 100 person years
= 56 cases / 1000 person years
Person-years Cases
Smoke 102,600 133
Do not smoke 42,800 3
Presentation of cohort data: Person-years at risk
Tobacco smoking and lung cancer, England & Wales, 1951
Source: Doll & Hill
Presentation of data: Various exposure levels
time
Exposure Study startsDisease
occurrence
Prospective cohort study
time
ExposureStudy startsDisease
occurrence
Retrospective cohort study
Exposure
time
Diseaseoccurrence Study starts
Recipe: Cohort study
Identify group of - exposed subjects- unexposed subjects
Measure incidence of disease Compare incidence between exposed and
unexposed group
unexposed
exposed
Incidence amongexposed
Incidence amongunexposed
Cohort studies
Absolute measures
- Risk difference (RD) Ie - Iue
Relative measures- Relative risk (RR)
Rate ratio Risk ratio
Effect measures in cohort studies
Ie
Iue
Ie = incidence in exposedIue= incidence in unexposed
Exposed
Not exposed
CasesNoncases Risk %
Cohort study
50 50 50 %
10 90 10 %
Risk ratio 50% / 10% = 5
Total
100
100
ate ham
did not eat ham
ill not ill Incidence
49 49 98 50 %
4 6 10 40 %
Risk difference 50% - 40% = 10%
Relative risk 50% / 40% = 1.25
Interpretation of Risk Ratios
RR>1
RR=1
RR<1 Protective factor
Risk factor
No association
Exposure Population (f/u 2 years)
Cases Incidence
(%) Relative
Risk
HIV +
215
8
3.7
11
HIV - 298 1 0.3
Does HIV infection increase risk of developing TB among drug users?
Vaccine efficacy (VE)
Status Pop. Cases Cases per
1,000 RR
Vaccinated 301,545 150 0.49 0.28
Unvaccinated 298,655 515 1.72 Ref.
Total 600,200 665 1.11
VE = 1 - RR = 1 - 0.28
= 72%
Population Cases Incidence
a1High N1 I1
cUnexposed NneIue
at riskExposurelevel
a2Medium N2 I2
a3Low N3 I3
Various exposure levels
Population Cases Incidence RR
a1High N1 I1
cUnexposed NneIue
at riskExposurelevel
a2Medium N2 I2
a3Low N3 I3
RR1
RR2
RR3
Reference
Various exposure levels
Cohort study: Tobacco smoking and lung cancer,
England & Wales, 1951
Source: Doll & Hill
Disadvantages of cohort studies
Large sample size
Latency period
Cost
Time-consuming
Loss to follow-up
Exposure can change
Multiple exposure = difficult
Ethical considerations
Strengths of cohort studies
Can directly measure - incidence in exposed and unexposed groups- true relative risk
Well suited for rare exposure Temporal relationship exposure-disease is
clear Less subject to selection biases
- outcome not known (prospective)
Can examine multiple effects for a single exposure
Population Outcome 1 Outcome 2 Outcome 3
exposed Ne Ie1 Ie2 Ie3
unexposed Nne Iue1 Iue2 Iue3
RR1 RR2 RR3
Cohort studies
A cohort study allows to calculate indicators which have a clear, precise meaning.
The results are immediately understandable.
Cross-sectional (prevalence) studies
Cross-sectional studies
Observation of a cross-section of a population at a single point in time
Recruitment of study participants Population Population sample
Observation for the presence of: One or more outcomes One or more exposures
Sampling
Sample
Target Population
Sampling
Population
Uses of cross-sectional surveys in public health
Estimate prevalence of disease or their risk factors
Estimate burden Measure health status in a defined population Plan health care services delivery Set priorities for disease control Generate hypotheses Examine evolving trends
- Before / after surveys- Iterative cross-sectional surveys
Potential objectives of a cross sectional study
Descriptive Estimate prevalence
Analytic Compare the prevalence of a disease in
various subgroups, exposed and unexposed Compare the prevalence of an exposure in
various subgroups, affected and unaffected
Ill Non ill Total
Exposed a b a+b
Non exposed c d c+d
Presentation of the data of an analytical cross sectional study in a 2
x 2 table
Simultaneous measurement of outcomes and exposures
Exposed
Not exposed
CasesNoncases Prevalence %
Cross-sectional study
500 500 50 %
100 900 10 %
Prevalence ratio (PR) 50% / 10% = 5
Total
1,000
1,000
Measuring association in analytical cross-sectional
surveys Prevalence among exposed / prevalence
among unexposed Prevalence ratio Formula equivalent to risk ratio Concept different
- No incidence- Only prevalence
• depends on both occurrence of new cases & duration of disease
Prevalence of West Nile virus (WNV) infection by place of residence, Central Macedonia,Greece, 2010
Infected Total Prevalence Prevalence
ratio
Rural 38 491 7.7% 5.9
Urban 3 232 1.3% Ref
Prevalence of HIV infection by socioeconomic status, African country X, 1999
Infected Total Prevalence Prevalence
ratio
High class
15 235 6.4% 2.6
Low class
11 450 2.4% Ref
Prevalence of hepatitis C (HCV) infection by quantity of therapeutic injections, Hazabad, Pakistan, 1993
No.of injections
Infected Total Prevalence Prevalence
ratio
>10 9 41 22% 22
0-10 4 52 8% 8
0 1 82 1% Ref
Advantages of cross-sectional surveys
Fairly quick Easy to perform Less expensive Adapted to chronic diseases
Limitations of cross-sectional surveys
Limited capacity to document causality (exposure and outcome measured at the same time -difficult to establish time sequence of events) Not useful to study disease etiology
Not suitable for the study of rare / short diseases Not adapted to severe / acute diseases
Not adapted to incidence measurement.
Limitations of causal inference in analytical cross sectional
studies
• Prevalent cases• Exposure and outcome examined at the
same time
Principle of case control studies
Our objective is to compare
the incidence rate in the exposed population
to the rate that would have been observed
in the same population, at the same time
if it had not been exposed
Exposed
Unexposed
Source population
CasesExposed
Unexposed
Source population
CasesExposed
Unexposed
Source population
Sample
Controls
CasesExposed
Unexposed
Source population
Controls:Sample of the denominator Representative with regard to exposure
Controls
Sample
Intuitively
if the frequency of exposure is higher among cases than controls
then the incidence rate will probably be higher among exposed than non-exposed
Case control study
DiseaseControls
Exposure
??
Retrospective nature
Cases Controls
Exposed a b
Not exposed c d
Total a + c b + d
% exposed
Distribution of cases and controls according to exposure in a case control study
Cases Controls
Exposed a b
Not exposed c d
Total a + c b + d
% exposed a/(a+c) b/(b+d)
Distribution of cases and controls according to exposure in a case control study
Oral Myocardialcontraceptives Infarction Controls
Yes 693 320
No 307 680
Total 1000 1000
% exposed 69.3% 32%
Distribution of myocardial infarction by oral contraceptive use
in cases and controls
Physical Myocardialactivity Infarction Controls
>= 2500 Kcal 190 230
< 2500 Kcal 176 136
Total 366 366
% exposed 51.9% 62.8%
Distribution of myocardial infarction by amount of physical activity
in cases and controls
Water Cases ControlsConsumption
YES 150 ?
NO 50 ?
Total 200 200
Volvo factory, Sweden, 3000 employees, Cohort study200 cases of gastroenteritis
Two types of case control studies
ExploratoryNew diseaseNew risk factorsSeveral exposures"Fishing expedition"
AnalyticalDefine a single hypothesisDose response
Cohort studies
Rate/risk Rate/risk difference Rate Ratio/Risk ratio (strength of association)
No calculation of rates/risks Proportion of exposure
Case control studies
Any way of estimating measures of association?
Odds
Probability that an event will happen
Probability that an event will not happen
Probability that cases/controls will be exposed
Probability that cases/controls will not be exposed
Exposed
Not exposed
Cases Controls Odds ratio
Case control study
a b
c d
Total a+c
OR= (a/c) / (b/d)
= ad / bc
a/c b/dOdds ofexposure
b+d
% exposed a/(a+c)
% unexposed c/(a+c)
b/(b+d)
d/(b+d)
Exposed
Not exposed
Cases Controls Odds ratio
Case control study
50 20 4
a b 50 80
c d
Total 100 100OR= (a/c) / (b/d)
= ad / bc
= (50x80) / (20x50)
= 4
50/50 20/80Odds ofexposure
Case control study design
Cases Controls
E
E
a b
c d
a b a x d---- --- = --- ---- c d b x c
Odds ratio
Frequency of chicken consumption in campylobacter cases and controls, Republic of Ireland and Northern
Ireland, 2003
Cases Controls Oddsratio
Ate chicken 181 251 2.1
Did not eat chicken
15 44 Ref
Frequency of contact with a dog in campylobacter cases and controls, Republic of Ireland and Northern
Ireland, 2003
Cases Controls Oddsratio
Contact with dog
Yes 29 93 0.40
No 158 201 Ref
Oral Myocardialcontraceptives Infarction Controls OR
Yes 693 320 4.8
No 307 680 Ref.
Total 1000 1000
Odds 693/307= 320/680=of exposure 2.2 0.5
Distribution of myocardial infarction by recent oral contraceptive use
in cases and controls
Physical Myocardialactivity Infarction Controls OR
>= 2500 Kcal 190 230 0.64
< 2500 Kcal 176 136 Ref.
Total 366 366
odds of 190/176= 230/136=exposure 1.1 1.7
Distribution of myocardial infarction by amount of physical activity
in cases and controls
Distribution of cases of endometrial cancer by oestrogen use in cases and controls
Oestrogen use Cases Controls Odds ratio
High a1 b1 a1d/b1c
Low a2 b2 a2d/b2c
None c d Reference
Relation of hepatocellular adenoma to duration of oral contraceptive use
in 79 cases and 220 controls
Months ofOC use Cases Controls Odds ratio
0-12 7 121 Ref.13-36 11 49 3.937-60 20 23 15.061-84 21 20 18.1>= 85 20 7 49.7Total 79 220
Source: Rooks et al. 1979
Advantages of case control studies
Rare diseases
Several exposures
Long latency
Rapidity
Low cost
Small sample size
Available data
No ethical problem
Limitations of case-control studies
Cannot compute directly risk Not suitable for rare exposure Temporal relationship exposure-disease
difficult to establish Biases +++
- control selection- recall biases when collecting data
Loss of precision due to sampling
The cohort study
is the gold-standard
of analytical
epidemiology
CASE-CONTROL STUDIES HAVE THEIR PLACE IN EPIDEMIOLOGY, but if cohort study possible,
do not settle for second best
Thank you!
Back-up slides
I1 = a / P1
I0 = c /P0
E
E
a
c
P1
P0
Populationdenominator
Cases
E
E
a
c
P1 /10
P0 /10
PopulationsampleCases
a/P1
I1/ I0 = ------
c/P0
}
aI1 = --------
P1/10
cI0 = --------
P0/10
} a/P1
I1/ I0 = ------
c/P0
I1 = a / P1
I0 = c /P0
Cases Controls
E
E
a b
c d
E
E
a
c
P1
P0
Source population
Pop.Cases
P1 b--- = ----P0 d
= sample
a/P1
I1/ I0 = ------
c/P0
}
I1 = a / P1
I0 = c /P0
Cases
= sample
E
E
a b
c d
Since d/b = P0 / P1
E
E
a
c
P1
P0
Source population
Pop.Cases
a/P1 a . P0 a . d I1 / I0 = ------ = ------- = ----- = c/P0 c . P1 c . b
}Controls
P1 b
--- = ----
P0 d
a / c------b / d