biostatistics course part 13 effect measures in 2 x 2 tables dr. sc. nicolas padilla raygoza...
Post on 31-Mar-2015
223 Views
Preview:
TRANSCRIPT
Biostatistics coursePart 13
Effect measures in 2 x 2 tables
Dr. Sc. Nicolas Padilla RaygozaDepartment of Nursing and Obstetrics
Division Health Sciences and EngineeringUniversity of Guanajuato
Campus Celaya-Salvatierra
Biosketch
Medical Doctor by University Autonomous of Guadalajara. Pediatrician by the Mexican Council of Certification on
Pediatrics. Postgraduate Diploma on Epidemiology, London School of
Hygiene and Tropical Medicine, University of London. Master Sciences with aim in Epidemiology, Atlantic International
University. Doctorate Sciences with aim in Epidemiology, Atlantic
International University. Associated Professor B, Department of Nursing and Obstetrics,
Division of Health Sciences and Engineering, University of Guanajuato, Campus Celaya Salvatierra, Mexico.
padillawarm@gmail.com
Competencies
The reader will obtain Risk Ratio or Odds Ratio from a 2 x 2 table.
He (she) will calculate 95% confidence interval from RR or OR.
He (she) will identify potential confounders and/or interactions.
He (she) will apply Mantel Haenzsel test for RR, OR and Chi-squared.
Introduction
In part 12 of the course, we tested the association between two categorical variables.
Now, we review the methods used to measure the association.
We will work with binary variables, so we will use 2 x 2 tables.
Example
A nurse in a poor area of Mexico, was informed that many area children attending the nursery were sick of respiratory infections.
She designed a cohort study to investigate the problem.
During the following years 1000 children were followed.
The main research question was: Attending nursery is associated with respiratory
infection?
Example
Respiratory infection
Respiratory infection
Total
Attending nursery
Yes
n %
No
n %
Yes 37 33.9 72 66.1 109
No 43 4.8 848 95.2 891
Total 80 8 920 92 1000
Risk Ratio (RR)
In health research, the term "risk" is used instead of proportion. For example:
The risk of infection among children attending day care was 33.9%.
Thus, the risk ratio is the ratio of two proportions. The risk of respiratory infection for those attending the
nursery 37 / (37 + 72) = 37/109 = 0.339 The risk of respiratory infection in children not attending day
care is: 43 / (43 + 848) = 43/891 = 0.048. The risk ratio (RR) is the ratio of these two risks.
Risk ratio = 0.339 / 0.048 = 7.06
Risk Ratio (RR)
In general, the risk ratio can be obtained with the following formula, where a, b, c and d are the frequencies in the 2 x 2 table.
Outcome Outcome Total
Exposure Yes No
Yes a b a + b
No c d c + d
Total a + c b + d N
Risk Ratio = (a /a+b) / (c/c + d)
Odds Ratio (OR)
The Odds Ratio (OR) is the ratio of the chance (probability) of the results between those exposed and the chance of the outcome among non-exposed. The chance of infection among attendees of the
nursery is: 37 / 72 = 0,514 The chance of infection among children not attending
day care is: 43 / 848 = 0,051 The Odds Ratio of these two probabilities: OR =
0,514 / 0,051 = 10.08 In general, the Odds Ratio was found with the following
formula: OR = ad / bc = (a / c) / (b / d)
Confidence intervals
In the analysis of data from children attending day care or not, we have the option to use RR or OR, to measure the effect of attendance at the nursery.
Each value is an estimate only, so these values should be reported with confidence intervals. An approximate confidence interval at 95% for the RR
is found using the following formula: Minimum value: RR / EF Maximum value: RR x EF
EF = exp(1.96√(1/a) – (1/a+b) + (1/c) –(1/c+d))
Confidence intervals
CI for the data of children who attend day care or not, is: EF = exp (1.96 √ 1 / 37 - 1 / 109 + 1 / 43 -
1/891 = 1.48 RR = 7.06 Minimum 7.06/1.48 = 4.77 Maximum value 7.06 x 1.48 = 10.45
95% CI = 4.77 to 10.45
Confidence intervals
An approximate confidence interval at 95% for the OR is found using the following formula: Minimum value: OR / EF Maximum value: OR x EF
EF = exp(1.96√(1/a) + (1/b) + (1/c) + (1/d))
Confidence intervals
CI for the data of children who attend day care or not, is: EF = exp (1.96 √ 1 / 37 + 1 / 72 + 1 / 43 +1 /
848 = 1.65 OR = 10.08 Minimum value 10.08/1.65 = 6.11 Maximum value 10.08 x 1.65 = 16.63
95% CI = 6.11 to 16.63
Which measure is best?
Risk Ratios are calculated for cross-sectional and cohort studies. The formula for the 95% confidence interval for
RR requires larger sample sizes than for OR. OR are calculated for case-control and cross-
sectional studies. In case-control studies is not possible to calculate
risks, and therefore can not calculate RR. There is an advantage in using OR.
It is a consistent measure of effect, unlike RR.
Example (Cont…)
Mexican children showed a strong association between exposure (attending nursery) and outcome (respiratory infection).
However such an association may be confounded by other factor(s).
For example, although children who attend day care, seem to have a 7 times higher risk of respiratory infection, the cause of the infection can also be something that is associated with children who go to daycare.
In other words, to attend the nursery may be a marker of exposure that causes a respiratory infection.
If this is true, we can say that the association between respiratory infections and assistance to the nursery, are confused.
How identify a potential confounder?
To evaluate a potential confounder, we should consider three aspects: The exposure The outcome The confounder
Example
The nurse is interested in the association between day care attendance and presence of respiratory infection, but is aware that children might be exposed to other factors that cause respiratory infection.
For example, overcrowding at home is a risk factor for respiratory infection.
It is therefore a potential confounder of the association between attendance at day care and respiratory infections.
Confounders
For a variable has been a potential confounding, it should meet three conditions: Must be:
an independent risk factor for the outcome of interest
should be associated with the exposure of interest
not be in the cause pathway between exposure and outcome.
Confounders
How do we check these conditions in the study of Mexican children? Condition 1 of confusion:
Risk factor for the outcome of interest Is there an association between overcrowding and
respiratory infection?
Overcrowding in home
RI
Yes
RI
No
Risk of RI
Yes 54 55 54/109 =0.5
No 21 870 21/891= 0.02
RR = 25
95%CI = 15.72 a 39.75
X2= 311.67
P<<0.05
Confounders
How do we check these conditions in the study of Mexican children? Condition 2 of confusion:
Association with exposure Is there an association between overcrowding and
assistance to child care?
Overcrowding in home
Attendance to nursery
Yes
Attendance to nursery
No
Yes 43 66
No 35 856
X2= 170.39
P<<0.05
Confounders
How do we check these conditions in the study of Mexican children? Condition 3 of confusion:
Is the potential confusion is the causal pathway? In this example, it is unlikely that child care
assistance, is caused by overcrowding
Do we have a confounder?
In this study, overcrowding has satisfied the three conditions necessary for a confounding variable: It is an independent risk factor for the outcome of
interest. Overcrowding is associated with respiratory infection.
It is associated with the exposure of interest. Overcrowding is associated with attendance at the nursery.
It is not in the causal pathway. Overcrowding is unlikely to be the cause of attendance at nursery.
Stratified tables
Now, we know that the data must be additionaly analyzed for to have the effect of overcrowding.
To adjust for confounder variable, we stratified the table 2 x 2 of interest.
The table without stratify is called raw table. Can be divided into strata defined by the confounder
variable. The sample is divided into two groups, each of them the
status of overcrowding is the same. The two groups are:
Overcrowding and without overcrowding
Stratified tables
If we want to find childcare assistance is associated with respiratory infection when comparing children within the same category of overcrowding.
The raw table for the relationship between respiratory infections and child care assistance:
Respiratory infection
Respiratory infection
Total
Attendance to nursery
Yes
n %
No
n %
Yes 37 33.9 72 66.1 109
No 43 4.8 848 95.2 891
Total 80 8 920 92 1000
Stratified tables
Now, it is show stratified tables by overcrowding and without overcrowding:
Respiratory infectionYes
Respiratory infectionNo
Total
NurseryYes
61 14 75
NurseryNo
5 21 26
Total 66 35 101
Respiratory infectionYes
Respiratory infectionNo
Total
NurseryYes
10 24 34
NurseryNo
4 861 865
Total 14 885 899
Overcrowding Without overcrowding
RR= 4.23 X2=32.88 p=0.000095%CI 1.91 a 9.37
RR= 63.6 X2=178.84 p=0.000095%CI 21.01 a 192.56
Stratified tables
Do you think that attendance at nursery is a risk factor for respiratory infections among children with overcrowding?
Yes, children attending day care are 63 times more at risk of respiratory infection than those who do not attend nursery.
The p value indicates a strong association between attendance at daycare and respiratory infection in the group without overcrowding.
Stratified tables
Do you think that attendance at nursery is a risk factor for respiratory infection in the group without overcrowding?
Yes, children attending day care are more than 3 times more at risk of respiratory infection than those not attending the nursery.
The p value indicates a strong association between attendance at daycare and respiratory infection in this group.
Within each stratum, the association between attendance at day care and respiratory infections is now independent of overcrowding at home.
Comparison of results
How to compare these results with those of the raw table? The raw table shows a strong relationship between attendance at day
care and respiratory infection, RR is different in both tables stratified but remains a significant statistical association.
RR 95%CI X2 P-value
Raw 7.06 4.77 a 10.45 111.88 <0.05
Overcrowding 4.23 1.91 a 9.37 32.88 <0.05
Without overcrowding
63.6 21.01 a 192.56 178.84 <0.05
Adjusted Risk Ratios
Nurse do not want show data divided into strata, prefer a global estimate of the effect of attended to nursery in respiratory tract infection adjusted by overcrowding.
This can be done by calculate RR using a Mantel Haenzsel method. First, look 2 x s table in each strata.
Exposure Disease Yes
DiaseaseNo
Total
Yes ae be
No ce de
Total ne
Risk Ratios from Mantel Haenzsel
Adjusted RR (summarized), can be obtained with:
Ʃ a (c+d)/n
RRMantel Haenzsel = ---------------
Ʃ c (a+b)/n
This give us a average of RR initially estimate into each table ; more important each table with more sample size.
Adjusted Risk Ratio
We calculate overcrowding adjusted RR with Mantel Haenzsel formula:
Respiratory infectionYes
Respiratory infection No
Total
NurseryYes
61 14 75
Nursery No
5 21 26
Total 66 35 101
Respiratory infectionYes
Respiratory infection No
Total
NurseryYes
10 24 34
Nursery No
4 861 865
Total 14 885 899
Overcrowding Non-overcrowding
61 (5 + 21)/ 101 + 10 (4 + 861)/899 15.70 + 9.62 25.32------------------------------------------------ = ----------------- = ----------- = 6.565 (61 + 14)/101 + 4 (10 + 24)/899 3.71 + 0.15 3.86
Adjusted Odds Ratio
Adjusted OR is calculate in similar form that adjusted RR.
Ʃ ad/n
RMMantel Haenzel= -----------
Ʃ bc/n
Exposure DiseaseYes
Diasease No
Total
Yes ae be
No ce de
Total ne
Adjusted Odds Ratio
In a cross-sectional study, on the use of quinfamide after a amoebic dysentery, it was reported how many are carriers of Entamoeba histolytic.
Non-carrier Carrier Total
Quinfamide 100 54 154
Non quinfamide
15 72 87
Total 115 126 241
Adjusted Odds Ratio
We calculate adjusted OR by residence area, with the Mantel Haenzsel formula:
Non-carrier Carrier Total
Quinfamide Yes
35 39 74
Quinfamide No
10 51 61
Total 45 90 135
Non-carrier Carrier Total
Quinfamide Yes
65 14 79
QuinfamideNo
5 21 26
Total 70 35 105
Urban Rural
(35 x 51 /135) + (65 x 21/105) 13.2 + 13 26.2---------------------------------------- = ----------------- = ---------- = 7.4(39 x 10 / 135) + (14 x 5 /105) 2.89 +0.67 3.56
Mantel Haenzsel X2
The nurse now knows that the association between respiratory infection and attend to nursery still is after adjusted by overcrowding, confounder variable.
Now, she want to calculate a Chi squared test to significance of this association, adjusted by confounder.
This can be do, calculating X2Mantel-Haenzsel test.
Mantel Haenzsel X2
To calculate adjusted Chi squared test for the confounder, we calculate Mantel Haenzsel Chi squared. Null hypothesis is that there is not association between attend to nursery and respiratory infection.
Ho : OR = 1.
[Ʃae-ƩE(ae)]2
X2Mantel Haenzsel= -------------------
ƩV(ae)
We should go, step by step, beginning with 2 x 2 of each strata.
Exposure Disease Yes
Disease No Total
Yes ae be
No ce de
Total ne
Mantel Haenzsel X2
Mantel Haenzsel Chi squared test is an average of individuals Chi squared of each table.
To calculate Mantel Haenzsel Chi squared test, we need three values of each table: ae number of ill and exposed
E(ae) value expected of ae
V(ae) variance (standard error squared) of ae, where,
E(ae) = total row x total column / grand total = (ae + be) x (ae + ce)/ne
(ae + be) x (ce + de) x (ae + ce) x (be + de)
V(ae) = --------------------------------------------------------
ne²(ne - 1)
Mantel Haenzsel X2
Overcrowding table a = 61 E(a) = 75 x 66 / 101 = 49.01 V(a) = (75 x 66 x 26 x 35) / (101² x (101 - 1)) = 4.42
Non-overcrowding table a = 10 E(a) = 34 x 14 / 899 = 0.53 V(a) = 34 x 14 x 865 x 885 / (899² x (899 - 1)) = 0.50 To obtain Mantel Haenzsel Chi squared test (adjusted Chi squared
by overcrowding), we add these values from the two strata, using the formula:
Example
[Ʃae-ƩE(ae)]2
X2Mantel Haenzsel= -------------------
ƩV(ae)
To obtain Mantel Haenzsel Chi squared test (Adjusted Chi squared test by overcrowding), we add these values, using the formula:
a E(a) V(a)
Overcrowding 61 49.01 4.42
Non-overcrowding 10 0.53 0.50
Total 71 49.54 4.92
X2Mantel-Haenzsel = (71 – 49.54)²/4.92= 93.60
Example
Confusion or not confusion
How we decide if there is confusion? There are nor statistical tests to demonstrate
confusion. We do calculate statistical tests and measure the
effect raw and stratified tables. Then, we calculate summarized statistical test and
we compare them with the raws, and we conclude if there is confusion or not.
Confusion or not confusion
If there is an important difference between raw and adjusted estimates, we say that the association of interest is confounding by another factor.
We look the data of children that attend to nursery and respiratory infection.
After adjust by overcrowding, RR diminish from 7.06 to 6.56.
Posibles effects from confusion
Generally there are more than one confounder. They can have different effects:
The association in study, can be or not significative before of adjust for a confounder and not significative after.
The association can be significative after adjust for a confounder but with a p-value less significative.
Strata can show oposite results and in this case, it is better, show stratified results. This is interaction or effect modified.
Confounder can hide an existing relationship.
Bibliografía
1.- Last JM. A dictionary of epidemiology. New York, 4ª ed. Oxford University Press, 2001:173.
2.- Kirkwood BR. Essentials of medical ststistics. Oxford, Blackwell Science, 1988: 1-4.
3.- Altman DG. Practical statistics for medical research. Boca Ratón, Chapman & Hall/ CRC; 1991: 1-9.
top related