6. categorical data analysis - chi-square & fisher exact test
TRANSCRIPT
![Page 1: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/1.jpg)
KNOWLEDGE FOR THE BENEFIT OF HUMANITY
BIOSTATISTICS (HFS3283)
CATEGORICAL DATA (CHI-SQUARE & FISHER EXACT TEST)
Dr. Mohd Razif Shahril
School of Nutrition & Dietetics
Faculty of Health Sciences
Universiti Sultan Zainal Abidin
1
![Page 2: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/2.jpg)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Topic Learning Outcomes At the end of this lecture, students should be able to;
• identify types of categorical data analysis and their use
• explain assumptions to be met when using chi-square
and fisher exact test
• perform chi-square and fisher exact test using SPSS
• explain how to interpret the SPSS outputs from chi-
square and fisher exact test
2
![Page 3: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/3.jpg)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
What is categorical data analysis?
3
• Independent (Explanatory) Variable is
Categorical (Nominal or Ordinal)
• Dependent (Response) Variable is Categorical
(Nominal or Ordinal)
• Most common;
– 2x2 (Each variable has 2 levels)
– Nominal/Nominal
– Nominal/Ordinal
– Ordinal/Ordinal
CONTINGENCY TABLE
![Page 4: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/4.jpg)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Contingency Table
4
• Tables representing all combinations of levels of
explanatory and response variables
• Numbers in table represent Counts of the
number of cases in each cell
• Row and column totals are called Marginal
counts
![Page 5: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/5.jpg)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Example of Contingency Table
5
• Response Variable – Cognitive Level (Low,
High)
• Explanatory Variable – BMI (Underweight,
Normal, Overweight, Obese)
BMICognitive
TotalLow High
Underweight 59 232 291
Normal 54 367 421
Overweight 114 101 215
Obese 173 54 227
Total 400 754 1154
Marginal Count
Marginal Count
Counts
![Page 6: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/6.jpg)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
2 x 2 Contingency Table
6
• Each variable has 2 levels– Explanatory Variable – Groups (Typically based on
demographics, exposure, or treatment)
– Response Variable – Outcome (Typically presence or absence of a characteristic)
BMICognitive
TotalLow High
≤ 24.9 113 599 712
> 24.9 287 155 442
Total 400 754 1154
![Page 7: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/7.jpg)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Chi-Square Test (X2)
7
• Hypothesis;– Comparing two or more
proportion
– Ho : P1 = P2
• Assumption– Random samples
– Observations are independent
– The number of cells with Expected Count (EC) less than 5, must be less than 20% of the total number of cells.
– The smallest EC must be at least 2.
Based on study design & method
Calculate expected count for each cell
(SPSS will do it)
The chi-square test for independence, also called Pearson's chi-square test or
the chi-square test of association, is used to discover if there is a
relationship between two categorical variables.
![Page 8: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/8.jpg)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Example Chi-Square Test (X2) – (1)
8
• Hypothesis;– Association between gender and Knowledge on
Nutrition (KoN)
– Comparing the proportion of Low KoN between gender
– Ho : P(KoN)male = P(KoN)femafe
• Assumption– Random samples [ √ ]
– Observations are independent [ √ ]
– The number of cells with Expected Count (EC) less than 5, must be less than 20% of the total number of cells
– The smallest EC must be at least 2Calculated by SPSS
![Page 9: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/9.jpg)
9
Chi-square using SPSS - procedure:
1
2
3
![Page 10: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/10.jpg)
10
Chi-square using SPSS - procedure:
4
5
6
7
8
9
![Page 11: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/11.jpg)
Chi-square using SPSS - Output:
11
Descriptive statistics for each group
Chi-square statistic = 0.417df = 1; P-value = 0.518
Must be < 20%
Must be ≥ 2
2 EC assumptions
is met
![Page 12: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/12.jpg)
Chi-square using SPSS – Table and Interpretation:
12
Variable nLow KoNFreq (%)
High KoNFreq (%)
X2 statistics a
(df)P-value
Gender
Male 39 19 (48.7) 20 (51.3)0.417 (1) 0.518
Female 34 14 (41.2) 20 (58.8)
Ethnicity
Malay
Others
Education Level
Low
High
Table 1: Factors (categorical variable) associated with Knowledge on Nutrition
a Chi-square test for independence
The prevalence (proportion) of Low Knowledge on Nutrition between male and female is not
significantly different (P = 0.518). Therefore, there is no significant association between gender and
Knowledge on Nutrition.
![Page 13: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/13.jpg)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
What if assumptions were not met?
13
• Combine adjacent columns or/and rows to
increase the EC if possible.
• If still did not meet expected cell assumption,
Fisher’s exact (FE) test can be applied (only
for 2 x 2 table in SPSS).
![Page 14: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/14.jpg)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Example Chi-Square Test (X2) – (2)
14
• Hypothesis;– Association between ethnicity and Knowledge on Nutrition
(KoN)
– Comparing the proportion of Low KoN between ethnicity
– Ho : P(KoN)malay=P(KoN)chinese=P(KoN)indian=P(KoN)others
• Assumption
– Random samples [ √ ]
– Observations are independent [ √ ]
– The number of cells with Expected Count (EC) less than
5, must be less than 20% of the total number of cells
– The smallest EC must be at least 2 Calculated by SPSS
![Page 15: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/15.jpg)
Chi-square using SPSS - Output:
Descriptive statistics for each group
4 (50%) cells have EC less than 5. The smallest EC is 1.36.One remedial maybe to
combine Indian and others, (or even combing 3 levels) and
call it as “others”.(Combination should be
interpretable/ meaningful)
15
Must be < 20%
Must be ≥ 2
2 EC assumptions
is not met
![Page 16: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/16.jpg)
Chi-square using SPSS - Output:
Descriptive statistics for each group
16Must be < 20% Must be ≥ 2
2 EC assumptions
is met
Chi-square statistic = 0.072df = 1; P-value = 0.788
If EC assumptionsis still not met
![Page 17: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/17.jpg)
Chi-square using SPSS – Table and Interpretation:
17
Variable nLow KoNFreq (%)
High KoNFreq (%)
X2 statistics a
(df)P-value
Gender
Male 39 19 (48.7) 20 (51.3)0.417 (1) 0.518
Female 34 14 (41.2) 20 (58.8)
Ethnicity
Malay 43 20 (46.5) 23 (53.5)0.072 (1) 0.788
Others 30 13 (43.3) 17 (56.7)
Education Level
Low
High
Table 1: Factors (categorical variable) associated with Knowledge on Nutrition
a Chi-square test for independence
The prevalence (proportion) of Low Knowledge on Nutrition between Malay and other ethnicity is not significantly different (P = 0.788). Therefore,
there is no significant association between ethnicity and Knowledge on Nutrition.
![Page 18: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/18.jpg)
S C H O O L O F N U T R I T I O N A N D D I E T E T I C S • U N I V E R S I T I S U L T A N Z A I N A L A B I D I N
Fisher Exact Test
18
• Fisher’s Exact Test is a test for independence in a 2 X 2 table.
• It is most useful when the total sample size and the expected values are small. – Useful when E(cell counts) < 5.
• The output consists of more than one p-values: – Choose Exact Sig. (2-sided)
![Page 19: 6. Categorical data analysis - Chi-Square & Fisher Exact Test](https://reader030.vdocuments.mx/reader030/viewer/2022032711/58788bad1a28ab375f8b4d25/html5/thumbnails/19.jpg)
Thank You
19