contingency tables – part ii – getting past chi-square?

Contingency Tables – Part II –

Getting Past Chi-Square?

Measures of Association – A Review1. What is the difference between a significance

test statistic and a measure of association?• How are they related?

2. The basic questions about associations between variables?

a) Does an association exist (vs. independence)?b) What is form (& direction) of the association?c) What is the strength (“size”) of the association?

“Strength of AssociationC. What does “association” mean?

1) Shared or common elements2) Degree of agreement3) Predictability (reduction in errors/ignorance)

D. Characteristics of association measures?1) Coefficient should range between -1 & +12) Coefficient should not be directly affected by N3) Coefficient should be independent of a variable’s

scale of measurement (its “metric”)4) Coefficient values should be interpretable

(intuitively or methematically)

“Strength of Association (cont.)E. A number of different measures of association

(coefficients) are available:

• Based on different levels of measurement

• Based on different analytical models

How to choose among them?1) Identify levels-of-measurement of both

variables

2) Identify if you have a clear independent variable use a directional or a nondirectional coefficient

3) Identify which coefficients are most commonly used

Measurement Level Situations:Association between 2 numerical variables?

– Coefficient = Pearson’s r• r2 = proportion of variance “in common”

– May use Spearman’s r if data are ranks

Association between 1 categoric and 1 numeric variable? (as in ANOVA)

– Coefficient of Association = eta (ή)• eta-squared = proportion of variance “between

groups”• In SPSS, use Descriptives Cross-tabs or

Compare Means Means procedures

Association between 2 categoric variables• Different approaches to nonparametric

measures of association

1) Chi-square-based Correct for degrees of freedom and sample size

2) Uncertainty/Errors of Prediction Predictability of Y given knowledge of X

3) Concordance/agreement Proportion of shared or correspondent values

• Note: coefficients for Ordinal and Nominal variables are different

Coeff. limited to the lower measurement level

Strength of Association (continued)• Association between 2 Nominal variables (or

1 nominal + 1 ordinal variable)

– Chi-square-derived:

• Contingency coefficient, C • Cramer’s V coefficient use this for 3x3

or larger tables

• Phi coefficient, Φ use this for 2x2 tables (or 2x3 tables)

– PRE-derived :

• Lambda (asymmetric) (λyx <> λxy)

Strength of Association (continued)

• Association between 2 Ordinal variables– Concordance-based (PRE) statistics:

• Gamma, γ most commonly used (note: in cases of 2x2 tables, gamma = Yules Q)

• Others? Kendall’s tau; Somer’s d (less used)

– Rank-order statistics:• Spearman’s Rho , • Use if many categories & few ties• Must convert scores to ranks

– Can also use Chi-square-based measures• Computing Phi as ordered coefficient

Nonparametric Measures of Association: Summary Recap

• Nominal variables– Phi, Φ for 2x2 tables (or 2x3)

– Kramer’s V for 3x3 tables or larger

• Ordinal variables– Gamma, γ most commonly used

– Yules Q same statistic in a 2x2 table

– Spearman’s r if many values & few ties

– Can also use Phi and Kramer’s V

Nonparametric Measures of Association: Summary (continued)

• Different kinds of coefficients will not yield the same values on the same crosstab

• Gamma (& Yules Q) will almost always compute higher values than Kramer’s V (& Phi) on the same tables

• Note that 2x2 tables (with binary variables) are somewhat of a special case

Non-Parametric measures of association

a. How to Compute them? – By Hand: see formulas in the textbook

• Chi-square-based = easiest to compute• Gamma = more laborious by hand• Note: X & Y variables in crosstab must be

formatted in the same direction for ordinal statistics (e.g., Gamma)

– In SPSS: Click Statistics box in Crosstabs pop-up menu, then select appropriate coefficients (Note: do not select them all)

II. Multivariate analysis of associations• Going beyond bivariate analysis to

multivariate analyses– We often wish to consider more than two

variables at a time because other variables may be involved in more complex patterns

– Termed “Partialling” or “Elaborating” statistically consider:• confounding effects of additional variables

“spurious relationships”• Complicating effects of additional variables

“contingent relationships”

Multivariate Analysis (continued)– In cross-tabulations, crosstabs are

“nested within levels of other variables• Compute separate sub-crosstabs within

each category or level of the 3rd variable

• See the example on the handout

– Partialing is only useful when the extra variable is associated with both X and Y• Then we wish to remove the extra covariation• Otherwise, it’s a waste of time

contingency tables – part ii – getting past chi-square?

Documents