test for independence
DESCRIPTION
Test for Independence. =). Age Groups vs. # Voting in 2008 Presidential Election. # reporting that they voted (in millions). Test for Independence. A Test for Independence is used to assess whether or not paired observations (expressed in a contingency/two-way table), are independent. - PowerPoint PPT PresentationTRANSCRIPT
Test for Independence=)
Age Groups vs. # Voting in 2008 Presidential Election
Voted Did not Vote Total18 - 20 5 7 1221-24 8 9 1725-34 21 21 4235-44 22 18 4045-64 52 28 8065 and older 27 12 39
135 95 230
# reporting that they voted (in millions)
Test for Independence
A Test for Independence is used to assess whether or not paired observations (expressed in a contingency/two-way table), are independent.
For Example: Is voting in the 2008 Presidential Election independent of age?
h𝐺𝑟𝑎𝑝 𝑖𝑐 𝐷𝑖𝑠𝑝𝑙𝑎𝑦You should always look at a graph of your data
in order to get an idea of that the relationship might be.
For χ2 tests, you should convert the counts to %ages and create a bar chart.
18 - 20
21-24 25-34 35-44 45-64 65 and
older
020406080
Reported Voting Behavior by Age Group
VotedDid not Vote
Age Groups
Perc
ent
Step #1: State the HypothesesNull Hypothesis (H0): data sets are independent
“Dull Hypothesis” – nothing is happeningAlternative Hypothesis (HA): data sets are not
independent. Something IS going on!
In this example:Null Hypothesis (H0): Voting is independent of
ageAlternative Hypothesis (HA): Voting is not
independent of age.
Step #2: Calculate the Chi-Squared StatisticExpected counts: In order to perform the Test for Independence, we
must know that all the expected counts are greater than 5.
Voted Did not Vote
Total
18 - 20 12
21-24 17
25-34 42
35-44 40
45-64 80
65 and older
39
135 95 230
Voted Did not Vote
Total
18 - 20 7.0218 4.9782 12
21-24 9.9476 7.0524 17
25-34 23.991 17.009 42
35-44 23.406 16.594 40
45-64 46.812 33.188 80
65 and older
22.821 16.179 39
135 95 230
χ2 = where = observed frequencies
= expected frequencies
χ2 = + + ….χ2 =7.357
Voted(obs)
Voted(exp)
Did not Vote (obs)
Did not vote (exp)
5 7.0218 7 4.97828 9.9476 9 7.052421 23.991 21 17.00922 23.406 18 16.59452 46.812 28 33.18827 22.821 12 16.179
χ2 = where = observed frequencies
= expected frequencies
χ2 = + + ….χ2 =7.017
Voted Did not Vote
18 - 20
21-24 … …25-34 … …35-44 … …45-64 … …65 and older
… …
Step #3: Calculate the Critical value. The importance of our test statistic depends
on two things: (1) The significance level required (1%, 5%, 10%) (2) The degrees of freedom of the data
DofF = (columns – 1)(rows– 1) = (2 – 1) (6 – 1) = (1) (5)= 5 degrees of freedom
Critical ValueThe table you have in your information
booklet allows to you determine the critical value…
Usually, the CV will be given to you on the EA.
Step #3 Continued
.90 = 10% level of significance.95 = 5% level of significance.99 = 1% level of significance
Let’s use the 5% level of significance. Critical Value for 5 degrees of
freedom is:o 11.070
Step #4: ConclusionIf χ2
calc is less than the Critical Value do not reject the null hypothesis
If χ2 calc is more than the Critical Value reject the
null hypothesisIn our example…
o 7.017 < 11.070 o SO we do not reject the null hypothesis.
• Present your conclusion IN CONTEXT.• Because our chi-squared statistic (7.375) is less
than our critical value (11.070) we fail to reject the null hypothesis. There is not enough evidence that the decision to vote is dependent on age.
Example 2:Volunteers are testing a new drug in a
clinical trial. It is claimed that the new drug will result in a more rapid improvement rate for sick patients.
The company wants to test its claim at the 5% significance level.
Improved Not Improved
Total
Given Drug
65 30 95
No Drug 42 43 85Total 97 83 180
Example 2:1) H0: Drug administration and improvement are
independent HA: Drug administration and improvement are dependent.
2) Expected Counts
3) χ2 = + + χ2 =6.7244) D.f. = (2-1)(2-1) = (1)(1) = 15) Critical Value: 3.841 * 6.724 > 3.8416) Because our calculated value of χ2 (6.724) is greater than our critical value (3.841), we can reject the null hypothesis. This suggests that drug administration and improvement are dependent. It appears that the drug does aid in patents’ improvement.
Improved Not Improved
Total
Given Drug 56.472 36.528 95No Drug 50.528 34.472 85Total 97 83 180
Next ClassWe will spend more time on chi-squared tests
of independence. We will learn how to do them on our GDC!
Using p-values