chapter thirteen hypothesis testing. copyright © houghton mifflin company. all rights reserved.13 |...
TRANSCRIPT
Chapter Thirteen
Hypothesis Testing
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 2
Hypotheses Testing
• Oversimplified or incorrect assumptions must be subjected to more formal hypothesis testing
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 3
Interesting Hypotheses
• Bankers assumed high-income earners are more profitable than low-income earners
• Clients who carefully balance their checkbooks every month and minimize fees due to overdrafts are unprofitable checking account customers
• Old clients were more likely to diminish CD balances by large amounts compared to younger clients– This was nonintutive because conventional wisdom
suggested that older clients have a larger portfolio of assets and seek less risky investments
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 4
Data Analysis
• Descriptive– Computing measures of central tendency and
dispersion,as well as constructing one-way tables
• Inferential– Data analysis aimed at testing specific
hypotheses is usually called inferential analysis
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 5
Null and Alternative Hypotheses
H0 -> Null Hypotheses
Ha -> Alternative Hypotheses
• Hypotheses always pertain to population parameters or characteristics rather than to sample characteristics. It is the population, not the sample, that we want to make an infernece about from limited data
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 6
Steps in Conducting a Hypothesis Test
• Step 1. Set up H0 and Ha
• Step 2. Identify the nature of the sampling distribution curve and specify the appropriate test statistic
• Step 3. Determine whether the hypothesis test is one-tailed or two-tailed
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 7
Steps in Conducting a Hypothesis Test (Cont’d)
• Step 4. Taking into account the specified significance level, determine the critical value (two critical values for a two-tailed test) for the test statistic from the appropriate statistical table
• Step 5. State the decision rule for rejecting H0
• Step 6. Compute the value for the test statistic from the sample data
• Step 7. Using the decision rule specified in step 5, either reject H0 or reject Ha
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 8
Launching a Product Line Into a New Market Area
• Karen, product manager for a line of apparel, to introduce the product line into a new market area
• Survey of a random sample of 400 households in that market showed a mean income per household of $30,000. Karen strongly believes the product line will be adequately profitable only in markets where the mean household income is greater than $29,000. Should Karen introduce the product line into the new market?
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 9
Karen’s Criterion for Decision Making
• To reach a final decision, Karen has to make a general inference (about the population) from the sample data
• Criterion: mean income across across all households in the market area under consideration
• If the mean population household income is greater than $29,000, then Karen should introduce the product line into the new market
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 10
Karen’s Hypothesis
• Karen’s decision making is equivalent to either accepting or rejecting the hypothesis: – The population mean household income in the
new market area is greater than $29,000
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 11
One-Tailed Hypothesis Test
• The term one-tailed signifies that all - or z-values that would cause Karen to reject H0, are in just one tail of the sampling distribution -> Population Mean
H0: $29,000
Ha: $29,000
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 12
Type I and Type II Errors
• Type I error occurs if the null hypothesis is rejected when it is true
• Type II error occurs if the null hypothesis is not rejected when it is false
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 13
Significance Level
-> Significance level—the upper-bound probability of a Type I error
• 1 - ->confidence level—the complement of significance level
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 14
Inference Based on
Sample Data
Real State of Affairs
H0 is True H0 is False
H0 is True Correct decision Confidence level = 1-
Type II error
P (Type II error) =
H0 is False Type I error Significance level = *
Correct decision
Power = 1-
*Term represents the maximum probability of
committing a Type I error
Summary of Errors Involved in Hypothesis Testing
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 15
Level of Risk
• Two firms considering introducing a new product that radically differs from their current product line– Firm ABC
• Well-established customer base, distinct reputation for its existing product line
– Firm XYZ • No loyal clientele, no distinct image for its present
products
• Which of these two firms should be more cautious in making a decision to introduce the new product?
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 16
Scenario - Firms ABC & XYZ
• Firm ABC– ABC should be more cautious
• Firm XYZ– XYZ should be less cautious
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 17
Sample mean (x) values greater than $29,000--that is x-values on the right-hand side of the sampling distribution centered on µ = $29,000--suggest that H0 may be false. More important the farther to the right x is , the stronger is the evidence against H0
Exhibit 13.1 Identifying the Critical Sample Mean Value – Sampling Distribution
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 18
Karen’s Decision Rule for Rejecting the Null Hypothesis
• Reject H0 if the sample mean exceeds xc
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 19
Every mean x has a corresponding equivalent standard Normal Deviate:The expression for z
x-Z = --------- sx
x = + zsx
Substituting xc for x and zc for z xc = + zcsx where zc is standard normal deviate corresponding to the critical sample mean, xc.
Criterion Value
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 20
Standard deviation for the sample of 400 households is $8,000. The standard error of the mean (sx ) is given by
S s = ---- = $400
n Critical mean household income xc through the following two steps:
1. Determine the critical z-value, zc. For =.05, From Appendix 1, zc = 1.645.
2. Substitute the values of zc, s, and (under the assumption that H0 is "just" true ), xc = + zc s = $29,658.
x
Computing the Criterion Value
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 21
Karen’s Decision Rule
• If the sample mean household income is greater than $29,658, reject the null hypothesis and introduce the product line into the new market area.
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 22
The value of the test statistic is simply the z-value corresponding to = $30,000.
x-Z = ------ = 2.5 s
Test Statistic
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 23
Exhibit 13.2 Critical Value for Rejecting the Null Hypothesis
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 24
P - Value – Actual Significance Level
• The probability of obtaining an x-value as high as $30,000 or more when is only $29,000 = .0062
• This value is sometimes called the actual significance level, or the p-value
• The actual significance level of .0062 in this case means the odds are less than 62 out of 10,000 that the sample mean income of $30,000 would have occurred entirely due to chance (when the population mean income is $29,000 or less)
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 25
Conduct T-Test when sample is smallLet the sample size, n = 25
X = $30,000 , s = $8,000 From the t-table in Appendix 3, tc = 1.71 for = .05 and d.f. = 24.
Decision rule: “Reject H0 if t 1.7l.”
T-test
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 26
The value of t from the sample data:S = 8000/25 = $1,600
x-t = ------ = 0.625 sx
The computed value of t is less than 1.71, H0 cannot be rejected. Karen should not introduce the product line into the new market area.
T-test (Cont’d)
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 27
Two-Tailed Hypothesis Test
• Two-tailed test is one in which values of the test statistic leading to rejectioin of the null hypothesis fall in both tails of the sampling distribution curve
H0 : = $29,000
Ha : $29,000
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 28
Test of Two Means
• A health service agency has designed a public service campaign to promote physical fitness and the importance of regular exercise. Since the campaign is a major one, the agency wants to make sure of its potential effectiveness before running it on a national scale– To conduct a controlled test of the campaign’s
effectiveness, the agency needs two similar cities– The agency identified two similar cities
• city 1 will serve as the test city• city 2 will serve as a control city
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 29
Test of Two Means (Cont’d)
• Random survey of was conducted to measure the average time per day a typical adult in each city spent on some form of exercise– 300 adults in city 1,– 200 adults in city 2
• Results of the survey : – average was 30 minutes per day (with a standard
deviation of 22 minutes) in city 1 – Average was 35 minutes per day (with a standard
deviation of 25 minutes) in city 2• Question
– From these results, can the agency conclude confidently that the two cities are well matched for the controlled test?
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 30
City 1: n1 = 300 x1 = 30 s1 = 22
City 2: n2 = 200 x2 = 35 s2 = 25
The hypotheses are
H0: 1 =2 or 1 -2 = 0
Ha: 1 2 or 1 -2 0
Basic Statistics and Hypotheses
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 31
Test statistic is the z-statistic, given by
(x1 - x 2) - (1 - 2 ) z = -------------------------------
s12/n1 + s2
2/n2
n1 and n2 are greater than 30. The z-statistic can therefore be used as the test statistic.
Test Statistic
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 32
Decision – Two-Tailed Test
• For Two-Tailed tests– Identify two critical values of z, one for each tail of the
sampling distribution
– The probability corresponding to each tail is .025, since = .05
– From the Normal Table, the z-value, for /2 =.025 is 1.96
• Decision rule : “Reject H0 if z -1.96 or if z 1.96.”
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 33
Computing the value of z from the survey results and under the customary assumption that the null hypothesis is true (i.e., 1 - 2 = 0): (30 - 35) - (0) z = --------------------------------- = -2.29
(22)2/300 + (25)2/200
Since z -1.96, we should reject H0.
Computing Z-value – Two-Tailed Test
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 34
Exhibit 13.5 Hypothesis Test Related to Mean Exercising in Two Cities
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 35
Test statistic
(x1 - x2) - (1 - 2 ) t = -------------------------
s* ( 1/n1 + 1/n2 )
with d.f. = n1 + n2 - 2. In this expression, s* is the pooled standard deviation, given by
(n1 – 1)s12 + (n2 – 1)s2
2
s* = --------------------------------- n1 + n2 - 2
T- Test for Independent Samples
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 36
n1 = 20 x1 = 30 s1 = 22n2 = 10 x2 = 35 s2 = 25
The degrees of freedom for the t‑statistic ared.f. = 28
Critical value of t with 28 d.f for a tail probability of .025 is 2.05.
Decision rule : “Reject H0 if t -2.05 or if t 2.05." The pooled standard deviation is
s* = 529 (approximately) = 23
T- Test for Independent Samples - Two Cities
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 37
The test statistic is
t = -.56
Since t is neither less than -2.05 nor greater than 2.05, we cannot reject H0
The sample evidence is not strong enough to conclude that the two cities differ in terms of levels of exercising activity of their residents.
T- Test for Independent Samples
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 38
National Insurance Company Study – Perceived Service Quality Differences Between Males and Females
• Test of Two Means Using the SPSS T-TEST Program– On the 10-point scale,
• males gave a mean rating of approximately 7.87
• females gave a mean rating of approximately 7.83.
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 39
National Insurance Company Study – Perceived Service Quality Differences Between Males and Females
• In SPSS, 1. Select ANALYZE from the menu,
2. Click COMPARE MEANS
3. Select INDEPENDENT-SAMPLES T -TEST
4. Move “OQ – Over all Service Quality” to the “TEST VARIABLES(S)” box
5. Move “gender” to “GROUPING VARIABLE” box
6. DEFINE GROUPS (SEX = 1 for male and 2 for female)
7. Click OK.
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 40
OQ – Overall Perceived Service Quality Gender – Sex = 1 for maleSex = 2 for female
National Insurance Company Study – Perceived Service Quality Differences Between Males and Females
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 41
Group Statistics
137 7.87 2.26 .19
126 7.83 2.31 .21
gendermale
female
OQN Mean Std. Deviation
Std. ErrorMean
National Insurance Company Study – Perceived Service Quality Differences Between Males and Females
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 42
F-Test--to see if the variance of the 2 groups are assumed to be equal p-value = .210 --> null hypothesis cannot be rejected at = 0.05
P-value > = 0.05 -- Do not Reject, Equal variance assumed is correct
Use this row when the null hypothesis of equality of variance is rejected
National Insurance Company Study – Perceived Service Quality Differences Between Males and Females
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 43
P-value=.88 is greater than the = of 0.05. Do not reject Ho.
The p-value implies that the odds are 88 to 100 that a difference of magnitude .04 (i.e., 7.87 - 7.83) could have occurred from chance. The null hypothesis cannot be rejected at the customary significance level of .05.
National Insurance Company Study – Perceived Service Quality Differences Between Males and Females
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 44
Test of Two Means When Samples Are Dependent
• The need to check for significant differences between two mean values when the samples are not independent
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 45
Test of Two Means When Samples Are Dependent (Cont’d)
• A retail chain ran a special promotion in a representative sample of 10 of its stores to boost sales
• Weekly sales per store before and after the introduction of the special promotion are shown
• Did the special promotion lead to a significant increase in sales?
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 46
Sales per Store (In Thousands)
StoreNumber (i)
BeforePromotion(xbi )
AfterPromotion(xai )
Change inSales (InThousands)xdi = xai - xbi
1 250 260 10
2 235 240 5
3 150 151 14 145 140 -5
5 120 124 4
6 98 100 2
7 75 70 -5
8 85 95 109 180 200 20
10 212 220 8Total 50
Sales Per Store Before and After a Promotional Campaign
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 47
One-Tailed Hypothesis Test:H0: d 0; Ha: d 0.The sample estimate of d is xd, given by
nXdi i=1
xd = ----- n
where n is the sample size.
xd = 50/10 = 5
Test of Two Means When Samples Are Dependent (Cont’d)
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 48
Test statistic is
xd - t = ----------- = 2.10 s/n
Test of Two Means When Samples Are Dependent (Cont’d)
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 49
Standard deviation (s) = 7.53, = 0.05, tc for 9 d.f = 1.83 from the Appendix 3
Decision rule: “Reject H0 if t 1.83.”
Test Statistic, t 1.83, we reject H0 and conclude that the mean change in sales per store was significantly greater than zero.
The special promotion was indeed effective.
Test of Two Means When Samples Are Dependent (Cont’d)
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 50
Exhibit 13.6 Hypothesis Test Related to Change in Weekly Sales Per Store
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 51
Test for a Single Proportion
• Ms.Jones wants to substantially increase her firm's advertising budget.
• The firm sells a variety of personal computer accessories
• Random sample : 20/100 (20%) know the brand name
• True awareness rate for the brand name across all personal computer owners is less than .3
• Should Ms. Jones increase the advertising budget on the basis of survey results?
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 52
Test for a Single Proportion (Cont’d)
• Need to test the population proportion of personal computer owners who are aware of the brand:
H0: .3
Ha: .3
( is the symbol for population proportion)
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 53
The test statistic: p -
Z = --------------------- (1- )/n
where p is the sample proportion. From the Normal Table, zc, = -1.645 for = .05.
Decision rule here is: “Reject Ho if z - 1.645.”
p = .2, = .3, and n = 100, z = -2.174
Test for a Single Proportion (Cont’d)
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 54
Since -2.174 -1.645, we reject H0; The sample awareness rate of .2 is too low to support the hypothesis that the population awareness rate is .3 or more.
The actual significance level (p-value) corresponding to z = -2.174 is approximately .015 (from Appendix 1).
Level of significance implies that the odds are lower than 15 in 1,000 that the sample awareness rate of .2 would have occurred entirely by chance (that is, when the population awareness rate is .3 or higher).
Test for a Single Proportion
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 55
Exhibit 13.4 Hypothesis Test Related to Proportion of Personal Computer Owners
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 56
Test of Two Proportions: Choosing Between Commercial X & Commercial Y For a New Product
• Tom, advertising manager for a frozen-foods, company, is in the process of deciding between two TV commercials (X and Y) for a new frozen food to be introduced– Commercial X
• Runs for 20 seconds• Random sample: 20 % awareness out of 200
respondents
– Commercial Y• Runs for 30 seconds• Random sample: 25 % awareness out of 200
respondents
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 57
Test of Two Proportions (Cont’d)
• Question– Can Tom conclude that commercial Y will be
more effective in the total market for the new product?
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 58
Criterion for Decision Making
• To reach a final decision, Tom has to make a general inference (about the population) from the sample data
• Criterion: relative degrees of awareness likely to be created by the 2 commercials in the population of all adult consumers
• Tom should conclude that commercial Y is more effective than commercial X only if the anticipated population awareness rate for commercial Y is greater than that for X
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 59
Hypothesis
• Tom’s decision-making is equvalent to either accepting or rejecting the hypothesis– The potential awareness rate that commercial
Y can generate among the population of consumers is greater than that which commercial X can generate
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 60
Commercial Commercial X Y
Sample sizes: n1 = 200 n2 = 200
Sample proportions: p1 = .25 p2 = .20
The hypotheses are:
H0: 1 2 or 1 - 2 0
Ha: 1 2 or 1 - 2 0
Null and Alternative Hypotheses
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 61
(p1 – p2) - (1 - 2)z = ------------------------ p1 - p2 -- is estimated by the sample standard error formula
Sample Standard Errorsp1 - p2 = PQ ( 1/n1 + 1/n2)
n1p1 + n2p2 P = ------------------- n1 + n2 Q = 1 - P
Test of Two Proportions – Sample Standard Error
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 62
For = .05, the critical value of z (from Appendix 1) is 1.645.
Decision rule: “Reject H0 if z 1.645.”
First compute P and Q, then sp1 - p2 and z:
200(.25) + 200(.2)P = ----------------------- = .225
200 + 200
Q = 1 - .225 = .775
Test of Two Proportions
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 63
sp1 - p2 = (.225)(.775) (1/200 + 1/200)
=0.042
(.25 - .20) - (0)z = ---------------------- = 1.19
.042
Since z 1.645, we cannot reject H0.
The sample evidence is not strong enough to suggest that commercial Y will be more effective than commercial X.
Test of Two Proportions
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 64
Hypothesis Test Related to Awareness Generated by Two Commercials
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 65
Cross-Tabulations: Chi-square Contingency Test
• Technique used for determining whether there is a statistically significant relationship between two categorical (nominal or ordinal) variables
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 66
Telecommunications Company
• Marketing manager of a telecommunications company is reviewing the results of a study of potential users of a new cell phone– Random sample of 200 respondents
• A cross-tabulation of data on whether target consumers would buy the phone (Yes or No) and whether the cell phone had Bluetooth wireless technology (Yes or No)
• Question– Can the marketing manager infer that an association
exists between Bluetooth technology and buying the cell phone?
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 67
Table 13.3 Two-Way Tabulation of Bluetooth Technology and Whether Customers Would Buy Cell Phone
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 68
H0: There is no association between wireless technology and buying the cell phone (the two variables are independent of each other).
Ha: There is some association between the Bluetooth feature and buying the cell phone (the two variables are not independent of each other).
Cross Tabulations - Hypotheses
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 69
Conducting the Test
• Test involves comparing the actual, or observed, cell frequencies in the cross-tabulation with a corresponding set of expected cell frequencies (Eij)
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 70
ninj
Eij = -----n
Where ni and nj are the marginal frequencies, that is, the total number of sample units in category i of the row variable and category j of the column variable, respectively
Expected Values
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 71
The expected frequency for the first-row, first-column cell is given by
100 100
E11 = ------------ = 50 200
Computing Expected Values
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 72
Table 13.4 Observed and Expected Cell Frequencies
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 73
Where r and c are the number of rows and columns, respectively, in the contingency table. The number of degrees of freedom associated with this chi‑square statistic are given by the product (r - 1)(c - 1).
r c (Oij - Eij)2
2 = -----------------
i=1 j=1 Eij
= 72.00
Chi-square Test Statistic
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 74
For d.f. = 1, Assuming =.05, from Appendix 2, the critical chi‑square value (2
c) = 3.84.
Decision rule is: “Reject H0 if 2 3.84.”
Computed 2 = 72.00Since the computed Chi-square value is greater than the critical value of 3.84, reject H0.
The apparent relationship between “Bluetooth technology"and "would buy the cellular phone" revealed by the sample data is unlikely to have occurred because of chance
Chi-square Test Statistic in a Contingency Test
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 75
Interpretation
• The actual significance level associated with a chi-square value of 72 is less than .001 (from Appendix 2). Thus, the chances of getting a chi-square value as high as 72 when there is no relationship between Bluetooth technology and purchase of cell phones are less than 1 in 1,000.
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 76
Cross-Tabulation Using SPSS for National Insurance Company
• One crucial issue in the customer survey of National Insurance Company was how a customer's education was associated with whether or not she or he would recommend National to a friend.
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 77
Need to Conduct Chi-square Test to Reach a Conclusion
• The hypotheses are– H0:There is no association between
educational level and willingness to recommend National to a friend (the two variables are independent of each other)
– Ha:There is some association between educational level and willingness to recommend National to a friend (the two variables are not independent of each other)
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 78
For two-way tabulation:1. Select ANALYZE on the SPSS menu,
2. Click on DESCRIPTIVE STATISTICS,
3. Select CROSS-TABS.
4. Move the “highest level of schooling” to ROW(S) box,
5. Move “rec” variable to “COLUMN(S) box.
6. Click on CELLS,
7. Select OBSERVED, and ROW PERCENTAGES.
8. Click CONTINUE and
9. Click OK.
Association Between Education and Customer’s Willingness to recommend National to a Friend
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 79
Association Between Education and Customer’s Willingness to recommend National to a Friend
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 80
COUNT represents the actual number of customers in each cell. The percentages are based on the corresponding
Association Between Education and Customer’s Willingness to recommend National to a Friend
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 81
Association Between Education and Customer’s Willingness to recommend National to a Friend
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 82
National Insurance Company Study - Chi-Square Test
For Chi-Square Assessment:1. Select ANALYZE 2. Click on DESCRIPTIVE STATISTICS3. Select CROSS-TABS4. Move the variable “highest level of schooling” to ROW(s) box5. Move “rec” to COLUMN(s) box; 6. Click on “STATISTICS”7. Select CHI-SQUARE, CONTINGENCY COEFFICIENT, and
CRAMER’S V 8. Click on CELLS, 9. Select OBSERVED and EXPECTED FREQUENCIES10.Click CONTINUE11.Click OK.
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 83
National Insurance Company Study - Chi-Square Test (Cont’d)
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 84
Interpret the Table
National Insurance Company Study – Expected Frequency Table
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 85
Computed Chi-square
value
P-value
National Insurance Company Study
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 86
National Insurance Company Study – P-Value Significance
• The actual significance level (p-value) = 0.019
• The chances of getting a chi-square value as high as 10.007 when there is no relationship between education and recommendation are less than 19 in 1000
• The apparent relationship between education and recommendation revealed by the sample data is unlikely to have occurred because of chance
• Jill and Tom can safely reject null hypothesis
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 87
Precautions in Interpreting Cross Tabulation Results
• Two-way tables cannot show conclusive evidence of a causal relationship
• Watch out for small cell sizes• Increases the risk of drawing erroneous
inferences when more than two variables are involved
Copyright © Houghton Mifflin Company. All rights reserved. 13 | 88
Patients who jog
Patients who do not jog
Patients with heart disease
20 40
Patients without heart disease
80 60
100 100
Is there a causal relationship between patients who jog and patients with hearth disease?
Two-way Table Based on a Survey of 200 Hospital Patients: