12. inference about two populations
TRANSCRIPT
-
7/28/2019 12. Inference About Two Populations
1/79
1
Inference aboutTwo Populations
-
7/28/2019 12. Inference About Two Populations
2/79
2
Introduction
Variety of techniques are presentedwhose objective is to compare twopopulations.
We are interested in:
The difference between two means. The difference between two proportions.
-
7/28/2019 12. Inference About Two Populations
3/79
3
INFERENCE ABOUT THEDIFFERENCE BETWEEN TWO
SAMPLES: INDEPENDENT SAMPLES
POPULATION 1 POPULATION 2
PARAMETERS:1, 21
22
PARAMETERS:2,
Statistics: Statistics:
Sample size: n 1 Sample size: n 2
21 1x , s
22 2x , s
-
7/28/2019 12. Inference About Two Populations
4/79
4
Inference about the Differencebetween Two Means:Independent Samples
Two random samples are drawn from the
two populations of interest.
Because we compare two population
means, we use the statistic 1 2 X X
-
7/28/2019 12. Inference About Two Populations
5/79
5
The Sampling Distribution of 1 2 X X
1 2 X X
1 2 X X
1 2 X X
1 2 X X
1. is normally distributed if the(original) population distributions are normal .
2. is approximately normallydistributed if the (original) population is notnormal, but the samples size is sufficientlylarge (greater than 30).
3. The expected value of is 1 - 2
4. The variance of is 12/n1 + 22/n2
-
7/28/2019 12. Inference About Two Populations
6/79
6
If the sampling distribution of isnormal or approximately normal we canwrite:
Z can be used to build a test statistic or a confidence interval for 1 - 2
21
21
nn
)()xx(Z
21 xx
Making an inference about
-
7/28/2019 12. Inference About Two Populations
7/79
7
21
21
nn
)()xx(Z
Practically, the Z statistic is hardlyused, because the population variancesare not known.
? ?
Instead, we construct a t statistic using thesample variances (S12 and S22).
S22S12t
Making an inference about
-
7/28/2019 12. Inference About Two Populations
8/79
8
Two cases are considered whenproducing the t-statistic.
The two unknown population variances areequal .
The two unknown population variances areno t equ a l .
Making an inference about :
and unknown case
-
7/28/2019 12. Inference About Two Populations
9/79
9
Inference about : Equalvariances
2nns)1n(s)1n(
S21
2
22
2
112
p
Example: s12
= 25; s22
= 30; n1 = 10; n2 = 15. Then,
04347.2821510
)30)(115()25)(110(S2p
Calculate the pooled variance estimate by:
n2 = 15 n
1= 10
2
1S
2
2S
The pooled
varianceestimator
-
7/28/2019 12. Inference About Two Populations
10/79
10
Inference about : Equalvariances
2nns)1n(s)1n(
S21
2
22
2
112
p
Example: s12
= 25; s22
= 30; n1 = 10; n2 = 15. Then,
04347.2821510
)30)(115()25)(110(S2p
Calculate the pooled variance estimate by:
2pS
n2 = 15 n
1= 10
2
1S
2
2S
The pooled
Varianceestimator
-
7/28/2019 12. Inference About Two Populations
11/79
11
Inference about : Equalvariances
Construct the t-statistic as follows:
2nn.f .d
)n1
n1
(s
)()xx(t
21
21
2p
21
Perform a hypothesis testH0: = 0H1: > 0
or < 0 or 0
Build a confidence interval
1 2
21 2 , 2
1 2
1 1( ) ( )
is the confidence level.
n n p x x t s n n
where
-
7/28/2019 12. Inference About Two Populations
12/79
12
EXAMPLE
The statistics obtained from randomsampling are given as
It is thought that 1 < 2. Test the
appropriate hypothesis assumingnormality with = 0.01.
1 1 1
2 2 2
n 8, x 93,s 20
n 9, x 129,s 24
-
7/28/2019 12. Inference About Two Populations
13/79
13
SOLUTION
1 and 2 are unknown t-test
Because s 1 and s 2 are not much differentfrom each other, use equal-variance t-test.H0: 1 = 2
H A: 1 < 2 (or 1 - 2
-
7/28/2019 12. Inference About Two Populations
14/79
14
Decision Rule:Reject H
0if t < -t
0.01,8+9-2=-2.602
Conclusion: Since t = -3.33 < -t 0.01,8+9-2 =-2.602, reject H 0 at =
0.01.
1
2 2 2 22 1 1 2 2
p
2
2
p
1 2
1 2
(
(n 1)s (n 1)s (7)20 (8)24s 494
n n 2 8 9 2x x ) 0 (93 129) 0
t 3.331 11 1 494s8 9n n
-
7/28/2019 12. Inference About Two Populations
15/79
15
Test Statistic for 1- 2 when 1 2 and unknown
Test Statistic:
with the degree of freedom
1 2 1 2
2 21 2
1 2
(x x ) ( )t =
s sn n
2 2 21 1 2 2
2 22 21 1 2 2
1 2
(s / n s / n )
s / n s / n
n 1 n 1
-
7/28/2019 12. Inference About Two Populations
16/79
16
Inference about : Unequal
variancesConduct a hypothesis testas needed, or,build a confidence interval
int
2 21 2
( ) ( )1 2 , 1 2is the confidence level
Confidence erval
s s
x x t 2 n n
where
-
7/28/2019 12. Inference About Two Populations
17/79
17
Which case to use:Equal variance or unequal
variance? Whenever there is insufficient evidence that
the variances are unequal, it is preferable to
perform the equal variances t-test . This is so, because for any two given
samples
The number of degreesof freedom for the equalvariances case
The number of degreesof freedom for the unequalvariances case
-
7/28/2019 12. Inference About Two Populations
18/79
18
Do people who eat high-fiber cereal for breakfast consume, on average, fewer calories for lunch than people who do not
eat high-fiber cereal for breakfast? A sample of 30 people was randomlydrawn. Each person was identified as aconsumer or a non-consumer of high-fiber cereal.
For each person the number of caloriesconsumed at lunch was recorded.
Example: Making an inferenceabout
-
7/28/2019 12. Inference About Two Populations
19/79
19
onsumers on-cmrs568 705498 819589 706681 509540 613646 582636 601739 608539 787596 573607 428529 754
637 741617 628633 537555 748
. .
. .
. .
. .
Solution:
The data are interval.
The parameter to be tested isthe difference between two means.
The claim to be tested is:The mean caloric intake of consumers (1)is less than that of non-consumers ( 2).
Example: Making an inferenceabout
-
7/28/2019 12. Inference About Two Populations
20/79
20
The hypotheses are:
H0: ( 1 - 2) = 0
H1: ( 1 - 2) < 0 To check the whether the population variances areequal, we use computer output to find the samplevariances
We have s 12= 1274.49, and s22 = 13,386.49.
It appears that the variances are unequal .
Example: Making an inferenceabout
-
7/28/2019 12. Inference About Two Populations
21/79
21
Example: Making an inferenceabout
Compute: Manually
From the data we have:
1 2
1 2
595.8; x 661.1
35.7; s 115.7
x
s
2
2 2
2 22 2
35.7 /10 115.7 / 20 25.0135.7 /10 115.7 / 20
10 1 20 1
df
-
7/28/2019 12. Inference About Two Populations
22/79
22
Example: Making an inferenceabout
Compute: Manually The rejection region is t < -t , = -t .05,25 @ -1.708
1 2 1 22 2 2 21 2
1 2
(x x ) ( ) (598.8 661.1) 0t = 2.31
s s 35.7 115.7n n 30 30
-
7/28/2019 12. Inference About Two Populations
23/79
23
MINITAB OUTPUT Two Sample T-Test and Confidence Interval
Twosample T for Consumers vs Non-cmrs
N Mean StDev SE MeanConsumers 10 595.8 35.7 11Non-cmrs 20 661 116 26
95% C.I. for mu Consumers - mu Non-cmrs: ( -123, -7)T-Test mu Consmers = mu Non-cmrs (vs
-
7/28/2019 12. Inference About Two Populations
24/79
24
2 21 2( )
1 2 / 2,1 2
4103 10670(604.02 633.239) 1.9796
43 10729.21 27.65 56.86, 1.56
s s x x t
n n
Compute: ManuallyThe confidence interval estimator for thedifference between two means is
Example: Making an inferenceabout
-
7/28/2019 12. Inference About Two Populations
25/79
25
An ergonomic chair can be assembledusing two different sets of operations
(Method A and Method B) The operations manager would like to know
whether the assembly time under the two
methods differ.
Example
-
7/28/2019 12. Inference About Two Populations
26/79
26
Example Two samples are randomly and independently
selected
A sample of 25 workers assembled the chair using method A.
A sample of 25 workers assembled the chair using method B.
The assembly times were recorded
Do the assembly times of the two methods differs ?
-
7/28/2019 12. Inference About Two Populations
27/79
27
Example: Making an inference
about Method A Method B
6.8 5.25.0 6.7
7.9 5.75.2 6.67.6 8.55.0 6.55.9 5.95.2 6.7
6.5 6.6. .. .. .. .
Assembly times in Minutes
Solution
The data are interval.
The parameter of interest is the differencebetween two population means.
The claim to be tested is whether a differencebetween the two methods exists.
-
7/28/2019 12. Inference About Two Populations
28/79
28
Solution: Making an inference
about Compute: Manually The hypotheses test is:
H0: ( 1 - 2) 0H1: ( 1 - 2) 0
To check whether the two unknown population variances areequal we calculate S12 and S22 .
We have s 12= 0.8478, and s22 =1.3031.
The two population variances appear to be equal.
-
7/28/2019 12. Inference About Two Populations
29/79
29
Solution: Making an inference
about Compute: Manually
4822525.f .d
93.0
251
251
076.1
0)016.6288.6(t
3031.1s 8478.0s 016.6x 288.6x 222121
076.122525
)303.1)(125()848.0)(125(S 2p
To calculate the t-statistic we have:
-
7/28/2019 12. Inference About Two Populations
30/79
30
The rejection region is t < -t / , =-t .025,48 = -2.009or t > t / , = t .025,48 = 2.009
CONCLUSION: Since t = -2.009 < 0.93 < 2.009,there is insufficient evidence to reject the nullhypothesis.
For = 0.05
2.009.093-2.009
Rejection regionRejection region
Solution
-
7/28/2019 12. Inference About Two Populations
31/79
31
Solution: Making an inference
about
.3584 > .05
-2.0106 < .93 < +2.0106
t-Test: Two-Sample Assuming Equal Variances
Method A Method B
Mean 6.29 6.02Variance 0.8478 1.3031Observations 25 25Pooled Variance 1.08Hypothesized Mean Difference 0df 48t Stat 0.93P(T
-
7/28/2019 12. Inference About Two Populations
32/79
32
Conclusion: There is no evidence to infer
at the 5% significance level that the twoassembly methods are different in terms of assembly time
Solution: Making an inference
about
-
7/28/2019 12. Inference About Two Populations
33/79
33
Solution: Making an inference
about A 95% confidence interval for 1 - 2 is calculated as follows:
1 2
2
1 2 , 2
1 2
1 1( ) ( )
1 16.288 6.016 2.0106 1.075( )
25 250.272 0.5896 [ 0.3176, 0.8616]
n n p x x t sn n
Thus, at 95% confidence level -0.3176 < 1 - 2 < 0.8616
Notice: Zero is included in the confidence interval
-
7/28/2019 12. Inference About Two Populations
34/79
34
Checking the required conditions for the equal variances case
The data appear to beapproximately normal
0
2
4
6
8
10
12
5 5.8 6.6 7.4 8.2 More
Design A
01234
567
4.2 5 5.8 6.6 7.4 More
Design B
-
7/28/2019 12. Inference About Two Populations
35/79
35
ANALYSIS OF PAIRED DATA
What is a matched pair experiment?
Why matched pairs experiments are needed?
How do we deal with data produced in this way?
The following example demonstrates a situationwhere a matched pair experiment is the correctapproach to test the difference between twopopulation means.
-
7/28/2019 12. Inference About Two Populations
36/79
-
7/28/2019 12. Inference About Two Populations
37/79
37
Solution Compare two
populations of intervaldata.
The parameter testedis 1 - 2
Finance Marketing61,228 73,36151,836 36,95620,620 63,627
73,356 71,06984,186 40,203. .. .. .
1
2
The mean of the highest salaryoffered to Finance MBAs
The mean of the highest salaryoffered to Marketing MBAs
H0: ( 1 - 2) = 0H1: ( 1 - 2) > 0
ANALYSIS OF PAIRED DATA
-
7/28/2019 12. Inference About Two Populations
38/79
38
Solution continued
From the data we have:
559,228,262s
,294,433,360s
423,60x
624,65x
22
21
2
1
Let us assume equalvariances
ANALYSIS OF PAIRED DATA
Equal VariancesFinance Marketing
Mean 65624 60423Variance 360433294 262228559Observations 25 25Pooled Variance 311330926Hypothesized Mean Difference 0df 48t Stat 1.04P(T
-
7/28/2019 12. Inference About Two Populations
39/79
39
Question The difference between the sample means is
65624 60423 = 5,201. So, why could we not reject H 0 and favor H 1
where ( 1 2 > 0)?
The effect of a large samplevariability
-
7/28/2019 12. Inference About Two Populations
40/79
-
7/28/2019 12. Inference About Two Populations
41/79
41
Reducing the variability
The values each sample consists of might markedly vary...
The range of observationssample B
The range of observationssample A
-
7/28/2019 12. Inference About Two Populations
42/79
42
...but the differences between pairs of observations might be quite close to one another, resulting in a smallvariability of the differences.
0
Differences
The range of thedifferences
Reducing the variability
-
7/28/2019 12. Inference About Two Populations
43/79
43
Analysis of Paired Data
Since the difference of the means isequal to the mean of the differences wecan rewrite the hypotheses in terms of D(the mean of the differences) rather than interms of 1 2.
This formulation has the benefit of asmaller variability.
Group 1 Group 2 Difference10 12 - 215 11 +4
Mean1 =12.5 Mean2 =11.5Mean1 Mean2 = 1 Mean Differences = 1
-
7/28/2019 12. Inference About Two Populations
44/79
44
Analysis of Paired Data
Data are generated from matched pairs notindependent samples.
Let X i and Y i denote the measurements for the i-th subject. Thus, (X
i, Y
i) is a matched pair
observations. Denote D i = Y i-Xi or X i-Yi. If there are n subjects studied, we have
D1, D 2,, D n. Then, n n
2 2i i 2
2 2 Di 1 i 1D
D
D D nDs
D and s sn n 1 n
-
7/28/2019 12. Inference About Two Populations
45/79
45
CONFIDENCE INTERVAL FORD= 1 - 2
A 100(1- C.I. for D= is given by :
For n 30, we can use z instead of t.
DD /2, n-1
sx tn
-
7/28/2019 12. Inference About Two Populations
46/79
46
HYPOTHESIS TESTS FORD= 1 - 2
The test statistic for testing hypothesisabout D is given by
with degree of freedom n-1.
D D
Dxt =s / n
EXAMPLE
-
7/28/2019 12. Inference About Two Populations
47/79
47
EXAMPLE Sample data on attitudes before and
after viewing an informational film.Subject Before After Difference
1 41 46.9 5.9
2 60.3 64.5 4.23 23.9 33.3 9.44 36.2 36 -0.25 52.7 43.5 -9.26 22.5 56.8 34.3
7 67.5 60.7 -6.88 50.3 57.3 79 50.9 65.4 14.5
10 24.6 41.9 17.3
i X i Yi D i=Y i-X i
-
7/28/2019 12. Inference About Two Populations
48/79
48
90% CI for D= 1- 2:
With 90% confidence, the mean attitudemeasurement after viewing the film exceedsthe mean attitude measurement beforeviewing by between 0.36 and 14.92 units.
DD 7.64,s 12.57
D/ 2,n 1
s 12.57D t 7.64 1.833
n 10
t0.05, 9
D 1 20.36 14.92
-
7/28/2019 12. Inference About Two Populations
49/79
49
EXAMPLE
How can we design an experiment toshow which of two types of tires isbetter? Install one type of tire on onewheel and the other on the other (front)wheels. The average tire (lifetime)distance (in 1000s of miles) is:
with a sample difference s.d. of There are a total of n=20 observations
4.55 D X
7.22 D s
-
7/28/2019 12. Inference About Two Populations
50/79
50
SOLUTION
H0: D=0
H A: D>0
Test Statistics:D D
D
x 4.55 0t = 2.82
s / n 7.22 / 20
Rejection H 0 if t>t .05,19 =1.729 ,Conclusion: Reject H 0 at =0.05
-
7/28/2019 12. Inference About Two Populations
51/79
51
EXAMPLE
It is claimed that an industrial safetyprogram is effective in reducing the loss of working hours due to factory accidents.The following data are collectedconcerning the weekly loss of workinghours due to accidents in six plants both
before and after the safety program isinstituted.
-
7/28/2019 12. Inference About Two Populations
52/79
52
Loss of working hours 1 2 3 4 5 6
Before 12 30 15 37 29 15 After 10 29 16 35 26 16
Do the data substantiate the claim?
Use = 0.05 .
-
7/28/2019 12. Inference About Two Populations
53/79
53
ANSWER
This is a matched pair experiment becausesamples from two populations are notindependent.
Loss of working hours Difference 2 1 -1 2 3 -1
1, 1.67, 6 D D x s n
-
7/28/2019 12. Inference About Two Populations
54/79
54
1 denote the average loss of working hours due
to factory accidents before the safety program .
2 denote the average loss of working hours dueto factory accidents after the safety program.
Also let . Then,1 2 D
0 : 0
: 0 D
A D
H
H
-
7/28/2019 12. Inference About Two Populations
55/79
55
Test statistic:
Rejection region: Conclusion: Do not reject H 0 at = 0.05
because . There isnot sufficient evidence to conclude that the
mean loss of working hours due to factoryaccidents reduces after the safetyprogram.
11.47/ 1.67 / 6
D
D
xt s n
, 1 0.05,5 2.015nt t t
0.05,51.47 2.015t t
PAIRED DATA AND TWO
-
7/28/2019 12. Inference About Two Populations
56/79
56
PAIRED DATA AND TWOSAMPLE t PROCEDURE
The two-sample t test is based on theassumption of independence.
In many paired experiments, there is astrong dependence between variables.
I f Ab t th Diff
-
7/28/2019 12. Inference About Two Populations
57/79
57
Inference About the Differenceof Two Population Proportions
Population 1 Population 2
PARAMETERS: p1
PARAMETERS: p2
Statistics: Statistics:
Sample size: n 1 Sample size: n 2
1
p2
p
I f b h diff
-
7/28/2019 12. Inference About Two Populations
58/79
58
Inference about the differencebetween two population
proportions In this section we deal with two populations
whose data are nominal.
For nominal data we compare the populationproportions of the occurrence of a certain event.
Examples Comparing the effectiveness of new drug versus older
one Comparing market share before and after advertising
campaign Comparing defective rates between two machines
-
7/28/2019 12. Inference About Two Populations
59/79
59
Parameter and Statistic
Parameter When the data are nominal, we can only
count the occurrences of a certain event in
the two populations, and calculateproportions.
The parameter is therefore p 1 p2.
Statistic An unbiased estimator of p 1 p2 is
(the difference between the sampleproportions).
1 2 p p
-
7/28/2019 12. Inference About Two Populations
60/79
60
Sample 1Sample size n1 Number of successes x1 Sample proportion
Two random samples are drawn from twopopulations. The number of successes in each sample is
recorded.
The sample proportions are computed.
Sample 2
Sample size n2 Number of successes x2 Sample proportionx
n 1
1
p 1
2
22 n
xp
Sampling Distribution of 1 2
p p
-
7/28/2019 12. Inference About Two Populations
61/79
61
SAMPLING DISTRIBUTION OF
A point estimator of p 1-p 2 is
The sampling distribution of is
if nip i 5 and n i(1-p i) 5, i=1,2.
1 2
p p
1 2
1 2 1 2
x x p p
n n
1 2
p p
1 1 2 21 2 1 2
1 2
p (1 p ) p (1 p ) p p ~ N(p p , )n n
-
7/28/2019 12. Inference About Two Populations
62/79
62
2
22
1
11
2121
)1()1(
)()
(
n p p
n p p
p p p p Z
The z-statistic
Because and are unknown the standard error must be estimated using the sample proportions.The method depends on the null hypothesis
1 p 2 p
-
7/28/2019 12. Inference About Two Populations
63/79
63
Testing the p 1 p2
There are two cases to consider:Case 1:
H0: p1-p2 =0Calculate the pooled proportion
1 2
1 2
x x p
n nThen Then
Case 2:
H0: p1-p2 =D (D is not equal to 0)Do not pool the data
22
2
x p
n1
11
x p
n
1 2
1 2
( ) 01 1
(1 )( )
p p Z
p pn n
2
22
1
11
21
n)p
1(p
n)p
1(p
D)p
p
(Z
-
7/28/2019 12. Inference About Two Populations
64/79
64
EXAMPLE (CASE 1)
A manufacturer claims that compared with hisclosest competitor, fewer of his employeesare union members. Over 318 of his
employees, 117 are unionists. From a sampleof 255 of the competitors labor force, 109 areunion members. Perform a test at = 0.05.
p1: the proportion of the manufacturers
employees that are union members. p2: the proportion of his closest competitors
employees that are union members.
-
7/28/2019 12. Inference About Two Populations
65/79
65
SOLUTIONH
0: p
1- p
2=0
H A: p 1- p 2 < 0
and , so pooled
sample proportion is
Test Statistic:
11
1
x 117 p
n 318 2
22
x 109 p
n 255
1 2
1 2
x x 117 109 p 0.39
n n 318 255
(117 / 318 109 / 255) 0
z 1.45181 1
(0.39)(1 0.39)318 255
-
7/28/2019 12. Inference About Two Populations
66/79
66
Decision Rule: Reject H 0 if z < -z 0.05 =-1.645.
Conclusion: Because z = -1.4518 > -z 0.05 =-1.645, not reject H 0 at =0.05. Manufacturer is wrong.
-
7/28/2019 12. Inference About Two Populations
67/79
67
The marketing manager needs to decidewhich of two new packaging designs toadopt, to help improve sales of hiscompanys soap. A study is performed in two supermarkets:
Brightly-colored packaging is distributed insupermarket 1.
Simple packaging is distributed in supermarket 2.
First design is more expensive, therefore,to befinancially viable it has to outsell the seconddesign.
Testing p 1 p2 (Case 1)
-
7/28/2019 12. Inference About Two Populations
68/79
68
Summary of the experiment results Supermarket 1 - 180 purchasers of Johnson
Brothers soap out of a total
of 904 Supermarket 2 - 155 purchasers of Johnson
Brothers soap out of a total
of 1,038 Use 5% significance level and perform a
test to find which type of packaging touse.
Testing p 1 p2 (Case 1)
-
7/28/2019 12. Inference About Two Populations
69/79
69
Solution The problem objective is to compare the
population of sales of the two packaging
designs. The data are nominal (Johnson Brothers or
other soap) The hypotheses are
H0: p 1 - p 2 = 0H1: p 1 - p 2 > 0
We identify this application as case 1
Population 1: purchases at supermarket 1Population 2: purchases at supermarket 2
Testing p 1 p2 (Case 1)
-
7/28/2019 12. Inference About Two Populations
70/79
70
Testing p 1 p2 (Case 1)
Compute: Manually For a 5% significance level the rejection region is
z > z = z .05 = 1.645
1 2 1 2
( ) ( ) (180 155) (904 1,038) .1725
The pooled proportion is
p x x n n
90.2
038,11
9041
)1725.1(1725.
1493.1991.
11)
1(
)()
(
21
2121
nn p p
p p p p Z
becomes statistic z The
1 2
180 904 .1991, 155 1, 038 .1493
The sample proportions are
p and p
-
7/28/2019 12. Inference About Two Populations
71/79
71
Testing p 1 p2 (Case 1) Excel (Data Analysis Plus)
Conclusion: There is sufficient evidence to conclude at the 5%significance level, that brightly-colored design will outsell thesimple design.
z-Test: Two Proportions
Supermark et 1 Supermark et 2 Sample Proportions 0.1991 0.1493Observations 904 1038Hypothesized Difference 0z Stat 2.90P(Z
-
7/28/2019 12. Inference About Two Populations
72/79
72
The bath soap of Johnson Brother Company is notselling well. Hoping to improve sales, the companysadvertising agency developed two new designs. Thefirst design features several bright colors and thesecond design is light green in color with thecompanys logo on it. Management needs to decidewhich of two new packaging designs to adopt, to helpimprove sales of a certain soap.
A study is performed in two supermarkets: For the brightly-colored design to be financially viable
it has to outsell the simple design by at least 3%.
Testing p 1 p2 (Case 2)
-
7/28/2019 12. Inference About Two Populations
73/79
73
Summary of the experiment results Supermarket 1 - 180 purchasers of Johnson
Brothers soap out of a total of 904
Supermarket 2 - 155 purchasers of JohnsonBrothers soap out of a total of 1,038
Use 5% significance level and perform a test tofind which type of packaging to use.
Testing p 1 p2 (Case 2)
-
7/28/2019 12. Inference About Two Populations
74/79
74
Solution The hypotheses to test are
H0: p 1 - p 2 = .03H1: p 1 - p 2 > .03
We identify this application as case 2 (thehypothesized difference is not equal to
zero).
Testing p 1 p2 (Case 2)
-
7/28/2019 12. Inference About Two Populations
75/79
75
Compute: Manually
The rejection region is z > z = z.05 = 1.645.Conclusion: Since 1.15 < 1.645 do not reject the null hypothesis.There is insufficient evidence to infer that the brightly-coloreddesign will outsell the simple design by 3% or more.
Testing p 1 p2 (Case 2)
15 . 1
038 , 1 ) 1493 . 1 ( 1493 .
904 ) 1991 . 1 ( 1991 .
03 . 038 , 1
155 904 180
) 1 ( ) 1 ( ) (
2
2 2
1
1 1
2 1
n
p p
n
p p
D p p Z
T i (C 2)
-
7/28/2019 12. Inference About Two Populations
76/79
76
Testing p 1 p2 (Case 2) Using Excel (Data Analysis Plus)
z-Test: Two Proportions
Supermarket 1 Supermarket 2 Sample Proportions 0.1991 0.1493Observations 904 1038Hypothesized Differen 0.03z Stat 1.14P(Z
-
7/28/2019 12. Inference About Two Populations
77/79
77
ESTIMATING p 1-p 2
1 1 2 21 2 / 2
1 2
( )p q p q
p p z n n
100(1 )% Confidence Interval for p 1-p 2:
-
7/28/2019 12. Inference About Two Populations
78/79
78
EXAMPLE
An antibiotic for pneumonia was injected into100 patients with kidney malfunctions (calleduremic patients) and 100 patients with nokidney malfunctions (called normal patients).Some allergic reaction developed in 38 of theuremic patients and 21 of the normalpatients.
) D h d id id h
-
7/28/2019 12. Inference About Two Populations
79/79
a) Do the data provide strong evidence thatthe rate of incidence of allergic reaction to
the antibiotics is higher in uremic patientsthan normal patients ?
Let p 1: the rate of incidence of allergic reaction to theantibiotics in uremic patients and
P2: the rate of incidence of allergic reaction to theantibiotics in normal patients
b) Construct a 95% confidence interval for the difference between the populationproportions and interpret the result .