seminar 3 solution 2015

1 | P a g e

Seminar 3

Question1 (Chapter 9: one-tail test, form of hypothesis, mean)

The owner of a local nightclub has recently surveyed a random sample of n =

250 customers of the club. She would now like to determine whether or not

the mean age of her customers is greater than 30. If so, she plans to alter the

entertainment to appeal to an older crowd. If not, no entertainment changes

will be made.

1.1 The appropriate hypotheses to test are

Answer: H0 : 30 versus H1 : > 30.

1.2 Using the sample information provided, calculate the value of the test

statistic.

Answer:

t = (30 30.45) / (5 / SQRT(250)= -1.42

1.3 Suppose she found that the sample mean was 30.45 years, and the

sample standard deviation was 5 years. If she wants to have a level of

significance at 0.01. (One-tail test)

Answer :

Reject H0 if t > 2.3263 (use the table E.3)

1.4 Suppose the test statistic does fall in the rejection region at = 0.05. what decision should she make?

Answer :

Do not reject H0. because t statistics < 2.3263

Question 2 (Chapter 9: one-tail test, proportion, form of hypothesis)

A major Blu-ray rental chain is considering opening a new store in an area

that currently does not have any such stores. The chain will open if there is

evidence that more than 5,000 of the 20,000 households in the area are

equipped with Blu-ray players. It conducts a telephone poll of 300 randomly

selected households in the area and finds that 96 have Blu-ray players.

2 | P a g e

2.1 State the test of the hypothesis that is of interest to the rental chain.

Answer:

Propotion =5000/20000=0.25

H0 : 0.25 versus H1 : > 0.25

2.2 The value of the test statistic in this problem is approximately equal to

Answer: 2.8

2.3 Given the level of significant = 0.05, the p-value associated with the test

statistic in this problem is approximately equal to

Answer: 0.0026

Note: Probability of Z=2.8 from the table E.2 = 0.9974 then p-value = 1-0.9974 =

0.0026

2.4 The rental chain's conclusion from the hypothesis test using a 5% level of

significance is____. Why?

Answer :

The rental chain's conclusion is to open a new store because P-value of Z=

2.8(0.0026) is less than 0.05 then reject H0 in favor of H1.

Question3 (Chapter 10: pooled variance, t test, difference between two

means)

3 | P a g e

Are Japanese managers more motivated than American managers? A

randomly selected group of each were administered the Sarnoff Survey of

Attitudes Toward Life (SSATL), which measures motivation for upward mobility.

The SSATL scores are summarized below. Given the level of significance at

0.05.

American Japanese

Sample Size 211 100

Sample Mean SSATL

Score 65.75 79.83

Sample Std. Dev. 11.07 6.41

3.1 Referring to table above, the researcher was attempting examine

whether the Japanese managers are more motivated than American

managers. What is an appropriate alternative hypothesis?

Answer:

H1: Japanese > American

3.2 From the analysis in table above, the correct test statistic is

Answer: -11.76

3.3 The proper conclusion for this test is

Answer:

4 | P a g e

At the = 0.05 level, the p-value (6.150883E-27) is lower than 0.05. The proper

conclusion is this evidence indicates that Japanese managers are more

motivated than American managers. (Reject the null hypothesis)

Question 4 (Chapter 10: Z test, difference between two proportions, form of

hypothesis)

The Wall Street Journal recently published an article indicating differences in

perception of sexual harassment on the job between men and women. The

article claimed that women perceived the problem to be much more

prevalent than did men. One question asked of both men and women was:

"Do you think sexual harassment is a major problem in the American

workplace?" 24% of the men compared to 62% of the women responded

"Yes." Assuming W designates women's responses and M designates men's,

4.1what hypothesis should The Wall Street Journal test in order to show that its

claim is true?

Answer:

H0 : W - M 0 versus H1 : W - M > 0

4.2 Suppose that 150 women and 200 men were interviewed. What is the

value of the test statistic?

Answer: 7.173 (see more details in the table below)

4.3 Suppose that 150 women and 200 men were interviewed. For a 0.01 level

of significance, what is the critical value for the rejection region?

Answer: 2.33 (see more details in the table below)

4.4 Construct a 99% confidence interval estimate of the difference between

the proportion of women and men who think sexual harassment is a major

problem in the American workplace.

Answer: 0.25 to 0.51 (see more details in the table below)

4.5 Construct a 95% confidence interval estimate of the difference between

the proportion of women and men who think sexual harassment is a major

problem in the American workplace.

Answer: 0.28 to 0.48

5 | P a g e

Question 5 (Chapter 11: One way ANOVA, Mean squares in One way

ANOVA, F statistics, Tukey-Kramer procedure)

5.1 In a one-way ANOVA, if the computed F statistic exceeds the critical F

value. What decision should you make?

Answer :

The computed F statistic (exceeded the critical F value) falls into the

region of rejection. Then, reject H0 since there is evidence of a

treatment effect.

5.2 Why would you use the Tukey-Kramer procedure?

Answer: Tukey-Kramer procedure can be used for making comparisons

between all pairs of groups or testing differences in pairwise means.

5.3 How to calculate the F test statistic in a one-way ANOVA?

Answer:

F test statistic =

Or

F test statistic =

6 | P a g e

5.4 How to calculate the degrees of freedom for the F test in a one-way

ANOVA?

Answer: Among group : (c - 1)

Within Group: (n - c)

Total : (n - 1)

Question 6 (Chapter 11: Analysis of variance summary table page 423)

A research company wants to compare the mean sales-to-appraisal ratios of

residential properties sold in four neighborhoods (G1, G2, G3, and G4). Four

properties are randomly selected from each neighborhood and the ratios

recorded for each, as shown below.

G1 1.2, 1.1, 0.9, 0.4 G2: 1.0, 1.5, 1.1, 1.3

G3: 2.5, 2.1, 1.9, 1.6 G4 : 0.8, 1.3, 1.1, 0.7

Interpret the results of the analysis summarized in the following table (Hint:

Table11.1 page 423):

Source df SS MS F P-value

Neighborhoods 3.1819 1.0606 10.76 0.001

Error 12

Total 4.3644

6.1 Referring to table above, the among group degrees of freedom is

Answer: 3

6.2 Referring to table above, the within group sum of squares is

Answer: 4.3644-3.1819=1.1825

6.3 Referring to table above, the within group mean squares is

Answer: 1.1825/12 = 0.98542

6.4 At the 0.05 level of significance, what conclusion can you make?

Answer: P-value (0.001) is less than the significance level (0.05), and then null

hypothesis is rejected. You can conclude that: The mean ratios for the 4

neighborhoods are not all the same.

7 | P a g e

Question 7 (Chapter 13 : Regression)

7. 1 What is Y-intercept (b0)? Or what does it represent?

Answer :

Y-intercept (b0)is the predicted value of Y when X = 0, or it represents the

estimated average Y when X = 0.

7.2 The managers of a brokerage firm are interested in finding out if the

number of new clients a broker brings into the firm affects the sales

generated by the broker. They sample 12 brokers and determine the number

of new clients they have enrolled in the last year and their sales amounts in

thousands of dollars. These data are presented in the table that follows.

Broker Clients Sales

1 27 52

2 11 37

3 42 64

4 33 55

5 15 29

6 15 34

7 25 58

8 36 59

9 28 44

10 30 48

11 17 31

12 22 38

Referring to the table, what is the estimated slope parameter for the sales

generated by the broker?

Answer : slope parameter for the sales generated by the broker = 1.1186 (

Topics: Section 13.2)

The link below is the quick guide of how to run regression with MS-Excel

https://faculty.fuqua.duke.edu/~pecklund/ExcelReview/Use%20Excel%20200

7%20Regression.pdf

8 | P a g e

7.3 Referring to the table, what is the estimated average change in the sales

if client goes up by 1.00?

Answer : Yi = 17.6919+1.1186 X1

If client goes up by 1.00, the estimated average change in the sales will be

1.1186.

7.4 Referring to the table, what is the coefficient of correlation for these

data?

9 | P a g e

Answer :

r = + 0.886 if b1 > 0

r = - 0.886 if b1 < 0

b1 = 1.1186 ; b1>0 then r = 0.886

(More detail on page 530)

7.5 Referring to the table, what percentage of the total variation in sales is

explained by clients?

Topics: Section 13.3

Answer : percentage of the total variation in sales is explained by clients is R

square , it equals to 78.46%. For multiple regression, the total variation

explained is Adjusted R square (Coefficient of multiple determination; see

more details in section 14.2) .

7.6 Referring to the table, what is the standard error of the estimate, SYX, for

the data? (Formula is on page 516)

Answer : Formula is on page 516

Then

SYX =5.804

10 | P a g e

7.7 Referring to the table, what is the standard error of the regression slope

estimate, Sb1?

Answer:

The standard error of the regression slope estimate(Sb )= 20.197

7.8 How to measure the variation in Regression? (Hint: See more details on

page 514)

Answer:

11 | P a g e

Measures of variation in Regression can compute from

SST= SSR+SSE

SST= 1227.38 + 336.869= 1564.25

(See more details on page 514)

Question 8 (Chapter 14: r-square, adjust r-square)

8.1 In a multiple regression, what is the coefficient of multiple determination?

What is the value of the coefficient of multiple determination? How to

interpret this value?


Answer :

The coefficient of multiple determination represents the proportion of

the variation in Y that is explained by the set of individual variables (Set

of X; X1, X2, Xn )

The value of the coefficient of multiple determination has to fall

between 0 and +1.

2 =

=

If 2 = 0.78 ,interpretation should be : The coefficient of multiple determination indicated that 78% of the

variation in Y (e.g., Sales) is explained by the variation in the set of X

(e.g., price, promotional expenditures).

8.2 What are the differences between R-squared and Adjusted R-squared?


Answer :

R-squared assumes that every X (independent variables) in the model helps

to explain variation in Y (Dependent variable). So, it gives us a percentage of

variation in Y that can be explained by our prediction equation (set of Xs).

Adjusted R-squared tells you the percentage of variation explained by only

those Xs (Independent variables; only those IVs that pass the t-test) that truly

affect Y (Dependent variable). Only that it takes into account both sample

size and the number of IV's (formula: Adjusted R-squared =1-((1-2)(1)

(1))).

12 | P a g e

With a sufficiently large sample size and a sufficiently small number of IV's, our

Adjusted R-squared and R-squared will be nearly equal. But when sample size

is small and/or there are a large number of IV's, the Adjusted R-squared will

be smaller. To penalizes you for adding independent variable(s) that do not

belong in the model. So, you can expect that the value of the Adjusted R-

squared will be less than or equal to value of R-squared.

(See more details

http://www.bus.ucf.edu/faculty/rhofler/file.axd?file=2012%2F2%2FR2+vs+adj+

R2.pdf)

seminar 3 solution 2015

Documents