chapter1 notes - applied statistics

83
CHAPTER 1 PARAMETER ESTIMATION Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1 PARAMETER ESTIMATION

Upload: secular-partisan

Post on 18-Apr-2015

174 views

Category:

Documents


9 download

TRANSCRIPT

Page 1: Chapter1 Notes - Applied Statistics

CHAPTER 1

PARAMETER ESTIMATION

Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.1

PARAMETER ESTIMATION

Page 2: Chapter1 Notes - Applied Statistics

12-3

INTRODUCTION

Parameter estimation is the first step in inferentialstatistics. In other words, it is the process of estimating thevalue of a parameter using information obtained from asample.

The process that acquires information from samples andThe process that acquires information from samples andused the information to make conclusions about populationsis called statistical inference.

In order to do statistical inference, we require the skills andknowledge of descriptive statistics, probability distributions,and sampling distributions. The process can be simply asin figure 1.

Page 3: Chapter1 Notes - Applied Statistics

INTRODUCTION (cont..)

The objective of estimation is to determine theapproximate value of a population parameter on the basisof a sample statistic.

There are two approaches to parameter estimation whichare

3

arei) Point estimation

q Using point estimate will obtain value that iseither 100% accurate or 100% different from thetrue value

q Note that, true value = parameter value

ii) Interval estimation

Page 4: Chapter1 Notes - Applied Statistics

Estimator is the statistic used to obtain the pointestimate.

Estimate is a specific value or range of values used toapproximate some population parameter.

INTRODUCTION (cont..)

4

approximate some population parameter.

Why we estimate? Can we get the exact value from thepopulation?

Page 5: Chapter1 Notes - Applied Statistics

Sampling process

Samples are taken at random

census survey

Population Eg: All UUM

students

Part of population unit

eg: a number of UUM students

Collect Information/ data

Collect Information/data

INTRODUCTION (cont..)

5

PARAMETER

Population measurement

Sample measurement

estimate STATISTICS

Figure 1.1: Relationship between parameter and statistic

Page 6: Chapter1 Notes - Applied Statistics

POINT ESTIMATION

A point estimate is a specific numerical value of aparameter or a single value (or point) used to approximatea population parameter.

6

A point estimator draws inferences about a population byestimating the value of an unknown parameter using asingle value or point.

Page 7: Chapter1 Notes - Applied Statistics

POINT ESTIMATION (CONT…)

Table 1.1: Symbols for parameter and statistics Parameter Statistics/ estimator

mean, µ mean,

variance, variance,

7

variance, variance,

standard deviation, standard deviation,

proportion, proportion,

Page 8: Chapter1 Notes - Applied Statistics

Table 1.2: Formulas for statistics Statistics/ estimator Formula

Sample mean

Sample variance

POINT ESTIMATION (CONT…)

8

Sample standard deviation

Sample proportion

Page 9: Chapter1 Notes - Applied Statistics

Characteristics of Good Estimator

The objective of each characteristic good estimator is toobtain an estimator with the sampling distribution meancentered to the parameter being estimated.

POINT ESTIMATION (CONT…)

9

The characteristic include:

§ un-biasness§ Consistency§ relatively efficiency

Page 10: Chapter1 Notes - Applied Statistics

An unbiased estimator of a population parameter is anestimator whose expected value is equal to that parameter.

Ø An estimator is an unbiased estimator for parameterif E( )=θ

ØE.g. the sample mean is an unbiased estimator of

POINT ESTIMATION (CONT…)

10

ØE.g. the sample mean is an unbiased estimator ofthe population mean µ , since:

Page 11: Chapter1 Notes - Applied Statistics

An unbiased estimator is said to be consistent if thedifference between the estimator and the parametergrows smaller as the sample size grows larger.

E.g. is a consistent estimator of because: x µ

POINT ESTIMATION (CONT…)

11

That is, as n grows larger, the variance of growssmaller.

x

Page 12: Chapter1 Notes - Applied Statistics

If there are two unbiased estimators of a parameter, theone whose variance is smaller is said to be relativelyefficient.

E.g. both sample median and mean are unbiasedestimators of the population mean. However, accordingto the variances

x

POINT ESTIMATION (CONT…)

12

so we choose mean, since it is relatively efficientcompared to the sample median .

x x~

Page 13: Chapter1 Notes - Applied Statistics

Example 1:

A sample of 10 frogs has been taken at random and theweight (in grams) for each of the frog was recorded andgiven as below:

250 230 200 210 195 225 200 230 240

POINT ESTIMATION (CONT…)

13

250 230 200 210 195 225 200 230 240190

1. compute the point estimate for the mean weight of thefrogs

2. estimate the standard deviation for the weight of thefrogs

3. estimate the proportion of frogs that have weight notmore than 200 grams

Page 14: Chapter1 Notes - Applied Statistics

Solution:

i. let represent the mean weight of the frogs

ii. the estimate of the standard deviation for the weight of the frogs is given by

14

ii. the estimate of the standard deviation for the weight of the frogs is given by

Page 15: Chapter1 Notes - Applied Statistics

iii. let be the number of frogs with weight not more than 200 grams and be the point estimate for the proportion of frog with weight not more than 200 grams

Frogs with weight not more than 200 grams are; 200, 195, 200, 190 Then,

Thus;

Solution:

15

Thus;

Note: the answer for question i) and ii) can be found directly from your calculator using the mode (SD) function. Those who use calculator model Casio can refer to Appendix 1 for the complete procedure.

Page 16: Chapter1 Notes - Applied Statistics

The age of 15 students who came to the recreationalclub during last weekend are as given below:

Example 2:

8 15 10 16 17

17 13 12 12 15

15 15 16 16 18

16

15 15 16 16 18

Calculate the point estimate of the:

i. average age of studentsii. variance of age of studentsiii. proportion of students with age more than 15

years old.

Page 17: Chapter1 Notes - Applied Statistics

A research has been done to determine percentage of UUM’s staff living in Jitra, From a sample of 200 randomly chosen people, 88 of them are living in Jitra.

i. Obtain the point estimate for the percentage of

Example 3:

17

i. Obtain the point estimate for the percentage of UUM’s staff living in Jitra.

ii. Estimate the mean and the standard deviation of the proportion.

Page 18: Chapter1 Notes - Applied Statistics

Interval Estimation

No matter how good is the point estimator is, we have toadmit that the point estimate can sometimes gives a valuewhich is 100% different from the true value.

Besides that, the point estimators don’t reflect the effectsof larger sample sizes.

18

Thus, it is recommended to use interval estimator toestimate population parameters, which is less precise butsafer.

An interval estimator draws inferences about a populationby estimating the value of an unknown parameter using aninterval.

Page 19: Chapter1 Notes - Applied Statistics

The value of interval estimator is between lower andupper boundaries. Generally, we write the value as

Lower bound < population parameter < Upper bound

Interval Estimation (cont…)

19

If is the point estimate of parameter , then the interval estimate is given by

Where,

( )θ̂S is the standard deviation of the estimator k is the distribution of the parameter (distribution can be define based on Central Limit Theorem)

Page 20: Chapter1 Notes - Applied Statistics

Once the interval estimate is obtained, we canconclude (with some ___% of certainty) that thepopulation parameter of interest is between some lowerand upper bounds.

In this section we will discuss the interval estimate for

Interval Estimation (cont…)

20

In this section we will discuss the interval estimate forthe mean and the proportion and is summarize in Figure1.2.

Page 21: Chapter1 Notes - Applied Statistics

Interval Estimation (cont…)

21

Figure 1.2: Interval estimation

Page 22: Chapter1 Notes - Applied Statistics

Interval estimation for mean

Generally, the interval estimator for one populationmean is given by

( ) ( )xSZxxSZx αα µ +<<−

Interval Estimation (cont…)

22

Note: the Z distribution can be replace by t distributionif the condition to use Z distribution is not satisfied.

( ) ( )( )xSZxor

xSZxxSZx

2

22

α

αα µ

±

+<<−

Page 23: Chapter1 Notes - Applied Statistics

To determine whether to use Z or t distribution, wehave to follow the Central Limit Theorem

Interval Estimation (cont…)

23Figure 1.3: Central Limit Theorem

Page 24: Chapter1 Notes - Applied Statistics

Characteristics of the Z Distribution

When the standard deviation of population isknown or the sample size taken is more than orequal to 30, the normal Z distribution can be used.

Interval Estimation (cont…)

24

Figure 1.4: Condition to use normal Z distribution

Page 25: Chapter1 Notes - Applied Statistics

Characteristics of the t Distribution

1. When the population standard deviation isunknown and the sample size is less than 30, thet distribution with degrees of freedom mustbe used instead of Z distribution.

Interval Estimation (cont…)

25

be used instead of Z distribution.

2. The degrees of freedom are the number ofvalues that are free to vary after a samplestatistic has been computed.

Page 26: Chapter1 Notes - Applied Statistics

The t distribution differs from the standard normaldistribution in the following ways.

i. The variance is greater than 1.

ii. The t distribution is actually a family of curves based

Interval Estimation (cont…)

26

ii. The t distribution is actually a family of curves basedon the concept of degrees of freedom, which isrelated to sample size. As the sample size increases,the t distribution approaches the standard normaldistribution.

Page 27: Chapter1 Notes - Applied Statistics

Interval Estimation (cont…)

27

Figure 1.5: The Z Normal and t distribution

Page 28: Chapter1 Notes - Applied Statistics

Interval Estimation (cont…)

28

Figure 1.6: t distribution with different degrees of freedom.

Page 29: Chapter1 Notes - Applied Statistics

When to use the z or t distribution?

Is population std. dev. σ known?

Use Z distribution no matter what the sample size is. Yes

No

* Variable are normally distributed when n<30

Interval Estimation (cont…)

29

Is sample size, n > 30?

Use t distribution and s in the formula.

Use Z distribution and s in place of σ.

Yes

No

** variable are approximately normally distributed

Figure 1.7: Criteria for choosing Z or t distribution

Page 30: Chapter1 Notes - Applied Statistics

Therefore; the confidence interval for a mean has 3formulas;

1. confidence interval for a mean with knownpopulation standard deviation

ZxZx σµσ +<<−

Interval Estimation (cont…)

30

nZx

nZx σµσ αα

22+<<−

or

Page 31: Chapter1 Notes - Applied Statistics

2. confidence interval for a mean with unknownpopulation standard deviation, sample sizemore than or equal to 30 .

nSZx

nSZx

22αα µ +<<−

Interval Estimation (cont…)

31

or

Page 32: Chapter1 Notes - Applied Statistics

3. confidence interval for a mean with unknownpopulation standard deviation, sample size lessthan 30 (n<30).

nStx

nStx nn 1,21,2 −− +<<− αα µ

Interval Estimation (cont…)

32

ntx

ntx nn 1,21,2 −− +<<− αα µ

or

Page 33: Chapter1 Notes - Applied Statistics

The graphical view of interval estimate:

Interval Estimation (cont…)

33

Width of interval

LCL: UCL:

Figure 1.8: Graphical view of confidence interval

Page 34: Chapter1 Notes - Applied Statistics

The probability of is called Confidence Level (ordegree of confidence).

Interval Estimation (cont…)

( α−1 )

The is called significance level or the probability of Type I error will occur.

34

Confidence Level is the relative frequency of times theconfidence interval actually does contain the populationparameter, assuming that the estimation process isrepeated a large number of times.

There are four commonly used confidence levels…

Page 35: Chapter1 Notes - Applied Statistics

Confidence Level

1- 0.90 0.10 0.05 1.6449

Interval Estimation (cont…)

35

0.90 0.10 0.05 1.6449 0.95 0.05 0.025 1.9600 0.98 0.02 0.01 2.3323 0.99 0.01 0.005 2.5758

Page 36: Chapter1 Notes - Applied Statistics

There are the critical values for t distribution.

Confidence Level (1-

df

0.90 3 0.10 0.05 2.3534

0.95 5 0.05 0.025 2.5706

36

0.95 5 0.05 0.025 2.5706

0.98 7 0.02 0.01 2.9980

0.99 9 0.01 0.005 3.2498

Page 37: Chapter1 Notes - Applied Statistics

Example 4:

A computer company samples demand during lead time over 25 time periods:

235 374 309 499 253

421 361 514 462 369

394 439 348 344 330

37

261 374 302 466 535

386 316 296 332 334

It is known that the standard deviation of demand over lead time is 75 computers. Estimate the mean demand over lead time with 95% confidence level in order to set inventory levels.

Page 38: Chapter1 Notes - Applied Statistics

Example 5:

The president of a large university wishes to estimate the average age of the students presently enrolled. From past studies, the standard deviation is known to be 2 years. A sample of 50 students is selected, and the mean is found to be 23.2 years. Find the 95% confidence interval of the population mean.

38

confidence interval of the population mean.

Example 6:

A survey of 30 adults found that the mean age of a person’s primary vehicle is 5.6 years. Assuming the standard deviation of the population is 0.8 year; find the 99% confidence interval of the population mean.

Page 39: Chapter1 Notes - Applied Statistics

Example 7:

A cereal company selects twenty five 12-ounce boxes of corn flakes every 10 minutes and weighs the boxes. Suppose the weights have a normal distribution with variance is 0.04 ounces. One such sample yields calculate the 90% confidence interval of the population mean.

39

interval of the population mean.

Example 8:

Ten randomly selected automobiles were stopped, and tread depth of the right front tire was measured. The mean was 0.32 inch and the standard deviation was 0.08 inch. Find the 95% confidence interval of the mean depth. Assume that the variable is approximately normally distributed.

Page 40: Chapter1 Notes - Applied Statistics

Example 9:

The average production of peanuts in the state of Virginia is 3000 pounds per acre. A new plant food has been developed and is tested on 60 individual plots of land. The mean yield with the new plant food is 3120 pounds of peanuts per acre with a standard deviation of 578 pounds. Find the 95% confidence interval for the

40

578 pounds. Find the 95% confidence interval for the mean amount of rainfall during the summer months for the northeast part of the United States. Interpret the interval.

Page 41: Chapter1 Notes - Applied Statistics

Example 10:

The following daily highs were recorded in the city of Chicago on 20 randomly selected December days.

32 21 25 25 31 27 22 44 39 18 49 32 34 36 38 40 30 28 36 38

41

Find a confidence interval for the mean daily high temperature, should we use t or Z distribution? Explain.

Page 42: Chapter1 Notes - Applied Statistics

Interval estimation for proportion

Whenever the information is given in percentage,proportion or number of success for a specific event,then the problem being investigated has somethingto do with proportion.

The procedures for drawing inferences about

42

The procedures for drawing inferences about proportion are involved the nominal and sometimes ordinal scale (i.e categorical data).

Example of categorical data: gender (male and female), job satisfaction (satisfied and unsatisfied), opinion (poor and good), attendance (absent, present), examination result (pass and failed), etc.

Page 43: Chapter1 Notes - Applied Statistics

Interval estimation for proportion

The point estimate for the proportion is given by

nxp=ˆ = symbol for the sample proportion

43

Where; = number of sample units that possess the characteristics of interest = sample size.

Page 44: Chapter1 Notes - Applied Statistics

Knowing that:

v Sample size n is bigv Both and are greater than or equal to 5

Interval estimation for proportion

then, the formula to estimate the confidence interval for a

44

then, the formula to estimate the confidence interval for aproportion is given by

( ) ( )

( ) ( )

nqpZppn

qpZp

nppZppn

ppZp

pSZpppSZp

ˆˆˆˆˆˆ

ˆ1ˆˆˆ1ˆˆ

ˆˆˆˆ

22

22

22

αα

αα

αα

+<<−=

−+<<−−=

+<<−

Where is, 1ˆˆ =+qp

Page 45: Chapter1 Notes - Applied Statistics

Example 11:

A recent study of 100 people in Miami found 27 were obese. What is the proportion of individual living in Miami who are obese? Obtain the 95% confidence interval of the proportion of individual living in Miami who are obese and interpret.

45

Example 12:

A survey found that out of 200 workers, 168 said they were interrupted three or more times an hour by phone, message, faxes and etc. Estimate with 90% confidence level, the percentage of the workers who are not interrupted three or more times an hour.

Page 46: Chapter1 Notes - Applied Statistics

Example 13:

In a random sample of 500 observations, we found the proportion of successes to be 48%. Estimate with 95% confidence the population proportion of successes.

Example 14:

46

A random sample of 1500 pine trees was tested for traces of the Bark Beetle infestation. The result showed that 153 of the trees showed such traces. Assuming the data is approximately normally distributed, calculate the point estimator of the proportion of pine trees has been infested, and find a 95% confidence interval for the proportion of pine trees have been infested

Page 47: Chapter1 Notes - Applied Statistics

Example 15:

The quality control manager at Ameen Company claims that the production of model A telephone ‘to be out of control’ when the overall rate of defects exceed 4%. The test for a random sample of 150 telephones revealed that 9 of them are defective. Construct a 98% confidence interval for the proportion of telephone’s defect.

Example 16:

47

Example 16:

A statistics practitioner working for a major league baseball wants to supply radio and television commentators with interesting statistics. He observed several hundred games and counted the number of time runner on first base attempted to steal second base. He found there were 373 such events of which 259 were successful. Estimate with 95% confidence the population proportion of all attempted theft of second base that is successful.

Page 48: Chapter1 Notes - Applied Statistics

Sample size

Sample size for Mean

Recall back: the interval formula for estimating populationmean is

nZx

nZx σµσ

αα +<<−

48

43421)(EError

22 nZx

nZx µ αα +<<−

note that, maximum error;

nZE σα

2=

Page 49: Chapter1 Notes - Applied Statistics

Sample size (cont…)

using the maximum Error formula, we then can calculatethe value of sample size, n which is given by

Sample size for Mean

22 σ

49

22

2

E

Zn

=ασ

Page 50: Chapter1 Notes - Applied Statistics

Sample size for proportion

Recall back: interval formula for estimating populationproportion is

( ) ( )ˆˆˆˆˆˆ p1pZppp1pZp −+<<−−

Sample size (cont…)

50

( ) ( )44 344 21

)(

ˆˆˆˆˆˆ

EError

22 np1pZppn

p1pZp −+<<−− αα

Page 51: Chapter1 Notes - Applied Statistics

note that, maximum error; ( )

np1pZE

2

ˆˆ −= α

using the maximum Error formula, we then can calculatethe value of sample size, n which is given by

Sample size for proportion

Sample size (cont…)

51

the value of sample size, n which is given by

( )

2

2

22

2

2

E

Zqp

E

Zp1pn

=

=αα ˆˆˆˆ

Page 52: Chapter1 Notes - Applied Statistics

Conclusion

In conclusion, the width of the confidence interval estimate isaffected by

ü the population standard deviation,

ü the confidence level

52

ü the confidence level

ü the sample size,

Page 53: Chapter1 Notes - Applied Statistics

The width of the confidence interval is a function of the confidence level, the population standard deviation, and the sample size…

nSZx

nSZx

nSZx

2

22

α

αα µ

±=

+<<−

A larger confidence level produces a wider confidence interval:

Conclusion

53

A larger confidence level produces a wider confidence interval:

Figure 1.9: relationship between width and confidence level

Page 54: Chapter1 Notes - Applied Statistics

Larger values of standard deviation produce widerconfidence intervals

Conclusion

54

Increasing the sample size decreases the width of the confidenceinterval while the confidence level can remain unchanged.

Figure 1.10: relationship between width and standard deviations

Page 55: Chapter1 Notes - Applied Statistics

INTERVAL ESTIMATION FOR TWO MEANS

Previously… we have discussed the techniques to estimateparameters for one population mean

Now, consider this parameter but with two populations. With

55

Now, consider this parameter but with two populations. Withtwo populations, our interest will now be on the differencebetween two population means.

Page 56: Chapter1 Notes - Applied Statistics

Sample, size: n1

Population 1

Parameters: and Statistics: and

INTERVAL ESTIMATION FOR TWO MEANS

56

Figure 1.11: Independent Population and Samples

Sample, size: n2

Population 2

Parameters: and

Parameters: and

Page 57: Chapter1 Notes - Applied Statistics

There are two different types of sample which are:

Dependent Samples also called related (or paired) samples occurwhen the response of the nth person in the second sample is partly afunction of the response of the nth person in the first sample. Thereare two (2) common forms of sample dependency,

INTERVAL ESTIMATION FOR TWO MEANS

57

übefore-after and other studies in which the same people are surveyed at different points in time including panel studies.

ümatched-pairs studies in which similar people are surveyed at different points in time.

Independent Samples are samples that are completely unrelatedto one another.

Page 58: Chapter1 Notes - Applied Statistics

Interval estimation for difference of two independent means

In order to test and estimate the difference between twomeans, we draw random samples from each of twopopulations. Initially, we will consider independent samples,that is, samples that are completely unrelated to one another.

INTERVAL ESTIMATION FOR TWO MEANS

58

that is, samples that are completely unrelated to one another.

Statistics used is or

Page 59: Chapter1 Notes - Applied Statistics

Two assumptions need to be fulfilled in order to determinethe difference between two independent means:

q The samples must be independent of each other;

Interval estimation for difference of two independent means

INTERVAL ESTIMATION FOR TWO MEANS

59

q The samples must be independent of each other;that is, there can be no relationship between thesubjects in each sample.

q The populations from which the samples wereobtained must be normally distributed.

Page 60: Chapter1 Notes - Applied Statistics

There are four (4) different formulas to estimate theconfidence level for the difference between twoindependent means, which are:

Interval estimation for difference of two independent means(cont..)

i. Confidence interval when both population variance(or standard deviation) are known

60

(or standard deviation) are known

Page 61: Chapter1 Notes - Applied Statistics

ii. Confidence interval when both population variance(or standard deviation) are unknown but both samplesizes are more or equal to 30

Interval estimation for difference of two independent means(cont..)

61

iii. Confidence interval when both population variance (orstandard deviation) are unknown, any one or bothsample sizes less than 30 and both populationvariances are assume equal

Page 62: Chapter1 Notes - Applied Statistics

iv. Confidence interval when both population variance (orstandard deviation) are unknown, any one or both samplesizes less than 30 and both population variances areassume unequal

Interval estimation for difference of two independent means(cont..)

62

As in interval estimator for one mean, same situation shouldbe consider in deciding the formula to use to determine thedifference between two means

Page 63: Chapter1 Notes - Applied Statistics

Are both and

known?

Are both n1 & n2 > 30?

Use t values

Use zα/2 values no matter what the sample size is.

Use zα/2 values and s in place of σ.

Yes

Yes

No

No

* Variable must be normally distributed when n<30

Figure 1.12: Flow diagram for choosing the correct distribution

63

Use tα/2 values and s in the formula.

** variable must be approximately normally distributed

Conduct equal variances t-test. Is ?

No

Yes

Use tα/2 values with pooled variance estimator,

Use tα/2 values with

Page 64: Chapter1 Notes - Applied Statistics

Are both and

known?

Are both n1 & n2 > 30?

Yes

Yes

No

No

* Variable must be normally distributed when n<30

Figure 1.13: Flow diagrams for choosing the correct confidence interval formula

64

Use tα/2 values and s in the formula.

** variable must be approximately normally distributed

Conduct equal variances t-test. Is ?

No

Yes

Page 65: Chapter1 Notes - Applied Statistics

Example 17:

Two random samples of 40 students were drawn independently from two normal populations. The following statistics regarding their scores in a final exam were obtained;

65

Construct a 95% confidence interval for the difference between the means.

Page 66: Chapter1 Notes - Applied Statistics

Solution The populations’ standard deviations are unknown. However, since both sample sizes are large enough (both ), according to Central Limit Theorem, the means follow Normal distribution.

66

The 95% confidence interval for the difference between the means is

Page 67: Chapter1 Notes - Applied Statistics

Example 18:

A random sample of 22 male customers who shopped at this supermarket showed that they spent an average of RM80 with standard deviation of RM17.50. While a random sample of 20 female customers who shopped at the same supermarket showed that they spent an average of RM96 with standard deviation RM14.40.

67

average of RM96 with standard deviation RM14.40. Assume that the amount spent at this supermarket by all the male and female customers are normally distributed with equal but unknown standard deviation.

Construct a 99% confidence interval for the difference between the mean amount spent by all male and all female customers at this supermarket and interpret the interval.

Page 68: Chapter1 Notes - Applied Statistics

Example 19:

Because of the rising costs of industrial accidents, many chemical, mining, and manufacturing firms have instituted safety courses. Employees are encouraged to take these courses designed to heighten safety awareness. A company is trying to decide which one of two courses to institute. To help make a decision eight employees take Course 1 and another eight take Course 2. Each employee takes a test,

68

another eight take Course 2. Each employee takes a test, which is graded out of a possible 25. The safety test results are shown below. Assume that the scores are normally distributed. Construct a 90% confidence interval for different of mean.

Course 1 14 21 17 14 17 19 20 16

Course 2 20 18 22 15 23 21 19 15

Page 69: Chapter1 Notes - Applied Statistics

Example 20:

Random samples of children sent to kindergarten aged 4 to 6 years in Bandar A and B were taken to find the number of hours spend for outdoor activities in the kindergarten daily. A sample of 321 children in Bandar B and 94 children in Bandar A give the mean of 3.01 hours and 2.88 hours, respectively. From past studies the

69

and 2.88 hours, respectively. From past studies the population standard deviation for the children in Bandar B is assumed to be 1.09, while the population standard deviation for the children in Bandar A is 1.01. Find a 95% confidence interval for the difference between the two population means.

Page 70: Chapter1 Notes - Applied Statistics

Interval estimation for the difference between twoproportions

We will now look at procedures for drawing inferences about thedifference between populations whose data are nominal (i.e.categorical).

70

With nominal data, we can calculate the proportions ofoccurrences of each type of outcome. Thus, the parameter to beestimated in this section is the difference between two populationproportions: p1–p2.

Page 71: Chapter1 Notes - Applied Statistics

Assumptions for doing Inferences about two proportions

i. We have proportions from two independent simplerandom samples.

ii. In order to use Normal Z distribution, for both

Interval estimation for the difference between twoproportions (cont…)

71

ii. In order to use Normal Z distribution, for bothsample the conditions

( ) ( ) 5ˆ1 5ˆ1 ,5ˆ ,5ˆ 22112211 ≥−≥−≥≥ pnandpnpnpn

must be satisfied.

Page 72: Chapter1 Notes - Applied Statistics

Interval estimation for the difference between twoproportions (cont…)

To draw inferences about the parameter , we takesamples of population, calculate the sample proportions andlook at their difference.

1

11ˆ n

xp = and

2

22ˆ

nx

p =

72

1n 2n

( )21 ˆˆ pp − is an unbiased estimator for ( )21 pp − .

Page 73: Chapter1 Notes - Applied Statistics

The confidence interval estimator for (p1–p2) is given by:

Interval estimation for the difference between twoproportions (cont…)

1 1 2 2 1 1 2 21 2 1 2 1 2

1 2 1 22 2

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆˆ ˆ ˆ ˆ( ) ( )

pq p q pq p qp p z p p p p z

n n n nα α

− − + ≤ − ≤ − + +

( ) ˆˆˆˆ qpqp

73

( )2

22

1

11

221

ˆˆˆˆˆˆ

nqp

nqp

zpp +±− α

Page 74: Chapter1 Notes - Applied Statistics

Example 21:

A Consumer Packaged Goods (CPG) company has testing the marketing of two new versions of soap packaging. Version one (bright colors) was distributed in one supermarket, while version two (simple colors) was in another. Construct a 95% confidence interval for the difference between the two proportions of successes of

74

difference between the two proportions of successes of packaged soap sales.

Page 75: Chapter1 Notes - Applied Statistics

Example 22:

A random sample of 500 respondents was selected in a large city to determine information concerning consumer behavior. Among the questions asked was, “Do you enjoy shopping?” Of 240 male respondents, 136 answered yes. Of 260 female respondents, 224 answered yes. Construct a 95% confidence interval

75

answered yes. Construct a 95% confidence interval estimate of the difference between the proportion of males and females who enjoy shopping.

Page 76: Chapter1 Notes - Applied Statistics

SPSS NOTES FOR OBTAINING THE CONFIDENCE INTERVAL OF MEAN

Step 1 : Select Analyze Menu → Select Descriptive Statistics

76

Page 77: Chapter1 Notes - Applied Statistics

Step 2 : Click on Explore → Select the appropriate variable Step 3 : Click on the button into Dependent List box

List of Variable(s)

77

Make your choice

Make your choice

Variable(s)

Page 78: Chapter1 Notes - Applied Statistics

Step 4 : Click on Statistics → Select the appropriate statistics, eg: Descriptive

You can change the degree of confidence

(Usually use 90%

78

Step 5 : Then, click on Continue→ Click on OK

(Usually use 90% and above)

Page 79: Chapter1 Notes - Applied Statistics

A random sample of 10 university students was surveyed to determine the amount of time spent weekly using a personal computer. The times are: 13, 14, 5, 6, 8, 10, 7, 12, 15, and 3. If the times are normally distributed with a standard deviation of 5.2

Example

79

hours, estimate with 90% confidence the mean weekly time spent using a personal computer by all university students.

Page 80: Chapter1 Notes - Applied Statistics

Descriptives

9.30 1.3006.92

11.68

9.339.00

16.900

MeanLower BoundUpper Bound

90% ConfidenceInterval for Mean

5% Trimmed MeanMedianVariance

timesStatistic Std. Error

80

16.9004.111

315128

-.040 .687-1.396 1.334

VarianceStd. DeviationMinimumMaximumRangeInterquartile RangeSkewnessKurtosis

At 90% confidence level, the mean weekly time spent using a personal computer by all university students is between 6.92 and 11.68.

Page 81: Chapter1 Notes - Applied Statistics

SUMMARY

A point estimator is a good estimator if it has the qualities ofgood estimator which are un-biasness, consistent andrelatively efficient.

Unlike point estimation, interval estimation involves aninterval constructed around the point estimate with a

81

interval constructed around the point estimate with aprobability of .

To construct interval, information regarding the samplingdistribution of the statistics is important.

.

Page 82: Chapter1 Notes - Applied Statistics

The Central Limit Theorem enables us to determine thesampling distribution for the sample statistics based on sampleinformation of the sample size and knowledge of thepopulation variance.

SUMMARY (CONT…)

82

If we want to know whether the population means/ proportionequals to certain value, k and the confidence interval formeans/ proportion includes the k value, we can conclude thatthere is evidence to conclude that the mean/ proportion equalsto k, at a given level of confidence.

Page 83: Chapter1 Notes - Applied Statistics

SUMMARY (CONT…)

If the confidence interval for the difference between twomeans/proportions includes 0 we can say that there is nosignificant difference (failed to reject) between the means ofthe two populations, at a given level of confidence.

83

END OF CHAPTER 1