inference about means and mean differences

60
© aSup-2007 Inference about Means and Mean Differences 1 PART III Inference about Means and Mean Differences

Upload: andi-koentary

Post on 26-Jul-2015

79 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

1

PART IIIInference about Means and Mean

Differences

Page 2: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

2

Chapter 8INTRODUCTION TO

HYPOTHESIS TESTING

Page 3: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

3

The Logic of Hypothesis Testing

It usually is impossible or impractical for a researcher to observe every individual in a population

Therefore, researchers usually collect data from a sample and then use the sample data to answer question about the population

Hypothesis testing is statistical method that uses sample data to evaluate a hypothesis about the population

Page 4: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

4

The Hypothesis Testing Procedure

1.State a hypothesis about population, usually the hypothesis concerns the value of a population parameter

2.Before we select a sample, we use hypothesis to predict the characteristics that the sample have. The sample should be similar to the population

3.We obtain a sample from the population (sampling)

4.We compare the obtain sample data with the prediction that was made from the hypothesis

Page 5: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

5

PROCESS OF HYPOTHESIS TESTING It assumed that the parameter μ is known

for the population before treatment The purpose of the experiment is to

determine whether or not the treatment has an effect on the population mean

Known population before treatment

μ = 30

TREATMENT

Unknown population after treatment

μ = ?

Page 6: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

6

EXAMPLE It is known from national health statistics

that the mean weight for 2-year-old children is μ = 26 pounds and σ = 4 pounds

The researcher’s plan is to obtain a sample of n = 16 newborn infants and give their parents detailed instruction for giving their children increased handling and stimulation

NOTICE that the population after treatment is unknown

Page 7: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

7

STEP-1: State the Hypothesis H0 : μ = 26 (even with extra

handling, the mean at 2 years is still 26 pounds)

H1 : μ ≠ 26 (with extra handling, the mean at 2 years will be different from 26 pounds)

Example we use α = .05 two tail

Page 8: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

8

STEP-2: Set the Criteria for a Decision Sample means that are likely to be

obtained if H0 is true; that is, sample means that are close to the null hypothesis

Sample means that are very unlikely to be obtained if H0 is false; that is, sample means that are very different from the null hypothesis

The alpha level or the significant level is a probability value that is used to define the very unlikely sample outcomes if the null hypothesis is true

Page 9: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

9

The location of the critical region boundaries for three

different los

-1.96 1.96-2.58 2.58

-3.30 3.30

α = .05α = .01α = .001

Page 10: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

10

STEP-3: Collect Data and Compute Sample Statistics

After obtain the sample data, summarize the appropriate statistic

σM = σ√n

z =M - μ

σM

NOTICE That the top of the z-scores

formula measures how much difference there is between the data and the hypothesis

The bottom of the formula measures standard distances that ought to exist between the sample mean and the population mean

Page 11: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

11

STEP-4: Make a Decision Whenever the sample data fall in the

critical region then reject the null hypothesis

It’s indicate there is a big discrepancy between the sample and the null hypothesis (the sample is in the extreme tail of the distribution)

Page 12: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

12

HYPOTHESIS TEST WITH z A standardized test that are normally

distributed with μ = 65 and σ = 15. The researcher suspect that special training in reading skills will produce a change in scores for individuals in the population. A sample of n = 25 individual is selected, the average for this sample is M = 70.

Is there evidence that the training has an effect on test score?

LEARNING CHECK

Page 13: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

13

FACTORS THAT INFLUENCE A HYPOTHESIS TEST

The size of difference between the sample mean and the original population mean

The variability of the scores, which is measured by either the standard deviation or the variance

The number of score in the sample

σM = σ√n

z =M - μ

σM

Page 14: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

14

DIRECTIONAL (ONE-TAILED) HYPOTHESIS TESTS

Usually a researcher begin an experiment with a specific prediction about the direction of the treatment effect

For example, a special training program is expected to increase student performance

In this situation, it possible to state the statistical hypothesis in a manner that incorporates the directional prediction into the statement of H0 and H1

Page 15: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

15

A psychologist has developed a standardized test for measuring the vocabulary skills of 4-year-old children. The score on the test form a normal distribution with μ = 60 and σ = 10.

A researcher would like to use this test to investigate the hypothesis that children who grow up as an only child develop vocabulary skills at a different rate than children in large family. A sample of n = 25 only children is obtained, and the mean test score for this sample is M = 63.

LEARNING CHECK

Page 16: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

UNCERTAINTY AND ERRORS IN HYPOTHESIS TESTING

Hypothesis testing is a inferential process, which means that it uses limited information as the basis for reaching a general conclusion

Although sample data usually representative of the population, there is always a chance that the sample is misleading and will cause a researcher to make the wrong decision about the research results

16

Page 17: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

Type I Error … occurs when a researcher rejects

a null hypothesis that is actually true.

In a typical research situation, a Type I Error means the researcher conclude that a treatment does have an effect when in fact it has no effect.

17

Page 18: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

Type II Error … occurs when a researcher fail to

reject a null hypothesis that is really false.

In a typical research situation, a Type II Error means that the hypothesis test has failed to detect real treatment effect

18

Page 19: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

Actual SituationNo Effect,H0 False

Effect Exist,H0 False

Reject H0

Type I Error Decision Correct

Retain H0

Decision Correct

Type II Error

19

Page 20: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

20

Chapter 9INTRODUCTION TO t STATISTIC

Page 21: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

21

THE t STATISTIC:AN ALTERNATIVE TO z

In the previous chapter, we presented the statistical procedure that permit researcher to use sample mean to test hypothesis about an unknown population

Remember that the expected value of the distribution of sample means is μ, the population mean

Page 22: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

22

The statistical procedure were based on a few basic concepts:

1. A sample mean (M) is expected more or less to approximate its population mean (μ). This permits us to use sample mean to test a hypothesis about the population mean.

2. The standard error provide a measure of how well a sample mean approximates the population mean. Specially, the standard error determines how much difference between M and μ is reasonable to expect just by chance.

Page 23: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

23

The statistical procedure were based on a few basic concepts:

3. To quantify our inferences about the population, we compare the obtained sample mean (M) with the hypothesized population mean (μ) by computing a z-score test statistic

Page 24: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

24

THE t STATISTIC:AN ALTERNATIVE TO z

The goal of the hypothesis test is to determine whether or not the obtained result is significantly

greater than would be expected by chance.

Page 25: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

25

THE PROBLEM WITH z-SCORE A z-score requires that we know the

value of the population standard deviation (or variance), which is needed to compute the standard error

In most situation, however, the standard deviation for the population is not known

In this case, we cannot compute the standard error and z-score for hypothesis test. We use t statistic for hypothesis testing when the population standard deviation is unknown

Page 26: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

26

Introducing t Statistic

σM =σ√n

Now we will estimates the standard error by simply substituting the sample variance or standard deviation in place of the unknown population value

SM =s√n

Notice that the symbol for estimated standard error of M is SM instead of

σM , indicating that the estimated value is computed from sample data rather than from the actual population parameter

Page 27: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

27

z-score and t statistic

σM = σ√n

z =M - μ

σM

SM = s√n

t =M - μ

SM

Page 28: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

28

The t Distribution Every sample from a population can be

used to compute a z-score or a statistic If you select all possible samples of a

particular size (n), then the entire set of resulting z-scores will form a z-score distribution

In the same way, the set of all possible t statistic will form a t distribution

Page 29: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

29

The Shape of the t Distribution The exact shape of a t distribution

changes with degree of freedom There is a different sampling

distribution of t (a distribution of all possible sample t values) for each possible number of degrees of freedom

As df gets very large, then t distribution gets closer in shape to a normal z-score distribution

Page 30: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

30

HYPOTHESIS TESTS WITH t STATISTIC

The goal is to use a sample from the treated population (a treated sample) as the determining whether or not the treatment has any effectKnown population before treatment

Unknown population after treatment

μ = 30 μ = ?

TREATMENT

Page 31: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

31

HYPOTHESIS TESTS WITH t STATISTIC

As always, the null hypothesis states that the treatment has no effect; specifically H0 states that the population mean is unchanged

The sample data provides a specific value for the sample mean; the variance and estimated standard error are computed

t =sample mean

(from data)

Estimated standard error (computed from the sample data)

population mean (hypothesized from H0)-

Page 32: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

32

A psychologist has prepared an “Optimism Test” that is administered yearly to graduating college seniors. The test measures how each graduating class feels about it future. The higher the score, the more optimistic the class. Last year’s class had a mean score of μ = 19. A sample of n = 9 seniors from this years class was selected and tested. The scores for these seniors are as follow:

19 24 23 27 19 20 27 21 18On the basis of this sample, can the psychologist

conclude that this year’s class has a different level of optimism than last year’s class?

LEARNING CHECK

Page 33: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

33

STEP-1: State the Hypothesis, and select an alpha level

H0 : μ = 19 (there is no change)

H1 : μ ≠ 19 (this year’s mean is different)

Example we use α = .05 two tail

Page 34: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

34

STEP-2: Locate the critical region Remember that for hypothesis test with t

statistic, we must consult the t distribution table to find the critical t value. With a sample of n = 9 students, the t statistic will have degrees of freedom equal to

df = n – 1 = 9 – 1 = 8 For a two tailed test with α = .05 and df =

8, the critical values are t = ± 2.306. The obtained t value must be more extreme than either of these critical values to reject H0

Page 35: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

35

STEP-3: Obtain the sample data, and compute the test statistic

Find the sample mean

Find the sample variances

Find the estimated standard error SM

Find the t statistic

SM = s√n

t =M - μ

SM

Page 36: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

36

STEP-4: Make a decision about H0, and state conclusion

The obtained t statistic (t = -4.39) is in the critical region. Thus our sample data are unusual enough to reject the null hypothesis at the .05 level of significance.

We can conclude that there is a significant difference in level of optimism between this year’s and last year’s graduating classes

t(8) = -4.39, p<.05, two tailed

Page 37: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

37

The critical region in thet distribution for α = .05 and df

= 8

Reject H0 Reject H0

Fail to reject H0

-2.306 2.306

Page 38: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

38

DIRECTIONAL HYPOTHESES AND ONE-TAILED TEST

The non directional (two-tailed) test is more commonly used than the directional (one-tailed) alternative

On other hand, a directional test may be used in some research situations, such as exploratory investigation or pilot studies or when there is a priori justification (for example, a theory previous findings)

Page 39: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

39

A fund raiser for a charitable organization has set a goal of averaging at least $ 25 per donation. To see if the goal is being met, a random sample of

recent donation is selected.The data for this sample are as follows:20 50 30 25 15 20 40 50 10 20

LEARNING CHECK

Page 40: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

40

The critical region in thet distribution for α = .05 and df

= 9

Reject H0

Fail to reject H0

1.883

Page 41: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

41

Chapter 10THE t TEST FOR TWO

INDEPENDENT SAMPLES

Page 42: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

42

OVERVIEW Single sample techniques are used

occasionally in real research, most research studies require the comparison of two (or more) sets of data

There are two general research strategies that can be used to obtain of the two sets of data to be compared:○ The two sets of data come from the two

completely separate samples (independent-measures or between-subjects design)

○ The two sets of data could both come from the same sample (repeated-measures or within subject design)

Page 43: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

43

Do the achievement scores for students taught by method A differ from the scores for students taught by method B?In statistical terms, are the two population means the same or different?

Unknownµ =?

SampleA

Unknownµ =?

SampleB

Taught by

Method A

Taught by

Method B

Page 44: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

44

THE HYPOTHESES FOR AN INDEPENDENT-MEASURES TEST

The goal of an independent-measures research study is to evaluate the mean difference between two population (or between two treatment conditions)

H0: µ1 - µ2 = 0 (No difference between the population means)

H1: µ1 - µ2 ≠ 0 (There is a mean difference)

Page 45: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

45

THE FORMULA FOR AN INDEPENDENT-MEASURES

HYPOTHESIS TEST

In this formula, the value of M1 – M2 is obtained from the sample data and the value for µ1 - µ2 comes from the null hypothesis

The null hypothesis sets the population mean different equal to zero, so the independent-measures t formula can be simplifier further

t =sample mean

difference

estimated standard error

population mean difference-

=M1 – M2

S (M1 – M2)

Page 46: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

46

THE STANDARD ERROR

To develop the formula for S(M1 – M2) we will consider the following points:

Each of the two sample means represent its own population mean, but in each case there is some error

SM = s2

n√SM1-M2 = s1

2

n1√s2

2

n2+

Page 47: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

47

POOLED VARIANCE The standard error is limited to

situation in which the two samples are exactly the same size (that is n1 – n2)

In situations in which the two sample size are different, the formula is biased and, therefore, inappropriate

The bias come from the fact that the formula treats the two sample variance

Page 48: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

48

POOLED VARIANCE for the independent-measure t

statistic, there are two SS values and two df values)

SP2 = SS

nSM1-M2 = s1

2

n1√s2

2

n2+

Page 49: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

49

HYPOTHESIS TEST WITH THE INDEPENDENT-MEASURES t

STATISTICIn a study of jury behavior, two samples of participants were provided details about a trial in which the defendant was obviously

guilty. Although Group-2 received the same details as Group-1, the second group was also

told that some evidence had been withheld from the jury by the judge. Later participants were asked to recommend a jail sentence. The length of term suggested by each participant is presented. Is there a significant difference between the two groups in their responses?

Page 50: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

50

THE LENGTH OF TERM SUGGESTED BY EACH

PARTICIPANTGroup-1 scores: 4 4 3 2 5 1 1 4Group-2 scores: 3 7 8 5 4 7 6 8

There are two separate samples in this study. Therefore the analysis will use the independent-measure t test

Page 51: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

51

STEP-1: State the Hypothesis, and select an alpha level

H0 : μ1 - μ2 = 0 (for the population, knowing evidence has been withheld has no effect on the suggested sentence)

H1 : μ1 - μ2 ≠ 0 (for the population, knowledge of withheld evidence has an effect on the jury’s response)

We will set α = .05 two tail

Page 52: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

52

STEP-2: Identify the critical region For the independent-measure t statistic,

degrees of freedom are determined bydf = n1 + n2 – 2 = 8 + 8 – 2 = 14

The t distribution table is consulted, for a two tailed test with α = .05 and df = 14, the critical values are t = ± 2.145.

The obtained t value must be more extreme than either of these critical values to reject H0

Page 53: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

53

STEP-3: Compute the test statistic

Find the sample mean for each groupM1 = 3 and M2 = 6

Find the SS for each groupSS1 = 16 and SS2 = 24

Find the pooled variance, andSP

2 = 2.86

Find estimated standard errorS(M1-M2) = 0.85

Page 54: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

54

STEP-3: Compute the t statistic

t = M1 – M2

S (M1 – M2)

=-3

0.85= -3.53

Page 55: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

55

STEP-4: Make a decision about H0, and state conclusion

The obtained t statistic (t = -3.53) is in the critical region on the left tail (critical t = ± 2.145). Therefore, the null hypothesis is rejected.

The participants that were informed about the withheld evidence gave significantly longer sentences,

t(14) = -3.53, p<.05, two tails

Page 56: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

56

The critical region in thet distribution for α = .05 and df

= 14

Reject H0 Reject H0

Fail to reject H0

-2.145 2.145

Page 57: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

57

LEARNING CHECK

The following data are from two separate independent-measures experiments. Without doing any calculation, which experiment is more likely to demonstrate a significant difference between treatment A and B? Explain your answer.

EXPERIMENT A EXPERIMENT BTreatment

ATreatment

BTreatment

ATreatment

B

n = 10 n = 10 n = 10 n = 10M = 42 M = 52 M = 61 M = 71

SS = 180 SS = 120 SS = 986 SS = 1042

Page 58: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

58

A psychologist studying human memory, would like to examine the process of

forgetting. One group of participants is required to memorize a list of words in the evening just before going to bed.

Their recall is tested 10 hours latter in the morning. Participants in the second group memorized the same list of words in he morning, and then their memories tested 10 hours later after being awake

all day.

LEARNING CHECK

Page 59: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

59

LEARNING CHECK

The psychologist hypothesizes that there will be less forgetting during less forgetting during sleep than a busy day. The recall scores for two samples of college students are follows:

Asleep Scores Awake Scores

15 13 14 14 15 13 14 12

16 15 16 15 14 13 11 12

16 15 17 14 13 13 12 14

Page 60: Inference about means and mean differences

© aSup-2007

Inference about Means and Mean Differences

60

Sketch a frequency distribution for the ‘asleep’ group. On the same graph (in different color), sketch the distribution for the ‘awake’ group.Just by looking at these two distributions, would you predict a significant differences between two treatment conditions?

Use the independent-measures t statistic to determines whether there is a significant difference between the treatments. Conduct the test with α = .05

LEARNING CHECK