qnt 531 advanced problems in statistics and research methods workshop 2 by dr. serhat eren...

61
QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Upload: theresa-osborne

Post on 11-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

QNT 531Advanced Problems in

Statistics and Research Methods

QNT 531Advanced Problems in

Statistics and Research Methods

WORKSHOP 2

By Dr. Serhat Eren

University OF PHOENIX

Page 2: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

ANALYSIS OF VARIANCE AND EXPERIMENTAL DESIGN

SECTION 2SECTION 2

Page 3: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

An Introduction to Analysis of VarianceAnalysis of Variance: Testing for the Equality of k

population meansMultiple comparison proceduresAn introduction to Experimental DesignCompletely Randomized DesignsRandomized Block DesignFactorial Experiment

SECTION 2

SECTION OBJECTIVES SECTION 2

SECTION OBJECTIVES

Page 4: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

One-Way Designs: The BasicsA factor is a variable that can be used to differentiate

one group or population from another. It is a variable that may be related to he variable of interest.

A level is one of several possible values or settings that the factor can assume.

The response variable is a quantitative variable that you are measuring or observing.

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 5: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

These are all examples of one-way or completely randomized designs.

An experiment has a one-way or completely randomized design if there are several different levels of one factor being studied and the objects or people being observed/ measured are randomly assigned to one of the levels of the factor.

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 6: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

The term one-way refers to the fact that the groups differ with regard to the one factor being studied.

The term completely randomized refers to the fact that individual observations are assigned to the groups in a random manner.

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 7: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Understanding the Total VariationAnalysis of variance (ANOYA) is the technique

used to analyze the variation in the data to determine if more than two population means are equal.

A treatment is a particular setting or combination of settings of the factor(s)

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 8: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

The grand mean or the overall mean is the sample average of all the observations in the experiment. It is labeled (x-bar-bar).

Now we can rewrite the variance calculations as follows:

x

1

)(1 1

2

2

n

xx

s

c

j

n

iij

i

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 9: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

The total variation or sum of squares total (SST) is a measure of the variability in the entire data set considered as a whole.

SST is calculated as follows:

c

j

n

iij

i

xxSST1 1

2)(

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 10: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Components of Total VariationThe between groups variation is also called the Sum

or Squares between or the Sum of Squares Among and it measures how much of the total variation comes from actual differences in the treatments.

The dot-plot shown in Figure 14.3 displays the sample average for each of the four time treatments. These are called treatment means.

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 11: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 12: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

A treatment mean is the average of the response variable for a particular treatment.

Between Groups Variation measures how different the individual treatment means are from the overall grand mean. It is often called the sum of squares between or the sum of squares among (SSA).

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 13: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

The formula for sum of squares among (SSA) is:

Within groups variation measures the variability in the measurements within the groups. It is often called sum of squares within or sum of squares error (SSE).

2)( xxnSSA jj

C

J

n

I jijj xxSSE

1 1

2)(

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 14: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 15: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 16: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 17: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 18: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 19: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 20: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

The Mean Square Terms in the ANOVA TableThe mean square among is labeled MSA The

mean square error is labeled MSE and the mean square total is labeled MST.

The formulas for the mean squares are;

1c

SSAMSA

cn

SSEMSE

1n

SSTMST

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 21: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Testing the Hypothesis of Equal Means In general, the null and alternative hypotheses for a

one-way designed experiment are shown below:

HA: At least one of the population means is different from the others.

COH ...: 21

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

Page 22: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

The formula for the F test statistic is calculated by taking the ratio of the two sample variances:

In ANOVA, MSA and MSE are our two sample variances. So the F statistic is calculated as:

22

21

s

sF

SECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM ONE-WAY DESIGNS

MSE

MSAF

Page 23: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

The three major assumptions of ANOVA are as follows:The errors are random and independent of

each other.Each population has a normal distribution.All of the populations have the same

variance.

SECTION 2

ASSUMPTIONS OF ANOVASECTION 2

ASSUMPTIONS OF ANOVA

Page 24: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

SECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNSSECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNS

A block is a group or objects or people that have been matched. Are object or person can be matched with itself, meaning that repeated observations are taken on that object or person and these observations form a block?

If the realities of data collection lead you to use blocks, then you must take this into account in your analysis. Your experimental design is called a randomized block design. Instead of using a one-way ANOVA you must use a block ANOVA.

Page 25: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

An experiment has a randomized block design if several different levels of one factor are being studied and the objects or people being observed/ measured have been matched.

Each object or person is randomly assigned to one of the c levels of the factor.

SECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNSSECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNS

Page 26: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Partitioning the Total VariationLike the approach we took with data from a one-way

design, the idea is to take the total variability as measured by SST and break it down into its components.

With a block design there is one additional component: the variability between the blocks. It is called the sum of squares blocks and is labeled SSBL.

SECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNSSECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNS

Page 27: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

The sum of squares blocks measures the variability between the blocks. It is labeled SSBL.

For a block design, the variation we see in the data is due to one of three things: the level of the factor, the block, or the error.

SECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNSSECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNS

Page 28: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Thus, the total variation is divided into three components:

SST = SSA + SSBL + SSE

SECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNSSECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNS

Page 29: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Using the ANOVA Table in a Block DesignThe ANOVA table for such a block design looks

just like the ANOVA table for a one-way design with an additional row.

SECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNSSECTION 2

ANALYSIS OF DATA FROM BLOCKED DESIGNS

Page 30: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 31: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 32: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 33: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 34: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 35: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

Motivation for a Factorial Design ModelAn experimental design is called a factorial

design with two factors if there are several different levels of two factors being studied.

The first factor is called factor A and there are r levels of factor A. The second factor is called factor B and there are c levels of factor B.

Page 36: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

The design is said to have equal replication if the same number of objects or people being observed/measured are randomly selected from each population.

The population is described by a specific level for each of the two factors. Each observation is called a replicate.

There are n' observations or replicates observed from each population. There are n = n'rc observations in total.

Page 37: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Partitioning the VariationThe sum of squares due to factor A is labeled SSA.

It measures the squared differences between the mean of each level of factor A and the grand mean.

The sum of squares due to factor B is labeled SSB. It measures the squared differences between the mean of each level of factor B and the grand mean.

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

Page 38: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

The sum of squares due to the interacting effect of A and B is labeled SSAB. It measures the effect of combining factor A and factor B.

The sum of squares error is labeled SSE. It measures the variability in the measurements within the groups.

Thus, the total variation is divided into four components:

SST = SSA + SSB + SSAB + SSE

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

Page 39: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Using the ANOVA Table in a Two-Way DesignThe ANOVA table for such a design looks just like

the ANOVA table for a one-way design with two additional rows.

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

Page 40: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 41: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Using the ANOVA Table in a Two-Way Design In a two-way ANOVA, three hypothesis tests should be

done.

1. To test the hypothesis of no difference due to factor A we would have the following null and alternative hypotheses:Ho: There is no difference in the population means due to factor A.HA: There is a difference in the population means due to factor A.

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

Page 42: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

2. To test the hypothesis of no difference due to factor B we would have the following null and alternative hypotheses:

Ho: There is no difference in the population means due to factor B.HA: There is a difference in the population means due to factor B.

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

Page 43: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

3. To test the hypothesis of no difference due to the interaction of factors A and B we would have the following null and alternative hypotheses:

Ho: There is no difference in the population means due to the interaction of factors A and B.HA: There is a difference in the population means due to the interaction of factors A arid B.

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

Page 44: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Understanding the interaction Effect The easiest way to understand this effect is to

look at a graph of the sample averages for each of the possible combinations of the two factors.

The line graph shown in Figure 14.7 displays the 20 sample means for airspace.

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

Page 45: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 46: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

From this graph you can see that the mean airspace decreases the longer the box sits on the shelf, regardless of from what position in the hardroll the box was made.

The airspace behavior is affected by the interaction of the time on the shelf and the position in the hardroll from which it was made.

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

Page 47: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

If there were no interaction effect, the lines connecting the sample means would be parallel as in Figure 14.8.

SECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNSSECTION 2

ANALYSIS OF DATA FROM TWO-WAY DESIGNS

Page 48: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 49: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

When we use analysis of variance to test whether the means of k populations are equal, rejection of the null hypothesis allows us to conclude only that the population means are not all equal.

In some cases we will want to go a step further and determine where the differences among means occur.

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE

Page 50: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

The purpose of this section is to introduce two multiple comparison procedures that can be used to conduct statistical comparisons between pairs of population means.

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE

Page 51: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

2.3.1 FISHER’S LSD Suppose that analysis of variance has provided

statistical evidence to reject the null hypothesis of equal population means.

In this case, Fisher’s least significant difference (LSD) procedure can be used to determine where the differences occur.

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE

Page 52: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 53: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX
Page 54: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Confidence Interval Estimate of the Difference Between Two Population Means Using Fisher’s LSD Procedure:

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE

LSDxx

nnMSEtxx

)(

11)(

21

212/21

Page 55: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

TYPE I ERROR RATES We showed how Fisher’s LSD procedure can be

used in such cases to determine where the differences occur.

Technically, it is referred to as a protected or restricted LSD test because it is employed only if we first find a significant F value by using analysis of variance.

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE

Page 56: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

To see why this distinction is important in multiple comparison tests, we need to explain the difference between a comparisonwise Type I error rate and an experimentwise Type I error rate.

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE

Page 57: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

For example, in the NCP example Fisher’s LSD procedure was used to make three pairwise comparisons.

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE

TEST 1 TEST 2 TEST 3

21

210

:

:

aH

H

31

310

:

:

aH

H

32

320

:

:

aH

H

Page 58: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

In each case, we used a level of significance of = 0.05.

Therefore, for each test, if the null hypothesis is true, the probability that we will make a Type I error is = 0.05; hence, the probability that we will not make a Type I error on each test is 1- 0.05= 0.95.

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE

Page 59: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

In discussing multiple comparison procedures we refer to this probability of a Type I error (= 0.05) as the comparisonwise Type I error rate; comparisonwise Type I error rates indicate the level of significance associated with a single pairwise comparison.

Let us now consider a slightly different question. What is the probability that in making three pairwise comparisons, we will commit a Type I error on at least one of the three tests?

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE

Page 60: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

To answer this question, note that the probability that we will not make a Type I error on any of the three tests is;

(.95)(.95)(.95)=0 .8574.

Therefore, the probability of making at least one Type I error is:

1-0.8574= 0.1426

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE

Page 61: QNT 531 Advanced Problems in Statistics and Research Methods WORKSHOP 2 By Dr. Serhat Eren University OF PHOENIX

Thus, when we use Fisher’s LSD procedure to make all three pairwise comparisons, the Type I error rate associated with this approach is not .05, but actually 0.1426; we refer to this error rate as the overall or experimentwise Type I error rate.

SECTION 2

MULTIPLE COMPARISON PROCEDURESECTION 2

MULTIPLE COMPARISON PROCEDURE