ng bb 33 hypothesis testing basics

138
UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO National Guard Black Belt Training Module 33 Hypothesis Testing Basics

Upload: leanleadersorg

Post on 29-Jan-2015

135 views

Category:

Education


7 download

DESCRIPTION

 

TRANSCRIPT

Page 1: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

National GuardBlack Belt Training

Module 33

Hypothesis Testing Basics

Page 2: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

CPI Roadmap – Analyze

Note: Activities and tools vary by project. Lists provided here are not necessarily all-inclusive.

TOOLS•Value Stream Analysis•Process Constraint ID •Takt Time Analysis•Cause and Effect Analysis •Brainstorming•5 Whys•Affinity Diagram•Pareto •Cause and Effect Matrix •FMEA•Hypothesis Tests•ANOVA•Chi Square •Simple and Multiple Regression

ACTIVITIES

• Identify Potential Root Causes

• Reduce List of Potential Root Causes

• Confirm Root Cause to Output Relationship

• Estimate Impact of Root Causes on Key Outputs

• Prioritize Root Causes

• Complete Analyze Tollgate

1.Validate the

Problem

4. Determine Root

Cause

3. Set Improvement

Targets

5. Develop Counter-

Measures

6. See Counter-MeasuresThrough

2. IdentifyPerformance

Gaps

7. Confirm Results

& Process

8. StandardizeSuccessfulProcesses

Define Measure Analyze ControlImprove

8-STEP PROCESS

Page 3: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

3Hypothesis Testing - Basic

Learning Objectives

Review the terms “Parameters” and “Statistics” as they relate to Populations and Samples.

Introduce Confidence Intervals for expressing the uncertainty when predicting a population parameter using a sample statistic, and how to calculate CI’s for some common situations for different sample sizes.

Show how the Central Limit Theorem and the Standard Error of the Mean applies to the use of Confidence Intervals and Tests

Page 4: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

4Hypothesis Testing - Basic

Learning Objectives (Cont.)

Introduce statistical tests for some common tests and introduce the t-distribution with testing

Learn about Hypothesis Testing to prove a statistical difference in process performance in applications of Minitab

Understand the tradeoffs and influences of sample sizes on statistical tests.

Apply knowledge of different classes of statistical errors to the decisions used in process improvement to minimize risk.

Page 5: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

5Hypothesis Testing - Basic

Application Examples

Transactional – A Black Belt has just finished a pilot of a new process for handling blanket Purchase Orders and wants to know if it has a statistically significant: a) shorter cycle time and b) increased accuracy over the old process.

Administrative – The manager of an AAFES1 order entry department wants to compare two order entry procedures to see if one is faster than the other.

Service – Medical diagnostic imaging services are provided from two different medical treatment facilities to a central hospital which wants to know if there are differences in the quality of service, particularly: a) the number of lost records and re-takes, and b) average waiting time for MRIs and X-rays.

1AAFES, Army and Air Force Exchange System

Page 6: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

6Hypothesis Testing - Basic

Since it is not always practical or possible to measure/query every item/person in the population, you take a random sample.

Population vs. Sample

25 appraisals chosen at random from a given month

All appraisals completed that month

3,000 people are given a new treatment in a clinical study

All sufferers of a certain disease that might be given the new treatment

10,000 people are asked who they will vote for President

All U.S. registered voters

SamplePopulation

Page 7: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

7Hypothesis Testing - Basic

Terms and Labels: Population vs. Sample

~

Count of items Mean Median Standard Dev.

N

m

m

s

n

x

x

S

Estimators =m

s

x

s

~

Population =

ParameterTerm

Sample =

Statistic

Page 8: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

8Hypothesis Testing - Basic

Population Parameters;

Mean, m (mu), and Standard Deviation, s (sigma)

Sample Statistics; Mean, x-bar,

and Standard Deviation, s

Population

Random Samples

of Size, n = 4

x s1 1,

x s2 2,

x s3 3,

x s4 4, m s,

Population Parameters vs. Sample Statistics

Page 9: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

9Hypothesis Testing - Basic

If:x1, x2, …, xn are independent measurements (i.e., a random sample of size n)

from a population, where the mean of x is m, when

the standard deviation of x is given as s,

Then:

The distribution of x

has mean and standard

deviation given by:

In addition, when n is sufficiently large, then the distribution of x- bar is approximately normal (“bell-shaped curve”). More on sample sizes later...

Central Limit Theorem

Standard Error

of the Meann

XX

ssmm and

n

XXX nX

21 x3

Page 10: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

10Hypothesis Testing - Basic

Variability of Means

Sample statistics estimate population parameters by inference:

For a given sample ( x, s, n ), we can estimate population

parameters of m s by inference.

As the sample size increases we are more confident that our sample statistic is a more valid estimator of the population parameter.

n

n

n

=

=

= 1

3

5

nxxss

sx

Page 11: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

National GuardBlack Belt Training

Confidence Intervals

Page 12: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

12Hypothesis Testing - Basic

What Is a Confidence Interval?

We know that when we take the average of a sample, it is probably not exactly the same as the average of the population.

Confidence intervals help us determine the likely range of the population parameter.

For example, if my 95% confidence interval is 5 +/-2, then I have 95% confidence that the mean of the population is between 3 and 7.

Page 13: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

13Hypothesis Testing - Basic

Usually, confidence intervals have an additive uncertainty:

Estimate ± Margin of Error

Sample Statistic ± [ ___ X ___ ]

Confidence

Factor

Measure of

Variability

What Is a Confidence Interval? (Cont.)

Example:

x, s

Note: Detailed formulas may be found in the appendix.

Page 14: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

14Hypothesis Testing - Basic

Why Do We Need Confidence Intervals?

Sample statistics, such as Mean and Standard Deviation, are only estimates of the population’s parameters.

Because there is variability in these estimates from sample to sample, we can quantify our uncertainty using statistically-based confidence intervals. Confidence intervals provide a range of plausible values for the population parameters (m and s).

Any sample statistic will vary from one sample to another and, therefore, from the true population or process parameter value.

Page 15: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

15Hypothesis Testing - Basic

Exercise

Let’s look at a population that has a normal distribution with:

known mean value = 65

standard deviation = 4

(This has been generated in dataset Confidence.mtw)

Each member in the class will randomly sample 25 data points from this population. (In Minitab, use Calc>Random Data>Sample from Columns.)

Sample 25 rows of data from C1 and store the results in C2.

Use graphical descriptive statistics to calculate the 95% confidence interval for the mean and sigma based on your sample of 25 data points. Do they include the mean, 65, and the sigma, 4?

Based on a class size of 25, we would expect 1 confidence interval to not contain 65 for the mean, and 1 that does not include 4 for sigma.

Page 16: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

16Hypothesis Testing - Basic

Confidence Interval for the Mean (m) with Population Standard Deviation (s) Known

Example

A random sample of size, n = 36, is taken and the distribution of x is normal. We are given that the population standard deviation (s) is 18.0. The value of x-bar is an estimator of the population mean (m), and the standard error of x-bar is:

From the properties of the standardized normal distribution,

there is a 95% chance that m is within the range of ( x-bar + and - 1.96 times the Standard Error of x-bar).

0.336/0.18/ nbarx ss

This is known as the Standard Error of the Mean

Page 17: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

17Hypothesis Testing - Basic

1m1m - 1.96(3.0) 1m + 1.96(3.0)

.95

.025 .025

95% of all x-bars will fall into the shaded region, defined by m ± 1.96(3.0)

Standard Error of the Mean

What Values of x-bar Can I Expect?

Distribution of x-bar

Page 18: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

18Hypothesis Testing - Basic

1m1m - 1.96(3.0) 1m + 1.96(3.0)

Observed sample mean, x-barsample C

But I Don’t Know m, I Only Know x-bar!

We can turn it around.

x-bar lying in the interval m ±1.96(3.0) is the same thing as m lying in the interval x-bar ±1.96(3.0).

Because there is a 95% chance that x-bar lies in the interval m ± 1.96(3.0), there is a 95% chance that the interval x-bar ± 1.96(3.0) encloses m.

The interval we construct using the observed sample mean is called a 95% confidence interval for m.

(---------- x-barsample C-----------)

(----------- x-barsample A -----------)

(---------- x-barsample B-----------)

Page 19: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

19Hypothesis Testing - Basic

Confidence Interval for the Mean (m) with Population Standard Deviation (s) Known

Another Example

An airline needs an estimate of the average number of passengers on a newly scheduled flight. Its experience is that data for the first month of flights is unreliable, but thereafter the passenger loading settles down.

Therefore, the mean passenger load is calculated for the first 20 weekdays of the second month after initiation of this particular new flight. If the sample mean (x-bar) is 112.0 and the population standard deviation (s) is assumed to be 25, find a 95% confidence interval for the true, long-run average number of passengers on this flight.

Page 20: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

20Hypothesis Testing - Basic

Confidence Interval for the Mean (m) with Standard Deviation (s) Known

Solution

We assume that the hypothetical population of daily passenger loads for weekdays is not badly skewed. Therefore, the sampling distribution of x-bar is approximately normal and the confidence interval results are approximately correct, even for a sample size of only 20 weekdays.

For a 95% confidence interval, we use z.025= 1.96 in the formula to obtain

We are 95% confident that the long-run mean, m , lies in this interval.

59.520

25

0.112bar-x

bar-x

ss

s

96.122 to 04.101 59.596.1112 or

Page 21: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

21Hypothesis Testing - Basic

Confidence Interval for the Mean (m) with Population Standard Deviation (s) Unknown

A very important point to remember is that for this example we assumed that we knew the population standard deviation, and many times that is not the case. Often, we have to estimate both the mean and the standard deviation from the sample.

When s is not known, we use the t-distribution rather than the normal (z) distribution. The t-distribution will be explained next.

In many cases, the true population s is not known, so we must use our sample standard deviation (s) as an estimate for the population standard deviation (s

Page 22: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

22Hypothesis Testing - Basic

Confidence Interval for the Mean (m) with Standard Deviation (s) Unknown (Cont.)

Since there is less certainty (not knowing m or s ), the t-distribution essentially “relaxes” or “expands” our confidence intervals to allow for this additional uncertainty.

In other words, for a 95% confidence interval, you would multiply the standard error by a number greater than 1.96, depending on the sample size.

1.96 comes from the normal distribution, but the number we will use in this case will come from the t-distribution.

Page 23: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

23Hypothesis Testing - Basic

What Is This t-Distribution?

The t-distribution is actually a family of distributions.

They are similar in shape to the normal distribution (symmetric and bell-shaped), although wider, and flatter in the tails.

How wide and flat the specific t-distribution is depends on the sample size. The smaller the sample size, the wider and flatter the distribution tails.

As sample size increases, the t-distribution approaches the exact shape of the normal distribution.

Page 24: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

24Hypothesis Testing - Basic

An Example of a t-Distribution

3210-1-2-3

0.4

0.3

0.2

0.1

0.0

t

freq

uenc

y

2.78

0.025

Area =

t-distribution

(n = 5)

Page 25: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

25Hypothesis Testing - Basic

Sample Size t-value (.025)*

2 12.71

3 4.30

5 2.78

10 2.26

20 2.09

30 2.05

100 1.98

1000 1.96

* For a 95% CI, = .05. Therefore, for a two tail distribution: /2= .05/2= .025

Some Selected t-Values

Here are values from the t-distribution for various sample sizes (for 95% confidence intervals):

Page 26: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

26Hypothesis Testing - Basic

Confidence Interval for the Mean (m) with Population Standard Deviation (s) UnknownExample

The customer expectation when phoning an order-out pizza shop is that the average amount of time from completion of dialing until they hear the message indicating the time in queue is equal to 55.0 seconds (less than a minute was the response from customers surveyed, so the standard was established at 10% less than a minute). You decide to randomly sample at 20 times from 11:30am until 9:30pm on 2 days to determine what the actual average is. In your sample of 20 calls, you find that the sample mean, x-bar, is equal to 54.86 seconds and the sample standard deviation, s, is equal to 1.008 seconds.

The actual data was as follows:

54.1, 53.3, 56.1, 55.7, 54.0, 54.1, 54.5, 57.1, 55.2, 53.8,54.1, 54.1, 56.1, 55.0, 55.9, 56.0 ,54.9, 54.3, 53.9, 55.0

What is a 95% confidence interval for the true mean call completion time?

Page 27: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

27Hypothesis Testing - Basic

We’re 95% confident that the actual mean call completion time is somewhere between

54.389 seconds and 55.331 seconds,based on our sample of 20 calls.

n

stx 1nα/2,

20

008.109.2860.54

331.55,389.54

95% Confidence Interval for Mean Call Completion Time

x = 54.860

s = 1.008

n = 20

t.025,19 = 2.09 our sample of 20 calls

Luckily, we don’t have to worry about the details of how to calculate the t-value. Minitab takes care of

that for us.

Page 28: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

28Hypothesis Testing - Basic

1. Open the Minitab file PizzaCall.mtw

Now Let Minitab Calculate the Confidence Interval

Page 29: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

29Hypothesis Testing - Basic

2. Select Stat> Basic Statistics> Graphical Summary

Now Let Minitab Calculate the Confidence Interval (Cont.)

Page 30: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

30Hypothesis Testing - Basic

Now Let Minitab Calculate the Confidence Interval (Cont.)

3. Double click on C-1 to place it in the Variables box

4. Click on OK

Page 31: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

31Hypothesis Testing - Basic

Now Let Minitab Calculate the Confidence Interval (Cont.)

95% Confidence Interval

for Mean (m:

54.388 55.332

95% Confidence Interval

for Standard Deviation (s:

0.767 1.472

We’re 95% confident that the actual mean is

between 54.388 and 55.332

We’re also taking a 5%chance that we’re wrong.

57565554

Median

Mean

55.5055.2555.0054.7554.5054.2554.00

A nderson-Darling Normality Test

V ariance 1.016

Skewness 0.560026

Kurtosis -0.509797

N 20

Minimum 53.300

A -Squared

1st Q uartile 54.100

Median 54.700

3rd Q uartile 55.850

Maximum 57.100

95% C onfidence Interv al for Mean

54.388

0.60

55.332

95% C onfidence Interv al for Median

54.100 55.582

95% C onfidence Interv al for StDev

0.767 1.472

P-V alue 0.105

Mean 54.860

StDev 1.008

95% Confidence Intervals

Summary for C1

Page 32: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

32Hypothesis Testing - Basic

Other Types of Confidence Intervals

There are other types of confidence intervals that are based on the same principles we have learned:

Standard Deviation

Proportions

Median

Others

We will discuss some of these later.

Page 33: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

National GuardBlack Belt Training

Hypothesis Testing

Page 34: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

34Hypothesis Testing - Basic

Extending the Concept of Confidence Intervals

Extending the concept of confidence intervals allows us to set-up and interpret statistical tests.

We refer to these tests as Hypothesis Tests.

One way to describe a hypothesis test:

Determining whether or not a particular value of interest is contained within a confidence interval.

Hypothesis testing also gives us the ability to calculate the probability that our conclusion is wrong.

Page 35: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

35Hypothesis Testing - Basic

The New Car

You buy a one-year old car from the Lemon Lot in order to save money on gas. The previous owner still had the original features sticker and you were pleased to note that the EPA mileage estimate indicated that the car should get 31 miles per gallon overall.

Page 36: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

36Hypothesis Testing - Basic

The New Car (Cont.)

As soon as you buy the car, you fill up the tank so that you’ll be ready to take the family for a drive and to go to work the next day. A few days later, you fill up again and calculate your gas mileage for that tank. After you push the “=“ key on your calculator, the number 27.1 appears.

Page 37: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

37Hypothesis Testing - Basic

The New Car (Cont.)

Should you send the car to a mechanic to check for problems?

Do you conclude that the EPA estimate is simply wrong?

Do you leave cruel messages on the seller’s answering machine?

What ARE your conclusions?

Page 38: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

38Hypothesis Testing - Basic

Continuing the Car Situation

At what value of gas consumption should you become alarmed that you are experiencing anything more than just random variation?

Page 39: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

39Hypothesis Testing - Basic

The Car Situation (Cont.)

What if we knew this?

s = 3.46

Distribution of gas consumption for this

car

12.8 %

Page 40: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

40Hypothesis Testing - Basic

Hypothesis Testing

Hypothesis Testing:

Allows us to determine statistically whether or not a value is cause for alarm (or is simply due to random variation)

Tells us whether or not two sets of data are different

Tells us whether or not a statistical parameter (mean, standard deviation, etc.) is statistically different from a test value of interest

Allows us to assess the “strength” of our conclusion (our probability of being correct or wrong)

Page 41: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

41Hypothesis Testing - Basic

Hypothesis Testing (Cont.)

Hypothesis Testing Enables Us to:

Handle uncertainty using a commonly accepted approach

Be more objective (2 persons will use the same techniques and come to similar conclusions almost all of the time)

Disprove or “fail to disprove” assumptions

Control our risk of making wrong decisions or coming to wrong conclusions

Page 42: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

42Hypothesis Testing - Basic

Hypothesis Testing (Cont.)

m

Population Mean

Sample BTrue

Population Distribution

Sample A

Sample C

Sample D

Some Possible Samples

Page 43: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

43Hypothesis Testing - Basic

Sample Size Concerns

If we sample only one item, how close do we expect to get to the true population mean?

How well do you think this one item represents the true mean?

How much ability do we have to draw conclusions about the mean?

What if we sample 900 items? Now, how close would we expect to get to the true population mean?

Page 44: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

44Hypothesis Testing - Basic

Sample Size (Cont.)

m

Population

Likely value of x-bar with a small sample

size

Likely value of x-bar with a large sample

size

x

The larger our sample, the closer x-bar is likely to be to the true population mean.

x

Page 45: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

45Hypothesis Testing - Basic

Standard Deviation

What effect would a lot of variation in the population have on our estimate of the population mean from a sample?

How would this affect our ability to draw conclusions about the mean?

What if there is very little variation in the population?

Page 46: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

46Hypothesis Testing - Basic

Standard Deviation (Cont.)

Population with a lot of variation

Population with less variation

m

m

Likely value of x-bar with sample size, n

Likely value of x-bar with sample size, n

x

x

Page 47: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

47Hypothesis Testing - Basic

Statistical Inferences and Confidence

How much confidence do we have in our estimates?

How close do you think the true mean, m, is to our estimate of the mean, x-bar?

How certain do we want/need to be about conclusions we make from our estimates?

If we want to be more confident about our sample estimate (i.e., we want a lower risk of being wrong), then we must relax our statement of how close we are to the true value.

Page 48: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

48Hypothesis Testing - Basic

Statistical Inferences and Confidence (Cont.)

m

Population

If we want to have high confidence in our conclusions, we must

relax the range in which we say the true

mean lies

As we tighten our estimate of the mean, our risk of being wrong increases. Thus, our confidence decreases.

x

x

Page 49: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

49Hypothesis Testing - Basic

Three Factors Drive Sample Sizes

Three concepts affect the conclusions drawn from a single sample data set of (n) items:

Variation in the underlying population (sigma)

Risk of drawing the wrong conclusions (alpha, beta)

How small a Difference is significant (delta)

)(n

Risk

Variation Difference

Page 50: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

50Hypothesis Testing - Basic

Three Factors: Variation, Risk, Difference

These 3 factors work together. Each affects the others.

Variation: When there’s greater variation, a larger sample is needed to have the same level of confidence that the test will be valid. More variation reduces our confidence interval.

Risk: If we want to be more confident that we are not going to make a decision error or miss a significant event, we must increase the sample size.

Difference: If we want to be confident that we can identify a smaller difference between two test samples, the sample size must increase.

Page 51: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

51Hypothesis Testing - Basic

Three Factors (Cont.)

Larger samples improve our confidence interval.

Lower confidence levels allow smaller samples.

All of these translate into a specific confidence interval for a given parameter, set of data, confidence level and sample size.

They also translate into what types of conclusions result from hypothesis tests.

Testing for larger differences between the samples, reduces the size of the sample. This is known as delta (D).

Page 52: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

52Hypothesis Testing - Basic

An Example

A unit has several quick response forces, QRF. Some forces have over 700 members, with at least 300 on the site at any time.

By regulation, all forces must have a quick response plan, the critical first phase of which is required to be completed in 10 minutes (600 seconds) or less.

There are two teams that are vying for “most responsive.” They have taken somewhat different approaches to implementing their quick response plans and management wants to know which approach is better: Team 1 or Team 2

Each one has 100 data points for actual responses and drills (Minitab file Response.mtw)

Page 53: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

53Hypothesis Testing - Basic

The Data from Team 1

598.0 598.8 600.2 599.4 599.6

599.8 598.8 599.6 599.0 601.2

600.0 599.8 599.6 598.4 599.6

599.8 599.2 599.6 599.0 600.2

600.0 599.4 600.2 599.6 600.0

600.0 600.0 599.2 598.8 600.0

598.8 600.2 599.0 599.2 599.4

598.2 600.2 599.6 599.6 599.8

599.4 599.6 600.4 598.6 599.2

599.6 599.0 600.0 599.8 599.6

599.4 599.0 599.0 599.6 599.4

599.4 599.8 599.6 599.2 600.0

600.0 600.8 599.4 599.6 600.0

598.8 598.8 599.2 600.2 599.2

599.2 598.2 597.8 599.8 599.4

599.4 600.0 600.4 599.6 599.6

599.6 599.2 599.6 600.0 599.8

599.0 599.8 600.0 599.6 599.0

599.2 601.2 600.8 599.2 599.6

600.6 600.4 600.4 598.6 599.4

Page 54: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

54Hypothesis Testing - Basic

The Data from Team 2

601.6 600.8 599.4 599.8 601.6

600.4 598.6 598.0 602.8 603.4

598.4 600.0 597.6 600.0 597.0

600.0 600.4 598.0 599.6 599.8

596.8 600.8 597.6 602.2 597.8

602.8 600.8 601.2 603.8 602.4

600.8 597.2 599.0 603.6 602.2

603.6 600.4 600.4 601.8 600.6

604.2 599.8 600.6 602.0 596.2

602.4 596.4 599.0 603.6 602.4

598.4 600.4 602.2 600.8 601.4

599.6 598.2 599.8 600.2 599.2

603.4 598.6 599.8 600.4 601.6

600.6 599.6 601.0 600.2 600.4

598.4 599.0 601.6 602.2 598.0

598.2 598.2 601.6 598.0 601.2

602.0 599.4 600.2 598.4 604.2

599.4 599.4 601.8 600.8 600.2

599.4 600.2 601.2 602.8 600.0

600.8 599.0 597.6 597.6 596.8

Page 55: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

55Hypothesis Testing - Basic

Descriptive Statistics – Team 1

600.75600.00599.25598.50597.75

Median

Mean

599.70599.65599.60599.55599.50599.45599.40

1st Q uartile 599.20

Median 599.60

3rd Q uartile 600.00

Maximum 601.20

599.43 599.67

599.40 599.60

0.54 0.72

A -Squared 0.84

P-V alue 0.029

Mean 599.55

StDev 0.62

V ariance 0.38

Skewness -0.082566

Kurtosis 0.745102

N 100

Minimum 597.80

A nderson-Darling Normality Test

95% C onfidence Interv al for Mean

95% C onfidence Interv al for Median

95% C onfidence Interv al for StDev

95% Confidence Intervals

Summary for Team 1

Page 56: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

56Hypothesis Testing - Basic

Descriptive Statistics – Team 2

603.0601.5600.0598.5597.0

Median

Mean

600.6600.4600.2600.0599.8

1st Q uartile 599.00

Median 600.20

3rd Q uartile 601.60

Maximum 604.20

599.86 600.60

599.80 600.60

1.65 2.18

A -Squared 0.29

P-V alue 0.615

Mean 600.23

StDev 1.87

V ariance 3.51

Skewness 0.051853

Kurtosis -0.518286

N 100

Minimum 596.20

A nderson-Darling Normality Test

95% C onfidence Interv al for Mean

95% C onfidence Interv al for Median

95% C onfidence Interv al for StDev

95% Confidence Intervals

Summary for Team 2

Page 57: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

57Hypothesis Testing - Basic

Example

The average cycle time for Team 1 is 599.55 seconds.

The average cycle time for Team 2 is 600.23 seconds.

The target cycle time for Phase 1 response is 600 seconds.

Is the difference between the two average cycle times statistically significant?

Page 58: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

58Hypothesis Testing - Basic

Example (Cont.)

The unit wants to determine if the true averages of the two teams are really different.

The unit thinks that the 600.23 average of team 2 is little too high, so there is a need to determine if the data indicates that the true average is really not equal to the target of 600 seconds.

The unit will use hypothesis testing to answer these questions.

Page 59: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

59Hypothesis Testing - Basic

Example

The first hypothesis test to be performed is to determine whether there is a statistically significant difference between the means of the two teams. This is called a 2-Sample t Test.

The real question is whether or not the means are different enough to indicate that the approaches taken by the two teams really are centered differently, or are they close enough that the difference could simply be a result of random variation?

After that, hypothesis testing can tell us if there is evidence indicating whether or not each team’s average is different from the target of 600 seconds.

First, we need to introduce some terminology.

Page 60: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

60Hypothesis Testing - Basic

The Null Hypothesis for a 2-Sample t Test

The 2-Sample t Test is used to test whether or not the means of two populations are the same.

The null hypothesis is a statement that the population means for the two samples are equal.

Ho: μ1 = μ2

We assume the null hypothesis is true unless we have enough evidence to prove otherwise. We say – we “fail to reject the null”.

If we can prove otherwise, then we “reject the null” hypothesis and accept the Alternative Hypothesis

HA: μ1 ≠ μ2

Page 61: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

61Hypothesis Testing - Basic

Null Hypothesis for 2-Sample t Test (Cont.)

This is analogous to our judicial system principle of “innocent until proven guilty”

The symbol used for the null hypothesis is Ho:

0: OR : 210210 mmmm HH

Page 62: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

62Hypothesis Testing - Basic

The Alternative Hypothesis for a 2-Sample t Test

The alternative hypothesis is a statement that represents reality if there is enough evidence to reject Ho.

If we reject the null hypothesis then we accept the alternative hypothesis.

This is analogous to being found “guilty” in a court of law.

The symbol used for the alternative hypothesis is Ha:

0: OR : 2121 mmmm aa HH

Page 63: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

63Hypothesis Testing - Basic

Our Emergency Response Team Example

In our example, the first hypothesis test will take this form:

21

21

:

:

mm

mm

a

o

H

H

0:

0:

21

21

mm

mm

a

o

H

H

We can rewrite it in this form:

Reminder:We are conducting a 2-Sample t test to determine if the average cycle time of the

Phase 1 response from our two teams are different.

Page 64: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

64Hypothesis Testing - Basic

Our Emergency Response Team Example (Cont.)

If we wanted to specifically test only whether or not there was enough evidence to indicate that team 2’s average was greater than team 1’s, it would take this form:

0:

0:

21

21

mm

mm

a

o

H

H

This is still a 2-Sample t-Test

Page 65: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

65Hypothesis Testing - Basic

Our Emergency Response Team Example (Cont.)

The second hypothesis test will be a 1-Sample t. It will take this form for each team:

600:

600:

1

1

m

m

a

o

H

H

When you are testing whether or not a population mean is equal to a given or

Target value, you use a 1-Sample t

Page 66: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

66Hypothesis Testing - Basic

Hypothesis Test in Minitab

We will use Minitab to conduct our hypothesis tests.

Open the Minitab file Response.mtw

Page 67: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

67Hypothesis Testing - Basic

Hypothesis Test in Minitab:2-Sample t-Test

Select Stat> Basic Statistics> 2-Sample tto compare Team 1 to Team 2

Page 68: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

68Hypothesis Testing - Basic

Hypothesis Test in Minitab (Cont.)

Team 1 and Team 2 are in different columns, so select Samples in different columns

Double click on C1-Supp1Then double click onC2-Supp2 to place them In First and Second boxes

Select Graphs to get the Graphs dialog box

Page 69: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

69Hypothesis Testing - Basic

Hypothesis Test in Minitab (Cont.)

In the Graphs dialog box, check both Boxplots of dataand Dotplots of data

Click OK here, and then click on OK in the previous dialog box

Page 70: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

70Hypothesis Testing - Basic

Hypothesis Test in Minitab (Cont.)

Team 2Team 1

605

604

603

602

601

600

599

598

597

596

Da

ta

Boxplot of Team 1, Team 2

Page 71: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

71Hypothesis Testing - Basic

Team 2Team 1

605

604

603

602

601

600

599

598

597

596

Da

ta

Individual Value Plot of Team 1, Team 2

Hypothesis Test in Minitab (Cont.)

Page 72: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

72Hypothesis Testing - Basic

Hypothesis Test in Minitab (Cont.)

This descriptive output shows up in your Session Window

Page 73: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

73Hypothesis Testing - Basic

Hypothesis Test in Minitab (Cont.)

The null hypothesis states that the difference between the two means is zero

Page 74: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

74Hypothesis Testing - Basic

Hypothesis Test in Minitab (Cont.)

We will cover p-values in more detail a little later

The p-value here is less than 0.05, so we can reject the null hypothesis

Page 75: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

75Hypothesis Testing - Basic

Assumptions

The Hypothesis Tests we have discussed make certain assumptions:

Independence between and within samples

Random samples

Normally distributed data

Unknown Variance

In our example, we did not assume equal variances. This is the safe choice. However, if we had reason to believe equal variances, then we could have checked the “Assume equal variances” box in the dialogue box.

Page 76: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

76Hypothesis Testing - Basic

The Risks of Being Wrong

Conclusion Drawn

Accept Ho

The

True

State

Ho True

Ho False

Type I

Error

-Risk)

Type II Error

-Risk)

Correct

Correct

Reject Ho

Error Matrix

Page 77: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

77Hypothesis Testing - Basic

Type I and Type II Errors

Type I Error

Alpha Risk

Producer Risk

The risk of rejecting the null, and taking action, when no action was necessary

Type II Error

Beta Risk

Consumer Risk

The risk of failing to reject the null when you should have rejected it.

No action is taken when there should have been action.

I’ve missed a significant effect!

I’ve discovered something that really

isn’t here!

Page 78: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

78Hypothesis Testing - Basic

Type I and Type II Errors (Cont.)

The Type I Error is determined up front.

It is the alpha value you choose.

The confidence level is one minus the alpha level.

The Type II Error is determined from the circumstances of the situation.

If alpha is made very small, then beta increases (all else being equal).

Requiring overwhelming evidence to reject the null increases the chances of a type II error.

To minimize beta, while holding alpha constant, requires increased sample sizes.

One minus beta is the probability of rejecting the null hypothesis when it is false. This is referred to as the Power of the test.

Page 79: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

79Hypothesis Testing - Basic

Type I and Type II Errors (Cont.)

What type of error occurs when an innocent man is convicted?

What about when a guilty man is set free?

Does the American justice system place more emphasis on the alpha or beta risk?

Page 80: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

80Hypothesis Testing - Basic

Exercise

Draw the Type I & II error matrix for airport security.

Do you think the security system at most airports places more emphasis on the alpha or beta risk?

Page 81: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

81Hypothesis Testing - Basic

The p-Value

If we reject the null hypothesis, the p-value is the probability of being wrong.

In other words, if we reject the null hypothesis, the p-value is the probability of making a Type I error.

It is the critical alpha value at which the null hypothesis is rejected.

If we don’t want alpha to be more than 0.05, then we simply reject the null hypothesis when the p-value is 0.05 or less.

As we will learn later, it isn’t always this simple.

Page 82: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

National GuardBlack Belt Training

Power, Delta and Sample Size

Page 83: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

83Hypothesis Testing - Basic

Beta, Power, and Sample Size

If two populations truly have different means, but only by a very small amount, then you are more likely to conclude they are the same. This means that the beta risk is greater.

Beta only comes into play if the null hypothesis truly is false. The “more” false it is, the greater your chances of detecting it, and the lower your beta risk.

The power of a hypothesis test is its ability to detect an effect of a given magnitude.

Minitab will calculate beta for us for a given sample size, but first let’s show it graphically….

1Power

Page 84: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

84Hypothesis Testing - Basic

Beta and Alpha

1m

95% Confidence Limit (alpha = .05) for mean, m1

(critical value)

Here is our first population and its corresponding alpha risk.

Alpha Risk

Page 85: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

85Hypothesis Testing - Basic

Beta and Alpha (Cont.)

1m

95% Confidence Limit (alpha = .05) for mean, m1 (critical

value)

We want to compare these two populations. Do you think that we will easily be able to determine if they are different?

D2m

Page 86: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

86Hypothesis Testing - Basic

Beta and Alpha (Cont.)

1m

Beta Risk

95% Confidence Limit (alpha = .05) for mean, m1 (critical

value)

If our sample from population 2 is in this grey area, we will not be able to see the difference. This is called Beta Risk.

D

2m

Page 87: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

87Hypothesis Testing - Basic

Beta and Delta

If we are trying to see a larger change, we have less Beta Risk.

2m

Beta Risk

D1m

95% Confidence Limit (alpha = .025) for

mean, m1 (critical value)

Page 88: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

88Hypothesis Testing - Basic

Beta and Sigma

Now we’re back to our original graphic. What do you think happens to Beta Risk if the standard deviations of the populations decrease?

1m

Beta Risk

95% Confidence Limit (alpha = .05) for mean,

m1 (critical value)

D2m

Page 89: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

89Hypothesis Testing - Basic

Beta and Sigma (Cont.)

If the standard deviation decreases, Beta Risk decreases.

Reducing variability has the same effect on Beta Risk as increasing sample size.

1m

Beta Risk

D2m

95% Confidence Limit (alpha = .05) for mean, m1

(critical value)

Page 90: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

90Hypothesis Testing - Basic

How Can Power Be Increased?

Power is related to risk, variation, sample size, and the size of change that we want to detect.

If we want to detect a smaller delta (effect), we typically must increase our sample size.

Page 91: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

91Hypothesis Testing - Basic

Example:

Power

Let’s use Minitab to determine the beta risk of the hypothesis test we performed on the two teams.

First, we’ll have to make some assumptions.

We don’t know the TRUE difference in the means, so we’ll assume that it’s 0.682, the differences in the sample averages.

A variance hypothesis test shows that the variances are not equal.

We will average the variances from Minitab to determine the combined variance using the following formula:

2

2

2

2

1 sss

Page 92: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

92Hypothesis Testing - Basic

Example:

Power (Cont.)

Select; Stat> Power and Sample Size>2-Sample t...

Page 93: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

93Hypothesis Testing - Basic

Example:

Power (Cont.)

To calculate Power, we need three things;1. Sample Size2. The Difference between the two Means3. The Average Standard Deviation of the two samples

We can get all this information from our 2-Sample t-Test conducted earlier:

1. Sample Size = 100

2. Difference Between Means = 0.682 (600.230 – 599.548 = 0.682)

3. Average Standard Deviation ??(See Next Slide)

Page 94: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

94Hypothesis Testing - Basic

Example:

Power (Cont.)

To Calculate Average Standard Deviation

Remember that Standard Deviations are the Square Roots of the Variance. Since square roots are not additive (we cannot add them and divide by two) we have to convert them back to Variances which are additive.

StDev Squared = VarianceTeam 1 0.619 squared = 0.3832Team 2 1.870 squared = 3.4969

Sum = 3.8801Divide by 2 to get Average = 1.9401

And Square Root of Average = 1.3929

So the Average Standard Deviation for the two samples is 1.3929

Page 95: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

95Hypothesis Testing - Basic

Example:

Power (Cont.)

1. Type in Sample Size of 100 here

2. Type in Difference Between Means of 0.682 here

3. Type in Average Standard Deviationof 1.393 here

4. Click on OK

Page 96: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

96Hypothesis Testing - Basic

Example:

Power (Cont.)

If the TRUE difference between the two support orgs. was 0.682, we would have a 6.88% chance of not observing this

and therefore concluding that they are the same.

The Power = 0.9312And since Beta = (1 –Power)

Beta = 0.0688.

Page 97: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

97Hypothesis Testing - Basic

Example:

Power (Cont.)

In practice, we evaluate the power of a test to determine its ability to detect a difference of a given magnitude that we deem important, or practically significant.

For example, we could calculate the power of a hypothesis test to see if we could measure a one minute difference in responsiveness between the two teams.

Page 98: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

98Hypothesis Testing - Basic

Example:

Power (Cont.)

Let’s say that if the two support organizations’ cycle times differ by as little as 0.4 seconds, then we need to analyze the reasons for the differences.

What is the power of our test to detect this difference?

What is the probability of making a type II error (concluding that there is no difference when one exists)?

Use Minitab to individually answer these questions.

Page 99: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

99Hypothesis Testing - Basic

Exercise:

Sample Size

Now that we understand the relationship between Beta, Power, Delta, and Sample Size, we can use this information to calculate the sample size necessary to give us the information we want.

We simply use the same function in Minitab to solve for sample size rather than power.

This is a very useful and common extension of Hypothesis Testing.

Page 100: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

100Hypothesis Testing - Basic

Exercise:

Sample Size (Cont.)

Here we enter the Difference (delta) we wish to detect, and the minimum Power value that we are willing to live with.

We leave Sample sizesblank.

Page 101: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

101Hypothesis Testing - Basic

Exercise:

Sample Size

Let’s extend our response team cycle time example

Determine what sample size we would need to detect a difference of 0.4 seconds at a power of 0.90.

What about at a power of 0.95?

What about at a power of 0.95 and an alpha of 0.025?

Hint: Click the Options button in the Power and Sample Size dialogue box.

Page 102: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

102Hypothesis Testing - Basic

Other Power and Sample Size Scenarios

We can perform these calculations not only for the difference between

two means, but for other tests as well.

Page 103: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

103Hypothesis Testing - Basic

1-Sample t-test in Minitab

Now, we will return to Minitab to test the following hypothesis about our two support organizations cycle times:

600:

600:

1

1

m

m

a

o

H

H

Page 104: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

104Hypothesis Testing - Basic

Back to the Support Organization Example:One Sample t-Test

1-Sample t-test in Minitab

Choose Stat>Basic Statistics>1-Sample tto test the mean of each response team against a standard or spec

Page 105: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

105Hypothesis Testing - Basic

1-Sample t-test in Minitab

Double click on C1 Team 1 and C2 Team 2 to place them in the dialog box here.

Type in the Hypothesized mean, or standard we are comparing to. Here it is 600.

Click the Graphs button to get to the Graphs dialog box.

Page 106: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

106Hypothesis Testing - Basic

1-Sample t-test in Minitab (Cont.)

Select Histogramof data andBoxplot of data

Click OK here and onthe previous Screen

Page 107: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

107Hypothesis Testing - Basic

601.0600.5600.0599.5599.0598.5598.0

35

30

25

20

15

10

5

0

-5

X_

Ho

Team 1

Fre

qu

en

cy

Histogram of Team 1(with Ho and 95% t-confidence interval for the mean)

1-Sample t-test in Minitab

This shows the Target we are testing, along with the Average and theConfidence Intervalfrom the data.

Page 108: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

108Hypothesis Testing - Basic

1-Sample t-test in Minitab (Cont.) - adj

601.5601.0600.5600.0599.5599.0598.5598.0

X_

Ho

Team 1

Boxplot of Team 1(with Ho and 95% t-confidence interval for the mean)

Page 109: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

109Hypothesis Testing - Basic

Team 2

Fre

qu

en

cy

603.0601.5600.0598.5597.0

15.0

12.5

10.0

7.5

5.0

2.5

0.0X_

Ho

Histogram of Team 2(with Ho and 95% t-confidence interval for the mean)

1-Sample t-test in Minitab (Cont.)

Page 110: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

110Hypothesis Testing - Basic

1-Sample t-test in Minitab (Cont.)

605604603602601600599598597596

X_

Ho

Team 2

Boxplot of Team 2(with Ho and 95% t-confidence interval for the mean)

Page 111: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

111Hypothesis Testing - Basic

1-Sample t-test in Minitab (Cont.)

Here is the descriptive output for the 1-Sample t-Test found in Session Window

Page 112: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

112Hypothesis Testing - Basic

2-Sided and 1-Sided Hypothesis Tests

We have concentrated on 2-sided hypothesis tests.

2-Sided tests determine whether or not two items are equal or whether a parameter is equal to some value.

Whether an item is less than or greater than another item or a value is not sought up front. A 2-sided test is a less specific test.

The alternative hypothesis is “Not Equal”.

Everything we have learned also applies to 1-sided tests.

1-Sided tests determine whether or not an item is less than (<) or greater than (>) another item or value.

The alternative hypothesis is either (<) or (>).

This makes for a more powerful test (lower beta at a given alpha and sample size).

Page 113: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

113Hypothesis Testing - Basic

More Detailed Information

Remember to use the Stat Guide button to learn more about the results and to help you interpret them.

Page 114: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Hypothesis Test Summary Template

Hypothesis Test (ANOVA, 1 or 2 sample t - test, Chi Squared,

Regression, Test of Equal Variance, etc)

Factor (x)

Testedp Value Observations/Conclusion

Example: ANOVA Location 0.030

Significant factor - 1 hour driving time from DC

to Baltimore office causes ticket cycle time to

generally be longer for the Baltimore site

Example: ANOVA Part vs. No Part 0.004

Significant factor - on average, calls requiring

parts have double the cycle time (22 vs 43

hours)

Example: Chi Squared Department 0.000

Significant factor - Department 4 has digitized

addition of customer info to ticket and less

human intervention, resulting in fewer errors

Example: Pareto Region n/a

South region accounted for 59% of the defects

due to their manual process and distance from

the parts warehouse

Describe any other observations about the root cause (x) data

     

Optional BB Deliverable- Example -

Page 115: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

No P

art

Part

0

50

100

150

Part/No Part

Net H

ours

Call

Open

Boxplots of Net Hour by Part/No

(means are indicated by solid circles)

Analysis of Variance for Net Hour

Source DF SS MS F P

Part/No 1 7421 7421 8.65 0.004

Error 69 59194 858

Total 70 66615

Individual 95% CI's For Mean

Level N Mean StDev --+---------+---------+---------+----

No Part 27 21.99 19.95 (--------*---------)

Part 44 43.05 33.70 (------*------)

--+---------+---------+---------+----

Pooled StDev = 29.29 12 24 36 48

After further investigation, possible reasons proposed by the team are OEM backorders, lack of technician certifications and the distance from the OEM to the client site. It is also caused by the need for technicians to make a second visit to the end user to complete the part replacement. Next step will be for the team to confirm these suspected root causes.

Boxplot: Part/ No Part Impact on Ticket Cycle Time

Because the p-value <= 0.05, we can be confident that calls requiring parts do have an impact on the ticket cycle time.

One-Way ANOVA Template

Optional BB Deliverable

- Example -

Page 116: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Linear Regression Template

95% confident that 94.1% of the variation in “Wait Time” is from the “Qty of Deliveries”

Deliveries

Wa

it T

ime

353025201510

55

50

45

40

35

S 1.11885

R-Sq 94.1%

R-Sq(adj) 93.9%

Fitted Line PlotWait Time = 32.05 + 0.5825 Deliveries

Optional BB Deliverable

- Example -

Page 117: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

117Hypothesis Testing - Basic

Takeaways

Since it is not always practical or possible to measure every item in the population, you take a random sample.

A basic understanding of the terms: Population, Sample, Population Parameter, Sample Statistic, Sample Mean, and Sample Standard Deviation

How to calculate a confidence interval with the population standard deviation known

How to calculate a confidence interval with the population standard deviation unknown

Page 118: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

118Hypothesis Testing - Basic

Takeaways (Cont.)

How Hypothesis tests help us handle uncertainty

The role of sample size, variation, and confidence level

The null and alternative hypotheses

Type I and Type II errors

Hypothesis tests in Minitab

Stat Guide

p-value

Page 119: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

119Hypothesis Testing - Basic

Takeaways (Cont.)

How to conduct a 1-way and 2-way t-test

How to conduct a Variance test (see Appendix)

How to conduct a Paired t-test (see Appendix)

Understanding of 1-way and 2-way test of proportions (see Appendix)

Understanding the relationship between Power and sample size and detectable difference (delta)

Page 120: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

What other comments or questions

do you have?

Page 121: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

121Hypothesis Testing - Basic

References

Hildebrand and Ott, Statistical Thinking for Managers, 4th Edition

Kiemele, Schmidt, and Berdine, Basic Statistics, 4th Edition

Page 122: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

National GuardBlack Belt Training

APPENDIX

Page 123: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Step 1: Define the problem objective

Step 2: Determine what data to collect (continuous or attribute)

Step 3: Based on data type, determine the appropriate hypothesis test to use

Step 4: Specify the null (H0) hypothesis and the alternative (H1) hypothesis

Step 5: Select a significance level (degree of risk acceptable), usually 0.05

Step 6: Execute Data Collection plan from step 2

Step 7: From the sample, conduct the hypothesis test using a statistical tool

Step 8: Identify the p-value

Step 9: Compare the p-value to the significance level - if the p-value is less than or equal to your acceptable risk (your alpha), then the null hypothesis is rejected

Step 10: Translate the decision to the situation

Hypothesis Testing - Steps

Page 124: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Decision Tree Matrix

Data Type(Step 2)

Hypothesis to be Tested (Step 3) Tree

Variable Testing equality of population MEAN (average) to a specific value 1

Variable Testing equality of population MEANS (averages) from two populations 2

Variable Testing equality of population MEANS (averages) from more than two populations 3

VariableTesting equality of population VARIANCES (standard deviation) from more than two

populations 4

Attribute - Binomial "Go/No-Go"

"Pass/Fail" or "Defective" Data

Testing equality of population PROPORTIONS (binomial data; e.g., pass/fail, go/no go, is/is not, etc.) from one or more populations 5

Attribute - Poisson"Count" or

"Defects" data

Testing equality of population PROPORTIONS (Poisson data; i.e., frequency of occurence in time or space) from two or more populations 6

Attribute (Contingency Table Data)

Testing for ASSOCIATION (not necessarily causal)Note: For use with attribute data only. For variable data, use correlation

or regression. No decision tree required.7

Page 125: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Decision Tree # 1

Start

Is

n > 30?1-Sample Z-test

Stat > Basic Statistics > 1-Sample ZYes

Ispopulationnormally

distributed?(Anderson-

Darling)

No

1-Sample Wilcoxon testrandom sample from a continuous,

symmetric population

Stat > Nonparametrics > 1-Sample Wilcoxon

No

1-Sample t-test(reasonably robust

against normality assumption)

Stat > Basic Statistics > 1-Sample t

Yes

Testing Equality of Population Mean

to a Specific Value

Variable (Continuous)

Has the average button diameter from the welder

changed from its historical value?

Application:

Type of Data:

Example:

Example1.vsd 6-1-00

Page 126: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Decision Tree # 2

2-Sample Mann Whitney(independent, random variables from two

populations with same shape, same variance)

Stat > Nonparametrics > Mann-Whitney

Note: If the two populations have different shapes

or different standard deviations, then use:

2-Sample t-testwithout pooling variances

Start

2-Sample Z-testStat > Basic Statistics > 2-Sample t

Yes

Are bothpopulations

normallydistributed?(Anderson-

Darling)

No

2-Sample t-testwith pooled variances

(reasonably robust against normality

assumption)

Stat > Basic Statistics > 2-Sample t

Assume equal variances

Testing Equality of Means

from Two Populations

Variable (Continuous)

Is the average button diameter from Welder A

different from that of Welder B?

Application:

Type of Data:

Example:

Are the

two samples

dependent?

EqualVariances?

(F-test)

Paired t-test(samples from normal distribution)

Stat > Basic Statistics > Paired t

Yes

2-Sample t-testwithout pooling variances

Stat > Basic Statistics > 2-Sample t

(Do not assume equal variances)

No

Do

n1 and n2

both exceed

30?

Yes

No

Yes

No

Example2.vsd 6-1-00

Page 127: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Decision Tree # 3

Start

Do samplescontain outliers?

(Box Plot)

Mood's Median test(independant, random samples f rom continuous

distributions hav ing same shape)

Stat > Nonparametrics > Mood's Median test

Yes

No

Testing Equality of Means from

More than Two Populations

Variable (Continuous)

Do the average button diameters from

Welders A, B and C differ from one another?

Application:

Type of Data:

Example:

Kruskal-Wallis(independant, random samples

f rom continuous distributions

hav ing same shape)

Stat > Nonparametrics > Kruskal-Wallis

Are thepopulations

normallydistrubted?(Anderson-

Darling)

One-Way Analysis of Variance

(ANOVA)(reasonably robust against assumptions

of normality and equal v ariances)

Stat > ANOVA > One-way

Yes

No

Did the test showsignif icance?

Tukey's testto conduct pairwise comparisons

Stat > ANOVA > One-way

Comparisons: Tukeys

No

Yes

Stop

Note: Use Dunnett's Method ifcomparing treatments to a control.

Example3.v sd 6-1-00

Page 128: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Decision Tree # 4

Start

Levene's testStat > ANOVA > Test for Equal Variances

2

More than 2

No

Bartlett's testStat > ANOVA > Test for Equal Variances

Yes

Testing Equality of Variances

Variable (Continuous)

Do the variances in button diameter

from the three welders differ from one another?

Application:

Type of Data:

Example:

Example4.v sd 6-1-00

How manypopulations are

being compared?

Are the

populations

normally

distributed?

(Anderson-

Darling)

Are the

populations

normally

distributed?

(Anderson-

Darling)

F-testStat > Basic Statistics >2 Variances

Yes

NoLevene's test

Stat > Basic Statistics > 2 Variances

Note: The F-test and Bartlett's test are not robust

against the normality assumption.

Page 129: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Decision Tree # 5

Case 2:

Testing Equality of

Proportions from Two

Populations

Example: Are Lines 1 and 2

running at the same

% defective rate?

Testing Equality of Population

Proportions

Attribute (Discrete) - Binomial Distribution

Application:

Type of Data:

Stat > Basic Statistics > 1-Proportion

Stat > Basic Statistics > 2-Proportions

Ho:P1=P2 no difference in popluation

proportions

M iniTab - Options select pooled p

Case 1:

Testing Population Proportion

Against a Specific Value

Example: Has the % defective rate

on Line 1 changed

from its historical value?

Use Chi-Square testMiniTab

Stat>Tables>Chi-square test

Case 3:

Testing Equality of

Proportions from More than

Two Populations

Example: Are Lines 1, 2 and 3

running at the same

% defective rate?

Example5.vsd 5-10-01

Page 130: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Decision Tree # 6

Testing Equality of Population

Defect Rates

Attribute (Discrete) - Poisson Distribution

Application:

Type of Data:

Comparing more than two Poisson

Distributions

1) Is the number of errors on invoices different

between Dept. A and Dept. B?

2) Does the number of seat defects

differ among shifts 1, 2 and 3?

Examples:

Example6.vsd 5-10-01

Comparing two Poission

Distributions

Use One-Way Analysis of Variance

Stat > ANOVA > One-way

Use 2 Sample t-test

Stat > Basic Stat > 2 Sample t

Caution

No Extreme Outliers

Stat > Basic Stats > 2-sample Poisson Rate

Page 131: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

Decision Tree # 7

Testing for Association

Attribute (Contingency Table Data)

Application:

Type of Data:

Chi-square test

Minitab:Stat > Tables > Chi-square test

Does the type of defect that occurs

depend on which product is being produced?Example:

Example7.vsd 6-1-00

Page 132: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

132Hypothesis Testing - Basic

Group 1:

Hypothesis Tests for Variation

Use the Minitab electronic docs, stat guide, and help to learn about performing hypothesis tests for equality of variance among two populations. You may also use your textbooks if you wish.

Prepare a 10-15 minute teachback on hypothesis tests for variation.

Be sure to work an example in your teachback.

Hint: Using Minitab to conduct a hypothesis test to determine if there is a difference in the amount of variation exhibited by each support organization would be a good example to use.

Page 133: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

133Hypothesis Testing - Basic

Group 2:

Paired t-Test

Use the Minitab electronic docs, stat guide, and help to learn about performing paired t-tests. You may also use your textbooks if you wish.

Prepare a 10-15 minute teachback on paired t-tests.

Be sure to illustrate the difference between a standard 2-way t-test and a paired t-test.

Be sure to work an example in your teachback.

Go through a sample size calculation in your example.

Page 134: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

134Hypothesis Testing - Basic

Group 3:

Hypothesis Tests with Proportions

Use the Minitab electronic docs, stat guide, and help to learn about performing hypothesis tests with proportions. You may also use your text books if you wish.

Prepare a 10-15 minute teachback on hypothesis tests with proportions.

Be sure to illustrate the main difference between hypothesis tests of proportions and the other hypothesis tests we have talked about.

Include both 1-way and 2-way proportion hypothesis tests.

Be sure to work an example in your teachback.

Go through a sample size calculation in your example.

Page 135: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

135Hypothesis Testing - Basic

Confidence Interval Formulas

Confidence Intervals for:

Mean (s Known) Mean (s Unknown)

Standard Deviation

Proportions (Approximate)

n

σZx α/2

n

stx 1nα/2,

2

1nα/2,1

2

1nα/2,

1nsσ

1ns

n

ppZpp

n

ppZp

ˆ1ˆˆ

ˆ1ˆˆ

2/2/

Page 136: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

136Hypothesis Testing - Basic

Table of Normal Curve Areas

Source: Statistical Thinking for Managers, Hildebrand and Ott, 4th Edition, page 800.

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.00 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359

0.10 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753

0.20 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141

0.30 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517

0.40 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879

0.50 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224

0.60 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549

0.70 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852

0.80 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133

0.60 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389

1.00 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621

1.10 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830

1.20 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015

1.30 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177

1.40 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319

1.50 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441

1.60 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545

1.70 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633

1.80 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706

1.90 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767

2.00 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817

2.10 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857

2.20 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890

2.30 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916

2.40 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936

2.50 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952

2.60 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964 z area

2.70 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974 3.5 0.49976737

2.80 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981 4.0 0.49996833

2.90 0.4981 0.4982 0.4982 0.4983 0.4981 0.4984 0.4985 0.4985 0.4986 0.4986 4.5 0.49999660

3.00 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990 5.0 0.49999971

Source: Computed by P.J. Hildebrand.

0 z

Page 137: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

137Hypothesis Testing - Basic

Calculation of t Test Statistic

The t test statistic is calculated as follows:

where D0 is the hypothesized difference between the two population means.

For an assumption of unequal variances:

21

021

XXs

XXt

D

)(

2

2

2

1

2

1

21 n

s

n

ss

XX

Page 138: NG BB 33 Hypothesis Testing Basics

UNCLASSIFIED / FOUO

UNCLASSIFIED / FOUO

138Hypothesis Testing - Basic

Calculation of t Test Statistic

For an assumption of equal variances:

where

21

1121 nn

ss pXX

2

)1()1(

21

2

22

2

11

nn

snsns p