hypothesis testing. why do we need it? – simply, we are looking for something – a statistical...

10
Hypothesis Testing

Upload: rosanna-pearson

Post on 13-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly

Hypothesis Testing

Page 2: Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly

Hypothesis Testing

Why do we need it?

– simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly a difference between a set of data from two samples. Mathematically, we infer this with some degree of confidence in our decision.

------------------------------------------------------------------------------

What does it prove ?

– helps us determine whether observed differences are:

statistically significant

or

due to chance (random or common cause variation)

Page 3: Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly

Test a Hypothesis – (HØ and Ha)

Null Hypothesis, (Ho)

• comes from word nullify (to negate)

• associated with distribution of chance events

• typically, the null hypothesis is: “2 samples are the same, except for variation caused by chance”

------------------------------------------------------------------------------

Alternate Hypothesis, (Ha)

• used as an alternative to the null hypothesis

• those hypotheses that identify a distribution of events that is not a chance distribution

• typically, the alternative hypothesis: “2 samples are fundamentally different”

Page 4: Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly

Null Hypothesis and RiskAlpha Risk -

finding a difference when one doesn’t really exist [A FALSE REJECT]

[ probability of making an incorrect decision ] usually 5% or less

i.e. – Jury decided GUILTY verdict when person was really INNOCENT

- Rejecting a good part on the assembly line (aka: producers risk)

-------------------------------------------------------------------------------

Beta Risk - NOT finding a difference when there is one [A FALSE ACCEPT]

[ probability of making a right decision when you’re really wrong ]

less than 10% chance it could have occurred randomly

i.e. - taxi driver thought corner was dangerous when it was safe

- Accepting a defective part from the assembly line (aka: consumers risk)

NOTE: Statistically, P value is probability of occurrence by “chance only”

(Ho = true (no difference) then a “high” >.05 p-value occurs)

Page 5: Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly

Alpha Beta Risk(GENERAL RULES)

Hypothesis testing; Tests “NULL” hypothesis [Ho = NO difference ] against an alternative hypothesis [Ha = groups (data) are different ]

----------------------------------------------------------------------------------

If p value < .05 (reject Ho and conclude Ha) Are different (truly)

If p value > .05 (cannot reject Ho) … So, there truly is NO difference

-----------------------------------------------------------------------------------

Why use?? To detect differences that may be important to the business. Is minor difference in averages due to random variation or reflect a true difference. Want to see the impact of our intervention.

Page 6: Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly

Statistical Difference vs. Practical Importance

You Decide …

• If there are large amounts of data, or the variation within the data is very small, hypothesis tests can detect very small differences between samples

• While the samples are statistically different, the differences may not mean much in the PRACTICAL world.

• DOES IT MAKE BUSINESS SENSE ??

• DOES IT PASS THE COMMON SENSE TEST ?

Page 7: Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly

What are the Data Assumptions ?• If data is continuous, we assume underlying distribution is Normal.

• You may need to transform non-Normal data. (i.e.: cycle times).

When comparing groups from different populations we assume:

• independent samples

• achieved through random sampling

• samples are representative (unbiased) of the population

When comparing groups from different processes we assume:

• each process is stable

• there are no special causes or shifts over time

• samples are representative of the process (unbiased)

Also note:

Pre-test and Post-test would violate independence. By knowing one, there is a possibility to predict the other. Any repeated measures of the same individuals would also violate independence.

Page 8: Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly

Two Sample T Test “Are the means of these two normally distributed groups really different from each other?” If the P value is .05 or less, it is usually accepted that the groups are different. 

One Way ANOVASimilar to the two sample T test, except that it can handle more than two groups. Again, the groups must be normal. We also have the added requirement that the variances (and the standard deviations) of all the groups are approximately equal. A P value of .05 or less indicates that the mean of at least one group is different from the rest. 

Mann-WhitneySimilar to, and less powerful than Two Sample T, but does not require normally distributed data. 

Homogeneity of Variance “Are the variances (and hence the standard deviations) of these groups of data equal?” Often used preparatory to ANOVA. If the P value of Levine's test is .05 or less, the variances are assumed to be unequal. 

Normality Test “Is this data normally distributed?” If the P value of the Anderson-Darling test is .05 or less, the data is presumed to be not normal. For small groups of data, the “fat pencil test” is more meaningful. 

Page 9: Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly

Multi-Vary StudyA passive examination of the process as it runs in its normal state. By noting the state of key input variables, and the simultaneous state of output variables, useful correlations can often be found. Sometimes a Multi-Vari study will reveal the sources of problems. In other cases, the outputs of a Multi-Vari study become the inputs to a designed experiment. Outputs are often shown as Main Effects Plots and/or Boxplots. 

Chi Square TestUsed with count data, arranged in a matrix of rows and columns. For example, TREATED and UNTREATED columns, and LIVED and DIED rows, in a 2x2 matrix. Counts entered into each cell are the number of people in each category. P value of the Chi Square test indicates whether or not the rows and columns are statistically independent, i.e, does ‘treatment’ or the lack of it influence survival? 

RegressionRegression is used with interval/ratio/variable inputs and outputs. It answers the questions, “Are the inputs and outputs linearly correlated?” and “If they are linearly correlated, what is the formula that connects them?”. One output is an equation of the form Y=mX+b, where Y is the output variable, m is the slope of the line, X is the input variable, and b is a constant (the Y intercept). Another output is an R2 value. An R2 of 86% says that 86% of the observed variation is explained by the straight line model, and 14% is not.  Regression with more than one input variable is called Multiple Linear Regression. 

Page 10: Hypothesis Testing. Why do we need it? – simply, we are looking for something – a statistical measure - that will allow us to conclude there is truly

Hypothesis Testing

Continuous Data

Non-Normal Data Normality Test Normal

Attribute Data

Contingency Table

One Sample T-TestOne Way ANOVA

Two-Sample T-Test(Variances Equal)

Two-Sample T-Test(Variances Not Equal)

Bartlett’s TestLevene’s Test

One Sample

Two or More Samples

For all tests:p>0.05 Fail to Reject Ho (Null)p<0.05 Reject Ho

Ho: 1=2=3…Ha: At least one is differentMinitab:Stat > ANOVA > Homog of Variance

Ho: M1 = M targetHa: M1 M targetMinitab:Stat > Nonparametric > 1 sample sign (OR)Stat > Nonparametric > 1 sample - Wilcoxon

Ho: M1 = M2 = M3...Ha: At least one is differentMinitab:Stat > Nonparametric > Mann Whitney (OR)Stat > Nonparametric > Kruskal Wallis (OR)Stat > Nonparametric > Moods Median (OR)Stat > Nonparametric > Friedmans

Two or More Samples

Ho: Data is normalHa: Data is NOT normalMinitab:Stat > Basic Stat > Normality Test(Use Anderson-Darling)

Ho: Two Factors are INDEPENDENTHa: Two Factors are DEPENDENTMinitab:Stat > Tables > Chi-Square Test

Two or More Samples

One Sample

Ho: 1=2=3…Ha: At least one is differentMinitab:Stat > ANOVA > Homog of Variance

Ho: 1=2=3…Ha: At least one is differentMinitab:Stat > ANOVA > One-Way

Ho: 1 = targetHa: 1 targetMinitab:Stat > Basic Stat > 1 Sample T

Equal Variance (Three or more samples)

Ho: 1=2

Ha: 12 Minitab:Stat > Basic Stat > 2 Sample T(Check Box for Equal Variance)

Two Samples

Ho: 1=2

Ha: 12 Minitab:Stat > Basic Stat > 2 Sample T(Check Box for Unequal Variance)

One Sample

“Roadmap”