statistics hypothesis testing & confidence intervals
TRANSCRIPT
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
1/36
HYPOTHESIS TESTING &
DEPARTMENT OFSTATISTICS
DR. RICKEDGEMAN, PROFESSOR& CHAIR SIX SIGMABLACKBELT
[email protected] OFFICE: +1-208-885-4410
CONFIDENCE INTERVALS
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
2/36
The Hypothesis Testing Approach
Conjectures(Hypotheses)
Evalu
ation
(TestMethod)
Gather&EvaluateFacts
Zone
of
Belief
ConsequencesA B or
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
3/36
The Scientific MethodNo Observer
or UninformedObserver
InformedObserver
Noninformative Event Informative Event
Scientific Method
of Investigation
Nothing Learned
Little orNothing Learned
Little or
Nothing Learned
Discovery!
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
4/36
Motivation for Hypothesis Testing
The intent of hypothesis testing is formally examine twoopposing conjectures (hypotheses), H0 and HA.
These two hypotheses are mutually exclusive and exhaustive
so that one is true to the exclusion of the other.We accumulate evidence - collect and analyze sample
information - for the purpose of determining which of thetwo hypotheses is true and which of the two hypotheses isfalse.
Beyond the issue of truth, addressed statistically, is the issueof justice. Justice is beyond the scope of statisticalinvestigation.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
5/36
The American Trial SystemIn Truth, the Defendant is:
H0: Innocent HA: Guilty
CorrectDecision IncorrectDecision
Innocent Individual Guilty Individual
Goes Free Goes Free
IncorrectDecision CorrectDecision
Innocent Individual Guilty IndividualIs Disciplined Is Disciplined
Innocent
Guilty
Verd
ict
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
6/36
Hypothesis Testing & the American Justice System State the Opposing Conjectures, H0 and HA. Determine the amount of evidence required, n,
and the risk of committing a type I error,
What sort of evaluation of the evidence is
required and what is the justification for this?(type of test)
What are the conditions which proclaim guilt and
those which proclaim innocence? (Decision Rule)
Gather & evaluate the evidence. What is the verdict? (H0 or HA?)
Determine Zone of Belief: Confidence Interval.
What is appropriate justice? --- Conclusions
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
7/36
True, But Unknown State of the WorldH0 is True HAis True
Ho is True
Decision
HAis True
Correct Decision Incorrect Decision
Type II Error Probability =
Incorrect Decision Correct Decision
Type I Error Probability =
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
8/36
Hypothesis Testing Algorithm
1) Specify H0 and HA2) Specify n and 3) What Type of Test and Why?
4) Critical Value(s) and Decision Rule (DR)5) Collect Pertinent Data and Determine the Calculated Value of the
Test Statistic (e.g. Zcalc, tcalc, 2calc, etc)6) Make a Decision to Either Reject H0 in Favor of HA or to Fail to
Reject (FTR) H0.7) Construct & Interpret the Appropriate Confidence Interval
8) Conclusions? Implications & Actions
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
9/36
H0: = < > 0 vs. HA: > < 0 n = _______ = _______
Testing a Hypothesis About a Mean;
Process Performance Measure is Approximately Normally Distributed;
We Know Therefore this is a Z-test - Use the Normal Distribution.
DR: ( in HA) Reject H0 in favor of HA if Zcalc < -Z/2 or if Zcalc >
+Z/2. Otherwise, FTR H0.
DR: (> in HA) Reject H0 in favor of HA iff Zcalc > +Z. Otherwise,FTR H0.
DR: (< in HA) Reject H0 in favor of HA iff Zcalc < -Z. Otherwise,
FTR H0.
Z-test & C.I. for
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
10/36
Z-test Algorithm (Continued)
Zcalc = (X - 0)/(/ /n)_____ Reject H0 in Favor of HA. _______ FTR H0.
The Confidence Interval for is Given by:X + Z/2(/ n )
Interpretation
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
11/36
t-test and Confidence Interval for H0: = < > 0 vs. HA: > < 0
n = _______ = _______ Testing a Hypothesis About a Mean;
Process Performance Measure is Approximately Normally Distributed or WeHave a Large Sample;
We Do Not KnowWhich Must be Estimated by S.
Therefore this is a t-test - Use Students T Distribution.
DR: ( in HA) Reject H0 in favor of HA if tcalc < -t/2 or if tcalc > +t/2.Otherwise, FTR H
0.
DR: (> in HA) Reject H0 in favor of HA iff tcalc > +t. Otherwise, FTR H0.
DR: (< in HA) Reject H0 in favor of HA iff tcalc < -t Otherwise, FTR H0.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
12/36
t-test Algorithm (Continued) tcalc = (X - 0)/(s/ /n )_____ Reject H0 in Favor of HA. _______ FTR H0.
The Confidence Interval for is Given by: X + t/2(s/ n )
Interpretation
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
13/36
Z-test & C.I. for p
H0: p = < > p0 vs. HA: p > < p0
n = _______ = _______ Testing a Hypothesis About a Proportion;
We have a large samplethat is, both np0 and n(1-p0) > 5 Therefore this is a Z-test - Use the Normal Distribution.
DR: ( in HA) Reject H0 in favor of HA if Zcalc < -Z/2 or if Zcalc > +Z/2.Otherwise, FTR H0.
DR: (> in HA) Reject H0 in favor of HA iff Zcalc > +Z. Otherwise, FTRH0.
DR: (< in HA) Reject H0 in favor of HA iff Zcalc < -Z. Otherwise, FTR
H0.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
14/36
Z-test for a proportion Zcalc = (p - p0)/( p0(1-p0)/n ) _____ Reject H0 in Favor of HA. _______ FTR H0.
The Confidence Interval for p is Given by:
p + Z/2( p(1-p)/n ) Interpretation
^
^ ^ ^
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
15/36
Advance, Inc.Integrated Circuit
Manufacturing
Methods & Materials
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
16/36
Interested in increasing productivity rating in the integrated circuit
division, Advance Inc. determined that a methods review course
would be of value to employees in the IC division.
To determine the impact of this measure they reviewed historicalproductivity records for the division and determined that the average
level was 100 with a standard deviation of 10.
Fifty IC division employees participated in the course and the post-
course productivity of these employees was measured, on average, to
be 105.
Assume that productivity ratings are approximately distributed. Did
the course have a beneficial effect. Test the appropriate hypothesis at
the = .05 level of significance.
Z-Test & Confidence Interval:
Training Effect Example
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
17/36
Training Effect Example
H0: < 100 HA: > 100
n = 50 = .05
(i) testing a mean (ii) normal distribution (iii) = 10 is known so that this is a Z-test
DR: Reject H0 in favor of HA iff Zcalc > 1.645. Otherwise, FTR H0
Zcalc = (X - 0)/( / n) = (105 - 100)/ (10/ 50 ) = 5/1.414 = 3.536
X Reject H0 in favor of HA. _______ FTR H0
The 95% Confidence Interval is Given by: X + Z/2 (/ n) which is 105 +
1.96(1.414) = 105 + 2.77 or 102.23 < < 107.77 Thus the course appears to have helped improve IC division employee productivityfrom an average level of 100 to a level that is at least 102.23 and at most 107.77.
A follow-up question: is this increase worth the investment?
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
18/36
Loan Application Processing
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
19/36
First Peoples Bank of Central City
First Peoples Bank of Central City would like to improve their
loan application process. In particular currently the amount of time
required to process loan applications is approximately normally
distributed with a mean of 18 days.
Measures intended to simplify and speed the process have been
identified and implemented. Were they effective? Test the
appropriate hypothesis at the = .05 level of significance if asample of 25 applications submitted after the measures were
implemented gave an average processing time of 15.2 days and astandard deviation of 2.0 days.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
20/36
First Peoples Bank of Central City
H0: > 18 HA: < 18 n = 25 = .05 (i) testing a mean (ii) normal distribution (iii) is unknown and must
be estimated so that this is a t-test DR: Reject H0
in favor of HA iff tcalc < -1.711. Otherwise, FTR H0
tcalc = (X - 0)/(s / n) = (15.2 - 18)/ (2/ 25 ) = -2.8/.4 = -7.00 X Reject H0 in favor of HA. _______ FTR H0
The 95% Confidence Interval is Given by: X + t/2 (s/n) which is15.2 + 2.064(.4) = 15.2 + .83 or 14.37 < < 16.03 Thus the course appears to have helped decrease the average time
required to process a loan application from 18 days to a level that is at
least 14.37 days and at most 16.03 days.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
21/36
Small Business
Loan Defaults
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
22/36
First Peoples Bank of Central CitySmall Business Loan Defaults
Historically, 12% of Small Business Loans granted result indefault. Three years ago, FPB of Central City purchasedsoftware which they hope will assist in reducing the defaultrate by more effectively discriminating between small
business loan applicants who are likely to default and thosewho are not likely to do so.
After adequately training their loan officers in use ofsoftware, FPB sampled 150 small business loan applications
processed using the software and found 9 to be in default atthe end of two years.
Using = .10, does it appear that the software is of value?
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
23/36
H0: p > .12 HA: p < .12
n = 150 = .10 (i) testing a proportion (ii) np0 = 150(.12) = 18 and n(1-p0 ) = 132
DR: Reject H0
in favor of HA
iff Zcalc
< -1.282. Otherwise, FTR H0
Zcalc = (p - p0)/( p0(1-p0)/n ) = (.06 - .12)/ (.12(.88)/150 ) =
-.06/.026533 = -2.261
X Reject H0 in favor of HA. _______ FTR H0
The 95% Confidence Interval is Given by: p + Z/2 ( p(1-p)/ n ) which is.06 + 1.645( .06(.94)/150 ) = .06 + 1.645(.0194) or .06 + .032 or.028 < p < .092
Thus the course appears to have helped decrease the small business loan
default rate from a level of 12% to a level that is between 2.8% and 9.2%
with a best estimate of 6%.
^
^ ^ ^
Small BusinessLoan Default Rate
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
24/36
2-test & C.I. for H0: = < > 0 vs. HA: = > < 0
n = _______ = _______
Testing a Hypothesis About a Standard Deviation (or Variance);
The Measured Trait (e.g. the PPM) is Approximately Normal; Therefore this is a 2-test - Use the Chi-Square Distribution.
DR: (in HA) Reject H0 in favor of HA if2calc < 2small,/2 or if2calc >2large,/2. Otherwise, FTR H0.
DR: (> in HA) Reject H0in favor of HAiff
2
calc > 2
large,Otherwise, FTR H0. DR: (< in HA) Reject H0in favor of HAiff2calc < 2small,Otherwise, FTR H0.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
25/36
2Test & C.I. (continued) 2calc = (n-1)s2/(20 ) _____ Reject H0 in Favor of HA. _______ FTR H0.
The Confidence Intervals for and are Given by: (n-1)s2/2large,/2 < 2 < (n-1)s2/2small,/2
and
(n-1)s2
/2large,/2 < < (n-1)s2/2small,/2 Interpretation
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
26/36
Fast Facts Financial, Inc.Fast Facts Financial (FFF), Inc. provides credit reports to lendinginstitutions that evaluate applicants for home mortgages, vehicle, home
equity, and other loans.
A pressure faced by FFF Inc. is that several competing credit reporting
companies provide reports in about the same average amount of time, butare able to promise a lower time than FFF Inc - the reason being that the
variation in time required to compile and summarize credit data is smaller
than the time required by FFF.
FFF has identified & implemented procedures which they believe willreduce this variation. If the historic standard deviation is 2.3 days, and the
standard deviation for a sample of 25 credit reports under the new
procedures is 1.8 days, then test the appropriate hypothesis at the = .05level of significance. Assume that the time factor is approximately
normally distributed.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
27/36
FFFExample H0: = < > 0 vs. HA: > < 0 where0 = 2.3
n = 25 = .05 . Testing a Hypothesis About a Standard Deviation (or Variance);
The Measured Trait (e.g. the PPM) is Approximately Normal;
Therefore this is a 2
-test - Use the Chi-Square Distribution. DR: (< in HA) Reject H0 in favor of HA iff2calc < 2small, = 13.8484.
Otherwise, FTR H0.
2calc = (n-1)s2/20 = (24)( 1.82 )/ (2.32) = 77.76/5.29 = 14.70
Reject H0 in favor of HA. X FTR H0.
77.76/39.3641 < 2 < 77.76/12.4011 or 1.975 < 2 < 6.27 so that1.405 days < < 2.50 days
Evidence is inconclusive. Work should continue on this.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
28/36
Two Sample Testsand
Confidence Intervals
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
29/36
H0: 12= dHA: 12< > d
n1 = _____ n2= _____ = 0Comparison of Means from Two Processes
Normality Can Be Reasonably Assumed
Are the two variances known or unknown?
(a) Known
Z-test(b) Unknown but Similar in Value t-test with n1+n2 2 df
(c) Unknown and Unequal t-test with complicated df
Critical Values and Decision Rules are the same as for any Z-test or t-test.
Tests and Intervals for Two Means
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
30/36
C.I. for1
2
X1 X2 ZX1-X2
or
X1 X2 tSX1-X2Decisions Same as any other Z or T test.
Implications Context Specific
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
31/36
(a) Z = [(X1 X2)d]
(1/n1 + 1/n2)
Z = [(X1 X2)d]
(21/n1+ 22/n2)
(b) t = [(X1 X2)d] (assume equal variances)
Sp(1/n1 + 1/n2) where df = n1+n2 2and Sp
2 = (n1-1)S12 + (n2-1)S2
2
(c ) t = [(X1 X2)d] (do not assume equal variances)
(S12/n1 + S22/n2) where df = [(s12 /n1) + (s22/n2)]2
(s12 /n1)
2 + (s22/n2)
2
n1 1 n2 1
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
32/36
Equality of Variances: The F-Test
H0: 1= 2 vs. HA: 1 < > 2
n1 = _____ n2 = _____ = _____
Test of equality of variances F-test
___ > in HA: reject H0 in favor of HA iff Fcalc > F,big.Otherwise, FTR H0.
___ < in HA: reject H0 in favor of HA iff Fcalc < F,small. Otherwise, FTRH0.
___in HA: reject H0 in favor of HA iff Fcalc < F/2,small or if Fcalc > F/,big.Otherwise, FTR H
0.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
33/36
Fcalc = S12/S2
2
Make a decision.
Fcalc/ Fn1-1,n2-1,/2 large 12/22 Fcalc/Fn1-1,n2-1,/2 small
C.I. for1/2 is obtained by taking squareroots of the endpoints of the above C.I. for
1
2/2
2
Conclusions / Implications Context Specific.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
34/36
Tests & Intervals for Two Proportions
H0
: p1
p2
= pd
HA: p1 p2 < > pd
n1 = _____ n2= _____ = 0
Comparison of Proportions from Two Processes
n1p1, n2p2, n1(1-p1) and n2(1-p2) all 5 Z-test
Critical Values and Decision Rules are the same
as for
any Z-test.
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
35/36
Z = [(p1 p2)] IF pd = 0 p(1-p)(1/n1 + 1/n2) where p = (X1+X2)/(n1 + n2)
^ ^
Z = [(p1 p2) pd] IF pd 0
^ ^ ^ ^
(p1(1--p1)/n1 + p2(1-p2)/n2
^ ^ ^ ^ ^ ^
C.I. for p1-p2 is (p1 p2) Z/2(p1(1--p1)/n1 + p2(1-p2)/n2
^ ^
-
7/29/2019 Statistics Hypothesis Testing & Confidence Intervals
36/36
HYPOTHESIS TESTING &
DEPARTMENT OFSTATISTICS
DR RICK EDGEMAN PROFESSOR & CHAIR SIX SIGMA BLACK BELT
[email protected] OFFICE: +1-208-885-4410
CONFIDENCE
INTERVALS
Endof Session