statistics hypothesis testing & confidence intervals

Upload: ujwala512

Post on 03-Apr-2018

240 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    1/36

    HYPOTHESIS TESTING &

    DEPARTMENT OFSTATISTICS

    DR. RICKEDGEMAN, PROFESSOR& CHAIR SIX SIGMABLACKBELT

    [email protected] OFFICE: +1-208-885-4410

    CONFIDENCE INTERVALS

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    2/36

    The Hypothesis Testing Approach

    Conjectures(Hypotheses)

    Evalu

    ation

    (TestMethod)

    Gather&EvaluateFacts

    Zone

    of

    Belief

    ConsequencesA B or

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    3/36

    The Scientific MethodNo Observer

    or UninformedObserver

    InformedObserver

    Noninformative Event Informative Event

    Scientific Method

    of Investigation

    Nothing Learned

    Little orNothing Learned

    Little or

    Nothing Learned

    Discovery!

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    4/36

    Motivation for Hypothesis Testing

    The intent of hypothesis testing is formally examine twoopposing conjectures (hypotheses), H0 and HA.

    These two hypotheses are mutually exclusive and exhaustive

    so that one is true to the exclusion of the other.We accumulate evidence - collect and analyze sample

    information - for the purpose of determining which of thetwo hypotheses is true and which of the two hypotheses isfalse.

    Beyond the issue of truth, addressed statistically, is the issueof justice. Justice is beyond the scope of statisticalinvestigation.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    5/36

    The American Trial SystemIn Truth, the Defendant is:

    H0: Innocent HA: Guilty

    CorrectDecision IncorrectDecision

    Innocent Individual Guilty Individual

    Goes Free Goes Free

    IncorrectDecision CorrectDecision

    Innocent Individual Guilty IndividualIs Disciplined Is Disciplined

    Innocent

    Guilty

    Verd

    ict

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    6/36

    Hypothesis Testing & the American Justice System State the Opposing Conjectures, H0 and HA. Determine the amount of evidence required, n,

    and the risk of committing a type I error,

    What sort of evaluation of the evidence is

    required and what is the justification for this?(type of test)

    What are the conditions which proclaim guilt and

    those which proclaim innocence? (Decision Rule)

    Gather & evaluate the evidence. What is the verdict? (H0 or HA?)

    Determine Zone of Belief: Confidence Interval.

    What is appropriate justice? --- Conclusions

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    7/36

    True, But Unknown State of the WorldH0 is True HAis True

    Ho is True

    Decision

    HAis True

    Correct Decision Incorrect Decision

    Type II Error Probability =

    Incorrect Decision Correct Decision

    Type I Error Probability =

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    8/36

    Hypothesis Testing Algorithm

    1) Specify H0 and HA2) Specify n and 3) What Type of Test and Why?

    4) Critical Value(s) and Decision Rule (DR)5) Collect Pertinent Data and Determine the Calculated Value of the

    Test Statistic (e.g. Zcalc, tcalc, 2calc, etc)6) Make a Decision to Either Reject H0 in Favor of HA or to Fail to

    Reject (FTR) H0.7) Construct & Interpret the Appropriate Confidence Interval

    8) Conclusions? Implications & Actions

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    9/36

    H0: = < > 0 vs. HA: > < 0 n = _______ = _______

    Testing a Hypothesis About a Mean;

    Process Performance Measure is Approximately Normally Distributed;

    We Know Therefore this is a Z-test - Use the Normal Distribution.

    DR: ( in HA) Reject H0 in favor of HA if Zcalc < -Z/2 or if Zcalc >

    +Z/2. Otherwise, FTR H0.

    DR: (> in HA) Reject H0 in favor of HA iff Zcalc > +Z. Otherwise,FTR H0.

    DR: (< in HA) Reject H0 in favor of HA iff Zcalc < -Z. Otherwise,

    FTR H0.

    Z-test & C.I. for

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    10/36

    Z-test Algorithm (Continued)

    Zcalc = (X - 0)/(/ /n)_____ Reject H0 in Favor of HA. _______ FTR H0.

    The Confidence Interval for is Given by:X + Z/2(/ n )

    Interpretation

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    11/36

    t-test and Confidence Interval for H0: = < > 0 vs. HA: > < 0

    n = _______ = _______ Testing a Hypothesis About a Mean;

    Process Performance Measure is Approximately Normally Distributed or WeHave a Large Sample;

    We Do Not KnowWhich Must be Estimated by S.

    Therefore this is a t-test - Use Students T Distribution.

    DR: ( in HA) Reject H0 in favor of HA if tcalc < -t/2 or if tcalc > +t/2.Otherwise, FTR H

    0.

    DR: (> in HA) Reject H0 in favor of HA iff tcalc > +t. Otherwise, FTR H0.

    DR: (< in HA) Reject H0 in favor of HA iff tcalc < -t Otherwise, FTR H0.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    12/36

    t-test Algorithm (Continued) tcalc = (X - 0)/(s/ /n )_____ Reject H0 in Favor of HA. _______ FTR H0.

    The Confidence Interval for is Given by: X + t/2(s/ n )

    Interpretation

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    13/36

    Z-test & C.I. for p

    H0: p = < > p0 vs. HA: p > < p0

    n = _______ = _______ Testing a Hypothesis About a Proportion;

    We have a large samplethat is, both np0 and n(1-p0) > 5 Therefore this is a Z-test - Use the Normal Distribution.

    DR: ( in HA) Reject H0 in favor of HA if Zcalc < -Z/2 or if Zcalc > +Z/2.Otherwise, FTR H0.

    DR: (> in HA) Reject H0 in favor of HA iff Zcalc > +Z. Otherwise, FTRH0.

    DR: (< in HA) Reject H0 in favor of HA iff Zcalc < -Z. Otherwise, FTR

    H0.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    14/36

    Z-test for a proportion Zcalc = (p - p0)/( p0(1-p0)/n ) _____ Reject H0 in Favor of HA. _______ FTR H0.

    The Confidence Interval for p is Given by:

    p + Z/2( p(1-p)/n ) Interpretation

    ^

    ^ ^ ^

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    15/36

    Advance, Inc.Integrated Circuit

    Manufacturing

    Methods & Materials

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    16/36

    Interested in increasing productivity rating in the integrated circuit

    division, Advance Inc. determined that a methods review course

    would be of value to employees in the IC division.

    To determine the impact of this measure they reviewed historicalproductivity records for the division and determined that the average

    level was 100 with a standard deviation of 10.

    Fifty IC division employees participated in the course and the post-

    course productivity of these employees was measured, on average, to

    be 105.

    Assume that productivity ratings are approximately distributed. Did

    the course have a beneficial effect. Test the appropriate hypothesis at

    the = .05 level of significance.

    Z-Test & Confidence Interval:

    Training Effect Example

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    17/36

    Training Effect Example

    H0: < 100 HA: > 100

    n = 50 = .05

    (i) testing a mean (ii) normal distribution (iii) = 10 is known so that this is a Z-test

    DR: Reject H0 in favor of HA iff Zcalc > 1.645. Otherwise, FTR H0

    Zcalc = (X - 0)/( / n) = (105 - 100)/ (10/ 50 ) = 5/1.414 = 3.536

    X Reject H0 in favor of HA. _______ FTR H0

    The 95% Confidence Interval is Given by: X + Z/2 (/ n) which is 105 +

    1.96(1.414) = 105 + 2.77 or 102.23 < < 107.77 Thus the course appears to have helped improve IC division employee productivityfrom an average level of 100 to a level that is at least 102.23 and at most 107.77.

    A follow-up question: is this increase worth the investment?

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    18/36

    Loan Application Processing

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    19/36

    First Peoples Bank of Central City

    First Peoples Bank of Central City would like to improve their

    loan application process. In particular currently the amount of time

    required to process loan applications is approximately normally

    distributed with a mean of 18 days.

    Measures intended to simplify and speed the process have been

    identified and implemented. Were they effective? Test the

    appropriate hypothesis at the = .05 level of significance if asample of 25 applications submitted after the measures were

    implemented gave an average processing time of 15.2 days and astandard deviation of 2.0 days.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    20/36

    First Peoples Bank of Central City

    H0: > 18 HA: < 18 n = 25 = .05 (i) testing a mean (ii) normal distribution (iii) is unknown and must

    be estimated so that this is a t-test DR: Reject H0

    in favor of HA iff tcalc < -1.711. Otherwise, FTR H0

    tcalc = (X - 0)/(s / n) = (15.2 - 18)/ (2/ 25 ) = -2.8/.4 = -7.00 X Reject H0 in favor of HA. _______ FTR H0

    The 95% Confidence Interval is Given by: X + t/2 (s/n) which is15.2 + 2.064(.4) = 15.2 + .83 or 14.37 < < 16.03 Thus the course appears to have helped decrease the average time

    required to process a loan application from 18 days to a level that is at

    least 14.37 days and at most 16.03 days.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    21/36

    Small Business

    Loan Defaults

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    22/36

    First Peoples Bank of Central CitySmall Business Loan Defaults

    Historically, 12% of Small Business Loans granted result indefault. Three years ago, FPB of Central City purchasedsoftware which they hope will assist in reducing the defaultrate by more effectively discriminating between small

    business loan applicants who are likely to default and thosewho are not likely to do so.

    After adequately training their loan officers in use ofsoftware, FPB sampled 150 small business loan applications

    processed using the software and found 9 to be in default atthe end of two years.

    Using = .10, does it appear that the software is of value?

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    23/36

    H0: p > .12 HA: p < .12

    n = 150 = .10 (i) testing a proportion (ii) np0 = 150(.12) = 18 and n(1-p0 ) = 132

    DR: Reject H0

    in favor of HA

    iff Zcalc

    < -1.282. Otherwise, FTR H0

    Zcalc = (p - p0)/( p0(1-p0)/n ) = (.06 - .12)/ (.12(.88)/150 ) =

    -.06/.026533 = -2.261

    X Reject H0 in favor of HA. _______ FTR H0

    The 95% Confidence Interval is Given by: p + Z/2 ( p(1-p)/ n ) which is.06 + 1.645( .06(.94)/150 ) = .06 + 1.645(.0194) or .06 + .032 or.028 < p < .092

    Thus the course appears to have helped decrease the small business loan

    default rate from a level of 12% to a level that is between 2.8% and 9.2%

    with a best estimate of 6%.

    ^

    ^ ^ ^

    Small BusinessLoan Default Rate

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    24/36

    2-test & C.I. for H0: = < > 0 vs. HA: = > < 0

    n = _______ = _______

    Testing a Hypothesis About a Standard Deviation (or Variance);

    The Measured Trait (e.g. the PPM) is Approximately Normal; Therefore this is a 2-test - Use the Chi-Square Distribution.

    DR: (in HA) Reject H0 in favor of HA if2calc < 2small,/2 or if2calc >2large,/2. Otherwise, FTR H0.

    DR: (> in HA) Reject H0in favor of HAiff

    2

    calc > 2

    large,Otherwise, FTR H0. DR: (< in HA) Reject H0in favor of HAiff2calc < 2small,Otherwise, FTR H0.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    25/36

    2Test & C.I. (continued) 2calc = (n-1)s2/(20 ) _____ Reject H0 in Favor of HA. _______ FTR H0.

    The Confidence Intervals for and are Given by: (n-1)s2/2large,/2 < 2 < (n-1)s2/2small,/2

    and

    (n-1)s2

    /2large,/2 < < (n-1)s2/2small,/2 Interpretation

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    26/36

    Fast Facts Financial, Inc.Fast Facts Financial (FFF), Inc. provides credit reports to lendinginstitutions that evaluate applicants for home mortgages, vehicle, home

    equity, and other loans.

    A pressure faced by FFF Inc. is that several competing credit reporting

    companies provide reports in about the same average amount of time, butare able to promise a lower time than FFF Inc - the reason being that the

    variation in time required to compile and summarize credit data is smaller

    than the time required by FFF.

    FFF has identified & implemented procedures which they believe willreduce this variation. If the historic standard deviation is 2.3 days, and the

    standard deviation for a sample of 25 credit reports under the new

    procedures is 1.8 days, then test the appropriate hypothesis at the = .05level of significance. Assume that the time factor is approximately

    normally distributed.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    27/36

    FFFExample H0: = < > 0 vs. HA: > < 0 where0 = 2.3

    n = 25 = .05 . Testing a Hypothesis About a Standard Deviation (or Variance);

    The Measured Trait (e.g. the PPM) is Approximately Normal;

    Therefore this is a 2

    -test - Use the Chi-Square Distribution. DR: (< in HA) Reject H0 in favor of HA iff2calc < 2small, = 13.8484.

    Otherwise, FTR H0.

    2calc = (n-1)s2/20 = (24)( 1.82 )/ (2.32) = 77.76/5.29 = 14.70

    Reject H0 in favor of HA. X FTR H0.

    77.76/39.3641 < 2 < 77.76/12.4011 or 1.975 < 2 < 6.27 so that1.405 days < < 2.50 days

    Evidence is inconclusive. Work should continue on this.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    28/36

    Two Sample Testsand

    Confidence Intervals

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    29/36

    H0: 12= dHA: 12< > d

    n1 = _____ n2= _____ = 0Comparison of Means from Two Processes

    Normality Can Be Reasonably Assumed

    Are the two variances known or unknown?

    (a) Known

    Z-test(b) Unknown but Similar in Value t-test with n1+n2 2 df

    (c) Unknown and Unequal t-test with complicated df

    Critical Values and Decision Rules are the same as for any Z-test or t-test.

    Tests and Intervals for Two Means

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    30/36

    C.I. for1

    2

    X1 X2 ZX1-X2

    or

    X1 X2 tSX1-X2Decisions Same as any other Z or T test.

    Implications Context Specific

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    31/36

    (a) Z = [(X1 X2)d]

    (1/n1 + 1/n2)

    Z = [(X1 X2)d]

    (21/n1+ 22/n2)

    (b) t = [(X1 X2)d] (assume equal variances)

    Sp(1/n1 + 1/n2) where df = n1+n2 2and Sp

    2 = (n1-1)S12 + (n2-1)S2

    2

    (c ) t = [(X1 X2)d] (do not assume equal variances)

    (S12/n1 + S22/n2) where df = [(s12 /n1) + (s22/n2)]2

    (s12 /n1)

    2 + (s22/n2)

    2

    n1 1 n2 1

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    32/36

    Equality of Variances: The F-Test

    H0: 1= 2 vs. HA: 1 < > 2

    n1 = _____ n2 = _____ = _____

    Test of equality of variances F-test

    ___ > in HA: reject H0 in favor of HA iff Fcalc > F,big.Otherwise, FTR H0.

    ___ < in HA: reject H0 in favor of HA iff Fcalc < F,small. Otherwise, FTRH0.

    ___in HA: reject H0 in favor of HA iff Fcalc < F/2,small or if Fcalc > F/,big.Otherwise, FTR H

    0.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    33/36

    Fcalc = S12/S2

    2

    Make a decision.

    Fcalc/ Fn1-1,n2-1,/2 large 12/22 Fcalc/Fn1-1,n2-1,/2 small

    C.I. for1/2 is obtained by taking squareroots of the endpoints of the above C.I. for

    1

    2/2

    2

    Conclusions / Implications Context Specific.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    34/36

    Tests & Intervals for Two Proportions

    H0

    : p1

    p2

    = pd

    HA: p1 p2 < > pd

    n1 = _____ n2= _____ = 0

    Comparison of Proportions from Two Processes

    n1p1, n2p2, n1(1-p1) and n2(1-p2) all 5 Z-test

    Critical Values and Decision Rules are the same

    as for

    any Z-test.

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    35/36

    Z = [(p1 p2)] IF pd = 0 p(1-p)(1/n1 + 1/n2) where p = (X1+X2)/(n1 + n2)

    ^ ^

    Z = [(p1 p2) pd] IF pd 0

    ^ ^ ^ ^

    (p1(1--p1)/n1 + p2(1-p2)/n2

    ^ ^ ^ ^ ^ ^

    C.I. for p1-p2 is (p1 p2) Z/2(p1(1--p1)/n1 + p2(1-p2)/n2

    ^ ^

  • 7/29/2019 Statistics Hypothesis Testing & Confidence Intervals

    36/36

    HYPOTHESIS TESTING &

    DEPARTMENT OFSTATISTICS

    DR RICK EDGEMAN PROFESSOR & CHAIR SIX SIGMA BLACK BELT

    [email protected] OFFICE: +1-208-885-4410

    CONFIDENCE

    INTERVALS

    Endof Session