testing hypotheses

Download TESTING HYPOTHESES

If you can't read please download the document

Upload: adena-mendoza

Post on 01-Jan-2016

51 views

Category:

Documents


1 download

DESCRIPTION

TESTING HYPOTHESES. Two ways of arriving at a conclusion. 1. Deductive inference. sample. population. 2. Inductive inference. sample. population. IF YOUR DATA ARE:. 1. Continuous data. 4. Equal variance (F-test). 2. Ratio or interval. - PowerPoint PPT Presentation

TRANSCRIPT

  • TESTING HYPOTHESES

  • Two ways of arriving at a conclusion2. Inductive inferencesamplepopulationsamplepopulation1. Deductive inference

  • IF YOUR DATA ARE:1. Continuous data2. Ratio or interval 3. Approximately normal distribution4. Equal variance (F-test)5. Conclusions about population based on sample (inductive)6. Sample size > 10samplepopulation

  • Imagine the following experiment:2 groups of cricketsGroup 1 fed a diet with extra supplementsGroup 2 fed a diet with no supplementsWeights Mean = 12.8Mean = 9.49

    12.113.913.012.114.912.212.914.913.612.013.513.612.015.912.412.010.912.111.010.9

    9.18.911.010.19.99.28.011.98.69.08.59.610.010.99.48.011.97.110.08.9

  • What youre doing here is comparing two samples that, because youve not violated any of the assumptions we saw before, should represent populations that look like this:9.4912.8Are the means of these populations different??FrequencyWeight

  • Are the means of these populations different??To answer this question use a statistical testA statistical test is just a method of determining mathematically whether you definitively say yes or no to this questionWhat test should I use??

  • IF YOU HAVENT VIOLATED ANY OF THE ASSUMPTIONS WE MENTIONED BEFORENumber of groups compared2 other than 2T -testDirection of difference specified?YesNoOne-tailedTwo- tailedDoes each data point in one data set (population) have a corresponding one in the other data set?YesNoPaired t-testUnpaired t-testAre the means of two populations the same?Are the means of more than two populations the same?Number of factors being tested12>2Does each data point in one data set (population) have a corresponding one in the other data sets?Two way ANOVAANOVAYesNoOne way ANOVARepeated Measures ANOVAOther tests

  • A simple t-test1. State hypothesesHo there is no difference between the means of the two populations of crickets (i.e. the extra nutrients had no effect on weight)H1 there is a difference between the means of the two populations of crickets (i.e. the extra nutrients had an effect on weight)

  • A simple t-test2. Calculate a t-value (any stats program does this for you)3. Use a probability table for the test you used to determine the probability that corresponds to the t-value that was calculated.(for the truly masochistic)

  • A simple t-test2. Calculate a t-value (any stats program does this for you)3. Use a probability table for the test you used to determine the probability that corresponds to the t-value that was calculated.Data Test statisticProbability

  • Unpaired t test Do the means of Nutrient fed and No nutrient differ significantly? P value The two-tailed P value is < 0.0001, considered extremely significant. t = 7.941 with 38 degrees of freedom. 95% confidence interval Mean difference = -3.307 (Mean of No nutrient minus mean of Nutrient fed) The 95% confidence interval of the difference: -4.150 to -2.464 Assumption test: Are the standard deviations equal? The t test assumes that the columns come from populations with equal SDs. The following calculations test that assumption. F = 1.192 The P value is 0.7062. This test suggests that the difference between the two SDs is not significant. Assumption test: Are the data sampled from Gaussian distributions? The t test assumes that the data are sampled from populations that follow Gaussian distributions. This assumption is tested using the method Kolmogorov and Smirnov: Group KS P Value Passed normality test? =============== ====== ======== ======================= Nutrient fed 0.1676 >0.10 Yes No nutrient 0.1279 >0.10 Yes

  • Interpretation of p < .0001?This means that there is less than 1 chance in 10,000 that these two means are from the same population.In the world of statistics, that is too small a chance to have happened randomly and so the Ho is rejected and the H1 accepted

  • For all statistical tests that youll use, it is convention that the minimum probability that two samples can differ and still be from the same population is 5% or p = .05

  • Nonparametric Statistics(Nominal Data)&Goodness-of-Fit Tests

  • What happens if you violate any of the assumptions?Step 1 - Panic

  • What happens if you violate any of the assumptions?Step 1 - PanicStep 2 - It depends on what assumptions have been violated.

    AssumptionOther testsAnother solution?1. Continuous dataYes2. Ratio/interval Yes3. Normal distributionYesTransform the data4. Equal varianceYes - Welchs5. Sample PopulationYes6. N

  • Nonparametric TestsThese tests are used when the assumptions of t-tests andANOVA have been violated

    They are called nonparametric because there is no estimation of parameters (means, standard deviations or variances) involved.Several kinds:Goodness-of-Fit tests - when you calculate an expected valueNon-parametric equivalents of parametric tests

  • Goodness-of-Fit TestsUse with nominal scale datae.g. results of genetic crossesAlso, youre using the population to deduce what the sample should look like

  • Classic example - genetic crosses

    Do they conform to an expected Mendelian ratio?Back to our little ball creatures - Critterus sphericalesPhenotypes:

    A_B_

    A_bb

    aaB_

    aabbMendelian inheritance-Predict a 9:3:3:1 ratio

  • -sampled 320 animals

    A_B_A_bbaaB_aabbObserved (o)19453676

  • -sampled 320 animals

    A_B_A_bbaaB_aabbObserved (o)19453676Expected (e)180606020

  • -sampled 320 animals

    A_B_A_bbaaB_aabbObserved (o)19453676Expected (e)180606020o - e14-77-14

  • -sampled 320 animals

    A_B_A_bbaaB_aabbObserved (o)19453676Expected (e)180606020o - e14-77-14(o - e)21964949196

  • -sampled 320 animals

    A_B_A_bbaaB_aabbObserved (o)19453676Expected (e)180606020o - e14-77-14(o - e)21964949196(o - e)2e1.08.82.829.8

  • -sampled 320 animals(o -e)2eSC2 = = 1.08 + .82 + .82 + 9.8 = 12.52df = number of classes -1 = 3

    A_B_A_bbaaB_aabbObserved (o)19453676Expected (e)180606020o - e14-77-14(o - e)21964949196(o - e)2e1.08.82.829.8

  • X2 = 12.52Critical value for 3 degrees of freedomat .05 level is7.82X2 TableConclusion: Probability of these data fitting the expected distribution is < .05,therefore they are not from a Mendelian populationThe actual probability of X2 =12.52 and df = 3 is .01 > p > .001

  • A little X2 wrinkle - the Yates correctionFormula is (o -e)2eSC2 =Except of df = 1 (i.e. youre using two categories of data)Then the formula becomes

    (|o -e| - 0.5)2 eSC2 =

  • A second goodness-of-fit test

    G-test or Log-Likelihood Ratio

    Use if |o - e | < ee.g. if o is 12 and e is 7G = 2 o ln= 4.60517 * o log10oeoe S S

  • Summary!

    Type of dataNumber of samplesAre data related?Test to useNominal2YesMcNemarNominal2NoFishers ExactNominal>2YesCochrans Q

  • All of the parametric tests (remember the big flow chart!) have non-parametric equivalents (or analogues)

    Type of dataNumber of samplesAre data related?Test to use

    Nominal2YesMcNemarNominal2NoFishers ExactNominal>2YesCochrans QOrdinal1NoKomolgorov- SmirnovOrdinal+2YesWilcoxon(paired t-test analogue)Ordinal+2NoMann Whitney U (unpaired t-test analogue)Ordinal+>2NoKruskal Wallis (analogue of one-way ANOVAOrdinal>2YesFriedman two-way ANOVA