inferential statistics

57
INFERENTIAL STATISTICS BY: ADIBARAWIAH BT ABDUL RAHIM (D20102040002) NIK NURHAFIZAH BT NIK DAUD (D20102039968) NOOR AZIE HARYANIS BT ABDUL AZIZ (D20102039969) 1

Upload: nurul-intan-fairuz

Post on 12-Jan-2016

12 views

Category:

Documents


1 download

DESCRIPTION

useful when doing AR

TRANSCRIPT

Page 1: Inferential Statistics

1

INFERENTIAL STATISTICS

BY:

ADIBARAWIAH BT ABDUL RAHIM (D20102040002)NIK NURHAFIZAH BT NIK DAUD (D20102039968)NOOR AZIE HARYANIS BT ABDUL AZIZ (D20102039969)

Page 2: Inferential Statistics

2

WHAT ARE INFERENTIAL STATISTICS?› Inferential statistics refer to certain procedures that allow

researchers to make inferences about a population based on data obtained from a sample.

Page 3: Inferential Statistics

3

THE LOGIC OF INFERENTIAL STATISTICS

Page 4: Inferential Statistics

Sampling Error

› Definition– Difference between a sample and its population (the data

obtained)– Arises as a result of taking a sample from population

rather than using the whole population.

Page 5: Inferential Statistics

Distribution of Sample Means

› Called as sampling distribution– It has its own mean and standard deviation.– It is the mean of the means (from the samples)– The mean is equal to the mean of the population.

Page 6: Inferential Statistics

› Large collections of random samples– Do pattern themselves in such a way that it is possible for

researchers to predict accurately some characteristics of the population from which sample was selected.

– Its means tend to be normally distributed (sizes of each of the samples must be large; more than 30)

Page 7: Inferential Statistics

Standard Error Of The Mean (SEM)

› The standard deviation for sampling distribution is called the standard error of the mean (SEM)

› n = sample size

Page 8: Inferential Statistics

CONFIDENCE INTERVALS

› A confidence interval is a region extending both above and below a sample statistic (such as a sample mean) within which a population parameter (such as the population mean) may be said to fall with a specified probability of being wrong.

› Limits or boundaries where the population mean lies. – 68% fall within ± 1 SEM of the mean– 95% fall within ± 2 SEM of the mean– 99% fall within ± 3 SEM of the mean

Page 9: Inferential Statistics
Page 10: Inferential Statistics
Page 11: Inferential Statistics
Page 12: Inferential Statistics

The Standard Error Of The Difference Between Sample Means

› The standard error of the difference (SED)– is the standard deviation of the distribution of differences

between samples means as well as the populations. – Formula:

Page 13: Inferential Statistics

13

HYPOTHESIS TESTING

Page 14: Inferential Statistics

RESEARCH HYPOTHESIS

NULL HYPOTHESIS

HYPOTHESIS TESTING

Page 15: Inferential Statistics

HYPOTHESIS TESTING

› Statistical hypothesis testing is a way of determining the probability that an obtained sample statistic will occur, given a hypothetical population parameter.

Page 16: Inferential Statistics

RESEARCH HYPOTHESIS

› A research hypothesis specifies the nature of the relationship the researcher thinks exists in the population.

› E.g :

“The population mean of students using method A is greater than the population mean of students using method B.”

Page 17: Inferential Statistics

NULL HYPOTHESIS› The null hypothesis typically specifies that there is no

relationship in the population.

› E.g

“There is no difference between the population mean of students using method A and the population mean of students using method B.”

› (This is the same thing as saying the difference between the means of the two populations is zero.)

Page 18: Inferential Statistics

Steps in making hypothesis

Page 19: Inferential Statistics

1. State the research hypothesis.“There is a difference between the population mean of students using method A and the population mean of students using method B”

2. State the null hypothesis.“There is no difference between the population mean of students using method A and the population mean of students usingmethod B.”

3. Determine the sample statistics pertinent to the hypothesis.“the mean of sample A and the mean of sample B.”

4. Determine the probability of obtaining the sample results.“the difference between the mean of sample A and the mean of sample B.”

5. If the probability is small, reject the null hypothesis, thus affirming the research hypothesis. If the probability is large, do not reject the null hypothesis, which means you cannot affirm the research hypothesis.

Page 20: Inferential Statistics

20

PRACTICAL VS. STATISTICAL SIGNIFICANCE

Page 21: Inferential Statistics

PRACTICAL SIGNIFICANCE

› A calculated difference is practically significant if the actual difference it is estimating will affect a decision to be made.

› Practical significance is more subjective and is based on other factors like cost, requirements, program goals, etc.

Page 22: Inferential Statistics

› When determining practical significance the researcher must consider the following:–The quality of the research questions–The relative size of the effect–The size of the sample–The importance of the finding–Confidence intervals–The link to previous research–The strength of correlation

Page 23: Inferential Statistics

STATISTICAL SIGNIFICANCE

› Statistical significance only means that one’s results are likely to occur by chance less than a certain percentage of the time, say 5 percent.

› The degree of risk that you are willing to take that you will reject a null hypothesis when it is actually true

› Statistical significance is mathematical - it comes from the data (sample size) and from your confidence (how confident you want to be in your results).

Page 24: Inferential Statistics

TESTS OF STATISTICAL SIGNIFICANCE

› A one-tailed test of significance involves the use of probabilities based on one-half of a sampling distribution because the research hypothesis is a directional hypothesis.

› A two-tailed test, on the other hand, involves the use of probabilities based on both sides of a sampling distribution because the research hypothesis is a non directional hypothesis.

Page 25: Inferential Statistics

TWO TAILED TESTS› If you are using a significance level of 0.05, a two-tailed test allots half of

your alpha to testing the statistical significance in one direction and half of your alpha to testing statistical significance in the other direction. 

› This means that .025 is in each tail of the distribution of your test statistic.

› When using a two-tailed test, regardless of the direction of the relationship you hypothesize, you are testing for the possibility of the relationship in both directions. 

› For example, we may wish to compare the mean of a sample to a given value x using a t-test.  Our null hypothesis is that the mean is equal to x. A two-tailed test will test both if the mean is significantly greater than x and if the mean significantly less than x. The mean is considered significantly different from x if the test statistic is in the top 2.5% or bottom 2.5% of its probability distribution, resulting in a p-value less than 0.05.     

Page 26: Inferential Statistics
Page 27: Inferential Statistics

ONE TAILED TESTS› If you are using a significance level of .05, a one-tailed test allots all of your

alpha to testing the statistical significance in the one direction of interest. 

› This means that .05 is in one tail of the distribution of your test statistic.

› When using a one-tailed test, you are testing for the possibility of the relationship in one direction and completely disregarding the possibility of a relationship in the other direction.

› Our null hypothesis is that the mean is equal to x. A one-tailed test will test either if the mean is significantly greater than x or if the mean is significantly less than x, but not both.

› Then, depending on the chosen tail, the mean is significantly greater than or less than x if the test statistic is in the top 5% of its probability distribution or bottom 5% of its probability distribution, resulting in a p-value less than 0.05. 

› The one-tailed test provides more power to detect an effect in one direction by not testing the effect in the other direction.

Page 28: Inferential Statistics
Page 29: Inferential Statistics

TYPE I AND TYPE II ERRORS› Type I Error: rejecting the null

hypothesis when it is true.

› The probability of making a Type I error

- Set by researcher

› E.g., .01 = 1% chance of rejecting null when it is true.

› E.g., .05 = 5% chance of rejecting null when it is true.

- Not the probability of making one or more Type I errors on multiples tests of null.

› Type II Error: not rejecting the null hypothesis when it is not true.

› The probability of making a Type I error

- Not directly controlled by researcher

- Reduced by increasing sample size

Page 30: Inferential Statistics

Making a decisionIf You… When the Null Hypothesis Is

Actually…Then You Have…

Reject the null hypothesis True (there really are no differences)

Made a Type I Error

Reject the null hypothesis False (there really are differences)

Made a Correct decision

Accept the null hypothesis False (there really are differences)

Made a Type II Decision

Accept the null hypothesis True (there really are no differences)

Made a Correct Decision

Page 31: Inferential Statistics

SIGNIFICANCE LEVELS

› The term significance level (or level of significance), as used in research, refers to the probability of a sample statistic occurring as a result of sampling error.

› The significance levels most commonly used in educational research are the .05 and 01 levels.

› Statistical significance and practical significance are not necessarily the same. Even if a result is statistically significant, it may not be practically (i.e., educationally) significant.

Page 32: Inferential Statistics

Probability Values

› p > .05 (deemed likely to be a result of chance)

› p < .05 (not likely to be a result of chance)

› p < .01 (less likely to be a result of chance)

› p < .001 (even less likely to be a result of chance)

Researchers are more often reporting the actual probabilityvalue rather than using < or > signs. (example p = .063)

Page 33: Inferential Statistics

33

INFERENCE TECHNIQUES

Page 34: Inferential Statistics

34

Parametric Nonparametric

Quantitative t-test for means Mann-Whitney U test

Analysis of variance (ANOVA)

Kruskal-Wallis one-way analysis of variance

Analysis of covariance (ANCOVA)

Sign test

Multivariate analysis of variance (MANOVA)

Friedman two-way analysis of variance

t-test for r

Categorical t-test for difference in proportions

Chi square

Commonly Used Inferential Techniques

Page 35: Inferential Statistics

35

PARAMETRIC TESTS FOR QUANTITATIVE DATA› A parametric statistical test requires various kinds of

assumptions about the nature of the population from which the samples involved in the research study were taken.

Page 36: Inferential Statistics

36

The t -Test for Means› used to see whether a difference between the means of two

samples is significant.

› produces a value for t (called an obtained t), to determine the level of significance (p= .05) that has been reached.– Researcher rejects the null hypothesis and concludes that a real

difference does exist when the level of significance is reached.

› Two forms of t-test:– A t-test for independent means– A t-test for correlated means

Page 37: Inferential Statistics

37

t-test for independent means› Used to compare the mean scores of

two different or independent groups.

› Example:2 randomly selected groups of 18th graders (31 in each group) were exposed to 2 different methods of teaching for a semester and they were given the same achievement test at the end of the semester. Their achievement scores could be compared using a t-test.

– Null hypothesis: population mean of method A = population mean of method B

– Research hypothesis: population mean of method A > population mean of method B

› The mean score of the achievement test for method A = 85

› The mean score of the achievement test for method B = 80

› The difference between the two teaching methods (85-80=5) is indicated by conducting a one-tailed t-test, to conclude whether the difference is statistically significant or not.

Page 38: Inferential Statistics

38

t-test for correlated means› Used to compare the mean scores of

the same group before and after a treatment of some sort is given.

› Used when the same subjects receive two different treatments in a study.

› Example: investigate the effectiveness of relaxation training for reducing the level of anxiety such athletes experience and thus improving their performance at the free throw line. She formulates these hypotheses:

› Null hypothesis: There will be no change in performance at the free throw line.

› Research hypothesis: Performance at the free throw line will improve.

Page 39: Inferential Statistics

39

Analysis of Variance (ANOVA)› A ratio of observed differences between more than two means.

› It is more versatile than a t-test and should be used in most cases instead of the t-test.

› The analysis allows comparison of means of the samples and testing of the null hypothesis regarding no significance difference between means of the samples.

Page 40: Inferential Statistics

40

Analysis of Covariance (ANCOVA)› Used when groups are given a pretest related in some way to the

dependent variable and their mean scores on this pretest are found to differ.

› Enables the researcher to adjust the posttest mean scores on the dependent variable for each group to compensate for the initial differences between the groups on the pretest. The pretest is called the covariate.

› How much the posttest mean scores must be adjusted depends on how large the difference between the pretest means is and the degree of relationship between the covariate and the dependent variable.

Page 41: Inferential Statistics

41

Multivariate analysis of variance (MANOVA)› Differs from ANOVA in only one respect: It incorporates two or

more dependent variables in the same analysis, thus permitting a more powerful test of differences among means.

› It is justified only when the researcher has reason to believe correlations exist among the dependent variables.

Page 42: Inferential Statistics

42

The t -Test for r› Used to see whether a correlation coefficient calculated on

sample data is significant—that is, whether it represents a nonzero correlation in the population from which the sample was drawn.

› The statistic being dealt with is a correlation coefficient ( r ) rather than a difference between means. The test produces a value for t (again called an obtained t ), which the researcher checks in a statistical probability table to see whether it is statistically significant. As with the other parametric tests, the larger the obtained value for t , the greater the likelihood that significance has been achieved.

Page 43: Inferential Statistics

43

NONPARAMETRIC TESTS FOR QUANTITATIVE DATA› A nonparametric statistical technique makes few, if any,

assumptions about the nature of the population from which the samples in the study were taken.

› Some of the commonly used nonparametric techniques for analyzing quantitative data are the Mann-Whitney U test, the Kruskal-Wallis one-way analysis of variance, the sign test, and the Friedman two-way analysis of variance.

Page 44: Inferential Statistics

44

The Mann-Whitney U Test› a nonparametric alternative to the t -test used when a researcher

wishes to analyze ranked data. The researcher intermingles the scores of the two groups and then ranks them as if they were all from just one group.

› The test produces a value ( U ), whose probability of occurrence is then checked by the researcher in the appropriate statistical table.

› The logic of the test is as follows:– If the parent populations are identical, then the sum of the pooled

rankings for each group should be about the same. – If the summed ranks are markedly different, on the other hand, then this

difference is likely to be statistically significant.

Page 45: Inferential Statistics

45

The Kruskal-Wallis One-Way Analysis of Variance› used when researchers have more than two independent groups to

compare.

› The procedure is quite similar to the Mann-Whitney U test except the sums of the ranks added together for each of the separate groups are then compared.

› This analysis produces a value ( H ), whose probability of occurrence is checked by the researcher in the appropriate statistical table.

Page 46: Inferential Statistics

46

The Sign Test› used when a researcher wants to analyze two related (as opposed to

independent) samples. Related samples are connected in some way.

› For example, often a researcher will try to equalize groups on IQ, gender, age, or some other variable.

› Another example of a related sample is when the same group is both pre- and posttested (that is, tested twice). Each individual, in other words, is tested on two different occasions (as with the t -test for correlated means).

› Procedures:– Simply lines up the pairs of related subjects and then determines how many times the paired

subjects in one group scored higher than those in the other group. If the groups do not differ significantly, the totals for the two groups should be about equal. If there is a marked difference in scoring (such as many more in one group scoring higher), the difference may be statistically significant.

Page 47: Inferential Statistics

47

The Friedman Two-Way Analysis of Variance› If more than two related groups are involved, then this test can

be used.

› Example:– This test would be appropriate if a researcher employs four

matched groups.

Page 48: Inferential Statistics

48

PARAMETRIC TESTS FOR CATEGORICAL DATA› The most common parametric technique for analyzing

categorical data is the t –test for differences in proportions.– t -Test for Proportions

› Used to analyze whether the proportion in one category (e.g., males) is different from the proportion in another category (e.g., females)

› Two forms similar like t-test for means in quantitative data:– t-test for independent proportions– t-test for correlated proportions

Page 49: Inferential Statistics

49

NONPARAMETRIC TESTS FOR CATEGORICAL DATA› The chi-square test is the nonparametric technique most

commonly used to analyze categorical data.– The Chi-Square Test

› The chi-square statistics can be used to determine the strength of the relationship (i.e., Does knowing someone’s gender help you predict their outcome score/value).

› The test statistic is: ›

χ2 = Chi-square valueO = Observed frequency for each category E = Expected frequency for each category.

Page 50: Inferential Statistics

Chi-square example

› We are interested in whether male students vs. female students are more likely to own cats vs. dogs.

› Notice that both variables are categorical.– Kind of pet: people are classified as owning cats or dogs (or

both or neither). We can count the number of people belonging to each category; we don’t scale them along a dimension of pet ownership.

– Sex: people are male or female. We count the number of people in each category; we don’t scale each person along a sex dimension.

Page 51: Inferential Statistics

Example Data

› Males are more likely to have dogs as opposed to cats

› Females are more likely to have cats than dogs

Cat Dog

Male 20 30 50

Female 30 20 50

50 50 100

NHST Question: Are these differences best accounted for by the null hypothesis or by the hypothesis that there is a real relationship between gender and pet ownership?

Page 52: Inferential Statistics

› To answer this question, we need to know what we would expect to observe if the null hypothesis were true (i.e., that there is no relationship between these two variables, and any observed relationship is due to sampling error).

Page 53: Inferential Statistics

Example Data

› To find expected value for a cell of the table, multiply the corresponding row total by the column total, and divide by the grand total

› For the first cell (and all other cells), (50 x 50)/100 = 25

› Thus, if the two variables are unrelated, we would expect to observe 25 people in each cell

Cat Dog

Male 25 25 50

Female 25 25 50

50 50 100

Page 54: Inferential Statistics

Example Data

› The differences between these expected values and the observed values are aggregated according to the Chi-square formula:

Cat Dog

Male 50

Female 50

50 50 100

E

EO 22

25

2520 2 25

2530 2

25

2530 2 25

2520 2

25

2520

25

2530

25

2530

25

2520 22222

4111125

25

25

25

25

25

25

252

Page 55: Inferential Statistics

Null Hypothesis Significance Testing(NHST) and chi-square

› Once you have the chi-square statistic, it can be evaluated against a chi-square sampling distribution

› The sampling distribution characterizes the range of chi-square values we might observe if the null hypothesis is true, but sampling error is giving rise to deviations from the expected values.

› You can look up the probability value associated with a chi-square statistic in a table of using a computer

› In our example in which the chi-square was 4.0, the associated p-value was >.05. (The chi-square statistic would have had to have been larger than 3.8 for it to have been significant.)

Page 56: Inferential Statistics

56

POWER OF A STATISTICAL TEST› The power of a statistical test for a particular set of data is the

likelihood of identifying a difference, when in fact it exists, between population parameters.

› Parametric tests are generally, but not always, more powerful than nonparametric tests.

Page 57: Inferential Statistics

57

REFERENCESFraenkel, J. R. , Wallen, N. E. , Hyun, H. H. (2012).

How to design and evaluate research in education. New York: McGraw- Hill.

Idre. (2013). Tail tests. Available: http://www.ats.ucla.edu/stat/mult_pkg/faq/general/tail_tests.htm. Last accessed 6th November 2013.

Sauro, J. (2004-2013). The standard error of the mean. Retreived from http://www.usablestats.com/lessons/sem