fundamentals of statistical reasoning in education
DESCRIPTION
Fundamentos de estadisticaTRANSCRIPT
7/17/2019 Fundamentals of Statistical Reasoning in Education
http://slidepdf.com/reader/full/fundamentals-of-statistical-reasoning-in-education 1/3
Reading the Research: Independent-Samples t Test
Santa and Hoien (1999, p. 65) examined the effects of an early-intervention pro-gram on a sample of students at risk for reading failure:
A t-test analysis showed that the post-intervention spelling performance in
the experimental group (M = 59.6, SD= 5.95) was statistically significantlyhigher than in the control group (M = 53.7, SD= 12.4) , t (47 ) = 2.067, p< .05.
Notice that an exact p value is not reported; rather, probability is reported rel-ative to the significance level of .05. The result of this independent-samples t testis therefore deemed significant at the .05 level.
Source: Santa, C. M. & Hoien, T. (1999). An assessment of early steps: A program for early intervention
of reading problems. Reading Research Quarterly, 34(1), 54–79.
Case Study: Doing Our Homework
This case study demonstrates the application of the independent-samples t test.We compared the academic achievement of students who, on average, spend twohours a day on homework to students who spend about half that amount of timeon homework. Does that extra hour of homework—in this case, double thetime—translate into a corresponding difference in achievement?
The sample of nearly 500 students was randomly selected from a population of seniors enrolled in public schools located in the northeastern United States. (Thedata are courtesy of the National Center for Education Statistics’ National Educa-
tion Longitudinal Study of 1988.) We compared two groups of students: those re-porting 4–6 hours of homework per week (Group 1) and those reporting 10–12hours per week (Group 2). The criterion measures were reading achievement,mathematics achievement, and grade-point average.
One could reasonably expect that students who did more homework would
score higher on measures of academic performance. We therefore chose the direc-tional alternative hypothesis, H 1: m1 m2 < 0, for each of the three t tests below.(The \less than" symbol simply reflects the fact that we are subtracting the hypothet-ically larger mean from the smaller mean.) For all three tests, the null hypothesisstated no difference, H 0: m1 m2 ¼ 0. The level of significance was set at .05.
of significant differences between means considerablyeasier than when the groups are already formed on thebasis of some characteristic of the participants (e.g.,sex, ethnicity).
The assumption of random sampling underliesnearly all the statistical inference techniques used byeducational researchers, including the t test and other
procedures described in this book. Inferences to popu-lations from which the samples have been randomlyselected are directly backed by the laws of probabilityand statistics and are known as statistical inferences;inferences or generalizations to all other groups arenonstatistical in nature and involve judgment andinterpretation.
294 Chapter 14 Comparing the Means of Two Populations: Independent Samples
7/17/2019 Fundamentals of Statistical Reasoning in Education
http://slidepdf.com/reader/full/fundamentals-of-statistical-reasoning-in-education 2/3
Our first test examined the mean difference between the two groups in readingperformance. Scores on the reading exam are represented by T scores, which, youmay recall from Chapter 6, have a mean of 50 and a standard deviation of 10. (Re-member not to confuse T scores, which are standard scores, with t ratios, whichmake up the t distribution and are used for significance testing.) The mean scoresare shown in Table 14.3. As expected, the mean reading achievement of Group 2( X 2 ¼ 54:34) exceeded that of Group 1 ( X 1 ¼ 52:41). An independent-samples t
test revealed that this mean difference was statistically significant at the .05 level(see Table 14.4). Because large sample sizes can produce statistical significance forsmall (and possibly trivial) differences, we also determined the effect size in order tocapture the magnitude of this mean difference. From Table 14.4, we see that the raw
mean difference of
1.93 points corresponds to an effect size of
.21. Remember,we are subtracting X 2 from X 1 (hence the negative signs). This effect size indicatesthat the mean reading achievement of Group 1 students was roughly one-fifth of astandard deviation below that of Group 2 students—a rather small effect.
We obtained similar results on the mathematics measure. The difference againwas statistically significant—in this case, satisfying the more stringent .001 signi-ficance level. The effect size, d ¼ :31, suggests that the difference between thetwo groups in mathematics performance is roughly one-third of a standarddeviation. (It is tempting to conclude that the mathematics difference is larger thanthe reading difference, but this would require an additional analysis—testing the
Table 14.3 Statistics for Reading, Mathematics, and GPA
n X s s X
READGroup 1 332 52.41 9.17 .500Group 2 163 54.34 9.08 .710
MATHGroup 1 332 52.44 9.57 .530Group 2 163 55.34 8.81 .690
GPAGroup 1 336 2.46 .58 .030Group 2 166 2.54 .58 .050
Table 14.4 Independent-Samples t Tests and Effect Sizes
X 1 X 2 t df p (one-tailed) d
READ 1.93 2.21 493 .014 .210MATH 2.90 3.24 493 .001 .310GPA .08 1.56 500 .059 .140
Case Study: Doing Our Homework 295
7/17/2019 Fundamentals of Statistical Reasoning in Education
http://slidepdf.com/reader/full/fundamentals-of-statistical-reasoning-in-education 3/3
statistical significance of the difference between two differences. We have not done
that here.)Finally, the mean difference in GPA was X 1 X 2 ¼ :08, with a correspondingeffect size of .14. This difference was not statistically significant ( p ¼ :059). Even if it were, its magnitude is rather small (d ¼ :14) and arguably of little practical sig-nificance. Nevertheless, the obtained p value of .059 raises an important point. Al-though, strictly speaking, this p value failed to meet the .05 criterion, it is important toremember that \.05" (or any other value) is entirely arbitrary. Should this result,
p ¼ :059, be declared \statistically significant"? Absolutely not. But nor should it bedismissed entirely. When a p value is tantalizingly close to a but nonetheless fails tomeet this criterion, researchers sometimes use the term marginally significant . Al-though no convention exists (that we know of) for deciding between a \marginallysignificant" result and one that is patently nonsignificant, we believe that it is im-portant to not categorically dismiss results that, though exceeding the announced level
of significance, nonetheless are highly improbable. (In the present case, for example,the decision to retain the null hypothesis rests on the difference in probability be-tween 50/1000 and 59/1000.) This also is a good reason for reporting exact p values inone’s research: It allows readers to make their own judgments regarding statistical sig-nificance. By considering the exact probability in conjunction with effect size, readersdraw a more informed conclusion about the importance of the reported result.
Suggested Computer Exercises
Exercises
1. Access the students data set, which containsgrade-point averages (GPA) and television view-ing information (TVHRSWK) for a random sam-ple of 75 tenth-grade students. Test whether thereis a statistically significant difference in GPA be-tween students who watch less than two hours of television per weekday and those who watch twoor more hours of television. In doing so,
(a) set up the appropriate statistical hypotheses,
(b) perform the test (a ¼ :05), and
(c) draw final conclusions.
2. Repeat the process above, but instead of GPA asthe dependent variable, use performance on thereading and mathematics exams.
independent samples
dependent samplessampling distribution of differences between
meansstandard error of the difference between meanspopulation variance
assumption of homogeneity of
variancevariance estimatepooled variance estimateassumption of population
normality
Identify, Define, or Explain
Terms and Concepts
296 Chapter 14 Comparing the Means of Two Populations: Independent Samples