interpreting research on school resources and student achievement: a rejoinder to hanushek

Interpreting Research on School Resources and Student Achievement: A Rejoinder toHanushekAuthor(s): Rob Greenwald, Larry V. Hedges and Richard D. LaineSource: Review of Educational Research, Vol. 66, No. 3 (Autumn, 1996), pp. 411-416Published by: American Educational Research AssociationStable URL: http://www.jstor.org/stable/1170530 .

Accessed: 28/06/2014 13:21

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

American Educational Research Association is collaborating with JSTOR to digitize, preserve and extendaccess to Review of Educational Research.

http://www.jstor.org

This content downloaded from 141.101.201.103 on Sat, 28 Jun 2014 13:21:18 PMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=aera

http://www.jstor.org/stable/1170530?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp


Review of Educational Research Fall 1996, Vol. 66, No. 3, pp. 411-16

Interpreting Research on School Resources and Student Achievement: A Rejoinder to Hanushek

Rob Greenwald Larry V. Hedges

University of Chicago Richard D. Laine

University of Chicago Illinois State Board of Education

Hanushek's (1996) comment illuminates some of the ways in which he differs from us in approach and interpretation. Hanushek misunderstands the interpretation of meta-analytic results when (as in the present case) studies produce a range of different, butpositive effects. We disagree with Hanushek on the role of statistical independence. While we do not regard multiple analyses of data on the same individuals to be as informative as analyses of independent data sets, Hanushek treats the two as equivalent. While publication bias remains a concern in research, we show that even overcompen- satingfor its likely effects does not substantially influence our results. While disagreements persist, scholarly debate should not obscure the fact that the best evidence, upon close inspection and the application of appropriate statistical methodology, demonstrates that student achievement is related to resource availability.

The question of allocation of resources to schools has indeed been marked by controversy, with various factions defending entrenched positions. All too often, research evidence has been used as part of the political rhetoric with little regard for how it might aid in the comprehension of the difficult problems facing our nation's schools. Hanushek's writings over the last two decades leave little doubt as to his position. Our own work in this area was motivated by the belief that methodological issues in the review of research on education production functions should be addressed. Our conclusions have been tempered by a skepticism of education production function research and of the utility of general policy pre- scriptions based on them.

Hanushek (1996) attributes to us four substantive positions that we do not endorse: that schools are currently working well, that they are providing a good return on investment in education, that performance problems are attributable to poorer students, and that investing more money in current schools would be wise. Since we have not stated or implied any of these positions, Hanushek's assertion is puzzling, as is his claim that our "manipulations and interpretations systematically distort the conclusions that should be drawn from the evidence" (p. 397).' These are serious charges, and it is important to examine the claims in detail.

We thank Harris Cooper for a helpful discussion concerning this rejoinder.

411



Greenwald, Hedges, and Laine

Analysis Issues

Hanushek (1996) claims that meta-analytic methods require that the characteristics of the studies to be combined-in this case, "all of the schooling situations" (p. 398)-are identical and that meta-analysis is therefore not applicable to production function studies. This is contrary to the consensus among meta- analytic experts (e.g., National Research Council, 1992). Examples of meta- analyses of studies with very different characteristics include reviews of studies of prediction of performance in all U.S. law schools (Rubin, 1980), prediction of performance in 755 very diverse jobs in the U.S. economy (Hartigan & Wigdor, 1989), dose-response functions in different species for the purposes of cross- species generalization (DuMouchel & Harris, 1983), and estimates of the value of air quality using different econometric modeling strategies (Smith & Huang, 1995). All of these examples combine information from regression analyses such as those employed in production function studies.

Hanushek is critical of combined significance tests because they are not in- tended to yield stronger conclusions. But he agrees with our interpretation: The tests show that at least some studies show real positive effects of resources, and there is no evidence of any real negative effects of resources. Hanushek may be correct that there is a "distribution of underlying [resource effect] parameters" (p. 402), but the combined significance tests demonstrate that the distribution is composed of nonnegative values. Thus the question is how to interpret the fact that the data imply a distribution of nonnegative resource effects.

Hanushek misinterprets the meaning of variation in resource effects across studies as indicating that resources produce effects in some schools but not in others. Such between-school variation in effects reflects variation about the achievement-on-resource regression line; it contributes to the error term within a study and reduces the R-square, but it need not alter the net resource effect. If the resource effect is positive for a study, it means that the net effect of resources is positive for the collection of schools in the study.

Variation across studies in resource effects implies that the net resource effect is not the same in all studies. This might be because the studies differ in the composition of schools that they include. More likely, variation is driven by the differences in study design and model specification. The critical fact is that these variations take place in the context of a generally positive distribution of resource effects. Given Hanushek's concern about the distribution of resource effects, we are surprised that he ignored the most direct evidence about it, the plots of those distributions. They verify that estimated resource effects are overwhelmingly positive.

It is gratifying to see how much Hanushek's position has changed since our earlier work (Hedges, Laine, & Greenwald, 1994). He has moved from the position that there is no systematic effect of school resources on student achievement to agree that there is a distribution of results, with some, perhaps most, of the studies finding a preponderance of schools in which greater resources are associ- ated with greater achievement. Although he is still reluctant to examine the coefficients themselves (preferring to interpret them via their statistical significance alone), he agrees in principle that examining the distribution of effects is a reasonable thing to do. Perhaps he will someday move beyond statistical signifi-

412



Rejoinder

cance in his interpretation of regression equations, as has been urged by other economists (McCloskey & Ziliak, 1996).

Sample Selection

Hanushek argues that we used a "very selective sampling of available results" (p. 400) because we were "out to show that there is a statistically significant relationship" (p. 400) between resource variables and achievement. In fact, we selected studies on the basis of methodological quality and completeness of reporting according to six criteria which were explicitly stated in our article, none of which depended on the outcome of the study. In his response, Hanushek provides a table (his Table 1) which he asserts is "a summary of results from a complete set of studies published through the end of 1994" (p. 399). We have described our criteria for choosing coefficients, but Hanushek has not. It is difficult to consider Hanushek's new "data" as scientific evidence until he reveals something about the procedures used to obtain publications and extract information from them.2

For the sake of argument, accepting that Hanushek' s unspecified data collection procedures are reasonable permits some revealing analyses. In the past, Hanushek has used a different definition of study from ours. We require that studies be independent, but Hanushek does not. Hanushek is content to count results based on the same individuals as many times as they happen to be reported, giving each weight equal to that of an independent replication.3 Thus if there are 10 coefficient estimates derived from the same individuals, Hanushek would count these as 10 "studies," but we would count them as 1 study. By counting results based on the same individuals more than once, Hanushek appears to have many more studies than we do. His Table 2 juxtaposes the number of coefficients he obtains with the number we used, and he describes the ratio as an indication of sampling rate in his Table 3.

There is another interpretation of this comparison. Since the principal differ- ence between our data set and Hanushek' s is Hanushek' s decision to count results based on the same data more than once, the inverse of his sampling rate (given here in Table 1) reflects the average number of times Hanushek counts each data set. That is, if there are 2 independent data sets, but Hanushek has 10 "studies," Hanushek is counting results based on each data set an average of 10/2 = 5 times.

Table 1 (in this rejoinder) shows that Hanushek generally counts results based on each data set more than once and that he counts the results that give negative

TABLE 1 Number of times Hanushek's Tables 1-3 count results from each data set

Total Statistically significant Statistically insignificant Total Resource estimates Positive Negative Positive Negative

Teacher/pupil ratio 4.3 3.1 5.3 2.3 5.8 Teacher education 4.5 2.3 1.5 3.7 4.6 Teacher experience 3.4 3.3 5.0 2.5 3.3 PPE 6.0 2.9 11.0 11.2 5.2

Note. PPE = per-pupil expenditure.

413




(and negative statistically significant) results more often than those that give positive (and positive significant) results. By systematically overcounting negative results, Hanushek is able to achieve the appearance that the evidence is more evenly divided than we found it to be.

The issue of independence is not merely an arcane assumption required for some "specialized [statistical] procedure" (p. 407) but is fundamental to what we think we can learn from empirical evidence. Information content is primarily a property of the data set, not the analysis. True, one can do a more or less informative analysis, but you cannot gain as much information by repeated analyses of the same data (from the same people) as you can from new data sets. If you could, there would be no need for replication with new data-we could just reanalyze a data set we already have. To put it another way, if your life depended on knowing that a medical procedure worked, would you rather have evidence from 50 independent studies or 50 different analyses of the same (single) study?

Publication Bias

Hanushek raises the issue of publication bias. As we point out, the best available empirical evidence on publication bias (albeit from another field) suggests that more than half of the studies with statistically nonsignificant results are likely to be published. Redoing our analysis and giving twice as much weight to each statistically nonsignificant result, which should more than compensate for the effects of publication bias, did not fundamentally alter either the direction or the magnitude of results.

Resource Effects May Vary Across States

Hanushek's suggests that resource effects may differ in different states. The possibility of this or other interactions of resource effects with particular study characteristics is intriguing. It should be considered as a possible explanation of the variation within the generally positive distribution of resource effects as we refine our models of where resources are most effective. Such refinements will require analyses of the resource coefficients themselves, however, not tabulations of partially redundant counts of p-values, which Hanushek favors.

Social Capital Hanushek argues that changes over the last few decades in social capital and

other social context factors have had little effect or perhaps a positive effect on educational outcomes. Although some indicators of social capital (such as family size) have improved in the last few decades, other indicators that are important, if difficult to measure, have decreased. We believe that the net effect of changes in social capital, especially for certain minority groups, has been negative over this period. If Hanushek really believes that times are as good for America's children today as they have ever been, he is surely in the minority.

Longitudinal Studies Hanushek criticizes us for not separating longitudinal from quasi-longitudinal

studies and for not providing "a complete description of the studies" (p. 406). This is an odd complaint. Hanushek does not even cite the studies on which he relies, but our Appendix 2 describes each of our studies, and Table 6 separately reports 414



Rejoinder

the median effect sizes from longitudinal and quasi-longitudinal studies. Although the number of truly longitudinal studies is very small and there are no truly longitudinal studies of the effects of teacher salary and only one of per-pupil expenditure, the median effect sizes for teacher experience, teacher education, and teacher ability are positive.

Conclusion

No one in this country is arguing that all of our public schools are doing well enough to meet today's standards, let alone the challenges facing our children in the 21st century. We recognize that we must look for greater returns on the public's investment in our children, and we must find policies and practices which improve the quality of teaching and learning, especially in light of the declining growth expected in future education spending. All of this necessitates an under- standing of how money matters in education. However, before this is addressed, the fact that so many children attend schools with limited resources demands that policymakers examine empirical evidence about the question of whether money matters. Our findings, which demonstrate that money, and the resources those dollars buy, do matter to the quality of a child's education. Thus policies must change to ensure that all children have sufficient resources and that incentives to spend those resources wisely are in place. Even Hanushek now appears to concede this point.

Notes

'We suspect that he interprets our conclusions as being consistent with these positions (which he does not endorse, either) and is using them as a straw man in his reply.

2We believe that the state of the art in educational research has moved to a point that requires adherence to minimal standards in reporting of data collection in research reviews.

3This is particularly a problem in econometric studies, where several different regression models, only one of which the researcher may believe is correct, may be investigated in a search for the eventual model. In such cases we doubt that the primary researcher would condone the practice of treating estimates from all models as equally valid.

References

DuMouchel, W. H., & Harris, J. E. (1983). Bayes methods for combining the results of cancer studies in humans and other species. Journal of the American Statistical Association, 78, 293-308.

Hanushek, E. A. (1996). A more complete picture of school resource policies. Review of Educational Research, 66, 397-409.

Hartigan, J. A., & Wigdor, A. K. (1989). Fairness in employment testing: Validity generalization, minority issues, and the General Aptitude Test Battery. Washington, DC: National Academy Press.

Hedges, L. V., Laine, R. D., & Greenwald, R. (1994). Does money matter?: A meta- analysis of studies of the effects of differential school inputs on student outcomes. Educational Researcher, 23(3), 5-14.

McCloskey, D. N., & Ziliak, S. T. (1996). The standard error of regressions. Journal of Economic Literature, 34, 97-114.

415




National Research Council. (1992). Combining information: Statistical issues and opportunities for research. Washington, DC: National Academy Press.

Rubin, D. B. (1980). Using empirical Bayes techniques in the law school validity studies. Journal of the American Statistical Association, 75, 801-816.

Smith, V. K., & Huang, J. C. (1995). Can markets value air quality? A meta-analysis of hedonic property value models. Journal of Political Economy, 103, 209-227.

Received June 19, 1996 Accepted June 20, 1996

416



interpreting research on school resources and student achievement: a rejoinder to hanushek

Documents