need-based evaluation theory: what do you need to know to do good evaluation?

Need-Based Evaluation Theory: What Do You Need To Know To Do Good Evaluation?

WILLIAM R SHADISH

INTRODUCTION

I want to propose a simple idea: Good evaluation responds to important social needs. The idea seems reasonable enough. But ifit is accepted, it challenges some popular notions. For example, it challenges the idea that evaluation is applied social science methodology, or that evaluation is judging merit or worth. I believe these ideas are too narrow. Evaluation is far more than these because it responds to important social needs that extend beyond methods or valuing. In this paper, I propose a need-based theory of evaluation that provides a more complete and accurate picture of what evaluators need to know to do good work.

WHY SHOULD WE CARE ABOUT SUCH ABSTRACT MATTERS?

Every year, graduate students studying evaluation at Cornell University, Syracuse University, and University at Albany/SUNY host the Edward F. Kelly Evaluation Conference. The conference activities are dominated by evaluation students who organize the program, give most of the talks, and contribute most to the proceedings (Lanahan & Eads, 1993). The seventh conference, held March 29, 1993 at Cornell, was devoted to the theme “Evaluation 2000: Perspective in the Future”. Consistent with this theme, many of the contributions aimed at identifying and discussing what students thought were the key issues facing evaluation in the future.

In the present context, the proceedings of that conference offer a unique opportunity. Being at such an early stage of their career, students are seeing much of evaluation for the first time, exploring ideas and methods that are new to them, and trying to determine what evaluation is and what they need to know to do it. What they write about evaluation can tell us a lot about both the future and the current condition of the profession. To use the old adage, they help us see ourselves as others see us.

WI&m R. Shad& Department of Psychology, The University of Memphis, Memphis, TN 38152.

EvaIuation Practice, Vol. 15, No. 3, 1994, pp. 347-358. Copyright @ 1994 by JAI Press, Inc.

ISSN: 0886-1633 All rights of reproduction in any form reserved.

347

348 EVALUATION PRACTICE, 15(3), 1994

What students at the 1993 conference reported seeing should give us pause for concern. They identified six key issues. Top on this list was the issue of professional identity and practice. Consider these student comments:

It is often said, and proudly so, that there is diversity in evaluation, yet it seems more like fragmentation. Everyone seems to define evaluation differently. What is it?

Evaluation needs to increase its questioning of what constitutes evaluation. How is evaluation to define itself? What is its character?

Who are we? What is our real purpose? We are off in 10 different directions.

We don’t seem to know who we are. And if we don’t know that, then how can we even begin to critique my own practice and that of others? (Fournier, 1993, pp. 34).

These questions point to a serious problem. Apparently we do poorly the most essential things that any profession must do-clearly identify its tasks and functioning, and the knowledge needed to do them. Physicians aim to prevent or cure physical ailments; to do this well, they need to know both academic topics like anatomy and chemistry, and also the daily functioning of the health care service delivery system. Lawyers aim to intervene at all parts of the legal system, from writing legislation to representing clients in court proceedings; and they need to know the law and the legal system to do that well. Without overlooking the legitimate debates that occur in such fields, students in those fields don’t seem to express as much confusion about these tasks and functions, nor about the education they need to be a part of them, as do students in evaluation. By contrast in evaluation, if we are to judge from these student observers, we have yet to convince students that we know what we are about in evaluation, and that we have a body of knowledge they need to learn to be one of us.

The purpose of this article is to address this problem. First, I will discuss significant problems with two definitions of evaluation that probably are most prevalent today- evaluation as applied social science methodology, and evaluation as valuing. Second, I will propose that the definition of our field’s tasks and functions should properly stem from the needs we are called to meet as a profession. I then suggest a definition of evaluation that flows from those needs, and describe the needs in more detail, These needs point directly to the knowledge base of the profession of evaluation, and to the coursework that would be useful in training evaluators.

Problematic Definitions of Evaluation

Most evaluators don’t worry too much about how to define the field. After all, evaluation is mostly practice-driven, and once evaluators find employment they become more concerned with doing the tasks set before them than with theoretical debates about the nature of the tasks they do. They don’t talk about doing; they just do. Even so, two definitions of evaluation probably dominate their implicit understandings of the nature of the field-and both are, at best, seriously incomplete.

Evaluation as Applied Social Science Methodology. The first definition of evaluation is that it is applied social science methodology. The origins of this definition are pretty obvious. During the modern era starting in the 196Os, early evaluators were mostly social scientists who took the methodologies they learned in their home disciplines (e.g., psychology, education, economics, sociology, anthropology, statistics) and applied

Need-bad Evaluation Theory 349

them to whatever evaluative problem was at hand. So economists applied econometric models to evaluating job training programs; educators applied testing technologies to evaluating curricula and classroom interventions; and psychologists applied experiments to evaluating the outcomes of practically everything.

But problems with this definition of evaluation quickly became apparent. It is certainly true that evaluators apply social science methodologies to evaluation tasks. But that is not all they do, nor all they are called on to do. For example, evaluators also choose to enter the policy making fray by choosing to make recommendations about how programs, curricula, or other things can be improved, or even whether they should be terminated. They often hope that their work might be used by someone, somewhere, to make better decisions; and they take sides in discussions about values over the choice of dependent variables, interventions to study or not study, or whose views to consult prior to and during the evaluation. Sometimes they do these things on their own initiative, but other times they are asked to do so by others in positions of social authority, as when they are asked to make recommendations about program, product, or personnel improvement, asked to judge whether the program was worth continuing, asked to provide specific information in a usable time frame for a particular purpose, and asked to provide information that might help adjudicate disputes among stakeholders with very different values (e.g., Republicans versus Democrats). Social science methodology has no particular resources to meet these needs. Hence, on such grounds the definition of evaluation as applied to social science methodology has been relegated to the dustheap of evaluation’s intellectual history-although many practicing evaluators probably still endorse it either because they have not thought much about the issues, or because we have not given them a better alternative. Evaluation is applied social science methodology in many cases, but it is always much more than that.

Evaluation as Valuing. Evaluators who think about it realize that there is a major alternative to the social science methodology definition of evaluation. This definition has been championed most forcefully by Michael Striven for nearly 30 years (e.g., Striven, 1966, 1967). It says that evaluation is about valuing, about determining the merit or worth of whatever is being evaluated (the evaluand). This definition has much to commend it.

1. It appeals to the dictionary definition of the word evaluation, and to the fact that most of the word value is contained in the word evaluation. After all, if you look up evaluation in the dictionary, you do not find “applied social science methodology” as a definition; but you do find the word Value” included in the definition.

2. A little thought shows that the value definition of evaluation can incorporate much of the applied social science methodology definition. The reason is that the gathering of data from which to construct value statements mostly uses traditional social science methods. So thinking of evaluation as valuing can incorporate many of the good things we have learned about applied social science methodology.

3. It helps remedy at least one of the salient weaknesses of the applied social science methodology approach-its failure to address explicitly how it is that evaluators can best relate their data to the values that almost inevitably bear on evaluative discussions in any applied arena. Indeed, it makes clear those values cannot be

EVALUATION PRACTICE, 15(3), 1994

4.

avoided even if we tried, and so suggests ways that we can do a more explicitly thoughtful job of something we are already doing implicitly. In his many works over the years, Striven (1980, 1991) has elaborated this definition into a full-scale theory of valuing with extraordinary conceptual richness, detail, and fecundity. In fact, there is nothing else remotely like Striven’s theory of valuing in evaluation; and it is too little known and used in evaluation today. We can only gain, not lose, by adding Striven’s conceptualization ,of evaluation to the fund of knowledge about applied social science methodology that already exists.

But this definition of evaluation is also incomplete if viewed as a complete description of what evaluators need to do in their work. One has only to run through the previous litany of things that evaluators do, or are called on to do by others, to see that this definition has not addressed all the problems that evaluators face. Left out are such important issues as how the things we evaluate start, change, and end; and how evaluative information is used in that process. Striven’s answer to this challenge is both blunt and subtle. Bluntly, he says that these things are not the main business of the evaluator, whose main job is to make value judgements. If you doubt that, he says, you do not understand logically what it is to evaluate. More subtly, he argues that the profession of evaluation has the chance to occupy a unique niche in the world, that of being the specialists to whom others turn when they want scientific information about whether something is good or bad. The argument is subtle because it is right-there is no profession currently in that niche-and it appeals to the need in all of us to have a unique place in the world. Such a task would indeed make the profession of evaluation unique. And it would also give us an agenda for training, lifted largely from both applied social science methodology and from the philosophical logic of the process of valuing. All we need do is add a few courses and practicum training in value theory and ethics, both from philosophy and other applied disciplines.

But both the subtle and the blunt answers are flawed. The blunt answer is flawed because it confuses philosophical logic with a profession. To say that such things as use or social change or product improvement are not the business of evaluators is literally wrong. In fact, those who are in the business of evaluation, in the literal sense of making a living by doing it, often do need to know about use, about social change, or about product improvement. If they don’t, they won’t get much business. If evaluation is a profession, then it is also a business, and its customers ask it to go beyond applying methods and making value judgements about merit or worth. To limit the profession of evaluation to valuing is to limit it to the philosopher’s task. Who will pay us to do that? Wouldn’t they be better off paying a philosopher?

The subtle answer is flawed because it assumes that in order to occupy this unique social niche of being the professionals who do scientific valuing, professional evaluators must give up all other tasks-a concern for use, for example. To oversimplify, this is akin to saying that lawyers should only render judgements about whether something is legal, but should not write contracts, create or improve legislation, participate in arbitration, or be in any way concerned with whether or not their work is useful to the clients who pay them. It is akin to saying that physicians should only tell you if you are sick, but not tell you how to get better, not be concerned with the place of medicine in the political- economic system, and not be concerned with things in the doctor-patient relationship that

N&-bus& Evaluation l&ny 351

might affect whether you use the physician’s advice. But lawyers and physicians do all these things, and we are glad they do. Lawyers and physicians do not lose their niche in the social world because they take on these extra tasks, and neither will we. In fact, to the extent that lawyers or physicians fail to address these tasks, they risk losing their niche to those who will do so. And so will we.

Recently, however, Striven (1994) suggested a way of resolving this entire set of criticisms. He began by distinguishing between theories ofevaluation versus theories about evaluation. Theories of evaluation are general accounts of valuing, such as what the nature of valuing is, how it is done, or what its logic is. Theories about evaluation concern the sociological, psychological, economic, and political aspects of the practice of evaluation. Most of his work has concerned theories of evaluation, not theories about evaluation. In this light, Striven’s work on valuing may not have been intended to be a complete description of the practice of evaluation, nor as a denial of the importance to professional practice of such things as evaluation use. To admit this is not in any way to diminish the importance of valuing in evaluation; it is just to see it as one of several important needs we fill in professional evaluation practice.

Summary

Put simply, both the notion of evaluation as valuing, and the notion of evaluation as applied social science methodology, share the same problem. Both are insufficient to meet the needs that the profession of evaluation is called on to address. If we train students to rely on such notions as their primary resource, they will not be able to do the full job that society asks them to do. Both society and the profession of evaluation will be worse off as a result. Which brings me to my central point.

NEED-BASED EVALUATION

If we wanted to know what good evaluation would be, one approach (the one I advocate) takes advantage of what we know about the logic of making value statements. Striven (1980) has proposed a logic of valuing that consists of four steps:

1) selecting criteria of merit on which the thing being evaluated should do well; 2) setting standards of performance on those criteria for how well the thing must

do; 3) gathering data on the criteria relative to the standards; and 4) synthesizing the result into a statement of whether the thing is good or not.

He proposes that the first step, criteria selection, be done based on a needs assessment. That is, a thing is good if it meets needs; a need, in turn, exists if harm would be done if the need is not met. If we use this to evaluate the profession of evaluation, evaluation would be good if it met important needs, and if a failure to meet those needs would result in harm. In this spirit, the best definition of evaluation would be one that is based on a thorough understanding of the needs that evaluation meets. Otherwise, the definition risks leading evaluators away from that which they need to be doing.


A Need-Based Definition of Evaluation

In general, then, my argument is this: the need to which evaluators respond is to use feasible practices to construct knowledge of the value of the evaluand that can be used to ameliorate the problems to which the evaluand is relevant. This description refers to live components-practice, knowledge construction, valuing, knowledge use, and the evaluand. First let me describe each component in slightly more detail (see Shadish, Cook, & Leviton, 1991 for elaboration), and then I will return to arguing that each refers to a need.

The knowledge construction component is concerned with what counts as acceptable knowledge about the evaluand, with methods to produce credible evidence, and with philosophical assumptions about the kinds of knowledge most worth studying. Methodology falls here, as do more philosophical debates in epistemology and ontology about why various methods are more or less preferable. The evaluationpractice component concerns the things evaluators do as they practice their profession. It deals with the role of evaluators in relating to stakeholders; the sources of questions; how to decide which questions to ask; and what methods to use given priorities among questions, the issues about which uncertainty is greatest, and constraints of time, financial resources, staff skills and procedural standards. The use component concerns how social science information may be applied in working with the evaluand. It deals with possible kinds of use, relative weight to be given to each kind of use, and what evaluators can do to increase use. The component that refers to the evaluand concerns the nature of the thing being evaluated, and its role in problem solving. It deals with the internal structure and functioning of the evaluand, its relationship to other parts of society, and the processes through which the evaluand and its constituent parts can be changed to improve its performance. If we are evaluating a social program, then we need to know how that program operates, its environment, and how it changes; if we are evaluating a product, we need to know the same things about it. The valuing component deals with which values ought to be represented in an evaluation, and how to construct judgements of the worth of the evaluand. Debates about ethical issues like justice fall here, but so do discussions of stakeholder roles in constructing and critiquing the evaluation.

If the need-based definition of evaluation is to hold, all five of these components must be needs; that is, harm must be done if they are not met. We can probably get quick consensus that two of these components are plausible needs-knowledge construction and evaluation practice.

The Need to Construct Knowledge. It is hard to imagine any evaluator objecting to the knowledge construction requirement. What harm is done if professional evaluation ignored this component? It simply is not evaluation, because an evaluative conclusion is based on knowledge of the evaluand. A more interesting question would have to be more specific. For example, what harm is done if professional evaluators generate knowledge that is not empirical in the broad sense of the word, but relied mostly or entirely on, say, gossip or professional lore? Whether we are quantitative or qualitative evaluators, whether we use social science methods or technology assessment methods, we probably all agree that the methods we use meet a need that gossip or professional lore do not. And we also probably agree that society would be worse off without this knowledge because gossip and professional lore may be less accurate under many circumstances. So while we can

Need-bad Evaluation Theory 353

and should actively debate the exact kind of knowledge that evaluation needs to produce, the need for accurate empirical knowledge itself is probably the most basic need.

The Need for Evaluation Practice. Similarly, it is hard to imagine any evaluator objecting to the requirement that evaluators need to construct knowledge in a resource environment characterized by a diversity of options coupled with severe constraints. Almost anyone who has practiced evaluation will endorse this need. After all, we cannot interview every single stakeholder, ask every question, or use every method. We are usually lucky to get an answer to one or two questions, using one or two methods. The world of evaluation practice is a world characterized by such constraints, and so by choices that have to be made. What harm is done if evaluators ignored these constraints? In many cases, the resulting evaluative information fails to be responsive, timely, cost-effective, or relevant; and such information would, in turn, waste time, money, and other scarce social resources.

The Need for Use. Now we get to the three components that some evaluators have argued explicitly are not the business of the evaluator. To take the first issue, what harm is done if professional evaluators fail to consider how evaluation results might be used? First, it threatens the very economic basis of the profession itself. Society spends large amounts of money on evaluation, and we have every reason to think that it does so in expectation of receiving a useful result in return. If we fail to live up to this social contract, we may lose the professional basis of the field, although the academic interests might still survive. But that is a self-serving concern, so we might note a second important harm. Imagine that an evaluation finds, say, that a product causes harm to consumers or to the environment, or that program managers are failing to spend federal money as authorized in legislation. If that information is not used, the product harm or the program mismanagement may continue, causing further harm that could have been prevented. If it is not used because some evaluators simply decide that use is none of their business even though they could have done something constructive, those evaluators are at least partly responsible for that harm.

The money that society gives to evaluators could be spent for other purposes, including for more social programs, for product advertising, for personnel development activities, for reducing the deficit, or for any of a host of other alternatives that might yield socially- valued results. For example, in the 1970s Congress mandated that local community mental health centers (CMHCs) set aside two percent of their funds to do self-evaluation (Cook & Shadish, 1982). No new money was provided, so the funds came from all the other activities that CMHCs do in providing services to ameliorate mental health problems. A good many of those problems cause real emotional and even physical harm, as when some chronically mentally ill patients cannot care properly for themselves some of the time. If evaluators ignore use because they think it is not part of their job, then they are partly responsible for a waste of money that could have been spent remedying these harms.

In the end, of course, it is the user who must decide whether to use results, not the evaluator. But this is not license for the evaluator to ignore the issue. By outlining the kinds of use that may be possible, the time frames in which such uses might occur, the steps that need to be taken to facilitate use, and the tradeoffs involved in these choices, the evaluator both educates the client about the client’s choices, and ensures that harms don’t occur needlessly when they could have been prevented. The argument is not that

EVALUATION PFtACTICE, 15(3), 1994

the main job of evaluators is to get their results used. Nor is it that evaluators must focus on short-term instrumental use rather than long-term conceptual use. The argument is simply that evaluators need to consider the potential usefulness of their work, to communicate about these matters with stakeholders of evaluation, and to ensure that the usefulness of the resulting evaluation is consistent both with the expectations of concerned parties and with the need for informed social action that may be implicit in the problem being investigated.

The Need to Know the Evaluand. At first blush, this need may seem obvious, for how else could one evaluate if one did not have knowledge about what one was evaluating? But the present claim is stronger than that. The need is to know the internal structure and functioning of the evaluand, its relationship to other parts of society, the socioeconomic environment in which it exists, its role in problem solving, and the processes through which the evaluand and its constituent parts can be changed to improve its performance. So again we ask, what harm is done if these matters are ignored? The answer is that such knowledge is needed for evaluation to be connected to problem solving. Without such knowledge, the problems and the harms caused by those problems live on. After all, professional evaluation is not a mere academic enterprise. Rather, it is a key element in problem solving, whether those problems are economic, social, commercial, or political, and whether they occur in the public or private sector. If evaluation is not connected to the problem solving process, its results will, at best, be relevant to that process by chance.

An example concerns the evaluation of social programs. Many social programs are very large administrative entities that themselves fund and administer smaller projects scattered throughout the nation. The national Head Start program is an example, administering many local Head Start projects. Over the last three decades, we have learned that such programs start and end infrequently, and when they do end, it is usually due to economic or political reasons rather than to a negative evaluation of program results. Evaluations are disconnected from this reality when it is assumed that social programs will be ended if evaluation results suggest the program is ineffective. Yet this very assumption was prevalent in some approaches to evaluation in the 1960s. There is, of course, a role for such evaluations in the social problem solving process; but it is a role that should be thoughtfully chosen on some occasions rather than blindly realized through ignorance of its limitations.

Similarly, evaluations that ignore this need are more likely to produce conclusions that have no place in the current socioeconomic system. For example, a school evaluation that recommends firing teachers for poor performance without thinking through the politics and economics of the teaching profession in school systems is, at best, incomplete, and at worst, harmful to the problem solving process. Similarly, an evaluation of a model program for treating the chronically mentally ill that ignores the politics of putting small group homes in single family residential neighborhoods risks harming patients if local backlash results in less support for this needy population. If professional evaluators can contribute to these harms by their work, then they are also responsible for doing what they can to prevent such harms from occurring. To do this, they must know the evaluand.

The Need for Valuing. For all the reasons previously outlined, Striven would no doubt agree that valuing is something evaluators need to do; some other evaluators give

Need-based Evaluation Thtmy 355

lip service to this need, too (Shadish, Cook, & Leviton, 1991). But some evaluators bridle at this suggestion, seeing their job as being the provision of information rather than the discussion of values, the making of value judgements, or the consideration of ethical issues like justice. So what harm would be done if evaluators ignored values? In one sense, no harm would be done, because evaluations would still have values implicit in them. Values inevitably permeate the selection of independent and dependent variables, the choice of questions and stakeholders, and the social and political context from which many evaluations arise. Evaluators cannot avoid values even if they try.

But in another sense, real harm is done if evaluators deal with values naively or poorly through their implicit choices. We learned this early in evaluation. For example, if the values of a particular stakeholder group are not considered, they may feel morally and politically slighted, and as a result be uncooperative during the course of the work and be critical in subsequent debates about evaluation results. If their values are misunderstood, they may see evaluation as less relevant than otherwise. House, Glass, McLean, and Walker (1978) claimed this happened with the Follow Through evaluation. Program developers said evaluation measures did not tap the constructs they thought the program would change. Parents said they had been excluded from decisions about evaluation so it did not reflect their interests. These interests, these constructs, reflect stakeholder values. As a consequence of dealing poorly with these values, the evaluation had less credibility with these stakeholders, and they undermined it in subsequent debates. Today we take it for granted that we must understand the value context in which we work, and the stake that various stakeholders have in the evaluation.

But the argument for consideration of values is stronger than this. &riven (1980) argues that harm is done if evaluators do not explicitly consider the four steps of the logic of valuing-if they do not:

1) try to surface all the criteria on which the evaluand must do well to be good or bad;

2) consider both absolute and comparative standards of performance for how well the evaluand must do on those criteria;

3) gather data well on these things; and 4) combine the results into a final value judgement.

On the third step, few evaluators would disagree, for this speaks mostly to the knowledge construction and evaluation practice issues about which some consensus exists. On the first two steps, Striven has a point not sufficiently appreciated by critics. If one accepts the basic premise that we are involved in a process in which either evaluators or stakeholders are making value judgements about the evaluand, then harm may occur when we do not do Striven’s first two steps explicitly and thoughtfully. For example, we may ask whether deinstitutionalization results in greater patient integration into society; but for deinstitutionalization to be good, we also need to know if patients are happy, if they are floridly symptomatic, if costs exceed benefits, if crime increases, if property values are affected when deinstitutionalized patients move into small group homes in single family neighborhoods, and so on. On all these criteria, we may ask if patients do better than they would have in the hospital; but for deinstitutionalization to be good, we also need to know if patients do better in nursing homes, in board-and-care homes, in jails, or with their families; and we need to know whether their actions on each criterion violate accepted


standards such as the law. Explicit use of these first two steps in Striven’s logic helps surface such options. If we systematically overlook some options, real harm may be done. With deinstitutionalization, for example, most evaluations systematically overlooked nursing home placements for patients when more deinstitutionalized patients resided there than in any other setting. Those who think nursing homes harm patients believe this oversight causes real harm to deinstitutionalized patients.

Many evaluators might agree to this much, but point out that one need not invoke values to do so. After all, a thorough consideration of the first two steps in &riven’s logic can be thought of as (a) considering all relevant dependent variables, and (b) considering which alternative interventions might affect those dependent variables. But such a response would be a welcomed development because it would be acknowledgement that explicit consideration of these steps-whether or not we call them part of valuing-is better than casual consideration of them. &riven’s point is that if you do these steps, you are doing the things that people do when they make a value judgement. It probably is better that these steps be done explicitly rather than haphazardly, using the best that philosophy and other disciplines have to offer.

Even if evaluators might agree to these arguments about the first three steps, they might disagree that attention ought to proceed to the fourth step in Given’s logic-making value judgements that the evaluand is good or bad. If evaluators simply present value- relevant evidence and let readers draw their own value judgements, would that harm the profession or society? Only if stakeholders are poor at combining evidence, so reach the wrong conclusion, and then do something harmful as a result. But with a few exceptions, stakeholders are probably reasonably good at combining evidence from the perspective of what is best for them. Their weakness might be in taking the perspective of what would be good for others, or for society as a whole. Depending on the power and responsibilities of the stakeholder, this might cause real harm. A stakeholder who is a policymaker, for example, probably should be able to consider multiple points of view, including the social good. To ensure these points of view are represented, therefore, the evaluator probably should attempt to show how different perspectives lead to different final value judgements-better to make it explicit than to risk its being overlooked. For instance, Cook, Appleton, Conner, Shaffer, Tamkin, and Weber (1975) found that Sesame Street teaches some alphabet skills to economically disadvantaged children who view the show regularly. But the disadvantaged watch the show less often than affluent cohorts who learned even more skills. So Sesame Street increased the gap in skills between these two groups. Are the gains of disadvantaged children worth widening the gap between them and advantaged children? It depends on your perspective, and evaluations that surface both perspectives are better than those that do not. Conversely, if the evaluator draws just one value judgement, the interests of those with other perspectives may be harmed to the extent that they are deprived of the information they need to make a judgement from their own perspective.

The last value issue we must consider concerns ethical topics like social justice. They are part of valuing, but is any harm done if we fail to consider them? Lack of awareness of such topics might lead to potential harm if, for example, it leads the evaluator to overlook a patent injustice-by definition, things that are unjust often cause real harm to people. For this reason alone, we must continue to use discussions of ethical issues to ensure that such problems are minimized. Fortunately, however, it seems likely that a thorough stakeholder analysis will surface such problems, and will have the added advantage of

Need-based Evaluation Theory 357

providing stakeholders with the information they need to participate most effectively in decision making in a pluralistic interest group democracy. In this sense, social justice is best served not by imposing some ethical theory from philosophy, but by allowing the ethical beliefs of stakeholders themselves to guide selection of criteria of merit in evaluations.

DISCUSSION

At this point, the reader should go back and review the questions asked by students at the start of the article. Have those questions been answered? Only in part. That part answered concerns defining the nature of the evaluation enterprise, and what we need to know to do good evaluation. The claim is that the profession of evaluation should meet important social needs captured in the definition: to use feasible practices to construct knowledge of the value of the evaluand that can be used to ameliorate the problems to which the evaluand is relevant. If so, then evaluators need to know about all five of these topics-about knowledge construction, about making the difficult choices of evaluation practice, about the use of evaluation, about the evaluand, and about the role of valuing in their work. These five components suggest a curriculum for the evaluation student, with coursework and readings on each of the five-although given the centrality of practice to the profession, the practice component should undoubtedly receive the most attention in the training of students. The curriculum is interdisciplinary and demanding, but justified in the needs that the profession meets. These five topics also suggest the domain that evaluation theory should address-the knowledge base of the profession (Shadish et al., 1991). These are the five needs we meet as professional evaluators, and therefore these are the five areas evaluators need to know about to do good evaluation.

But the students’ questions at the start of this paper are not completely answered. The part not answered concerns how we choose among the various options that practitioners face-the part of student questions that concern fragmentation, and riding off in 10 directions at once. In a field as large as evaluation, encompassing such an enormous array of options, students must indeed be bewildered at the possibility of narrowing these options to the best one-choosing among randomized and quasi+xperiments, quantitative and qualitative methods, formative and summative emphases, instrumental and conceptual use, short-and long-term use, and the like. I have presented the outlines of an answer to this problem elsewhere (Shadish et al., 1991)-all these options are legitimate choices to make depending on the situation. Selections should be guided by the use of contingency devices that tell us when, where, and why certain choices might be better than others in certain circumstances. Contingency devices are theoretical concepts that identify a general choice point and the options at that point, and then show how and why evaluation practice should differ depending on which option is present or chosen. Examples of contingency devices are the stage of program development (e.g., new intervention, existing intervention), the structural characteristics of the thing being evaluated (e.g., policy, program, project, element), and the employment situation of the evaluator (e.g., academic, public sector, private sector; see Shadish et al., 1991, pp. 426430). This latter reference briefly discusses this answer in outline form, but it deserves more elaboration than the present paper would allow. SO to flesh out this answer properly requires another paper, and Ill end this article with its title: “Contingency Devices and the Future of Evaluation Theory.”


REFERENCES

Cook, T. D., Appleton, H., Conner, R. F., Shaffer, A., Tamkin, G., 8z Weber, S. J. (1975). “Sesame Street ” revisited. New York: Russell Sage Foundation.

Cook, T. D., & Shadish, W. R. (1982). Metaevaluation: An assessment of the congressionally mandated evaluation system for Community Mental Health Centers. In G. J. Stahler & W. R. Tash (Eds.), Znnovative approaches to mental health evaluation. New York: Academic Press.

Fournier, D. (1993). Evaluation 2000: A summary of Syracuse University student perspectives on the future. In M. P. Lanahan & C. Eads, (Eds.), Proceedings of the 1993 Edward F. Kelly Evaluation Conference (pp. 1-13). Albany, New York: University at Albany/SUNY.

House, E. R., Glass, G. V., McLean, L. D., & Walker, D. F. (1978). No simple answer: Critique of the follow through evaluation. Harvard Educational Review, 48, 128-160.

Lanahan, M. P. & Eads, C. (Eds.). (1993). Proceedings of the 1993 Edward F. Kelly Evaluation Conference. Albany, New York: University at Albany/SUNY.

Striven, M. (1966). Value claims in the social sciences. Publication 123 of the Social Science Education Consortium, Purdue University, Lafayette, Indiana.

Striven, M. (1967). The methodology of evaluation. In R. W. Tyler, R. M. Gagne, & M. Striven (Eds.), Perspectives on curriculum evaluation. Chicago, IL: Rand-McNally.

Striven, M. (1980). l%e logic of evaluation. Inverness, CA: Edgepress. Striven, M. (1991). Evaluation thesaurus (4th ed.). Newbury Park, CA: Sage Publications. Striven, M. (1994). Types of theory in evaluation. Theories of evaluation (Vol. 2, pp. 14). Newsletter

of the AEA topical interest group on theories of evaluation: Instructional Design, Development, and Evaluation Program, School of Education, Syracuse University.

Shadish, W. R., Cook, T. D., & Leviton, L. C. (1991). Foundations ofprogram evaluation: Theories of practice. Newbury Park, CA: Sage Publications.

need-based evaluation theory: what do you need to know to do good evaluation?

Documents