understanding long-term environmental risks

16
Journal of Risk and Uncertainty, 3:315-330 (1990) © 1990 Kluwer Academic Publishers Understanding Long-TermEnvironmental Risks BARUCH FISCHHOFF* Department of Social and Decision Sciences, Department of Engineering and Public Poli~ Carnegie Mellon University, Pittsburgh, PA 15213 Key words: long-term environmental risk, judgment, risk estimating, learning facilitation Abstract How well we manage long-term environmental risks depends on how well we understand them. Whether the risk managers are experts or laypeople, that understanding is typically limited. As a result, people must rely on judgment when making decisions about risks. Estimating how big risks are and how much reducing them is worth is an intellectual skill. After reviewing the behavioral principles that govern how people acquire such skills, this article offers several proposals for facilitating learning abut risks by improving the ways in which scientific data are created or presented. It also describes some pitfalls facing attempts to determine the quality of other people's understanding of risks, whether through direct study or more casual obsewation. How people respond to risks depends on how they perceive those risks, and especially on their perceptions regarding how large those risks are, how painful their realization would be, what opportunities exist for controlling them, and how costly control would be. These perceptions are critical to the management of both immediate and long-term risks. For example, drivers will slow down at a curve only if they see it coming, judge it to be sharper than they can comfortably manage at their current speed, and decide that prudence will not be viewed as timidity or incompetence. Analogous thoughts shape the actions of professional risk managers, such as a nuclear power plant operator responding to errant warning lights on a control panel, a civil defense official responding to the derailment of a tanker train several miles from a residential neighborhood, or a captain responding to changes in the wind across a shipping lane. Although they play themselves out at a much slower pace, decisions regarding long- term risks are also based on perceptions of risks, benefits, and control options. The defining property of such decisions is that the ultimate consequences will not be fully realized for some (long) period of time. One common corollary of this property is that decision makers receive relatively little direct feedback regarding how wise their deci- sions have been. For example, constituents may complain about regulators' failure to close an incinerator. Nonetheless, it will be a long time before the casualties will begin to (or fail to) mount, confirming (or disconfirming) their claim. Conversely, if a facility is closed, one may never know whether there really was a risk. *Preparation of this article was supported by National Science Foundation Grant SES-8846459. It is gratefully acknowledged. The opinions expressed are those of the author.

Upload: baruch-fischhoff

Post on 06-Jul-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Journal of Risk and Uncertainty, 3:315-330 (1990) © 1990 Kluwer Academic Publishers

Understanding Long-Term Environmental Risks BARUCH FISCHHOFF* Department of Social and Decision Sciences, Department of Engineering and Public Poli~ Carnegie Mellon University, Pittsburgh, PA 15213

Key words: long-term environmental risk, judgment, risk estimating, learning facilitation

Abstract

How well we manage long-term environmental risks depends on how well we understand them. Whether the risk managers are experts or laypeople, that understanding is typically limited. As a result, people must rely on judgment when making decisions about risks. Estimating how big risks are and how much reducing them is worth is an intellectual skill. After reviewing the behavioral principles that govern how people acquire such skills, this article offers several proposals for facilitating learning abut risks by improving the ways in which scientific data are created or presented. It also describes some pitfalls facing attempts to determine the quality of other people's understanding of risks, whether through direct study or more casual obsewation.

How people respond to risks depends on how they perceive those risks, and especially on their perceptions regarding how large those risks are, how painful their realization would be, what opportunities exist for controlling them, and how costly control would be. These perceptions are critical to the management of both immediate and long-term risks. For example, drivers will slow down at a curve only if they see it coming, judge it to be sharper than they can comfortably manage at their current speed, and decide that prudence will not be viewed as timidity or incompetence. Analogous thoughts shape the actions of professional risk managers, such as a nuclear power plant operator responding to errant warning lights on a control panel, a civil defense official responding to the derailment of a tanker train several miles from a residential neighborhood, or a captain responding to changes in the wind across a shipping lane.

Although they play themselves out at a much slower pace, decisions regarding long- term risks are also based on perceptions of risks, benefits, and control options. The defining property of such decisions is that the ultimate consequences will not be fully realized for some (long) period of time. One common corollary of this property is that decision makers receive relatively little direct feedback regarding how wise their deci- sions have been. For example, constituents may complain about regulators' failure to close an incinerator. Nonetheless, it will be a long time before the casualties will begin to (or fail to) mount, confirming (or disconfirming) their claim. Conversely, if a facility is closed, one may never know whether there really was a risk.

*Preparation of this article was supported by National Science Foundation Grant SES-8846459. It is gratefully acknowledged. The opinions expressed are those of the author.

316 BARUCH FISCHHOFF

The same obstacles to learning about long-term risks face decisions made at the individual level. For example, smokers may regret or relish their habit for a long time before discovering whether their bodies are sensitive enough to show ill effects. And similar obstacles face technical experts, who know much but still must guess at exactly how things will work out. They, too, make repeated decisions, each based on the same imperfect beliefs, hoping that their whole enterprise is not threatened by some intellec- tual common-mode failure. In all these cases, critical lessons may be learned not only late, but too late.

A second common correlate of decisions about long-term risks is a relatively diffuse moment of decision. For example, most facilities are never closed nor opened for good; most bad habits are neither adopted nor foresworn forever. As a result, it may be hard to reconstruct exactly what decisions produced whatever consequences were experienced, rendering the lessons to be learned ever murkier.

In terms of the psychology of risk decisions, these features prove to be critical corre- lates. Unless people learn risk facts directly, they will come to understand risk issues only if their experience provides useful feedback, so that they can learn by themselves what they have not been taught. Thus, the delayed consequences of long-term decisions mean that these decisions will be particularly difficult to understand. Because people dislike decisions that they do not understand, it may be tempting to avoid making them, or for the resulting choices to be wobbly (Janis and Mann, 1977).

Furthermore, not only are the fact issues underlying long-term decisions hard to understand, but so are the value issues. For example, smoking decisions require balanc- ing certain short-term benefits against uncertain long-term risks. Facility siting decisions often require balancing uncertain effects on the health of one's children against uncer- tain effects on their economic welfare. It must be tempting to avoid making such trade- offs or to find some way of restating them so that they seem less harsh, perhaps by ignoring some of the difficult issues. People acquire their values through experience, learning what they like. As a result, here, too, the slow pace at which long-term decisions play out poses a barrier to learning.

The obvious alternative to learning from experience is direct instruction. To that end, various experts have filled our world with cautions and reassurances regarding various risks. Yet, even here judgment is necessary. Recipients must decide whether the risk communicators know what they are talking about, and whether they have the recipients' best interests at heart.

The present analysis looks first at the general processes by which people master judg- mental skills, and then at how those processes emerge in the specific context of long-term environmental risks. The next section discusses the obstacles to evaluating the perfor- mance of individuals making risk judgments. The final section draws on these analyses to offer some proposals for improving the understanding of long-term environmental risks, by changing the way that data are created and presented. Some of these proposals are designed to speed the learning process, while others circumvent it by providing more effective direct instruction.

UNDERSTANDING LONG-TERM ENVIRONMENTAL RISKS 317

1. Conditions for learning

Few aspects of psychological theory are as well established as the basic principles of learning from experience (Hearst, 1988). These principles state that it is easiest to ac- quire a skill when one receives large quantities of prompt, unambiguous feedback in a setting that rewards performance of that skill. This is true for psychomotor skills, such as riding a bike and steering a ship, and for intellectual skills, such as estimating the rate at which risks accumulate over repeated exposures or the strength of an opponent's bar- gaining position.

These principles can be illustrated by looking at the evidentiary record regarding how people acquire one skill that is essential to most decisions regarding long-term environ- mental risks, namely, estimating the accurary of one's own beliefs (Lichtenstein, Fisch- hoff, and Phillips, 1982; Yates, 1990). That ability is essential in order to know when to take action, when to hedge one's bets, and when to acquire more information. One measure of accurary is calibration. Well-calibrated individuals hold correct beliefs XX% of the time that they are XX% confident of being correct. People inevitably make many confidence judgments in the course of a day. 1 However, they typically do so under con- ditions unfavorable to learning. For many confidence judgments, the feedback is long delayed, so that by the time it arrives, one no longer remembers what the prediction was or what considerations motivated it. The value of feedback is also diluted whenever the predictions are vague, because either the belief or the degree of confidence in it have not been stated sharply. Finally, even prompt, clear feedback will have little effect if people are rewarded not for being accurate, but for exuding confidence or for avoiding commitments.

Many studies have found what one would expect, given such partial and imperfect feedback. People typically have some, but incomplete, mastery of this skill. Individuals tend to know more when they express more confidence. However, their absolute levels of confidence do not match their absolute levels of knowledge. In a typical experimental study, subjects might be correct 50% of the time when they are 50% confident, but only 80% of the time when they are 100% confident. The overall trend in such a study would be overconfidence, shown, say, by a mean confidence of 80% over a set of beliefs, only 70% of which prove to be correct.

There are relatively few studies of experts' calibration in situations where they have run out of hard data and must rely on judgment. In those studies, the same picture emerges--unless the experts have enjoyed the conditions needed for learning this intel- lectual skill. The best-documented example of success is that of weather forecasters assessing the probability of near-term precipitation. Forecasting organizations, such as the U.S. National Weather Service, have done much of what they could to facilitate learning. They elicit explicit quantitative predictions for well-specified events (e.g., re- ceiving more than .01 inch of precipitation between 0600 and 1200). Forecasters receive both the immediate feedback of seeing what the weather does and statistical summaries of how well they are doing. 2 The performance of forecasters has rewarded that effort.

318 BARUCH FISCHHOFF

Forecasters not only know a lot, but they also know how much they know. Their calibra- tion is, however, poorer in cases where feedback is poorer (e.g., in predicting severe storms or wind velocities) or where they have just begun to receive feedback. Thus, it is not something about forecasters as people, but something about their working condi- tions that seems to make the difference in their performance (Murphy and Brown, 1984; Murphy and Daan, 1985). Similar patterns of success have been observed with other experts enjoying favorable conditions for learning (e.g., "professional" bridge and horse players) (Keren, 1987). Laypeople, too, have improved substantially when favorable con- ditions are created for them artificially (Lichtenstein and Fischhoff, 1980).

Thus, confidence assessment seems to be a learnable skill, where people's level of mastery depends on the opportunities provided by their life experiences. In the ex- tremely good conditions enjoyed by some select experts, performance seems to be so good that further improvements would do little to improve any decisions based on their confidence assessments. Unfortunately, such documented performance records are un- usual, leaving one to guess at how good other experts' judgments are. One way to guess is to ask whether they have had the opportunity to learn that skill. Thus, one might put the greatest trust in the confidence expressed by economists, toxicologists, homeowners, parents, and politicians 3 who routinely make explicit quantitative predictions about well- defined events that transpire soon, who systematically review their experience, and who live in a world that rewards them for accuracy (Henrion and Fischhoff, 1986).

Of course, confidence assessments are only needed when people do not know the right answer with complete confidence. 4 We send people to school (or, more likely, pay for the services of those who have sent themselves) to know the answers to questions of risk and benefit. Unfortunately, in the case of long-term environmental risks, schools cannot teach the answers to many critical questions. All that anyone knows with any confidence is the answers to subquestions of the big questions. Identifying the correct subquestions, assembling them into an analytical structure, and determining how much confidence to place in the conclusions of that analysis are all tasks requiring the exercise of judgment. Clearly, experts begin their thinking about long-term environmental problems at a much higher level of knowledge than do laypeople. However, their tasks also require them to go from the domain of what is unknown to laypeople to the domain of what is unknown to experts as well. That step into the unknown requires the exercise of judgment --expert judgment, but still judgment (Fischhoff, 1989a).

After examining the opportunities that people have had to acquire a skill, one must ask about the rate at which (even the best) experience translates into improved perfor- mance, as well as about any overall limits to the learning process. Going back to the example of calibration, performance has been found to jump up quickly with a first dose of organized feedback, followed by very slow improvement (Lichtenstein and Fischhoff, 1980). As mentioned, there is some chance that, over time, calibration will become effectively perfect.

The opportunities for acquiring many other skills seem much more limited. Some tasks seem to require greater cognitive capacity than people can muster; others demand cognitive structures that few people have (e.g., anticipating the operation of complex technical systems; integrating diverse and competing desires) (Steruberg and Smith, 1988). Unless the correct behavior is in people's "repertoire," there is no way that it can

UNDERSTANDING LONG-TERM ENVIRONMENTAL RISKS 319

be reinforced by their experiences. 5 In other cases, the cognitive task is tractable, but critical feedback is hidden from view. For these reasons, it is hard, for example, to learn just what motivates other people's behavior or just how toxins work on the body (Nisbett and Ross, 1980).

Where raw experience is unrevealing, direct instruction is needed. Anecdotal obser- vation of risk management controversies suggests quite a complex picture of how well people can learn risk facts directly (Fischhoff, 1989b; Krimsky and Plough, 1988). On the one hand, there are cases where citizens or legislators seem to be buffeted helplessly by the opinions of conflicting experts, perhaps siding, in the end, with a minority that advo- cates more caution or more alarm than most experts judge appropriate. At the same extreme, it is easy to point to cases in which the decisions of laypeople seem to reflect gross misestimates of risk or benefit.

On the other hand, at the heart of most local controversies, one finds ordinary citizens who have, at some level, mastered complex research literatures, despite having little scientific (or even formal) education. At the same extreme, one finds consumers who shrug off expert exhortations to alarm or caution with an arguably sophisticated appre- ciation of how much (or little) faith these experts deserve. 6 One currently debated theory of risk behavior claims, in fact, that people understand risks so well that they routinely find ways to get greater benefit out of systems that have been engineered to be safer. As a result, they keep risk levels constant (e.g., they drive faster on improved roads) (Slovic and Fischhoff, 1983; Wilde, 1988), frustrating the schemes of safety officials.

The parsimonious account of these anecdotes, too, might be in terms of learning theory: people can be expected to know those facts that they have had the opportunity to master. To take a simple and well-studied example, people have been found to be quite good at estimating the relative frequency of repeated events that they have observed directly, even if the repetitions are distributed in time and the estimation task comes as a surprise (Hasher and Zacks, 1984). They are, however, less proficient at estimating the population frequency of such events. That inference requires discerning systematic bi- ases in the sampling process that brings events to their attention and understanding the role of sample size in sample variability, two judgmental skills that people have trouble acquiring (Kahneman, Slovic, and Tversky, 1982; Tversky and Kalmeman, 1974).

Laypeople's understanding of risk controversies might be expected to show analogous patterns: they should have a good feeling for the frequency of risks that are openly reported or easily observed, but should do more poorly when risks are hidden (e.g., suicides) or are matters of conjecture (e.g., the carcinogenicity of new chemicals) (Lich- tenstein et al., 1978). 7 Similarly, laypeople should understand something about the soci- ology of those scientific controversies that they can see (e.g., who trusts whom, who pays whom, who has been trustworthy in the past, who seems to have an opinion on every- thing), but less about the deeper structural principles of science (e.g., peer review, the role of sample size in weighting conflicting studies).

For their part, experts may remember every word that the public has said about them (and their technologies), yet still maintain quite inaccurate beliefs regarding the social, psychological, and political processes generating those words. A likely misdirection is assuming that what laypeople say reflects what they believe, rather than, say, posturing or frustration. Without direct, candid interaction between experts and laypeople, there is

320 BARUCH FISCI-IHOFF

relatively little opportunity to test (and correct) either community's misconceptions about the other.

The successful management of long-term environmental risks depends on the wisdom of both experts and laypeople (including citizens, consumers, legislators, and many reg- ulators). From a learning perspective, what these people see is what society will get, in terms of reasoned public debate about risk issues. That same perspective suggests several procedures that might improve the quality of the judgment that is brought on bear on long-term environmental risk decisions. These procedures include ways to make better use of what people already know, as well as ways to speed the learning process. The need for such interventions depends on the quality of people's existing judgment in particular situations. The following section discusses obstacles to determining current understand- ing. Those obstacles may not only misdirect interventions, but also may lead to giving experts or laypeople too broad or too restricted a role in risk management (relative to their competence).

2. The quality of judgment

Although the principles of learning are, to a first approximation, fairly simple, their application is not. It requires a detailed look at the specific intellectual skills needed for a particular decision, and at the opportunities for their acquisition. It implies finding a patchwork of strengths and weaknesses. People know some things, but not others; they can do some things, but not others. That expectation itself puts the lie to the sweeping generalizations that often pepper risk debates, be they about all-knowing experts or no-knowing laypeople. As a result, people's beliefs and levels of understanding must be measured. The task of doing so faces serious methodological obstacles. Failure to master these problems can distort both the assessment of laypeople's prowess and the elicitation of experts' opinions (e.g., as inputs to probabilistic risk analyses). These threats can be divided into asking poor questions and evoking poor answers.

2.1. Posing poor questions

A common form of dismissal begins, "They don't even know. . . " Completions that might include "that AIDS causes cancer, .... that they should test their houses for radon," and "that we're running out of places for their garbage." The policy implication is that "they" are incompetent to manage their own affairs. For their own good, they need educational campaigns or coercive legislation or deference to expert opinion. Following such recom- mendations could markedly change our society through the practices and institutions that are created to deal with risks.

Often, however, these conclusions about laypeople are based on how they respond to questions whose answers are not particularly relevant to their lives. For example, in the Institute of Medicine's (1986) important and insightful study, Confronting AIDS, the public is derided because only 41% of respondents to a survey knew that AIDS is caused by a virus. Even if one trusts this result, s one must ask why anyone should know this fact.

UNDERSTANDING LONG-TERM ENVIRONMENTAL RISKS 321

What decision could such knowledge (or ignorance) affect? It is all too easy to conclude that a person who does not know one thing that we know does not know much of anything (Nisbett and Ross, 1980). A more rigorous approach is needed to determine what people really need to know. Conceptually, that criterion should be a value-of- information analysis, identifying those facts that can most affect their decisions (Raiffa, 1968). Thus, when reading a survey result, one might routinely ask, "Just what difference does the answer make to how respondents conduct their lives?"

One strategy for ensuring relevance is to focus on people's responses to actual risks, treating their behavior as an answer to some question posed by their environment. Unfortunately, interpreting behavior requires more detailed knowledge of how people interpret the question than is typically available (Fischhoff and Cox, 1985). For example, people who fail to test for radon may be demonstrating their ignorance of its risks, or their knowledge of the costs of remediation. Why should one perform a test that could produce an incontrovertible sign of risk that one would then have to ignore (Svenson and Fischhoff, 1985)?

Even when investigators pose questions that are worth answering, they often do so inadequately. For example, a National Center for Health Statistics (1987) survey that was intended to guide U.S. policy about AIDS asked, "How likely do you think it is that the AIDS virus will be transmitted by sharing plates and other eating utensils?" Interpreting people's answers requires knowing how they interpret "sharing." We posed this question to a reasonably homogeneous group of laypeople (undergraduates at an Ivy League college) and found considerable disagreement about both the intensity and the fre- quency of sharing (Linville, Fischhoff, and Fischer, 1990). Analogous problems may be found in people's responses to everyday risk communications. For example, when ado- lescents are admonished "don't do drugs," how do they interpret "do" and "drugs?" When adults see public service announcements asking, implicitly, "Do you drink and drive?", how do they interpret "drink" and "drive" (Quadrel, 1990)?

Decisions about risks are seldom about risks alone, but also pose questions about the benefits that accompany those risks and that may be foregone if the risks are controlled. These questions, too, must be asked precisely if responses to risks are to be interpreted meaningfully. However, even systematic studies of people's values regarding long-term environmental risk issues often rely on survey questions that are too vague to get at these perceptions (Freudenberg and Rosa, 1984; Peterson, Driver and Gregory, 1988). For example, in order to answer a question like, "Do you support nuclear power?", people must first have answered such subsidiary questions as "What are my other options--just fossil fuels, or is aggressive conservation a possibility?" and "What consequences should I consider--just efficiency and environmental impact, or also questions of equity and centralization of power?" Respondents who have wrestled with nuclear energy questions long enough to have evolved a stable perspective would, presumably, complete these missing details in a consistent way and provide reliable answers. 9

On the other hand, if people have not learned how to think about a problem, then they may interpret it differently each time it arises. The particular interpretation that they adopt will depend on the particular cues and reminders that accompany the question (e.g., are conservation and equity even mentioned?). When problem definitions are unstable, whether this occurs in political debates or surveys, then the public may look

322 BARUCH FISCHHOFF

confused and untrustworthy. However, their difficulties may be with understanding the question, rather than with producing the answer. The variety and power of subtle hints to define ambiguous evaluation questions is a staple of cognitive and social psychology research (Fiske and Taylor, 1985; Hogarth, 1982). 1° It is also a standard device of suc- cessful politicians, who can leave the audiences to their debates shifting back and forth like fans at a tennis match, as opposing speakers define and redefine the issues.

Even when people have learned how they want to think about an issue, observers still need to determine what their perspectives are. Inferring people's benefit perceptions from their choices is as difficult as inferring their risk perceptions. Elected politicians exploit this ambiguity when they infer complex, detailed mandates from the simple

choices that voters have made in choosing them.

2.2. Offering poor answers

Survey research, indeed any social research, often looks deceptively easy. It may seem as though anyone who can answer questions can ask them. After all, we ask people ques- tions in our everyday lives, just as we routinely interpret their behavior. However, both prove to be difficult enterprises. Many of the methodological flaws found in studies have direct analogues in everyday behavior.

One such pitfall is the tendency to exaggerate the extent to which other people make responses based on stable, overriding personality characteristics, rather than responding variably as a function of the specific situations in which they find themselves. This pitfall is fed by the tendency to see individuals primarily in one kind of situation which evokes one kind of behavior. For example, some technologists claim that laypeople are unrea- sonably cautious (e.g., about nuclear power, pesticides), while some public health offi- cials claim that people are unduly complacent (e.g., about radon, skin cancer).

Those who exaggerate the generality of behavior observed in particular circumstances are in good company. The same bias seems to have helped fuel interest in personality tests, which have had rather modest success, considering the effort invested in them (Bromily and Curley, in press; Mischel, 1968; Nisbett and Ross, 1980). The general problem is that behavior is the wrong answer for the inferences that observers would like to make, even when people are responding to well-formulated questions. There are just too many sets of beliefs and values that could lead to a particular behavior (Dawes, 1979; Goldberg, 1968; Nisbett and Wilson, 1977). 11 One must directly study the quantitative perceptions of risk and benefit that underlie behaviors.

Unfortunately, eliciting more appropriate answers is fraught with its own methodolog- ical difficulties. Quantitative judgments are hard whenever people are uncomfortable with the needed numbers (especially very large or very small numbers) or with the units to which the numbers are attached (e.g., reactor years, lost days of life expectancy). As a result, such judgments are often unreliable and disturbingly sensitive to the precise way in which questions are asked. For example, just mentioning a high number can inflate subsequent estimates of risk or benefit (relative to mentioning a low number or no

UNDERSTANDING LONG-TERM ENVIRONMENTAL RISKS 323

number at all). Changing the format of a question can produce similar changes (e.g., from "Of those afflicted with a disease, how many die?" to "Of those who die from a disease, how many have it and survive?"). There are comparable response mode effects for benefit assessment as well. As a result, how much something is said to be worth depends on how the possible answers are phrased (Fischhoff, in press; Fischhoff and MacGregor, 1983; Hogarth, 1982; Poulton, 1989; Turner and Martin, 1985).

These phenomena can be observed in everyday behavior, as well as in laboratory experiments. They are possible where people have not thought through their beliefs, and hence can be influenced by hints (intended or inadvertent) regarding what to say, or where people are unfamiliar with the response mode, and hence can be influenced by hints regarding how to express themselves. Thus, one would expect greater stability when people decide what to pay for a new kind of toothpaste than when they assign a dollar value to a change in the environment (which they have never thought of as a good with a price tag). One would expect similar problems with experts, when they must express themselves in an unfamiliar way. Thus, one should be wary of asking even a devoted conservationist what a wilderness is worth or a veteran nuclear power plant operator to estimate the co-occurrence rates for various control room problems.

However, even when people have expressed their beliefs about the size of risks and benefits, they may still not have expressed all their relevant perceptions. People also care about how these consequences are created and controlled. Such issues will affect quali- tative perceptions regarding, for example, how predictable these consequences seem to be and what control strategies are even considered. Eliciting such perceptions poses additional methodological challenges. Nonetheless, such research is being conducted in areas as diverse as interface design for hazardous technologies, science education, eco- nomic education, and risk commtmication. For example, we have found people who believe that radon permanently contaminates houses (as if it were a long-lived isotope), meaning that treatment must be prohibitively expensive (Bostrom, Fischhoff and Mor- gan, in press). Individuals who held such beliefs might see no point in testing, even if they knew that tests cost $10.95 and that the EPA has attributed 20,000 deaths annually to radon. Other investigators have found strange intuitive rules regarding the workings of the economy, thermostats, and physiological processes (Furnham, 1987; Gentner and Stevens, 1983; Rouse and Morris, 1986). Such misconceptions frustrate people's at- tempts to manage their own lives or to make sense out of the actions of risk managers. As elsewhere, if it is difficult for investigators to discern how people interpret the processes that create risks and benefits, then it is even harder for casual observers to do so.

3. Some proposals

Perhaps the most general conclusion that follows from adopting a learning perspective is a hortatory one: caution is needed when interpreting or evaluating other people's under- standing of risk issues. People learn facts about the world and about themselves piece by piece, at differing rates, and with varying degrees of mastery. Even when they have adopted a general principle (e.g., a scientific theory, a moral precept, an inferential rule),

324 BARUCH FISCHHOFF

people may not invoke it in all places where it might be relevant, may not interpret it appropriately where it is invoked, and may not combine it properly with other principles. As a result, they cannot be expected to be wholly expert even in the areas closest to them (e.g., their own values, their professional training).

Even when stated so generally, these realizations are important to the tenor of risk debates, in which people are often reduced to caricatures (National Research Council, 1989). Some more concrete recommendations follow, intended to improve people's un- derstanding of long-term environmental risks. Some aim at the short run, others at the long run. Some are directed at laypeople, others at technical experts. These recommen- dations fall into two categories: how data are presented and how data are created. Each recommendation could be implemented routinely by the various institutions responsible for risk management. Other good reasons may exist for taking these steps (e.g., it's the law). They appear here because of their potential contribution to learning.

3.1. Recommendations for how data are presented

1. Give the big picture. Lay out the basic issues relevant to managing a risk before getting into any details (e.g., what are the key consequences, the key uncertainties, the major players, the essential technical and legal constraints). Factors that are out of sight tend to be out of mind. People cannot respond reasonably or responsibly to problems that are revealed only partiallyJ 2

2. Prioritize information by importance. People have limited time, attention, and information-processing capacity. They should receive first the information that they need most. Doing otherwise means, in effect, breaking the faith with them, not even showing enough sensitivity to their needs to organize one's presentation. Where in- formation is intended to serve a particular decision, a value-of-information analysis could formalize the prioritization process. 13

3. Present quantitative information comprehensibly. Although this should go without say- ing, it is all too easy to find risk communications that show little awareness of the documented obstacles to risk communication.

4. Make the qualifications to quantitative information readily accessible. Knowing what the experts think is meaningless without knowing how much the experts know. Laypeople need statistical summaries of experts' performance, as well as qualitative summaries of the state of their science. For example: Are there conflicting schools of thought? Have the experts received formal training or just gravitated to the area? Where does judgment enter analyses?

5. Summarize information from multiple sources systematically. Making sense of risk issues often requires bringing some order to a welter of diverse and conflicting stud- ies. A typical response is an intuitive reading of what it all means. Clinical psycholo- gists face the analogous task of integrating multiple studies evaluating the effective- ness of clinical treatment programs (e.g., for schizophrenia). To that end, they have developed a family of procedures called meta-analyses (Hunter, Schmidt, and Jack- son, 1982), which weight studies according to such features as their sample size and

UNDERSTANDING LONG-TERM ENVIRONMENTAL RISKS 325

measurement reliability. By applying such procedures, one can avoid such natural but flawed tendencies as just counting the number of studies that did and did not observe a statistically significant difference. The consumers of risk studies need equally sys- tematic summaries.

6. Present alternative perspectives. People's choices can be powerfully affected by how problems are presented. That manipulation may be deliberate or inadvertent. Be- cause it is difficult to generate alternative perspectives spontaneously, risk communi- cators should present them explicitly. For example, messages might be phrased in terms of both (1) the risks that a measure eliminates and the (complementary) risks that it does not; (2) the relative and absolute differences in the risks faced with and without an intervention; and (3) the probability of an accident from single and multi- ple exposures to a risk (e.g., McNeil et al., 1982; Slovic, Fischhoff, and Lichtenstein, 1978; Tversky and Kahneman, 1981). 14

7. Present qualitative as well as quantitative information. People facing specific decisions need quantitative risk and benefit estimates. Often, though, people are just trying to follow the debate over a risk. To that end, they may need qualitative information regarding how risks are created and managed. Indeed, such descriptions might not be a bad way to convey a feeling for how big risks are.

8. Evaluate communications. Especially when risks are small, misleading risk communi- cations can have worse health effects than the risk. In such cases, it seems no more justified to subject the public to an untested communication than to an untested drug (Fischhoff, 1987).

3.2. Recommendations for how data are created

The following suggestions involve modest to radical changes in the production of scien- tific information for risk management. They apply to cases in which research is con- ducted to guide practical decisions, rather than "just" to advance basic scientific under- standing.

1. Create the big picture. Because responsible decision making requires at least some knowledge of all significant issues, the scientific research agenda must ensure at least some treatment of all such issues. This means addressing processes that are hard to quantify (e.g., operator error, impairment from electromagnetic fields) or effects that are hard to monetize (e.g., loss of noncommercial species, immune systems insults). Indeed, the "best buys" in research may be found in rudimentary studies of the most poorly understood issues. Such investigations can pick up (or even just suggest) only large effects. If they find something, then its practical importance should be obvious. If they do not, then decision makers have reduced their uncertainty about what surprises might be lurking, if only to document that the effects are too small to emerge in such a study. 15

2. Create the information that is important. Value-of-information analysis can show how precisely various facts need to be known. Doing so provides the basis for zero-based

326 BARUCH FISCTql-IOFF

budgeting of applied research. Applying such an approach systematically might reveal many research problems that have been pursued beyond the point of practical signif- icance (even though they may still address important theoretical issues), as well as cases where the information yield would be larger if resources were spread over a modest variety of methods rather than being concentrated on a single favored method. Basic research needs are a matter of taste; policy-related research should be chosen in a more focused way (National Research Council, 1983).

3. Create performance records for experts. Interpreting a communication requires an inference regarding the credibility of its source. Those inferences are most solid when based on systematic empirical evidence. To create that evidence, experts must make explicit predictions that can then be evaluated in the light of subsequent experience. This process also generates the feedback that the experts need in order to learn from experience. Although its adaptation to other contexts is far from trivial, the approach adopted by weather forecasters should be the norm in any research that is supported in return for its contribution to policymaking. At the very minimum, experts should disclose explicitly where judgment entered their analyses and what elicitation proce- dures were used, so that recipients can consider the credibility of such judgments.

4. Create alternative perspectives. The accepted way of expressing uncertainty is in terms of a probability distribution over possible values. One great challenge to creating such assessments is capturing the structural uncertainty produced by mistaken assump- tions in scientific theories or measurement procedures. Where there are many such assumptions, recipients are left wondering, "What else might be true, if the dominant theory is so uncertain?" One potentially useful response is creating full-blown alter- native analyses, reflecting minority approaches. Doing so would mean diverting re- sources from creating a better conventional analysis, in the hopes of both informing readers and reconciling competing views. 16

5. Create proper incentives. Decision makers must worry not only about what experts know, but also about what they can say. In order to encourage candor, information recipients must create (and demonstrate) an incentive structure that rewards experts for saying what they really believe, rather than for exuding confidence, avoiding re- sponsibility, generating alarm, or allaying fears. Creating explicit performance records is one step in this direction. However, experts also need structural protections. One radical proposal is to ban the use of risk analysis for probative purposes, under the assumption that proving how large (or small) risks are is incompatible with discover- ing the truth about them.

4. Conclusions

By definition, the consequences of decisions regarding long-term environmental risks will play themselves out over a long time. Given the relative novelty of trying to make such decisions in a deliberate fashion, we are probably not doing very well. To improve, we must attend to the learning process. Enough is known about the general dimensions

UNDERSTANDING LONG-TERM ENVIRONMENTAL RISKS 327

of that process to suggest places where help is particularly needed and also which inter- ventions might make a difference. Specific suggestions are provided here for both pre- senting and creating risk information. Some of these interventions require abandoning customary practices, perhaps sacrificing some short-term efficiency for a chance at greater long-term efficacy. Doing so requires a commitment to humility (i.e., we experts still have something to learn) and to democracy (i.e., the public needs to grow as we do).

N o ~ s

1. These confidence assessments may be attached to beliefs as diverse as "This report will satisfy the boss," "This valentine won't offend my partner," "This water is safe to drink," or "I remembered to turn off the stove."

2. The goal of forecasters, as set by management, is to provide accurate probabilities (and not, for example, to tell people what they want to hear about the weather, or to never fail to predict a rainy day even if that means often suggesting that people carry umbrellas on dry days).

3. Psychologists could also be included in this group. 4. At the least, they should have sufficient confidence that additional knowledge would not affect contingent

decisions. 5. For example, if people do not have at least a weak notion of regression toward the mean, then it will be very

difficult for them to extract the proper lessons when their nonregressive predictions fail (Tversky and Kahneman, 1974).

6. Without a detailed empirical description of such episodes, it is hard to know exactly what is happening. At the time of this writing, two recent episodes might fit this description: 1) in the Alar controversy, some consumers backed off from a product with ready substitutes in response to the willingness of a major environmental organization to risk some of its reputation by raising a flag of alarm (they may also have learned how little nutritional value apple juice provides); and 2) many homeowners resisted testing for radon despite the remonstrations of government agencies, who could promise that the tests were fairly accurate, but not that the problem could be managed at an acceptable cost (why risk confirming a fear that one cannot address and, thereby, forego the benefits of denial?) (Smith, Desvousges, Johnson, and Fisher, 1990).

7. For example, in studies examining lay understanding of the risks of radon, we have found people who believe (erroneously) that it causes skin cancer, that it permanently contaminates surfaces like rugs and drapes, or that radioactive decay is like organic decay.

8. One possible reason for mistrust is that some respondents will get a multiple-choice question correct by chance. A second reason is concern over whether even those who answer correctly know what a virus really is.

9. An interesting example of such incompleteness may be found in contingent valuation studies, asking people how much they would be willing to pay for environmental goods (e.g., improved visibility, preserving endangered species). Although these studies are sophisticated in many ways, the investigators themselves do not seem to have evolved a stable perspective on what issues need to be addressed in creating a meaningful evaluation question (Fischhoff and Furby, 1988; Mitchell and Carson, 1989).

10. For example, most people seem willing to stick with an unattractive gamble, like a .25 chance of losing $200, when the alternative is a "sure loss" of equal expected value. However, their preferences reverse when the alternative is described as an "insurance premium" (Fischhoff, Slovic, and Lichtenstein, 1980; Hershey and Schoemaker, 1980).

11. A particularly common, and unfortunate, version of these inferences is found in risk comparisons, wherein laypeople are derided for tolerating a fixed risk (e.g., .000001 of premature death) from one source (e.g., 10 miles of canoeing) and not from another (e.g., 50 years of living within 5 miles of a nuclear power plant). Even if one believes the risk estimates, the two actions are so different in their various risks, benefits, and control options that little can be inferred from the comparison (Cohen and Lee, 1979; Covello, Sandman, and Slovic, 1988; Fischhoff, Lichtenstein, Slovic, Derby, and Keeney, 1981).

328 BARUCH FISCHHOFF

12. One example from the anecdotal observations of this author is the controversy over (EDB) several years ago (Sharlin, 1987). The accounts emerging in the news media never mentioned the issues of what chem- icals would replace this fumigant were it banned, or who received the economic benefits of its use. The public was chastised for its aversion to EDB. However, their response might have been logically defensible had they guessed that the alternatives to EDB were demonstrably safer and that the profits for its use went to the producers while the risks went to the consumers. Guessing wrong is a rather different problem than reasoning inadequately. It puts the onus on the technical experts who have failed to clarify the basic structure of the risk problem.

13. Like most of the recommendations here, this one applies to experts as well as laypeople. Although they may, through experience, have become accustomed to jumbled presentations of technical information, that fact cannot improve their performance.

14. As an example of the need for point 3, we recently asked some college students to estimate the risk of AIDS being transmitted during 1, 10, and 100 sexual exposures to someone with the virus. For male-to- female transmission, subjects estimated the respective risks to be, on average, about 5%, 10%, and 25%. If one assumes independent chances of exposure, these are wildly inconsistent estimates. (If the risk is 5 % on one exposure, then it should be a virtual certainty for 100.) For people whose statistical intuitions lack a clear notion of how risks compound, presenting one perspective would create a misconception about the other. So would describing a drug as doubling the chances of a particular side effect, without noting that this means increasing it from one in a million to two in a million (Linville et al., 1990).

15. For example, at the time of this writing, there is considerable alarm about the apparent decline in frog populations. No one seems to know why this is happening or what it means, but here even a crude frog census has suggested (or at least emphasized) a dimension of concern that would otherwise have been neglected.

16. For example, a toxicological risk analysis might divert a portion of its resources to those scientists who believe that we are sitting on a cancer time bomb, letting them make their case. A cost-benefit analysis might be complemented by studies focused on those effects that are hard to monetize.

References

Bostrom, A., B. Fischhoff, and M.G. Morgan. (In press). "Eliciting Mental Models of Hazardous Processes: A Methodology and an Application to Radon," Journal of Social lssues.

Bromily, P. and S.P. Curley. (In press). "Personality and Individual Differences in Risk Taking." In J.E Yates (ed.), Risk Taking. Chichester, England: Wiley.

Cohen, B. and I.S. Lee. (1979). "A Catalog of Risks," Health Physics 36, 707-722. Covello, V.T., P.M. Sandman, and E Slovic. (1988). Risk Communication, Risk Statistics, and Risk Comparisons:

A Manual for Plant Managers. Washington, D.C.: Chemical Manufacturers Association. Dawes, R.M. (1979). "The Robust Beauty of Improper Linear Models in Decision Maldng,"Ameriean Psychol-

og/st 34, 571-582. Fischhoff, B. (In press). "Value Elicitation: Is There Anything in There?" In M. Hechter, L. Cooper, and L.

Nadel (eds.), Values. Stanford, CA: Stanford University Press. Fischhoff, B. (1989a). "Eliciting Knowledge for Analytical Representation," IEEE Transactions on Systems,

Man and Cybernetics 13(3), 448-461. Fischhoff, B. (1989b). "Risk: A Guide to Controversy." In National Research Council, Improving Risk Com-

munications. Washington, D.C.: National Academy of Sciences Press, pp. 211-319. Fischhoff, B. (1987). "Treating the Public with Risk Communications: A Public Health Perspective," Science,

Techmglog~, and Human Values 12(3&4), 13-19. Fischhoff, B. and L.A. Cox, Jr. (1985). "Conceptual Framework for Benefits Assessment." In J.D. Bentkover,

V.T. Covello, and J. Mumpower (eds.), Benefits Assessment." The State of the Art. Dordrecht, The Nether- lands: D. Reidel.

UNDERSTANDING LONG-TERM ENVIRONMENTAL RISKS 329

Fischhoff, B. and L. Furby. (1988). "Measuring Values: A Conceptual Framework for Interpreting Transac- tions," Joumal of Risk and Uncertainty 1, 147-184.

Fischhoff, B., S. Lichtenstein, P. Slovic, S.L. Derby, and R.L. Keeney. (1981). Acceptable Risk. New York: Cambridge University Press.

Fischhoff, B. and D. MacGregor. (1983). "Judged Lethality: How Much People Seem to Know Depends Upon How They Are Asked," RiskAnabysis 3, 229-236.

Fischhoff, B., P. Slovic, and S. Lichtenstein. (1980). "Knowing What You Want: Measuring Labile Values." In T. Wallsten (ed.), Cognitive Processes in Choice and Decision Behavior. Hillsdale, NJ: Erlbaum.

Fiske, S. and S. Taylor. (1985). Social Cognition. Reading, MA: Addison Wesley. Freudenberg, W.A. and E.A. Rosa (eds). (1984). Public Reactions to Nuclear Power.'Are There Critical Masses?

Boulder, CO: Westview. Furnham, A.E (1987). Lay Theories. London: Pergamon Press. Gentner, D. and A.L. Stevens. (1983). MentalModels. Hillsdale, NJ: Erlbaum. Goldberg, L.R. (1968). "Simple models or simple processes? Some research on clinical judgments,"American

Psychologist 23, 483-496. Hasher, L. and R.T. Zacks. (1984). "Automatic and Effortful Processes in Memory," Journal of Experimental

Psychology: General 108, 356-388. Hearst, E. (1988). "Fundamentals of Learning and Conditioning." In R.C. Atldnson, R.J. Herrnstein, G.

Lindzey, and R.D. Luce (eds.), Steven's Handbook of Experimental Psychology, Volume 2. New York: Wiley-Interscience, pp. 1-109.

Henrion, M. and B. Fisehhoff. (1986). "Assessing Uncertainty in Physical Constants," American Journal of Physics 54(9), 791-798.

Hershey, J.C. and P.JM. Schoemaker. (1980). "Risk Taking and Problem Context in the Domain of Losses," Journal of Risk and Insurance 47, 111-132.

Hogarth, R.M. (ed.). (1982). New Directions for Methodology of the Social Sciences: Question Framing and Response Consistency. San Francisco: Jossey-Bass.

Hunter, J.E., EL. Schmidt, and G.B. Jackson. (1982). Meta-ana~sis. Beverly Hills, CA: Sage. Institute of Medicine. (1986). Confronting AIDS. Washington, D.C.: Institute of Medicine. Janis, I. and L. Mann. (1977). Decision Making. New York: Free Press. Kahneman, D., E Slovic, and A. Tversky (eds.). (1982). Judgment under Uncertainty: Heuristics and Biases. New

York: Cambridge University Press. Keren, G. (1987). "Facing Uncertainty in the Game of Bridge: A Calibration Study," OrganizationalBehavior

and Human Decision Processes 39, 98-114. Krimsky, S. and A. Plough. (1988). EnvironmentaIHazards. Dover, MA: Auburn House. Lichtenstein, S. and B. Fischhoff. (1980). "Training for Calibration," Organizational Behavior and Human

Performance 26, 149-171. Lichtenstein, S., B. Fischhoff, and L.D. Phillips. (1982). "Calibration of Probabilities: State of the Art to 1980."

In D. Kahneman, E Stovic, and A. Tversky (eds.),Judgment under Uncertainty: Heuristics and Biases. New York: Cambridge University Press.

Lichtenstein, S., E Slovic, B. Fischhoff, M. Layman, and B. Combs. (1978). "Judged Frequency of Lethal Events," Journal of Experimental Psychology: Human Learning and Memory 4, 551-578.

Linville, E, B. Fischhoff, and G. Fischer. (Submitted). "Judging the Risks of AIDS." Carnegie Mellon University. McNeil, B.J., S. Pauker, H. Sox, Jr., and A. Tversky. (1982). "On the Elicitation of Preferences for Alternative

Therapies," New England Journal of Medicine 293, 216-221. Mischel, W. (1968). Personality and Assessment. New York: Wiley. Mitchell, R.C. and R.T. Carson. (1989). Using Surveys to Value lhtblic Goods: The Contingent Valuation Method.

Washington, D.C.: Resources for the Future. Murphy, A.H. and B.G. Brown. (1984). '% Comparative Evaluation of Objective and Subjective Weather

Forecasts in the United States," Journal of Forecasting 3, 369-393. Murphy, A.H. and H. Daan. (1985). "Forecast Evaluation." In A.H. Murphy and R.W. Katz (eds.), Probability,

Statistics and Decision Making in the Atmospheric Sciences. Boulder, CO: Westview Press, pp. 379-437. National Center for Health Statistics. (1987). "Knowledge and Attitudes about AIDS: Data from the National

Health Interview Survey, August 10-30, 1987,"Advance Data 146.

330 BARUCH FISCHHOFF

National Research Council. (1989). Improving Risk Communication. Washington, D.C.: National Academy of Sciences Press.

National Research Council. (1983). Pr/or/ty Mechanisms for Toxic Chemicals. Washington, D.C.: National Research Council.

Nisbett, R.E. and L. Ross. (1980). Human Inference: Strategies and Shortcomings of Social Judgment. Engle- wood Cliffs, N J: Prentice-Hall.

Nisbett, R.E. and T.A. Wilson. (1977). "Telling More than We Can Know," Psychological Review 84, 231-259. Peterson, G.L., B.L. Driver, and R. Gregory (eds.). (1988). Amenity Resource Valuation. State College, PA:

Venture Publisher. Poulton, E.C. (1989). Bias in Quantifying Judgments. London: Lawrence Erlbaum. Quadrel, M.J. (1990). Subjective Definitions of Probability Events. Unpublished doctoral dissertation, Depart-

ment of Social and Decision Sciences, Carnegie Mellon University. Raiffa, H. (1968). Decision Analysis. Reading, MA: Addison Wesley. Rouse, W.B. and N.M. Morris. (1986). "On Looking into the Blackbox: Prospects and Limits in the Search for

Mental Models," Psychological Bulletin 100, 349-363. Sharlin, H.I. (1987). "Macro-risks, Micro-risks, and the Media: The EDB Case." In B.B. Johnson and V.T.

Covello (eds,), The Social and Cultural Construction of Risk. Dordrecht, The Netherlands: D. Reidel. Slovic, P. and B. Fischhoff. (1983). "Targeting Risks: Comments on Wilde's 'Theory of Risk Homeostasis,'"

Risk Analysis 2, 231-238. Slovie, E, B. Fischhoff, and S. Lichtenstein. (1978). '5~ccident Probabilities and Seat-belt Usage: A Psycholog-

ical Perspective," Accident Analysi~ and Prevention 10, 281-285. Smith, V.K., W.H. Desvousges, ER. Johnson, and A. Fisher. (1990). "Can Public Information Programs Affect

Risk Perceptions," Journal of Policy Ana~ysis and Management 9(1), 41-59. Sternberg, R.J. and E.E. Smith (eds.). (1988). The Psychology of Human Thought. New York: Cambridge

University Press. Svenson, O. and B. Fischhoff. (1985). "Levels of Environmental Decisions," Journal of EnvironmentalPsychol-

ogy 5, 55-68. Turner, C.E and E. Martin (eds.). (1985). Survey Measure of Subjective Phenomena. Beverly Hills, CA: Russell Sage. Tversky, A. and D. Kahneman. (1981). "The Framing of Decisions and the Psychology of Choice," Science 211,

453-458. Tversky, A. and D. Kahneman. (1974). "Judgment under Uncertainty: Heuristics and Biases," Science 185,

1124-1131. Wilde, G.J.S. (1988). "Risk Homeostasis Theory and Traffic Accidents," Ergonomics 31, 441-465. Yates, J.E (1990). Judgment and Decision Making. Englewood Cliffs, NJ: Prentice Hall.