two causal theories of counterfactual conditionals

47
Two Causal Theories of Counterfactual Conditionals Lance J. Rips Psychology Department, Northwestern University, Evanston Received 17 May 2009; received in revised form 8 August 2009; accepted 12 August 2009 Abstract Bayes nets are formal representations of causal systems that many psychologists have claimed as plausible mental representations. One purported advantage of Bayes nets is that they may provide a theory of counterfactual conditionals, such as If Calvin had been at the party, Miriam would have left early. This article compares two proposed Bayes net theories as models of people’s understanding of counterfactuals. Experiments 1–3 show that neither theory makes correct predictions about back- tracking counterfactuals (in which the event of the if-clause occurs after the event of the then-clause), and Experiment 4 shows the same is true of forward counterfactuals. An amended version of one of the approaches, however, can provide a more accurate account of these data. Keywords: Counterfactuals; Conditionals; Causal reasoning; Bayes nets 1. Introduction Counterfactual conditionals are sentences of the form If A were the case, then C would be the case, and they commonly appear in both everyday talk (If Calvin were at the party, Miriam would have left early) and scientific discourse (If the sphere were released at 50 m, it would have hit the ground in 3 s). Part of the interest in counter- factual conditionals stems from their relation to causal principles. If the sentence about Calvin is true, then it is likely that causal facts about the interaction—physical and psy- chological—made Miriam’s departure depend on Calvin’s presence. If the sentence about the sphere is true, then causal factors made its time in flight depend on the height at which it was released. Not all counterfactuals imply causal relations. We can meaningfully say that if Axiom 9 were omitted, then Theorem 20.6 would still be true, although the axiom is not a cause of the theorem. Nevertheless, the relation between Correspondence should be sent to Lance J. Rips, Department of Psychology, Northwestern University, 2029 Sheridan Road, Evanston, IL 60208. E-mail: [email protected] Cognitive Science 34 (2010) 175–221 Copyright Ó 2009 Cognitive Science Society, Inc. All rights reserved. ISSN: 0364-0213 print / 1551-6709 online DOI: 10.1111/j.1551-6709.2009.01080.x

Upload: lance-j-rips

Post on 20-Jul-2016

222 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Two Causal Theories of Counterfactual Conditionals

Two Causal Theories of Counterfactual Conditionals

Lance J. RipsPsychology Department, Northwestern University, Evanston

Received 17 May 2009; received in revised form 8 August 2009; accepted 12 August 2009

Abstract

Bayes nets are formal representations of causal systems that many psychologists have claimed as

plausible mental representations. One purported advantage of Bayes nets is that they may provide a

theory of counterfactual conditionals, such as If Calvin had been at the party, Miriam would have leftearly. This article compares two proposed Bayes net theories as models of people’s understanding of

counterfactuals. Experiments 1–3 show that neither theory makes correct predictions about back-

tracking counterfactuals (in which the event of the if-clause occurs after the event of the then-clause),

and Experiment 4 shows the same is true of forward counterfactuals. An amended version of one of

the approaches, however, can provide a more accurate account of these data.

Keywords: Counterfactuals; Conditionals; Causal reasoning; Bayes nets

1. Introduction

Counterfactual conditionals are sentences of the form If A were the case, then Cwould be the case, and they commonly appear in both everyday talk (If Calvin were atthe party, Miriam would have left early) and scientific discourse (If the sphere werereleased at 50 m, it would have hit the ground in 3 s). Part of the interest in counter-

factual conditionals stems from their relation to causal principles. If the sentence about

Calvin is true, then it is likely that causal facts about the interaction—physical and psy-

chological—made Miriam’s departure depend on Calvin’s presence. If the sentence

about the sphere is true, then causal factors made its time in flight depend on the

height at which it was released. Not all counterfactuals imply causal relations. We can

meaningfully say that if Axiom 9 were omitted, then Theorem 20.6 would still be true,

although the axiom is not a cause of the theorem. Nevertheless, the relation between

Correspondence should be sent to Lance J. Rips, Department of Psychology, Northwestern University, 2029

Sheridan Road, Evanston, IL 60208. E-mail: [email protected]

Cognitive Science 34 (2010) 175–221Copyright � 2009 Cognitive Science Society, Inc. All rights reserved.ISSN: 0364-0213 print / 1551-6709 onlineDOI: 10.1111/j.1551-6709.2009.01080.x

Page 2: Two Causal Theories of Counterfactual Conditionals

counterfactual and causal connections is close enough to think that a theory of one

would illuminate the other.

A good theory of counterfactuals might shed light on causality. Philosophers have pro-

posed that an event C causally depends on an event A just in case two counterfactual condi-

tionals are true: If A were the case, then C would be the case, and If A were not the case,then C would not be the case (Lewis, 1973). For example, Miriam’s departure causally

depends on Calvin’s presence provided that: if Calvin were present, then Miriam would

leave early, and if Calvin were not present, then Miriam would not leave early. Psycholo-

gists have used similar formulations to decide whether two events in a story are causally

related (Trabasso & Sperry, 1985) and to predict whether people will attribute causality to

an action (Wells & Gavanski, 1989). However, counterfactual theories of causality face a

variety of counterexamples (see Collins, 2007, for a review of philosophical issues, and

Spellman & Mandel, 1999, for a review of psychological ones). Although proponents have

attempted to revise the theory to avoid the counterexamples (e.g., Lewis, 2000), it seems

safe to say that there are no completely satisfactory formulations. We will not pursue these

attempts here.

A good theory of causality might shed light on counterfactuals. The intuition is that if we

know the causal structure of a system in enough detail to simulate it, then we can use the

simulation to decide the truth of relevant counterfactuals (see Isard, 1974, for an early

instance of this idea in AI, and Jackson, 1977, for an example from philosophy). Consider,

for example, the hypothetical device pictured in Fig. 1, which consists of just four compo-

nents, A, B, C, and D. The arrows in the diagram indicate direct causal connections. The arc

joining the arrows from A to C and from B to C mean that A and B jointly cause C (neither

alone is sufficient to cause C). Thus, when A and B operate together then C operates, and

when C operates then D does. Assume that all four components are operating at present.

Then it seems likely that if C were not operating D would not be operating. The causal facts

determine the counterfactual’s truth or probability. This article examines theories based on

this idea.1

Fig. 1. Sample causal system (the deterministic jointly caused device) from Experiments 1–3. Circles represent

components of the device, and arrows represent direct causal connections between them. The arc connecting two

arrows indicates that both components A and B must be operating in order for component C to operate.

176 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 3: Two Causal Theories of Counterfactual Conditionals

Some recent causal theories of counterfactuals have used Bayes nets as the underlying

representation of causality (Hiddleston, 2005; Pearl, 2000). In these formulations, Bayes

nets supply a functional description of how the events within the system depend on their

immediate causes. It is common to depict the nets in graphical form with arrows directed

from the immediate causes to their effects, as in Fig. 1. Usually, no loops are allowed, that

is, no directed pathways from event A to event B to … to event A. In earlier versions of

Bayes nets (e.g., Pearl, 1988), statistical dependence or independence between the events

determined the presence or absence of a causal arrow. To handle counterfactual condition-

als, however, theorists have proposed stronger versions in which Bayes nets specify the

causal relations more directly (i.e., in a more top-down fashion).2

Cognitive psychologists have recently made strong claims on behalf of Bayes nets as

mental representations of causality. Evidence from people’s judgments about cause and

effect supports the hypothesis that people use Bayes nets to reason about interventions on

causal systems (Gopnik et al., 2004; Sloman & Lagnado, 2005; Steyvers, Tenenbaum,

Wagenmakers, & Blum, 2003; Waldmann & Hagmayer, 2005), the underlying structure of

categories (Rehder & Burnett, 2005), and the outcome of hypothesis testing (Gopnik et al.,

2004). Can Bayes nets also account for people’s judgments about counterfactuals? In

Sections 2 and 3, I outline two proposed theories of counterfactuals based on Bayes nets

(Hiddleston, 2005; Pearl, 2000). The experiments that follow are tests of whether either

theory is able to describe people’s judgments about counterfactual conditionals. The two

theories yield opposite predictions about the truth of certain counterfactuals, and the experi-

ments compare these predictions with people’s decisions about the same sentences. We will

see that although neither theory provides a complete account of the results, one of them

yields a better overall picture of the data’s trends. Some of the discrepancies between theory

and data are likely due to processing limits, such as limited memory or search time, and Sec-

tion 10 considers ways to modify the theory to bring it into line with these psychological

constraints.

Interest in the present theories, however, extends beyond theories of conditional sentences.

People do counterfactual reasoning whenever they contemplate a nonfactual state of affairs,

whether or not this takes a specifically conditional form (e.g., Byrne, 2007; Kahneman &

Varey, 1990; Roese, 1997; Tetlock & Henik, 2005). Many instances of wishing, decision

making, planning, inductive reasoning, and prediction involve thinking about hypothetical

consequences of a contemplated but nonactual event. A successful theory of counterfactual

conditionals should carry over to this wider domain. Section 11 takes a more general view

of counterfactual thinking: Envisioning a counterfactual event often requires imagining a

plausible way in which the event could have come about. Once we have such an explanation,

we can then reason forward to fill in the probable consequences of the initial event.

2. A pruning theory of counterfactuals

To get a Bayes net to determine the truth of a counterfactual sentence, a theory must

specify what would happen if the if-part of the counterfactual were true. For example,

L. J. Rips ⁄ Cognitive Science 34 (2010) 177

Page 4: Two Causal Theories of Counterfactual Conditionals

suppose the counterfactual is If component C were not operating, then component Dwould not be operating. To find out whether this sentence is true of the Fig. 1 device,

we must know how to use the network to simulate a situation in which C is not

operating. Many changes to the network could bring this about, and these changes have

different implications for the counterfactual’s truth. Which one should we choose? One

appealing possibility is perhaps the simplest one: First, set the Bayes net so that infor-

mation directly specified in the conditional’s if-part is true. Then make any further

changes that are effects of the if-part, holding constant all remaining parts of the

network. If the then-part is true in this modified state, so is the entire counterfactual. In

the case of If C were not operating, then D would not be operating, we set C so it is

not operating, holding the values of A and B constant (as they are not effects of C).

We then let the network run and observe the result for D. In this case, D will not be

operating (as D is an effect of C). The conditional’s then-part is true in this situation,

and so is the conditional itself.

This Bayes net approach to counterfactual conditionals treats them on an analogy to what

might occur if someone were to manipulate a causal system to bring about the state

described in the if-part of the counterfactual (Pearl, 2000). For example, the counterfactual

If C were not operating, then D would not be operating would be true of the Fig. 1 device in

case directly stopping C in turn stops D.

This approach, which I will call pruning theory, assumes that each variable in the

Bayes net—for example, the state of each component in Fig. 1 as operating or not

operating—depends completely on two sets of factors. One set is the state of its

immediate (parent) causes. In Fig. 1, for example, the state of component C depends

on the state of both components A and B, and the state of component D depends on

that of C. A and B are parents of C, and C is the parent of D. The second factor

that affects the variables are unobserved causes, not shown in Fig. 1, and whose pur-

pose is to account for variation in the state of an observed variable that is not

explained by its parents. These unobserved causes are typically assumed statistically

independent of each other. Their role is similar to error terms in familiar statistical

models, such as regression or anova, in the sense that they model residual variability.

We will use UX to denote an error term that directly affects variable X. The state of

its parent variables and its error variable completely determine the state of any vari-

able in the system.

In describing Bayes nets for present purposes, we can treat each observed or unob-

served event as having two states, on or off, which we will denote 1 and 0, respec-

tively. The Bayes net in Fig. 1 might specify, for example, that component A operates

(A = 1) if outside forces, represented by its error variable, has a value equal to 1

(UA = 1), and otherwise does not operate (A = 0 when UA = 0). Similarly, for compo-

nent B. Let us assume that C always operates whenever both A and B are operating

but not when A or B is operating alone. (This is how we described the device in the

experiments that follow.) Similarly, D always operates when C does. We can then fully

describe the system in Eq. (1):

178 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 5: Two Causal Theories of Counterfactual Conditionals

a: A ¼ UA

b: B ¼ UB

c: C ¼ A � Bd: D ¼ C

ð1Þ

For example, C operates (C = 1) only when both A operates (A = 1) and B operates (B = 1).

If either variable is 0, C will not operate (multiplying by 0 will mean C = 0).3

To find out whether a counterfactual conditional is true or false in such a system, pruning

theory recommends following three steps: We first determine the values of the error

variables (the UXs) based on the current state of the system, and we fix the variables at those

values. Second, we set any variables mentioned in the if-part of the counterfactual (the ante-

cedent) to the values specified there, ignoring the values of parents or error variables. This

amounts to removing or pruning the incoming links to these antecedent variables, as they no

longer determine the variables’ state. Third, we figure out the values of any variables in the

then-part of the counterfactual (the consequent) within the pruned-down system. To see

how these steps work, consider once again the counterfactual If C were not operating, thenD would not be operating, applied to the Fig. 1 machine, and assume that all four compo-

nents are currently operating (i.e., A = B = C = D = 1). Equations (1a–b) implies that the

error variables must be equal to 1 in this current state (i.e., because A = 1, UA = 1; because

B = 1, UB = 1), and we freeze the state of these variables at 1. Second, as the antecedent

stipulates that C is not operating, we set C = 0, ignoring the incoming effects of A and B.

Finally, because C = 0, Eq. (1d) yields D = 0. Pruning theory, therefore, predicts that the

counterfactual is true: If C were not operating, then D would not be operating.

3. A minimal-networks theory of counterfactuals

A second Bayes net theory, minimal-networks theory, requires less disruption to the sys-

tem in simulating the counterfactual situation (Hiddleston, 2005). The central idea of this

second approach is to consider all revisions to the system in which the counterfactual’s ante-

cedent is true, but which are (in a sense to be described) minimally different from the sys-

tem’s actual state. If the counterfactual’s consequent is true in all such revisions, then the

counterfactual itself is true; otherwise, false. The theory’s underlying intuition is that to

bring about a counterfactual state, any changes to a causal system should preserve the sys-

tem’s internal causal principles. Deviations from the actual state should occur only where

either (a) causal links are probabilistic rather than deterministic, or (b) external factors affect

the system’s operation. This idea differs from the one that pruning theory embodies. In that

theory, counterfactual changes are concentrated in the causal inputs to the antecedent vari-

ables, as we have seen. The two theories will therefore make different predictions about the

truth of a counterfactual, given certain causal set-ups.

In comparing an alternative state of a causal system to its actual state, we can sort its vari-

ables into three types: One class consists of variables whose value in the alternative state is

L. J. Rips ⁄ Cognitive Science 34 (2010) 179

Page 6: Two Causal Theories of Counterfactual Conditionals

the same as in the actual state and all of whose parents (if any) also retain their actual val-

ues. Variables in this class are called intact variables. A second class consists of variables

whose value in the alternative state is different from that in the actual state but all of whose

parent variables (if any) again have the same values as in the actual state. These are breakvariables. All remaining variables are said to be up for grabs. For example, suppose, as

before, that components A, B, C, and D in our sample system are currently operating—each

actual value is 1—and consider an alternative in which A is operating (A = 1) but B, C, and

D are not (B = C = D = 0). Fig. 2 shows the actual state at the bottom and the alternative in

question at the middle left. Then A retains its actual value in the alternative, and as it has no

parents, A is intact. B, however, has a new value, and as it too has no parents, it constitutes a

break. Both C and D have new values, but the value of a parent of each has changed; so Cand D are up for grabs. Fig. 2 shows intact variables in unshaded circles with intact borders,

break variables in unshaded circles with broken borders, and up-for-grab variables in shaded

circles.

To evaluate a counterfactual conditional, we look at all alternative states of the system in

which the conditional’s antecedent is true, but which are otherwise minimally different from

the current state. The counterfactual will be true just in case the consequent is true in all

such minimal networks. To flesh this out, let us say that a state is legal if it obeys all the cau-

sal laws that apply to the system. These causal laws are the rules that dictate how the system

works, as given by its operating principles. For example, in specifying the way the Fig. 1

device works, we noted that C operates whenever both A and B operate, and D operates

whenever C does. (These laws constrain which combination of values are possible in a state

of the system [e.g., D must be operating whenever C is], but they do not typically dictate the

value of any particular variable [C can be either on or off in a legal state]. Thus, the anteced-

ent of a counterfactual [e.g., If C were not operating] will normally be true in some legal

states and false in others.)

We can then say that a state S is minimal with respect to the actual state and a given

counterfactual if S meets following conditions:

a. The antecedent is true in S.

b. S is legal.

c. S has as few breaks as possible.

d. S has as many intact variables as possible

among those variables that are not effects of the antecedent.

ð2Þ

To see what this amounts to, let us go back to our earlier example If C were not operat-ing, then D would not be operating. We again assume that all components are operating in

the actual state, as in the bottom diagram of Fig. 2. There are eight possible states in which

C is not operating, which are given by the combinations of values of the remaining vari-

ables, but the causal laws of the system eliminate all but three of these as illegal. For

instance, any state in which C is not operating but D is operating will be illegal, because

according to our original specification of the system, D operates whenever C is operating.

Fig. 2 shows the three remaining possible states at the top and in the middle. The two states

180 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 7: Two Causal Theories of Counterfactual Conditionals

Fig. 2. Minimal networks for the deterministic jointly caused device. Unshaded circles with broken boarders rep-

resent break variables, unshaded circles with intact boarders represent intact variables, and shaded circles represent

variables that are ‘‘up for grabs’’ (see text for an explanation of these variable types). Thick arrows connecting the

networks indicate that the network at the head of the arrow is minimal relative to the network at its tail.

L. J. Rips ⁄ Cognitive Science 34 (2010) 181

Page 8: Two Causal Theories of Counterfactual Conditionals

in the middle of Fig. 2 are the minimal networks. The state at the top is not minimal, as it

has more break variables and fewer intact variables than either of the two middle states. In

both the minimal states, D is not operating; that is, the consequent of the conditional (D isnot operating) is true in each of these states. Thus, the entire counterfactual conditional is

true. (D also happens not to be operating in the top state, but this state does not count in

determining the counterfactual’s truth.)

4. Some differences between the pruning and minimal-networks theories

Both pruning and minimal-networks theories yield the same conclusion about the count-

erfactual If C were not operating, then D would not be operating for the system in Fig. 1.Both predict that this counterfactual is true. But the predictions are not always identical.

These theories agree that the truth of a counterfactual depends on whether the consequent is

true in states similar to the actual one but in which the antecedent is true; they differ, how-

ever, in the way they compute similarity. Pruning theory imposes similarity through its error

variables. By clamping the value of these UX variables to their actual values, the resulting

states will be close to the actual state, consistent with the changes that the antecedent brings

about. Minimal-networks theory, however, dispenses with error variables. Instead, the the-

ory ensures similarity by minimizing breaks and maximizing intact values of observed vari-

ables. The theories also differ in the revisions they allow to the actual state when evaluating

a counterfactual. Both theories permit changes (pruning or breaks) in the causal stream; but

pruning theory locates these changes just before the event in the antecedent, whereas

minimal-networks theory locates its changes only where they are consistent with the causal

laws governing the system.4

These differences in the theories can lead to different predictions. This can occur, for

example, when the antecedent of a counterfactual describes an effect and the consequent

describes one of its causes. Within the sample system of Fig. 1, the counterfactual If C werenot operating, then A would be operating is a case of this sort. The minimal networks for

this counterfactual are the same as in our earlier example: the two middle states of Fig. 2.

As A is operating in one of these states, but not in the other, the minimal-networks approach

deems this counterfactual false. (Recall that, according to minimal-networks theory, a

counterfactual is true if its consequent is true in all minimal networks; otherwise the count-

erfactual is false.) Intuitively, if either A is not operating or B is not operating, then C will

not operate; so we cannot tell whether A is operating if all we know is that C is not.

Pruning theory evaluates If C were not operating, then A would be operating by first fix-

ing the values of the U variables at 1, as before. Because the antecedent specifies that C is

not operating, C is set to 0 and the inputs to this variable are removed: The arrows from Aand B are pruned, or equivalently, Eq. (1c) is replaced by C = 0. This is the same proce-

dure we followed in evaluating If C were not operating, D would not be operating. In the

present case, however, the value of A in the resulting system is 1, according to Eq. (1a).

Thus, the counterfactual is true. In effect, removing the arrows or equations relating the

behavior of A and B to C means that C’s status is no longer diagnostic of A’s or B’s status.

182 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 9: Two Causal Theories of Counterfactual Conditionals

Because A is operating in the actual state and because C casts no light on A, pruning theory

yields the result that A would still be operating. This verdict is the opposite of the one we

reached in minimal-networks theory.

Pruning theory approaches counterfactuals in much the same way it approaches state-

ments about explicit interventions on a causal system. This framework models a direct

manipulation on a variable by removing incoming arrows and setting the value of the vari-

able to a constant, the result of the manipulation. For example, if someone acts on C to

stop it from operating, he or she removes the causal connections from A and B, setting Cto 0. This means that pruning theory’s decision about a counterfactual such as If C werenot operating, would A be operating? should be similar to that for a corresponding

question about an intervention: Suppose someone stopped C from operating, would A beoperating? Sloman and Lagnado (2005) point out this similarity, and some of their experi-

ments contain parallel forms of counterfactual and intervention questions. Their Experi-

ment 5, for example, described a simple two-variable device: ‘‘All rocket ships have two

components, A and B. Component A causes Component B to operate. In other words, if A,

then B.’’ One group of participants had to answer a counterfactual question about the

device (‘‘Suppose Component B were not operating, would Component A still operate?’’).

A second group answered a question about an explicit intervention (‘‘Suppose Component

B were prevented from operating, would Component A still operate?’’). The results

showed that although 89% of participants answered ‘‘yes’’ to the intervention item, only

68% of participants answered ‘‘yes’’ to the counterfactual (Sloman & Lagnado, 2005,

Table 5).5

The aim of the present experiments is to determine whether either pruning theory or

minimal-networks theory can predict how people answer counterfactual questions, such as

If C were not operating, would A be operating? Counterfactuals can specify more precisely

the way in which the antecedent event can come about. Sloman and Lagnado’s (2005) pre-

vention counterfactuals are a case in point. So are observation counterfactuals, such as Ifsomeone observed C not operating, would A be operating? which Sloman and Lagnado

also used in their experiments. An unlimited number of further specifications are also pos-

sible: If someone stopped C from operating by disabling the connection from A to C, wouldA be operating? If someone stopped C from operating by disabling A, would A be operat-ing? or If someone observed C not operating because A was not operating, would A beoperating? and so on. The answers to these questions are likely to be different because

they envision distinct ways of disrupting the normal workings of the device. Because the

goal of the present experiments is to determine the normal or natural possibilities people

rely on in dealing with counterfactuals, the experiments employ unmarked versions of such

questions: If C were not operating…? or If C had not operated…? Pruning theory and

minimal-networks theory make different predictions about the answers to these unmarked

counterfactuals, and the issue is which set of predictions is correct. Thus, the unmarked

case seems the best (most neutral) test bed for the theories. As Sloman and Lagnado

(2005, p. 27) note, ‘‘The data clearly show that counterfactual statements are sometimes

interpreted as causal interventions, but the conditions favoring this interpretation are still

not entirely clear.’’

L. J. Rips ⁄ Cognitive Science 34 (2010) 183

Page 10: Two Causal Theories of Counterfactual Conditionals

5. An overview of the experiments

This study compares the predictions of the pruning and minimal-networks theories for

counterfactual conditionals. Experiments 1–3 examine participants’ judgments about the

truth of counterfactuals like our example If C were not operating, would A be operating? for

which the two theories yield contrasting predictions. These studies vary the underlying cau-

sal structure (Experiments 1–3), the phrasing of the counterfactual question (Experiment 2),

and the base rates of the component variables (Experiment 3), as these factors help differen-

tiate the theories.

For the causal system of Fig. 1, the sentence If C were not operating, A would be operat-ing is a backtracking counterfactual, one in which the antecedent describes an effect and the

consequent describes its cause. Some theories exclude backtracking counterfactuals from

analysis (e.g., Lewis, 1973, 1979), and these counterfactuals are sometimes clumsy to

express because of their combination of tense and mood. It is a mouthful to say If Calvinhad gotten an F in the course, then he would have had to have forgotten to have turned inhis assignments last month. However, as Bennett (2003) argues, there is no reason to think

that such backtracking counterfactuals are incoherent or useless, and a successful theory of

counterfactuals should be able to cover both forward (predictive) and backtracking (diag-

nostic) conditionals. Nevertheless, backtracking counterfactuals may be something of a

special case or may involve a special type of ambiguity. Experiment 4, therefore, extends

the comparison of the theories to forward counterfactuals.

6. Experiment 1: Structural complexity and backtracking

Participants in this experiment read short descriptions of four-component devices, and for

each device, they decided whether a counterfactual conditional was true or false. The

devices were similar to Fig. 1, but they varied in two respects. The instructions described

the C component of two of the devices as operating only when both A and B were operating,

but they described the C component of the other two devices as operating when either A or

B was operating. Table 1 shows the first two jointly caused devices at the top and the latter

two separately caused devices at the bottom. (Table 1 adopts our earlier convention of using

an arc to indicate that the two connected causal arrows are both necessary to produce the

effect; see note 2.) The other variation among the devices is that, for two of them, compo-

nent C operates on a deterministic basis but operates on a probabilistic basis for the remain-

ing two. Table 1 indicates this difference with solid arrows for deterministic and dashed

arrows for probabilistic connections. For example, the description of the device in Table 1

A—the deterministic jointly caused device—mentioned that component A’s operating and

component B’s operating together always cause component C to operate, whereas the

description of the probabilistic jointly caused device in Table 1B mentioned that component

A’s operating and component B’s operating together usually cause component C to operate.

For each device, the instructions stated that all four components were currently operating

and then immediately asked the critical question: If component C were not operating, would

184 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 11: Two Causal Theories of Counterfactual Conditionals

component A be operating? Pruning theory makes exactly the same prediction for all four

devices in Table 1: If C were not operating, then A would still be operating. In each case,

the theory sets the error variable for component A, UA, to 1, as A is operating in the actual

situation. The procedure then removes causal connections from A to C and from B to C, and

it sets the value of C to 0. Under these conditions, A would still be operating: as UA is 1, so

is A, by Eq. (1a). This is equivalent to Sloman and Lagnado’s (2005) ‘‘undoing’’ prediction.

The equation for C will differ for the four devices because of the change in structure. How-

ever, these differences do not affect the truth of the counterfactual.6

The minimal-networks approach makes predictions that are also the same for all four

devices in Table 1, but predictions opposite those of pruning theory: The counterfactual If Cwere not operating, A would be operating is false for all four devices. We have already

found that the counterfactual is false for the deterministic jointly caused device in Table 1A,

because A is not operating in one of the minimal-networks (see Fig. 2). For the deterministic

separately caused device (Table 1C), either A or B can cause C. Thus, the only way C could

not be operating, according to this theory, is if both A and B are not operating. All other pos-

sibilities violate the causal laws governing this device and are excluded from the range of

possible networks. The only minimal network that satisfies these causal constraints is the

one in which all four components are off, making the counterfactual false.

The minimal-networks approach deals with the two probabilistic devices by assuming

that the individual causal connections produce their effects on a probabilistic basis. In

the two devices in Table 1B and 1D, when A and B are operating, C need not operate, as the

Table 1

Percentage of ‘‘Yes’’ responses to the question ‘‘If component C were not operating, would component A be

operating?’’ for the four device types in Experiment 1

Deterministic Probabilistic

Jointly caused: a. b.

55.3 ± 7.2 59.6 ± 7.2

Separately caused: c. d.

12.8 ± 4.9 42.6 ± 7.2

Note. Percentages are given as ± 1 SD, based on 47 observations.

L. J. Rips ⁄ Cognitive Science 34 (2010) 185

Page 12: Two Causal Theories of Counterfactual Conditionals

connections from the causes to the effect do not always run their course. Thus, a state in

which both A and B are operating but C is not operating no longer violates these probabilis-

tic devices’ causal laws. The minimal networks therefore include the states in which both Aand B are operating, as well as the two states in which one is operating and the other is not,

as shown in Fig. 3 for the probabilistic jointly caused device. Because A is not operating in

one of these minimal networks, the counterfactual is once again false.7

In short, pruning theory predicts that, for all four devices in Table 1, participants should

answer ‘‘yes’’ to the question If component C were not operating, would component A beoperating? whereas minimal-networks theory predicts they should answer ‘‘no.’’

6.1. Method

Participants in this experiment read descriptions of each of the Table 1 devices. After

each description, they answered a counterfactual question about components A and C.

6.1.1. Procedure and materialsThe participants received a booklet containing a page of instructions and four additional

pages. The instructions mentioned that they would be reading about hypothetical devices

and ‘‘would make some decisions about the way the devices work under certain condi-

tions.’’ Participants were to read the remaining pages of the booklet in order and were not to

return to earlier pages once they were done with later ones.

Each page contained a description of one of the devices in Table 1. For example, the

description of the device in Table 1A stated:

Fig. 3. Minimal networks for the probabilistic jointly caused device. For the probabilistic separately caused

device, the networks are the same (but without the connecting arcs).

186 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 13: Two Causal Theories of Counterfactual Conditionals

Professor McNutt of the Department of Engineering has designed a device called a

blicket. The blicket has only four components, labeled A, B, C, and D. The device works

in the following way:

Component A’s operating and component B’s operating together always cause compo-

nent C to operate.

Component A’s operating alone never causes component C to operate.

Component B’s operating alone never causes component C to operate.

Component C’s operating always causes component D to operate.

The description of the Table 1B device was the same, except for a change in its name

(from blicket to philp) and a change to the top sentence in the above list. The new sentence

read, ‘‘Component A’s operating and component B’s operating together usually cause com-

ponent C to operate.’’

The description of the device in Table 1C was similar, but in place of the first three sen-

tences, participants learned that ‘‘Component A’s operating always causes component C to

operate’’ and ‘‘Component B’s operating always causes component C to operate.’’ This

device also had a new name (a glux). For the probabilistic version of this device in

Table 1D, the word ‘‘usually’’ replaced ‘‘always’’ in these two sentences, and the name of

the device was a flam. No diagrams accompanied the descriptions.

After the device’s description, participants read: ‘‘Imagine that just now components A,

B, C, and D are all operating. If component C were not operating, would component A be

operating?’’ Participants circled a response, ‘‘yes’’ or ‘‘no,’’ in their booklet and then rated

their confidence in their answer by circling a number on a 0–9 scale. The scale appeared as

a horizontal line of numerals, with 0 labeled ‘‘not at all confident’’ and 9 labeled ‘‘couldn’t

be more confident.’’ Two booklets were constructed for each of the 24 possible orders of

the devices by arranging the booklet’s pages in the appropriate sequence. After reading the

instructions, the participants proceeded through the booklets at their own pace.

6.1.2. ParticipantsThe booklets were assigned randomly to 48 participants. All were undergraduates at

Northwestern University, and they received credit in their introductory psychology class for

taking part. During the same experimental session, they also filled out a number of addi-

tional questionnaires from unrelated studies. Participants took about 10 min to complete the

booklet, and the entire session lasted approximately 25 min. We tested the participants in

groups of two to six individuals.

6.2. Results and discussion

The data of most interest in this experiment are the percentages of ‘‘yes’’ responses to

the question If component C were not operating, would component A be operating? These

percentages varied across the different devices. Table 1 displays the relevant percentages

and shows that most participants thought the conditional was false of the deterministic sepa-

rately caused device, but they were more evenly divided about the remaining devices.

L. J. Rips ⁄ Cognitive Science 34 (2010) 187

Page 14: Two Causal Theories of Counterfactual Conditionals

According to pruning theory, participants should answer ‘‘yes’’ to the counterfactual

question for all four devices, whereas minimal-networks theory predicts they should answer

‘‘no.’’ The percentage of ‘‘yes’’ responses, however, was 42% overall, and the responses

were far from uniform. Participants produced more ‘‘yes’’ responses for probabilistic

devices (51%) than for deterministic ones (33%), and they also produced more ‘‘yes’’

responses for the jointly caused devices (56%) than for the separately caused ones (28%).

However, these main effects and the interaction between them are due to the low percentage

of ‘‘yes’’ responses for the deterministic separately caused device. Because these responses

are binary ‘‘yes’’ or ‘‘no’’ answers, the analysis employed a repeated-measures model for

categorical data (Grizzle, Starmer, & Koch, 1969), which yields the Wald statistic (QW) as a

test of the model’s effects. In these terms, the results produced significant effects of

determinism (probabilistic vs. deterministic connections) [QW(1) = 10.43, p = .0012], struc-

ture (joint vs. separate causes) [QW(1) = 23.53, p < .0001], and an interaction between these

factors [QW(1) = 7.60, p = .0058]. One participant skipped an answer to one of the ques-

tions, and the analysis omitted this participant’s data.

Responses fell in the 40–60% range for all but the deterministic separately caused device,

and this raises the question of whether the intermediate values reflect participants’

uncertainty about the correct answer or, instead, a mixture of strategies about each of which

participants were relatively certain. We can assess their degree of certainty from the confi-

dence ratings they provided. These data show that mean confidence was fairly high for the

deterministic separately caused device (6.8 on our 0–9 scale) but lower for the remaining

devices (5.4 for the deterministic jointly caused device, 5.4 for the probabilistic jointly

caused device, and 5.0 for the probabilistic separately caused device). anova of the confi-

dence ratings confirmed this general pattern. Confidence was higher for deterministic than

probabilistic devices [F(1,47) = 18.75, MSe = 2.27, p < .0001] and for separately caused

than jointly caused devices [F(1,47) = 4.16, MSe = 3.45, p = .04], but these were a function

of the interaction between determinism and structure [F(1,47) = 15.43, MSe = 2.76,

p = .0003]. This suggests that participants were relatively uncertain for all but the determi-

nistic separately caused device. It is still possible, of course, that individual differences exist

in participants’ approach to the problems; but such differences do not obviously favor either

pruning or minimal-networks theory. Only three of the 48 participants consistently

responded ‘‘yes’’ in the way pruning theory predicts, and only 13 consistently responded

‘‘no’’ in the way minimal-networks theory predicts.

Each participant in this experiment evaluated the counterfactual for all four devices.

Although the order with which they did so was balanced, it is possible that earlier answers

influenced later ones in a way that makes the overall percentages misleading. However, lim-

iting the analysis to just the first answer from each participant produces a pattern of results

very close to that in Table 1. The percentage of ‘‘yes’’ answers was 50% for the determinis-

tic jointly caused device, 17% for the deterministic separately caused device, 58% for the

probabilistic jointly caused device, and 50% for the probabilistic separately caused device.

The difference between these percentages and those in Table 1 are all within eight-percent-

age points. Analyses similar to those above produced no significant effects, however,

presumably because of the small number of observations (only 12 per device).

188 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 15: Two Causal Theories of Counterfactual Conditionals

Both theories have difficulty explaining the variation in responses, especially the notice-

ably lower percentage for the deterministic separately caused device. In Experiment 3, we will

explore modifications to the theories that may bring them in line with the data. But perhaps

the problem is with the data rather than with the theories. Participants may have failed to

interpret the question as one involving counterfactual circumstances, instead understanding

the question to be about what is true in the actual situation. If so, the pruning and minimal-

networks theories no longer apply, and all bets are off about their predictions. The instructions

in Experiment 1 stated that ‘‘just now components A, B, C, and D are all operating’’ immedi-

ately before putting the key question ‘‘If component C were not operating, would component

A be operating?’’ But although this wording seems to underline the counterfactual nature of

the question, participants may have missed it, and we can explore whether stronger signals

about the contrary-to-fact status of the antecedent will boost support for either theory.

7. Experiment 2: Effects of question wording

Conditionals with ‘‘would’’ in the consequent—such as our example If component Cwere not operating, would component A be operating?—sometimes imply or presuppose

that the antecedent and consequent are both false. In that case, they are truly counterfactual.

But we can also use these constructions when we are uncertain about the antecedent’s and

consequent’s status. Suppose one of the Table 1 devices is in some distant location, and we

do not know whether its components are currently off or on. We can still meaningfully use

If component C were not operating, then component A would be operating to describe what

is going on, based on our understanding of the device’s workings. Listeners might assume

under these conditions that we are not envisioning any special means to get C to stop operat-

ing. In particular, we are not imagining the sort of intervention that pruning theory dictates.

Instead, we are assuming that the machine is working normally and simply describing one

of its possible states. If participants in Experiment 1 were interpreting our question in this

noncounterfactual way, then pruning theory’s predictions would not apply, and our failure

to find evidence for pruning theory would not be surprising. As the previous sentence illus-

trates, these conditionals are common in stating hypotheses, guesses, and predictions where

the speaker’s or writer’s uncertainty is important. Let us call them ‘‘hypothetical condition-

als’’ to distinguish them from true counterfactual conditionals (following Dawid, 2007).

One way to emphasize the counterfactual over the hypothetical interpretation is to alter

the tense of a conditional. Linguistic analysis of counterfactual constructions suggests that

the tense system is ordinarily responsible for signaling, not only the time but also the

(counter)factual status of an event. Past tense marks the time of an event to be other than the

time of the utterance, but it can also indicate that the circumstances of an event are other

than the actual circumstances (Iatridou, 2000). For example, Fred wishes he had a drinkuses the past tense had to convey the fact that Fred lacks the drink. (The sentence means that

Fred wishes for a drink now; so had denotes counterfactuality, not past time.) That is, had(in combination with wishes) implies that Fred’s having a drink now is counterfactual. To

convey a counterfactual state in the past, English and other languages use a double

L. J. Rips ⁄ Cognitive Science 34 (2010) 189

Page 16: Two Causal Theories of Counterfactual Conditionals

past tense (past perfect or pluperfect) form. Fred wishes he had had a drink implies

that sometime in the past Fred lacked the drink he wanted. Likewise, we can bring out the

counterfactual interpretation of a conditional by employing an extra layer of past tense. For

example, compare the questions in Sentences (3a,b):

a: If component C had not operated; would component A have operated?

b: If component C were not operating; would component A be operating? ð3Þ

In both questions, would provides one layer of past tense (would is the past tense form of

will; see Iatridou, 2000), but in Sentence (3a) have operated supplies an extra past tense

operator. Because one of the past tense markers in Sentence (3a) must imply counterfactual-

ity, (3a) conveys a counterfactual interpretation more strongly than does (3b).

The present experiment takes advantage of these linguistic facts to see whether more

explicitly counterfactual questions will produce support for the pruning or minimal-networks

theories. As in Experiment 1, participants read descriptions of the four devices in Table 1,

and they learn that components A, B, C, and D are all operating. However, participants in the

present experiment then answer (3a) on some trials. This question should clarify the counter-

factual nature of the question. If the failure of the theories in Experiment 1 was due to uncer-

tainty about the factual or counterfactual status of the question, this shift in wording should

bring the results more in line with the theories’ predictions. To compare the effects of the old

and new wording, we used the (3a) form on some trials and the (3b) form on the others.

7.1. Method

Participants received a nine-page booklet—one page of instructions followed by eight

pages of problems. The instructions duplicated those of Experiment 1. The eight problems

combined the two questions in (3a–b) with the four devices of Table 1. The participants

filled out each problem page as in Experiment 1 by circling an answer to the question

(‘‘yes’’ or ‘‘no’’) and rating their confidence in their answer (on a 0–9 scale).

In half the booklets, the first four problems formed a block and asked (3a) for each

device; the second block of four problems asked (3b). In the remaining booklets, the first

block used (3b), and the second (3a). For a particular booklet, the order of the devices in

the two blocks was the same. Across booklets, however, all 24 possible orders of the

devices appeared. Thus, there were 48 booklets in all, which were formed by varying

question order and device order. The problems within a booklet used eight different

nonsense words (e.g., glux) to name the devices. But except for these names and the

alternative question wording, the problems’ phrasing was the same as in Experiment 1.

The booklets were randomly assigned to 48 participants. These participants were from

the same population as those in Experiment 1, but they had not taken part in the earlier

study. They were tested under the same conditions as in Experiment 1. Four participants

in this group failed to complete their booklets (e.g., by failing to circle some of the

answers), and the data from these participants do not appear in the analyses reported

here.

190 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 17: Two Causal Theories of Counterfactual Conditionals

7.2. Results and discussion

If the reason for the theories’ failure in Experiment 1 was that participants failed to grasp

the counterfactual status of the question, then the more explicit wording in the present study

should lend more support to the models. On critical trials, participants learned that compo-

nents A, B, C, and D were all currently operating and were immediately asked, If componentC had not operated, would component A have operated? This strongly implies that C’s not

operating is contrary to fact. However, results based on this wording are quite similar to

those of Experiment 1 and provide no further warrant for the theories. Pruning theory

predicts consistent ‘‘yes’’ responses for the four types of machines, whereas minimal-net-

works theory predicts consistent ‘‘no’’ responses. Table 2 (columns headed had ⁄ wouldhave) reveals, however, that neither pattern matches the data. Under the new had ⁄ would-have wording, participants gave ‘‘yes’’ responses to the deterministic separately caused

machine on 16% of trials, comparable to the 13% in Experiment 1. For the rest of the

machines, responses were at an intermediate level: 57% ‘‘yes’’ overall, similar to the 52%

figure from the first experiment.

The present experiment also contains an internal comparison between the new wording

and the old (If component C were not operating, would component A be operating?), as

participants received the old wording on one block of trials and the new wording on another.

Table 2

Percentage of ‘‘Yes’’ responses to the questions ‘‘If Component C were not operating, would component A be

Operating?’’and ‘‘If Component C had not operated, would component A have operated?’’ for the four device

types in Experiment 2

Deterministic Probabilistic

Jointly caused: a. b.

Were ⁄ would? Had ⁄ would have? Were ⁄ would? Had ⁄ would have?

61.4 ± 7.3 54.5 ± 7.5 65.9 ± 7.1 65.9 ± 7.1

Separately caused: c. d.

Were ⁄ would? Had ⁄ would have? Were ⁄ would? Had ⁄ would have?

15.9 ± 5.5 15.9 ± 5.5 40.9 ± 7.4 50.0 ± 7.5

Note. Percentages are given as ± 1 SD, based on 44 observations.

L. J. Rips ⁄ Cognitive Science 34 (2010) 191

Page 18: Two Causal Theories of Counterfactual Conditionals

This comparison appears in Table 2 and shows comparable results for the more explicitly

(had ⁄ would have) and less explicitly (were ⁄ would) counterfactual phrasing. For the were ⁄would items, ‘‘yes’’ responses were low for the deterministic separately caused device

(16%) and intermediate for the three other devices (56%). A categorical analysis, similar to

that of Experiment 1, confirms this impression. The analysis yielded significant effects of

device structure [QW(1) = 33.52, p < .0001] and determinism [QW(1) = 17.49, p < .0001],

as well as an interaction between them [QW(1) = 8.44, p = .0037]. These effects echo those

of the first study. But there is no significant effect of wording [QW(1) = 0.01, p = .9027] nor

interaction of wording with the other variables [QW(1) < 1 and p > .34 for the three inter-

action terms].

Participants were somewhat more confident of their response with the new had ⁄ wouldhave wording than with the old were ⁄ would question. Confidence on the 0–9 scale was

6.53 for the former and 5.33 for the latter [F(1,43) = 4.38, MSe = 3.84, p = .0423]. This

effect suggests that participants were considering the wording and may have found the

new wording more clear-cut in its meaning. However, wording had no differential impact

on participants’ confidence about the devices. As in Experiment 1, confidence was higher

for the deterministic separately caused device (7.4) than for the other three (5.6 for the

deterministic jointly caused device, 5.3 for the probabilistic jointly caused device, and 5.4

for the probabilistic separately caused device). This difference produced main effects for

structure [F(1,43) = 9.46, MSe = 7.59, p = .0036] and determinism [F(1,43) = 31.05,

MSe = 4.07, p < .0001], as well as for the interaction between them [F(1,43) = 15.69,

MSe = 3.91, p = .0003]. But type of wording did not affect these differences, as wording

did not interact with either of the other variables (Fs for all three possible interactions

were <1).

Participants in this experiment answered questions with were ⁄ would and had ⁄ wouldhave in separate blocks of trials. Carry-over effects may, therefore, have reduced the dif-

ferences due to wording. If we examine just the first block of trials, however, we find

similar patterns in participants’ answers. Participants who answered the had ⁄ would havequestions produced ‘‘yes’’ answers on 26% of trials for the deterministic separately caused

device, but answered ‘‘yes’’ more often for the remaining devices (52% for the determin-

istic jointly caused device, 70% for the probabilistic jointly caused device, and 48% for

the probabilistic separately caused device). For participants who answered the were ⁄ wouldquestions, the corresponding figures were 19% for the deterministic separately caused

device, 67% for the deterministic jointly caused device, 71% for the probabilistic jointly

caused device, and 38% for the probabilistic separately caused device. A categorical

analysis of these data produced only effects of structure [QW(1) = 19.75, p < .0001] and

determinism [QW(1) = 7.17, p = .0074]. In particular, no main effect of wording

materialized nor any interaction of wording with the remaining variables [QW(1) < 1.3,

p > .25].

In short, revising the wording of the question to make it more clearly counterfactual did

not alter the pattern of answers we found in Experiment 1. Participants made more confident

responses under the new wording, but they continued to give a low rate of ‘‘yes’’ responses

to the deterministic separately caused device and a medium rate of ‘‘yes’’ to the other

192 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 19: Two Causal Theories of Counterfactual Conditionals

devices. This suggests that failure to recognize the counterfactual nature of the question was

not the reason for the data’s departure from the predictions of the pruning and minimal-net-

works theories.

The consistently low rate of ‘‘yes’’ answers for the deterministic separately caused

device, however, hints at a different way to square minimal-networks theory with these

results. This machine stands out, according to this theory, because it is the only device in

which all minimal networks have component A not operating. The probabilistic devices

have three minimal networks each, with A operating in two of them (see Fig. 3). The deter-

ministic jointly caused device has two minimal networks (the two in the middle row of

Fig. 2), with A operating in one. The deterministic separately caused device, however, has

just one minimal network, in which A is not operating. As we noticed earlier, both A and Bmust be off to turn C off, according to the laws governing this device; hence, if C is off, both

A and B must be off as well. A variation on the minimal networks account might take advan-

tage of this difference to give a more accurate explanation of the results. Perhaps partici-

pants consulted the correct minimal networks—those specified by the theory—but

employed an alternative response strategy, answering ‘‘no’’ when the consequent of the

conditional is false in all minimal networks and ‘‘yes’’ when the consequent is true in all

minimal networks, and splitting their vote otherwise. Perhaps more explicit response

instructions could garner more support for minimal networks.

8. Experiment 3: Effects of necessity and base rates

One explanation for the results of Experiments 1 and 2 is that participants were uncer-

tain how to respond to the counterfactual unless it was causally necessary or causally

impossible. As just noted, a response rule of this sort could obscure evidence for the mini-

mal-networks approach. To check this possibility, the present experiment retains the four

devices from Table 1 but compares two different questions about them. One group of par-

ticipants, the yes ⁄ no group, answered the same yes ⁄ no question as in Experiment 1 (e.g., Ifcomponent C were not operating, would component A be operating?). A second group, the

necessity group, instead decided whether an equivalent counterfactual sentence (If compo-nent C were not operating, component A would be operating) necessarily followed from

given facts about the device and ‘‘the causal laws that govern our world.’’ Minimal-net-

works theory predicts that participants in the necessity group should say that the counter-

factual follows only if the consequent is true in all minimal networks of the antecedent

and should otherwise say that the counterfactual does not follow. Because the consequent

(A is operating) is false in at least one minimal network for each device, minimal-networks

theory predicts ‘‘doesn’t follow’’ responses to all devices under this more conservative

response rule.

The problems included one further variation as an additional probe of how participants

understood the instructions. For each device, the description specified that one of the

components, A or B, operated 95% of the time and that the other component independently

operated 5% of the time. These base rates could affect participants’ certainty about whether

L. J. Rips ⁄ Cognitive Science 34 (2010) 193

Page 20: Two Causal Theories of Counterfactual Conditionals

the relevant components are operating when the antecedent is true. If so, they may affect

answers to the yes ⁄ no question. Base rates, however, should not influence participants’ deci-

sions about whether the counterfactual necessarily follows. Consider, for example, the mini-

mal networks in the middle row of Fig. 2. If component A operates 95% of the time and

component B operates 5% of the time, then the state shown at the left will occur more often

than the state at the right. Both states, however, have non-zero probability, and the rightmost

state rules out the possibility that A operates in all minimal networks. Thus, participants

who are asked whether it necessarily follows that If C were not operating, then A would beoperating should respond that it does not, despite the variation in base rates.

8.1. Method

As in Experiment 1, participants received a booklet that contained a page of instructions,

followed by four additional pages, one for each of the Table 1 devices. Participants in

the yes ⁄ no condition answered two questions about each device: If component C were notoperating, would component A be operating? and If component C were not operating,would component B be operating? Participants in the necessity condition decided for each

device whether it necessarily followed that: If component C were not operating, componentA would be operating and If component C were not operating, component B would beoperating.

8.1.1. Procedure and materialsBooklets for the yes ⁄ no condition were quite similar to those of Experiment 1. However,

descriptions of the devices also included information about the base rates of operation for

components A and B. Half the booklets specified that component A operated 95% of the

time and component B operated 5% of the time. For example, the description of the deter-

ministic jointly caused device appeared as follows:

Professor McNutt of the Department of Engineering has designed a device called a blic-

ket. The blicket has only four components, labeled A, B, C, and D. The device works in

the following way:

Component A’s operating and component B’s operating together always cause compo-

nent C to operate.

Component A’s operating alone never causes component C to operate.

Component B’s operating alone never causes component C to operate.

Component C’s operating always causes component D to operate.

Component A operates 95% of the time.

Component B operates 5% of the time.

Components A and B operate independently: sometimes both operate, sometimes nei-

ther, sometimes only one.

The remaining booklets in this condition were the same, except that the percentages were

reversed (i.e., A operated 5% of the time, and B 95% of the time). For both versions of the

194 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 21: Two Causal Theories of Counterfactual Conditionals

booklet, participants were told to ‘‘Imagine that just now components A, B, C, and D are all

operating,’’ and they then answered the question about A and rated their confidence in their

answer. Next, they answered the question about B and rated their confidence. For each

version of the booklet, the order in which the devices appeared was balanced by means of a

random Latin square.

In the necessity condition, the instructions stated that participants were to decide

whether the key sentences (If component C were not operating, component A would beoperating and If component C were not operating, component B would be operating) fol-

lowed or did not follow. Participants were told, ‘‘You should answer ‘follows’ if you think

the statement will be true in all situations that obey the causal laws that govern our world

and in which the description of the device is also true. Otherwise, answer ‘doesn’t fol-

low.’’’ The only other difference between conditions was that participants in the necessity

condition circled either ‘‘follows’’ or ‘‘doesn’t follow’’ to record their choice rather than

‘‘yes’’ or ‘‘no.’’

8.1.2. ParticipantsThere were 48 participants in this experiment, 24 in each condition. All participants

were from the same pool as those of Experiments 1 and 2, but none had participated

in the earlier studies. The procedure within the test session was the same as that

in the earlier experiments. Data from one participant in the necessity condition

were omitted from the analyses that follow because the participant failed to follow

instructions.

8.2. Results and discussion

The stronger instructions to decide whether the counterfactual necessarily followed failed

to eliminate the differences among the devices. Participants were divided about whether the

counterfactual followed for the probabilistic devices and for the deterministic jointly caused

device. The overall percentage of ‘‘follows’’ responses was 48% for these items. For the

deterministic separately caused device, however, the percentage dipped to 13. This pattern

is similar to that of Experiments 1 and 2 and to the responses from participants in the yes ⁄ no

condition in this experiment. The latter participants gave 64% ‘‘yes’’ responses for the first

three devices but 10% ‘‘yes’’ for the deterministic separately caused device.

However, participants were not simply ignoring the instructions. We would expect those

in the necessity condition to downplay the differences in base-rate frequency for compo-

nents A and B. Recall that the instructions described one of these components as operating

95% of the time and the other as operating 5% of the time. Because these base rates were

neither 0 nor 1, they should not affect whether the components were necessarily on or off.

In line with this prediction, Table 3 shows that participants in the necessity condition pro-

duced about the same percentage of ‘‘follows’’ responses for the high and low base-rate

components: 42% versus 37%, respectively. Participants in the yes ⁄ no condition, however,

tended to give more positive answers when component A or B had a high base-rate fre-

quency than when it had low frequency: 61% versus 40%.

L. J. Rips ⁄ Cognitive Science 34 (2010) 195

Page 22: Two Causal Theories of Counterfactual Conditionals

As in Experiments 1 and 2, a categorical anova showed significant effects of structure,

determinism, and the interaction between them [for structure, QW(1) = 40.02, p < .0001;

for determinism, QW(1) = 12.48, p = .0004; and for the interaction, QW(1) = 20.73,

p < .0001]. There was no significant main effect of instructions [i.e., yes ⁄ no vs. necessity,

QW(1) = 1.71, p = .19], although as you might expect, the percentage of ‘‘follows’’

responses was less than or approximately equal to the percentage of ‘‘yes’’ responses in

comparable conditions. There were no significant interactions of instructions with structure

or determinism [QW(1) < 2.32, p > .13 for each of these effects]. This pattern is consistent

with the impression that the stronger necessity instructions did not alter the impact of the

differences among the devices.

As the results in Table 3 suggest, higher base rates produced significantly more positive

responses [QW(1) = 11.30, p = .0008], but this effect depended on the instructions (yes ⁄ no

vs. necessity), in line with our earlier observations [QW(1) = 4.09, p = .04]. There were also

significant triple interactions of base rate and instructions with determinism [QW(1) = 5.25,

p = .02] and structure [QW(1) = 5.03, p = .02]. Appendix A takes up the reasons for these

latter effects.

Confidence ratings also followed the pattern of Experiments 1 and 2 in exhibiting higher

confidence for the deterministic separately caused device (7.20 on the 0–9 scale) than for

the other three devices (6.39 for the deterministic jointly caused device, 6.12 for the proba-

bilistic jointly caused device, and 5.76 for the probabilistic separately caused device). This

Table 3

Percentage of positive responses as a function of device type, response options (yes ⁄ no or follows ⁄ does not

follow), and base rate in Experiment 3

Deterministic Probabilistic

Jointly caused: a. b.

Frequency of operation: ‘‘Yes’’ ‘‘Follows’’ ‘‘Yes’’ ‘‘Follows’’

High base rate 70.8 ± 9.3 47.8 ± 10.2 83.3 ± 7.6 52.2 ± 10.2

Low base rate 41.7 ± 10.1 34.8 ± 9.7 50.0 ± 10.2 56.6 ± 10.1

Separately caused: c. d.

Frequency of operation: ‘‘Yes’’ ‘‘Follows’’ ‘‘Yes’’ ‘‘Follows’’

High base rate 12.5 ± 6.8 17.4 ± 7.7 79.2 ± 8.3 52.2 ± 10.2

Low base rate 8.3 ± 5.6 8.7 ± 6.0 58.3 ± 10.1 47.8 ± 10.2

Note. Percentages are given as ± 1 SD, based on 44 observations.

196 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 23: Two Causal Theories of Counterfactual Conditionals

difference produced a significant interaction between device structure and determinism

[F(1,45) = 9.20, MSe = 3.42, p = .004]. The main effect of determinism was also significant

in this analysis [F(1,45) = 16.35, MSe = 4.22, p = .0002], although the effect of structure

was not [F(1,45) < 1, MSe = 5.53]. Participants were more confident in deciding whether

the counterfactuals necessarily followed (mean confidence = 7.03) than whether they were

true or false (5.73), producing a main effect of instruction [F(1,45) = 4.03, MSe = 31.47,

p = .03]. This suggests that the yes ⁄ no questions were more difficult to answer. But there

was no support for the idea that participants found the yes ⁄ no responses especially problem-

atic when the minimal networks exhibited mixed evidence relative to when they presented

consistent evidence. That hypothesis would predict an interaction between instructions,

structure, and determinism, but no such interaction materialized in the analysis

[F(1,45) < 1, MSe = 3.42]. There were no further significant interactions with instructions,

and no main effects or interactions for base-rate frequency.

Asking the participants to decide whether the counterfactual necessarily followed did

not enhance support for either pruning theory or minimal-networks theory. Participants in

this condition—like those in the yes ⁄ no condition and those in Experiments 1 and

2—tended to distinguish among the four devices rather than giving uniform responses.

Only six participants in the necessity condition gave all negative responses to the

devices, and none gave all positive responses. In the yes ⁄ no condition, the comparable

figures were two and one. It seemed possible that the results of Experiments 1 and 2

failed to do justice to minimal-networks theory, as participants may have been uncertain

how to respond in case the conditional’s consequent was true in some minimal networks

and false in others. By stressing that participants should answer ‘‘follows’’ only if the

conditional necessarily followed from the device description and causal laws, the neces-

sity instructions should have given the minimal-networks approach its best chance. Par-

ticipants should have consulted all the minimal networks and responded ‘‘doesn’t

follow’’ if the counterfactual’s consequent was false in any of them. This strategy would

have led to unambiguously negative answers for all four devices, as the consequent is

false in at least one minimal network for each. But although participants in this condition

took the instructions seriously, they still produced fewer ‘‘follows’’ responses for the

deterministic separately caused device in Table 3C than for the other items, just as they

had in the previous experiments.

9. Experiment 4: Forward counterfactuals

The counterfactual conditionals in Experiments 1–3 were backtracking counterfactuals,

such as If component C were not operating, component A would be operating, in which the

effect (C’s operation) appears in the antecedent and the cause (A’s operation) appears in the

consequent. Backtracking counterfactuals like these pose problems for some theories. In a

deterministic world, if all causes of a particular effect occur just as they do in the actual situ-

ation, the same effect must also occur. This means that if we hypothesize a counterfactual

situation in which the effect does not occur, the sequence of events leading up to the effect

L. J. Rips ⁄ Cognitive Science 34 (2010) 197

Page 24: Two Causal Theories of Counterfactual Conditionals

must also be altered. This is pruning theory’s motivation for cutting the causal links between

the immediate causes of an effect and the effect itself, and it is the motivation for minimal-

networks theory allowing breaks in the causal stream. It may be unclear, however, which

upstream changes are necessary to block the effect and, thus, which backtracking counter-

factuals (if any) are true.

As noted earlier, however, backtracking counterfactuals are often true and useful. In trou-

bleshooting, for example, we may have to reason that if some component of a device were

not working then some prior component would have been at fault (e.g., Hale & Barsalou,

1995). Moreover, the truth of some forward counterfactuals seems to depend on the state of

upstream causes. Consider, for example, the simple system in Fig. 4 and the forward count-

erfactual If component B were not operating, then component C would be operating. In this

device, component A always causes components B and C to operate, and component Balways causes component C to operate. If we suppose that component B is not operating,

then whether C is operating will depend on the state of A. It is reasonable to think that B’s

not operating implies that A is not operating and, therefore, that C is not operating either. So

it is false that If component B were not operating, then component C would be operating.

But if this reasoning is correct, determining the truth of the (forward) counterfactual

involves considering the cause of the antecedent.

Pruning theory makes reasoning of this sort impossible.8 The equations for the Fig. 4

device should be those in (4), given the description in the previous paragraph:

a: A ¼ UA

b: B ¼ A

c: C ¼ Aþ B�A � Bð4Þ

Suppose all three components are currently operating (A = B = C = 1). Pruning theory

first updates the variables to reflect this state of affairs, fixing UA = 1. To simulate the ante-

cedent, If B were not operating, we prune the connection between A and B, and then turn Boff. This is equivalent to substituting B = 0 for Eq. (3b). Solving the remaining equations,

we find A = 1 and C = 1. The theory, therefore, predicts that it is true that If componentB were not operating, then component C would be operating, contrary to the intuition just

discussed.9

Fig. 4. Sample causal system from Experiment 4.

198 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 25: Two Causal Theories of Counterfactual Conditionals

Minimal-networks theory, however, goes along with our reasoning. The only network

consistent with B not operating is the one in which all three components are not operating.

This network has one break variable (A) and no intact variables (similar to the net at the

upper left of Fig. 5). The deterministic causal connections of the device eliminate all other

potential states in which B is off. The two theories again make opposite predictions but this

time about forward, as well as backtracking, counterfactuals. Experiment 4 tests this con-

trast, among other differences between the two approaches.

In this experiment, participants evaluated two types of counterfactuals: the backtracking

counterfactual If component B were not operating, would component A be operating? and

the forward counterfactual If component B were not operating, would component C be oper-ating? In addition, the experiment varied whether the causal link between components Aand B was deterministic or probabilistic. Table 4 shows the deterministic version on the left

and the probabilistic version on the right. There was one final variation: In two of the

devices, component A had a 95% base rate of operating, whereas in the remaining two, it

had a 5% base rate. For reasons similar to those mentioned earlier (in the introductions to

Experiments 1 and 3), neither variation affects the predictions of either pruning theory nor

minimal-networks theory. According to pruning theory, participants should answer ‘‘yes’’

to both conditional questions for all four devices. According to minimal-networks theory,

participants should answer ‘‘no.’’ As we have just seen, there is only one minimal network

for the deterministic devices—one in which all components are off—and the theory there-

fore predicts a ‘‘no’’ response to both questions. For the probabilistic devices, there are two

Fig. 5. Minimal networks for the probabilistic system in Experiment 4.

L. J. Rips ⁄ Cognitive Science 34 (2010) 199

Page 26: Two Causal Theories of Counterfactual Conditionals

minimal networks: the one with all three components off and the other with A and C on but

B off. These networks appear at the top of Fig. 5. Because A (C) is off in one of the minimal

networks, the answer to the counterfactual questions is ‘‘no’’ once again.10

9.1. Method

Each participant evaluated two counterfactuals for each of four devices. The devices were

(a) the deterministic device in Table 4A with a 95% base rate for component A; (b) the same

device with a 5% base rate for component A; (c) the probabilistic device in Table 4B with a

95% base rate for A; and (d) the same device with a 5% base rate for A. The descriptions of

the devices were similar to those in the earlier experiments but with changes to accommo-

date the new structures. For example, the booklets described the deterministic, high base-

rate device as follows:

Professor McNutt of the Department of Engineering has designed a device called a

glux. The glux has only three components, labeled A, B, and C. The device works in

the following way:

Component A’s operating always causes component B to operate.

Component A’s operating always causes component C to operate.

Component B’s operating always causes component C to operate.

Component A operates 95% of the time.

For the probabilistic devices, ‘‘sometimes’’ substituted for ‘‘always’’ in the first line.

(The italics appeared in the booklets to emphasize the difference between the deterministic

and probabilistic devices.) For the low base-rate devices, ‘‘5%’’ substituted for ‘‘95%’’ in

the fourth line.

9.1.1. Procedure and materialsParticipants again received booklets with a page of instructions, followed by four pages,

each devoted to one of the devices. The instructions were like those of Experiment 1, but in

Table 4

Percentage of ‘‘yes’’ responses to forward and backward counterfactuals as a function of device type and base

rate in Experiment 4

Frequency of A’s operation: Deterministic Probabilistic

a. b.

If not B, A? If not B, C? If not B, A? If not B, C?

High base rate 37.5 ± 6.7 43.8 ± 7.2 79.2 ± 5.9 79.2 ± 5.9

Low base rate 25.0 ± 6.2 35.4 ± 6.9 60.4 ± 7.1 62.5 ± 7.0

Note. Percentages are given as ± 1 SD, based on 48 observations.

200 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 27: Two Causal Theories of Counterfactual Conditionals

this case, a diagram similar to Fig. 4 accompanied the instructions to illustrate the structure

of the devices. Each of the following pages contained a device description, then one of the

conditional questions (either If component B were not operating, would component A beoperating? or If component B were not operating, would component C be operating?) and

its associated confidence rating scale, the remaining question, and its rating scale. The prob-

lem asked participants to ‘‘imagine that just now components A, B, and C are all operating’’

just before posing the questions. In half the booklets, the question about A was first; in the

remaining booklets, the question about C was first. There were 24 booklets of each type,

with their pages arranged to correspond to the 24 possible orders of the devices. Participants

were randomly assigned to one of the booklets. The experimental session followed the pro-

cedure of Experiments 1–3.

9.1.2. ParticipantsForty-eight participants took part in this experiment. All were university undergraduates,

and they were either paid or received credit in introductory psychology for their role in the

study. One participant skipped a page in the booklet, and the data from this participant are

omitted from the categorical analysis described in the following section.

9.2. Results and discussion

Consider, first, the forward counterfactual, If component B were not operating, wouldcomponent C be operating? Pruning theory predicts that participants should answer ‘‘yes’’

for both the deterministic device in Table 4A and the probabilistic device in Table 4B. Min-

imal-networks theory predicts ‘‘no’’ for both. Table 4 shows, however, that the percentages

of ‘‘yes’’ responses were far from equal. Participants gave 40% ‘‘yes’’ answers for the

deterministic device, whereas they gave 71% ‘‘yes’’ answers for the probabilistic one. Intui-

tively, the deterministic connection between A and B means that when B is not operating,

neither is A. But if both these causes of C are not operating, C itself should not be operating

(see the quoted explanations from Participants A and B in Section 10 for evidence of this

kind of thinking). A probabilistic connection between A and B, however, leaves it open that

A is still operating when B is not; hence, one of C’s causes may still be active, and C itself

may be active. The observed difference between device types is in the direction this line of

reasoning predicts, and this finding suggests that the difficulties for pruning and minimal-

networks theories extend to forward counterfactuals. The difference in base rates for compo-

nent A also affected responses to the forward counterfactual. Participants gave positive

answers on 62% of trials when A often occurs but on 49% of trials when A seldom occurs.

Results for the backtracking counterfactual If component B were not operating, wouldcomponent A be operating? show quite similar effects and also largely agree with the data

from Experiments 1–3. The percentage of ‘‘yes’’ responses to this question is higher for the

probabilistic device (70%) than for the deterministic one (31%). For the deterministic device

in the present experiment and for the deterministic separately caused device in the previous

ones, the effect not operating provides good evidence that the cause is not operating.

This is not the case, however, for the probabilistic devices in the present study, nor for the

L. J. Rips ⁄ Cognitive Science 34 (2010) 201

Page 28: Two Causal Theories of Counterfactual Conditionals

probabilistic or jointly caused devices in Experiments 1–3. In all these latter cases, the effect

not operating is an uncertain guide to whether the cause is operating, in accord with these

differences. The backtracking counterfactual also shows the same effect of base rates as

does the forward counterfactual. Participants made 58% ‘‘yes’’ responses when A’s base

rate is high but 43% ‘‘yes’’ responses when it is low.

A categorical repeated-measures analysis confirmed the effect of determinism

[QW(1) = 34.67, p < .0001] and base-rate frequency [QW(1) = 6.76, p = .0094]. These

effects were approximately the same for forward as for backtracking counterfactuals, as the

direction of the counterfactual did not significantly interact with either determinism or base

rate [for the interaction with determinism, QW(1) = 2.23, p = .135; for the interaction with

base rate, QW(1) = 0.26, p = .611]. The effect of base rate is somewhat larger for the proba-

bilistic device than for the deterministic one. In the probabilistic case, ‘‘yes’’ responses are

18 percentage points greater when A’s base rate is high than when it is low. But in the deter-

ministic case, this advantage is 10 percentage points. This difference is what you might

expect from the causal set up: When the link from A to B is probabilistic, A could still be

operating if B is off, and A’s base rate could affect the likelihood that both A and C are on.

When the link from A to B is deterministic, however, A should be off if B is off, and A’s base

rate should have no effect. Although the interaction between determinism and base rate is

not significant in the present experiment [QW(1) = 0.70, p = .403], a separate analysis of

the probabilistic device shows a significant effect of base rate [QW(1) = 8.26, p = .004],

whereas a similar analysis of the deterministic device does not [QW(1) = 1.85, p = .173].

No other significant effects appeared in either the omnibus analysis or the analyses of the

individual devices. Only five of the 48 participants gave positive responses to all questions,

and only two gave all negative responses.

Participants should be more certain about the workings of the deterministic device than

of the probabilistic device. If the causal connection between A and B is only ‘‘sometimes’’

effective, then there is no way to be sure whether A or C will be on when B is off. Confi-

dence ratings reflected this difference. Mean confidence was 7.6 on the 0–9 scale for the

deterministic device but 6.7 for the probabilistic one [F(1,47) = 13.14, MSe = 5.73,

p = .0007]. This is consistent with the results of Experiments 1–3. When the deterministic

nature of the connections and the configuration of the components lead to an unambiguous

answer (as they do for the deterministic device in this experiment and the deterministic

separately caused device in the previous ones), confidence goes up. But if the probabilistic

connections or the configuration leads to ambiguities (probabilistic devices in all four exper-

iments and the deterministic jointly caused device in Experiments 1–3), confidence goes

down.

The confidence ratings also exhibited an interaction between question type (forward vs.

backward) and determinism [F(1,47) = 5.63, MSe = 2.01, p = .022]. Mean confidence for

the backtracking question was 7.8 for the deterministic device and 6.5 for the probabilistic

one, but this difference narrowed slightly for the forward question (mean confidence was

7.4 for the deterministic device and 6.9 for the probabilistic one). The backtracking question

If not B, A? may have highlighted the difference between the deterministic and the

probabilistic system, as these systems vary precisely in the status of the causal connection

202 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 29: Two Causal Theories of Counterfactual Conditionals

between A and B. The forward question If not B, C? may have made the difference less sali-

ent, as the B–C link is deterministic in both devices.

Confidence ratings also produced an effect of base-rate frequency [F(1,47) = 6.47,

MSe = 2.08, p = .014], with slightly higher ratings when A operated more often (7.3 vs.

7.0). Because the given frequencies were complementary—A’s base rate was either .95 or

.05—the reason for this difference is unclear. Some participants may have confused confi-

dence in their previous ‘‘yes’’ or ‘‘no’’ answer with confidence that the answer was ‘‘yes.’’

The general trend, however, in this and the previous experiments is for confidence to be neg-atively correlated with the percentage of ‘‘yes’’ responses (e.g., the Spearman rank-order

correlation rS = .64 across the eight main conditions in the present study). This suggests that

participants were not always misinterpreting the confidence rating task as calling for ratings

of the truth of the counterfactual, as this would have produced a positive correlation. No

further effects were significant in the analysis.

In sum, forward counterfactuals produced approximately the same results as did back-

tracking counterfactuals for the Table 4 devices. Both types of question are sensitive to the

deterministic or probabilistic nature of the causal relations and to the base rate for the

devices’ components. As a result, the questions produced answers that varied greatly across

conditions. The range is 35–79% ‘‘yes’’ responses for the forward counterfactual and

25–79% for the backtracking counterfactual. Any theory about such sentences must account

for this large variation.

10. General discussion

The four experiments in this article examined answers to counterfactual questions of the

form If component X were not operating (had not operated), would component Y be operat-ing (have operated)? where X and Y were parts of a simple machine. Backtracking counter-

factuals are those in which Y is a cause of X, and all four experiments provided evidence

that answers to these questions depended on the nature of Y and possibly other of X’s causes.

Participants’ answers depended on whether the causal connection from Y to X was determin-

istic or probabilistic (Experiments 1–4). When deterministic, the answers also depended on

whether Y was individually sufficient for X or only jointly sufficient (Experiments 1–3).

When the connection was probabilistic, the answers also depended on Y’s base rate (Experi-

ments 3 and 4). Some of the same effects appeared when participants decided whether the

sentence If component X were not operating, component Y would be operating necessarily

followed from the device’s description and causal laws (Experiment 3). The device’s

configuration—whether it was deterministic or probabilistic and whether it was jointly

caused or separately caused—still affected ‘‘follows’’ responses, but mere base rates did

not. Experiment 4 found related effects for forward counterfactuals. X is a cause of Y for

these questions, but the results again showed effects of both determinism and base rates.

Finally, the explicitness of the counterfactual wording—If X had not operated, would Y haveoperated? versus If X were not operating, would Y be operating?—did not change the size

of these effects (Experiment 2). Participants who read, ‘‘Imagine that just now components

L. J. Rips ⁄ Cognitive Science 34 (2010) 203

Page 30: Two Causal Theories of Counterfactual Conditionals

A, B, C, and D are all operating. If component C had not operated, would component A have

operated?’’ would have difficulty avoiding the question’s counterfactual import, but their

answers were quite similar to those who received less explicit wording.

These effects suggest that events causally upstream from the antecedent event, X, can

have an impact on answers to counterfactual questions. These questions lead people to con-

sider ways in which the antecedent could have come about, and variations in what brings

about the antecedent can influence the counterfactual’s truth. Given the simple device in

Fig. 4, for example, contemplating how component B could have failed to operate leads us

to consider the causes of B’s performance. If B fails because of a faulty connection between

A and B, then A may still be working and may also cause C to work. Whether it does so may

depend on A’s base-rate frequency. By contrast, if B fails because of a failure of A itself,

then C will not occur, no matter what A’s usual base rate.

10.1. Pruning theory’s advantages and limitations

Inferences about the upstream causes of the antecedent are difficult to square with prun-

ing theory. Because pruning theory severs connections between the antecedent event and its

causal parents, properties of the parents, such as their base rates, should not affect the

answer to counterfactual questions. Neither should properties of the parents’ links to the

antecedent event, such as whether these connections are singly or jointly sufficient. Pruning

theory has a clear motivation for severing the connections into the antecedent event. In order

for X not to occur in an envisioned counterfactual situation, something must change prior to

X. The prior change, however, cannot be an arbitrary one. Event X can have an enormous

number of events leading up to it, including some from remote times. Changing some of

these remote events may cause X not to happen but may also have many further ramifica-

tions. Arbitrary changes to these preceding events give us no basis for deciding the answer

to the counterfactual question If X had not occurred, would Y have occurred? as some may

alter Y whereas others may not. A sensible answer requires that the change to actual condi-

tions be as small as possible, consistent with X not occurring. One way to implement this

minimal change is to imagine all the events causally prior to X happening just as they actu-

ally did, with changes confined to the connections between X’s parents and X itself. Such a

change is similar to what happens when we directly manipulate X. We remove the usual

causal inputs to X, directly change its value, and assess downstream effects on other

variables.

Pruning theory makes a convincing case that the changes needed to bring about the count-

erfactual’s antecedent should be minimal ones. But the right minimal change might not

always be a change to the immediate connections into the antecedent event. If you are won-

dering what would have happened if you had not gone to college, for example, then it would

be odd to imagine this as a case in which all the immediate causes of your actually going to

college were erased and you were whisked into a situation in which you did not go. This

would be similar to finding yourself in an experiment in which you had been randomly

assigned to a No-college control group. Instead, it is much more natural to consider which

changes to preceding events would most plausibly have led to your not attending college

204 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 31: Two Causal Theories of Counterfactual Conditionals

and then to contemplate the effects of those causes. If the likely explanation for not attend-

ing is family financial difficulties, then consequences like you taking a job loom large in

your counterfactual future. If the likely explanation for not attending college is personal

health problems, however, other events, prominently medical ones, become likely, whereas

work-related events may become less likely. Which of these possible histories is more prob-

able will determine your judgment about the truth of counterfactuals, such as If I had notgone to college, I would have taken a job as a construction worker. Considerations of this

sort involve backtracking to the most likely prior causes of the counterfactual’s antecedent

and then reasoning forward to determine the truth of the event in the consequent (Rips,

2008). This back-and-forth inferencing means that properties of events causally prior to the

antecedent—for example, the events’ frequency and sufficiency—will influence people’s

evaluation of counterfactuals. This type of reasoning accords with the results of the present

studies.

Should we say that pruning theory provides a normative account against which partici-

pants’ responses fall short? To make this case, pruning theory would have to marshal argu-

ments that wider considerations outweigh people’s intuitions about counterfactuals, but it is

difficult to see what these factors could be. What normative aspects of counterfactuals could

dictate, for example, that you should disregard the connections from the immediate causes

of your going to college in the preceding example? (See Cartwright, 2007, for a theoretical

critique of pruning along these lines.)

Proponents of pruning theory could contend that participants in these experiments were

not really interpreting the stimulus sentences as counterfactuals and were therefore perform-

ing some alternative computations to answer the questions. But it is unclear why participants

would ignore the sentences’ counterfactual form. In all four experiments, participants

learned that the components were currently operating just before being asked the conditional

question in which one component was not operating. Experiment 2 altered the question’s

wording to highlight its counterfactual status, but without materially changing the results.

A better defense of pruning would be to portray the theory as an idealized model whose

advantages are simplicity and computational specificity in determining counterfactuals’

truth. To determine the truth value, all that is required is a representation of the causal net-

work, including the functional equations and the current state of the variables. But although

there are merits to pruning theory’s simplicity and definiteness, the trouble with such a

defense is that there may be better idealizations, including minimal-networks theory.

We should notice that problems with pruning theory’s treatment of counterfactuals do not

imply there is anything wrong with this theory’s handling of explicit intervention. For exam-

ple, direct intervention on component C of the Fig. 1 device may indeed involve severing

connections into C, and this may block inferences from C’s state to the states of C’s parents.

The experiments do not assess this aspect of the theory. What the experiments do show is

that participants do not treat counterfactuals in this interventional way, at least for the sim-

ple devices we have considered here. Proponents of pruning could reserve the pruning oper-

ation only for counterfactuals that are explicitly about interventions. For example, pruning

could apply to counterfactuals such If someone intervened on component C to keep it fromoperating, component A would be operating, but more standard Bayesian revision could

L. J. Rips ⁄ Cognitive Science 34 (2010) 205

Page 32: Two Causal Theories of Counterfactual Conditionals

apply to other counterfactuals, such as the ones in these experiments. Evidence for pruning

in the case of conditionals about explicit interventions comes from Sloman and Lagnado

(2005), as mentioned in the introduction. However, pruning theory, in its original form, was

clearly not confined to counterfactuals that directly describe interventions (e.g., no such

qualification appears in the formal presentation of pruning theory in Pearl, 2000, chap. 7).

Such a confined theory would merely assert that the model for interventions carries over to

counterfactuals mentioning these same interventions. Instead, pruning theory is supposed to

be a general theory of counterfactuals. Hence, we need some explanation for why the theory

does not predict the results of the present experiments. Defenders of pruning theory might

be able to relax the theory’s assumptions to allow pruning to occur at places other than the

inputs to the node mentioned in the antecedent, but the defenders would need to describe

the conditions for relocating the pruned links.

Likewise, problems with pruning theory’s treatment of counterfactuals do not imply that

other versions of Bayes nets cannot handle the present findings. As just hinted, standard

Bayesian revision does allow the kind of diagnostic reasoning from effect to cause that

seems evident in the data. Thus, standard Bayes nets may be able to predict the data where

pruning fails. To evaluate this possibility, I fit such a model to the percentage of ‘‘yes’’

responses in Experiments 3 and 4, using Bayes net software (HUGIN; Andersen, Olesen,

Jensen, & Jensen, 1989). In general, however, the results of this modeling show that the

obtained data are far less extreme than what standard (i.e., noninterventionist) Bayes nets

predict. Consider, for example, the deterministic jointly caused device (Table 3A) in Experi-

ment 3, and assume (without loss of generality) that component A has the high base rate

(95%) and component B the low base rate (5%). With prior probabilities Pr(A = 1) = .95

and Pr(B = 1) = .05, the probability that C is on is Pr(C = 1) = .95 * .05 = .048, as A and Bwere said to operate independently and as both have to be on in order for C to be on. Simi-

larly, A will be on and C will be off if and only if A is on and B is off. Thus, Pr(A = 1 and

C = 0) = .95 * (1 - .05) = .902. This means that the conditional probability that A is on

given C is off is Pr(A = 1 | C = 0) = .902 ⁄ (1 - .048) = .947. A similar calculation shows

that Pr(B = 1 | C = 0) = .003. We can take these last two conditional probabilities as the

standard Bayesian model’s predictions for the questions If component C were not operating,would component A be operating? and If component C were not operating, would compo-nent B be operating? As Table 3A reveals, however, the proportion of subjects who

answered ‘‘yes’’ to the first question was .708, and the proportion ‘‘yes’’ for the second

question is .417.

Much the same is true for the remaining devices in Experiments 3 and 4. For example,

the deterministic device in Experiment 4 (Table 4A) is such that the relevant conditional

probabilities are uniformly zero across conditions. However, the overall proportion of

‘‘yes’’ answers is .354, and it is not clear how the model could account for these responses.

In the case of the probabilistic devices, the model requires a parameter to estimate the likeli-

hood that the causal parents (A and B in Experiment 3 and A in Experiment 4) will produce

their effect. (Participants in these experiments knew only that the parents ‘‘usually’’ or

‘‘sometimes’’ caused the effect to occur.) A search of parameter space sometimes found

values that would produce correct predictions for the high base rate conditions, but the

206 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 33: Two Causal Theories of Counterfactual Conditionals

model systematically underpredicted the results from the low base rate conditions.11 To give

a convincing account of the data, then, standard Bayes nets would have to include additional

assumptions. But although extending the model in this way might be a worthwhile endeavor,

I will not attempt to do so here.

10.2. Advantages and limitations of minimal-networks theory

Minimal-networks theory has more room to maneuver in explaining the results of

these experiments. Like pruning theory, the minimal-networks approach revises the

course of history to bring about the event mentioned in the counterfactual’s antecedent.

But the latter approach is more flexible in the revisions it countenances. Instead of sim-

ply orphaning the antecedent event from its causal parents, this theory allows the values

of earlier events to change, provided that the changes respect the causal laws governing

the system. This policy often means that the changes will occur at the initial events or

root nodes of the Bayes net, as in Fig. 2, and one could argue that any predictive advan-

tage for the theory thereby depends on which events the modeler chooses to represent at

the roots. However, this choice is not necessarily an arbitrary matter. The events included

in a Bayes net are usually selected to minimize effects of external or exogenous factors

and to reduce the correlations among those external factors that operate on the observed

variables. This choice allows the networks to explain the system’s behavior with as few

confounding influences as possible, and the choice also helps determine which events are

roots. The root nodes for the devices in Experiments 1–4, for example, seem sensible in

light of the devices’ description.

Minimal-networks theory also permits the type of back-and-forth reasoning mentioned in

Section 10.1. The theory allows us to backtrack from a component that is not working to

earlier components that might explain the failure. We can then use the changed values of

those components in a forward direction to predict other changes in the system. If the conse-

quent of the counterfactual is true in all networks that are minimally altered in this way, then

the counterfactual itself is true.

We have seen, however, that a strict application of minimal-networks theory fails to

predict the results of Experiments 1–4. For example, the minimal networks of all four

devices in Table 1 contain at least one network in which component A is not operating;

hence, the question If component C were not operating, would component A be operat-ing? should get a negative answer. But although responses to the deterministic separately

caused device in Table 1C were mostly negative, responses to the remaining devices split

more evenly. The easiest way out for minimal networks is to assume that participants

were uncertain how to respond when the evidence was mixed, that is, when the conse-

quent was true in some minimal networks but false in others. However, the difference

among devices remained in Experiment 3 when participants were told to respond posi-

tively only if the counterfactual necessarily followed. These instructions should have

forced negative answers when the consequent was false in any of the minimal networks,

contrary to the obtained results. Could other changes within the minimal-networks frame-

work explain these deviations?

L. J. Rips ⁄ Cognitive Science 34 (2010) 207

Page 34: Two Causal Theories of Counterfactual Conditionals

To explore this possibility, Appendix A considers introducing a number of additional

assumptions to deal with the results of Experiment 3, allowing for limitations in people’s

ability to process the networks. In particular, we can suppose:

(a) Participants sometimes give up on a problem because of failures of attention, lack of

motivation, errors in understanding the instructions or descriptions, or other factors.

In these situations, the participants simply choose randomly between ‘‘follows’’ and

‘‘doesn’t follow’’ (or ‘‘yes’’ and ‘‘no’’) with equal probability. This assumption is

needed to explain why positive responses sometimes occur where there are no mini-

mal networks in which both the antecedent and consequent are true (i.e., for the deter-

ministic separately caused device).

(b) Even when participants work through the problem, they do not consult all relevant

networks for a given device but instead sample or construct just one of them. They

then base their answer on whether the consequent is true or false in this network.

Because the number of minimal models can be large for some causal set ups, we

could justify this strategy on the grounds of psychological plausibility. Participants

may use a sample rather than the complete set of networks because of processing lim-

itations, for example, constraints on working-memory or time. This idea is also quite

similar to that offered in some versions of mental-model theory for deductive reason-

ing, in which people assess the validity of an argument by inspecting a single mental

model of the argument’s premises (e.g., Evans, Over, & Handley, 2007). This strategy

will sometimes lead participants to make a ‘‘yes’’ response even when the consequent

is false in one or more of the networks, as the sample they choose may happen to be

one in which the consequent is true.

(c) Finally, we assume that participants in the yes ⁄ no condition sample among the mini-

mal networks in which the antecedent is true, but participants in the necessity condi-

tion sample among all causally legal networks [in the sense of (2)] in which the

antecedent is true. This use of all legal networks may be due to the instructions in the

necessity condition, which emphasized that the counterfactual should be true ‘‘in all

situations that obey the causal laws that govern our world and in which the description

of the device is also true.’’ The injunction to consider all situations may have pushed

participants to take into account all legal networks without regard for their similarity

or minimality with respect to the actual situation.

The Appendix A model is based on these assumptions and provides a good fit to the

data of Experiment 3. It also seems compatible with the data from Experiments 1 and 2.

Participants made yes ⁄ no responses in those experiments, and the data tend to fall near

the mean predicted values from the high and low baserates for the comparable conditions

in Experiment 3. This suggests that participants may have assumed an intermediate base

rate for these devices. The model has more difficulty with the results of Experiment 4,

however, particularly for the deterministic device. This device has just a single minimal

network in which the consequent is false, both for the forward question (If not B, C?)

and the backtracking question (If not B, A?). Because there are no minimal (or even

legal) antecedent-and-consequent networks, the model has to account for ‘‘yes’’

208 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 35: Two Causal Theories of Counterfactual Conditionals

responses solely through guessing via Assumption (a) above. Table 4 shows, however,

that the obtained percentages range from 25% to 44%, so guessing would have to occur

on 50–88% of trials.

10.3. Searching for causes

From a cognitive point of view, our survey of pruning theory and minimal-networks

theory presents a complementary pattern of advantages and disadvantages. Pruning

theory provides a procedure that is easy to execute within a single model of the causal

situation, but it is inflexible with respect to the modifications it can make within the

model. Minimal-networks theory allows more flexibility (including limited backtracking)

but at the expense of positing a potentially large number of minimal models. We can

think of the Appendix A theory as a start on reconciling these ideas. Assumption (b) of

the previous section uses sampling to cut down the number of minimal models to

one, simplifying the resulting operations. This assumption also explains why participants

in Experiment 3 sometimes responded ‘‘follows’’ even when some of the minimal

networks warranted a ‘‘doesn’t follow’’ response. Sampling is useless, however, in

explaining the ‘‘yes’’ responses for the deterministic device in Experiment 4, which we

were just considering. No possible antecedent network exists for this device in which

the consequent is true.An explanation for these responses might be couched more easily in terms of the way

people operate at a micro level rather than in terms of how they deal with fully constructed

networks. A theory along these lines might detail the procedures people use to go from the

values of given variables to the values of the remaining variables. In this respect, pruning

theory’s step-by-step method may be closer to the mark, although the specific procedures

would differ, of course, from the theory’s officially sanctioned ones. In the network of

Fig. 4, for example, such a theory might specify the way people decide that component Amust be off when component B is off. These procedures could involve solving equations,

similar to those in (1) and (4), or perhaps more likely, reasoning from general principles

governing causal necessity and sufficiency (e.g., Ahn & Graham, 1999; Cummins, 1995;

Staudenmayer, 1975).

Some hints about such a process come from a pilot study in which 12 participants

received the problems from Experiment 4. After they answered the questions about each

device, however, these participants wrote a description of how they had arrived at their

answers. When participants answered ‘‘no’’ to the questions (If not B, A? and If not B, C?)

about the deterministic device, they tended to give straightforward lines of reasoning to sup-

port their decisions:

Participant A: ‘‘If B is not operating, it is not causing C to operate. This also means that

A is not operating because it always causes B to operate. Therefore, none are operating.’’

Participant B: ‘‘For component C to operate, either A, B, or both must be operating. It is

given that B is not, and this means that A must also not be operating. Therefore, C could

not operate.’’

L. J. Rips ⁄ Cognitive Science 34 (2010) 209

Page 36: Two Causal Theories of Counterfactual Conditionals

However, participants who answered ‘‘yes’’ to the same questions about the deterministic

device produced less coherent answers, possibly reflecting an incomplete understanding of

that device:

Participant C: [For If not B, then C?] ‘‘I figured that since A always caused C to operate

that just b ⁄ c B was not did not mean that C couldn’t run. B just has a better chance at run-

ning.’’ [For If not B, then A?] ‘‘B and A didn’t have to rely on each other at all.’’

Participant D: [For If not B, then C?] ‘‘Component C will still operate even if B doesn’t

operate as long as Component A continues to operate.’’ [For If not B, then A?] ‘‘Compo-

nent A isn’t dependent on B.’’

The responses from Participants C and D could be due to pruning, although the small

number of consistent pruners makes this somewhat unlikely. Instead, it is possible that these

participants failed to put together the separate implications of B’s state for those of A and C.

Let us consider a theory along these lines in which people use principles of causal neces-

sity and sufficiency to determine the values of causal variables. These principles will some-

times leave the values of certain variables undetermined because the given information may

not supply enough information, and in such cases, people may guess at the values. They may

also rely on base rates to fill in missing values if the task requires yes ⁄ no responses rather

than follows ⁄ does not follow responses. Values may also be undetermined because partici-

pants fail to apply some of the principles, because of processing limitations (constrained

working memory, time, or motivation), and in these cases, too, they may fill in the values

from base rates or by guessing. This would help explain the ‘‘yes’’ responses in Experiment

4, which pose problems for minimal networks. The principles themselves might take the fol-

lowing form, which simply spell out the consequences of causal necessity and sufficiency:

a. If {X1, X2, …, Xn} are jointly causally sufficient for Ythen if (X1 is on and X2 is on and… and Xn is on)

then Y is on.

b. If {X1, X2, …, Xn} are jointly causally sufficient for Ythen if Y is off

then (X1 is off or X2 is off or… or Xn is off).

c. If {X1, X2, …, Xn} are jointly causally necessary for Ythen if Y is on

then (X1 is on or X2 is on or… or Xn is on).

d. If {X1, X2, …, Xn} are jointly causally necessary for Ythen if (X1 is off and X2 is off and… and Xn is off)

then Y is off. (5)

The principles in Eq. (5a–d) deal with the case in which a set of variables are jointly nec-

essary or jointly sufficient to cause another, but we can use the same principles to deal with

individually necessary or sufficient causes as the special case in which the set contains just

one variable. No new principles are required.12

To predict the quantitative pattern of responses, we need to specify a processing frame-

work for using these rules. Let us suppose that, with some probability p, people successfully

210 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 37: Two Causal Theories of Counterfactual Conditionals

apply these rules to the conditional’s antecedent and to their representation of the causally

necessary and sufficient relations, as given by the device’s description. If one of the rules

produces a determinate value for the consequent, then the reasoner will respond ‘‘yes’’ or

‘‘no’’ (or ‘‘follows’’ or ‘‘doesn’t follow’’) according to this value. If a rule produces a value

for a variable other than the consequent, then the reasoner can apply the rules once more

(again with probability p). This process can continue until the reasoner gives up trying (with

probability 1 – p) or until no new results are forthcoming. At this stage, she must make a

response based on base rates if they’re available or by guessing. Although a number of

response rules are possible here, preliminary model fitting suggests that if the reasoner has

to decide whether the conditional necessarily follows or does not follow, she will consider

the base rates irrelevant and guess randomly between the alternatives.13 If the choice is

between answering ‘‘yes’’ or ‘‘no’’ to the conditional question, the reasoner will either

respond ‘‘yes’’ if the relevant base rate is high (with probability q) or guess randomly (with

probability 1 ) q).

As an example, consider the deterministic device of Table 4A. Component A’s operating

is sufficient for B and is also sufficient for C. Similarly, components A and B are jointly nec-

essary for C. In answering the question If B were not operating, would C be operating? we

look for an applicable rule and find (with probability p) that we can instantiate Eq. (5b) with

{A} causally sufficient for B and B off. The rule tells us that A is off. Applying the rules on a

second round (again with probability p) finds (5d) available: {A, B} is necessary for C, and

both A and B are off. This guarantees that C is also off, which yields a negative (‘‘no’’ or

‘‘doesn’t follow’’) answer to the problem. At either step in this process, reasoners may fail

to deploy the rule (with probability 1 – p) because of processing limitations, and they will

then guess at an answer if they must decide whether the conditional necessarily follows. If

they must decide instead whether the conditional question has a ‘‘yes’’ or ‘‘no’’ answer, they

will respond ‘‘yes’’ if the base rate is high (with probability q) and will guess otherwise.

Appendix B contains the modeling details and shows that the goodness of fit to the Exper-

iment 3 data is nearly the same as for the modified minimal-networks theory of Appendix

A. Like the earlier theory, this one is consistent with the data from Experiments 1 and 2. But

it also extends to the results of Experiment 4, as the example in the previous paragraph sug-

gests. The model’s fits to the latter experiment are best, however, when the values of the

parameters p and q are lower than in Experiment 3 (see Appendix B). This difference could

be put down to extrinsic factors that differ between the experiments, such as differences

between the two groups of participants. But it is possible that the greater connectivity of the

Experiment 4 devices—in which each component is linked to every other—makes it diffi-

cult to isolate the effect of individual variables. The devices of Experiments 1–3 have more

easily isolable components, potentially advancing participants’ causal reasoning.

11. Conclusion

According to some prominent theories (e.g., Edgington, 2004; Lewis, 1979), counter-

factual conditionals envision a world that is the same as the actual world until shortly

L. J. Rips ⁄ Cognitive Science 34 (2010) 211

Page 38: Two Causal Theories of Counterfactual Conditionals

before the event mentioned in the conditional’s antecedent. At that point, the envisioned

world deviates from the actual one to accommodate the antecedent. If we assume this

view is correct, what determines where this fork in the causal stream occurs? The

results of all four experiments suggest that the answer is not necessarily at the last

causally possible moment. Although participants could have imagined a situation in

which all causal ancestors of the antecedent remain intact and in which only the ante-

cedent and its descendants change, they preferred to place the locus of change earlier

to preserve the causally necessary and sufficient relations governing the causal system.

In accord with the minimal-networks approach, this policy tends to locate the change at

the boundaries of the causal system or at the places within the system that have proba-

bilistic rather than deterministic connections to their effects. You can think of this pol-

icy in more general terms as one in which people look backward for the best

explanation for the antecedent and then forward to see whether the explanation would

imply the consequent.

To provide a quantitative account of the participants’ judgments about counterfactuals,

however, we need to consider psychological limits on people’s ability to represent alterna-

tive situations. Participants sometimes said that a counterfactual necessarily followed from

the given information even when a causally legal (and minimally altered) state of the system

was inconsistent with the conditional. We can account for these answers by assuming that

people sample from the relevant states rather than systematically considering all of them.

But sampling fails to account for cases in which participants respond positively to a counter-

factual when no causally legal state of the system exists. To explain these responses, we

need to examine the step-by-step procedure people use to search for the mandated values of

the causal variables. If people fail to integrate information in the conditional with informa-

tion about the structure of the system, they may end up producing answers that clash with

normatively appropriate responses.

The model of Section 10.3 provides a start on such a theory, and it goes along with

evidence from our protocols about difficulties participants encounter as they reason

about causal evidence. The appendices treat this model as a rival to the sampling ver-

sion of minimal-networks theory for purposes of comparing their predictions. But we

can also view the model as a psychological amendment or elaboration that allows for

short cuts and errors in understanding the causal constraints. Instead of sampling ran-

domly from the space of minimal legal networks, people may use principles like those

in Eq. (5) to construct a single, partially specified representation of the system. An

open question is to what extent we should still regard such a representation as a type

of Bayes net. Similarly, the model of Section 10.3 makes only the simplest use of the

probabilistic information included in the problems. Tasks that require direct estimates

of probabilities may require methods that are more sophisticated to integrate probabili-

ties and causal structure.

The present results are consistent with the intuitively appealing idea that knowledge of

causal structure can underwrite judgments about counterfactual situations. But they high-

light the importance of people’s beliefs about how these situations can come about and of

the difficulties people have in envisioning them. Students of counterfactuals have long

212 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 39: Two Causal Theories of Counterfactual Conditionals

recognized that these conditionals are evaluated in a context that includes not only the ante-

cedent but also further changes that maintain causal and logical consistency with the

antecedent. The present study adds that the context also depends on events leading up to the

antecedent, events that plausibly explain it.

Notes

1. Some approaches combine the strategies of using causality to explain counterfactuals

and of using counterfactuals to explain causality. Halpern and Pearl (2005), for

example, posit Bayes nets to represent general causal relations (type causality)—for

example, events of type C cause events of type E. They then use the representation

to define both the truth of counterfactuals and the notion of an actual (or token)

cause—for example, that a particular event C caused a particular event E. The notion

of an actual cause is of great interest, but because this article deals with judgments

about counterfactuals, it will focus on the link between type causality and counter-

factuals conditionals. (See Hall, 2007, for a critique of the Bayes net treatment of

actual or token cause.)

2. Graphical representations of Bayes nets usually do not include arcs like the one in

Fig. 1, although they are standard in and-or graphs in other areas of computer science

(see, e.g., Nilsson, 1980). I include them here to clarify the factors in the following

experiments, but they introduce no new assumptions to Bayes net theories. These the-

ories capture joint versus separate causes in other ways (e.g., as part of the probability

distributions that accompany such systems or as part of the functional specification of

the causal dependencies). The prohibition against looping in Bayes nets means that

they are not able to represent systems with positive or negative feedback, but some

extensions to the theory drop this prohibition (see Pearl, 2000).

3. There is some flexibility in how we choose to represent the device in Fig. 1. For exam-

ple, our description of the device specified that C’s operating always causes D to oper-

ate, but Eq. (1d) means that D operates if and only if C does. We could allow for the

possibility that D could operate even in the absence of C by replacing the fourth equa-

tion with D = C + UD ) C * UD. Unobserved causes (UD) can turn D on. A similar

replacement would allow C to operate even when A or B is off. These changes, how-

ever, do not affect the predictions from pruning theory in the following experiments.

The original equations in (1) also seem the more natural way to represent the device

as we described it to participants (see Methods, Experiment 1), as we would normally

regard it odd to find, for example, D operating because of unobserved causes in such a

simple causal system.

4. A potential difficulty that affects both theories has to do with the basis for the Bayes

net’s causal relations. According to some theories (e.g., Woodworth, 2003, 2007),

these relations themselves depend on certain counterfactuals. Woodworth’s theory, for

example, posits that ‘‘If (a) there are possible interventions (ideal manipulations) that

change the value of X such that (b) if such an intervention (and no other) were to occur

L. J. Rips ⁄ Cognitive Science 34 (2010) 213

Page 40: Two Causal Theories of Counterfactual Conditionals

X and Y would be correlated, then X causes Y’’ (Woodworth, 2007, p. 20). The prob-

lem is that if Bayes nets are founded on counterfactuals like these, then using Bayes

nets to determine the truth of counterfactuals may be circular (as Hall, 2007, suggests).

The theory has not succeeded in explaining counterfactuals in noncounterfactual

terms. For this reason, we will assume for the time being that the causal relations or

functional equations are given in some other way (e.g., by exploiting known laws

governing systems of this type). Of course, this objection does not affect our compari-

son of pruning and minimal-networks theory in the experiments that follow, as it

impacts the theories equally.

5. A similar difference appeared in Sloman and Lagnado’s (2005) Experiment 2 in the

context of a slightly more complex three-variable system. One group of participants

rated the answer to a straight counterfactual (e.g., What is the probability that A wouldhave happened if B had not happened?), whereas a second group rated an explicit pre-

vention counterfactual (Someone intervened directly on B, preventing it from happen-ing. What is the probability that A would have happened?). The average probability

rating for the straight counterfactual was 3.2 on a 1–5 response scale (1 = very low

probability, 5 = very high probability), whereas the average was 3.9 for the preven-

tion version. Although Sloman and Lagnado do not compare these means statistically,

they do report that the first was not significantly higher than the scale midpoint (3.0),

whereas the second was significantly higher than the midpoint. (I am grateful to Mor-

teza Dehghani and Rumen Iliev for pointing out these differences; see Dehghani, Iliev,

& Kaufmann, 2007.)

6. The equation in (1c) describes the deterministic jointly caused device. For the

deterministic separately caused device, the simplest representation is C = A + B -

(A * B), which will be 1 when A or B or both equal 1, and 0 otherwise. Pruning

theory handles the probabilistic devices by means of the U variables. Thus, compo-

nent C in the probabilistic jointly caused device will have the equation

C = A * B * UC, and the probabilistic separately caused device the equation

C = [A + B - (A * B)] * UC, where UC is a binary random variable. Here, UC will

have the effect of turning off C on some proportion of trials (when UC = 0), even

when A and B are operating. This accords with our description that A and B ‘‘usu-

ally’’ cause C to operate.

7. Fig. 3 shows that the state in which both A and B are operating but C and D are not

has two intact variables (A and B) and one break variable (C). The state in which A is

operating but B, C, and D are not has only one intact variable (A) and one break vari-

able (B). Similarly for the state in which B is operating but A, C, and D are not. This

suggests that the first of these states should be minimal relative to the latter two, as it

has more intact variables (i.e., two) and the same number of break variables (i.e., one).

However, minimal-networks theory calculates minimality in terms of subset relations,

not counts of the number of intact or break nodes (Hiddleston, 2005). Because the set

of break variables in the first state, {C}, is not a subset of the set of break variables in

either of the latter two states, {A} or {B}, these networks are incomparable, and all

three states are minimal.

214 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 41: Two Causal Theories of Counterfactual Conditionals

8. Pruning theory does allow a type of backward or abductive reasoning in determining

and then clamping the values of the U variables based on the actual state of affairs.

However, clamping UA (instead of changing A itself) leads to the wrong prediction.

Other (nonpruning) Bayes net models are in better agreement than the pattern of

reasoning just described; see the discussion of ‘‘transduction’’ in Pearl (2000, p.

208), and Section 10 of the present article.

9. Pruning theory again gives us some leeway in representing the device in Fig. 4. It

is not possible for C to be off when either A or B is on, given our description

(i.e., A and B always turn on C). But it might be possible for C to be on when

both A and B are off, provided there are unobserved causes of C. Similarly, it

might be possible for B to be on when A is off, given unobserved causes of B.

We can allow for these possibilities by writing B = A + UB - A * UB and C = 1 -

(1 - A)(1 - B)(1 - UC) for the corresponding Eqs. (4). These changes, however, do

not affect the predictions in this experiment. To handle the counterfactuals, we

must set UA = 1 and unplug B, setting its value to 0. Then A = 1. Substituting in

the new equation for C still yields C = 1.

10. Predictions from minimal-networks theory for the probabilistic devices are subject to

the points mentioned in note 7. The state in which all components are off has one

break variable (A) and no intact variables. But the state in which A and C are on and

B is off has one break variable (B) and one intact variable (A). See Fig. 5. Because of

the way the theory calculates minimality, however, these states are incomparable and

both are minimal.

11. To see why this is so, take the probabilistic device in Table 4B for the low base rate

condition, Pr(A = 1) = .05. Let the conditional probability Pr(B = 1 | A = 1) =

p. (This gives the value of ‘‘sometimes’’ in the statement, ‘‘Component A’s

operating sometimes causes component B to operate.’’) The probability that B is on,

Pr(B = 1), is then .05p, assuming that B never comes on spontaneously. By Bayes

theorem, Pr(A = 1 | B = 0) is Pr(A = 1) * Pr(B = 0 | A = 1) ⁄ Pr(B = 0) = .05(1 - p) ⁄(1 - .05p). This last quantity is never greater than .05 (which occurs when p = 0).

However, Table 4B shows that the corresponding proportion of ‘‘yes’’ responses

is .604 for the question If component B were not operating, would component A beoperating?

12. The principles in Eq. (5a–d) run from causal necessity and sufficiency to relations on

the states of the variables. The converse generally does not hold. That is, observa-

tions about the states of a system typically do not entail facts about causal necessity

or sufficiency. This means that Eq. (5a,b) do not reduce causal sufficiency to the

material conditional, and Eq. (5a–d) do not reduce causal necessity and sufficiency

to the material biconditional.

13. Why would participants guess between ‘‘follows’’ and ‘‘doesn’t follow’’ when the

outcome of the reasoning process is indeterminate? Why wouldn’t they simply

respond ‘‘doesn’t follow’’? One answer is that participants may be unsure whether

their failure to find a determinate answer is due to the nature of the problem or to

their own inability to find a solution. If they are unsure of their own ability, then

L. J. Rips ⁄ Cognitive Science 34 (2010) 215

Page 42: Two Causal Theories of Counterfactual Conditionals

guessing may be a reasonable last-resort response. (Imagine you are taking a multiple-

choice math test and are unable to compute an exact solution to a particular problem.

You could check ‘‘none of the above’’ if such an option is available. But if you are

unsure of your ability to find such a solution, then guessing among the sensible

response alternatives is a rational approach to your predicament.)

Acknowledgments

I thank Jennifer Asmuth, Rita Baglioli, Annum Bhullar, Eryka Nosal, and Eyal Sagi for

their help with these experiments. For helpful comments on this research, I am grateful to

Dan Bartels, Eric Hiddleston, Rumen Iliev, Barry Loewer, Andrea Proctor, Steven Sloman,

and to audiences at Brown, Indiana, Northwestern, Princeton, and Rutgers Universities. IES

grant R305A080341 and a fellowship from the Guggenheim Foundation helped support this

project.

References

Ahn, W.-K., & Graham, L. M. (1999). The impact of necessity and sufficiency in the Wason four-card selection

task. Psychological Science, 10, 237–242.

Andersen, S. K., Olesen, K. G., Jensen, F. V., & Jensen, E. (1989). HUGIN: a shell for building belief universes

for expert systems. Proceedings of the International Joint Conference on Artificial Intelligence, 2, 1080–

1085.

Bennett, J. (2003). A philosophical guide to conditionals. Oxford, England: Oxford University Press.

Byrne, R. M. J. (2005). The rational imagination: How people create alternatives to reality. Cambridge, MA:

MIT Press.

Cartwright, N. (2007). Counterfactuals in economics: A commentary. In J. K. Campbell, M. O’Rourke, &

H. Silverstein (Eds.), Causation and explanation (pp. 191–216). Cambridge, MA: MIT Press.

Collins, J. (2007). Counterfactuals, causation, and preemption. In D. Jacquette (Ed.), Philosophy of logic(pp. 1127–1144). Amsterdam: Elsevier.

Cummins, D. D. (1995). Naive theories and causal deduction. Memory & Cognition, 23, 646–658.

Dawid, A. P. (2007). Counterfactuals, hypotheticals and potential responses: A philosophical examination of

statistical causality. In F. Russo & J. Williamson (Eds.), Causality and probability in the sciences (pp. 503–

532). London: College Publications.

Dehghani, M., Iliev, R., & Kaufmann, S. (2007). Effects of fact mutability in the interpretation of counterfactu-

als. In D. S. Macnamara & J. G. Trafton (Eds.), Proceedings of the 29th Annual Conference of the CognitiveScience Society (pp. 941–946). Austin, TX: Cognitive Science Society.

Edgington, D. (2004). Counterfactuals and the benefit of hindsight. In P. Dowe & P. Noordhof (Eds.), Cause andchance: Causation in an indeterministic world (pp. 12–27). Abingdon, UK: Routledge.

Evans, J. St. B. T., Over, D. E., & Handley, S. J. (2007). Rethinking the model theory of conditionals. In

W. Schaeken, A. Vandierendonck, W. Schroyens, & G. d’Ydewalle (Eds.), The mental models theory ofreasoning: Refinements and extensions (pp. 63–83). Mahwah, NJ: Erlbaum.

Gopnik, A., Glymour, C., Sobel, D. M., Schulz, L., Kushnir, T., & Danks, D. (2004). A theory of causal learning

in children: Causal maps and Bayes nets. Psychological Review, 111, 3–32.

Grizzle, J. E., Starmer, C. F., & Koch, G. G. (1969). Analysis of categorical data by linear models. Biometrics,

25, 489–504.

216 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 43: Two Causal Theories of Counterfactual Conditionals

Hale, C. R., & Barsalou, L. W. (1995). Explanation content and construction during system learning and trouble-

shooting. Journal of the Learning Sciences, 4, 385–436.

Hall, N. (2007). Structural equations and causation. Philosophical Studies, 132, 109–136.

Halpern, J. Y., & Pearl, J. (2005). Causes and explanations: A structural-model approach. Part I: Causes. BritishJournal for the Philosophy of Science, 56, 843–887.

Hiddleston, E. (2005). A causal theory of counterfactuals. Nous, 39, 632–657.

Iatridou, S. (2000). The grammatical ingredients of counterfactuality. Linguistic Inquiry, 31, 231–270.

Isard, S. D. (1974). What would you have done if...? Theoretical Linguistics, 1, 233–255.

Jackson, F. (1977). A causal theory of counterfactuals. Australasian Journal of Philosophy, 55, 3–21.

Kahneman, D., & Varey, C. A. (1990). Propensities and counterfactuals: The loser that almost won. Journal ofPersonality and Social Psychology, 59, 1101–1110.

Lewis, D. (1973). Causation. Journal of Philosophy, 70, 556–567.

Lewis, D. (1979). Counterfactual dependence and time’s arrow. Nous, 13, 455–476.

Lewis, D. (2000). Causation as influence. Journal of Philosophy, 97, 182–197.

Nilsson, N. J. (1980). Principles of artificial intelligence. Palo Alto, CA: Tioga.

Pearl, J. (1988). Probabilistic reasoning in intelligent systems. San Mateo, CA: Morgan Kaufmann.

Pearl, J. (2000). Causality. Cambridge, England: Cambridge University Press.

Rehder, B., & Burnett, R. C. (2005). Feature inference and the causal structure of categories. Cognitive Psychol-ogy, 50, 264–314.

Rips, L. J. (2008). Causal thinking. In J. E. Adler & L. J. Rips (Eds.), Reasoning: Studies of human inferenceand its foundation (pp. 597–631). Cambridge, England: Cambridge University Press.

Roese, N. J. (1997). Counterfactual thinking. Psychological Bulletin, 121, 133–148.

Sloman, S. A., & Lagnado, D. A. (2005). Do we ‘‘do’’? Cognitive Science, 29, 5–39.

Spellman, B. A., & Mandel, D. R. (1999). When possibility informs reality: Counterfactual thinking as a cue to

causality. Current Directions in Psychological Science, 8, 120–123.

Staudenmayer, H. (1975). Understanding conditional reasoning with meaningful propositions. In R. J. Falmagne

(Ed.), Reasoning: Representation and process in children and adults (pp. 55–79). Hillsdale, NJ: Erlbaum.

Steyvers, M., Tenenbaum, J. B., Wagenmakers, E., & Blum, B. (2003). Inferring causal networks from observa-

tion and interventions. Cognitive Science, 27, 453–489.

Tetlock, P. E., & Henik, E. (2005). Theory- versus imagination-driven thinking about historical counterfactuals.

In D. R. Mandel, D. J. Hilton, & P. Catellani (Eds.), The psychology of counterfactual thinking (pp. 199–

216). London: Routledge.

Trabasso, T., & Sperry, L. L. (1985). Causal relatedness and importance of story events. Journal of Memory andLanguage, 24, 595–611.

Waldmann, M. R., & Hagmayer, Y. (2005). Seeing versus doing: Two modes of accessing causal knowledge.

Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 216–227.

Wells, G. L., & Gavanski, I. (1989). Mental simulation of causality. Journal of Personality and Social Psychol-ogy, 56, 161–169.

Woodward, J. (2003). Making things happen: A theory of causal explanation. Oxford, England: Oxford Univer-

sity Press.

Woodward, J. (2007). Interventionist theories of causation in psychological perspective. In A. Gopnik &

L. Schulz (Eds.), Causal learning (pp. 19–36). Oxford, England: Oxford University Press.

Appendix A: A psychological extension of the minimal-networks theory

We can modify minimal-networks theory by incorporating Assumptions (a–c) of Section

10 and then see whether this modified approach correctly predicts the results of Experiment

3. According to the assumptions, when participants in the necessity condition sample among

L. J. Rips ⁄ Cognitive Science 34 (2010) 217

Page 44: Two Causal Theories of Counterfactual Conditionals

the legal antecedent networks, their likelihood of finding one in which the consequent is true

is given by the probability in (A1):

Prlegal;numðConsjAntÞ ¼ numðlegal antedecent-and-consequent netsÞnumðlegal antedecent netsÞ ; ðA1Þ

where num counts the number of legal networks in question. In the case of the deterministic

jointly caused device in Fig. 2, for example, there are three legal networks in which the

antecedent of If not C, then A is true, the three shown in the top and middle of the figure. Of

these networks, just one also makes the consequent true (the network at the middle left). So

the value of Prlegal,num(Cons|Ant) is one-third.

To accommodate guessing under Assumption (a) above, we can let pg represent the pro-

portion of trials on which guessing occurred. The predicted probability of a ‘‘follows’’

response can then be expressed as in (A2):

Prð‘‘follows’’Þ ¼ :5pg þ ð1� pgÞPrlegal;numðConsjAntÞ ðA2Þ

The first term represents the ‘‘follows’’ responses due to guessing (assuming a

.5 chance of choosing ‘‘follows’’ when guessing occurs), and the second term, the

‘‘follows’’ responses due to participants relying instead on the conditional probability in

(A1).

Although Eq. (A2) provides a reasonable account of the ‘‘follows’’ responses, as we will

see momentarily, we have already found that the ‘‘yes ⁄ no’’ responses depend on base rates,

which are not reflected in (A1) or (A2). This suggests that participants in the yes ⁄ no condi-

tion tended to consider the probability of a network in their sampling; that is, they were

more likely to sample a network that base rates make more probable. We can estimate these

probabilities in Experiment 3 from the fact that the base rates were said to be independent.

For example, when the base rate of component A is high (.95) and the base rate of compo-

nent B is low (.05), the likelihood of a network in which component A is on and component

B is on is .95 · .05 = .0475; the likelihood that A is on and B is off is .95 · (1 ).05) = .9025; and so on. We can then formulate a counterpart to Eq. (A1) in which we sub-

stitute the probabilities of the relevant networks for simple counts of these networks. In

accord with Assumption (c), these yes ⁄ no participants focused on just the minimal networks.

If so, the relevant conditional probability that a minimal network of the antecedent is also a

minimal network of the consequent is (A3):

Prmin;prob

ðConsjAntÞ ¼P

i Pr minimal antecedent-and-consequent netið ÞP

j Pr minimal antecedent netj� � : ðA3Þ

Equation (A3), however, predicts much more extreme dependence on base rates than

appears in the data, a problem we also encountered with standard Bayesian revision in

218 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 45: Two Causal Theories of Counterfactual Conditionals

the Section 10. For example, the deterministic jointly caused device has the two mini-

mal antecedent networks in the middle of Fig. 2, with the leftmost being a minimal

antecedent-and-consequent network (for the counterfactual If not C, then A). When the

base rate of A is low and that of B is high, the probability of the leftmost network

(where A is on and B is off) is .05 * (1 ) .95) = .0025, and the probability of the right-

most network (where A is off and B is on) is (1 - .05) * .95 = .9025. The value of

Prmin,prob(Cons | Ant) is therefore .0025 ⁄ (.0025 + .9025) = .0028. But Table 3 shows that

the proportion of trials on which participants said ‘‘yes’’ was .417 in this condition. By

Assumption (a), participants may be guessing on some trials. But unless they are doing

so on virtually every trial, guessing will not explain this discrepancy. A more likely

explanation is that participants in the yes ⁄ no condition considered the base rates only

sometimes (using a measure like Eq. [A3]), whereas at other times they relied on sim-

pler counts of the networks (using a measure like Eq. [A1]). According to Assumption

(c), the analog to (A1) should sum the number of minimal networks rather than the

legal networks, a measure we can call ‘‘Prmin,num(Cons | Ant).’’ This leads to (A4) as

the predicted number of ‘‘yes’’ responses:

Prð‘‘Yes’’Þ ¼ :5pg þ ð1� pgÞ hPrmin;numðConsjAntÞ��

þð1� hÞPrmin;probðConsjAntÞ�; ðA4Þ

where h is the probability of using the simpler rather than the more complex measure of

conditional probability and pg is again the probability of guessing.

To illustrate the result of these assumptions, I fit Eqs (A2) and (A4) simultaneously to the

data from Experiment 3. This model fitting used a nonlinear least-squares procedure to pre-

dict the percentage of ‘‘yes’’ and ‘‘follows’’ responses in Table 3, treating pg and h as free

parameters. The predicted values from the equations (converted to percentages) appear in

Table A1 (in the rows labeled ‘‘predicted minimal net’’), and they compare well with the

obtained results. The root mean square deviation (RMSD), corrected for the number of

parameters, is 6.78, and R2 is .97. (RMSD is calculated asffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffisum of squared error/14

p, as

there are 16 data points and two free parameters.) The estimated parameters were pg = .197

and h = .681. Thus, according to these values, guessing occurred about 20% of the time,

and participants in the yes ⁄ no condition used the simpler measure of conditional probability

on about 70% of trials.

The theory of Eqs. (A2) and (A4) also helps explain some results from Experiment 3

that we noticed earlier. Recall that the analysis of this experiment found reliable

triple interactions of base rate, instructions, and structure, and of base rate, instructions,

and determinism. Base rates only matter for the yes ⁄ no condition, according to the

model, because base rates influence only the probability of particular networks via Eqs.

(A3) and (A4). To explain the interaction of base rates and instruction with structure,

note that any effect of base rate will be eliminated for the deterministic separately caused

device. Equation (A3) is always 0 here, as there are no minimal antecedent-and-conse-

quent networks. This will reduce the effect of base rates on ‘‘yes’’ responses for the

separately caused devices relative to the jointly caused ones, producing the interaction.

L. J. Rips ⁄ Cognitive Science 34 (2010) 219

Page 46: Two Causal Theories of Counterfactual Conditionals

The same factor also serves to explain the interaction of base rates and instructions with

determinism.

Appendix B: Reasoning from causal principles

In Appendix A, we examined predictions of the minimal-networks theory for the

results of Experiment 3. To do the same for the principle-based model, we can develop

a set of equations for that experiment, based on the procedure outlined in the text. As

an example, take the deterministic separately caused device of Table. 3C. Participants

should represent this device as one in which {A} is causally sufficient for C and {B}

is also sufficient for C. As the conditional’s antecedent contains the information that Cis not operating, the reasoner can apply rule (5b) with probability p and conclude that

A is not operating. Such a reasoner will conclude that the answer to the conditional

question (If C were not operating, would A be operating?) is ‘‘no’’ or that the condi-

tional assertion (If C were not operating, A would be operating) does not follow. With

probability 1 ) p, however, the reasoner fails to apply the rule. If the response options

are follows ⁄ does not follow, the reasoner will ignore base-rate information and guess.

Hence, the probability of a ‘‘follows’’ response will be .5(1 ) p). In the yes ⁄ no

condition, the reasoner will respond ‘‘yes’’ if the base rate is high with probability qand will guess otherwise. The probability of a ‘‘yes’’ response in the high base rate

condition is therefore (1 ) p)q + .5(1 ) p)(1 ) q), and the probability of ‘‘yes’’ in the

low base rate condition is .5(1 ) p). Similar equations can be developed for the other

conditions.

This set of equations was fit to the data of Experiment 3 using the same nonlinear least-

squares technique employed in Appendix A, with p and q as free parameters. The model’s

predictions appear in Table A1 (rows labeled ‘‘predicted causal principles’’), where they

can be compared with the obtained data and to the predictions from the minimal-networks

theory. Overall, RMSD = 6.50, slightly better than RMSD (6.78) for the minimal-nets the-

ory. R2 = .93 for the current model, slightly worse than minimal network’s .97. These sum-

mary statistics and the comparison in Table A1 both suggest comparable fits for the two

theories (which have the same number of parameters). The obtained value for p is .82 and

the value for q is .78.

One advantage of the present model over that of Appendix A is that it also applies to the

results from Experiment 4. Because the single minimal network for the deterministic device

in Table 4A falsifies the conditional’s consequent, minimal-networks theory predicts low

rates of ‘‘yes’’ responses. In fact, participants responded ‘‘yes’’ on 35% of trials. The

present model is consistent was this result because participants sometimes fail to apply the

correct rules. An attempt to fit the model to the data in Table 4 produced reasonable results,

with RMSD = 7.89 and R2 = .91. However, the best fitting version has parameters substan-

tially lower than those of Experiment 3: p = .58 and q = .55. A potential reason for this

difference is mentioned in the text.

220 L. J. Rips ⁄ Cognitive Science 34 (2010)

Page 47: Two Causal Theories of Counterfactual Conditionals

Table A1

Percentage of positive responses in Experiment 3 and predictions from the minimal-net-

works theory and the causal-principles theory

Deterministic Probabilistic

Jointly caused: a. b.

Frequency of operation: ‘‘Yes’’ ‘‘Follows’’ ‘‘Yes’’ ‘‘Follows’’

High base rate (observed) 70.8 ± 9.3 47.8 ± 10.2 83.3 ± 7.6 52.2 ± 10.2

(predicted minimal nets) 62.7 36.6 71.9 50.0

(predicted causal principles) 77.6 50.0 77.6 50.0

Low base rate (observed) 41.7 ± 10.1 34.8 ± 9.7 50.0 ± 10.2 56.6 ± 10.1

(predicted minimal nets) 37.3 36.6 47.6 50.0

(predicted causal principles) 50.0 50.0 50.0 50.0

Separately caused: c. d.

Frequency of operation: ‘‘Yes’’ ‘‘Follows’’ ‘‘Yes’’ ‘‘Follows’’

High base rate (observed) 12.5 ± 6.8 17.4 ± 7.7 79.2 ± 8.3 52.2 ± 10.2

(Predicted: minimal nets) 9.8 9.8 71.9 50.0

(Predicted: causal principles) 16.3 9.2 77.6 50.0

Low base rate (observed) 8.3 ± 5.6 8.7 ± 6.0 58.3 ± 10.1 47.8 ± 10.2

(Predicted: minimal nets) 9.8 9.8 47.6 50.0

(Predicted: causal principles) 9.2 9.2 50.0 50.0

L. J. Rips ⁄ Cognitive Science 34 (2010) 221